Wayfair runs on dynamic pricing. The same sofa can carry one price in the morning, a different one by evening, and another the next day, all driven by demand, inventory, and the platform's own algorithmic pricing model. For a shopper waiting on a deal, or an analyst studying a furniture category, the useful thing is not the price right now but how it moves over time. To see that, you need to record the price on a schedule and keep a history.

This guide shows you how to build a Wayfair price tracker in JavaScript with Node.js and cheerio. You fetch a public Wayfair product or category page through the Crawling API, extract each product's title and price, append a timestamped row to a history file, and run the whole thing on a schedule so the file grows into a price log you can chart. The walkthrough stays scoped to public listing data, and the legality section near the end is genuine, not boilerplate, so read it before you point this at real volume.

What you will build

A Node.js tracker that takes a public Wayfair listing URL, retrieves the rendered HTML through the Crawling API, parses each product card, and writes a timestamped snapshot to disk. Run it once and you get the current prices; run it on a schedule and the history file becomes a time series you can analyze. Each record holds these fields:

  • Timestamp an ISO time string marking when the snapshot was taken, the column that turns rows into a time series.
  • Title the product name read from the listing card, for example "Mahwah 98'' Chenille Square Arm Sofa".
  • Price the price as shown on the card, like "$689.99".
  • URL the source listing page the snapshot came from, so you can group history per page.

Why a plain request fails on Wayfair

If you request a Wayfair listing URL with a bare HTTP client, you rarely get the product grid back. Two things work against you. First, Wayfair renders its listing cards in the browser with JavaScript, so the initial HTML is a near-empty shell until the page's scripts run, and the prices you want live inside that rendered markup. Second, Wayfair flags automated traffic: requests from datacenter IPs and patterns that do not look like a real browser get challenged, rate-limited, or blocked before they reach the rendered product data.

So a working Wayfair tracker needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work, and it is a poor fit for something you want to run unattended on a timer. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse with cheerio.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic JavaScript and Node.js. You should be comfortable writing and running a Node script, installing packages with npm, and working with variables, functions, and loops. If you are new to Node, the official docs and any beginner course will get you to the level this tutorial assumes.

Node.js 16 or later. Confirm your version with node --version. Node runs the JavaScript locally, which is what makes scheduled, unattended tracking possible. If you do not have it, install it from the Node.js website or through a version manager like nvm.

A Crawlbase account and token. Sign up, open your dashboard, and copy your token from the account docs page. The free tier gives you 1,000 requests with no card, which is plenty for building and testing a tracker. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a project folder, initialize it, and install the libraries the tracker needs.

bash
node --version

mkdir wayfair-price-tracker && cd wayfair-price-tracker
npm init -y

npm install crawlbase cheerio

Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. Node's built-in fs module handles writing the history file, so there is nothing extra to install for that. Create a file named tracker.js in this folder and add the code from the steps below.

Step 1: Fetch the rendered Wayfair page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, and request the listing URL. We use the sofas category page as the running example. Checking the status code before you parse keeps failures loud instead of silent.

javascript
const { CrawlingAPI } = require('crawlbase');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

const wayfairPageURL =
  'https://www.wayfair.com/furniture/sb0/sofas-c413892.html';

api
  .get(wayfairPageURL)
  .then((response) => {
    if (response.statusCode === 200) {
      console.log(response.body.slice(0, 500));
    }
  })
  .catch((error) => console.error('API request error:', error));

Run the script with node tracker.js and you should see real Wayfair listing markup at the top of the body, not a stripped-down shell. That confirms rendering works before you write a single selector. Wayfair's furniture pages are heavily JavaScript-rendered, so this render step is what makes the prices appear in the returned HTML at all.

Crawlbase Crawling API

That first request just returned a fully rendered Wayfair page without a headless browser or a proxy on your side. The Crawling API runs the page in a real browser, rotates through residential IPs server-side, and handles the challenges Wayfair throws at scrapers, so you get finished HTML from one call, which is exactly what a scheduled tracker needs to run unattended. Point it at the sofas page on the free tier first.

Step 2: Extract title and price with cheerio

With rendered HTML in hand, load it into cheerio and walk the product cards. Wayfair lays each product out in a repeating container marked data-hb-id="Card", so you select every card, then read the title and price from inside it. To find these yourself on a live page, right-click a product and choose "Inspect," then look for the card element and the attributes that label the name and the price. The fields below are the ones Wayfair exposes through data-test-id attributes.

javascript
const cheerio = require('cheerio');

function parseProducts(html) {
  const $ = cheerio.load(html);
  const products = [];

  // Each listing sits in a card container
  $('div[data-hb-id="Card"]').each((index, element) => {
    const card = $(element);

    let productName = card
      .find('p[data-test-id="ListingCard-ListingCardName-Text"]')
      .text()
      .trim();
    const productPrice = card
      .find('span[data-test-id="PriceDisplay"]')
      .text()
      .trim();

    // Fall back to a label when the name is missing
    if (productName === '') {
      productName = 'Name is not available';
    }

    if (productPrice) {
      products.push({ title: productName, price: productPrice });
    }
  });

  return products;
}

A few details keep this faithful to the page. The product name comes from the p[data-test-id="ListingCard-ListingCardName-Text"] element, and the price comes from the span[data-test-id="PriceDisplay"] element. The .text() method grabs the inner text and .trim() strips the surrounding whitespace so you get a clean value like "$689.99". When a card has no readable name, the parser substitutes "Name is not available" rather than dropping the row, which matches how Wayfair occasionally renders ad or sponsored slots without a standard title.

Selectors drift

Wayfair's attribute hooks (data-hb-id, the data-test-id values) are tied to its front-end build and can change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back empty, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production tracker, not a sign something is broken.

Step 3: Log prices over time to JSON

This is the step that turns a one-off scrape into a tracker. Instead of overwriting a file each run, you read the existing history, append the new snapshot with a timestamp, and write the combined record back. Each run adds one batch of rows stamped with the moment it ran, so the file accumulates a price history you can chart or diff later.

javascript
const fs = require('fs');

const HISTORY_FILE = 'price-history.json';

function loadHistory() {
  if (!fs.existsSync(HISTORY_FILE)) return [];
  return JSON.parse(fs.readFileSync(HISTORY_FILE, 'utf-8'));
}

function appendSnapshot(products, pageUrl) {
  const timestamp = new Date().toISOString();
  const rows = products.map((p) => ({
    timestamp,
    title: p.title,
    price: p.price,
    url: pageUrl,
  }));

  const history = loadHistory();
  history.push(...rows);
  fs.writeFileSync(HISTORY_FILE, JSON.stringify(history, null, 2));
  console.log(`Logged ${rows.length} prices at ${timestamp}`);
}

The loadHistory helper returns an empty array on the first run, when the file does not exist yet, so there is no special-casing needed elsewhere. appendSnapshot stamps every row in a single run with the same ISO timestamp, which makes it trivial to group a run later or to compare a product's price across two timestamps. Because each row carries its source url, you can track several Wayfair pages into the same file and still separate them when you analyze.

Step 4: Assemble the full tracker

Now wire the fetch, the parse, and the logging into one runnable script. This is the complete tracker: run it and it records one snapshot.

javascript
const fs = require('fs');
const { CrawlingAPI } = require('crawlbase');
const cheerio = require('cheerio');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });
const HISTORY_FILE = 'price-history.json';
const wayfairPageURL =
  'https://www.wayfair.com/furniture/sb0/sofas-c413892.html';

async function crawl(pageUrl) {
  const response = await api.get(pageUrl);
  if (response.statusCode === 200) return response.body;
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

function parseProducts(html) {
  const $ = cheerio.load(html);
  const products = [];
  $('div[data-hb-id="Card"]').each((index, element) => {
    const card = $(element);
    let productName = card
      .find('p[data-test-id="ListingCard-ListingCardName-Text"]')
      .text()
      .trim();
    const productPrice = card
      .find('span[data-test-id="PriceDisplay"]')
      .text()
      .trim();
    if (productName === '') productName = 'Name is not available';
    if (productPrice) products.push({ title: productName, price: productPrice });
  });
  return products;
}

function loadHistory() {
  if (!fs.existsSync(HISTORY_FILE)) return [];
  return JSON.parse(fs.readFileSync(HISTORY_FILE, 'utf-8'));
}

function appendSnapshot(products, pageUrl) {
  const timestamp = new Date().toISOString();
  const rows = products.map((p) => ({ timestamp, title: p.title, price: p.price, url: pageUrl }));
  const history = loadHistory();
  history.push(...rows);
  fs.writeFileSync(HISTORY_FILE, JSON.stringify(history, null, 2));
  console.log(`Logged ${rows.length} prices at ${timestamp}`);
}

async function track() {
  const html = await crawl(wayfairPageURL);
  if (!html) return;
  const products = parseProducts(html);
  appendSnapshot(products, wayfairPageURL);
}

track();

Run it with node tracker.js. The first run creates price-history.json and logs the current sofa prices; every later run appends another timestamped batch to the same file. The fetch, parse, and log stages are each small and independent, which makes the script easy to extend, for example to track a different category by changing one URL.

What the output looks like

The history file holds one object per product per run, each carrying the timestamp, title, price, and source URL. After two runs a few hours apart, a single product appears twice with the same title and possibly a different price, which is exactly the signal a tracker exists to capture.

json
[
  {
    "timestamp": "2024-03-18T09:00:00.000Z",
    "title": "Mahwah 98'' Chenille Square Arm Sofa",
    "price": "$689.99",
    "url": "https://www.wayfair.com/furniture/sb0/sofas-c413892.html"
  },
  {
    "timestamp": "2024-03-18T09:00:00.000Z",
    "title": "Adelmina 88.6'' Upholstered Sofa",
    "price": "$444.99",
    "url": "https://www.wayfair.com/furniture/sb0/sofas-c413892.html"
  },
  {
    "timestamp": "2024-03-18T21:00:00.000Z",
    "title": "Mahwah 98'' Chenille Square Arm Sofa",
    "price": "$649.99",
    "url": "https://www.wayfair.com/furniture/sb0/sofas-c413892.html"
  }
]

Export the history to CSV

JSON is convenient for the running log, but a CSV opens straight in Excel or Google Sheets, where charting a price line over time takes a couple of clicks. This helper reads the JSON history and writes a flat price-history.csv with one row per snapshot. It mirrors the legacy approach of writing name and price columns, with the timestamp added so the history is plottable.

javascript
const fs = require('fs');

function exportCsv() {
  const history = JSON.parse(fs.readFileSync('price-history.json', 'utf-8'));
  const headers = ['timestamp', 'title', 'price', 'url'];
  const escape = (value) => `"${String(value).replace(/"/g, '""')}"`;

  const lines = [headers.join(',')];
  for (const row of history) {
    lines.push(headers.map((h) => escape(row[h])).join(','));
  }

  fs.writeFileSync('price-history.csv', lines.join('\n'));
  console.log(`Exported ${history.length} rows to price-history.csv`);
}

exportCsv();

The escape helper wraps every field in quotes and doubles any embedded quotes, which matters because Wayfair product titles are long and often contain commas, inch marks, and other punctuation. The result is a clean time series: one column for when, one for which product, one for the price, ready to pivot on title and chart the price line per item. This is the same pattern you would use for any price intelligence workflow.

Run it on a schedule

A tracker is only useful if it runs by itself. You have two straightforward options. The simplest is the operating system scheduler: on macOS or Linux, a cron entry runs the script at a fixed interval, with each run appending one snapshot.

bash
# Open your crontab
crontab -e

# Run the tracker every day at 9am and 9pm
0 9,21 * * * cd /path/to/wayfair-price-tracker && node tracker.js

If you would rather keep the schedule inside Node, so it travels with the project and runs anywhere Node runs, install node-cron and wrap the existing track function in a schedule.

javascript
// npm install node-cron
const cron = require('node-cron');

// At minute 0 of hours 9 and 21, every day
cron.schedule('0 9,21 * * *', () => {
  console.log('Running scheduled price check...');
  track();
});

Keep the polling interval reasonable. Twice a day is enough to catch most of Wayfair's price movement without putting unnecessary load on the site, and a calm request rate is also the single biggest factor in staying unblocked. Over a few weeks of running, the history file becomes a real record of how each sofa's price moved, which is the whole point of tracking rather than spot-checking. The same recorded series feeds naturally into a price comparison tool if you start tracking competing listings alongside Wayfair's.

Staying unblocked

Even with rendering handled, Wayfair watches for scraper-shaped traffic, and a tracker that runs forever is more exposed than a one-off run. A few habits keep it healthy.

  • Pace your requests. A tracker does not need to poll often. A run every several hours captures the price trend and keeps you far under any rate limit. Resist the urge to scrape on a tight loop.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a limit or a challenge. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A scheduled job that starts returning non-200 responses is telling you the current rate or IP tier is no longer enough. Log those responses and back off rather than letting a silent failure poison your history with gaps.

For the broader playbook on keeping a long-running job alive, see how to scrape websites without getting blocked. If you want to extend the same approach to another marketplace, the guide on how to scrape Walmart prices easily walks the same fetch-then-parse pattern for a different store.

Using a Wayfair price tracker is generally defensible when it monitors only publicly available listing information, but "generally" is doing real work in that sentence. Whether your specific use is allowed depends on Wayfair's terms of service, your jurisdiction, and what you do with the data. Wayfair's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Wayfair's Terms of Use and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public product data: the title and price that anyone can see on a listing page without an account. Keep the tracker to personal or research use, pace it gently, and keep your request volume low enough that you are not straining Wayfair's servers. Avoid personal data, including anything tied to identifiable reviewers beyond the public review text and ratings shown on the page. Do not redistribute Wayfair's copyrighted media, such as product photography, as if it were your own. If you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

This guide is deliberately scoped to public listing prices because that is the line that keeps the work defensible. It does not cover anything behind a login, customer or seller account data, order history, or any attempt to bypass authentication or a challenge you were not meant to pass. If your project needs more than public listings, a sanctioned data agreement with Wayfair is the correct path, not a more aggressive scraper. When in doubt about a commercial use, consult legal advice before you scale up.

Recap

Key takeaways

  • Tracking means recording over time. Wayfair's dynamic pricing makes the current price less useful than the trend, so a tracker appends timestamped snapshots rather than overwriting a single value.
  • Wayfair renders prices client-side. A plain request returns an empty shell, so you must render the page behind a trusted IP before cheerio can read the title and price.
  • The Crawling API does both in one call. It renders the page, rotates residential IPs, and handles challenges, returning finished HTML, which is what lets a scheduled tracker run unattended.
  • cheerio extracts the fields. Select each data-hb-id="Card" container, then read the name from the ListingCard-ListingCardName-Text element and the price from PriceDisplay, expecting those hooks to drift over time.
  • Schedule it and stay on public data. Run twice a day with cron or node-cron, export the history to CSV for charting, and keep the tracker scoped to public listing prices that respect Wayfair's ToS and robots.txt.

Frequently Asked Questions (FAQs)

What is a Wayfair price tracker?

A Wayfair price tracker is a small program that records the prices of products listed on Wayfair on a schedule and keeps a history. Instead of checking a price by hand, it fetches the listing page, reads each product's title and price, and appends a timestamped row to a file. Over time that file becomes a price log you can chart, so you can see fluctuations, spot discounts, and time a purchase around a price drop.

Why does a plain request return incomplete data from Wayfair?

Because Wayfair renders its product cards client-side with JavaScript and challenges automated traffic. A raw HTTP request from a datacenter IP usually returns a near-empty shell or a block page rather than the listing cards, so the prices are not in the HTML you get back. To get a complete page you have to render it behind a trusted IP, which is what the Crawling API handles for you.

How does Wayfair pricing work?

Wayfair uses a dynamic pricing model influenced by demand, availability, competition, and its own algorithmic pricing system, which collects and analyzes data in real time. Sellers set their own prices, and Wayfair adjusts to stay competitive, so the same product can show different prices across locations and even within a single day. That volatility is exactly why recording prices on a schedule is more useful than checking once.

How do I track price drops on Wayfair?

Run the tracker on a schedule so it appends a timestamped snapshot each time, then compare a product's price across two timestamps in the history file. When a later price is lower than an earlier one for the same title, that is a drop. Charting the CSV in a spreadsheet makes drops obvious as dips in the price line, and you can wire an alert on top of the same data once the history is in place.

My selectors return empty values. What changed?

Almost certainly Wayfair's markup. The data-hb-id and data-test-id hooks the parser relies on are tied to Wayfair's front-end build and can change without notice, so selectors that worked last month can break. Re-inspect a live listing page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production tracker.

How do I avoid getting blocked while tracking Wayfair prices?

Keep the polling interval modest, a few times a day is plenty for price trends, and route requests through rotating residential IPs so no single address trips a rate limit or a challenge. The Crawling API manages rotation, a trusted IP pool, and challenge handling for you; if you build your own stack, that is the part to invest in. Watch the status codes from your scheduled runs and back off when you start seeing non-200 responses.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available