Walmart's Best Sellers pages rank the products selling fastest in each department, from electronics to home goods, and that public ranking is one of the cleanest demand signals on the open web. Retailers watch it to see what is trending, analysts use it to study category shifts, and price trackers anchor their comparisons to whatever is currently in front of shoppers. The data is right there on the page: a rank order, product titles, prices, ratings, and a link to each item.

This guide shows you how to scrape Walmart Best Sellers with JavaScript and Node.js using cheerio. You build a small, runnable scraper that fetches a Walmart Best Sellers list through the Crawling API, parses rank, title, price, rating, and link for each product, and exports the result as JSON and CSV. The whole walkthrough stays scoped to public product-listing data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Node.js script that takes a public Walmart Best Sellers category URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every product on the list. We use the electronics Best Sellers page as the running example and pull these fields per item:

  • Rank the position in the Best Sellers order, starting at 1.
  • Title the product name, for example "Apple AirPods with Charging Case (2nd Generation)".
  • Price the price as shown on the card, like "$69.00".
  • Rating the star rating text, for instance "4.6 out of 5 Stars".
  • Reviews the review count as a number when present.
  • Link the URL to the individual product page.

Why a plain request fails on Walmart

If you request a Walmart Best Sellers URL with a bare HTTP client, you rarely get the product grid back. Two things work against you. First, Walmart renders the listing cards in the browser with JavaScript, so the initial HTML is a near-empty shell until the page's scripts run. Second, Walmart flags automated traffic aggressively: datacenter IPs and request patterns that do not look like a real browser get challenged with a CAPTCHA, rate-limited, or blocked before they reach the rendered product data.

So a working Walmart scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse with cheerio.

Two ways to parse

The Crawling API can return raw HTML for you to parse with cheerio, or you can pass autoparse: 'true' and get structured JSON back directly. This guide writes the cheerio parser by hand so you control every field and selector, but the autoparse option is there when you want Crawlbase to do the extraction for you.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are new to Node, the official docs and any beginner course will get you to the level this tutorial assumes.

Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.

A Crawlbase account and token. Sign up, open your dashboard, and copy your token from the account docs page. The free tier gives you 1,000 requests with no card. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a project folder, initialize it, and install the two libraries the scraper needs.

bash
node --version

mkdir walmart-best-sellers && cd walmart-best-sellers
npm init -y

npm install crawlbase cheerio

Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. Create a file named walmart-scraper.js in this folder and add the code from the steps below.

Step 1: Fetch the rendered Best Sellers page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, and request the Best Sellers URL. Checking the status code before you parse keeps failures loud instead of silent.

javascript
const { CrawlingAPI } = require('crawlbase');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

const walmartPageURL =
  'https://www.walmart.com/shop/best-sellers/electronics';

api
  .get(walmartPageURL)
  .then((response) => {
    if (response.statusCode === 200) {
      console.log(response.body.slice(0, 500));
    }
  })
  .catch((error) => console.error('API request error:', error));

Run the script with node walmart-scraper.js and you should see real Walmart product markup at the top of the body, not a stripped-down shell. That confirms rendering works before you write a single selector. To get parsed JSON back instead of HTML, pass the autoparse option, which is the fastest route when you do not need custom selectors.

javascript
// Ask the API to parse the page and return JSON
const options = { autoparse: 'true' };

api
  .get(walmartPageURL, options)
  .then((response) => {
    if (response.statusCode === 200) {
      console.log(JSON.parse(response.body));
    }
  })
  .catch((error) => console.error('API request error:', error));
Crawlbase Walmart Scraper

That first request just returned a fully rendered Walmart page without a headless browser or a proxy on your side. The Crawling API runs the page in a real browser, rotates through residential IPs server-side, and handles the CAPTCHAs Walmart throws at scrapers, so you get finished HTML (or autoparsed JSON) from one call. Point it at the electronics Best Sellers page on the free tier first.

Step 2: Parse each product with cheerio

With rendered HTML in hand, load it into cheerio and walk the product cards. Walmart lays each Best Seller out in a repeating container, so you select every card, then read the title, price, rating, reviews, and link from inside it. Reading each field defensively keeps one missing value from crashing the run, and tracking the loop index gives you the rank.

javascript
const cheerio = require('cheerio');

function parseBestSellers(html) {
  const $ = cheerio.load(html);
  const products = [];

  const containers = $(
    '.sans-serif.mid-gray.relative.flex.flex-column.w-100.hide-child-opacity'
  );

  containers.each((index, element) => {
    const card = $(element);
    const product = { rank: index + 1 };

    // Title
    product.title = card
      .find('[data-automation-id="product-title"]')
      .text()
      .trim();

    // Price: split the currency symbol from the number
    const priceString = card
      .find('[data-automation-id="product-price"] .w_iUH7')
      .text()
      .trim();
    const priceMatch = priceString.match(/([^\d]+)([\d,\.]+)/);
    product.price = priceMatch
      ? `${priceMatch[1].trim()}${priceMatch[2]}`
      : '';

    // Rating and review count share one block
    const ratingText = card
      .find('.flex.items-center.mt2 .w_iUH7')
      .text()
      .trim();
    const rating = ratingText.replace(/\d+\s*reviews/i, '').trim();
    product.rating = rating !== '' ? rating : 'Rating not available';

    const reviewsMatch = ratingText.match(/(\d+)\s*reviews/i);
    product.reviews = reviewsMatch ? parseInt(reviewsMatch[1], 10) : 0;

    // Link to the product page
    const href = card.find('a[link-identifier]').attr('href');
    product.link = href
      ? new URL(href, 'https://www.walmart.com').href
      : '';

    if (product.title) products.push(product);
  });

  return products;
}

A few details keep this faithful to the page. The product title comes from the data-automation-id="product-title" attribute, and the price sits inside the product-price block as a single string like "current price $69.00", so a regular expression separates the currency symbol from the numeric part. The rating block carries both the star rating and the review count in one run of text, so one regex strips the "N reviews" tail to leave the rating, and a second pulls the review count out as a number. The link is read from the card's anchor href and resolved to an absolute URL so it works outside the page.

Selectors drift

Walmart's class names (w_iUH7, the long flex flex-column container class, and the rest) are generated and change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back empty, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Assemble the full script with JSON and CSV export

Now wire the fetch and the parse into one runnable script, then write the records to disk as both JSON and CSV. Fetching with autoparse off returns raw HTML, which is what the cheerio parser expects.

javascript
const fs = require('fs');
const { CrawlingAPI } = require('crawlbase');
const cheerio = require('cheerio');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

async function crawl(pageUrl) {
  const response = await api.get(pageUrl);
  if (response.statusCode === 200) return response.body;
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

function toCsv(rows) {
  const headers = ['rank', 'title', 'price', 'rating', 'reviews', 'link'];
  const escape = (value) =>
    `"${String(value).replace(/"/g, '""')}"`;
  const lines = [headers.join(',')];
  for (const row of rows) {
    lines.push(headers.map((h) => escape(row[h])).join(','));
  }
  return lines.join('\n');
}

async function main() {
  const url = 'https://www.walmart.com/shop/best-sellers/electronics';
  const html = await crawl(url);
  if (!html) return;

  const products = parseBestSellers(html);
  fs.writeFileSync('best-sellers.json', JSON.stringify(products, null, 2));
  fs.writeFileSync('best-sellers.csv', toCsv(products));
  console.log(`Saved ${products.length} products to JSON and CSV`);
}

main();

Paste the parseBestSellers function from Step 2 into the same file so main can call it. Run it with node walmart-scraper.js and you get two files: best-sellers.json with the full structured records and best-sellers.csv ready to open in a spreadsheet. The toCsv helper quotes every field and doubles any embedded quotes, which matters here because product titles are long and frequently contain commas.

What the output looks like

The JSON file holds one object per product in Best Sellers order, each with the rank, title, price, rating, review count, and link.

json
[
  {
    "rank": 1,
    "title": "Apple AirPods with Charging Case (2nd Generation)",
    "price": "$69.00",
    "rating": "4.6 out of 5 Stars.",
    "reviews": 23387,
    "link": "https://www.walmart.com/ip/Apple-AirPods-with-Charging-Case-2nd-Generation/604342441"
  },
  {
    "rank": 2,
    "title": "SGIN 15.6inch Laptop 4GB DDR4 128GB SSD Windows 11",
    "price": "$259.99",
    "rating": "4.5 out of 5 Stars.",
    "reviews": 2697,
    "link": "https://www.walmart.com/ip/SGIN-15-6inch-Laptop/1044996074"
  }
]

The CSV mirrors the same rows with a header line, so it drops straight into Excel, Google Sheets, or any data pipeline that reads delimited files.

csv
rank,title,price,rating,reviews,link
"1","Apple AirPods with Charging Case (2nd Generation)","$69.00","4.6 out of 5 Stars.","23387","https://www.walmart.com/ip/Apple-AirPods-with-Charging-Case-2nd-Generation/604342441"
"2","SGIN 15.6inch Laptop 4GB DDR4 128GB SSD Windows 11","$259.99","4.5 out of 5 Stars.","2697","https://www.walmart.com/ip/SGIN-15-6inch-Laptop/1044996074"

Scale across categories

One category is a demo; a real job walks several Best Sellers departments. Walmart exposes each one under /shop/best-sellers/, so you can build a list of category slugs, fetch each through the Crawling API, parse it with the same function, and tag every row with its category before you export. Because every Best Sellers page shares the same card structure, the parser you already wrote works across all of them without changes.

javascript
async function scrapeCategories(categories) {
  const all = [];
  for (const category of categories) {
    const url = `https://www.walmart.com/shop/best-sellers/${category}`;
    const html = await crawl(url);
    if (!html) continue;
    const rows = parseBestSellers(html).map((p) => ({ category, ...p }));
    all.push(...rows);
  }
  return all;
}

scrapeCategories(['electronics', 'toys', 'home']).then((rows) => {
  console.log(`Collected ${rows.length} products across categories`);
});

This pattern carries straight into product research and price-comparison work. For a deeper look at turning ranked listings into decisions, see how to automate ecommerce product research, and if you want the same approach in another language, the guide to scraping Walmart search with Python covers the search-results side of the catalog.

Staying unblocked

Even with rendering handled, Walmart watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Introduce a delay between category fetches rather than hammering pages in a tight loop. Spreading requests out is the single biggest factor in staying under Walmart's rate limits.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a limit or a CAPTCHA. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or non-200 responses is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. If you want a ranked Best Sellers list from another marketplace as well, the same fetch-then-parse pattern carries over to Amazon in the guide on how to scrape Amazon Best Sellers.

Whether scraping Walmart is allowed depends on Walmart's terms of service, your jurisdiction, and what you do with the data. Walmart's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Walmart's Terms of Use and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public product data: the rank, title, price, rating, review count, and product link that anyone can see on a Best Sellers page without an account. Respect Walmart's stated rate expectations and keep your request volume low enough that you are not straining its servers. Avoid personal data, including anything tied to identifiable reviewers beyond the public review text and star counts shown on the page. Do not redistribute Walmart's copyrighted media, such as product photography, as if it were your own. If you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

For volume or commercial use, Walmart runs the Walmart Developer Portal and the Walmart Affiliate program, which expose sanctioned product and catalog data with clear usage terms. Those are the right tools when you need large volumes, guaranteed structure, or commercial rights. This guide is deliberately scoped to public Best Sellers and category listings because that is the line that keeps the work defensible. It does not cover anything behind a login, customer or seller personal data, order history, or any attempt to bypass authentication or a CAPTCHA you were not meant to pass. If your project needs more than public listings, Walmart's official APIs or a data agreement are the correct path, not a cleverer scraper.

Recap

Key takeaways

  • Walmart renders listings client-side and blocks hard. A plain request returns an empty shell or a CAPTCHA, so you must render the page behind a trusted IP before you parse it.
  • The Crawling API does both in one call. It renders the page, rotates residential IPs, and handles CAPTCHAs; pass autoparse: 'true' for JSON or take raw HTML and parse it yourself.
  • cheerio extracts the fields. Select every product container, then read title, price, rating, reviews, and link, deriving rank from the loop index, and expect the generated class names to drift.
  • Export to JSON and CSV. Write structured records to both formats, quoting CSV fields so comma-heavy product titles stay intact.
  • Stay on public data. Respect Walmart's ToS and robots.txt, pace your requests, and prefer the Walmart Developer Portal or Affiliate program for volume or commercial use.

Frequently Asked Questions (FAQs)

What are Walmart Best Sellers?

Walmart Best Sellers are the products selling fastest in a given department on Walmart's online store, presented as a ranked grid. Each category, such as electronics, toys, or home, has its own Best Sellers page that surfaces the top-rated, in-demand items. Watching these pages is a quick way to see what is trending and what shoppers are buying right now.

Why does a plain request return incomplete data from Walmart?

Because Walmart renders its product grid client-side with JavaScript and challenges automated traffic with CAPTCHAs. A raw HTTP request from a datacenter IP usually returns an empty shell or a block page rather than the product cards. To get a complete page you have to render it behind a trusted IP, which is what the Crawling API handles for you.

Should I use autoparse or write my own cheerio parser?

Use autoparse when you want structured JSON fast and the default fields cover your needs; pass autoparse: 'true' and the API returns parsed data. Write your own cheerio parser when you need full control over which fields and selectors you extract, as this guide does, so you can shape the records and add fields like rank exactly the way you want.

My selectors return empty values. What changed?

Almost certainly Walmart's markup. Its container classes and generated class names like w_iUH7 change without notice, so selectors that worked last month can break. Re-inspect a live page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

Can I scrape customer personal data from Walmart?

No, and this guide does not cover it. Customer account details, order history, and anything behind a login are not public data. Scraping login-walled content, personal data about reviewers beyond the public review text, or bypassing authentication is out of scope here and runs against Walmart's terms. For sanctioned access the correct route is Walmart's Developer Portal or a data agreement.

How do I avoid getting blocked while scraping Walmart?

Keep your per-IP request rate low, add delays between category fetches, and route through rotating residential IPs so no single address trips a rate limit or a CAPTCHA. The Crawling API manages rotation, a trusted IP pool, and CAPTCHA handling for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available