Walmart is one of the largest retailers on the web, and every public product page carries a stream of customer reviews: a star rating, a short title, the written feedback, the reviewer's display name, and a date. That text is a useful signal for sentiment analysis, product research, and competitive benchmarking, because it tells you in shoppers' own words what works and what does not about a product.
This guide shows you how to scrape Walmart reviews with JavaScript and Node.js using cheerio. You build a small, runnable scraper that fetches a Walmart product page through the Crawling API, parses each review into a clean record, walks the review pagination, and exports the results to JSON and CSV. The whole walkthrough stays scoped to public review text, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.
What you will build
A Node.js script that takes a public Walmart product URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for each review on the page. We will use a single product page as the running example and pull these fields per review:
- Reviewer the display name shown on the review, for example "Optimistic" or "First Time Shopper".
- Rating the star rating as Walmart shows it, like "5 out of 5 stars review".
- Title the short headline the reviewer gave their review.
- Body the written feedback text of the review.
- Date the date the review was posted, when present on the card.
Why a plain request fails on Walmart
If you request a Walmart product URL with a bare HTTP client, you get a response with status 200 and only a fraction of the review data in the body. Two things work against you. First, Walmart renders prices, ratings, and the review list in the browser with JavaScript, so the initial HTML is incomplete until the page's scripts run. Second, Walmart flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get challenged, rate-limited, or blocked before they ever reach the rendered content.
So a working Walmart review scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse.
Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Walmart loads the review list client-side, so the JS token gives you the most complete page here. Using the normal token can return a page with the review section empty, leaving you nothing to parse.
Prerequisites
You need a few things in place before writing any code. None of them take long.
Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are new to Node, the official docs and any beginner course will get you to the level this tutorial assumes. For a fuller walkthrough, see our guide on how to build a web scraper with Node.js.
Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.
A Crawlbase account and token. Sign up, open your dashboard, and copy your token from the account docs page. The free tier gives you 1,000 requests with no card required, which is plenty to follow along. Treat the token like a password: it authenticates your requests, so keep it out of version control.
Set up the project
Create a project folder, initialize it, and install the two libraries the scraper needs.
node --version mkdir walmart-reviews-scraper && cd walmart-reviews-scraper npm init -y npm install crawlbase cheerio
Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. If selectors are new to you, the primer on XPath and CSS selectors is a good companion.
Step 1: Fetch the rendered product page
Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, and request the product URL. We point at a single Walmart product as the example; any public product page with a review section works. Checking the status code before you parse keeps failures loud instead of silent.
const { CrawlingAPI } = require('crawlbase'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); const productUrl = 'https://www.walmart.com/ip/Straight-Talk-Apple-iPhone-14-128GB-Midnight-Prepaid-Smartphone-Locked-to-Straight-Talk/1381920049'; async function crawl(pageUrl) { const options = { ajax_wait: 'true', page_wait: 5000 }; const response = await api.get(pageUrl, options); if (response.statusCode === 200) { return response.body; } console.error(`Request failed: ${response.statusCode}`); return null; } crawl(productUrl).then((html) => { console.log(html ? html.slice(0, 500) : 'No HTML returned'); });
The two wait options matter for a client-rendered target like this. ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so late-rendering elements appear before the page is captured. Five seconds is a reasonable start; raise it if the review list comes back empty. Run the script with node scraper.js and you should see real product markup, not a stripped-down shell. That confirms rendering works before you write a single selector.
Walmart needs a rendered page behind a trusted IP, in one call, before its review list even appears. The Crawling API runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public product page on the free tier first.
Step 2: Parse each review with cheerio
With rendered HTML in hand, load it into cheerio and walk the review cards. Walmart lays each review out in a repeating block inside the review section, so you select every card, then read the reviewer name, rating, title, body, and date from inside it. Reading each field defensively keeps one missing value from crashing the run.
const cheerio = require('cheerio'); function parseReviews(html) { const $ = cheerio.load(html); const reviews = []; $('#item-review-section li.dib').each((_, el) => { const card = $(el); const rating = card.find('.w_iUH7').text().trim(); const body = card.find('.lh-copy').text().trim(); if (!rating && !body) return; reviews.push({ reviewer: card.find('.f7.gray').first().text().trim() || null, rating: rating || null, title: card.find('.w_kV33.w_Sl3f.w_mvVb.f5.b').text().trim() || null, body: body || null, date: card.find('.f7.gray.mt1').text().trim() || null, }); }); return reviews; }
A couple of details keep this resilient. The container selector #item-review-section li.dib and the inner .w_iUH7 (rating) and .lh-copy (review body) selectors come straight from the live Walmart review markup. The reviewer name, title, and date sit in their own small wrappers, so each is read with its own selector. Every field falls back to null when its element is missing, which is common since not every review carries a separate title or a visible date. The early return skips any list item that has neither a rating nor body text, so layout placeholders do not pollute the output.
Walmart's class names (w_iUH7, lh-copy, dib, and the rest) are generated and change without notice, and they can differ between product layouts. Treat the selectors above as a starting template, not a contract. When a field comes back as null, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.
Step 3: Handle review pagination
One page of reviews is a demo; a real job walks the pagination. Walmart exposes the review page number through a page query parameter on the product URL, so you can build each page URL in a loop, fetch it through the Crawling API, parse it with the same function, and collect the rows. Because every review page shares the same card structure, the parser you already wrote works across all of them without changes.
async function scrapeAllReviews(baseUrl, totalPages) { const all = []; for (let page = 1; page <= totalPages; page++) { const sep = baseUrl.includes('?') ? '&' : '?'; const url = `${baseUrl}${sep}page=${page}`; const html = await crawl(url); if (html) all.push(...parseReviews(html)); } return all; }
Each iteration appends page=N to the product URL, fetches the rendered page, and spreads the parsed reviews into one combined array. Keep totalPages conservative while testing so you are not making more requests than you need, and raise it once the output looks right.
Step 4: Export to JSON and CSV
Now wire the fetch, the parse, and the pagination into one runnable script, then write the collected reviews to both a JSON file and a CSV file. JSON is the natural format for feeding a sentiment model; CSV opens straight in a spreadsheet for quick product research.
const { CrawlingAPI } = require('crawlbase'); const cheerio = require('cheerio'); const fs = require('fs'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); async function crawl(pageUrl) { const options = { ajax_wait: 'true', page_wait: 5000 }; const response = await api.get(pageUrl, options); if (response.statusCode === 200) return response.body; console.error(`Request failed: ${response.statusCode}`); return null; } function parseReviews(html) { const $ = cheerio.load(html); const reviews = []; $('#item-review-section li.dib').each((_, el) => { const card = $(el); const rating = card.find('.w_iUH7').text().trim(); const body = card.find('.lh-copy').text().trim(); if (!rating && !body) return; reviews.push({ reviewer: card.find('.f7.gray').first().text().trim() || null, rating: rating || null, title: card.find('.w_kV33.w_Sl3f.w_mvVb.f5.b').text().trim() || null, body: body || null, date: card.find('.f7.gray.mt1').text().trim() || null, }); }); return reviews; } async function scrapeAllReviews(baseUrl, totalPages) { const all = []; for (let page = 1; page <= totalPages; page++) { const sep = baseUrl.includes('?') ? '&' : '?'; const html = await crawl(`${baseUrl}${sep}page=${page}`); if (html) all.push(...parseReviews(html)); } return all; } function toCsv(rows) { const headers = ['reviewer', 'rating', 'title', 'body', 'date']; const escape = (v) => `"${(v ?? '').replace(/"/g, '""')}"`; const lines = [headers.join(',')]; for (const row of rows) { lines.push(headers.map((h) => escape(row[h])).join(',')); } return lines.join('\n'); } async function main() { const productUrl = 'https://www.walmart.com/ip/Straight-Talk-Apple-iPhone-14-128GB-Midnight-Prepaid-Smartphone-Locked-to-Straight-Talk/1381920049'; const reviews = await scrapeAllReviews(productUrl, 3); fs.writeFileSync('walmart_reviews.json', JSON.stringify(reviews, null, 2)); fs.writeFileSync('walmart_reviews.csv', toCsv(reviews)); console.log(`Saved ${reviews.length} reviews to JSON and CSV`); } main();
Run the full script with node scraper.js. It walks three review pages, collects every review into one array, and writes walmart_reviews.json and walmart_reviews.csv to the project folder. The CSV escaping wraps each value in quotes and doubles any internal quote, so review bodies with commas or quotation marks do not break the columns.
What the output looks like
The JSON file holds one object per review, ready to feed a sentiment pipeline or load into a notebook.
[ { "reviewer": "Optimistic", "rating": "5 out of 5 stars review", "title": "Great camera and speed", "body": "Pictures are coming out great and the speed is what I've been needing", "date": "October 2, 2023" }, { "reviewer": "First Time Shopper", "rating": "5 out of 5 stars review", "title": "Awesome phone", "body": "Bought this last month and it's been awesome. The camera quality is perfect especially with the action mode.", "date": "September 18, 2023" } ]
The CSV mirrors the same fields with a header row, so a marketing or product team can open it in a spreadsheet, sort by rating, and skim the lowest scores first. With the data in hand you can run sentiment scoring, tally the rating distribution, or track how feedback shifts release over release. For the analysis side, our guide on how to scrape customer reviews covers turning raw review text into structured insight, and the same review pipeline applies to Amazon reviews if you are comparing the two marketplaces.
Staying unblocked
Even with rendering handled, Walmart watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.
-
Pace your requests. Hammering review pages in a tight loop is the fastest way to get throttled. Spread requests out and keep
totalPagesto what you actually need instead of crawling every page at full speed. - Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
- Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.
For the broader playbook, see how to scrape websites without getting blocked. If you also need product titles, prices, or the search grid rather than reviews, the companion guide on how to scrape Walmart search with Python covers that side, and Walmart is a frequent target for broader ecommerce web scraping work, where the same fetch-then-parse pattern carries across sites.
Is it legal to scrape Walmart reviews?
Whether scraping Walmart reviews is allowed depends on Walmart's terms of service, your jurisdiction, and what you do with the data. Walmart's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Walmart's Terms of Use and its robots.txt at https://www.walmart.com/robots.txt, and treat both as the boundary for what you collect.
A few lines worth holding to. Collect only public review text: the star rating, the review title, the body, the display name as shown on the card, and the posted date that anyone can see without an account. Do not build a profile of an individual reviewer, link their reviews across products or sites, or try to identify the real person behind a display name. A display name and a public review body are public content; treating them as raw material to track a specific shopper is a privacy line you should not cross. Respect Walmart's stated rate expectations and keep your request volume low enough that you are not straining its servers.
This guide is deliberately scoped to public product and review pages because that is the line that keeps the work defensible. It does not cover anything behind a login, account or order data gated by a sign-in, personal data about reviewers beyond the public text they posted, or any attempt to bypass authentication. If your project needs more than public reviews, a formal data agreement with Walmart is the correct path, not a cleverer scraper. When in doubt, prefer aggregate analysis (rating distributions, common themes) over anything that singles out one reviewer.
Key takeaways
- Walmart renders reviews client-side. A plain fetch returns an incomplete page, so you must render it before you parse it.
-
You need rendering and a trusted IP together. The Crawling API does both in one call;
ajax_waitandpage_waitcontrol how long it waits for the review list to load. -
cheerio does the extraction. Select every
#item-review-section li.dibcard, then map reviewer, rating, title, body, and date to current selectors, and expect those selectors to drift. -
Scale by looping pages, export to JSON and CSV. The
pageparameter walks the review pages, the same parser works across every page, and both files feed sentiment and product research. - Stay on public review text. Respect Walmart's ToS and robots.txt, never profile individual reviewers, and keep out of anything behind a login.
Frequently Asked Questions (FAQs)
Why does a plain request return incomplete data from Walmart?
Because Walmart renders the price, the rating summary, and the full review list client-side with JavaScript. The initial HTML is partial until the page's scripts run in a browser, so a raw HTTP request returns status 200 with the review section empty or stubbed out. To get a complete page you have to render it first, which is what the Crawling API handles for you.
Which fields can I extract from a Walmart review?
The reviewer display name, the star rating (Walmart phrases it as "5 out of 5 stars review"), the short review title, the body text, and the posted date when it is present on the card. The scraper in this guide pulls all five from the #item-review-section li.dib review cards using the .w_iUH7 and .lh-copy selectors plus the small wrapper classes for name, title, and date.
How do I scrape every page of reviews, not just the first?
Walmart paginates reviews through a page query parameter on the product URL. The scrapeAllReviews function in this guide loops from page 1 to a page count you set, appends page=N to the URL, fetches each rendered page through the Crawling API, and runs the same parser on every page before combining the results into one array.
My selectors return null. What changed?
Almost certainly Walmart's markup. Its generated class names (w_iUH7, lh-copy, dib, and the wrappers around the name, title, and date) change without notice and can differ between product layouts, so selectors that worked last month can break. Re-inspect a live review page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.
Can I scrape personal data about Walmart reviewers?
No, and this guide does not cover it. A public display name and review body are public content, but profiling a specific reviewer, linking their activity across products or sites, or trying to identify the person behind a display name crosses a privacy line. Keep your analysis at the aggregate level (rating distributions, common themes) and stay out of anything behind a login.
Can I analyze the scraped Walmart reviews?
Yes. Once the reviews are in JSON or CSV you can run sentiment analysis to classify each one as positive, negative, or neutral, tally the rating distribution, surface the most praised and most criticized features, and track how feedback shifts over time. That aggregate view is exactly what makes review data useful for product research and competitive benchmarking.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
