Walmart's search results are not a flat list. Near the top, and again partway down the page, sit the sponsored placements: products a seller paid to surface on a keyword. Each one carries the same public signal a competitor would kill for, who is bidding on "headphones," where they land, what they charge, and what they call the product. Tracked over time, that is straight ad and competitor intelligence pulled from a page anyone can load.

This guide shows you how to scrape Walmart sponsored ads with JavaScript and Node.js. You build a small, runnable scraper that fetches a rendered Walmart search page through the Crawling API, parses every sponsored placement with cheerio into a clean record (title, price, position, link), and then rolls those records up to answer one question: which products advertise on a given keyword. The whole walkthrough stays scoped to public search-results data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Node.js script that takes a public Walmart search URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every sponsored placement on the results page. We use the search for "headphones" as the running example and pull these fields per ad:

  • Title the sponsored product name, for example "Bose QuietComfort 45 Headphones Noise Cancelling Over-Ear Wireless Bluetooth Earphones, Black".
  • Price the listed price as shown on the card, like "$329.00".
  • Position the rank of the ad among sponsored placements on the page, so you can tell a top-of-search bid from a mid-search one.
  • Link the URL to the individual product page the ad points to.

Once each ad is a record, a short aggregation step groups them by advertiser so you can see who is buying placement on the keyword, and at what positions. That is the ad-intelligence payoff: the same data Walmart shows every shopper, reorganized for analysis.

Why a plain request fails on Walmart

If you request a Walmart search URL with a bare HTTP client, you do not get the clean results page you see in a browser. Two things work against you. First, Walmart renders the search grid client-side with JavaScript, so the sponsored cards, prices, and product links are not in the initial HTML until the page's scripts run. Second, Walmart flags automated traffic fast: datacenter IPs and request patterns that do not look like a real browser get challenged or blocked before they ever reach the rendered content.

So a working Walmart scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse. It also exposes an autoparse option that returns structured page content as JSON, which is handy for a first look before you write your own selectors.

Sponsored vs organic

Walmart mixes paid placements into the same grid as organic results and tags the paid ones with a small "Sponsored" label inside the card. The parser below reads that label so you can separate ads from organic listings instead of treating the whole grid as advertising. If the label text shifts, that one selector is the field to re-check first.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are newer to Node, our guide on how to build a web scraper with Node.js covers the groundwork this tutorial assumes.

Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.

A Crawlbase account and token. Sign up, open your dashboard, and copy your token from the account docs page. The free tier includes 1,000 requests with no card, which is plenty to follow this tutorial end to end. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a project folder, initialize it, and install the two libraries the scraper needs.

bash
node --version

mkdir walmart-ads-scraper && cd walmart-ads-scraper
npm init -y

npm install crawlbase cheerio

Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull individual fields out by CSS selector. If selectors are new to you, the primer on XPath and CSS selectors pairs well with this guide.

Step 1: Fetch the rendered search page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, and request the search URL. Checking the status code before you parse keeps failures loud instead of silent.

javascript
const { CrawlingAPI } = require('crawlbase');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

const walmartPageURL = 'https://www.walmart.com/search?q=headphones';

api
  .get(walmartPageURL)
  .then((response) => {
    if (response.statusCode === 200) {
      console.log(response.body.slice(0, 500));
    }
  })
  .catch((error) => console.error('API request error:', error));

This requests the Walmart headphones search page through the Crawling API and prints the first 500 characters of the returned HTML. Run it with node scraper.js and you should see real search-grid markup, not a stripped-down shell or a challenge page. That confirms rendering and a trusted IP are working before you write a single selector. If you would rather skip writing selectors at all for a first pass, add the autoparse option, api.get(walmartPageURL, { autoparse: 'true' }), and the API returns the page's main content as JSON instead.

Crawlbase Walmart Scraper

That first call returned a rendered Walmart grid behind a trusted IP, in one request. The Crawling API runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML (or auto-parsed JSON), so you skip running a headless fleet and a proxy pool yourself. Point it at a public search page on the free 1,000-request tier first.

Step 2: Parse the sponsored placements with cheerio

With rendered HTML in hand, load it into cheerio and walk the result grid. Walmart lays each result out in a repeating product container, so you select every card, read the "Sponsored" label, and keep only the cards that carry it. For each sponsored card you pull the title, price, position, and link. Reading every field defensively keeps one missing value from crashing the run.

javascript
const cheerio = require('cheerio');

function parseSponsored(html) {
  const $ = cheerio.load(html);
  const ads = [];
  let position = 0;

  const containers = $('.sans-serif.mid-gray.relative.flex.flex-column.w-100.hide-child-opacity');

  containers.each((index, element) => {
    const card = $(element);

    // Only keep cards tagged "Sponsored"
    const label = card.find('[data-testid^="variant-"] .gray').text().trim();
    if (!/sponsored/i.test(label)) return;

    position += 1;

    const title = card
      .find('[data-automation-id="product-title"]')
      .text()
      .replace(/\s+/g, ' ')
      .trim();

    const priceString = card
      .find('[data-automation-id="product-price"] .w_iUH7')
      .text()
      .trim();
    const priceMatch = priceString.match(/([^\d]+)([\d,\.]+)/);
    const price = priceMatch ? `${priceMatch[1].trim()}${priceMatch[2]}` : '';

    const href = card.find('a[link-identifier]').attr('href') || '';
    const link = href && href.startsWith('/') ? `https://www.walmart.com${href}` : href;

    ads.push({ position, title, price, link });
  });

  return ads;
}

The selectors here come straight from Walmart's search markup. Every result sits in the long utility-class container .sans-serif.mid-gray.relative.flex.flex-column.w-100.hide-child-opacity; the "Sponsored" tag lives in a [data-testid^="variant-"] .gray span; the title is [data-automation-id="product-title"]; and the price string is inside [data-automation-id="product-price"] .w_iUH7, which we split with a regex into the currency symbol and the numeric part. The early return drops any card without the sponsored label, so position counts only paid placements, top of search first, mid-search next. The link is read from the card anchor's href and made absolute when Walmart returns a relative path.

Selectors drift

Walmart's class names and data-testid values change without notice, and the long utility-class container above is especially brittle since it is generated styling, not a stable hook. Treat these selectors as a starting template, not a contract. When a field comes back empty, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Put it together and analyze the advertisers

Now wire the fetch and the parse into one runnable script, then add the analysis step. Once every sponsored card is a record, grouping by the product page link tells you which advertisers are buying placement on the keyword and where they land.

javascript
const { CrawlingAPI } = require('crawlbase');
const cheerio = require('cheerio');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

async function crawl(pageUrl) {
  const response = await api.get(pageUrl);
  if (response.statusCode === 200) return response.body;
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

// parseSponsored from Step 2 goes here

function advertiserReport(ads) {
  return ads.map((ad) => ({
    advertiser: ad.title.split(' ').slice(0, 2).join(' '),
    position: ad.position,
    price: ad.price,
    link: ad.link,
  }));
}

async function main() {
  const keyword = 'headphones';
  const url = `https://www.walmart.com/search?q=${encodeURIComponent(keyword)}`;
  const html = await crawl(url);
  if (!html) return;

  const ads = parseSponsored(html);
  console.log(`${ads.length} sponsored placements on "${keyword}"`);
  console.log(JSON.stringify(advertiserReport(ads), null, 2));
}

main();

The advertiserReport step is deliberately simple: it derives a short advertiser handle from the first words of each product title and keeps the position, price, and link alongside it. In a real pipeline you would key the advertiser on something firmer (the brand field or the seller resolved from the product page), but even this rough grouping answers the core question, who is paying to appear on this keyword and at what rank. Run several keywords on a schedule and the same records become a competitive feed: new advertisers entering a term, price moves on the placements, and rank shifts over time.

What the output looks like

Run the full script with node scraper.js and you get a clean array of sponsored placements, ordered by position, ready to write to JSON, CSV, or a database.

json
[
  {
    "advertiser": "Bose QuietComfort",
    "position": 1,
    "price": "$329.00",
    "link": "https://www.walmart.com/ip/Bose-QuietComfort-45-Headphones/376188834"
  },
  {
    "advertiser": "COWIN E7",
    "position": 2,
    "price": "$35.00",
    "link": "https://www.walmart.com/ip/COWIN-E7-Active-Noise-Cancelling-Headphones/123456789"
  },
  {
    "advertiser": "OneOdio Wired",
    "position": 3,
    "price": "$31.99",
    "link": "https://www.walmart.com/ip/OneOdio-Wired-Over-Ear-Headphones/950096760"
  }
]

Each row is one sponsored placement: who is advertising, where they ranked among the paid slots, what they charge, and the product page the ad points to. Stored with a timestamp per run, this is the raw material for keyword-level ad intelligence.

Scale across keywords and pages

One keyword on one page is a demo; a real job walks a list of keywords and the pagination behind each. Walmart exposes the page number through the page query parameter, so you can build each page URL in a loop, fetch it through the Crawling API, parse it with the same function, and collect the rows. Because every results page shares the same card structure, the parser you already wrote works across all of them without changes.

javascript
async function scrapeKeyword(keyword, totalPages) {
  const all = [];
  for (let page = 1; page <= totalPages; page++) {
    const url =
      `https://www.walmart.com/search?q=${encodeURIComponent(keyword)}&page=${page}`;
    const html = await crawl(url);
    if (html) all.push(...parseSponsored(html));
  }
  return all;
}

scrapeKeyword('headphones', 3).then((rows) => {
  console.log(`Collected ${rows.length} sponsored placements`);
});

Swap the single keyword for an array and loop your target terms to build a competitive map across a whole category. To enrich each ad with full detail (brand, seller, ratings, the complete spec list), take the link from each record and fetch that individual product page through the same crawl function, then write a small parser for the product layout. The pattern is identical: render, then parse. The same fetch-then-parse approach carries straight into broader ecommerce web scraping work, and feeding these prices into a model is a natural fit for price intelligence.

Staying unblocked

Even with rendering handled, Walmart watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Hammering pages in a tight loop is the fastest way to get throttled. Spread requests out and vary your keywords instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. If you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy gives you the same residential IP rotation as a drop-in proxy endpoint.

Whether scraping Walmart sponsored ads is allowed depends on Walmart's terms of service, your jurisdiction, and what you do with the data. Walmart's terms of use restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Walmart's Terms of Use and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public search-results data: the sponsored title, price, position, and product link that anyone can see without an account. The "Sponsored" label and the ranking are public signals Walmart already shows every shopper, which is what makes ad intelligence on this surface defensible. Be mindful of copyright when it comes to product images and ad copy, which are the seller's and Walmart's intellectual property, so analyze them rather than republishing them as your own. Respect Walmart's stated rate expectations and keep your request volume low enough that you are not straining its servers. Avoid personal data entirely, including anything tied to identifiable buyers or sellers beyond what is publicly listed on a results page. If you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

For volume or commercial use, the right path is an official channel: Walmart runs a Connect advertising platform with its own reporting for advertisers, and the Walmart.io developer program exposes sanctioned APIs. Those are the correct tools when you need large volumes, guaranteed structure, or commercial rights. This guide is deliberately scoped to public search and sponsored-placement data because that is the line that keeps the work defensible. It does not cover anything behind a login, buyer or seller personal data, internal campaign metrics like another advertiser's spend or click-through rate, or any attempt to bypass authentication. If your project needs more than public placements, Walmart's official advertising tools or a data agreement are the correct path, not a cleverer scraper.

Recap

Key takeaways

  • Walmart renders search client-side. A plain fetch returns an incomplete page, so you must render it before you parse it, which the Crawling API does behind a trusted IP in one call.
  • Filter on the "Sponsored" label. Read the [data-testid^="variant-"] .gray tag inside each card and keep only the paid placements, so ads stay separate from organic results.
  • Capture title, price, position, and link. Walmart's product-title and product-price hooks give you the fields; counting only sponsored cards gives you the position.
  • Group by advertiser for intelligence. Rolling the records up by product or brand answers who advertises on a keyword and at what rank, tracked over time as a competitive feed.
  • Stay on public data. Respect Walmart's ToS and robots.txt, prefer the official Walmart Connect or Walmart.io tools for volume or commercial use, and never touch logins, personal data, or another advertiser's private campaign metrics.

Frequently Asked Questions (FAQs)

What are sponsored items on Walmart?

Sponsored items are products an advertiser pays to surface higher in Walmart's search results or category pages. They sit in the same grid as organic listings but carry a small "Sponsored" label, and they are part of Walmart's advertising platform, where sellers bid to put their products in front of shoppers searching a given keyword. Because the placement and the label are visible to every shopper, they are public signals you can read off the page.

Why does a plain request return incomplete data from Walmart?

Because Walmart renders the search grid client-side with JavaScript. The initial HTML is partial until the page's scripts run in a browser, so a raw HTTP request returns a page with the sponsored cards, prices, and links missing or blank, and Walmart often serves a challenge to automated traffic on top of that. Rendering the page through the Crawling API behind a trusted IP gives you the complete grid to parse.

My selectors return empty strings. What changed?

Almost certainly Walmart's markup. The long utility-class container, the data-testid values, and the w_iUH7 price class change without notice, so selectors that worked last month can break. The generated utility-class container is the most brittle of them. Re-inspect a live page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

How do I tell which products advertise on a keyword?

Parse only the cards carrying the "Sponsored" label, capture each one's title, price, position, and product link, then group the records by advertiser. Run the same keyword on a schedule and compare runs to see new advertisers entering the term, price moves on the placements, and changes in rank. That keyword-level rollup is the ad-intelligence output this guide builds toward.

Can I get another advertiser's spend or click-through rate?

No, and this guide does not cover it. Spend, impressions, and click-through rate are private campaign metrics that live inside an advertiser's own account, not on the public results page. Scraping cannot reach them, and trying to would mean going behind a login. The only sanctioned source for that kind of data is your own Walmart Connect reporting for campaigns you run.

How often should I collect Walmart ad data?

It depends on how fast the keyword moves. For competitive terms where advertisers and prices change frequently, a daily or near-daily run keeps your view current; for slower categories, weekly is often enough. Whatever cadence you pick, keep the per-IP request rate low and rely on rotation so a faster schedule does not get you throttled. Watch the status codes and back off when you start seeing challenges.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available