AliExpress is one of the largest marketplaces on the web, and a single search results page packs the data that drives price tracking, product research, and supplier discovery: a product title, its price, a seller rating, an orders or sold count, and a link back to the listing. Pull that across a keyword and you have a structured view of what is selling, at what price, and how well it is rated, all from public listing data.

This guide shows you how to scrape AliExpress search pages with JavaScript and Node.js. You build a small, runnable scraper that turns a keyword into an AliExpress search URL, fetches the rendered results page through the Crawling API, parses each product card with cheerio, walks the pagination, and exports the rows to JSON and CSV. We keep the whole walkthrough scoped to public search and listing data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Node.js script that takes a search keyword, builds the AliExpress search URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for each product on the results page. We pull these fields per card, the same set the legacy AliExpress SERP scraper returned:

  • Title the product name as listed, for example "Wireless Bluetooth Earbuds Noise Cancelling".
  • Price the current price as shown on the card, like "$12.96".
  • Rating the seller or product rating value, for instance "4.9".
  • Orders the orders or sold count, such as "600 sold".
  • Product URL the link to the individual item page.

Why a plain request fails on AliExpress

If you request an AliExpress search URL with a bare HTTP client, you get a response with status 200 and very little usable product data in the body. Two things work against you. First, AliExpress builds its search results in the browser with JavaScript, so the initial HTML is a near-empty shell until the page's scripts run and render the product grid. Second, AliExpress flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get challenged, rate-limited, or shown a CAPTCHA before they ever reach the rendered listings.

So a working AliExpress search scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns finished HTML for you to parse.

Why the JS token

Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. AliExpress builds its product grid client-side, so the JS token gives you the most complete page here. Using the normal token can return a shell with no products, leaving you nothing to parse.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are new to Node, the walkthrough on how to build a web scraper with Node.js covers the ground this tutorial assumes.

Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.

A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token from the account docs page. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a project folder, initialize it, and install the two libraries the scraper needs.

bash
node --version

mkdir aliexpress-search-scraper && cd aliexpress-search-scraper
npm init -y

npm install crawlbase cheerio

Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. If selectors are new to you, the primer on XPath and CSS selectors is a good companion.

Step 1: Build the search URL from a keyword

AliExpress turns a keyword search into a predictable URL. The wholesale search path takes the keyword with spaces replaced by hyphens, which is exactly the transform the legacy scraper used. Wrap that in a small helper so any keyword becomes a valid search URL.

javascript
function searchUrl(keyword, page = 1) {
  const slug = keyword.trim().split(' ').join('-');
  return `https://www.aliexpress.com/w/wholesale-${slug}.html?page=${page}`;
}

console.log(searchUrl('wireless earbuds'));
// https://www.aliexpress.com/w/wholesale-wireless-earbuds.html?page=1

The page parameter is what you increment later to walk the pagination. For now it stays at 1 so you can get a single page working before scaling out.

Step 2: Fetch the rendered search page

Next, get the finished page. Import the CrawlingAPI class, initialize it with your JS token, and request the search URL. Checking the status code before you parse keeps failures loud instead of silent.

javascript
const { CrawlingAPI } = require('crawlbase');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

async function crawl(pageUrl) {
  const options = { ajax_wait: 'true', page_wait: 5000 };
  const response = await api.get(pageUrl, options);
  if (response.statusCode === 200) {
    return response.body;
  }
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

crawl(searchUrl('wireless earbuds')).then((html) => {
  console.log(html ? html.slice(0, 500) : 'No HTML returned');
});

The two wait options matter for a client-rendered target like this. ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so the late-rendering product grid appears before the page is captured. Five seconds is a reasonable start; raise it if products come back empty. Run the script with node scraper.js and you should see real product markup, not a stripped-down shell. That confirms rendering works before you write a single selector.

Crawlbase AliExpress Scraper

AliExpress builds its product grid client-side and challenges scraper traffic, so you need a rendered page behind a trusted IP in one call. The Crawling API takes a JS token, runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public search page on the free tier first.

Step 3: Parse each product with cheerio

With rendered HTML in hand, load it into cheerio and walk the product cards. AliExpress lays each search result out in a repeating card, so you select every card, then read title, price, rating, orders, and the item link from inside it. Reading each field defensively keeps one missing value from crashing the run.

javascript
const cheerio = require('cheerio');

function parseSearch(html) {
  const $ = cheerio.load(html);
  const items = [];

  $('a.search-card-item').each((_, el) => {
    const card = $(el);
    const title = card.find('[title]').first().attr('title');
    if (!title) return;

    const href = card.attr('href') || '';
    const url = href.startsWith('//') ? `https:${href}` : href;

    items.push({
      title: title.trim(),
      price: card.find('.multi--price-sale--U-S0jtj').text().trim() || null,
      rating: card.find('.multi--starList--Fh2vqvr').attr('aria-label') || null,
      orders: card.find('.multi--trade--Ktbl2jB').text().trim() || null,
      url: url || null,
    });
  });

  return items;
}

A couple of details keep this resilient. The title is read from the card's title attribute rather than its text, since AliExpress truncates the visible name but keeps the full string in the attribute. The product URL on AliExpress is often protocol-relative (it starts with //), so the helper prepends https: to make it absolute, which mirrors the legacy output where bare https: links were a known rough edge. Each field falls back to null when the element is missing, which is common since not every card shows a rating or an orders count.

Selectors drift

AliExpress hashes its class names (multi--price-sale--U-S0jtj, multi--starList--Fh2vqvr, and the rest), and it regenerates those hashes on deploys, so they change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back as null, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 4: Put it together

Now wire the URL builder, the fetch, and the parse into one runnable script. Build the URL, fetch the rendered HTML, hand it to the parser, and print the structured records.

javascript
const { CrawlingAPI } = require('crawlbase');
const cheerio = require('cheerio');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

function searchUrl(keyword, page = 1) {
  const slug = keyword.trim().split(' ').join('-');
  return `https://www.aliexpress.com/w/wholesale-${slug}.html?page=${page}`;
}

async function crawl(pageUrl) {
  const options = { ajax_wait: 'true', page_wait: 5000 };
  const response = await api.get(pageUrl, options);
  if (response.statusCode === 200) return response.body;
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

function parseSearch(html) {
  const $ = cheerio.load(html);
  const items = [];
  $('a.search-card-item').each((_, el) => {
    const card = $(el);
    const title = card.find('[title]').first().attr('title');
    if (!title) return;
    const href = card.attr('href') || '';
    const url = href.startsWith('//') ? `https:${href}` : href;
    items.push({
      title: title.trim(),
      price: card.find('.multi--price-sale--U-S0jtj').text().trim() || null,
      rating: card.find('.multi--starList--Fh2vqvr').attr('aria-label') || null,
      orders: card.find('.multi--trade--Ktbl2jB').text().trim() || null,
      url: url || null,
    });
  });
  return items;
}

async function main() {
  const html = await crawl(searchUrl('wireless earbuds'));
  if (!html) return;
  const items = parseSearch(html);
  console.log(JSON.stringify(items.slice(0, 3), null, 2));
}

main();

What the output looks like

Run the full script with node scraper.js and you get a clean array of records, one per product, ready to write to JSON, CSV, or a database.

json
[
  {
    "title": "Wireless Bluetooth Earbuds Noise Cancelling Touch Control",
    "price": "$12.96",
    "rating": "4.9",
    "orders": "600 sold",
    "url": "https://www.aliexpress.com/item/1005005690275912.html"
  },
  {
    "title": "TWS Gaming Earphones Low Latency Long Battery Life",
    "price": "$8.31",
    "rating": "4.7",
    "orders": "2000 sold",
    "url": "https://www.aliexpress.com/item/1005005123456789.html"
  }
]

Loop through result pages

One page of results is a demo; a real job walks the pagination. AliExpress exposes the page number through the page query parameter, which the searchUrl helper already accepts, so you build each page URL in a loop, fetch it through the Crawling API, parse it with the same function, and collect the rows. Because every results page shares the same card structure, the parser you already wrote works across all of them without changes.

javascript
async function scrapePages(keyword, totalPages) {
  const all = [];
  for (let page = 1; page <= totalPages; page++) {
    const html = await crawl(searchUrl(keyword, page));
    if (html) all.push(...parseSearch(html));
  }
  return all;
}

scrapePages('wireless earbuds', 3).then((rows) => {
  console.log(`Collected ${rows.length} products`);
});

To enrich each row with full detail (every image, the full description, shipping options, and the complete seller profile), take the url from each card and fetch that individual item page through the same crawl function, then write a small parser for the product layout. The pattern is identical: render, then parse. For the product-page version of this work in another language, see how to scrape AliExpress products with Python.

Export to JSON and CSV

Collecting rows in memory is fine for a demo, but you usually want them on disk. Node's built-in fs module writes JSON in one line, and a tiny helper turns the same array into CSV for spreadsheets or a quick import.

javascript
const fs = require('fs');

function toCsv(rows) {
  const headers = ['title', 'price', 'rating', 'orders', 'url'];
  const escape = (v) => `"${(v ?? '').toString().replace(/"/g, '""')}"`;
  const lines = rows.map((r) => headers.map((h) => escape(r[h])).join(','));
  return [headers.join(','), ...lines].join('\n');
}

scrapePages('wireless earbuds', 3).then((rows) => {
  fs.writeFileSync('aliexpress-products.json', JSON.stringify(rows, null, 2));
  fs.writeFileSync('aliexpress-products.csv', toCsv(rows));
  console.log(`Saved ${rows.length} products to JSON and CSV`);
});

The CSV helper quotes every field and doubles any embedded quotes, which keeps product titles with commas in them from breaking the column layout. From here the JSON feeds a database or a notebook, and the CSV opens straight in a spreadsheet for a quick price scan.

Staying unblocked

Even with rendering handled, AliExpress watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Hammering pages in a tight loop is the fastest way to get throttled or shown a CAPTCHA. Spread requests out and vary your keywords instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. If you would rather route your own traffic through a rotating pool than use the managed API, the Smart AI Proxy gives you the same residential IP rotation as a drop-in proxy endpoint; the proxy-first take on this exact site is covered in AliExpress proxy scraping. AliExpress is also a frequent target for broader ecommerce web scraping work, where the same fetch-then-parse pattern carries across sites, and the price fields you collect here feed straight into price intelligence.

Whether scraping AliExpress is allowed depends on AliExpress's terms of service, your jurisdiction, and what you do with the data. AliExpress's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read AliExpress's Terms of Use and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public search data: the product title, price, rating, orders count, and the item link that anyone can see without an account. Respect AliExpress's stated rate expectations and keep your request volume low enough that you are not straining its servers. Avoid personal data, including anything tied to identifiable buyers or sellers beyond the public store name shown on a card, and do not redistribute the product images or descriptions wholesale, since those are the sellers' copyrighted media. If you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

For volume or commercial use, AliExpress offers an official affiliate and open-platform API through its parent Alibaba, and that is the right tool when you need large volumes, guaranteed structure, or commercial rights. This guide is deliberately scoped to public search and listing pages because that is the line that keeps the work defensible. It does not cover anything behind a login, buyer or seller personal data, private messages between users, order or account data gated by a sign-in, or any attempt to bypass authentication. If your project needs more than public listings, the official API or a data agreement is the correct path, not a cleverer scraper.

Recap

Key takeaways

  • AliExpress builds its grid client-side. A plain request returns an empty shell, so you must render the search page before you parse it.
  • You need rendering and a trusted IP together. The Crawling API with a JS token does both in one call; ajax_wait and page_wait control how long it waits for the product grid.
  • cheerio does the extraction. Select every search card, then map title, price, rating, orders, and the product URL to current selectors, and expect AliExpress's hashed class names to drift.
  • Scale by looping the page parameter. The page query parameter walks the result pages, and the same parser works across every page, then export the rows to JSON and CSV.
  • Stay on public data. Respect AliExpress's ToS and robots.txt, prefer the official Alibaba API for volume or commercial use, and never touch logins, personal data, or copyrighted media you would redistribute.

Frequently Asked Questions (FAQs)

Why does a plain request return no products from AliExpress?

Because AliExpress builds its search results in the browser with JavaScript. The initial HTML is a near-empty shell until the page's scripts run and render the product grid, so a raw HTTP request returns status 200 with no usable product data. To get a complete page you have to render it first, which is what the Crawling API's JS token handles for you.

Do I need the normal token or the JS token for AliExpress?

Use the JS token. The normal token fetches static HTML, which on AliExpress comes back without products. The JS token renders the page in a real browser before handing back the HTML, so the product cards are present when cheerio parses them.

How do I scrape multiple pages of AliExpress search results?

AliExpress exposes the page number through the page query parameter on the wholesale search URL. Increment it in a loop, fetch each page through the Crawling API, and run the same parser over every page. The card structure is identical across pages, so one parser collects all the rows, which you then write to JSON or CSV.

My selectors return null. What changed?

Almost certainly AliExpress's markup. Its product cards use hashed class names like multi--price-sale--U-S0jtj that the site regenerates on deploys, so selectors that worked last month can break. Re-inspect a live page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

Can I scrape buyer or seller personal data from AliExpress?

No, and this guide does not cover it. Buyer details, private messages, and account data sit behind a login, so they are not public data. The public store name on a product card is fine to record, but scraping login-walled content, personal data, or bypassing authentication to reach it is out of scope here and runs against AliExpress's terms. For sanctioned access the correct route is the official Alibaba API or a licensing agreement.

Should I use the official API or scrape the site?

If you need volume, guaranteed structure, or commercial reuse rights, use the official Alibaba open-platform or affiliate API. It is built for that and keeps you on the right side of AliExpress's terms. Scraping public search pages with the approach in this guide fits smaller, public-data research where no API access is in place, as long as you respect the ToS, robots.txt, and rate limits.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available