Expedia is one of the largest online travel marketplaces, and the hotel, flight, and activity data it surfaces is genuinely useful if you are doing price research, monitoring a market, or building a travel product. The catch is that Expedia renders its results client-side and defends hard against bots, so a plain HTTP fetch hands you an empty shell. This guide shows you how to scrape Expedia with JavaScript: a small, runnable Node build that renders the page, parses hotel listings with Cheerio, paginates, and writes the results to CSV.
To keep this honest and defensible, the whole walkthrough is scoped to public data: hotel and flight listings, nightly prices, availability, ratings, and review counts that anyone can see without logging in. It does not touch user accounts, login-walled content, booking actions, or personal data. The ethics and ToS section near the end is not boilerplate, so read it before you point this at production volume.
Why scrape Expedia for travel data
Public travel pricing moves constantly, and a single page view only tells you what a fare looks like right now. Scraping Expedia's public listings lets you track nightly rates and availability over time, which is what price monitoring, competitor benchmarking, and demand research all depend on. For an engineer, the value is structured data: turning a rendered search page into clean rows you can query, chart, or feed into a model.
This is the same shape of problem as any ecommerce web scraping job. The difference is that Expedia's anti-bot stack is more aggressive than a typical store, so the approach has to account for that from the first request.
Understand the target: Expedia's hotel-search URL
Expedia exposes several search surfaces (flights, cars, cruises, things to do), but the hotel search is the cleanest to work with and the one this guide uses. Its results live at a predictable URL whose query parameters map directly to the search form, so you can construct any search programmatically without driving the UI.
Here is a concrete example: two adults, one room, in Dubai, for a four-night stay.
https://www.expedia.com/Hotel-Search?adults=2&rooms=1&destination=Dubai&startDate=2026-07-10&endDate=2026-07-14
The parameters that matter:
- destination the search location, URL-encoded (a city, region, or property name).
- adults the number of adult guests.
- rooms how many rooms to price.
-
startDate and endDate the stay window, in
YYYY-MM-DDformat.
Build the URL with the parameters you care about and you have a repeatable target. Vary the destination and dates in a loop and you have a price-monitoring job.
Why a plain fetch fails here
If you request that URL with a bare HTTP client, you get a response with status 200 and almost no hotel data in the body. Two things are working against you. First, Expedia renders its listings in the browser with JavaScript, so the initial HTML is a shell that gets populated only after the page's scripts run. Second, Expedia flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get challenged or blocked before they ever see the rendered content.
So a working Expedia scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawlbase Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns the finished HTML for you to parse.
Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Expedia is client-side rendered, so you need the JS token here. Using the normal token returns the same empty shell a plain fetch would.
Set up the project
You need Node.js and npm installed. Confirm both, then create a project and install the libraries.
node --version npm --version mkdir expedia-scraper && cd expedia-scraper npm init -y npm install cheerio crawlbase csv-writer
Three dependencies do the work: crawlbase is the client for the Crawling API, cheerio parses the returned HTML with a jQuery-like API on the server, and csv-writer serializes your results. If you want a queryable store instead of flat files, add sqlite3 and write rows to a table; the CSV path below covers the common case.
You also need a Crawlbase account and a JS token, which you get from the dashboard after signing up. Drop it into the code wherever you see YOUR_CRAWLBASE_JS_TOKEN.
Fetch the rendered HTML
Start by getting the finished page. You pass two options that matter for a site like Expedia: ajax_wait tells the API to wait for asynchronous content to load, and page_wait holds for a fixed number of milliseconds after load so late-rendering listings have time to appear. Five seconds is a reasonable starting point; raise it if results come back thin.
const { CrawlingAPI } = require('crawlbase') const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_JS_TOKEN' }) const options = { ajax_wait: true, page_wait: 5000, } const expediaURL = 'https://www.expedia.com/Hotel-Search?adults=2&rooms=1&destination=Paris&startDate=2026-07-10&endDate=2026-07-14' async function fetchExpediaHTML(url) { const response = await api.get(url, options) return response.body } fetchExpediaHTML(expediaURL).then((html) => console.log(html))
Run it with node expedia-scraper.js and you should see real markup with hotel cards in it, not the empty shell a plain fetch returns. That confirms rendering is working before you write a single selector.
Expedia needs a rendered page behind a trusted IP, in one call. The Crawling API takes a JS token, runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public hotel search on the free tier first.
Parse hotel listings with Cheerio
With the HTML in hand, load it into Cheerio and walk the property cards. Each card carries the fields you want: hotel name, price per night, total price, rating, and review count. Inspect the live page in your browser's dev tools to find the current selectors, then map each field to one.
const cheerio = require('cheerio') function extractHotelDetails(html) { const $ = cheerio.load(html) const hotels = [] $('div[data-stid="property-listing-results"] .uitk-card').each((i, el) => { hotels.push({ name: $(el).find('h3.uitk-heading').text().trim(), pricePerNight: $(el).find('[data-test-id="price-summary"] .uitk-text').first().text().trim(), totalPrice: $(el).find('[data-test-id="price-summary"] .uitk-text').last().text().trim(), rating: $(el).find('.uitk-badge .uitk-badge-base-text').text().trim(), reviews: $(el).find('.uitk-text[aria-hidden="false"]').last().text().trim(), }) }) return hotels }
Expedia's class names (the uitk-* prefixes and data-stid attributes) change without notice. Treat the selectors above as a starting template, not a contract. When extraction returns empty strings, re-inspect the live page and update the selectors. This is normal maintenance for any production scraper, not a sign something is broken.
Wire the fetch and the parse together in a main function so you have one runnable script.
async function main() { const html = await fetchExpediaHTML(expediaURL) const hotels = extractHotelDetails(html) console.log(hotels) } main().catch((err) => console.error('Scrape failed:', err))
What the output looks like
Run the full script and you get an array of structured hotel objects. A trimmed sample:
[ { "name": "OKKO Hotels Paris Rueil-Malmaison", "pricePerNight": "$118", "totalPrice": "$542 total", "rating": "9.0", "reviews": "286 reviews" }, { "name": "Grand Hotel Leveque", "pricePerNight": "$174", "totalPrice": "$782 total", "rating": "8.4", "reviews": "1,004 reviews" }, { "name": "citizenM Paris Opera", "pricePerNight": "$192", "totalPrice": "$871 total", "rating": "9.6", "reviews": "76 reviews" } ]
Handle pagination
Expedia loads its first batch of results and then reveals more behind a "Show more results" button rather than a numbered page list. The Crawling API can click that button for you with the css_click_selector option: pass a valid, URL-encoded CSS selector and the API clicks it after the page renders, so the additional listings load before the HTML comes back.
const options = { ajax_wait: true, page_wait: 10000, css_click_selector: encodeURIComponent('button[data-stid="show-more-results"]'), }
Swap that into your options object and the next run returns more hotels than before. Give page_wait a little more room when clicking, since the extra results need time to render after the click fires.
Store the results in CSV
Logging to the console is fine while you iterate, but you want the data on disk. The csv-writer library maps each object key to a column and appends your rows in a few lines.
const createCsvWriter = require('csv-writer').createObjectCsvWriter function saveToCSV(data) { const writer = createCsvWriter({ path: 'hotels.csv', header: [ { id: 'name', title: 'Name' }, { id: 'pricePerNight', title: 'Price Per Night' }, { id: 'totalPrice', title: 'Total Price' }, { id: 'rating', title: 'Rating' }, { id: 'reviews', title: 'Reviews' }, ], }) return writer.writeRecords(data).then(() => console.log('Saved hotels.csv')) }
Call saveToCSV(hotels) from main instead of logging, and each run writes a tidy hotels.csv you can open in any spreadsheet or load into a pipeline. If you would rather query the data with SQL, write the same rows into a SQLite table with sqlite3 instead; the parsing stays identical.
Staying unblocked
Even with rendering handled, Expedia watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.
- Pace your requests. Hammering the same search in a tight loop is the fastest way to get throttled. Spread requests out and vary your search parameters.
- Lean on rotation. A pool of residential proxies spreads requests across many real-user IPs so no single address trips a rate limit. The Crawling API handles this for you; if you roll your own, this is the part to get right.
- Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat proxy status error codes as signal, not noise.
For the broader playbook on this, see how to scrape websites without getting blocked. If you want to compare this approach with a headless-browser build, web scraping with Python and Selenium walks through that stack.
The honest part: ToS and legality
Scraping a large commercial travel site sits in a legal gray area, and the answer to "is it allowed" depends on the platform's terms of service, your jurisdiction, and what you do with the data. Expedia's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work.
A few lines worth holding to. Collect only public data: listings, prices, availability, and ratings that anyone can see without an account. Respect the site's robots.txt and its stated rate expectations, and keep your request volume low enough that you are not straining anyone's servers. If you plan to reuse the data commercially, get permission or an official data agreement rather than assuming silence is consent. And never collect personal data, including anything tied to individual user accounts.
This guide is deliberately scoped to public listing data because that is the line that keeps the work defensible. It does not cover anything behind a login, account or profile data, reviews tied to identifiable people, or booking actions of any kind. If your project needs more than public listings, the right move is an official API or a data agreement with Expedia, not a cleverer scraper. For background on how managed access differs from raw scraping, what is an API proxy is a useful read.
Key takeaways
- Expedia is client-side rendered. A plain fetch returns an empty shell, so you must render the page before you parse it.
-
You need rendering and a trusted IP together. The Crawling API with a JS token does both in one call;
ajax_waitandpage_waitcontrol how long it waits for content. - Cheerio does the extraction. Map name, price, rating, and reviews to current selectors, and expect those selectors to drift over time.
-
Pagination is a click. Use
css_click_selectorto trigger "show more results" before the HTML comes back. - Stay on public data. Respect Expedia's ToS and robots.txt; no accounts, no personal data, no booking actions.
Frequently Asked Questions (FAQs)
Why does a plain fetch return no hotel data from Expedia?
Because Expedia renders its listings client-side with JavaScript. The initial HTML is a shell that only fills in after the page's scripts run in a browser, so a raw HTTP request returns status 200 with the hotel fields blank. To get real data you have to render the page first, which is what the Crawling API's JS token handles for you.
Do I need the normal token or the JS token for Expedia?
The JS token. The normal token fetches static HTML, which on Expedia is the same empty shell a plain fetch returns. The JS token renders the page in a real browser before handing back the HTML, so the listings are present when Cheerio parses them.
My Cheerio selectors return empty strings. What changed?
Almost certainly Expedia's markup. Its uitk-* class names and data-stid attributes change without notice, so selectors that worked last month can break. Re-inspect a live search page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.
How do I get more than the first page of results?
Expedia reveals additional hotels behind a "Show more results" button rather than numbered pages. Pass that button's CSS selector to the Crawling API's css_click_selector option, URL-encoded, and the API clicks it after rendering so the extra listings load before the HTML returns. Give page_wait a bit more time when you do this.
How do I avoid getting blocked while scraping Expedia?
Keep your per-IP request rate low, vary your search parameters instead of looping the same URL, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation and a trusted IP pool for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.
Is it legal to scrape Expedia?
It depends on Expedia's terms of service, your jurisdiction, and your purpose, and their terms restrict automated access. Keep strictly to public listing data, respect robots.txt and rate expectations, and never touch accounts, personal data, or booking actions. For commercial reuse, get permission or an official data agreement rather than relying on a scraper.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
