Booking.com is one of the largest travel marketplaces on the open web, and its public search-results pages carry a lot of structured signal: hotel and property names, nightly prices, guest review scores, neighbourhoods, and the link to each listing. Price-intelligence teams use it to benchmark rates across cities, researchers study supply and demand by destination, and trip planners compare properties side by side. All of that sits on the public search page in a predictable card layout.
This guide shows you how to scrape Booking.com with JavaScript and Node.js using Cheerio. You build a small, runnable scraper that fetches a public Booking.com search-results page through the Crawling API, parses the property name, price, review score, location, and listing URL for each card, then exports the result as JSON and CSV. The whole walkthrough stays scoped to public hotel listing data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.
What you will build
A Node.js script that takes a public Booking.com search-results URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every property card on the page. We use a San Francisco hotel search as the running example and pull these fields per listing:
- Name the hotel or property name shown on the card, for example "Hotel Zephyr San Francisco".
- Price the displayed nightly price, like "US$592".
- Rating the public guest review score, when Booking.com shows one for that property.
- Location the neighbourhood and city the property is listed in.
- Link the URL to the individual property page.
Why a plain request fails on Booking.com
If you request a Booking.com search URL with a bare HTTP client, you rarely get the property cards back. Two things work against you. First, Booking.com renders the results list in the browser with JavaScript, so the initial HTML is a near-empty shell until the page's scripts run. Second, Booking.com watches for automated traffic: datacenter IPs and request patterns that do not look like a real browser get challenged with a CAPTCHA, rate-limited, or blocked before they ever reach the rendered listings.
So a working scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but keeping that stack healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse with Cheerio.
The Crawling API gives you two tokens: a normal one and a JavaScript one. Booking.com needs the page rendered in a real browser, so use your JavaScript token for every request in this guide. The normal token returns the unrendered shell and your selectors will come back empty.
Prerequisites
You need a few things in place before writing any code. None of them take long.
Basic JavaScript and Node.js. You should be comfortable writing and running a Node script, installing packages with npm, and working with asynchronous code. For a fuller walkthrough, our guide to building a web scraper with Node.js covers the basics.
Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.
A Crawlbase account and token. Sign up, open your dashboard, and copy your JavaScript token from the account docs page. The free tier gives you 1,000 requests with no card, and you only pay for successful requests. Treat the token like a password: it authenticates your requests, so keep it out of version control.
Set up the project
Create a project folder, initialize it, and install the two libraries the scraper needs.
node --version mkdir booking-scraper && cd booking-scraper npm init -y npm install crawlbase cheerio
Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. Create a file named scraper.js in this folder and add the code from the steps below.
Step 1: Fetch the rendered search page
Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JavaScript token, and request a public Booking.com search-results URL. Checking the status code before you parse keeps failures loud instead of silent.
const { CrawlingAPI } = require('crawlbase'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); const bookingPageURL = 'https://www.booking.com/searchresults.html?ss=San+Francisco&checkin=2025-12-25&checkout=2025-12-31&group_adults=2&no_rooms=1&group_children=0&selected_currency=USD'; api .get(bookingPageURL) .then((response) => { if (response.statusCode === 200) { console.log(response.body.slice(0, 500)); } }) .catch((error) => console.error('API request error:', error));
Run the script with node scraper.js and you should see real Booking.com property markup at the top of the body, not a stripped-down shell. That confirms rendering works before you write a single selector. The Crawling API uses the JavaScript token you supplied to render the page in a real browser, so the property cards are present in the HTML you get back.
That first request just returned a fully rendered Booking.com search page without a headless browser or a proxy on your side. The Crawling API runs the page in a real browser, rotates through residential IPs server-side, and handles the CAPTCHAs Booking.com throws at scrapers, so you get finished HTML from one call. Point it at a public hotel search on the free tier first, then add your parser.
Step 2: Parse each property card with Cheerio
With rendered HTML in hand, load it into Cheerio and walk the property cards. Booking.com wraps each listing in an element marked data-testid="property-card", so you select every card, then read the name, location, review score, price, and listing link from inside it using the page's own data-testid attributes. Reading each field defensively keeps one missing value from crashing the run.
const cheerio = require('cheerio'); function parseDataFromHTML(html) { const $ = cheerio.load(html); const properties = []; const cardSelector = '[data-testid="property-card"]'; $(cardSelector).each((_, card) => { const currentCard = $(card); const extractText = (selector) => currentCard.find(selector).text().trim(); const name = extractText('[data-testid="title"]'); const location = extractText('[data-testid="address"]'); const rating = extractText( '[data-testid="review-score"] div.a3b8729ab1.d86cee9b25', ); const price = extractText( 'span[data-testid="price-and-discounted-price"]', ); const link = currentCard .find('[data-testid="title-link"]') .attr('href'); if (name) { properties.push({ name, price: price || 'Price not available', rating: rating || 'Rating not available', location, link: link || '', }); } }); return properties; }
A few details keep this faithful to the page. Each listing lives in a [data-testid="property-card"] element. Inside a card, the name comes from [data-testid="title"], the location from [data-testid="address"], the review score from the score block inside [data-testid="review-score"], the price from span[data-testid="price-and-discounted-price"], and the listing URL from the [data-testid="title-link"] anchor's href. Pulling on the stable data-testid hooks first, and only falling back to generated class names where you have to, keeps the parser steadier as the markup shifts.
Booking.com's generated class names (the a3b8729ab1 style suffixes above) change without notice, and even the data-testid hooks get renamed occasionally. Treat the selectors as a starting template, not a contract. When a field comes back empty, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.
Step 3: Assemble the full script with JSON and CSV export
Now wire the fetch and the parse into one runnable script, then write the records to disk as both JSON and CSV. The legacy guide saved the raw HTML to a file between steps, but folding the fetch and parse into a single run keeps the moving parts down.
const fs = require('fs'); const { CrawlingAPI } = require('crawlbase'); const cheerio = require('cheerio'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); async function crawl(pageUrl) { const response = await api.get(pageUrl); if (response.statusCode === 200) return response.body; console.error(`Request failed: ${response.statusCode}`); return null; } function toCsv(rows) { const headers = ['name', 'price', 'rating', 'location', 'link']; const escape = (value) => `"${String(value).replace(/"/g, '""')}"`; const lines = [headers.join(',')]; for (const row of rows) { lines.push(headers.map((h) => escape(row[h])).join(',')); } return lines.join('\n'); } async function main() { const url = 'https://www.booking.com/searchresults.html?ss=San+Francisco&checkin=2025-12-25&checkout=2025-12-31&group_adults=2&no_rooms=1&group_children=0&selected_currency=USD'; const html = await crawl(url); if (!html) return; const properties = parseDataFromHTML(html); fs.writeFileSync('booking.json', JSON.stringify(properties, null, 2)); fs.writeFileSync('booking.csv', toCsv(properties)); console.log(`Saved ${properties.length} properties to JSON and CSV`); } main();
Paste the parseDataFromHTML function from Step 2 into the same file so main can call it. Run it with node scraper.js and you get two files: booking.json with the full structured records and booking.csv ready to open in a spreadsheet. The toCsv helper quotes every field and doubles any embedded quotes, which matters here because property names and neighbourhood labels frequently contain commas.
What the output looks like
The JSON file holds one object per property, each with the name, price, public review score, location, and listing link. The values below are illustrative; your run reflects whatever is live for your search dates.
[ { "name": "Hotel Zephyr San Francisco", "price": "US$592", "rating": "8.3", "location": "Fisherman's Wharf, San Francisco", "link": "https://www.booking.com/hotel/us/zephyr-san-francisco.html" }, { "name": "Club Quarters Hotel Embarcadero, San Francisco", "price": "US$554", "rating": "8.0", "location": "Financial District, San Francisco", "link": "https://www.booking.com/hotel/us/club-quarters-san-francisco.html" } ]
The CSV mirrors the same property rows with a header line, so it drops straight into Excel, Google Sheets, or any data pipeline that reads delimited files.
name,price,rating,location,link "Hotel Zephyr San Francisco","US$592","8.3","Fisherman's Wharf, San Francisco","https://www.booking.com/hotel/us/zephyr-san-francisco.html" "Club Quarters Hotel Embarcadero, San Francisco","US$554","8.0","Financial District, San Francisco","https://www.booking.com/hotel/us/club-quarters-san-francisco.html"
Handle pagination
One search page is a demo; a real job pulls every page of results. Booking.com paginates its search URLs with an offset parameter that skips a fixed number of properties per page, so you can loop over offsets, fetch each page through the Crawling API, parse it with the same function, and stop when a page returns no cards. Because every results page shares the same card structure, the parser you already wrote works across all of them without changes.
async function scrapeAllPages(baseUrl, maxPages) { const allProperties = []; const perPage = 25; for (let page = 0; page < maxPages; page++) { // Booking.com skips results with an offset parameter const pageUrl = `${baseUrl}&offset=${page * perPage}`; const html = await crawl(pageUrl); if (!html) break; const properties = parseDataFromHTML(html); if (properties.length === 0) break; // no more results allProperties.push(...properties); console.log(`Page ${page + 1}: ${properties.length} properties`); // Pace requests so you stay under the rate limit await new Promise((r) => setTimeout(r, 2000)); } return allProperties; }
The exact page size and parameter can change, so check a couple of real "next page" links in your browser and match the pattern. The important habits carry over to any target: loop until the results run out, and put a short delay between requests so you are not hammering the site. For more on rendered, JavaScript-heavy pages like this one, see our guide to crawling JavaScript websites.
Staying unblocked
Even with rendering handled, Booking.com watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.
- Pace your requests. Introduce a delay between page fetches rather than hammering the search in a tight loop. Spreading requests out is the single biggest factor in staying under Booking.com's rate limits.
- Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a limit or a CAPTCHA. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
- Read the status codes. A run that starts returning challenges or non-200 responses is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.
For the broader playbook, see how to scrape websites without getting blocked. If you want similar listing data from another travel platform, the same fetch-then-parse pattern carries straight over to scraping Airbnb prices and to scraping Expedia with JavaScript. To compare those rates across sites, our guide to web scraping for price intelligence walks through the analysis side.
Is it legal to scrape Booking.com?
Whether scraping Booking.com is allowed depends on Booking.com's terms of service, your jurisdiction, and what you do with the data. Booking.com's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Booking.com's Terms of Service and its robots.txt, respect any rate expectations they state, and treat both as the boundary for what you collect.
This guide is deliberately scoped to public hotel listing data: the property name, displayed price, aggregate review score, neighbourhood, and listing link that anyone can see on a search page without logging in. That is factual business-listing information about a property, not personal data. Other content is out of scope: individual guest reviews and the people who wrote them are personal data, anything behind a login or a booking flow is off limits, and property photos and descriptions are copyrighted material you should not redistribute wholesale. If any personal data does enter the picture, privacy law such as GDPR and CCPA applies, so keep your collection to the aggregate, non-personal fields above.
If your project needs more than public listings, the right path is a sanctioned one, not a cleverer scraper. Booking.com runs official affiliate and partner programs, including a Demand API, that expose property, availability, and pricing data under clear terms with attribution rules and defined commercial rights. Those are the correct tools when you need large volumes, guaranteed structure, or the right to reuse the data commercially. When you are unsure whether a use is allowed, get permission or a data agreement rather than assuming silence is consent.
Key takeaways
- Booking.com renders listings client-side and blocks hard. A plain request returns an empty shell or a CAPTCHA, so you must render the page behind a trusted IP, using the JavaScript token, before you parse it.
- The Crawling API does both in one call. It renders the page in a real browser, rotates residential IPs, and handles CAPTCHAs, returning finished HTML you parse with Cheerio.
-
Cheerio extracts the fields. Select every
[data-testid="property-card"], then read name, price, review score, location, and listing link, and expect the generated class names to drift. - Paginate and export. Loop over Booking.com's offset parameter until results run out, pace your requests, and write structured records to both JSON and CSV.
- Stay on public data. Collect public hotel listings only, treat individual guest reviews and reviewers as personal data, respect ToS and robots.txt, and prefer Booking.com's official partner API for volume or commercial use.
Frequently Asked Questions (FAQs)
Can I build a Booking.com scraper in a language other than JavaScript?
Yes. This guide uses JavaScript with Cheerio, but the same approach works in any language. The Crawling API has libraries and SDKs for several languages, so you fetch the rendered HTML the same way and parse it with whatever HTML parser your stack prefers, such as BeautifulSoup in Python. The selectors and fields stay the same; only the parsing syntax changes.
Why does a plain request return incomplete data from Booking.com?
Because Booking.com renders its property list client-side with JavaScript and challenges automated traffic with CAPTCHAs. A raw HTTP request from a datacenter IP usually returns an empty shell or a block page rather than the property cards. To get a complete page you have to render it behind a trusted IP, which is what the Crawling API handles for you when you use the JavaScript token.
My selectors return empty values. What changed?
Almost certainly Booking.com's markup. Its generated class names like a3b8729ab1 change without notice, and even the data-testid hooks get renamed sometimes, so selectors that worked last month can break. Re-inspect a live page in your browser's dev tools, update the selectors in parseDataFromHTML, and you are back in business. Periodic selector maintenance is normal for any production scraper.
Will I get blocked while scraping Booking.com?
You can, if you send too many requests too fast from one address. The Crawling API reduces that risk by rotating through residential IPs and handling CAPTCHAs for you, but you should still pace your requests, add delays between pages, and watch the status codes so you can back off when challenges appear. Those habits matter on any hard commercial target.
Can I scrape individual guest reviews and reviewer names?
That is out of scope for this guide, and for good reason. Individual reviews and the people who wrote them are personal data, which pulls in privacy law like GDPR and CCPA. Use the aggregate guest review score as a signal about a property, do not build profiles of individual reviewers, and do not republish a person's review tied to their identity. For anything beyond public listings, use Booking.com's official partner program.
Does Booking.com have an official API?
Yes. Booking.com runs official affiliate and partner programs, including a Demand API, that expose property, availability, and pricing data under clear terms, with attribution rules and defined commercial rights. If you need large volumes, guaranteed structure, or the right to reuse the data commercially, that sanctioned route is the correct one. This public-data scraper is best for research, prototyping, and smaller-scale analysis where an official agreement is not warranted.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
