Airbnb is one of the largest short-stay platforms on the web, and its public listing pages carry exactly the structured fields that drive market research, price tracking, and travel comparison: the listing title, the price per night, the rating and review count, the location, and the amenities a place advertises. For anyone studying nightly rates across a city or tracking how a market moves, that public listing data is the raw material, and collecting it by hand across dozens of stays is slow and error-prone.

This guide shows you how to scrape Airbnb listings with Python the reliable way. You build a small, runnable scraper that fetches rendered Airbnb search pages through the Crawling API, parses the listing fields you want with BeautifulSoup, handles pagination, and exports clean JSON and CSV. The whole walkthrough stays scoped to public listing data: no host or guest personal information, no individual reviews tied to a named person. The legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Python script that takes a public Airbnb search URL for a location and stay dates, collects the listing cards on each results page, and extracts a structured record per listing. The running example is stays in the United States, but the same approach works for any public search URL. We pull these fields:

  • Title the listing title shown on the card, such as "Cabin in Woodstock".
  • Price the price per night as displayed on the listing.
  • Rating the overall guest rating and the review count next to it.
  • Location the place name parsed from the listing title.
  • Amenities the key amenities the card surfaces, such as a pool, wifi, or kitchen.
  • Link the canonical URL of the listing page.

Every one of those fields is public and non-personal. The scraper never touches a host's name, a guest's profile, private messages, or any review attributed to a named individual.

Why a plain request fails on Airbnb

If you request an Airbnb search URL with a bare HTTP client, you get a response with status 200 and almost none of the listing data in the body. Two things work against you. First, Airbnb renders its search results in the browser through JavaScript, so the initial HTML is a thin shell that fills in only after the page's scripts run. Parse that first response and you capture an empty grid instead of the listing cards. Second, Airbnb flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get rate-limited, IP-blocked, or challenged before they ever reach the rendered content.

So a working Airbnb scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns finished HTML for you to parse.

Why the JS token

Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Airbnb fills its search grid client-side, so you need the JS token here. The normal token returns the same thin shell a plain fetch would, and there is little useful to parse out of it.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to the parsing side, the BeautifulSoup guide is a good companion to this tutorial.

Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda, and make sure Python is on your PATH.

A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token from the account docs page. Crawlbase includes 1,000 free requests to start, which is plenty for working through this guide, and you pay only for successful requests. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the libraries the scraper needs.

bash
python --version

python -m venv airbnb_env
source airbnb_env/bin/activate

pip install crawlbase beautifulsoup4

On Windows, activate the environment with airbnb_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull out individual fields by CSS selector. Both json and csv ship with the standard library, so there is nothing more to install for the export step.

Step 1: Fetch a rendered Airbnb search page

Start by getting a finished page. Import the CrawlingAPI class, initialize it with your JS token, and request an Airbnb search URL. Airbnb loads results asynchronously, so pass ajax_wait and page_wait to hold for the dynamic content before the page is captured. Checking the Crawlbase pc_status before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

OPTIONS = {
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/122.0",
    "ajax_wait": "true",
    "page_wait": 5000,
}

def crawl(page_url):
    response = api.get(page_url, OPTIONS)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['headers']['pc_status']}")
    return None

if __name__ == "__main__":
    search_url = "https://www.airbnb.com/s/United-States/homes?checkin=2026-07-10&checkout=2026-07-12&adults=2"
    html = crawl(search_url)
    print(html[:500] if html else "No HTML returned")

The two wait options matter for a client-rendered target like Airbnb. ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so late-rendering cards appear before the page is captured. Five seconds is a reasonable start; raise it if the results come back thin. The search URL carries the location, check-in and check-out dates, and the number of adults, just as Airbnb's own search does. Run the script with python airbnb_scraper.py and you should see real Airbnb search markup, not the shell a plain request returns. That confirms rendering works before you write a single selector.

Crawlbase Airbnb Scraper

Airbnb needs a rendered page behind a trusted IP, in one call, which is exactly what the ajax_wait and page_wait options above set up. The Crawling API takes a JS token, runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public search page on the free tier first, and pay only for successful requests.

Step 2: Inspect the listing cards and find the selectors

With a finished page in hand, the next step is to find where each field lives. Open the same search URL in your browser, right-click a listing card, and choose Inspect to open the dev tools. Airbnb wraps each result in an item element inside its site content area, and the title, rating, and price each sit in predictable spots within that card.

From the legacy markup, the listing container and its inner fields map to these selectors. They are a starting template: Airbnb's generated class names rotate, so re-inspect a live page whenever a field comes back empty.

  • Listing container: div#site-content div[itemprop="itemListElement"]
  • Title: div[data-testid="listing-card-title"]
  • Rating and reviews: span.r1dxllyb inside the card
  • Price per night: div._i5duul span.a8jt5op
  • Link: the href on the card's anchor, joined to the Airbnb host

Step 3: Parse the listing fields

Load the rendered HTML into BeautifulSoup, iterate over each listing container, and pull the fields with the selectors above. Each lookup is guarded so a missing field returns None instead of crashing the run. The title doubles as a source for the location: Airbnb writes its card titles as "Type in Place", so the text after "in" is the location.

python
from urllib.parse import urljoin
from bs4 import BeautifulSoup

CARD = 'div#site-content div[itemprop="itemListElement"]'

def text_of(node, selector):
    el = node.select_one(selector)
    return el.get_text(strip=True) if el else None

def location_from_title(title):
    if title and " in " in title:
        return title.split(" in ", 1)[1]
    return None

def amenities_of(node):
    spans = node.select('div[data-testid="listing-card-subtitle"] span')
    items = [s.get_text(strip=True) for s in spans]
    return [a for a in items if a]

def parse_card(node):
    title = text_of(node, 'div[data-testid="listing-card-title"]')
    anchor = node.select_one("a")
    href = anchor["href"] if anchor and anchor.get("href") else None
    return {
        "title": title,
        "price": text_of(node, 'div._i5duul span.a8jt5op'),
        "rating": text_of(node, 'span.r1dxllyb'),
        "location": location_from_title(title),
        "amenities": amenities_of(node),
        "link": urljoin("https://www.airbnb.com", href) if href else None,
    }

def scrape_page(html):
    soup = BeautifulSoup(html, "html.parser")
    return [parse_card(node) for node in soup.select(CARD)]

The text_of helper queries one element and returns its stripped text, or None when the element is absent, so a card that omits a field does not break the loop. The rating selector picks up the combined rating and review count Airbnb renders together, for example "4.99 (85)". location_from_title reads the place from the card title, and amenities_of collects the short descriptors Airbnb shows in the card subtitle. The anchor's href is relative, so urljoin turns it into a full listing URL. Notice what is absent: nothing here reads a host name, a host profile, or any guest's review text. The card surfaces only public listing attributes, and that is all the parser collects.

Selectors drift

Airbnb's generated class names like r1dxllyb and a8jt5op change without notice. Treat the selectors here as a starting template, not a contract. When a field comes back empty, re-inspect the live card in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 4: Handle pagination across search pages

One search page is a slice of the result set. Airbnb paginates with an items_offset query parameter, advancing the offset by the page size (18 cards per page) to step through results. Read the offset for the next page from the pagination nav when it is present, or step the offset yourself up to a ceiling so a large market does not run away. A small retry wrapper around the fetch keeps a single slow page from ending the run.

python
import time

PAGE_SIZE = 18

def fetch_html(page_url, max_retries=2):
    for attempt in range(max_retries + 1):
        html = crawl(page_url)
        if html:
            return html
        if attempt < max_retries:
            print(f"Retrying ({attempt + 1}/{max_retries})...")
            time.sleep(1)
    print(f"Unable to fetch {page_url}")
    return None

def collect_all_listings(base_url, max_pages):
    records = []
    for page in range(max_pages):
        offset = page * PAGE_SIZE
        sep = "&" if "?" in base_url else "?"
        page_url = f"{base_url}{sep}items_offset={offset}"
        html = fetch_html(page_url)
        if not html:
            break
        page_records = scrape_page(html)
        if not page_records:
            break
        records.extend(page_records)
        time.sleep(2)
    return records

fetch_html retries a failed fetch up to twice with a short pause, returning the HTML on success and None once it gives up. collect_all_listings walks each page by advancing items_offset, caps the crawl at your max_pages ceiling, and stops early when a page returns no cards (the natural end of the results). The time.sleep(2) between pages paces the run so you are not hammering the site.

Step 5: Assemble the full script

Now wire the pieces into one runnable script: collect listings across pages, then export the records to both JSON and CSV.

python
import csv
import json
import time
from urllib.parse import urljoin
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

OPTIONS = {
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/122.0",
    "ajax_wait": "true",
    "page_wait": 5000,
}

CARD = 'div#site-content div[itemprop="itemListElement"]'
PAGE_SIZE = 18

def crawl(page_url):
    response = api.get(page_url, OPTIONS)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['headers']['pc_status']}")
    return None

def fetch_html(page_url, max_retries=2):
    for attempt in range(max_retries + 1):
        html = crawl(page_url)
        if html:
            return html
        if attempt < max_retries:
            time.sleep(1)
    return None

def text_of(node, selector):
    el = node.select_one(selector)
    return el.get_text(strip=True) if el else None

def location_from_title(title):
    if title and " in " in title:
        return title.split(" in ", 1)[1]
    return None

def amenities_of(node):
    spans = node.select('div[data-testid="listing-card-subtitle"] span')
    items = [s.get_text(strip=True) for s in spans]
    return [a for a in items if a]

def parse_card(node):
    title = text_of(node, 'div[data-testid="listing-card-title"]')
    anchor = node.select_one("a")
    href = anchor["href"] if anchor and anchor.get("href") else None
    return {
        "title": title,
        "price": text_of(node, 'div._i5duul span.a8jt5op'),
        "rating": text_of(node, 'span.r1dxllyb'),
        "location": location_from_title(title),
        "amenities": amenities_of(node),
        "link": urljoin("https://www.airbnb.com", href) if href else None,
    }

def scrape_page(html):
    soup = BeautifulSoup(html, "html.parser")
    return [parse_card(node) for node in soup.select(CARD)]

def collect_all_listings(base_url, max_pages):
    records = []
    for page in range(max_pages):
        offset = page * PAGE_SIZE
        sep = "&" if "?" in base_url else "?"
        html = fetch_html(f"{base_url}{sep}items_offset={offset}")
        if not html:
            break
        page_records = scrape_page(html)
        if not page_records:
            break
        records.extend(page_records)
        time.sleep(2)
    return records

def save_outputs(records):
    with open("airbnb_listings.json", "w") as f:
        json.dump(records, f, indent=2)
    if not records:
        return
    with open("airbnb_listings.csv", "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=records[0].keys())
        writer.writeheader()
        for r in records:
            row = {**r, "amenities": ", ".join(r["amenities"])}
            writer.writerow(row)

def main():
    search_url = "https://www.airbnb.com/s/United-States/homes?checkin=2026-07-10&checkout=2026-07-12&adults=2"
    records = collect_all_listings(search_url, max_pages=2)
    save_outputs(records)
    print(f"Saved {len(records)} listings")

if __name__ == "__main__":
    main()

The script collects listings across up to two search pages, parses each card into a record, and paces the loop with a two-second sleep. save_outputs writes both a JSON file and a CSV; for the CSV it flattens the amenities list into a comma-separated string so the column stays readable. Adjust max_pages and the search URL to fit your target location and dates.

What the output looks like

Run the full script with python airbnb_scraper.py and you get a clean structured record per listing, ready for analysis, a database, or a spreadsheet. The titles, ratings, and prices below mirror the shape Airbnb renders on its cards.

json
[
  {
    "title": "Cabin in Woodstock",
    "price": "$70 per night",
    "rating": "4.9 (41)",
    "location": "Woodstock",
    "amenities": ["Wifi", "Kitchen", "Free parking"],
    "link": "https://www.airbnb.com/rooms/12345678"
  },
  {
    "title": "Farm stay in Kalispell",
    "price": "$199 per night",
    "rating": "5.0 (161)",
    "location": "Kalispell",
    "amenities": ["Wifi", "Pool", "Kitchen"],
    "link": "https://www.airbnb.com/rooms/23456789"
  }
]

The matching CSV carries the same columns, one row per listing, which drops straight into pandas or any spreadsheet for filtering by price band, rating, or location. If your goal is rate tracking specifically, the companion guide on scraping Airbnb prices with Python goes deeper on the price field, and price intelligence with web scraping covers what to do with the numbers once you have them.

Staying unblocked at scale

Even with rendering handled, Airbnb watches for scraper-shaped traffic. A few habits keep a longer run healthy, and they apply to any hard commercial target.

  • Pace your requests. Hammering search pages in a tight loop is the fastest way to get throttled or challenged. The two-second sleeps above are the floor, not the ceiling; widen them for larger jobs and vary your targets instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning non-200 pc_status values is telling you the current rate or IP tier is no longer enough. Treat that as a signal to back off, not noise to ignore.

For larger crawls, the async Crawler queues requests and delivers results to a webhook, which suits running many search pages without holding open connections. For the broader playbook, see how to scrape websites without getting blocked. The same two-layer approach carries over to other listing portals, such as scraping Apartments.com.

Whether scraping Airbnb is allowed depends on Airbnb's terms of service, your jurisdiction, and what you do with the data. Airbnb's Terms of Service restrict automated access, scraping, and the collection of content from the platform, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it only makes the technical part work. Read Airbnb's Terms of Service and its robots.txt, respect the rate limits those imply, and treat both as the boundary for what you collect. Keep your request volume low enough that you are not straining Airbnb's servers.

The bigger line is personal data. Airbnb listings are user-posted content, which means a page can carry information about real people: hosts and guests. This guide is deliberately scoped to public, non-personal listing fields, the title, price per night, rating, review count, location, amenities, and the listing link, because that is the line that keeps the work defensible. Do not collect host or guest names, profile photos, contact details, or any other personal information, and do not scrape individual reviews tied to a named guest or assemble profiles of hosts. That is personal data, and where it is involved, privacy laws such as the GDPR in the EU and the CCPA in California apply, with their own requirements and penalties. Public availability of a field does not make it free to harvest when it identifies a person.

For anything beyond a small, public, non-personal sample, the right path is an official channel rather than a cleverer scraper. Airbnb runs partner and API programs for permitted integrations, and that is the correct route for commercial or bulk use. When in doubt about a specific use case, get legal advice before you build a product on top of the data. The technical walkthrough above is a way to learn the mechanics on public data, not a license to collect at scale or to touch anything tied to an individual.

Recap

Key takeaways

  • Airbnb is client-side rendered. A plain request returns a thin shell with an empty grid, so you must render the page before you parse it.
  • You need rendering and a trusted IP together. The Crawling API with a JS token does both in one call; ajax_wait and page_wait control how long it waits for content.
  • Parse the card, not the person. Iterate the itemListElement containers and read title, price, rating with review count, location, amenities, and link, all public and non-personal fields.
  • Paginate and export. Step Airbnb's items_offset parameter up to a ceiling, pace the run with short sleeps, and write the records to JSON and CSV.
  • Stay on public data. Respect Airbnb's ToS and robots.txt, never collect host or guest personal data or named reviews, remember GDPR and CCPA apply to any personal data, and use Airbnb's official or partner API for production.

Frequently Asked Questions (FAQs)

Why does a plain request return an empty Airbnb grid?

Because Airbnb loads its search results client-side with JavaScript. The initial HTML is a shell that fills in only after the page's scripts run in a browser, so a raw HTTP request returns status 200 with no listing cards. To get the full set you have to render the page first, which is what the Crawling API's JS token handles for you.

Do I need the normal token or the JS token for Airbnb?

The JS token. The normal token fetches static HTML, which on Airbnb is the same thin shell a plain fetch returns. The JS token renders the page in a real browser before handing back the HTML, so the listing cards and their fields are present when BeautifulSoup parses them. Credits differ for normal and JavaScript requests, so check your dashboard.

What fields can I scrape from an Airbnb listing?

Public, non-personal listing fields: the listing title, the price per night, the rating and review count, the location, the key amenities, and the listing link. Stay on data that is visible to any visitor without an account, and never collect host or guest names, profiles, contact details, or individual reviews tied to a named person. Those are personal data and fall outside the public-listing scope this guide covers.

My selectors return None. What changed?

Almost certainly Airbnb's markup. Its generated class names (r1dxllyb for the rating, a8jt5op for the price, and the listing-card-title test id) change without notice, so selectors that worked last month can break. Re-inspect a live card in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

How do I handle pagination across a location's listings?

Airbnb advances results with an items_offset query parameter, stepping by the page size of 18 cards. The collect_all_listings function above increments the offset page by page, caps the crawl at a max_pages ceiling, and stops once a page returns no cards. Keep a short sleep between pages so the run stays polite.

Can I use scraped Airbnb data commercially?

Treat that as a legal question, not a technical one. Airbnb's Terms of Service restrict scraping and reuse, and listings can contain personal data covered by laws like the GDPR and CCPA, so commercial or bulk use generally needs permission. Review the terms, use Airbnb's official or partner API for production, and seek legal advice before building a product on top of the data.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available