Homes.com lists property data across the United States, and its search and listing pages carry exactly the structured fields that drive price tracking, market research, and investment analysis: a listing title, the street address, the asking price, beds, baths, square footage, and the link to each property. Pull that into a spreadsheet and you can compare neighborhoods, watch how prices move over time, and spot listings worth a closer look without clicking through hundreds of pages by hand.

This guide shows you how to scrape Homes.com with Python the reliable way. You build a small, runnable scraper that fetches a rendered search page through the Crawling API, parses the fields you want with BeautifulSoup, walks the pagination, and writes the results to JSON and CSV. The whole walkthrough stays scoped to public listing data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Python script that takes a public Homes.com search URL, retrieves the rendered HTML through the Crawling API, walks several result pages, and extracts a structured record per listing. We will use a single city search as the running example and pull these fields:

  • Title the listing type, like "House for Rent" or "Condo for Rent".
  • Address the street address of the property.
  • Price the asking price or monthly rent shown on the card.
  • Beds the number of bedrooms.
  • Baths the number of bathrooms.
  • Size the square footage, where the listing reports it.
  • Link the absolute URL to the full property page.

Why a plain request fails on Homes.com

If you request a Homes.com search URL with a bare HTTP client, you get a response with status 200 and almost none of the listing data in the body. Two things work against you. First, Homes.com renders much of its content in the browser with JavaScript, so the initial HTML is a thin shell that only fills in after the page's scripts run. Second, the site flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get rate-limited, challenged, or served a captcha before they ever reach the rendered content.

So a working Homes.com scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns finished HTML for you to parse. For more on why dynamic sites need this, see how to crawl JavaScript websites.

Why the JS token

Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Homes.com fills its listing fields client-side, so you need the JS token here. Using the normal token returns the same empty shell a plain fetch would, and there is nothing useful to parse out of it.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to the language, the scrape a website with Python walkthrough covers the basics this tutorial assumes.

Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda.

A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token from the account docs page. The first 1,000 requests are free and no card is required. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs.

bash
python --version

python -m venv homes_scraping_env
source homes_scraping_env/bin/activate

pip install crawlbase beautifulsoup4

On Windows, activate the environment with homes_scraping_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull out individual fields by CSS selector. If you have not used the parser before, the BeautifulSoup guide is a good companion to this tutorial.

Step 1: Fetch the rendered search page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JS token, and request the search URL. The two wait options matter for a client-rendered target: ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so late-rendering elements appear before the page is captured. Checking the status before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI

crawling_api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

options = {
    "ajax_wait": "true",
    "page_wait": 10000,
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36",
}

def make_crawlbase_request(url):
    response = crawling_api.get(url, options)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Failed to fetch the page. Crawlbase status: {response['headers']['pc_status']}")
    return None

if __name__ == "__main__":
    url = "https://www.homes.com/los-angeles-ca/homes-for-rent/p1/"
    html = make_crawlbase_request(url)
    print(html[:500] if html else "No HTML returned")

The function reads response["headers"]["pc_status"], the per-request status the Crawling API returns alongside the body, and only hands back HTML when it reads "200". Ten seconds of page_wait is a reasonable start for Homes.com; raise it if the listing fields come back empty. Run the script with python homes_scraper.py and you should see real search markup, not the empty shell a plain fetch returns. That confirms rendering works before you write a single selector.

Crawlbase Crawling API

That single make_crawlbase_request call is doing the hard part for you. Homes.com needs a rendered page behind a trusted IP, and the Crawling API takes your JS token, runs the page in a real browser, rotates through residential IPs server-side, and hands back finished HTML, so you skip running a headless browser fleet and a proxy pool yourself. Point it at a public search page on the free tier first.

Step 2: Parse the listing cards with BeautifulSoup

With rendered HTML in hand, load it into BeautifulSoup and pull each card. Inspect a Homes.com search page in your browser's dev tools and you will find each listing wrapped in a div with the class for-rent-content-container. Select all of those, then read the individual fields off each one. The title sits in a p.property-name, the address in a p.address, and the price, beds, and baths come from the li items inside ul.detailed-info-container, in that order.

python
from bs4 import BeautifulSoup

BASE_URL = "https://www.homes.com"

def parse_listings(html):
    soup = BeautifulSoup(html, "html.parser")
    cards = soup.select("div.for-rent-content-container")
    properties = []

    for card in cards:
        title_elem = card.select_one("p.property-name")
        address_elem = card.select_one("p.address")
        info_container = card.select_one("ul.detailed-info-container")
        info = info_container.find_all("li") if info_container else []
        link_elem = card.select_one("a")

        properties.append({
            "title": title_elem.text.strip() if title_elem else "N/A",
            "address": address_elem.text.strip() if address_elem else "N/A",
            "price": info[0].text.strip() if len(info) > 0 else "N/A",
            "beds": info[1].text.strip() if len(info) > 1 else "N/A",
            "baths": info[2].text.strip() if len(info) > 2 else "N/A",
            "size": info[3].text.strip() if len(info) > 3 else "N/A",
            "link": BASE_URL + link_elem["href"] if link_elem and link_elem.get("href") else "N/A",
        })

    return properties

Each guard returns "N/A" instead of throwing when an element is missing, so one absent field does not crash the run. The detail row is positional: Homes.com lays price, beds, baths, and size out as ordered li items, so the code reads them by index and bounds-checks the length first. The link comes off the card's anchor as a relative path, so prefixing BASE_URL gives you an absolute URL you can follow straight to the property page.

Selectors drift

Homes.com class names (the for-rent-content-container card, the property-name and address fields, the detailed-info-container rows) change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back as "N/A", re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Walk the pagination

One page is a demo; a real job runs over the whole result set. Homes.com appends a page segment to the search path, so the listings for a city live at .../homes-for-rent/p1/, .../p2/, and so on. Loop over a fixed number of pages, fetch each one through the same request function, parse its cards, and collect everything into a single list. A short pause between pages keeps the run from hammering the site.

python
import time

SEARCH_URL = "https://www.homes.com/los-angeles-ca/homes-for-rent"
MAX_PAGES = 3

def scrape_search():
    properties = []
    for page in range(1, MAX_PAGES + 1):
        url = f"{SEARCH_URL}/p{page}/"
        print(f"Scraping page {page}: {url}")
        html = make_crawlbase_request(url)
        if html:
            properties.extend(parse_listings(html))
        time.sleep(2)
    return properties

The time.sleep(2) between pages is deliberate: it paces the run so you are not hammering the site, which is the single most effective habit for staying unblocked. Adjust MAX_PAGES and the city slug in SEARCH_URL to fit your target. To scrape rentals in Chicago instead, swap in chicago-il; to scrape homes for sale, change homes-for-rent to the for-sale path.

Step 4: Export to JSON and CSV

With a list of records in hand, write it out in whatever shape your downstream work needs. JSON keeps the structure intact for code that reads it back; CSV drops straight into a spreadsheet for sorting and charting. Two small helpers cover both.

python
import json
import csv

def save_to_json(properties, filename="properties.json"):
    with open(filename, "w") as f:
        json.dump(properties, f, indent=4)

def save_to_csv(properties, filename="properties.csv"):
    if not properties:
        return
    with open(filename, "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=properties[0].keys())
        writer.writeheader()
        writer.writerows(properties)

The CSV writer reads its column headers from the keys of the first record, so the columns stay in step with the fields you parse in step 2. If you add or rename a field there, both exports follow automatically.

Putting it all together

Here is the complete, runnable scraper: fetch each search page through the Crawling API, parse the cards, walk the pagination, and write the results to JSON and CSV. Drop in your token and run it.

python
import json
import csv
import time
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

crawling_api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

BASE_URL = "https://www.homes.com"
SEARCH_URL = "https://www.homes.com/los-angeles-ca/homes-for-rent"
MAX_PAGES = 3

options = {
    "ajax_wait": "true",
    "page_wait": 10000,
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36",
}

def make_crawlbase_request(url):
    response = crawling_api.get(url, options)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Failed to fetch the page. Crawlbase status: {response['headers']['pc_status']}")
    return None

def parse_listings(html):
    soup = BeautifulSoup(html, "html.parser")
    cards = soup.select("div.for-rent-content-container")
    properties = []

    for card in cards:
        title_elem = card.select_one("p.property-name")
        address_elem = card.select_one("p.address")
        info_container = card.select_one("ul.detailed-info-container")
        info = info_container.find_all("li") if info_container else []
        link_elem = card.select_one("a")

        properties.append({
            "title": title_elem.text.strip() if title_elem else "N/A",
            "address": address_elem.text.strip() if address_elem else "N/A",
            "price": info[0].text.strip() if len(info) > 0 else "N/A",
            "beds": info[1].text.strip() if len(info) > 1 else "N/A",
            "baths": info[2].text.strip() if len(info) > 2 else "N/A",
            "size": info[3].text.strip() if len(info) > 3 else "N/A",
            "link": BASE_URL + link_elem["href"] if link_elem and link_elem.get("href") else "N/A",
        })

    return properties

def scrape_search():
    properties = []
    for page in range(1, MAX_PAGES + 1):
        url = f"{SEARCH_URL}/p{page}/"
        print(f"Scraping page {page}: {url}")
        html = make_crawlbase_request(url)
        if html:
            properties.extend(parse_listings(html))
        time.sleep(2)
    return properties

def save_to_json(properties, filename="properties.json"):
    with open(filename, "w") as f:
        json.dump(properties, f, indent=4)

def save_to_csv(properties, filename="properties.csv"):
    if not properties:
        return
    with open(filename, "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=properties[0].keys())
        writer.writeheader()
        writer.writerows(properties)

if __name__ == "__main__":
    listings = scrape_search()
    save_to_json(listings)
    save_to_csv(listings)
    print(f"Saved {len(listings)} listings to properties.json and properties.csv")

What the output looks like

Run the full script with python homes_scraper.py and you get a clean record per listing, ready to write to JSON, CSV, or a database.

json
[
    {
        "title": "Condo for Rent",
        "address": "3824 Keystone Ave Unit 2, Culver City, CA 90232",
        "price": "$3,300 per month",
        "beds": "2 Beds",
        "baths": "1.5 Baths",
        "size": "1,100 Sq Ft",
        "link": "https://www.homes.com/los-angeles-ca/homes-for-rent/property/3824-keystone-ave-culver-city-ca-unit-2/2er2mwklw8zq6/"
    },
    {
        "title": "House for Rent",
        "address": "3901 Alonzo Ave, Encino, CA 91316",
        "price": "$17,000 per month",
        "beds": "4 Beds",
        "baths": "3.5 Baths",
        "size": "3,400 Sq Ft",
        "link": "https://www.homes.com/los-angeles-ca/homes-for-rent/property/3901-alonzo-ave-encino-ca/879negnf45nee/"
    }
]

The CSV version of the same data has one row per listing with title, address, price, beds, baths, size, and link columns, which opens cleanly in any spreadsheet for sorting by price or filtering by neighborhood.

Scaling to property pages

The search scraper gives you the card-level fields. When you want the full detail of a single property, follow the link you already captured and parse the property page, which carries richer fields like the lot size, a longer description, and the listing agent's public contact line. The property page uses its own selectors: the address sits in div.property-info-address, the price in span#price, beds and baths in span.feature-beds and span.feature-baths, and the lot size in span.property-info-feature.lotsize. Reuse make_crawlbase_request to fetch the page, then map those selectors the same way you did for the cards. Pace each property fetch with a short sleep, just as the search loop does.

Staying unblocked

Even with rendering handled, Homes.com watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Hammering pages in a tight loop is the fastest way to get throttled or served a captcha. Spread requests out, as the sleep above does, and vary your targets instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. If you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy (also called the AI Proxy) gives you the same residential IP rotation as a drop-in proxy endpoint. The same approach extends to other real-estate sites: see our guides on scraping Zillow, scraping Redfin, and scraping Realtor.com.

Whether scraping Homes.com is allowed depends on Homes.com's terms of service, your jurisdiction, and what you do with the data. The terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read the Homes.com Terms of Service and its robots.txt, respect any stated rate limits, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public listing data: the title, address, price, beds, baths, square footage, and the link that anyone can see without an account. Keep your request volume low enough that you are not straining the site's servers. Avoid anything tied to identifiable individuals, including the names and contact details of listing agents, owners, or property managers shown on a page. Those are personal data, and gathering or storing them can bring privacy laws like GDPR and the CCPA into play, so leave them out unless you have a clear lawful basis and a real need.

One more point specific to real estate: much of the underlying listing data originates from Multiple Listing Services (MLS), and that data is frequently licensed under terms that restrict redistribution. If your project needs that depth or any commercial reuse in bulk, the correct path is a licensed MLS feed or an official data agreement, not a cleverer scraper. This guide is deliberately scoped to public listing pages because that is the line that keeps the work defensible. It does not cover anything behind a login, account or saved-search data, agent or owner personal details, or any attempt to bypass authentication. Public listing data only.

Recap

Key takeaways

  • Homes.com is client-side rendered. A plain fetch returns an empty shell, so you must render the page before you parse it.
  • You need rendering and a trusted IP together. The Crawling API with a JS token does both in one call; ajax_wait and page_wait control how long it waits for content.
  • BeautifulSoup does the extraction. Map title, address, price, beds, baths, size, and link to the for-rent-content-container card selectors, and expect those selectors to drift.
  • Walk the pagination, then export. Iterate the p{page} segment, collect every card, pace the run with a short sleep, and write the results to JSON and CSV.
  • Stay on public data. Respect Homes.com's ToS and robots.txt, collect only public listing fields, leave out agent and owner personal details, and use a licensed MLS feed for anything commercial or bulk.

Frequently Asked Questions (FAQs)

Why does a plain request return no data from Homes.com?

Because Homes.com renders its listing content client-side with JavaScript. The initial HTML is a shell that only fills in after the page's scripts run in a browser, so a raw HTTP request returns status 200 with the price, beds, baths, and size fields blank. To get real data you have to render the page first, which is what the Crawling API's JS token handles for you.

Do I need the normal token or the JS token for Homes.com?

The JS token. The normal token fetches static HTML, which on Homes.com is the same empty shell a plain request returns. The JS token renders the page in a real browser before handing back the HTML, so the listing fields are present when BeautifulSoup parses them. The ajax_wait and page_wait options tell the renderer how long to wait for that content.

What data can I scrape from a Homes.com listing?

Public listing fields: the title, the street address, the asking price or monthly rent, the number of beds and baths, the square footage where shown, and the link to the property page. Stay on data that is visible to any visitor without an account, and avoid the personal details of agents or owners, which fall outside the public-listing scope this guide covers.

My selectors return "N/A". What changed?

Almost certainly Homes.com's markup. The for-rent-content-container card, the property-name and address fields, and the detailed-info-container rows change without notice, so selectors that worked last month can break. Re-inspect a live page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

How do I handle pagination across a city's listings?

Homes.com appends a p{page} segment to the search path, so you fetch .../p1/, .../p2/, and so on in a loop, parse the cards on each page, and collect them into one list. Keep a short sleep between requests and stop at your chosen page limit. The scrape_search function above shows the full loop.

How do I avoid getting blocked while scraping Homes.com?

Keep your per-IP request rate low, pace requests with a short delay, vary your targets instead of looping one path, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation and a trusted IP pool for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available