How to Automate Real Estate Data Extraction Using Crawlbase

Have you ever tried scraping real estate sites, only to hit a wall of CAPTCHA, rate limits, redirects, or IP bans? It’s like doing all the hard work and getting stuck at the finish line.

So, if you’re a founder building a rental data platform or a developer just trying to gather listings into Excel for insights, roadblocks like these can be deal-breakers.

But what if there was a cleaner way to extract accurate, structured property data without the usual headaches, and trusted by 70,000+ dev teams? Sounds like a fairy tale, right? Not exactly. Meet Crawlbase, the only tool you’ll ever need for AI-powered Web Scraping.

Step-by-step Guide to Build a Real Estate Data Scraper

This guide shows how to scrape property listings from two real estate websites — Estately and Re/Max by using the Crawlbase Crawling API. We’ll extract data such as price, number of beds, number of baths, square footage, and address, then export it to an Excel sheet.

You don’t need to manage proxies, CAPTCHA, or JavaScript rendering. Crawlbase handles all that for you. So, without worrying, let’s get started.

1. Prerequisites

Before you begin, make sure you have:

Node.js installed
An IDE/Code Editor of your choice
A Crawlbase Crawling API token → Get it here (1000 free requests)
A basic understanding of JavaScript/Node (just enough to read functions)

2. Install Required Packages

Open your terminal and run:

1 2	npm init -y npm install cheerio exceljs crawlbase

These packages handle:

Cheerio: To extract content from HTML, like jQuery
ExcelJS: To write listings into an Excel file
Crawlbase: To bypass CAPTCHA, blocks, and restrictions

3. Create the Script Files and Import Required Modules

First, create 2 new files for 2 different scripts called estately.js and remax.js. And this is how your project structure would look:

Then, import the required libraries by pasting the following at the top:

1
2
3

const cheerio = require('cheerio');
const ExcelJS = require('exceljs');
const { CrawlingAPI } = require('crawlbase');

These lines of code are required for both scripts, so don’t forget to add them.

Now, Let’s Look at What’s Different

Since we have two different scripts for two different real estate website listings, here is what we are gonna do:

First, we will share the complete function code for each script
Second, you will look at the code as a whole and try it
Finally, we will walk through the Estately script step-by-step. Since both scripts follow a similar structure, understanding one will make the other a breeze.

1. How to Scrape Estately

What It Does:

Visits Estately’s Cape Coral listings
Extracts price, beds, baths, sqft, and address
Saves everything to Excel

Add This Function:

// Don’t forget to add the Shared Logic above
(async () => {
  try {
    const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
    const url = 'https://www.estately.com/FL/Cape_Coral';
    const response = await api.get(url);
    const html = response?.body;
    const $ = cheerio.load(html);

    const properties = [];

    $('div.result-item-details').each((i, cardEl) => {
      const card = $(cardEl);

      const price = card.find('p.result-price strong').text().trim();
      const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

      const beds = card.find('ul.result-basics-grid li').eq(0).find('b').text().trim();
      const baths = card.find('ul.result-basics-grid li').eq(1).find('b').text().trim();
      const sqft = card.find('ul.result-basics-grid li').eq(2).find('b').text().trim();

      const addressFull = card.find('a.margin-0.small.limit-line-length').text().trim();

      if (price) {
        properties.push({
          price,
          priceValue,
          beds,
          baths,
          sqft,
          address1: addressFull,
        });
      }
    });

    // Sort by price
    properties.sort((a, b) => a.priceValue - b.priceValue);

    // Write to Excel
    const workbook = new ExcelJS.Workbook();
    const sheet = workbook.addWorksheet('Properties');

    sheet.columns = [
      { header: 'Price', key: 'price', width: 15 },
      { header: 'Beds', key: 'beds', width: 10 },
      { header: 'Baths', key: 'baths', width: 10 },
      { header: 'Sqft', key: 'sqft', width: 12 },
      { header: 'Address', key: 'address1', width: 30 },
    ];

    properties.forEach((p) => sheet.addRow(p));

    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
    const filePath = `estately_property_data_${timestamp}.xlsx`;

    await workbook.xlsx.writeFile(filePath);

    console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);
  } catch (error) {
    console.error(`❌ Scraping failed: ${error.message}`);
  }
})();

Now, open your terminal, run node estately.js, and see the results:

The result:

2. How to Scrape Remax

What It Does:

Scrapes Re/Max rental listings in LA
Extracts price, beds, baths, sqft, and address
Sorts by price and saves to Excel

Add This Function:

// Don’t forget to add the Shared logic above
(async () => {
  try {
    const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
    const url = 'https://www.remax.com/homes-for-rent/ca/los-angeles/city/0644000';
    const response = await api.get(url);
    const html = response?.body;
    const $ = cheerio.load(html);

    const properties = [];

    $('li[data-testid="d-li"]').each((i, el) => {
      const card = $(el);

      const address = card.find('a.d-listing-card-address h3').text().trim();
      const price = card.find('span.d-listing-card-price-container h4').text().trim();
      const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

      const beds = card.find('p:contains("Beds") strong').first().text().trim();
      const baths = card.find('p:contains("Baths") strong').first().text().trim();
      const sqft = card.find('p:contains("Sq Ft") strong').first().text().trim();

      if (price && address) {
        properties.push({ price, priceValue, beds, baths, sqft, address });
      }
    });

    properties.sort((a, b) => a.priceValue - b.priceValue);

    const workbook = new ExcelJS.Workbook();
    const sheet = workbook.addWorksheet('Properties');

    sheet.columns = [
      { header: 'Price', key: 'price', width: 15 },
      { header: 'Beds', key: 'beds', width: 10 },
      { header: 'Baths', key: 'baths', width: 10 },
      { header: 'Sq Ft', key: 'sqft', width: 12 },
      { header: 'Address', key: 'address', width: 40 },
    ];

    properties.forEach((p) => sheet.addRow(p));

    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
    const filePath = `remax_property_data_${timestamp}.xlsx`;

    await workbook.xlsx.writeFile(filePath);
    console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);
  } catch (error) {
    console.error(`❌ Error: ${error.message}`);
  }
})();

Now, open your terminal, run node remax.js, and see the results:

The results:

Step-by-Step Explanation for The Estately Script

1. Import Required Libraries

Cheerio: Parses and extracts data from HTML (like jQuery).
exceljs: Helps in creating and saving Excel files.
CrawlingAPI: Comes from Crawlbase to fetch/render pages (especially dynamic ones).

1
2
3

const cheerio = require('cheerio');
const ExcelJS = require('exceljs');
const { CrawlingAPI } = require('crawlbase');

2. Start an Async IIFE

An Immediately Invoked Function Expression (IIFE) is used to run asynchronous code right away.
tryand catch handles any runtime errors.

(async () => {
  try {
    // ...
  } catch (error) {
    // ...
  }
})();

3. Set up Crawlbase API and Fetch Webpage

Initializes CrawlingAPI with your API token.
Targets the Cape Coral, Florida listings on Estately.
response.body contains the HTML content of the page.

const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
const url = 'https://www.estately.com/FL/Cape_Coral';
const response = await api.get(url);
const html = response?.body;

4. Load HTML with Cheerio

Loads HTML into Cheerio for DOM traversal:

1	const $ = cheerio.load(html);

5. Extract Property Data

Finds all property cards using div.result-item-details.
For each card:
- Extracts price, then removes $, ,, etc. for sorting priceValue.
- Gets beds, baths, and sqft from <ul> > <li> structure.
- Extracts the address.
Pushes each structured property object to the properties array.

const properties = [];

$('div.result-item-details').each((i, cardEl) => {
  const card = $(cardEl);

  const price = card.find('p.result-price strong').text().trim();
  const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

  const beds = card.find('ul.result-basics-grid li').eq(0).find('b').text().trim();
  const baths = card.find('ul.result-basics-grid li').eq(1).find('b').text().trim();
  const sqft = card.find('ul.result-basics-grid li').eq(2).find('b').text().trim();

  const addressFull = card.find('a.margin-0.small.limit-line-length').text().trim();

  if (price) {
    properties.push({
      price,
      priceValue,
      beds,
      baths,
      sqft,
      address1: addressFull,
    });
  }
});

6. Sort Properties by Price (Ascending)

Ensures cheaper properties appear first.

1	properties.sort((a, b) => a.priceValue - b.priceValue);

7. Creates an Excel File and Add Data

Initializes a new Excel workbook and worksheet:

1	properties.sort((a, b) => a.priceValue - b.priceValue);

Defines column headers and widths for the Excel file:

sheet.columns = [
  { header: 'Price', key: 'price', width: 15 },
  { header: 'Beds', key: 'beds', width: 10 },
  { header: 'Baths', key: 'baths', width: 10 },
  { header: 'Sqft', key: 'sqft', width: 12 },
  { header: 'Address', key: 'address1', width: 30 },
];

Adds each property object as a new row:

1	properties.forEach((p) => sheet.addRow(p));

8. Save File with Timestamped Name

Formats the current timestamp.
Saves the file with a unique name.

const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filePath = `estately_property_data_${timestamp}.xlsx`;

await workbook.xlsx.writeFile(filePath);

9. Success & Error Logging

Success message:

1	console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);

Logs any error if scraping fails:

1
2
3

catch (error) {
  console.error(`❌ Scraping failed: ${error.message}`);
}

Final Thoughts

Scraping real estate data doesn’t have to be painful. With a combination of Crawlbase + Cheerio + ExcelJS, you get an easy, scalable flow that just works.

Instead of playing defense against CAPTCHA and bans, you should be able to build what you want to: value-driven tools, smart dashboards, or even simple reports.

If you’re looking for a way to reliably extract data from complex, protected sites, Crawlbase is the only web scraping tool you’ll ever need.

How to Automate Real Estate Data Extraction Using Crawlbase

Step-by-step Guide to Build a Real Estate Data Scraper

1. Prerequisites

2. Install Required Packages

3. Create the Script Files and Import Required Modules

1. How to Scrape Estately

2. How to Scrape Remax

Step-by-Step Explanation for The Estately Script

1. Import Required Libraries

2. Start an Async IIFE

3. Set up Crawlbase API and Fetch Webpage

4. Load HTML with Cheerio

5. Extract Property Data

6. Sort Properties by Price (Ascending)

7. Creates an Excel File and Add Data

8. Save File with Timestamped Name

9. Success & Error Logging

Final Thoughts

Thomas Adewale

Our solution

Crawling API

Similar to "How to Automate Real Estate Data Extraction Using Crawlbase"

How to Scrape Realtor.com - Extract Real Estate Data

How to Scrape Trulia

Scrape Redfin Property Data

How to Scrape Zillow for Real Estate Data

How to Scrape Homes.com Property Data

Most read from crawling and scraping learning

What is AI Model Training? Everything You Need to Know

Apify Alternative 2025 - Best Web Scraping Tool

How to Automate Real Estate Data Extraction Using Crawlbase

Start crawling and scraping the web today

How to Automate Real Estate Data Extraction Using Crawlbase

Step-by-step Guide to Build a Real Estate Data Scraper

1. Prerequisites

2. Install Required Packages

3. Create the Script Files and Import Required Modules

1. How to Scrape Estately

2. How to Scrape Remax

Step-by-Step Explanation for The Estately Script

1. Import Required Libraries

2. Start an Async IIFE

3. Set up Crawlbase API and Fetch Webpage

4. Load HTML with Cheerio

5. Extract Property Data

6. Sort Properties by Price (Ascending)

7. Creates an Excel File and Add Data

8. Save File with Timestamped Name

9. Success & Error Logging

Final Thoughts

Thomas Adewale

Our solution

Crawling API

Share this post

Similar to "How to Automate Real Estate Data Extraction Using Crawlbase"

Most read from crawling and scraping learning

Start crawling and scraping the web today