Have you ever tried scraping real estate sites, only to hit a wall of CAPTCHA, rate limits, redirects, or IP bans? It’s like doing all the hard work and getting stuck at the finish line.

So, if you’re a founder building a rental data platform or a developer just trying to gather listings into Excel for insights, roadblocks like these can be deal-breakers.

But what if there was a cleaner way to extract accurate, structured property data without the usual headaches, and trusted by 70,000+ dev teams? Sounds like a fairy tale, right? Not exactly. Meet Crawlbase, the only tool you’ll ever need for AI-powered Web Scraping.

Step-by-step Guide to Build a Real Estate Data Scraper

This guide shows how to scrape property listings from two real estate websites — Estately and Re/Max by using the Crawlbase Crawling API. We’ll extract data such as price, number of beds, number of baths, square footage, and address, then export it to an Excel sheet.

Automate real estate using Crawlbase

You don’t need to manage proxies, CAPTCHA, or JavaScript rendering. Crawlbase handles all that for you. So, without worrying, let’s get started.

1. Prerequisites

Before you begin, make sure you have:

  • Node.js installed
  • An IDE/Code Editor of your choice
  • A Crawlbase Crawling API token → Get it here (1000 free requests)
  • A basic understanding of JavaScript/Node (just enough to read functions)

2. Install Required Packages

Open your terminal and run:

1
2
npm init -y
npm install cheerio exceljs crawlbase

These packages handle:

  • Cheerio: To extract content from HTML, like jQuery
  • ExcelJS: To write listings into an Excel file
  • Crawlbase: To bypass CAPTCHA, blocks, and restrictions

3. Create the Script Files and Import Required Modules

First, create 2 new files for 2 different scripts called estately.js and remax.js. And this is how your project structure would look:

Automate real estate using Crawlbase

Then, import the required libraries by pasting the following at the top:

1
2
3
const cheerio = require('cheerio');
const ExcelJS = require('exceljs');
const { CrawlingAPI } = require('crawlbase');

These lines of code are required for both scripts, so don’t forget to add them.

Now, Let’s Look at What’s Different

Since we have two different scripts for two different real estate website listings, here is what we are gonna do:

  1. First, we will share the complete function code for each script
  2. Second, you will look at the code as a whole and try it
  3. Finally, we will walk through the Estately script step-by-step. Since both scripts follow a similar structure, understanding one will make the other a breeze.

1. How to Scrape Estately

What It Does:

  • Visits Estately’s Cape Coral listings
  • Extracts price, beds, baths, sqft, and address
  • Saves everything to Excel

Add This Function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
// Don’t forget to add the Shared Logic above
(async () => {
try {
const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
const url = 'https://www.estately.com/FL/Cape_Coral';
const response = await api.get(url);
const html = response?.body;
const $ = cheerio.load(html);

const properties = [];

$('div.result-item-details').each((i, cardEl) => {
const card = $(cardEl);

const price = card.find('p.result-price strong').text().trim();
const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

const beds = card.find('ul.result-basics-grid li').eq(0).find('b').text().trim();
const baths = card.find('ul.result-basics-grid li').eq(1).find('b').text().trim();
const sqft = card.find('ul.result-basics-grid li').eq(2).find('b').text().trim();

const addressFull = card.find('a.margin-0.small.limit-line-length').text().trim();

if (price) {
properties.push({
price,
priceValue,
beds,
baths,
sqft,
address1: addressFull,
});
}
});

// Sort by price
properties.sort((a, b) => a.priceValue - b.priceValue);

// Write to Excel
const workbook = new ExcelJS.Workbook();
const sheet = workbook.addWorksheet('Properties');

sheet.columns = [
{ header: 'Price', key: 'price', width: 15 },
{ header: 'Beds', key: 'beds', width: 10 },
{ header: 'Baths', key: 'baths', width: 10 },
{ header: 'Sqft', key: 'sqft', width: 12 },
{ header: 'Address', key: 'address1', width: 30 },
];

properties.forEach((p) => sheet.addRow(p));

const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filePath = `estately_property_data_${timestamp}.xlsx`;

await workbook.xlsx.writeFile(filePath);

console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);
} catch (error) {
console.error(`❌ Scraping failed: ${error.message}`);
}
})();

Now, open your terminal, run node estately.js, and see the results:

Automate real estate using Crawlbase

The result:

Automate real estate using Crawlbase

2. How to Scrape Remax

What It Does:

  • Scrapes Re/Max rental listings in LA
  • Extracts price, beds, baths, sqft, and address
  • Sorts by price and saves to Excel

Add This Function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// Don’t forget to add the Shared logic above
(async () => {
try {
const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
const url = 'https://www.remax.com/homes-for-rent/ca/los-angeles/city/0644000';
const response = await api.get(url);
const html = response?.body;
const $ = cheerio.load(html);

const properties = [];

$('li[data-testid="d-li"]').each((i, el) => {
const card = $(el);

const address = card.find('a.d-listing-card-address h3').text().trim();
const price = card.find('span.d-listing-card-price-container h4').text().trim();
const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

const beds = card.find('p:contains("Beds") strong').first().text().trim();
const baths = card.find('p:contains("Baths") strong').first().text().trim();
const sqft = card.find('p:contains("Sq Ft") strong').first().text().trim();

if (price && address) {
properties.push({ price, priceValue, beds, baths, sqft, address });
}
});

properties.sort((a, b) => a.priceValue - b.priceValue);

const workbook = new ExcelJS.Workbook();
const sheet = workbook.addWorksheet('Properties');

sheet.columns = [
{ header: 'Price', key: 'price', width: 15 },
{ header: 'Beds', key: 'beds', width: 10 },
{ header: 'Baths', key: 'baths', width: 10 },
{ header: 'Sq Ft', key: 'sqft', width: 12 },
{ header: 'Address', key: 'address', width: 40 },
];

properties.forEach((p) => sheet.addRow(p));

const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filePath = `remax_property_data_${timestamp}.xlsx`;

await workbook.xlsx.writeFile(filePath);
console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);
} catch (error) {
console.error(`❌ Error: ${error.message}`);
}
})();

Now, open your terminal, run node remax.js, and see the results:

Automate real estate using Crawlbase

The results:

Automate real estate using Crawlbase

Step-by-Step Explanation for The Estately Script

1. Import Required Libraries

  • Cheerio: Parses and extracts data from HTML (like jQuery).
  • exceljs: Helps in creating and saving Excel files.
  • CrawlingAPI: Comes from Crawlbase to fetch/render pages (especially dynamic ones).
1
2
3
const cheerio = require('cheerio');
const ExcelJS = require('exceljs');
const { CrawlingAPI } = require('crawlbase');

2. Start an Async IIFE

  • An Immediately Invoked Function Expression (IIFE) is used to run asynchronous code right away.
  • tryand catch handles any runtime errors.
1
2
3
4
5
6
7
(async () => {
try {
// ...
} catch (error) {
// ...
}
})();

3. Set up Crawlbase API and Fetch Webpage

  • Initializes CrawlingAPI with your API token.
  • Targets the Cape Coral, Florida listings on Estately.
  • response.body contains the HTML content of the page.
1
2
3
4
const api = new CrawlingAPI({ token: 'YOUR TOKEN' });
const url = 'https://www.estately.com/FL/Cape_Coral';
const response = await api.get(url);
const html = response?.body;

4. Load HTML with Cheerio

  • Loads HTML into Cheerio for DOM traversal:
1
const $ = cheerio.load(html);

5. Extract Property Data

  • Finds all property cards using div.result-item-details.
  • For each card:
    • Extracts price, then removes $, ,, etc. for sorting priceValue.
    • Gets beds, baths, and sqft from <ul> > <li> structure.
    • Extracts the address.
  • Pushes each structured property object to the properties array.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const properties = [];

$('div.result-item-details').each((i, cardEl) => {
const card = $(cardEl);

const price = card.find('p.result-price strong').text().trim();
const priceValue = parseFloat(price.replace(/[^0-9.]/g, ''));

const beds = card.find('ul.result-basics-grid li').eq(0).find('b').text().trim();
const baths = card.find('ul.result-basics-grid li').eq(1).find('b').text().trim();
const sqft = card.find('ul.result-basics-grid li').eq(2).find('b').text().trim();

const addressFull = card.find('a.margin-0.small.limit-line-length').text().trim();

if (price) {
properties.push({
price,
priceValue,
beds,
baths,
sqft,
address1: addressFull,
});
}
});

6. Sort Properties by Price (Ascending)

  • Ensures cheaper properties appear first.
1
properties.sort((a, b) => a.priceValue - b.priceValue);

7. Creates an Excel File and Add Data

  • Initializes a new Excel workbook and worksheet:
1
properties.sort((a, b) => a.priceValue - b.priceValue);
  • Defines column headers and widths for the Excel file:
1
2
3
4
5
6
7
sheet.columns = [
{ header: 'Price', key: 'price', width: 15 },
{ header: 'Beds', key: 'beds', width: 10 },
{ header: 'Baths', key: 'baths', width: 10 },
{ header: 'Sqft', key: 'sqft', width: 12 },
{ header: 'Address', key: 'address1', width: 30 },
];
  • Adds each property object as a new row:
1
properties.forEach((p) => sheet.addRow(p));

8. Save File with Timestamped Name

  • Formats the current timestamp.
  • Saves the file with a unique name.
1
2
3
4
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filePath = `estately_property_data_${timestamp}.xlsx`;

await workbook.xlsx.writeFile(filePath);

9. Success & Error Logging

  • Success message:
1
console.log(`✅ Scraped and saved ${properties.length} properties to ${filePath}`);
  • Logs any error if scraping fails:
1
2
3
catch (error) {
console.error(`❌ Scraping failed: ${error.message}`);
}

Final Thoughts

Scraping real estate data doesn’t have to be painful. With a combination of Crawlbase + Cheerio + ExcelJS, you get an easy, scalable flow that just works.

Instead of playing defense against CAPTCHA and bans, you should be able to build what you want to: value-driven tools, smart dashboards, or even simple reports.

If you’re looking for a way to reliably extract data from complex, protected sites, Crawlbase is the only web scraping tool you’ll ever need.