In data extraction and analysis, scraping search engine results is crucial for researchers, developers, and analysts seeking substantial data for insights or applications. Recognizing the importance of scraping Bing search results unlocks a wealth of information, allowing users to leverage Bing’s extensive database.

This guide takes a practical approach to scraping Bing search results using JavaScript Puppeteer and the Crawlbase Crawling API. We’ll delve into the significance of JavaScript Puppeteer for streamlined interaction with Bing’s search engine and understand how integrating the Crawlbase Crawling API ensures seamless access to Bing results, effortlessly bypassing common scraping issues.

Join us in exploring Bing SERP scraping as we master advanced web scraping techniques to unlock Microsoft Bing’s full potential as a valuable data source.

Table of Contents

I. Understanding Bing’s Search Page Structure

  • Bing SERP Structure
  • Data to Scrape

II. Prerequisites
III. Setting Up Puppeteer

  • Prepare the Coding Environment
  • Scraping Bing SERP using Puppeteer

IV. Setting Up Crawlbase’s Crawling API

  • Obtain API Credentials
  • Prepare the Coding Environment
  • Scraping Bing SERP using Crawling API

V. Puppeteer vs Crawling API

  • Pros and Cons
  • Conclusion

VI. Frequently Asked Questions (FAQ)

I. Understanding Bing’s Search Page Structure

Search engines play a pivotal role in helping users navigate the vast sea of information from the internet. With its distinctive features and growing user base, Microsoft’s Bing stands as a significant player in web search. As of April 2024, Bing.com reached close to 1.3 billion unique global visitors, a testament to its widespread usage and influence in the online space. Although experiencing a slight decline from the previous month’s 1.4 billion visitors and far behind Google, Bing remains relevant in delivering search results.

Why scrape bing

Source

Before we start working with our scraper, it’s important to understand the layout of a Bing SERP (Search Engine Results Page), like our target URL for this guide. Bing typically presents search results in a format that includes various elements, and you can extract valuable information from these elements using web scraping techniques. Here’s an overview of the structure and the data you can scrape:

Bing SERP Structure

1. Search Results Container

  • Bing displays search results in a container, usually in a list format, with each result having a distinct block.

2. Individual Search Result Block

  • Each search result block contains information about a specific webpage, including the title, description, and link.

3. Title

  • The search result’s title is the clickable headline representing the webpage. Users must identify the relevance of the result quickly.

4. Description

  • The description provides a brief summary or snippet of the content found on the webpage. It offers additional context to users about what to expect from the linked page.

5. Link

  • The link is the URL of the webpage associated with the search result. Clicking on the link directs users to the respective webpage.

6. Result Videos

  • Bing may include video results directly in the search results. These can be videos from various sources like YouTube, Vimeo, or other video platforms.
Bing SERP

Data to Scrape:

1. Titles

  • Extract each search result’s titles to understand the web pages’ main topics or themes.

2. Descriptions

  • Scrape the descriptions to gather concise information about the content of each webpage. This can be useful for creating summaries or snippets.

3. Links

  • Capture the URLs of the web pages associated with each search result. These links are essential for navigating to the source pages.

We’ll show you how easy it is to use the Crawling API to scrape the data mentioned above. Also, we’ll use the method page.evaluate in Puppeteer to execute a function within the context of the page being controlled by Puppeteer. This function runs in the browser environment and can access the DOM (Document Object Model) and JavaScript variables within the page. Here is an example:

1
2
3
4
5
6
7
8
const results = await page.evaluate(() => {
return Array.from(document.querySelectorAll('li.b_algo')).map((list, index) => ({
position: index + 1,
title: list.querySelector('h2 a').textContent,
url: list.querySelector('h2 a').getAttribute('href'),
description: list.querySelector('p.b_algoSlug').textContent,
}));
});

Let’s proceed to the main part of our guide, where we’ll walk you through the process of using Puppeteer and Crawling API step by step to scrape Bing SERP data.

II. Prerequisites

Before getting started, ensure that you have the following prerequisites:

  1. Node.js: Make sure Node.js is installed on your machine. You can download it from Node.js official website.
  2. npm (Node Package Manager): npm is typically included with Node.js installation. Check if it’s available by running the following command in your terminal:
1
npm -v

If the version is displayed, npm is installed. If not, ensure Node.js is correctly installed, as npm is bundled with it.

Having Node.js and npm installed ensures a smooth experience as you proceed with setting up your environment for web scraping with Puppeteer or Crawling API.

III. Setting Up Puppeteer

Puppeteer is a powerful Node.js library developed by the Chrome team at Google. It provides a high-level API to control headless or full browsers over the DevTools Protocol, making it an excellent choice for tasks such as web scraping and automated testing. Before diving into the project with Puppeteer, let’s set up a Node.js project and install the Puppeteer package.

Prepare the Coding Environment

  1. Create a Node.js Project
    Open your terminal and run the following command to create a basic Node.js project with default settings:
1
npm init -y

This command generates a package.json file, which includes metadata about your project and its dependencies.

  1. Install Puppeteer:
    Once the project is set up, install the Puppeteer package using the following command:
1
npm i puppeteer

This command downloads and installs the Puppeteer library, enabling you to control browsers programmatically.

  1. Create an Index File:
    To write your web scraper’s code, create an index.js file. Use the following command to generate the file:
1
touch index.js

This command creates an empty index.js file where you’ll write the Puppeteer script for scraping Bing SERP data. You have the option to change this to whatever filename you like.

Scraping Bing SERP using Puppeteer

With your Node.js project initialized, Puppeteer installed, and an index.js file ready, you’re all set to harness the capabilities of Puppeteer for web scraping. Copy the code below and save it to your index.js file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Import required modules
const puppeteer = require('puppeteer');
const fs = require('fs');

// Define an asynchronous function to scrape Bing search results
async function getBingData(searchString) {
// Launch a headless browser
const browser = await puppeteer.launch({
headless: 'new', // "new" opens a new browser window
});

// Create a new page in the browser
const page = await browser.newPage();

// Navigate to the Bing search results page for the specified search string
await page.goto(`https://bing.com/search?q=${encodeURI(searchString)}`);

// Wait for the selector ".b_pag" to ensure that the search results are loaded
await page.waitForSelector('.b_pag');

// Extract relevant data from the search results using page.evaluate
const results = await page.evaluate(() => {
// Map over each search result element to create an array of result objects
return Array.from(document.querySelectorAll('li.b_algo')).map((list, index) => ({
position: index + 1,
title: list.querySelector('h2 a').textContent,
url: list.querySelector('h2 a').getAttribute('href'),
description: list.querySelector('p.b_algoSlug').textContent,
}));
});

// Close the browser after scraping is complete
await browser.close();

// Log the results to the console
console.log(results);

// Write the results to a JSON file for further use
fs.writeFileSync('bing-serp.json', JSON.stringify({ results }, null, 2));

// Return the scraped results
return results;
}

// Call the function with a sample search string (e.g "samsung s23 ultra")
getBingData('samsung s23 ultra');

Let’s execute the above code by using a simple command:

1
node index.js

If successful, you’ll get the result in JSON format as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
{
"results": [
{
"position": 1,
"title": "Samsung Galaxy S23 Ultra | Samsung US - Samsung ...",
"url": "https://www.samsung.com/us/smartphones/galaxy-s23-ultra/",
"description": "WebMeet the latest Galaxy S23 Ultra phone, designed with the planet in mind, equipped with a built-in S Pen, Nightography camera, & powerful chip for epic gaming."
},
{
"position": 2,
"title": "Samsung Galaxy S23 Ultra - Full phone specifications ...",
"url": "https://www.gsmarena.com/samsung_galaxy_s23_ultra-12024.php",
"description": "WebSamsung Galaxy S23 Ultra Android smartphone. Announced Feb 2023. Features 6.8″ display, Snapdragon 8 Gen 2 chipset, 5000 mAh battery, ..."
},
{
"position": 3,
"title": "Samsung Galaxy S23 Ultra | Samsung PK",
"url": "https://www.samsung.com/pk/smartphones/galaxy-s23-ultra/",
"description": "WebMobile. Smartphones. Galaxy S23 Ultra. Meet the new Galaxy S23 Ultra, designed with the planet in mind and equipped with a built-in S Pen, Nightography camera and powerful chip for epic gaming."
},
{
"position": 4,
"title": "Samsung Galaxy S23 Ultra Price in Pakistan 2023",
"url": "https://www.whatmobile.com.pk/Samsung_Galaxy-S23-Ultra",
"description": "WebSamsung Galaxy S23 Ultra - The Ultra Smartphone Of Ultra Company Samsung is launching a new Galaxy S23 which has got the moniker ..."
},
{
"position": 5,
"title": "Samsung Galaxy S23 Ultra 5G - Camera & Specs",
"url": "https://www.samsung.com/ph/smartphones/galaxy-s23-ultra/",
"description": "WebDiscover the new Samsung Galaxy S23 Ultra 5G with refined Nightime Cameras, 6.8\" 120Hz adaptive anti-glare display and epic performance. Skip to content Samsung and Cookies"
},
{
"position": 6,
"title": "Samsung Galaxy S23 Ultra | Samsung Canada",
"url": "https://www.samsung.com/ca/smartphones/galaxy-s23-ultra/",
"description": "WebGalaxy S23 Ultra BUY NOW Ultra Reborn Re-engineered Nightography camera Revolutionary gaming processor Renowned S Pen Expert Review Highlights Introduction ..."
},
{
"position": 7,
"title": "Galaxy S23 Ultra: Official Introduction Film | Samsung - YouTube",
"url": "https://www.youtube.com/watch?v=BSYsXVFzmKA",
"description": "Web1 Feb 2023 · What's new? The new Galaxy S23 Ultra. Share the epic with our most powerful processor yet, a pro-grade camera that boasts epic Nightography, and the mighty e..."
},
{
"position": 8,
"title": "Samsung Galaxy S23 Ultra review | Tom's Guide",
"url": "https://www.tomsguide.com/reviews/samsung-galaxy-s23-ultra",
"description": "Web18 Sep 2023 · The Samsung Galaxy S23 Ultra takes Samsung's flagship to the next level with a whopping 200MP camera and lots of other photography improvements. You also get a Qualcomm Snapdragon 8 ..."
},
{
"position": 9,
"title": "Samsung Galaxy S23 Ultra Price in Pakistan 2024",
"url": "https://priceoye.pk/mobiles/samsung/samsung-galaxy-s23-ultra",
"description": "WebBuy Samsung Galaxy S23 Ultra at the lowest price in Pakistan of Rs. 494,999/-. Check prices from all online stores, compare specs, features and get the latest offers and gift vouchers. See the highlights, specifications, ..."
},
{
"position": 10,
"title": "Samsung Galaxy S23 Ultra: release date, price, specs ...",
"url": "https://www.techradar.com/news/samsung-galaxy-s23-ultra",
"description": "Web1 Feb 2023 · The Samsung galaxy S23 Ultra, as well as the smartphones it launched alongside, will release on Friday, February 17. The devices are available to preorder right now, though if you want to secure a ..."
}
]
}

IV. Setting Up Crawlbase’s Scraper

Now that we’ve covered the steps for Puppeteer, let’s explore the Scraper. Here’s what you need to do if it’s your first time using the Scraper:

Obtain API Credentials:

  1. Sign Up for Scraper :
  • Begin by signing up on the Crawlbase website to obtain access to the Scraper.
  1. Access API Documentation:
  1. Retrieve API Credentials:
  • Find your API credentials (e.g., API key) either in the documentation or on your account dashboard. These credentials are crucial for authenticating your requests to the Scraper.
Crawlbase Docs

Prepare the Coding Environment

To kickstart your Scraper project using Crawlbase Scraper and set up the scraping environment successfully, follow these commands:

  1. Create Project Folder
1
mkdir bing-serp

This command creates an empty folder named “bing-serp” to organize your scraping project.

  1. Navigate to Project Folder
1
cd bing-serp

Use this command to enter the newly created directory and prepare for writing your scraping code.

  1. Create JS File
1
touch index.js

This command generates an index.js file where you’ll write the JavaScript code for your scraper.

  1. Install Crawlbase Package
1
npm install crawlbase

The Crawlbase Node package is used for interacting with the Crawlbase APIs including the Scraper, allowing you to fetch HTML without getting blocked and scrape content from websites efficiently.

Scraping Bing SERP using Scraper

Once done setting up your coding environment, we can now start integrating the Scraper into our script.

Copy the code below and make sure you replace "Crawlbase_TOKEN" with your actual Crawlbase API token for proper authentication.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// import Crawlbase Scraper API package
const { ScraperAPI } = require('crawlbase');

// Import the 'fs' module
const fs = require('fs');

// initializing Scraper API
const api = new ScraperAPI({ token: 'Crawlbase_TOKEN' }); // Replace it with you Crawlbase token

// Bing SERP URL
const bingSerpURL = 'https://www.bing.com/search?q=samsung+s23+ultra';

// Defining the javascript parameter to allow proper scraping for Bing SERP
const options = {
javascript: true,
};

// Scraper API get request execution
api
.get(bingSerpURL, options)
.then((response) => {
const scrapedData = response.json.body;

fs.writeFileSync('bing_scraped.json', JSON.stringify({ scrapedData }, null, 2));
})
.catch((error) => {
console.log(error, 'ERROR');
});

Execute the above code by using a simple command:

1
node index.js

Result should be in JSON format as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
{
"scrapedData": {
"searchResults": [
{
"position": 1,
"title": "Samsung Galaxy S23 Ultra | Samsung US - Samsung Electronics America",
"url": "https://www.samsung.com/us/smartphones/galaxy-s23-ultra/",
"description": "WebGalaxy S23 Ultra. We've raised the bar with a 200MP camera and our fastest mobile processor ever. ** Special carrier offers. BUY NOW. Ultra evolved. 200MP camera, the highest resolution on a phone. Our fastest mobile processor ever** Advanced Nightography. Built-in S Pen with camera shutter button. See what others are saying."
},
{
"position": 2,
"title": "Specs | Samsung Galaxy S23 Ultra | Samsung US",
"url": "https://www.samsung.com/us/smartphones/galaxy-s23-ultra/specs/",
"description": "WebBUY NOW. SEE IN 360°. *Color availability may vary depending on country or carrier. *Online exclusive colors only available on Samsung.com. Display. Optimized for immersive gaming. 6.8\"* 3088 x 1440 (Edge Quad HD+) Peak Brightness. 1750 nits. HDR. 1200 nits. HBM. 1200 nits. Adaptive Refresh Rate. 1~120Hz. Watch outside with clarity."
},
{
"position": 3,
"title": "Samsung Galaxy S23 Ultra review | Tom's Guide",
"url": "https://www.tomsguide.com/reviews/samsung-galaxy-s23-ultra",
"description": "WebSep 18, 2023 · The Samsung Galaxy S23 Ultra takes Samsung's flagship to the next level with a whopping 200MP camera and lots of other photography improvements. You also get a Qualcomm Snapdragon 8 Gen..."
},
{
"position": 4,
"title": "Samsung Galaxy S23 Ultra | Samsung Canada",
"url": "https://www.samsung.com/ca/smartphones/galaxy-s23-ultra/",
"description": "WebGalaxy S23 Ultra. Meet the new Galaxy S23 Ultra, designed for better sustainability and equipped with a built-in S Pen, Nightography camera and powerful chip for epic gaming."
},
{
"position": 5,
"title": "Galaxy S23 Ultra: Official Introduction Film | Samsung - YouTube",
"url": "https://www.youtube.com/watch?v=BSYsXVFzmKA",
"description": "WebFeb 1, 2023 · 6.55M subscribers. Subscribed. 106K. Share. 25M views 11 months ago #GalaxyS23 #SharetheEpic #Samsung. What's new? The new Galaxy S23 Ultra. Share the epic with our most powerful..."
},
{
"position": 6,
"title": "Samsung Galaxy S23 vs. S23+ vs. S23 Ultra: What's the ...",
"url": "https://www.pcmag.com/news/samsung-galaxy-s23-vs-s23-plus-vs-s23-ultra-whats-the-difference",
"description": "WebFeb 1, 2023 · 5G. Samsung Galaxy S23 vs. S23+ vs. S23 Ultra: What's the Difference? All three models in the Galaxy S23 lineup offer premium power and features, but which one should you buy? We..."
},
{
"position": 7,
"title": "Samsung Galaxy S23 Ultra review: indomitable showman",
"url": "https://www.techradar.com/reviews/samsung-galaxy-s23-ultra",
"description": "WebFeb 13, 2023 · Samsung Galaxy S23 Ultra. No contract. data 2GB. Free. upfront. Monthly $68.18. /mth. Visit Website."
},
{
"position": 8,
"title": "Galaxy S23 Ultra, 512GB (Unlocked) | Samsung US",
"url": "https://www.samsung.com/us/smartphones/galaxy-s23-ultra/buy/galaxy-s23-ultra-512gb-unlocked-sm-s918uzrfxaa/",
"description": "WebLearn about Galaxy S23 Ultra Key Features. Chat with an Expert. Galaxy S23 Ultra Galaxy S23 | S23+ Connectivity. Select your carrier. Out of Stock. Out of Stock. Out of Stock. Storage. 256GB. $1,199.99. 512GB. $1,379.99. 1TB. $1,619.99. ... Log in now to earn up to undefined% back in Samsung Rewards Points."
},
{
"position": 9,
"title": "Samsung - Galaxy S23 Ultra 256GB (Unlocked) - Phantom Black",
"url": "https://www.bestbuy.com/site/samsung-galaxy-s23-ultra-256gb-unlocked-phantom-black/6529723.p",
"description": "WebShop Samsung Galaxy S23 Ultra 256GB (Unlocked) Phantom Black at Best Buy. Find low everyday prices and buy online for delivery or in-store pick-up. Price Match Guarantee."
},
{
"position": 10,
"title": "Samsung Galaxy S23 Ultra - Full phone specifications - GSMArena.com",
"url": "https://www.gsmarena.com/samsung_galaxy_s23_ultra-12024.php",
"description": "WebSamsung Galaxy S23 Ultra. Released 2023, February 17. 234g, 8.9mm thickness. Android 13, up to Android 14, One UI 6. 256GB/512GB/1TB storage, no card slot. 52% 11,347,994 hits. 1485 Become..."
}
],
"videosSearchResults": [],
"relatedSearches": [],
"numberOfResults": 14400000
}
}

V. Puppeteer vs Crawlbase Scraper

When deciding between Puppeteer and Crawlbase’s Scraper for scraping Bing Search Engine Results Pages (SERP) in JavaScript, there are several factors to consider. Let’s break down the pros and cons of each option:

Crawlbase VS Puppeteer

Puppeteer:

Pros:

  1. Full Control: Puppeteer is a headless browser automation library that provides full control over the browser, allowing you to interact with web pages just like a user would.
  2. Dynamic Content: Puppeteer is excellent for scraping pages with dynamic content and heavy JavaScript usage, as it renders pages and executes JavaScript.
  3. Customization: You can customize your scraping logic extensively, adapting it to specific website structures and behaviors.
  4. Flexibility: Puppeteer is not limited to scraping. It can also be used for automated testing, taking screenshots, generating PDFs, and more.

Cons:

  1. Learning Curve: Puppeteer might have a steeper learning curve, especially for beginners, as it involves understanding how browsers work and interacting with them programmatically.
  2. Resource Intensive: Running a headless browser can be resource-intensive, consuming more memory and CPU compared to simpler scraping solutions.
  3. Development Time: Creating and maintaining Puppeteer scripts may require more development time, potentially increasing overall project costs.

Crawlbase’s Scraper :

Pros:

  1. Ease of Use: Crawlbase API is designed to be user-friendly, making it easy for developers to get started quickly without the need for extensive coding or browser automation knowledge.
  2. Scalability: Crawlbase API is a cloud-based solution, offering scalability and eliminating the need for you to manage infrastructure concerns.
  3. Proxy Management: Crawlbase API handles proxies and IP rotation automatically, which can be crucial for avoiding IP bans and improving reliability.
  4. Cost-Efficient: Depending on your scraping needs, using a service like API might be more cost-efficient, especially if you don’t require the extensive capabilities of a headless browser.

Cons:

  1. Limited Customization: Crawlbase API might have limitations in terms of customization compared to Puppeteer. It may not be as flexible if you need highly specialized scraping logic.
  2. Dependency on External Service: Your scraping process relies on an external service, which means you are subject to their service availability and policies.

Conclusion:

Choose Puppeteer if:

  • You need full control and customization over the scraping process.
  • You’re aware that development time may be longer, potentially increasing costs.
  • You are comfortable managing a headless browser and are willing to invest time in learning.

Choose Crawlbase API if:

  • You want a quick and easy-to-use solution without the need for in-depth browser automation knowledge.
  • Scalability and proxy management are crucial for your scraping needs.
  • You prefer a managed service and a simple solution for quick project deployment.
  • You aim for a more cost-efficient solution considering potential development time and resources.

Ultimately, the choice between Puppeteer and Crawlbase API depends on your specific requirements, technical expertise, and preferences in terms of control and ease of use.

If you like this guide, check out other scraping guides from Crawlbase. See our recommended “how-to” guides below:

How to Scrape Flipkart
How to Scrape Yelp
How to Scrape Glassdoor

VI. Frequently Asked Questions (FAQ)

Q. Can I use the Crawlbase API for other websites?

Yes, the Crawlbase API is compatible with other websites, especially popular ones like Amazon, Google, Facebook, LinkedIn, and more. Check the Crawlbase API documentation for the full list.

Q. Is there a free trial for the Crawlbase API?

Yes, the first 1,000 free requests are free of charge for regular requests. If you need JavaScript rendering, you can subscribe to any of the paid packages.

Q. Can the Crawlbase API hide my IP address to avoid blocks or IP bans?

Yes. the Crawlbase API utilizes millions of proxies on each request to bypass common scraping problems like bot detection, CAPTCHAs, and IP blocks effectively.

If you have other questions or concerns about this guide or the API, our product experts will be glad to assist. Please do not hesitate to contact our support team. Happy Scraping!