How To Scrape Google Search Results with Python

Web scraping and data extraction have caused a revolution in how we collect information from the huge amount of data on the internet. Search engines like Google are goldmines of knowledge, and the ability to extract useful URLs from their search results can make a big difference for many purposes. Whether you own a business doing market research, love data and want information, or need data for different uses in your job, web scraping can give you the data you need.

In this blog, we will learn how to scrape Google search results, extract useful information, and storing information well in an SQLite database.

We’ll use Python and the Crawlbase Crawling API. Together, we’ll go through the complex world of web scraping and data management giving you the skills and know-how to use the power of Google’s search results. Let’s jump in and begin!

Why Scrape URLs Google Search Results?

Scraping Google search pages has many benefits. It gives you access to a huge and varied set of data, thanks to Google’s top spot as the world’s most used search engine. This data covers many fields, from business to school to research.

The real strength of scraping is that you can get just the data you want. Google’s search results match what you’re looking for. When you scrape these results, you can get data that fits your search terms letting you pull out just the info you need. Google Search shows a list of websites about the topic you search. Scraping these links lets you build a full set of sources that fit what you’re researching or studying.

Companies can use Google search results scraping to study the market. They can get insights about their rivals from search results about their field or products. Looking at these results helps them understand market trends, what buyers think, and what other companies are doing. People who make content and write blogs can use this method to find good articles, blog posts, and news. This gives them a strong base to create their own content. Online marketers and SEO experts get a lot from scraping search pages.

Learning to scrape Google search pages gives you a strong tool to use the internet’s wealth of info. In this blog, we’ll look at the tech side of this process. We’ll use Python and the Crawlbase Crawling API as our tools. Let’s start this journey to learn about the art and science of web scraping for Google search pages.

Using Crawlbase Crawling API to Scrape Google Results

The Crawlbase Crawling API leads the pack in web scraping giving users a strong and flexible way to pull data from websites. It aims to make the tricky job of web scraping easier by offering a simple interface with powerful tools. With Crawlbase helping you out, you can set up automatic data grabbing from websites even from tricky ones like Google’s search pages. This automation saves you lots of time and work that you’d otherwise spend gathering data by hand. Let’s take a closer look at the thinking behind this choice:

Scalability: Crawlbase has the ability to handle web scraping on a large scale. Your project might cover a few hundred pages or a huge database with millions of entries. Crawlbase adjusts to meet your needs, making sure your scraping projects grow without any hitches.
Reliability: Web scraping can be harsh because websites keep changing. Crawlbase tackles this problem with solid error handling and monitoring. This cuts down the chances of scraping jobs and running into unexpected issues.
Proxy Management: Websites often use anti-scraping measures like IP blocking. To deal with this, Crawlbase offers good proxy management. This feature helps you avoid IP bans and makes sure you can still get the data you’re after.
Easy to use: The Crawlbase API takes away the hassle of building and running your scraper or crawler. It works in the cloud dealing with the complex tech stuff so you can focus on getting the data you need.
Fresh data: The Crawlbase Crawling API makes sure you get the newest and most current data by crawling in real time. This is key for tasks that need accurate analysis and decision-making.
Money-saving: Setting up and running your web scraping system can be expensive. On the other hand, the Crawlbase Crawling API offers a cheaper option where you pay for what you use.

Exploring the Crawlbase Python Library

The Crawlbase Python library helps you get the most out of the Crawlbase Crawling API. This library serves as your toolkit to add Crawlbase to Python projects. It makes the process easy for developers, no matter their experience level.

Here’s a glimpse of how it works:

Initialization: Begin your journey by initializing the Crawling API class with your Crawlbase token.

1	api = CrawlingAPI({ 'token': 'YOUR_CRAWLBASE_TOKEN' })

Scraping URLs: Effortlessly scrape URLs using the get function, specifying the URL and any optional parameters.

1
2
3

response = api.get('https://www.example.com')
if response['status_code'] == 200:
    print(response['body'])

Customization: The Crawlbase Python library has options to adjust your scraping. You can check out more ways to explore in the API documentation.

Now you know about the Crawlbase Crawling API and can use it well. We’re about to dive into Google’s huge search results, uncovering the secrets of getting web data. Let’s get started and explore all the info Google has to offer!

4. Essential Requirements for a Successful Start

Before you start your web scraping journey with the Crawlbase Crawling API, you need to get some essential things ready. This part will talk about these must-haves making sure you’re all set for what’s ahead.

Configuring Your Development Environment For Scraping Google

Setting up your coding space is the first thing to do in your web scraping Google Search. Here’s what you need to do:

Python Installation: Make sure you have Python on your computer. You can get the newest Python version from their official website. You’ll find easy-to-follow setup guides there too.
Code Editor: Pick a code editor or IDE to write your Python code. Some popular choices are Visual Studio Code, PyCharm, Jupyter Notebook, or even a basic text editor like Sublime Text.
Virtual Environment: Setting up a virtual environment for your project is a smart move. It keeps your project’s required packages separate from what’s installed on your computer’s main Python setup. This helps avoid any clashes between different versions of packages. You can use Python’s built-in venv module or other tools like virtualenv to create these isolated environments.

Installing the Necessary Libraries to Scrape Google Search Results

To interact with the Crawlbase Crawling API and perform web scraping tasks effectively, you’ll need to install some Python libraries. Here’s a list of the key libraries you’ll require:

Crawlbase: A lightweight, dependency free Python class that acts as wrapper for Crawlbase API. We can use it to send requests to the Crawling API and receive responses. You can install it using pip:

1	pip install crawlbase

SQLite: SQLite is a lightweight, server-less, and self-contained database engine that we’ll use to store the scraped data. Python comes with built-in support for SQLite, so there’s no need to install it separately.

Creating Your Crawlbase Account

Now, let’s get you set up with a Crawlbase account. Follow these steps:

Visit the Crawlbase Website: Open your web browser and navigate to the Crawlbase website Signup page to begin the registration process.
Provide Your Details: You’ll be asked to provide your email address and create a password for your Crawlbase account. Fill in the required information.
Verification: After submitting your details, you may need to verify your email address. Check your inbox for a verification email from Crawlbase and follow the instructions provided.
Login: Once your account is verified, return to the Crawlbase website and log in using your newly created credentials.
Access Your API Token: You’ll need an API token to use the Crawlbase Crawling API. You can find your tokens here.

With your development environment configured, the necessary libraries installed, and your Crawlbase account created, you’re now equipped with the essentials to dive into the world of web scraping using the Crawlbase Crawling API. In the following sections, we’ll delve deeper into understanding Google’s search page structure and the intricacies of web scraping. So, let’s continue our journey!

5. Understanding the Structure of Google Search Results Pages

To get good at scraping Google search pages, you need to grasp how these pages are put together. Google uses a complex layout that mixes different parts to show search results . In this part, we’ll take apart the main pieces and show you how to spot the valuable data within.

Components of a Google Search Results Page

A typical Google search page comprises several distinct sections, each serving a specific purpose:

Search Bar: You’ll find the search bar at the top of the page. This is where you type what you’re looking for. Google then looks through its database to show you matching results.
Search Tools: Just above your search results, you’ll see a bunch of options to narrow down what you’re seeing. You can change how the results are sorted, pick a specific date range, or choose the type of content you want. This helps you find what you need.
Ads: Keep an eye out for sponsored content at the beginning and end of your search results. These are ads that companies pay for. They might be related to what you searched for, but sometimes they’re not.
Locations: Google often shows a map at the top of the search results page that relates to what you’re looking for. It also lists the addresses and how to get in touch with the most relevant places.
Search Results: The main part of the page has a list of websites, articles, pictures, or other stuff that matches your search. Each item usually comes with a title, a small preview, and the web address.
People Also Ask: Next to the search results, you’ll often see a “People Also Ask” box. It works like a FAQ section showing questions that are tied to what you searched for.
Related Searches: Google shows a list of related search links based on your query. These links can take you to useful resources that add to your data collection.
Knowledge Graph: On the right side of the page, you might see a Knowledge Graph panel with information about the topic you looked up. This panel often has key facts, images, and related topics.
Pagination: If there are more pages of search results, you’ll find pagination links at the bottom. These let you move through the results.

In the next parts, we’ll explore the nuts and bolts of scraping Google search pages. We’ll cover how to extract key data , deal with pagination, and save information to an SQLite database.

6. Mastering Google Search Page Scraping with the Crawling API

This part will focus on becoming skilled at Google Search page scraping using the Crawlbase Crawling API. We want to use this powerful tool to its full potential to pull information from Google’s search results. We’ll go through the key steps, from getting your Crawlbase token to handling pagination. As an example, we’ll collect important details about search results for the query “data science” on Google.

Getting the Correct Crawlbase Token

Before we embark on our Google Search page scraping journey, we need to secure access to the Crawlbase Crawling API by obtaining a suitable token. Crawlbase provides two types of tokens: the Normal Token (TCP) for static websites and the JavaScript Token (JS) for dynamic pages. For Google Search pages, Normal Token is a good choice.

from crawlbase import CrawlingAPI

# Initialize the Crawling API with your Crawlbase JavaScript token
api = CrawlingAPI({ 'token': 'CRAWLBASE_NORMAL_TOKEN' })

You can get your Crawlbase token here after creating account on it.

Setting up Crawlbase Crawling API

With our token in hand, let’s proceed to configure the Crawlbase Crawling API for effective data extraction. Crawling API responses can be obtained in two formats: HTML or JSON. By default, the API returns responses in HTML format. However, we can specify the “format” parameter to receive responses in JSON.

HTML response:

Headers:
  url: "The URL which was crawled"
  original_status: 200
  pc_status: 200

Body:
  The HTML of the page

JSON Response:

// pass query param "format=json" to receive response in JSON format
{
  "original_status": "200",
  "pc_status": 200,
  "url": "The URL which was crawled",
  "body": "The HTML of the page"
}

We can read more about Crawling API response here. For the example, we will go with the JSON response. We’ll utilize the initialized API object to make requests. Specify the URL you intend to scrape using the api.get(url, options={}) function.

from crawlbase import CrawlingAPI
import json

# Initialize the Crawling API with your Crawlbase Normal token
api = CrawlingAPI({ 'token': 'CRAWLBASE_NORMAL_TOKEN' })

# URL of the Google search page you want to scrape
google_search_url = 'https://www.google.com/search?q=data+science'

# options for Crawling API
options = {
 'format': 'json'
}

# Make a request to scrape the Google search page with options
response = api.get(google_search_url, options)

# Check if the request was successful
if response['headers']['pc_status'] == '200':
  # Loading JSON from response body after decoding byte data
  response_json = json.loads(response['body'].decode('latin1'))

  # pretty printing response body
  print(json.dumps(response_json, indent=4, sort_keys=True))
else:
  print("Failed to retrieve the page. Status code:", response['status_code'])

In the above code, we have initialized the API, defined the Google search URL, and set up the options for the Crawling API. We are passing the ”format” parameter with the value “json” so that we can have the response in JSON. Crawling API provides many other important parameters. You can read about them here.

Upon successful execution of the code, you will get output like below.

{
  "body": "Crawled HTML of page",
  "original_status": 200,
  "pc_status": 200,
  "url": "https://www.google.com/search?q=data+science"
}

Selecting the Ideal Scraper

Crawling API provides multiple built-in scrapers for different important websites, including Google. You can read about the available scrapers here. The “scraper” parameter is used to parse the retrieved data according to a specific scraper provided by the Crawlbase API. It’s optional; if not specified, you will receive the full HTML of the page for manual scraping. If you use this parameter, the response will return as JSON containing the information parsed according to the specified scraper.

Example:

1 2	# Example using a specific scraper response = api.get('https://www.google.com/search?q=your_search_query', { 'scraper': 'scraper_name' })

One of the available scrapers is “google-serp”, designed for Google search result pages. It returns an object with details like ads, and people also like section details, search results, related searches, and more. This includes all the information we want. You can read about “google-serp” scraper here.

Let’s add this parameter to our example and see what we get in the response:

from crawlbase import CrawlingAPI
import json

# Initialize the Crawling API with your Crawlbase Normal token
api = CrawlingAPI({ 'token': 'CRAWLBASE_NORMAL_TOKEN' })

# URL of the Google search page you want to scrape
google_search_url = 'https://www.google.com/search?q=data+science'

# options for Crawling API
options = {
 'scraper': 'google-serp'
}

# Make a request to scrape the Google search page with options
response = api.get(google_search_url, options)

# Check if the request was successful
if response['status_code'] == 200 and response['headers']['pc_status'] == '200':
  # Loading JSON from response body after decoding byte data
  response_json = json.loads(response['body'].decode('latin1'))

  # pretty printing response body
  print(json.dumps(response_json, indent=4, sort_keys=True))
else:
  print("Failed to retrieve the page. Status code:", response['status_code'])

Output:

{
  "body": {
    "ads": [],
    "numberOfResults": 2520000000,
    "peopleAlsoAsk": [
      {
        "description": "A data scientist uses data to understand and explain the phenomena around them, and help organizations make better decisions. Working as a data scientist can be intellectually challenging, analytically satisfying, and put you at the forefront of new advances in technology.Jun 15, 2023",
        "destination": {
          "text": "Courserahttps://www.coursera.org \u00e2\u0080\u00ba Coursera Articles \u00e2\u0080\u00ba Data",
          "url": "https://www.coursera.org/articles/what-is-a-data-scientist#:~:text=A%20data%20scientist%20uses%20data,of%20new%20advances%20in%20technology."
        },
        "position": 1,
        "title": "What exactly does a data scientist do?",
        "url": "https://google.com/search?sca_esv=561439800&q=What+exactly+does+a+data+scientist+do%3F&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQzmd6BAgvEAY"
      },
      {
        "description": "Yes, because it demands a solid foundation in math, statistics, and computer programming, entering a data science degree can be difficult. The abilities and knowledge required to excel in this sector may, however, be acquired by anybody with the right amount of effort and commitment.Aug 11, 2023",
        "destination": {
          "text": "simplilearn.comhttps://www.simplilearn.com \u00e2\u0080\u00ba is-data-science-hard-article",
          "url": "https://www.simplilearn.com/is-data-science-hard-article#:~:text=Yes%2C%20because%20it%20demands%20a,amount%20of%20effort%20and%20commitment."
        },
        "position": 2,
        "title": "Is data science too hard?",
        "url": "https://google.com/search?sca_esv=561439800&q=Is+data+science+too+hard%3F&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQzmd6BAgqEAY"
      },
      {
        "description": "Does Data Science Require Coding? Yes, data science needs coding because it uses languages like Python and R to create machine-learning models and deal with large datasets.Jul 28, 2023",
        "destination": {
          "text": "simplilearn.comhttps://www.simplilearn.com \u00e2\u0080\u00ba what-skills-do-i-need-to-b...",
          "url": "https://www.simplilearn.com/what-skills-do-i-need-to-become-a-data-scientist-article#:~:text=Does%20Data%20Science%20Require%20Coding,and%20deal%20with%20large%20datasets."
        },
        "position": 3,
        "title": "Is data science a coding?",
        "url": "https://google.com/search?sca_esv=561439800&q=Is+data+science+a+coding%3F&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQzmd6BAgrEAY"
      },
      {
        "description": "Is data science a good career? Data science is a fantastic career with a tonne of potential for future growth. Already, there is a lot of demand, competitive pay, and several benefits. Companies are actively looking for data scientists that can glean valuable information from massive amounts of data.Jun 19, 2023",
        "destination": {
          "text": "simplilearn.comhttps://www.simplilearn.com \u00e2\u0080\u00ba is-data-science-a-good-car...",
          "url": "https://www.simplilearn.com/is-data-science-a-good-career-choice-article#:~:text=View%20More-,Is%20data%20science%20a%20good%20career%3F,from%20massive%20amounts%20of%20data."
        },
        "position": 4,
        "title": "Is data science a good career?",
        "url": "https://google.com/search?sca_esv=561439800&q=Is+data+science+a+good+career%3F&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQzmd6BAgsEAY"
      }
    ],
    "relatedSearches": [
      {
        "title": "data science jobs",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+science+jobs&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhVEAE"
      },
      {
        "title": "data science salary",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+science+salary&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhQEAE"
      },
      {
        "title": "data science degree",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+Science+degree&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhREAE"
      },
      {
        "title": "data science - wikipedia",
        "url": "https://google.com/search?sca_esv=561439800&q=data+science+-+wikipedia&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhTEAE"
      },
      {
        "title": "data science definition and example",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+science+definition+and+example&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhUEAE"
      },
      {
        "title": "data science syllabus",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+Science+syllabus&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhSEAE"
      },
      {
        "title": "data science vs data analytics",
        "url": "https://google.com/search?sca_esv=561439800&q=Data+science+vs+data+analytics&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhPEAE"
      },
      {
        "title": "what is data science in python",
        "url": "https://google.com/search?sca_esv=561439800&q=What+is+Data+Science+in+Python&sa=X&ved=2ahUKEwikkP3WyYWBAxUkkWoFHTxKCSIQ1QJ6BAhNEAE"
      }
    ],
    "searchResults": [
      {
        "description": "Data science combines math and statistics, specialized programming, advanced analytics, artificial intelligence (AI), and machine learning with specific subject ...",
        "destination": "IBMhttps://www.ibm.com \u00e2\u0080\u00ba topics \u00e2\u0080\u00ba data-science",
        "position": 1,
        "postDate": "",
        "title": "What is Data Science?",
        "url": "https://www.ibm.com/topics/data-science"
      },
      {
        "description": "Data scientists examine which questions need answering and where to find the related data. They have business acumen and analytical skills as well as the ...",
        "destination": "University of California, Berkeleyhttps://ischoolonline.berkeley.edu \u00e2\u0080\u00ba Data Science",
        "position": 2,
        "postDate": "",
        "title": "What is Data Science? - UC Berkeley Online",
        "url": "https://ischoolonline.berkeley.edu/data-science/what-is-data-science/"
      },
      {
        "description": "A data scientist is a professional who creates programming code and combines it with statistical knowledge to create insights from data.",
        "destination": "Wikipediahttps://en.wikipedia.org \u00e2\u0080\u00ba wiki \u00e2\u0080\u00ba Data_science",
        "position": 3,
        "postDate": "",
        "title": "Data science",
        "url": "https://en.wikipedia.org/wiki/Data_science"
      },
      {
        "description": "A data scientist's duties can include developing strategies for analyzing data, preparing data for analysis, exploring, analyzing, and visualizing data, ...",
        "destination": "Oraclehttps://www.oracle.com \u00e2\u0080\u00ba what-is-data-science",
        "position": 4,
        "postDate": "",
        "title": "What is Data Science?",
        "url": "https://www.oracle.com/what-is-data-science/"
      },
      {
        "description": "Aug 1, 2023 \u00e2\u0080\u0094 Data science is the domain of study that deals with vast volumes of data using modern tools and techniques to find unseen patterns, derive ...",
        "destination": "Simplilearn.comhttps://www.simplilearn.com \u00e2\u0080\u00ba data-science-tutorial",
        "position": 5,
        "postDate": "",
        "title": "What is Data Science? A Simple Explanation and More",
        "url": "https://www.simplilearn.com/tutorials/data-science-tutorial/what-is-data-science"
      },
      {
        "description": "Jun 15, 2023 \u00e2\u0080\u0094 A data scientist uses data to understand and explain the phenomena around them, and help organizations make better decisions.",
        "destination": "Courserahttps://www.coursera.org \u00e2\u0080\u00ba Coursera Articles \u00e2\u0080\u00ba Data",
        "position": 6,
        "postDate": "",
        "title": "What Is a Data Scientist? Salary, Skills, and How to ...",
        "url": "https://www.coursera.org/articles/what-is-a-data-scientist"
      },
      {
        "description": "Data Science is a combination of mathematics, statistics, machine learning, and computer science. Data Science is collecting, analyzing and interpreting data to ...",
        "destination": "Great Learninghttps://www.mygreatlearning.com \u00e2\u0080\u00ba blog \u00e2\u0080\u00ba what-is-dat...",
        "position": 7,
        "postDate": "",
        "title": "What is Data Science?: Beginner's Guide",
        "url": "https://www.mygreatlearning.com/blog/what-is-data-science/"
      },
      {
        "description": "Data science Specializations and courses teach the fundamentals of interpreting data, performing analyses, and understanding and communicating actionable ...",
        "destination": "Courserahttps://www.coursera.org \u00e2\u0080\u00ba browse \u00e2\u0080\u00ba data-science",
        "position": 8,
        "postDate": "",
        "title": "Best Data Science Courses Online [2023]",
        "url": "https://www.coursera.org/browse/data-science"
      },
      {
        "description": "Apr 5, 2023 \u00e2\u0080\u0094 Data science is a multidisciplinary field of study that applies techniques and tools to draw meaningful information and actionable insights ...",
        "destination": "Built Inhttps://builtin.com \u00e2\u0080\u00ba data-science",
        "position": 9,
        "postDate": "",
        "title": "What Is Data Science? A Complete Guide.",
        "url": "https://builtin.com/data-science"
      }
    ],
    "snackPack": {
      "mapLink": "",
      "moreLocationsLink": "",
      "results": []
    }
  },
  "original_status": 200,
  "pc_status": 200,
  "url": "https://www.google.com/search?q=data%20science"
}

The above output shows that the “google-serp” scraper does its job very efficiently. It scraps all the important information including 9 search results from related Google search page and gives us a JSON object that we can easily use in our code as per the requirement.

When it comes to scraping Google search pages, mastering pagination is essential for gathering comprehensive data. The Crawlbase “google-serp” scraper provides valuable information in its JSON response: the total number of results, known as “numberOfResults.” This information serves as our guiding star for effective pagination handling.

Your scraper must deftly navigate through the various pages of results concealed within the pagination to capture all the search results. You’ll use the “start” query parameter to do this successfully, mirroring Google’s methodology. Google typically displays nine search results per page, creating a consistent gap of nine results between each page, as illustrated below:

Page 1: https://www.google.com/search?q=data+science&start=1
Page 2: https://www.google.com/search?q=data+science&start=10
… And so on, until the final page.

Determining the correct value for the “start” query parameter is a matter of incrementing the position of the last “searchResults” object from the response and adding it into the previous start value. You’ll continue this process until you’ve reached your desired result number or until you’ve harvested the maximum number of results available. This systematic approach ensures that valuable data is collected, enabling you to extract comprehensive insights from Google’s search pages.

Let’s update the example code to handle pagination and scrape all the products:

from crawlbase import CrawlingAPI
import json

# Initialize the Crawling API with your Crawlbase Normal token
api = CrawlingAPI({ 'token': 'CRAWLBASE_NORMAL_TOKEN' })

# URL of the Google search page you want to scrape
google_search_url = 'https://www.google.com/search?q=data+science'

# options for Crawling API
options = {
 'scraper': 'google-serp'
}

# List to store the scraped search results
search_results = []

def get_total_results(url):
  # Make a request to scrape the Google search page with options
  response = api.get(url, options)

  # Check if the request was successful
  if response['status_code'] == 200 and response['headers']['pc_status'] == '200':
      # Loading JSON from response body after decoding byte data
      response_json = json.loads(response['body'].decode('latin1'))

      # Getting Scraper Results
      scraper_result = response_json['body']

      # Extract pagination information
      numberOfResults = scraper_result.get("numberOfResults", None)
      return numberOfResults
  else:
      print("Failed to retrieve the page. Status code:", response['status_code'])
      return None

def scrape_search_results(url):
  # Make a request to scrape the Google search page with options
  response = api.get(url, options)

  # Check if the request was successful
  if response['status_code'] == 200 and response['headers']['pc_status'] == '200':
      # Loading JSON from response body after decoding byte data
      response_json = json.loads(response['body'].decode('latin1'))

      # Getting Scraper Results
      scraper_result = response_json['body']

      # Extracting search results from the JSON response
      results = scraper_result.get("searchResults", [])
      search_results.extend(results)

  else:
      print("Failed to retrieve the page. Status code:", response['status_code'])

# Extract pagination information
numberOfResults = get_total_results(google_search_url) or 50
# Initialize starting position for search_results
start_value = 1

# limiting search results to 50 max for the example
# you can increase limit upto numberOfResults to scrape max search results
while start_value < 50:
  if start_value > numberOfResults:
    break
  page_url = f'{google_search_url}&start={start_value}'
  scrape_search_results(page_url)
  start_value = start_value + search_results[-1]['position'] + 1

# Process the collected search results as needed
print(f'Total Search Results: {len(search_results)}')

Example Output:

1	Total Search Results: 47

As you can see above we have now 47 search results which are far greater then what we have previously. You can update the limit in the code (Set to 50 for the example) and can scrape any amount of search results within the range of number of available results.

Saving Data to SQLite database

Once you’ve successfully scraped Google search results using the Crawlbase API, you might want to persist this data for further analysis or use it in your applications. One efficient way to store structured data like search results is by using an SQLite database, which is lightweight, self-contained, and easy to work with in Python.

Here’s how you can save the URL, title, description, and position of every search result object to an SQLite database:

import sqlite3
from crawlbase import CrawlingAPI
import json

def scrape_google_search():
  # Initialize the Crawling API with your Crawlbase Normal token
  api = CrawlingAPI({'token': 'CRAWLBASE_NORMAL_TOKEN'})

  # URL of the Google search page you want to scrape
  google_search_url = 'https://www.google.com/search?q=data+science'

  # Options for Crawling API
  options = {
      'scraper': 'google-serp'
  }

  # List to store the scraped search results
  search_results = []

  def get_total_results(url):
    # Make a request to scrape the Google search page with options
    response = api.get(url, options)

    # Check if the request was successful
    if response['status_code'] == 200 and response['headers']['pc_status'] == '200':
        # Loading JSON from response body after decoding byte data
        response_json = json.loads(response['body'].decode('latin1'))

        # Getting Scraper Results
        scraper_result = response_json['body']

        # Extract pagination information
        numberOfResults = scraper_result.get("numberOfResults", None)
        return numberOfResults
    else:
        print("Failed to retrieve the page. Status code:", response['status_code'])
        return None

  def scrape_search_results(url):
    # Make a request to scrape the Google search page with options
    response = api.get(url, options)

    # Check if the request was successful
    if response['status_code'] == 200 and response['headers']['pc_status'] == '200':
        # Loading JSON from response body after decoding byte data
        response_json = json.loads(response['body'].decode('latin1'))

        # Getting Scraper Results
        scraper_result = response_json['body']

        # Extracting search results from the JSON response
        results = scraper_result.get("searchResults", [])
        search_results.extend(results)

    else:
        print("Failed to retrieve the page. Status code:", response['status_code'])

  def initialize_database():
    # Create or connect to the SQLite database
    conn = sqlite3.connect('search_results.db')
    cursor = conn.cursor()

    # Create a table to store the search results
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS search_results (
            title TEXT,
            url TEXT,
            description TEXT,
            position INTEGER
        )
    ''')

    # Commit changes and close the database connection
    conn.commit()
    conn.close()

  def insert_search_results(result_list):
    # Create or connect to the SQLite database
    conn = sqlite3.connect('search_results.db')
    cursor = conn.cursor()

    # Iterate through result_list and insert data into the database
    for result in result_list:
      title = result.get('title', '')
      url = result.get('url', '')
      description = result.get('description', '')
      position = result.get('position', None)

      cursor.execute('INSERT INTO search_results VALUES (?, ?, ?, ?)',
                    (title, url, description, position))

    # Commit changes and close the database connection
    conn.commit()
    conn.close()

  # Initializing the database
  initialize_database()

  # Extract pagination information
  numberOfResults = get_total_results(google_search_url) or 50
  # Initialize starting position for search_results
  start_value = 1

  # limiting search results to 50 max for the example
  # you can increase limit upto numberOfResults to scrape max search results
  while start_value < 50:
    if start_value > numberOfResults:
      break
    page_url = f'{google_search_url}&start={start_value}'
    scrape_search_results(page_url)
    start_value = start_value + search_results[-1]['position'] + 1

  # save search_results into database
  insert_search_results(search_results)

if __name__ == "__main__":
    scrape_google_search()

In above code, The scrape_google_search() function is the entry point. It initializes the Crawlbase API with an authentication token and specifies the Google search URL that will be scraped. It also sets up an empty list called search_results to collect the extracted search results.

The scrape_search_results(url) function takes a URL as input, sends a request to the Crawlbase API to fetch the Google search results page, and extracts relevant information from the response. It then appends this data to the search_results list.

Two other key functions, initialize_database() and insert_search_results(result_list), deal with managing a SQLite database. The initialize_database() function is responsible for creating or connecting to a database file named search_results.db and defining a table structure to store the search results. The insert_search_results(result_list) function inserts the scraped search results into this database table.

The script also handles pagination by continuously making requests for subsequent search result pages. Max limit for search results are set to 50 for this example. The scraped data, including titles, URLs, descriptions, and positions, is then saved into the SQLite database which we can use for further analysis.

search_results database preview:

7. Scrape Google Search Results with Crawlbase

Web scraping is a transformative technology that empowers us to extract valuable insights from the vast ocean of information on the internet, with Google search pages being a prime data source. This blog has taken you on a comprehensive journey into the world of web scraping, employing Python and the Crawlbase Crawling API as our trusty companions.

We began by understanding the significance of web scraping, revealing its potential to streamline data collection, enhance efficiency, and inform data-driven decision-making across various domains. We then introduced the Crawlbase Crawling API, a robust and user-friendly tool tailored for web scraping, emphasizing its scalability, reliability, and real-time data access.

We covered essential prerequisites, including configuring your development environment, installing necessary libraries, and creating a Crawlbase account. We learned how to obtain the token, set up the API, select the ideal scraper, and efficiently manage pagination to scrape comprehensive search results.

Now that you know how to do web scraping, you can explore and gather information from Google search results. Whether you’re someone who loves working with data, a market researcher, or a business professional, web scraping is a useful skill. It can give you an advantage and help you gain deeper insights. So, as you start your web scraping journey, I hope you collect a lot of useful data and gain plenty of valuable insights.

Our APIs continue working with the latest Google changes

8. Frequently Asked Questions (FAQs)

Q. What is the significance of web scraping Google search results page?

Web scraping Google search results is significant because it provides access to a vast amount of data available on the internet. Google is a primary gateway to information, and scraping its search results allows for various applications, including market research, data analysis, competitor analysis, and content aggregation.

Q. What are the main advantages of using the “google-serp” Scraper?

The “google-serp” scraper is specifically designed for scraping Google search result pages. It provides a structured JSON response with essential information such as search results, ads, related searches, and more. This scraper is advantageous because it simplifies the data extraction process, making it easier to work with the data you collect. It also ensures you capture all relevant information from Google’s dynamic search pages.

Q. Is it legal to scrape Google results?

Scraping Google SERP results is not illegal. However, you must adhere to Google’s terms of service. It is best to consider using complaint APIs as they provide structured access to data within the ethical boundaries.

Q. Is it possible to scrape Google Events results?

Google allows users to search for events like concerts, festivals, exhibitions, and gatherings happening globally. When you enter relevant keywords, an events table appears above the standard search results, displaying details such as event titles, locations, dates, performers, and more. While this public data can be scraped, it’s crucial to comply with all legal guidelines and regulations.

Q. Can I scrape Google Local results?

Google Local results are determined by a mix of relevance and proximity to the user. For instance, searching for “coffee shops” will display nearby options along with directions. These results differ from Google Maps, which focuses on route calculations and location-based searches. Scraping Google Local results is permissible for personal or business use, provided you adhere to legal requirements.

Q. Can I scrape Google Video results?

Scraping publicly available Google Video results is generally legal, but it’s essential to follow all applicable rules and regulations. Extracting metadata such as video titles, descriptions, and URLs can be helpful for various purposes. However, if you plan to collect large volumes of data, it’s wise to seek legal counsel to avoid potential issues. Always prioritize compliance when gathering public information from Google.

How To Scrape Google Search Results with Python

Why Scrape URLs Google Search Results?

Using Crawlbase Crawling API to Scrape Google Results

Exploring the Crawlbase Python Library

4. Essential Requirements for a Successful Start

Configuring Your Development Environment For Scraping Google

Installing the Necessary Libraries to Scrape Google Search Results

Creating Your Crawlbase Account

5. Understanding the Structure of Google Search Results Pages

Components of a Google Search Results Page

6. Mastering Google Search Page Scraping with the Crawling API

Getting the Correct Crawlbase Token

Setting up Crawlbase Crawling API

Selecting the Ideal Scraper

Saving Data to SQLite database

7. Scrape Google Search Results with Crawlbase

8. Frequently Asked Questions (FAQs)

Q. What is the significance of web scraping Google search results page?

Q. What are the main advantages of using the “google-serp” Scraper?

Q. Is it legal to scrape Google results?

Q. Is it possible to scrape Google Events results?

Q. Can I scrape Google Local results?

Q. Can I scrape Google Video results?

Hassan Rehan

Our solution

Crawling API

Similar to "How To Scrape Google Search Results with Python"

How to Scrape Google Search Results with Python

Top Challenges of Scraping Google Search Results and How to Overcome Them

How to Analyze Competitor's Google Ads

How to Extract and Analyze SEO Data from Google Search Results

How Does Google Scrape Websites?

Most read from advanced web scraping tutorials

Top Scraper API Alternative in 2025 - Best Replacement

Top Challenges of Scraping Google Search Results and How to Overcome Them

What is Cloud Storage? Types and Uses of Cloud Storage

Start crawling and scraping the web today

How To Scrape Google Search Results with Python

Why Scrape URLs Google Search Results?

Using Crawlbase Crawling API to Scrape Google Results

Exploring the Crawlbase Python Library

4. Essential Requirements for a Successful Start

Configuring Your Development Environment For Scraping Google

Installing the Necessary Libraries to Scrape Google Search Results

Creating Your Crawlbase Account

5. Understanding the Structure of Google Search Results Pages

Components of a Google Search Results Page

6. Mastering Google Search Page Scraping with the Crawling API

Getting the Correct Crawlbase Token

Setting up Crawlbase Crawling API

Selecting the Ideal Scraper

Effortlessly Managing Pagination

Saving Data to SQLite database

7. Scrape Google Search Results with Crawlbase

8. Frequently Asked Questions (FAQs)

Q. What is the significance of web scraping Google search results page?

Q. What are the main advantages of using the “google-serp” Scraper?

Q. Is it legal to scrape Google results?

Q. Is it possible to scrape Google Events results?

Q. Can I scrape Google Local results?

Q. Can I scrape Google Video results?

Hassan Rehan

Our solution

Crawling API

Share this post

Similar to "How To Scrape Google Search Results with Python"

Most read from advanced web scraping tutorials

Start crawling and scraping the web today