Google Shopping is one of the densest e-commerce surfaces on the web. A single results page packs product titles, prices, sellers, ratings, and links from dozens of retailers side by side, which is exactly the shape of data a price-monitoring or market-research job wants. If you sell anything online, the public listings show you what competitors charge and who stocks a given product right now.
This guide shows you how to scrape Google Shopping data with Python the reliable way. You build a small, runnable scraper that fetches a rendered Shopping results page through the Crawling API, parses each listing with BeautifulSoup, handles pagination, and exports the records to JSON and CSV for price tracking. The whole walkthrough stays scoped to public Shopping listings that anyone can see without an account, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.
What you will build
A Python script that takes a public Google Shopping search URL, retrieves the HTML through the Crawling API, and extracts a structured record for every product on the page. We will use the query "louis vuitton bags" as the running example and pull these fields from each listing:
- Title the product name as shown in the listing.
- Price the displayed price text, useful as the core signal for price monitoring.
- Seller the merchant or retailer offering the product.
- Rating the average customer rating when one is shown.
- Link the URL to the product's Google Shopping page.
Why a plain request fails on Google Shopping
If you fire a bare HTTP request at a Google Shopping URL from a script, you rarely get the clean page you see in your own browser. Two things work against you. First, Google renders much of the Shopping grid with JavaScript and tailors what it returns based on the requesting IP and region, so a foreign datacenter address can come back with a consent wall, a different currency, or partial content. Second, Google watches for automated traffic: requests that do not look like a real browser get challenged, fed a CAPTCHA, or blocked before they reach the listings.
So a working Google Shopping scraper needs two things in one request: an IP the platform reads as a real visitor, and a browser that renders the page when it leans on scripts. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but keeping those healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it fetches from a trusted residential IP and renders when needed, and it returns finished HTML for you to parse.
Google Shopping shows different products, currencies, and sellers depending on where the request appears to come from. A request from a residential IP in the country you care about looks like an ordinary shopper, while a foreign datacenter address is an immediate tell. The Crawling API rotates through residential addresses server-side and lets you pin a country, so you get the listings a local shopper would see. You can start with 1,000 free requests, no credit card needed.
Prerequisites
You need a few things in place before writing any code. None of them take long.
Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If BeautifulSoup is new to you, our guide to using BeautifulSoup in Python covers the parsing basics this tutorial assumes.
Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda.
A Crawlbase account and token. Sign up, open your dashboard, and copy your request token. Crawlbase issues two token types: a normal token for static pages and a JavaScript token for browser-rendered pages. Google Shopping works with the normal token in most regions. Your first 1,000 requests are free, and adding billing details before you spend them unlocks an extra 9,000 free requests. Treat the token like a password: it authenticates your requests, so keep it out of version control.
Set up the project
Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs.
python --version python -m venv shopping_env source shopping_env/bin/activate pip install crawlbase beautifulsoup4
On Windows, activate the environment with shopping_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client that sends the request to the Crawling API and returns the rendered HTML, and beautifulsoup4 parses that HTML so you can pull out each field by CSS selector.
Step 1: Fetch the page through the Crawling API
Start by getting the HTML. Initialize the CrawlingAPI client with your token, then write a small scrape_google_shopping() function that sends your target URL with a set of options, checks that the underlying page came back with a 200 status, and returns the parsed HTML. The options pin the country, set a realistic user agent, and give the page a few seconds to render before the HTML is captured.
from crawlbase import CrawlingAPI from bs4 import BeautifulSoup import json # Initialize CrawlingAPI with your access token crawling_api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"}) # Request options: pin the region, look like a real browser, let the page render options = { "country": "US", "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36", "page_wait": 5000, } def fetch_html(url): response = crawling_api.get(url, options) if response["headers"]["pc_status"] != "200": print(f"Failed to fetch the page. Status: {response['headers']['pc_status']}") return None return response["body"].decode("utf-8") if __name__ == "__main__": url = "https://www.google.com/search?q=louis+vuitton+bags&tbm=shop&num=20" html = fetch_html(url) if html: print(html[:500])
The client returns a response whose headers["pc_status"] is the status Crawlbase saw when it fetched the page, so guarding on it means a block or consent wall surfaces as a clear message instead of feeding garbage into the parser. The search URL carries three things that matter: q is the query, tbm=shop switches Google into Shopping mode, and num=20 asks for twenty results per page. Run the script with python shopping.py and you should see real Shopping markup in the first 500 characters, which confirms the fetch works before you write a single selector.
That pc_status check only ever reads 200 because the request reached Google as a real shopper in the first place. The Crawling API fetches the page from a rotating residential IP in the country you pin, renders the JavaScript-heavy Shopping grid for you, and hands back finished HTML, so you skip running a headless browser fleet and sourcing a residential proxy pool yourself. Point it at a public Shopping URL on the free tier first.
Understanding the Google Shopping results page
Before writing selectors, it helps to know how the results page is laid out. Open a Shopping search in your browser, right-click a product, and choose Inspect to see the structure in dev tools. Each result is one of a few moving parts.
- Product listings. Each card carries a title, an image, a price, a seller or retailer name, and a rating when one exists. This is the bulk of what you scrape.
-
Pagination. Results spread across multiple pages, reached by changing the
startoffset in the URL. - Filters and sorting. Price range, brand, and category filters change what the page returns, which matters when you want a targeted slice.
- Sponsored listings. Some cards are ads. If you only want organic listings, you need to tell them apart from sponsored ones.
Step 2: Parse the listings with BeautifulSoup
With HTML in hand, load it into BeautifulSoup and pull each product by its selector. Google wraps each grid result in a .sh-dgr__grid-result container, and the individual fields live in nested elements inside it. Inspect the live page in your browser's dev tools to confirm the current class names; the selectors below match the layout at the time of writing.
def parse_products(html): soup = BeautifulSoup(html, "html.parser") products = [] for item in soup.select(".sh-dgr__grid-result"): title_el = item.select_one("h3.tAxDx") price_el = item.select_one("span.a8Pemb.OFFNJ") seller_el = item.select_one(".aULzUe.IuHnof") rating_el = item.select_one(".Rsc7Yb") link_el = item.select_one("a.Lq5OHe") products.append({ "title": title_el.get_text(strip=True) if title_el else None, "price": price_el.get_text(strip=True) if price_el else None, "seller": seller_el.get_text(strip=True) if seller_el else None, "rating": rating_el.get_text(strip=True) if rating_el else None, "link": "https://www.google.com" + link_el["href"] if link_el else None, }) return products
Each field reads from its own selector: h3.tAxDx holds the title, span.a8Pemb.OFFNJ the price, .aULzUe.IuHnof the seller or retailer, .Rsc7Yb the rating, and the anchor a.Lq5OHe carries the product link. The href Google stores is relative, so prefixing https://www.google.com turns it into a full URL. Every field uses a guard, ... if el else None, so a missing rating or seller leaves a None in the record instead of raising an exception and killing the whole run.
Google's class names, like tAxDx and a8Pemb, are generated and change when Google redeploys its front end. Treat the selectors above as a starting template, not a contract. When a field comes back None for every product, re-inspect a live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.
Step 3: Handle pagination
One page of twenty products is a demo; price monitoring wants the whole set. Google Shopping paginates with the start query parameter, an offset into the results: start=0 is the first page, start=20 the second, start=40 the third, and so on in steps of twenty. The shape stays the same for every page, so you build each URL, fetch it through the Crawling API, and parse it with the same function. Pacing the loop with a short pause between requests keeps a long run healthy.
import time def scrape_multiple_pages(base_url, pages=3): all_products = [] for page in range(pages): start_index = page * 20 paginated_url = f"{base_url}&start={start_index}" html = fetch_html(paginated_url) if html: all_products.extend(parse_products(html)) time.sleep(3) return all_products
The loop multiplies the page index by twenty to get each start offset, appends it to the base URL, and collects the parsed products into one flat list. The time.sleep(3) between pages is the single most useful habit for a long crawl: spacing requests out is what keeps Google reading you as a shopper rather than a bot.
Step 4: Put it together and export JSON and CSV
Now wire the fetch, the parse, and the pagination into one runnable script, then write the results to both JSON and CSV. JSON keeps the nested structure for code that consumes it; CSV drops straight into a spreadsheet, which is the format most price-monitoring workflows actually open.
from crawlbase import CrawlingAPI from bs4 import BeautifulSoup import json import csv import time crawling_api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"}) options = { "country": "US", "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36", "page_wait": 5000, } def fetch_html(url): response = crawling_api.get(url, options) if response["headers"]["pc_status"] != "200": print(f"Failed to fetch the page. Status: {response['headers']['pc_status']}") return None return response["body"].decode("utf-8") def parse_products(html): soup = BeautifulSoup(html, "html.parser") products = [] for item in soup.select(".sh-dgr__grid-result"): title_el = item.select_one("h3.tAxDx") price_el = item.select_one("span.a8Pemb.OFFNJ") seller_el = item.select_one(".aULzUe.IuHnof") rating_el = item.select_one(".Rsc7Yb") link_el = item.select_one("a.Lq5OHe") products.append({ "title": title_el.get_text(strip=True) if title_el else None, "price": price_el.get_text(strip=True) if price_el else None, "seller": seller_el.get_text(strip=True) if seller_el else None, "rating": rating_el.get_text(strip=True) if rating_el else None, "link": "https://www.google.com" + link_el["href"] if link_el else None, }) return products def scrape_multiple_pages(base_url, pages=3): all_products = [] for page in range(pages): paginated_url = f"{base_url}&start={page * 20}" html = fetch_html(paginated_url) if html: all_products.extend(parse_products(html)) time.sleep(3) return all_products def save_json(data, filename="products.json"): with open(filename, "w", encoding="utf-8") as f: json.dump(data, f, ensure_ascii=False, indent=4) def save_csv(data, filename="products.csv"): if not data: return with open(filename, "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=data[0].keys()) writer.writeheader() writer.writerows(data) def main(): base_url = "https://www.google.com/search?q=louis+vuitton+bags&tbm=shop&num=20" products = scrape_multiple_pages(base_url, pages=3) if products: save_json(products) save_csv(products) print(f"Saved {len(products)} products to products.json and products.csv") if __name__ == "__main__": main()
Run the full script with python shopping.py. It walks three pages of the "louis vuitton bags" Shopping results, extracts a record for each product, and writes everything to products.json and products.csv. To track a different product, change the q value in base_url; the rest of the pipeline handles whatever comes back. Run it on a schedule and diff the price column across runs and you have the core of a price-monitoring system.
What the output looks like
You get a clean list of product records, each with the title, price, seller, rating, and link, ready to write to JSON, CSV, or a database.
[ { "title": "Louis Vuitton Mini Pochette Accessoires Monogram", "price": "$760.00", "seller": "louisvuitton.com", "rating": "4.5", "link": "https://www.google.com/shopping/product/11460745201866483383" }, { "title": "Louis Vuitton Onthego Empreinte PM Black", "price": "$3,568.00", "seller": "StockX", "rating": null, "link": "https://www.google.com/shopping/product/7199001631589324220" } ]
The CSV mirror has one row per product with title,price,seller,rating,link columns and a header line, so it opens cleanly in any spreadsheet tool. A missing field comes through as an empty cell in CSV and null in JSON, which is the second listing's rating in the sample above.
Staying unblocked at scale
Even with a trusted IP handled for you, Google watches for scraper-shaped traffic, and Shopping is no exception. A few habits keep a run healthy.
-
Pace your requests. Hammering results pages in a tight loop is the fastest way to get challenged. The
time.sleepbetween pages is there for a reason; keep it and vary your queries instead of paging one term flat out. - Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
-
Pin the right country. Prices and sellers differ by region. Set the
countryoption to the market you are monitoring so the data matches what a local shopper sees. - Re-inspect when fields go empty. Google changes its markup periodically. If a field stops parsing, open a live page in dev tools and update the selector.
For the broader playbook, see how to scrape websites without getting blocked. If you are building this into a recurring price-tracking job, our guide on web scraping for price intelligence covers how to turn raw listings into a usable signal, and the broader e-commerce web scraping guide extends the same approach to other stores. To scrape standard Google search results rather than the Shopping tab, see how to scrape Google search pages.
Is it legal to scrape Google Shopping?
Whether scraping Google Shopping is allowed depends on Google's terms of service, your jurisdiction, and what you do with the data. Google's terms place limits on automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Google's terms and its robots.txt, and treat both as the boundary for what you collect.
A few lines worth holding to. Collect only public listing data: the product titles, prices, sellers, ratings, and links that anyone can see on a Shopping results page without an account. Keep your request volume low enough that you are not straining Google's servers, pace your crawl rather than running it flat out, and pin the country to the market you actually need. Do not republish Google's product images or copyrighted media wholesale, and stay away from anything behind a login or any personal data, none of which this tutorial touches.
If you need Shopping data at a scale or in a form that public scraping cannot defensibly support, Google offers official routes. The Content API for Shopping and the merchant and ads APIs are the sanctioned ways to manage and read product data programmatically, and an official agreement is the correct path when a project outgrows modest, public, well-paced collection. A cleverer scraper is not a substitute for a data agreement.
Key takeaways
- Google Shopping is JS-rendered and region-aware. A plain request gets a consent wall or the wrong currency, so you need a rendered fetch from a trusted residential IP in the right country.
- The Crawling API fetches behind a real IP. Send it the Shopping URL with a pinned country, it rotates residential IPs and renders the grid server-side, and returns finished HTML for you to parse.
-
BeautifulSoup does the extraction. Select each
.sh-dgr__grid-result, then read title, price, seller, rating, and link from it, and expect the generated class names to drift. -
Paginate with the start offset. Increase
startin steps of 20 to walk deeper into results, and pace your requests with a sleep between pages. - Export to JSON and CSV. JSON keeps the structure for code; CSV drops into a spreadsheet, which is what most price-monitoring workflows open. Stay on public data and respect Google's ToS and robots.txt.
Frequently Asked Questions (FAQs)
Why does a plain request fail or return the wrong page on Google Shopping?
Google renders much of the Shopping grid with JavaScript and tailors what it returns based on the requesting IP and region, so a call from a foreign datacenter address can come back with a consent wall, the wrong currency, or partial content instead of the listings you see in your own browser. It also flags traffic that does not look like a real browser. Fetching through the Crawling API, which uses rotating residential IPs and renders the page, makes the request look like an ordinary shopper so you get the real results.
What fields can I extract from Google Shopping?
This tutorial pulls five fields from each product card: the title, the price, the seller or retailer, the rating when one is shown, and the link to the product's Shopping page. You can extend the parser to grab images or shipping text by adding more selectors. Stay within public listing data and avoid anything behind a login.
How do I handle pagination on Google Shopping?
Use the start query parameter, an offset into the results: start=0 is the first page, start=20 the second, start=40 the third, and so on in steps of 20. Build each page URL with the offset, fetch it through the Crawling API, parse it with the same function, and pause a few seconds between requests so you are pacing the crawl rather than hammering it.
Can I use this for price monitoring?
Yes, that is the main use case. Run the script on a schedule, export the results to CSV, and compare the price column across runs to spot increases, drops, and new sellers for the products you track. Pinning the country option keeps prices comparable to the market you care about. Our guide on web scraping for price intelligence goes deeper on turning the raw data into a signal.
My selectors return nothing. What changed?
Almost certainly Google's markup. Class names like tAxDx and a8Pemb are generated and change when Google redeploys its front end, so selectors that worked last month can break. Re-inspect a live Shopping page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.
Does Google Shopping have an official API?
Yes. Google offers the Content API for Shopping along with merchant and ads APIs that let developers manage product listings, advertising campaigns, and performance data programmatically. If your project needs Shopping data at scale or in a sanctioned form, those official endpoints are the right path rather than public scraping.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
