Flipkart is one of India's largest e-commerce platforms, and every product page on it carries the kind of structured detail that drives price tracking, competitor monitoring, and catalog research: a product name, the current price, a star rating, a review count, and a block of specifications and highlights. The problem is that Flipkart renders those pages in the browser and defends them against automated traffic, so a plain HTTP request hands you a near-empty shell instead of the fields you came for.

This guide shows you how to scrape Flipkart product pages with Python the reliable way. You build a small, runnable scraper that fetches a rendered product page through the Crawling API, parses the fields you want with BeautifulSoup, and prints a clean structured record. The whole walkthrough stays scoped to public product data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Python script that takes a public Flipkart product URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record of the product. We will target a single product page as the running example and pull these fields:

  • Name the product title, for example a headphone model and its key spec line.
  • Price the current selling price shown on the page.
  • Rating the average star rating, like "4.3".
  • Review count how many ratings or reviews the product has.
  • Highlights the bullet list of key specs Flipkart shows near the top of the page.

Why a plain fetch fails on Flipkart

If you request a Flipkart product URL with a bare HTTP client, you get a response that is missing most of the data you can see in a browser. Two things work against you. First, Flipkart builds much of its product content client-side with JavaScript, so the initial HTML is thin and only fills in once the page's scripts run. Second, Flipkart flags automated traffic fast: datacenter IPs and request patterns that do not look like a real browser get challenged or served a stripped-down page before they ever reach the full product detail.

So a working Flipkart scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns finished HTML for you to parse.

Why the JS token

Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Flipkart leans on client-side rendering for product detail, so the JS token is the safe default here. The normal token can return a thinner page where price, rating, or highlights are missing, and there is nothing to parse out of that.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to the language, the official Python docs and any beginner course will get you to the level this tutorial assumes.

Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda.

A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs.

bash
python --version

python -m venv flipkart_env
source flipkart_env/bin/activate

pip install crawlbase beautifulsoup4

On Windows, activate the environment with flipkart_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull out individual fields by CSS selector. If BeautifulSoup is new to you, the BeautifulSoup guide covers the basics this tutorial builds on.

Step 1: Fetch the rendered product page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JS token, and request the product URL. Checking the status code before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"})

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 5000}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['status_code']}")
    return None

if __name__ == "__main__":
    page_url = "https://www.flipkart.com/oneplus-bullets-wireless-z2-bluetooth-headset/p/itm4c3852314bb61"
    html = crawl(page_url)
    print(html[:500] if html else "No HTML returned")

The two wait options matter for a client-rendered target like this. ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so late-rendering elements appear before the page is captured. Five seconds is a reasonable start; raise it if the product fields come back empty. Run the script with python scraper.py and you should see real product markup, not a thin shell. That confirms rendering works before you write a single selector.

Crawlbase Crawling API

Flipkart needs a rendered page behind a trusted IP, in one call. The Crawling API takes a JS token, runs the page in a real browser, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public product page on the free tier first.

Step 2: Parse the product fields with BeautifulSoup

With rendered HTML in hand, load it into BeautifulSoup and pull each field by its selector. A Flipkart product page lays the core details out in a predictable structure, so you can map name, price, rating, review count, and the highlights list to individual selectors. Wrap the extraction in helpers that return None for a missing field so one absent value does not crash the run.

python
from bs4 import BeautifulSoup
import re

def text_of(soup, selector):
    el = soup.select_one(selector)
    return el.get_text(strip=True) if el else None

def scrape_product(html):
    soup = BeautifulSoup(html, "html.parser")

    price = text_of(soup, "div._30jeq3.x_zBx4")
    highlights = [
        li.get_text(strip=True)
        for li in soup.select("div._7eSDEz li._21Ahn-")
    ]

    return {
        "name": text_of(soup, "span.B_NuCI"),
        "price": re.sub(r"\D", "", price) if price else None,
        "rating": text_of(soup, "div._3LWZlK"),
        "review_count": text_of(soup, "span._2_R_DZ"),
        "highlights": highlights,
    }

The text_of helper does two useful things at once: it queries a single element and returns None when the element is missing, instead of throwing on a .get_text() call against nothing. That keeps the extraction resilient when one field is absent on a given page, which is common since not every product lists a rating or a full highlights block. The price is run through a small regex to strip the currency symbol and commas, leaving a clean integer string you can cast to a number later.

Selectors drift

Flipkart's class names (the hashed tokens like _30jeq3, B_NuCI, and _3LWZlK) are build artifacts and change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back as None, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Put it together

Now wire the fetch and the parse into one runnable script. Fetch the rendered HTML, hand it to the parser, and print the structured record.

python
import json
import re
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"})

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 5000}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['status_code']}")
    return None

def text_of(soup, selector):
    el = soup.select_one(selector)
    return el.get_text(strip=True) if el else None

def scrape_product(html):
    soup = BeautifulSoup(html, "html.parser")
    price = text_of(soup, "div._30jeq3.x_zBx4")
    highlights = [
        li.get_text(strip=True)
        for li in soup.select("div._7eSDEz li._21Ahn-")
    ]
    return {
        "name": text_of(soup, "span.B_NuCI"),
        "price": re.sub(r"\D", "", price) if price else None,
        "rating": text_of(soup, "div._3LWZlK"),
        "review_count": text_of(soup, "span._2_R_DZ"),
        "highlights": highlights,
    }

def main():
    page_url = "https://www.flipkart.com/oneplus-bullets-wireless-z2-bluetooth-headset/p/itm4c3852314bb61"
    html = crawl(page_url)
    if not html:
        return
    data = scrape_product(html)
    print(json.dumps(data, indent=2))

if __name__ == "__main__":
    main()

What the output looks like

Run the full script with python scraper.py and you get a clean structured record for the product, ready to write to JSON, CSV, or a database.

json
{
  "name": "OnePlus Bullets Wireless Z2 Bluetooth Headset",
  "price": "1799",
  "rating": "4.3",
  "review_count": "9,05,873 Ratings & 47,210 Reviews",
  "highlights": [
    "30 Hrs Battery Life",
    "Fast Charge: 10 min for 20 hours",
    "Bluetooth version: 5.0"
  ]
}

Scaling to many products

One product is a demo; a real job runs over a list of products. The shape stays the same: keep a list of product URLs, fetch each through the Crawling API, parse it with the same function, and collect the rows. Because every product page shares the same structure, the parser you already wrote works across all of them without changes.

python
import time

product_urls = [
    "https://www.flipkart.com/oneplus-bullets-wireless-z2-bluetooth-headset/p/itm4c3852314bb61",
    "https://www.flipkart.com/boat-airdopes-161-bluetooth-headset/p/itm8a7493150ae4a",
]

results = []
for url in product_urls:
    html = crawl(url)
    if html:
        results.append(scrape_product(html))
    time.sleep(2)

with open("products.json", "w") as f:
    json.dump(results, f, indent=2)

To build the URL list at scale, scrape Flipkart's public search pages with the same fetch-then-parse pattern, collect the product links, and then visit each one. The time.sleep(2) between requests is deliberate pacing: it keeps you from hammering the site in a tight loop, which is the fastest way to get throttled. For a broader e-commerce playbook, see the guide on e-commerce web scraping.

Staying unblocked

Even with rendering handled, Flipkart watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Hammering pages in a tight loop is the fastest way to get throttled. Spread requests out and vary your targets instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked and the deeper dive on how to crawl JavaScript websites. If you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy (also called the AI Proxy) gives you the same residential IP rotation as a drop-in proxy endpoint.

Whether scraping Flipkart is allowed depends on Flipkart's terms of service, your jurisdiction, and what you do with the data. Flipkart's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read the Flipkart Terms of Use and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public product data: the product name, price, rating, review count, and the highlights anyone can see without an account. Respect Flipkart's stated rate expectations and keep your request volume low enough that you are not straining its servers. Avoid anything tied to identifiable individuals, including reviewer profiles or buyer details beyond the aggregate counts shown publicly. If you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

This guide is deliberately scoped to public product-detail pages because that is the line that keeps the work defensible. It does not cover anything behind a login: your account or order history, seller dashboards, login-walled pages, or any attempt to bypass authentication. If your project needs more than public product data, for bulk or commercial use the correct path is an official Flipkart API or a data agreement, not a cleverer scraper. Public product data only.

Recap

Key takeaways

  • Flipkart renders product detail client-side. A plain fetch returns a thin page, so you must render before you parse.
  • You need rendering and a trusted IP together. The Crawling API with a JS token does both in one call; ajax_wait and page_wait control how long it waits for content.
  • BeautifulSoup does the extraction. Map name, price, rating, review count, and highlights to current selectors, and expect those hashed class names to drift.
  • Scale by looping URLs. The same parser works across every product page, so a real job is just a list of links plus sensible pacing.
  • Stay on public data. Respect Flipkart's ToS and robots.txt, prefer an official API or agreement for bulk or commercial use, and never touch accounts, orders, or login-walled pages.

Frequently Asked Questions (FAQs)

Why does a plain fetch return no product data from Flipkart?

Because Flipkart builds much of its product detail client-side with JavaScript. The initial HTML is thin and only fills in after the page's scripts run in a browser, so a raw HTTP request can come back with the price, rating, or highlights missing. To get real data you have to render the page first, which is what the Crawling API's JS token handles for you.

Do I need the normal token or the JS token for Flipkart?

Use the JS token. The normal token fetches static HTML, which on Flipkart can be a thinner page where key product fields are absent. The JS token renders the page in a real browser before handing back the HTML, so the name, price, rating, and highlights are present when BeautifulSoup parses them.

My selectors return None. What changed?

Almost certainly Flipkart's markup. Its class names are hashed build artifacts like _30jeq3 and B_NuCI, and they change without notice, so selectors that worked last month can break. Re-inspect a live product page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

How do I scrape many Flipkart products at once?

Keep a list of product URLs and loop over it, fetching each through the Crawling API and parsing it with the same function. Build that URL list by scraping Flipkart's public search pages first. Add a short delay between requests so you are not hammering the site, and write the collected rows to JSON or CSV at the end.

Should I use an official Flipkart API or scrape the site?

If you need licensed data, bulk volume, guaranteed structure, or commercial reuse rights, an official API or data agreement is the right tool and keeps you on the right side of Flipkart's terms. Scraping public product pages with the approach in this guide fits smaller, public-data research where no API access is in place, as long as you respect the ToS, robots.txt, and rate limits.

How do I avoid getting blocked while scraping Flipkart?

Keep your per-IP request rate low, vary your targets instead of looping one path, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation and a trusted IP pool for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available