Target.com is one of the largest retail catalogs in the United States, with millions of public product pages spanning electronics, apparel, home goods, and groceries. Each listing carries the data that powers most retail research: a title, a current price, a TCIN identifier, a star rating, and a clear in-stock or out-of-stock signal. Price trackers, competitive analysts, and product researchers watch those fields because they are one of the cleaner reads on what a mass-market retailer is charging and stocking at any moment.

This guide shows you how to scrape Target product data with Python. You build a small, runnable scraper that fetches a Target search or product page through the Crawling API, parses a clean record for each product, handles pagination across result pages, and exports to JSON and CSV. The whole walkthrough stays scoped to public catalog data: the titles, prices, ratings, and availability anyone can see on Target.com without logging in.

What you will build

A Python script that takes a Target search URL, retrieves the rendered page through the Crawling API, and extracts a structured record per product. We use a "womens sweaters" search as the running example, the same query the legacy walkthrough used, and pull these fields from each product card:

  • Title the product name shown on the listing card.
  • Price the current listed price, which can be a single value or a range.
  • TCIN / SKU Target's own product identifier, read from the product URL.
  • Rating the average star rating, derived from the rating bar width.
  • Review count how many reviews back that rating.
  • Availability whether the item reads as in stock or unavailable.
  • Product URL the absolute link to the item's own detail page.

Why a plain request fails on Target

If you point a bare HTTP client at a Target search URL, you get an almost empty result. Target renders its search grid client-side: the server ships a lightweight shell, and the product cards fill in only after the page's JavaScript runs. Parse a plain requests.get() response and you will see an empty list, because the products you came for were never in that first HTML payload.

On top of that, Target watches for automated traffic and blocks request patterns that do not look like a real browser. So a working Target scraper needs two things in one request: a browser that renders the page, and an IP that Target reads as a real shopper. You can assemble that yourself with a headless browser and a pool of rotating residential proxies, but keeping that stack healthy is most of the work. The Crawling API folds both into a single call: you send it the search URL, it renders the page behind a trusted residential IP, handles the rotation and CAPTCHA solving, and returns finished HTML for you to parse.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to the language, the getting started with Python scraping guide covers the level this tutorial assumes.

Python 3.8 or later. Confirm your version with python --version (or python3 --version). If you do not have it, install it from python.org and make sure Python is on your system PATH.

A Crawlbase account and token. Sign up for a free account, open your dashboard, and copy your token from the account docs page. Crawlbase issues two tokens: a normal token for static sites and a JavaScript token for dynamic, JS-rendered pages. Target is JS-rendered, so this tutorial uses the JavaScript token. The free tier includes 1,000 requests with no card, which is plenty to build and test this scraper. Treat the token like a password and keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs. crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull each field out of the product cards by CSS selector.

bash
python --version

python -m venv target_env
source target_env/bin/activate

pip install crawlbase beautifulsoup4

On Windows, activate the environment with target_env\Scripts\activate instead of the source line. With both libraries installed, create the script file the rest of the guide builds up:

bash
touch target_scraper.py

Understanding Target's search page

A Target search lives at a stable URL: https://www.target.com/s?searchTerm=womens+sweaters. The page lays out a grid of product cards, one per item, each carrying the same handful of fields: a title, a price, a star rating bar, a review count, and a link into the product's detail page. Target marks most of these with data-test attributes, which are far more durable than its generated class names.

Before writing selectors, open a search page in your browser, right-click a product card, and choose Inspect. Each card sits inside div[data-test="product-grid"], the title link carries data-test="product-title", the current price uses data-test="current-price", the review count uses data-test="rating-count", and the rating itself is a masked bar with data-ref="rating-mask" whose CSS width encodes the score. The TCIN, Target's internal product ID, is embedded in the product URL after the /A- marker, so you read it straight from the link.

Step 1: Fetch the rendered search page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JavaScript token, set the search URL, and request it with the JS rendering options. Checking the status before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI
from urllib.parse import quote

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 5000}
    response = api.get(page_url, options)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['headers']['pc_status']}")
    return None

if __name__ == "__main__":
    search_term = "womens sweaters"
    url = f"https://www.target.com/s?searchTerm={quote(search_term)}"
    html = crawl(url)
    print(html[:500] if html else "No HTML returned")

The two wait options matter for a grid that fills in after load. ajax_wait tells the API to wait for asynchronous content to finish, and page_wait holds for 5000 milliseconds after load so the late-rendering cards appear before the page is captured. The Crawling API reports its own outcome in response["headers"]["pc_status"], so check that rather than the raw HTTP code. Run the script and you should see real product markup, not an empty shell. That confirms rendering works before you write a single selector.

Crawlbase Target Scraper

That empty grid is exactly the problem the Crawling API solves: Target's listings only exist after the JavaScript runs, behind an IP it trusts. You send your token and the search URL, it runs the page in a real browser, rotates through residential IPs server-side, handles the CAPTCHA solving, and hands you finished HTML. You skip running a headless browser fleet and a proxy pool yourself. Start on the free 1,000-request tier.

Step 2: Parse the product cards with BeautifulSoup

With rendered HTML in hand, load it into BeautifulSoup, find every product card, and pull each field by its selector. The rating needs a small helper: Target draws the star bar as a filled mask, and the score lives in the bar's CSS width as a percentage, so you convert that percentage to a value out of 5.

python
from bs4 import BeautifulSoup

BASE = "https://www.target.com"
GRID = 'div[data-test="product-grid"] section[class^="styles__StyledRowWrapper"] div[class^="styles__StyledCardWrapper"]'

def extract_rating(element):
    style = element.get("style") if element else None
    if not style:
        return None
    for prop in style.split(";"):
        prop = prop.strip()
        if prop.startswith("width:"):
            value = prop[len("width:"):].strip()
            if value.endswith("%"):
                percentage = float(value[:-1])
                return round((percentage / 100) * 5, 2)
    return None

def extract_tcin(href):
    if href and "/A-" in href:
        return href.split("/A-")[1].split("?")[0].split("#")[0]
    return None

def scrape_target_listing(html):
    soup = BeautifulSoup(html, "html.parser")
    products = soup.select(GRID)
    results = []
    for product in products:
        try:
            title_el = product.select_one('a[data-test="product-title"]')
            rating_el = product.select_one('div[data-ref="rating-mask"]')
            reviews_el = product.select_one('span[data-test="rating-count"]')
            price_el = product.select_one('span[data-test="current-price"]')
            sold_out_el = product.select_one('[data-test="soldOut"]')

            href = title_el["href"] if title_el else None
            availability = "Out of stock" if sold_out_el else "In stock"

            results.append({
                "title": title_el.get_text(strip=True) if title_el else None,
                "price": price_el.get_text(strip=True) if price_el else None,
                "tcin": extract_tcin(href),
                "rating": extract_rating(rating_el),
                "review_count": reviews_el.get_text(strip=True) if reviews_el else None,
                "availability": availability,
                "product_url": BASE + href if href else None,
            })
        except Exception as e:
            print(f"Skipped a card: {e}")
    return results

The product grid selector chains three pieces straight from the legacy script: the data-test="product-grid" container, the row wrapper, and the card wrapper. Inside each card, the title and price come from their data-test selectors, the review count from rating-count, and the rating from the masked bar via extract_rating. The extract_tcin helper reads Target's product identifier out of the URL after /A-, and availability is inferred from whether a sold-out marker is present. Each if el else None guard keeps extraction resilient when a field is absent, which is common since not every card shows a rating or review count.

Selectors drift

Target's generated class names (the styles__StyledCardWrapper prefixes) change without notice, while its data-test and data-ref attributes are far more stable. Lean on the attribute selectors and treat the class-prefix chain as a starting template, not a contract. When a field comes back as None for every card, re-inspect the live search page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper.

Step 3: Handle pagination across result pages

One page of results is a demo; a real job walks the whole result set. Target paginates its search with the Nao parameter in the URL, which sets the starting offset for each page. Results come in batches of 24, so Nao=1 is the first page, Nao=25 the second, and so on. You advance the offset, fetch each page, and stop when there is no enabled next-page button.

python
PER_PAGE = 24

def has_next_page(html):
    soup = BeautifulSoup(html, "html.parser")
    return bool(soup.select_one('button[data-test="next"]:not([disabled])'))

def scrape_all_pages(base_url, max_pages=5):
    all_products = []
    for page in range(max_pages):
        nao = page * PER_PAGE + 1
        url = f"{base_url}&Nao={nao}"
        html = crawl(url)
        if not html:
            break
        found = scrape_target_listing(html)
        if not found:
            break
        all_products.extend(found)
        print(f"Page {page + 1}: {len(found)} products")
        if not has_next_page(html):
            break
    return all_products

The Nao offset is computed from the page index, so page 0 requests Nao=1, page 1 requests Nao=25, and the loop keeps going until has_next_page finds no enabled button[data-test="next"]. The max_pages cap and the empty-results break both stop you early, so a search with two pages of results never spins through five requests. Capping pages also keeps your free-tier credits in check while you are still testing.

Step 4: Assemble the script and export JSON and CSV

Now wire the fetch, parse, and pagination into one runnable script, then write the records to both JSON and CSV so you can load them into a notebook or a spreadsheet.

python
import csv
import json
from urllib.parse import quote
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})
BASE = "https://www.target.com"
GRID = 'div[data-test="product-grid"] section[class^="styles__StyledRowWrapper"] div[class^="styles__StyledCardWrapper"]'
PER_PAGE = 24
FIELDS = ["title", "price", "tcin", "rating", "review_count", "availability", "product_url"]

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 5000}
    response = api.get(page_url, options)
    if response["headers"]["pc_status"] == "200":
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['headers']['pc_status']}")
    return None

def extract_rating(element):
    style = element.get("style") if element else None
    if not style:
        return None
    for prop in style.split(";"):
        prop = prop.strip()
        if prop.startswith("width:"):
            value = prop[len("width:"):].strip()
            if value.endswith("%"):
                percentage = float(value[:-1])
                return round((percentage / 100) * 5, 2)
    return None

def extract_tcin(href):
    if href and "/A-" in href:
        return href.split("/A-")[1].split("?")[0].split("#")[0]
    return None

def scrape_target_listing(html):
    soup = BeautifulSoup(html, "html.parser")
    products = soup.select(GRID)
    results = []
    for product in products:
        try:
            title_el = product.select_one('a[data-test="product-title"]')
            rating_el = product.select_one('div[data-ref="rating-mask"]')
            reviews_el = product.select_one('span[data-test="rating-count"]')
            price_el = product.select_one('span[data-test="current-price"]')
            sold_out_el = product.select_one('[data-test="soldOut"]')

            href = title_el["href"] if title_el else None
            availability = "Out of stock" if sold_out_el else "In stock"

            results.append({
                "title": title_el.get_text(strip=True) if title_el else None,
                "price": price_el.get_text(strip=True) if price_el else None,
                "tcin": extract_tcin(href),
                "rating": extract_rating(rating_el),
                "review_count": reviews_el.get_text(strip=True) if reviews_el else None,
                "availability": availability,
                "product_url": BASE + href if href else None,
            })
        except Exception as e:
            print(f"Skipped a card: {e}")
    return results

def has_next_page(html):
    soup = BeautifulSoup(html, "html.parser")
    return bool(soup.select_one('button[data-test="next"]:not([disabled])'))

def scrape_all_pages(base_url, max_pages=5):
    all_products = []
    for page in range(max_pages):
        url = f"{base_url}&Nao={page * PER_PAGE + 1}"
        html = crawl(url)
        if not html:
            break
        found = scrape_target_listing(html)
        if not found:
            break
        all_products.extend(found)
        print(f"Page {page + 1}: {len(found)} products")
        if not has_next_page(html):
            break
    return all_products

def export(rows, name="target_products"):
    with open(f"{name}.json", "w", encoding="utf-8") as f:
        json.dump(rows, f, indent=2, ensure_ascii=False)
    with open(f"{name}.csv", "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=FIELDS)
        writer.writeheader()
        writer.writerows(rows)
    print(f"Saved {len(rows)} products to {name}.json and {name}.csv")

def main():
    search_term = "womens sweaters"
    base_url = f"https://www.target.com/s?searchTerm={quote(search_term)}"
    rows = scrape_all_pages(base_url, max_pages=3)
    export(rows)

if __name__ == "__main__":
    main()

Run the full script with python target_scraper.py. It walks up to three pages of the search, parses one row per product, and writes both target_products.json and target_products.csv. The shared FIELDS list keeps the CSV column order in step with the dictionary keys, so the two exports never drift apart. To scrape a different category, change search_term to any query Target supports, or point base_url at a category page instead of a search.

What the output looks like

You get a clean list of product records, in page order, ready to write to JSON, CSV, or a database.

json
[
  {
    "title": "Women's Fine Gauge Crewneck Sweater - A New Day",
    "price": "$20.00",
    "tcin": "88228365",
    "rating": 3.9,
    "review_count": "587",
    "availability": "In stock",
    "product_url": "https://www.target.com/p/women-s-fine-gauge-crewneck-sweater-a-new-day/-/A-88228365"
  },
  {
    "title": "Women's Crew Neck Cashmere-Like Pullover Sweater - Universal Thread",
    "price": "$20.00 - $25.00",
    "tcin": "88062926",
    "rating": 4.2,
    "review_count": "746",
    "availability": "In stock",
    "product_url": "https://www.target.com/p/women-s-crew-neck-cashmere-like-pullover-sweater-universal-thread/-/A-88062926"
  }
]

Note that rating and review_count come back as null for products with no reviews yet, which is expected: the masked rating bar is simply not rendered on those cards. Prices arrive as displayed strings, including ranges like "$20.00 - $25.00", so normalize them to numbers downstream if you plan to compare or chart them. This is exactly the field set you want for a price intelligence workflow.

Scaling across queries and staying unblocked

One search is a demo; a real research job runs across many queries or categories. Keep a map of search terms to scrape, loop over it, and pace the requests. The same field set carries over to a single product detail page, where the title, price, TCIN, rating, and an explicit in-stock or out-of-stock badge all live on one URL. Even with rendering handled, Target watches for scraper-shaped traffic, so a few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Spread requests out with a delay between pages and queries rather than crawling everything at full speed. Schedule heavier jobs during off-peak hours to ease load on Target's servers.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Retain only what you need. Store the product fields your project uses and discard the rest. Re-check your selectors periodically so the scraper keeps pace with markup changes.

For the broader playbook on avoiding blocks, see how to scrape websites without getting blocked, and for more on why rendering matters here, how to crawl JavaScript websites. The same pattern applies to other big-box retailers, like in the guides on scraping Best Buy product data and Walmart search with Python.

Whether scraping Target is allowed depends on Target's terms of service, your jurisdiction, and what you do with the data. Target's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Target's terms of service and its robots.txt, and treat both as the boundary for what you collect. For commercial or competitive use, the legal picture gets more complex, and consulting a legal expert about your specific case is the sensible move.

A few lines worth holding to. Collect only public data: the titles, prices, TCINs, ratings, and listing links that anyone can see on a Target search or product page without an account. Keep your request volume low enough that you are not straining Target's servers, and avoid personal data, including anything tied to identifiable shoppers, reviewers, or store associates beyond what is publicly listed. Do not redistribute copyrighted media such as Target's product photography, and if you plan to reuse the data commercially, get permission or an official agreement rather than assuming silence is consent.

This guide is deliberately scoped to public search and product pages because that is the line that keeps the work defensible. It does not cover anything behind a login, account or order data, payment details, or any attempt to bypass authentication or a CAPTCHA you are not entitled to pass. Target operates an affiliate and partner program and offers official data feeds for approved partners, and that is the right channel when you need large volumes, guaranteed structure, or commercial rights. If your project needs more than public catalog data, an official agreement is the correct path, not a cleverer scraper.

Recap

Key takeaways

  • Target's data is public but JavaScript-rendered. The product grid only exists after the page's scripts run, so a plain request returns an empty list.
  • You need rendering and a trusted IP together. The Crawling API renders the page behind a residential IP in one call, using the JavaScript token plus ajax_wait and page_wait.
  • Lean on data-test attributes. Map title, price, TCIN, rating, review count, and availability from Target's data-test and data-ref markers, and expect generated class names to drift.
  • Paginate with the Nao offset. Target pages results in batches of 24 via the Nao parameter; stop when no enabled next button remains, and export to JSON and CSV from a shared field list.
  • Stay on public data. Respect Target's terms and robots.txt, prefer official partner feeds for licensed or bulk data, and never touch accounts, orders, or personal information.

Frequently Asked Questions (FAQs)

Why does a plain request return no products from Target?

Target renders its search grid client-side: the first HTML payload is a shell, and the product cards only appear after the page's JavaScript runs. A raw requests.get() parses that shell and finds nothing, which is why the legacy DIY attempt returned an empty list. Rendering the page through the Crawling API behind a trusted IP solves both the JavaScript and the blocking problem, which is why the scraper here routes its request through it.

What is a TCIN and how do I get it?

TCIN is Target's internal product identifier, the equivalent of an SKU on the site. It appears in every product URL after the /A- marker, for example /A-88228365 means TCIN 88228365. The extract_tcin helper reads it straight from the product link, so you get a stable key for each item without an extra request, which is handy for de-duplicating or joining records over time.

Point the scraper at the category URL rather than a search URL. The product grid, the data-test selectors, and the Nao pagination parameter all behave the same way on category pages, so the same parser and pagination loop carry over. Just swap the base_url in main for the category you want and keep the rest of the script as is.

How do I read product availability?

On search and category cards, an out-of-stock item shows a sold-out marker, which the scraper detects with a [data-test="soldOut"] check and records as "Out of stock", defaulting to "In stock" otherwise. For a precise, per-store availability read, scrape the individual product page, where Target shows a clear in-stock or out-of-stock badge and shipping or pickup options tied to a location.

Why are some ratings null in the output?

A null rating or review count means that product has no reviews yet, so Target does not render the masked rating bar on its card. The extract_rating helper reads the score from the bar's CSS width, and when the bar is absent it returns None. That is expected behavior, not a selector failure; new or low-traffic listings simply have nothing to show.

How do I avoid getting blocked while scraping Target?

Keep your per-IP request rate low, add a delay between pages and queries, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation, a trusted IP pool, and CAPTCHA handling for you; if you build your own stack, that is the part to invest in. Watch the pc_status codes the API returns and back off when you start seeing failures.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available