Etsy is a global marketplace for handmade, vintage, and craft goods, and every search page is a public, structured feed of what sellers are listing right now: the product title, the asking price, the shop behind it, and a star rating. For anyone doing market research, price tracking, or competitive analysis in the handmade space, that listing data is one of the cleanest demand signals you can read without an account. A seller can watch how rivals price a similar item; a buyer can monitor a category for trends; an analyst can chart how a niche moves over time.

This guide shows you how to scrape Etsy product listings with Python. You build a small, runnable scraper that fetches an Etsy search page through the Crawling API, parses a clean record per listing, handles pagination across result pages, and exports the results to JSON and CSV. The whole walkthrough stays scoped to public listing data: the titles, prices, shop names, ratings, and links anyone can see on an Etsy search page without logging in.

What you will build

A Python script that takes an Etsy search query, retrieves each rendered results page through the Crawling API, and extracts a structured record per listing card. We use the search query clothes as the running example, the same query the legacy walkthrough used, and pull these fields from each card:

  • Title the product name shown on the listing card.
  • Price the listed price, as the currency value Etsy renders on the card.
  • Shop the name of the seller's shop behind the listing.
  • Rating the average star rating, when the listing shows one.
  • Link the URL to the listing's own detail page.

Why a plain request fails on Etsy

If you point a bare HTTP client at an Etsy search URL, you rarely get the listings you came for. Two things work against you. First, Etsy leans heavily on JavaScript: it ships a lightweight shell and fills the listing cards in as the page's scripts run, so the initial HTML is often missing most of the grid. Second, Etsy flags automated traffic quickly. Datacenter IP ranges and request patterns that do not look like a real browser get met with a CAPTCHA, an interstitial challenge, or an outright block before you ever reach the results.

So a working Etsy scraper needs two things in one request: a browser that renders the page, and an IP that Etsy reads as a real shopper. You can assemble that yourself with a headless browser and a pool of rotating residential proxies, but keeping that stack healthy is most of the work. The Crawling API folds both into a single call: you send it the search URL, it renders the page behind a trusted residential IP, handles the rotation and CAPTCHA solving, and returns finished HTML for you to parse.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to the language, the official Python docs or any beginner course covers the level this tutorial assumes.

Python 3.8 or later. Confirm your version with python --version (or python3 --version). If you do not have it, install it from python.org and make sure Python is on your system PATH.

A Crawlbase account and token. Sign up for a free account, open your dashboard, and copy your token. Etsy renders its listings with JavaScript, so use the JavaScript token for this scraper. The free tier includes 1,000 requests with no card, which is plenty to build and test it. Treat the token like a password and keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the three libraries the scraper needs. crawlbase is the official client for the Crawling API, beautifulsoup4 parses the returned HTML so you can pull each field out of the listing cards by CSS selector, and pandas makes the CSV export a one-liner.

bash
python --version

python -m venv etsy_env
source etsy_env/bin/activate

pip install crawlbase beautifulsoup4 pandas

On Windows, activate the environment with etsy_env\Scripts\activate instead of the source line. With the libraries installed, create the script file the rest of the guide builds up:

bash
touch etsy_scraper.py

Understanding the Etsy search page

Etsy's search page lives at a stable URL keyed on the query: a search for clothes is https://www.etsy.com/search?q=clothes, and you move through result pages by appending a &page= parameter. The page lays out a grid of listing cards, one per product, each carrying the same handful of fields: a title, a price, a shop name, a star rating, and a link into the listing's detail page.

Before writing selectors, open a search page in your browser, right-click a listing card, and choose Inspect. Etsy wraps each result in a list item under a search-results container, and exposes the title, price, and rating inside that card. Those are the elements you target. Etsy's class names change from time to time, so the selectors below worked when this was written; expect to re-inspect the live page and adjust if a field starts coming back empty.

Step 1: Fetch the rendered search page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, set the search URL, and request it. Checking the status code before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 3000}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("latin1")
    print(f"Request failed: {response['status_code']}")
    return None

if __name__ == "__main__":
    search_url = "https://www.etsy.com/search?q=clothes"
    html = crawl(search_url)
    print(html[:500] if html else "No HTML returned")

The two wait options matter for a grid that fills in as the page loads. ajax_wait tells the API to wait for asynchronous content to finish, and page_wait holds for a fixed number of milliseconds after load so the late-rendering cards appear before the page is captured. The body is decoded as latin1 because Etsy pages mix in characters that strict UTF-8 decoding can choke on. Run the script and you should see real listing markup, not a challenge shell. That confirms rendering works before you write a single selector.

Crawlbase Crawling API

That single crawl call hides the hard part: Etsy needs a rendered page behind a trusted IP, and the two _wait options only help once the request itself gets through. The Crawling API takes your token, runs the search page in a real browser, rotates through residential IPs server-side, and handles the CAPTCHA solving, then hands you finished HTML. You skip running a headless browser fleet and a proxy pool yourself. Point it at an Etsy search URL on the free 1,000-request tier first.

Step 2: Parse the listing cards with BeautifulSoup

With rendered HTML in hand, load it into BeautifulSoup, find every listing card, and pull each field by its selector. Etsy groups results into an ordered list under the search-results container, with one list item per card. Inside each card the title, price, shop, and rating sit in their own elements, and the listing link is the card's anchor. Wrap each card in a try/except so one malformed listing does not crash the run.

python
from bs4 import BeautifulSoup

CONTAINER = "div.search-listings-group div[data-search-results-container] ol li"

def text_of(card, selector):
    el = card.select_one(selector)
    return el.get_text(strip=True) if el else None

def parse_link(card):
    a = card.select_one("a.listing-link[href]")
    return a["href"] if a else None

def scrape_listings(html):
    soup = BeautifulSoup(html, "html.parser")
    cards = soup.select(CONTAINER)
    results = []
    for card in cards:
        try:
            title = text_of(card, "div.v2-listing-card__info h3.v2-listing-card__title")
            if not title:
                continue
            results.append({
                "title": title,
                "price": text_of(card, "div.n-listing-card__price p.lc-price span.currency-value"),
                "shop": text_of(card, "div.v2-listing-card__info p.v2-listing-card__shop span"),
                "rating": text_of(card, "div.shop-name-with-rating span.larger_review_stars > div"),
                "link": parse_link(card),
            })
        except Exception as e:
            print(f"Skipped a card: {e}")
    return results

The text_of helper queries one element inside a card and returns None when it is missing, instead of throwing on a .get_text() call against nothing. That keeps extraction resilient when a field is absent, which is common since not every listing shows a rating. The title comes from h3.v2-listing-card__title, the price from the span.currency-value inside the price block, the shop name from the listing card's shop element, and the rating from the review-stars block. The container selector and the title, price, and rating selectors are ported straight from the legacy walkthrough; the shop and link are the two extra fields this version adds.

Selectors drift

Etsy's listing-card class names (the title clamp, the price span, the rating block) get regenerated and change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back as None for every card, re-inspect a live search page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper.

Step 3: Handle pagination across result pages

One search page is a sample; the full result set spans many. Etsy splits search results into numbered pages, and you move through them by appending &page=N to the search URL. To gather a complete dataset, read the total page count from the first page, then walk each page in turn. A short delay between requests paces the run so you are not hammering Etsy in a tight loop.

python
import time

PAGINATION = 'div[data-appears-component-name="search_pagination"] nav ul.search-pagination li:nth-last-child(3) a'

def get_total_pages(html):
    soup = BeautifulSoup(html, "html.parser")
    el = soup.select_one(PAGINATION)
    try:
        return int(el.get_text(strip=True)) if el else 1
    except ValueError:
        return 1

def scrape_all_pages(query, max_pages=5):
    search_url = f"https://www.etsy.com/search?q={query}"
    first = crawl(search_url)
    if not first:
        return []
    total = min(get_total_pages(first), max_pages)
    all_listings = scrape_listings(first)
    print(f"page 1: {len(all_listings)} listings (of {total} pages)")
    for page in range(2, total + 1):
        html = crawl(f"{search_url}&page={page}")
        if not html:
            break
        found = scrape_listings(html)
        all_listings.extend(found)
        print(f"page {page}: {len(found)} listings")
        time.sleep(2)
    return all_listings

get_total_pages reads the page count from the pagination control: the legacy selector targets the third-from-last pagination link, which is the last numbered page before the next and last arrows. It falls back to 1 if the control is missing or the text is not a number, so a single-page result still works. The max_pages cap keeps a test run bounded while you are developing, and the time.sleep(2) between requests paces the crawl so you are not flagged for rapid-fire traffic.

Step 4: Assemble the script and export JSON and CSV

Now wire the fetch, parse, and pagination into one runnable script, then write the records to both JSON and CSV so you can load them into a notebook or a spreadsheet. JSON keeps the structure for code; the CSV, written with pandas, opens straight in any spreadsheet tool.

python
import json
import time
import pandas as pd
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_TOKEN"})
CONTAINER = "div.search-listings-group div[data-search-results-container] ol li"
PAGINATION = 'div[data-appears-component-name="search_pagination"] nav ul.search-pagination li:nth-last-child(3) a'

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 3000}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("latin1")
    print(f"Request failed: {response['status_code']}")
    return None

def text_of(card, selector):
    el = card.select_one(selector)
    return el.get_text(strip=True) if el else None

def parse_link(card):
    a = card.select_one("a.listing-link[href]")
    return a["href"] if a else None

def scrape_listings(html):
    soup = BeautifulSoup(html, "html.parser")
    results = []
    for card in soup.select(CONTAINER):
        try:
            title = text_of(card, "div.v2-listing-card__info h3.v2-listing-card__title")
            if not title:
                continue
            results.append({
                "title": title,
                "price": text_of(card, "div.n-listing-card__price p.lc-price span.currency-value"),
                "shop": text_of(card, "div.v2-listing-card__info p.v2-listing-card__shop span"),
                "rating": text_of(card, "div.shop-name-with-rating span.larger_review_stars > div"),
                "link": parse_link(card),
            })
        except Exception as e:
            print(f"Skipped a card: {e}")
    return results

def get_total_pages(html):
    soup = BeautifulSoup(html, "html.parser")
    el = soup.select_one(PAGINATION)
    try:
        return int(el.get_text(strip=True)) if el else 1
    except ValueError:
        return 1

def scrape_all_pages(query, max_pages=5):
    search_url = f"https://www.etsy.com/search?q={query}"
    first = crawl(search_url)
    if not first:
        return []
    total = min(get_total_pages(first), max_pages)
    listings = scrape_listings(first)
    print(f"page 1: {len(listings)} listings (of {total} pages)")
    for page in range(2, total + 1):
        html = crawl(f"{search_url}&page={page}")
        if not html:
            break
        found = scrape_listings(html)
        listings.extend(found)
        print(f"page {page}: {len(found)} listings")
        time.sleep(2)
    return listings

def export(rows, name="etsy_listings"):
    with open(f"{name}.json", "w", encoding="utf-8") as f:
        json.dump(rows, f, indent=2, ensure_ascii=False)
    pd.DataFrame(rows).to_csv(f"{name}.csv", index=False)
    print(f"Saved {len(rows)} listings to {name}.json and {name}.csv")

def main():
    rows = scrape_all_pages("clothes", max_pages=5)
    if rows:
        export(rows)

if __name__ == "__main__":
    main()

Run the full script with python etsy_scraper.py. It fetches each rendered search page, parses one row per listing, walks the result pages up to max_pages, and writes both etsy_listings.json and etsy_listings.csv. Building the CSV from a pandas DataFrame means the columns follow the dictionary keys automatically, so the two exports never drift apart.

What the output looks like

You get a clean list of listing records, in page order, ready to write to JSON, CSV, or a database.

json
[
  {
    "title": "Women's Christian Sweatshirt, Bible Verse Shirts, Faith Based Tshirts",
    "price": "13.85",
    "shop": "FaithfulThreadsCo",
    "rating": "4.8",
    "link": "https://www.etsy.com/listing/1234567890/womens-christian-sweatshirt"
  },
  {
    "title": "Thankful Super Soft Sweatshirt, Thanksgiving Sweatshirt, Friendsgiving Shirt",
    "price": "29.99",
    "shop": "CozyFallPrints",
    "rating": "4.9",
    "link": "https://www.etsy.com/listing/9876543210/thankful-super-soft-sweatshirt"
  }
]

The price comes through as the bare currency value Etsy renders on the card, so if you want a number for analysis you can cast it to a float after stripping any thousands separators. If you would rather store the data in a queryable format instead of flat files, the same row dictionaries drop straight into a SQLite table with a short sqlite3 insert, which the legacy walkthrough covered as an alternative to CSV.

Scaling across queries

One search query is a demo; a real research job runs across many. Etsy serves a search page for any query string, so to compare niches you keep a list of queries and run the paginated scraper over each, keying the output by query name. Pace the requests so you are not crawling everything at full speed.

python
QUERIES = ["clothes", "ceramic mug", "leather wallet"]

def scrape_queries(queries, max_pages=3):
    everything = {}
    for query in queries:
        print(f"scraping: {query}")
        everything[query] = scrape_all_pages(query, max_pages=max_pages)
        time.sleep(3)
    return everything

Keying the output by query keeps each result set separate, which is what you want when comparing categories. To track trends over time, run the job on a schedule and stamp each export with the date, then diff successive snapshots to see how prices and ratings shifted. For a deeper treatment of turning this kind of feed into a pricing dataset, see how to use web scraping for price intelligence.

Staying unblocked

Even with rendering handled, Etsy watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.

  • Pace your requests. Spread requests out with a delay between pages and queries rather than crawling everything at full speed. Schedule heavier jobs during off-peak hours to ease load on Etsy's servers.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Retain only what you need. Store the listing fields your project uses and discard the rest. Re-check your selectors periodically so the scraper keeps pace with markup changes.

For the broader playbook on avoiding blocks, see how to scrape websites without getting blocked, and for the parsing side, the guide on using BeautifulSoup in Python covers the library in depth. If you are building toward a full research workflow, how to automate ecommerce product research shows where a listing feed like this one fits.

Whether scraping Etsy is allowed depends on Etsy's Terms of Use, your jurisdiction, and what you do with the data. Etsy's terms place limits on automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Etsy's Terms of Use and its robots.txt, and treat both as the boundary for what you collect. For commercial or competitive use, the legal picture gets more complex, and consulting a legal expert about your specific case is the sensible move.

A few lines worth holding to. Collect only public data: the listing titles, prices, shop names, ratings, and listing links that anyone can see on a search page without an account. Keep your request volume low enough that you are not straining Etsy's servers, and respect sellers' privacy: avoid personal data, including anything tied to identifiable buyers or sellers beyond the public shop name on a listing. Do not redistribute the product photos or descriptions sellers own, and do not reach for anything behind a login or any attempt to bypass authentication or a challenge you are not entitled to pass.

This guide is deliberately scoped to public Etsy search listings because that is the line that keeps the work defensible. For licensed or bulk access, Etsy offers an official Open API for developers, which is the right tool when you need large volumes, guaranteed structure, or commercial rights. If your project needs more than public listing data, that official API or a data agreement is the correct path, not a cleverer scraper.

Recap

Key takeaways

  • Etsy search pages are a public listing feed. Each query's results page carries the title, price, shop, and rating that make it useful for market research and price tracking.
  • You need rendering and a trusted IP together. Etsy fills the listing grid client-side and blocks bot traffic, so the Crawling API renders the page behind a residential IP in one call.
  • BeautifulSoup does the extraction. Loop the search-results list items and map title, price, shop, rating, and link to current selectors, and expect those selectors to drift.
  • Handle pagination and export to JSON and CSV. Read the page count, walk each result page with a delay, and write both files so the data loads into code or a spreadsheet.
  • Stay on public data. Respect Etsy's Terms of Use and robots.txt, prefer the official Etsy Open API for licensed or bulk data, and never touch accounts, personal information, or seller-owned media.

Frequently Asked Questions (FAQs)

Why does a plain request return no listings from Etsy?

Two reasons. Etsy fills much of the listing grid client-side as the page loads, so a raw request often gets a shell missing most of the cards. On top of that, Etsy challenges or blocks traffic that does not look like a real browser. Rendering the page through the Crawling API behind a trusted IP solves both, which is why the scraper here routes its request through it with ajax_wait and page_wait set.

How do I scrape a specific Etsy search query?

Every Etsy search has its own URL keyed on the query, for example https://www.etsy.com/search?q=clothes. Change the q value to the query you want, then move through result pages by appending &page=N. To cover many queries, keep a list of them and loop over the paginated scraper, pacing the requests with a short delay.

Which fields can I extract from an Etsy listing card?

From each search-result card this scraper pulls the product title, the price (the currency value Etsy renders), the shop name, the star rating, and the listing link. The legacy walkthrough also noted product descriptions and images as available on Etsy; you can add those by inspecting the card and adding the matching selectors to the parse step.

How do I handle pagination when scraping Etsy?

Read the total page count from the pagination control on the first results page, then loop from page 2 to that total, appending &page=N to the search URL each time. The scraper here caps the count with a max_pages argument so a test run stays bounded, and sleeps between requests so you are not hammering Etsy in a tight loop.

How do I save the scraped Etsy data?

The script writes both a JSON file, which keeps the structure for code, and a CSV built from a pandas DataFrame, which opens straight in a spreadsheet. The same row dictionaries also drop into a SQLite table with a short sqlite3 insert if you want a queryable store instead of flat files.

How do I avoid getting blocked while scraping Etsy?

Keep your per-IP request rate low, add a delay between pages and queries, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation, a trusted IP pool, and CAPTCHA handling for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available