OpenSea is one of the largest NFT marketplaces, and every collection and listing page carries the kind of structured data an NFT price tracker, a market-research notebook, or a rarity dashboard wants: an item name, the collection it belongs to, the current listing price in ETH, the last sale, the token id, and the image. The catch is that OpenSea is a JavaScript-heavy React application, so a plain HTTP request hands you an almost-empty shell instead of the items you came for.

This guide shows you how to scrape OpenSea data with Python the reliable way. You build a small, runnable scraper that fetches a rendered collection page through the Crawling API with a JavaScript token, parses each item with BeautifulSoup, and prints clean structured records. The whole walkthrough stays scoped to public NFT data that anyone can see on a collection page, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Python script that takes a public OpenSea collection URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every NFT on the page. We will use a public collection as the running example and pull these fields from each item card:

  • Item name the unique name of the individual NFT, for example "Courtyard #1024".
  • Collection the collection the item belongs to.
  • Price (ETH) the current listing price as shown on the card.
  • Last sale the price the item previously sold for, when the card shows it.
  • Token id the unique on-chain identifier, useful for tracking a token across platforms.
  • Image URL the source of the item's thumbnail.
  • Item URL the link to the individual NFT detail page.

Why a plain request fails on OpenSea

If you request an OpenSea collection URL with a bare HTTP client, you get a response with status 200 and almost none of the NFT data in the body. Two things work against you. First, OpenSea is a client-side React app that builds its item grid in the browser, so the initial HTML is a frame that only fills in after the page's scripts run and the marketplace data loads over the network. Second, OpenSea flags automated traffic quickly: datacenter IPs and request patterns that do not look like a real browser get challenged or blocked before they ever reach the rendered grid.

So a working OpenSea scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with a JavaScript token, it renders the page behind a trusted IP, and it returns finished HTML for you to parse. If client-side rendering is new to you, our guide on crawling JavaScript websites explains why rendering matters.

Why the JS token

Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. OpenSea is client-side rendered, so you need the JS token here. Using the normal token returns the same empty frame a plain fetch would, and there is nothing to parse out of it. You can start with 1,000 free requests, no credit card needed.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If BeautifulSoup is new to you, our guide to using BeautifulSoup in Python covers the parsing basics this tutorial assumes.

Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda.

A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token from the account docs page. Treat the token like a password: it authenticates your requests, so keep it out of version control.

Set up the project

Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs.

bash
python --version

python -m venv opensea_env
source opensea_env/bin/activate

pip install crawlbase beautifulsoup4

On Windows, activate the environment with opensea_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull out individual fields by CSS selector.

Step 1: Fetch the rendered collection page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JS token, and request the collection URL. Checking the status before you parse keeps failures loud instead of silent.

python
from crawlbase import CrawlingAPI

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"})

def crawl(page_url):
    options = {"ajax_wait": "true", "page_wait": 5000}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['status_code']}")
    return None

if __name__ == "__main__":
    page_url = "https://opensea.io/collection/courtyard-nft"
    html = crawl(page_url)
    print(html[:500] if html else "No HTML returned")

The two wait options matter for a client-rendered target like this. ajax_wait tells the API to wait for asynchronous content to finish loading, which is what OpenSea uses to populate its item grid, and page_wait holds for a fixed number of milliseconds after load so late-rendering cards appear before the page is captured. Five seconds is a reasonable start; raise it if the cards come back empty. Run the script with python scraper.py and you should see real item markup, not the empty frame a plain fetch returns. That confirms rendering works before you write a single selector.

Crawlbase Crawling API

OpenSea needs a rendered React page behind a trusted IP, in one call. The Crawling API takes a JS token, runs the page in a real browser, waits on the AJAX that loads the item grid, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless fleet and a proxy pool yourself. Point it at a public collection on the free tier first.

Step 2: Parse the item cards with BeautifulSoup

With rendered HTML in hand, load it into BeautifulSoup and pull each NFT by its selector. OpenSea lays its collection out as a grid of repeated item cards, so you select all the cards once and then read the same fields from each one. Inspect the live page in your browser's dev tools to confirm the current attributes; the selectors below match the layout at the time of writing.

python
from bs4 import BeautifulSoup

BASE = "https://opensea.io"

def text_of(card, selector):
    el = card.select_one(selector)
    return el.get_text(strip=True) if el else None

def token_id_from(href):
    # OpenSea item URLs end with the token id, e.g. /assets/.../1024
    return href.rstrip("/").split("/")[-1] if href else None

def parse_collection(html):
    soup = BeautifulSoup(html, "html.parser")
    cards = soup.select('article.AssetSearchList--asset')
    items = []

    for card in cards:
        link = card.select_one("a.Asset--anchor")
        href = link["href"] if link else None
        img = card.select_one("img")
        items.append({
            "name": text_of(card, 'span[data-testid="ItemCardFooter-name"]'),
            "price_eth": text_of(card, 'div[data-testid="ItemCardPrice"] span[data-id="TextBody"]'),
            "last_sale": text_of(card, 'div[data-testid="ItemCardPrice-secondary"]'),
            "token_id": token_id_from(href),
            "image_url": img["src"] if img else None,
            "item_url": BASE + href if href else None,
        })

    return items

The text_of helper does two useful things at once: it queries a single element inside the card and returns None when that element is missing, instead of throwing on a .get_text() call against nothing. That keeps the extraction resilient when one field is absent on a given card, which is common since not every NFT shows a last-sale price or a current listing. The item name comes from the data-testid="ItemCardFooter-name" span, the listing price from the nested data-id="TextBody" span inside ItemCardPrice, and the link from the Asset--anchor anchor. The token id is the last path segment of that link, and the full item URL is the base host joined to the relative href.

Selectors drift

OpenSea's class names and data-testid attributes change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back as None for every card, re-inspect a live item in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Handle scroll-based pagination

A collection page does not load every item up front. OpenSea uses infinite scroll, so more NFTs appear only as you move down the grid. Rather than reverse-engineer the pagination calls, you let the Crawling API scroll the page for you with the scroll and scroll_interval options. The API scrolls for the number of seconds you give it, which loads more cards into the same HTML you then parse.

python
def crawl_with_scroll(page_url):
    options = {
        "ajax_wait": "true",
        "scroll": "true",
        "scroll_interval": "20",  # scroll for 20 seconds, max 60
    }
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['status_code']}")
    return None

Setting scroll to true tells the API to scroll the rendered page, and scroll_interval controls how long, up to 60 seconds. A longer interval loads more items but costs more time per request, so pick a value that matches how deep into the collection you need to go. Note that when you scroll you drop page_wait, because the scroll already keeps the page open long enough for new cards to render.

Step 4: Put it together

Now wire the scroll-enabled fetch and the parser into one runnable script. Fetch the rendered HTML, hand it to the parser, and write the result to JSON so you can load it anywhere later.

python
import json
from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup

api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"})
BASE = "https://opensea.io"

def crawl(page_url):
    options = {"ajax_wait": "true", "scroll": "true", "scroll_interval": "20"}
    response = api.get(page_url, options)
    if response["status_code"] == 200:
        return response["body"].decode("utf-8")
    print(f"Request failed: {response['status_code']}")
    return None

def text_of(card, selector):
    el = card.select_one(selector)
    return el.get_text(strip=True) if el else None

def token_id_from(href):
    return href.rstrip("/").split("/")[-1] if href else None

def parse_collection(html):
    soup = BeautifulSoup(html, "html.parser")
    cards = soup.select('article.AssetSearchList--asset')
    items = []

    for card in cards:
        link = card.select_one("a.Asset--anchor")
        href = link["href"] if link else None
        img = card.select_one("img")
        items.append({
            "name": text_of(card, 'span[data-testid="ItemCardFooter-name"]'),
            "price_eth": text_of(card, 'div[data-testid="ItemCardPrice"] span[data-id="TextBody"]'),
            "last_sale": text_of(card, 'div[data-testid="ItemCardPrice-secondary"]'),
            "token_id": token_id_from(href),
            "image_url": img["src"] if img else None,
            "item_url": BASE + href if href else None,
        })

    return items

def main():
    page_url = "https://opensea.io/collection/courtyard-nft"
    html = crawl(page_url)
    if not html:
        return
    items = parse_collection(html)
    with open("opensea_data.json", "w") as f:
        json.dump(items, f, indent=2)
    print(f"Saved {len(items)} items")

if __name__ == "__main__":
    main()

What the output looks like

Run the full script with python scraper.py and you get a clean structured record for each NFT, ready to write to JSON, CSV, or a database. The price and last-sale fields come back as the strings OpenSea displays, ETH symbol and all, so normalize them downstream if you need numeric values.

json
[
  {
    "name": "Courtyard #1024",
    "price_eth": "0.018 ETH",
    "last_sale": "Last sale: 0.015 ETH",
    "token_id": "1024",
    "image_url": "https://i.seadn.io/s/raw/files/abc123.png",
    "item_url": "https://opensea.io/assets/matic/0x251be3.../1024"
  },
  {
    "name": "Courtyard #2087",
    "price_eth": "0.021 ETH",
    "last_sale": null,
    "token_id": "2087",
    "image_url": "https://i.seadn.io/s/raw/files/def456.png",
    "item_url": "https://opensea.io/assets/matic/0x251be3.../2087"
  }
]

The second record has a null last sale, which is the helper doing its job: not every item has sold before, so that field is simply absent rather than a crash.

Scraping NFT detail pages

The collection page gives you a card-level summary, but each NFT also has its own detail page with richer fields, including a longer description, full price history, and rarity rank where the collection publishes one. The approach is the same as the collection page: render the URL with the JS token, then read fields by selector. The selectors differ because the detail layout is different, so inspect a live item page before relying on them.

python
def parse_nft_detail(html, url):
    soup = BeautifulSoup(html, "html.parser")
    rank = soup.select_one('[data-testid="rarity-rank"]')
    return {
        "name": text_of(soup, "h1.item--title"),
        "collection": text_of(soup, "a.item--collection-detail"),
        "price_eth": text_of(soup, "div.Price--amount"),
        "rarity_rank": rank.get_text(strip=True) if rank else None,
        "token_id": token_id_from(url),
        "item_url": url,
    }

This reuses the same text_of and token_id_from helpers from the collection scraper, so a detail run is just a different selector set over the same fetch-then-parse loop. The item name comes from the item--title heading, the listing price from Price--amount, and the rarity rank from a rarity-rank test id when the collection exposes one. Where a collection has no rarity ranking, that field stays None, which is correct rather than a bug.

Scaling and staying unblocked

One collection is a demo; a real job runs over several collections. The shape stays the same: keep a list of collection URLs, fetch each through the Crawling API with scroll enabled, parse it with the same function, and collect the rows. Because every collection page shares the same card structure, the parser you already wrote works across all of them without changes. Even with rendering handled, OpenSea watches for scraper-shaped traffic, so a few habits keep a run healthy, and they apply to any hard target.

  • Pace your requests. Hammering collection pages in a tight loop is the fastest way to get throttled. Spread requests out and vary your targets instead of crawling one collection at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. And if you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy (also called the AI Proxy) gives you the same residential IP rotation as a drop-in proxy endpoint.

Whether scraping OpenSea is allowed depends on OpenSea's terms of service, your jurisdiction, and what you do with the data. OpenSea's terms place limits on automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read the OpenSea Terms of Service and its robots.txt, and treat both as the boundary for what you collect.

A few lines worth holding to. Collect only public NFT data: the item name, collection, listing price, last sale, token id, image, and link that anyone can see on a collection page without an account. Respect OpenSea's stated rate expectations and keep your request volume low enough that you are not straining its servers. The token metadata itself lives on-chain, so for many use cases reading the blockchain directly is both cleaner and unambiguous. Do not redistribute the artwork or media tied to an NFT as if it were yours; the image URL is fine to reference, the underlying media is the creator's.

For production or commercial use, OpenSea publishes an official API, and that is the path it intends developers to take for marketplace data. It gives you structured listings, events, and collection stats under clear terms, without rendering pages or maintaining selectors. Scraping fits exploratory research and one-off analysis on public data; if your project needs ongoing, high-volume, or commercial access, the official API or a direct data agreement is the correct route, not a cleverer scraper.

Recap

Key takeaways

  • OpenSea is a client-side React app. A plain fetch returns an empty frame, so you must render the page before you parse it.
  • Use the JS token. The Crawling API with a JavaScript token renders the page behind a trusted IP in one call; ajax_wait and page_wait control how long it waits for content.
  • Scroll handles pagination. OpenSea loads items on infinite scroll, so pass scroll and scroll_interval instead of reverse-engineering its pagination.
  • BeautifulSoup does the extraction. Select all item cards, then read name, price in ETH, last sale, token id, image, and link from each, and expect the selectors to drift.
  • Stay on public data and prefer the official API for production. Respect OpenSea's ToS and robots.txt, scope yourself to public NFT data, and use the official API for commercial or high-volume work.

Frequently Asked Questions (FAQs)

Why does a plain fetch return no NFTs from OpenSea?

Because OpenSea is a client-side React application that builds its item grid in the browser. The initial HTML is a frame that only fills in after the page's scripts run and the marketplace data loads over the network, so a raw HTTP request returns status 200 with the grid empty. To get real data you have to render the page first, which is what the Crawling API's JS token handles for you.

Do I need the normal token or the JS token for OpenSea?

The JS token. The normal token fetches static HTML, which on OpenSea is the same empty frame a plain fetch returns. The JS token renders the page in a real browser before handing back the HTML, so the item cards are present when BeautifulSoup parses them.

How do I load more than the first screen of items?

OpenSea loads items on infinite scroll, so the rest appear only as you move down the grid. Pass scroll set to true and a scroll_interval in seconds to the Crawling API, and it scrolls the rendered page for you before capturing the HTML, so more cards are present when you parse. A longer interval loads more items at the cost of a slower request.

My selectors return None for every card. What changed?

Almost certainly OpenSea's markup. Its class names and data-testid attributes change without notice, so selectors that worked last month can break. Re-inspect a live item in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

Should I scrape OpenSea or use its official API?

For exploratory research and one-off analysis on public data, scraping a collection page is fine and is what this guide covers. For production, commercial, or high-volume use, prefer OpenSea's official API: it returns structured listings, events, and collection stats under clear terms, without rendering pages or chasing selector changes.

Can I scrape account or personal data from OpenSea?

No, and this guide does not cover it. Wallet account details, anything behind a login, and the artwork or media you would redistribute all sit outside public NFT data, so they are not in scope here and run against OpenSea's terms. Stick to the public item, collection, and listing data anyone can see, and read the blockchain directly when you need authoritative token metadata.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available