Use with OpenAI
A native plugin that brings Crawlbase MCP into OpenAI Codex. Crawl any URL, extract clean Markdown, take screenshots, and optionally push results to Cloud Storage — all without leaving Codex.
What it does
The Crawlbase Codex plugin wraps Crawlbase MCP as a Codex-native plugin. Once installed, you can ask Codex to crawl a page, extract its content, or capture a screenshot in plain English — Codex picks the right tool, calls Crawlbase, and returns the result.
Powered by Crawlbase's infrastructure: JavaScript rendering, automatic proxy rotation, and built-in anti-bot bypass. Same reliability you use in production, conversational interface in Codex.
The plugin is open source: github.com/crawlbase/crawlbase-codex-plugin. Issues and PRs welcome.
Prerequisites
You need a Crawlbase account and two API tokens:
Grab both from your dashboard. See Authentication for the difference.
Install from Codex Marketplace
- Open Codex and go to Plugins → Browse Marketplace.
- Search for Crawlbase Web Scraper.
- Click Install.
- Add your
CRAWLBASE_TOKENandCRAWLBASE_JS_TOKENwhen prompted.
The marketplace listing is still in review. Use manual installation below in the meantime.
Manual installation
Clone into your Codex plugins directory and set environment variables:
# Clone the plugin into Codex's plugins directory
git clone https://github.com/crawlbase/crawlbase-codex-plugin \
~/.codex/plugins/crawlbase-mcp
# Set your tokens
export CRAWLBASE_TOKEN=YOUR_TOKEN
export CRAWLBASE_JS_TOKEN=YOUR_JS_TOKEN
# Restart Codex — the plugin auto-discoversUsage
Once installed, ask Codex naturally. It will pick the right tool and call Crawlbase under the hood.
# Crawling
"Crawl https://example.com and return the HTML"
"Get the markdown content of https://example.com/article"
"Take a screenshot of https://example.com"
# Device emulation
"Fetch the page at https://example.com using a mobile browser"
"Take a full-page screenshot of https://example.com and describe what you see"Tools exposed
The plugin registers three crawl tools and six storage tools.
Crawl tools
store: true to push the page to Cloud Storage instead of returning inline.store: true.screenshot_url — the underlying HTML can be persisted with store: true but the image itself is not stored.Storage tools
rid or url. Pass as: "json", "html", or "markdown" to choose the response shape.delete_after flag for fire-and-forget pipelines.Storage usage examples
"Crawl https://example.com and store it in Crawlbase Cloud Storage"
"List all stored pages in Crawlbase"
"Fetch rid abc123 from storage as markdown"
"Bulk-retrieve these 50 rids and delete them afterward"
"How many pages do I have in Crawlbase storage?"Per-token storage silos
Storage is partitioned per token. Pages crawled with CRAWLBASE_TOKEN live in a separate silo from pages crawled with CRAWLBASE_JS_TOKEN (which covers JS-rendered pages and all screenshots).
Every crawl response includes a token_type field — "normal" or "js" — that tells you which silo a result landed in. When calling any storage tool, pass use_js_token: true if the item lives in the JS silo. Otherwise omit it.
If storage_get returns a not-found error for a RID you know exists, you're probably querying the wrong silo. Try again with use_js_token: true (or remove it if you had it set).
Related
- Crawlbase MCP Server — the underlying MCP server the plugin wraps
- Cloud Storage — the storage backend
- Prompt patterns — battle-tested prompts you can adapt for Codex

