n8n has earned its following because it lets you automate real work without writing much code. You wire services together on a visual canvas, trigger actions on a schedule or event, and chain them into complete workflows. Pair that with an MCP server that can read the live web, and your AI steps stop guessing from stale training data and start acting on what a page actually says today.
This guide shows you how to connect n8n with Crawlbase Web MCP end to end: run the Web MCP server, add an MCP Client Tool node in n8n, point it at the server, and build a small AI workflow that crawls a page and summarizes it. No custom scraper code, no proxy pool to babysit. By the end you will have a working loop where an AI Agent reaches for a crawl tool on its own and hands you back clean, structured output.
Why connect n8n to an MCP server at all
An AI Agent in n8n is only as good as the data it can reach. On its own it can reason and write, but it cannot see a live product page, a competitor's pricing, or a news article published an hour ago. The Model Context Protocol (MCP) closes that gap: it is a standard interface that lets an AI model call external tools without a bespoke integration for each one. Run an MCP server, and any MCP-aware client, n8n included, can call whatever functions that server exposes.
The Crawlbase Web MCP server exposes crawling as a set of tools. Once it is connected, your n8n AI Agent gains the ability to fetch a URL and get back rendered HTML, clean markdown, or a screenshot, all behind Crawlbase's anti-block infrastructure. This is the same idea covered in introducing Crawlbase MCP: a plug-and-play bridge between LLM agents and the real-time web.
What you need before you start
Get these in place first and the rest of the setup is quick:
- n8n the desktop app or n8n Cloud both work; the only difference is how n8n reaches your local MCP server.
- A Crawlbase account sign up to get your free API tokens from the dashboard. You need both the Normal token and the JavaScript token.
- Node.js required to run the Crawlbase Web MCP server locally.
- ngrok (optional) only for n8n Cloud users who need their local server reachable over the internet.
How n8n talks to Crawlbase through MCP
Instead of writing an HTTP integration with headers and retry logic, you run the MCP server and let n8n call into it. The flow is short:
- The MCP Client Tool node in n8n connects to the running MCP server.
- The Crawlbase Web MCP server exposes the actual crawling functions.
- Your AI Agent picks the right tool, the workflow receives the data, and the next step acts on it.
You never touch API tokens inside n8n, because the tokens live with the server you start in Step 1. n8n just calls the tools and reads the result.
Step 1: Install and start the Crawlbase Web MCP server
The server is what makes the crawling tools available to n8n, so it has to be running before n8n can see anything. Open a terminal and pull down the project.
git clone https://github.com/crawlbase/crawlbase-mcp cd crawlbase-mcp npm install
Now start the server in HTTP transport mode, passing your two tokens as environment variables. Replace the placeholders with the real values from your Crawlbase dashboard.
CRAWLBASE_TOKEN=your_token CRAWLBASE_JS_TOKEN=your_js_token npm run start:http
The two tokens do different jobs:
- CRAWLBASE_TOKEN handles standard crawling of static pages.
- CRAWLBASE_JS_TOKEN handles pages that render their content with JavaScript, where the server spins up a real browser before returning the HTML.
If everything installed correctly, the server comes up at http://localhost:3000. Leave this terminal window open. n8n connects to this running process whenever your workflow fires, so closing it breaks the link.
Crawlbase issues two token types. The Normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Passing both at startup lets the AI Agent crawl simple pages cheaply and still handle client-side rendered sites without you switching anything mid-workflow. You can find both in your dashboard under account documentation.
Optional for n8n Cloud users: expose the server with ngrok
n8n Desktop runs on the same machine as the MCP server, so http://localhost:3000 works directly and you can skip this. n8n Cloud runs on n8n's servers, so it needs a public URL to reach the server on your machine. ngrok creates a secure tunnel for exactly that.
ngrok http 3000
ngrok prints a public HTTPS URL that points at your local port 3000. Use that URL as the endpoint in n8n instead of localhost. Because this exposes your MCP server to the internet, keep the URL and your tokens private.
Step 2: Connect n8n to the Crawlbase Web MCP server
Open n8n and go to your workflow. Add an MCP Client Tool node next to your AI Agent node. This node is what gives the agent access to the Crawlbase crawling tools. Configure it like this:
-
Endpoint: use
http://localhost:3000on n8n Desktop, or your ngrok HTTPS URL on n8n Cloud. -
Server Transport: choose HTTP Streamable, which matches the
start:httpmode you launched. - Authentication: set to None; your tokens were already supplied to the server in Step 1, so n8n does not need them.
- Tools to Include: choose All so the agent can pick whichever crawl tool fits the task.
When the connection succeeds, the node lists the tools the server exposes:
{ "tools": [ { "name": "crawl", "description": "Crawl a URL and return HTML" }, { "name": "crawl_markdown", "description": "Extract clean markdown from a URL" }, { "name": "crawl_screenshot", "description": "Take a screenshot of a webpage" } ] }
Seeing crawl, crawl_markdown, and crawl_screenshot confirms the link is live. If the list is empty, the connection did not complete; the troubleshooting section below covers the usual causes.
The Web MCP server turns Crawlbase's crawling infrastructure into tools any MCP client can call. It renders JavaScript pages in a real browser, rotates through residential IPs server-side, and returns HTML, markdown, or a screenshot, so your n8n agent reads the live web without you running a headless fleet or a proxy pool. Start on the free tier and point it at a public page.
Step 3: Build your first AI scraping workflow
With the tools connected, build a small workflow that pulls a webpage and produces a summary. It has two main parts:
- An AI Agent node that already has the MCP Client Tool wired in as a tool.
- A Chat Model node (OpenAI, Anthropic, or any model n8n supports) that the agent uses to reason and write.
Inside the AI Agent node, give it a prompt that tells it what to do. The agent reads the prompt, decides it needs to crawl, calls the right tool, and feeds the result to the model. A simple example:
Crawl https://crawlbase.com/blog and summarize the latest article in three bullet points.
Click Execute step on the AI Agent node. Within a moment the summary appears in the output. Behind the scenes the agent reached for crawl_markdown, which returns clean markdown that is far easier for a model to read than raw HTML, then passed that text to the chat model for the summary. Open the workflow execution logs to see the whole chain: the crawl request, the markdown the server returned, and the model's final answer.
That is the core loop. The agent decides which tool to use based on your prompt, so asking it to "take a screenshot of" a page steers it to crawl_screenshot instead, with no node changes on your side. If you want to go deeper on letting an agent choose tools and reason over web data, AI data extraction and how it works covers the pattern.
Troubleshooting the connection
Most setup snags fall into a few buckets. Quick checks usually clear them:
n8n cannot connect to the MCP server
- Confirm the server is still running in its terminal window and has not exited.
- Double-check the endpoint:
http://localhost:3000on Desktop, or the current ngrok HTTPS URL on Cloud (ngrok issues a new URL each restart). - Make sure nothing else on your machine is holding port 3000.
No tools appear in the MCP Client Tool node
- Verify Server Transport is set to HTTP Streamable to match the
start:httplaunch mode. - Refresh the tool list in the node.
- If it is still empty, restart both the MCP server and n8n, then reconnect.
Token or permission errors
- Re-copy both tokens from the dashboard to rule out stray spaces or a missing character.
- Confirm your account has enough credits to run the request.
- If a JavaScript-heavy page comes back empty, make sure
CRAWLBASE_JS_TOKENwas set when you started the server.
Output is empty or looks wrong
- Open the execution logs to see exactly what the crawl returned.
- Test with a simpler, static URL first to isolate whether the issue is the page or the prompt.
- Tighten the prompt so it clearly instructs the agent to crawl before summarizing.
Where to take this next
The summary workflow is deliberately small, but the same wiring scales into real automation. A few directions worth trying:
- Crawl competitor pages on a schedule and have n8n post short summaries into Slack.
- Track product prices across several sites and trigger a notification when one changes.
- Pull a handful of articles on one topic and let the model produce a combined brief.
- Collect public business details from directories and route them into a CRM.
- Monitor how key pages change over time for SEO or compliance work.
Because the MCP server runs behind Crawlbase's infrastructure, the hard parts of scraping (rendering, rotation, and avoiding blocks) are handled for you. If you do hit resistance on a tough target, how to scrape websites without getting blocked is the playbook. And if you are weighing how managed access compares to raw proxying, what is an AI proxy is a useful read. For heavier custom builds outside n8n, the Crawling API and Smart AI Proxy give you the same engine to call directly.
The appeal of this setup is that it works without code: n8n drives the automation, Crawlbase does the crawling, and MCP keeps the two in sync. Crawlbase is trusted by over 70,000 developers, so the crawling layer under your workflow is the same one that powers production scrapers at scale.
Key takeaways
- MCP is the bridge. It lets an n8n AI Agent call external tools, and the Crawlbase Web MCP server exposes crawling as those tools.
-
The server holds the tokens. Start it with both
CRAWLBASE_TOKENandCRAWLBASE_JS_TOKEN; n8n connects with Authentication set to None. -
Match the transport. Launch with
start:httpand set the node's Server Transport to HTTP Streamable, pointing atlocalhost:3000or an ngrok URL. -
The agent picks the tool.
crawl,crawl_markdown, andcrawl_screenshotare chosen by the agent based on your prompt. - No code, no proxy pool. Crawlbase handles rendering, rotation, and blocks, so the workflow stays clean.
Frequently Asked Questions (FAQs)
What is the Crawlbase Web MCP server?
It is a small server that exposes Crawlbase's crawling features as Model Context Protocol tools. Any MCP-aware client, including n8n, Claude Desktop, and Cursor, can connect to it and call functions like crawl, crawl_markdown, and crawl_screenshot to fetch live web data without writing a custom integration.
Do I need both Crawlbase tokens to connect n8n?
Use both for the smoothest setup. CRAWLBASE_TOKEN handles static pages and CRAWLBASE_JS_TOKEN renders JavaScript-heavy pages in a real browser. Passing both when you start the server lets the AI Agent crawl simple and complex pages without you switching anything in the workflow.
What is the difference between n8n Desktop and n8n Cloud for this setup?
n8n Desktop runs on the same machine as the MCP server, so it can reach http://localhost:3000 directly. n8n Cloud runs on n8n's servers, so it needs a public URL; tunnel your local server with ngrok and use that HTTPS address as the endpoint instead.
Why does the MCP Client Tool node show no tools?
The most common cause is a transport mismatch: make sure Server Transport is set to HTTP Streamable to match the start:http launch mode. Also confirm the server is still running, the endpoint is correct, and port 3000 is free. Refreshing the tool list or restarting both the server and n8n usually resolves it.
Do I have to write any code to use this?
No. Beyond a couple of terminal commands to start the server, the whole workflow is built visually in n8n. The AI Agent decides which crawl tool to call based on your prompt, so you wire the nodes once and drive everything with plain instructions.
Can the AI Agent choose which crawl tool to use on its own?
Yes. With Tools to Include set to All, the agent reads your prompt and selects the fitting tool. Ask it to summarize a page and it reaches for crawl_markdown; ask for a screenshot and it uses crawl_screenshot. You steer the choice through the wording of the prompt, not through node settings.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

