# Crawler APIs

To monitor crawler stats, manage jobs, purge a crawler via API, pause/unpause a crawler, please follow the instructions below:

Note: For JS crawlers, replace the TCP token with the JS token in all API calls.

# Stats API

Get a summary of your crawlers, including concurrency, queue status, and crawl history (success and failure breakdown):

curl 'https://api.crawlbase.com/crawler/_USER_TOKEN_/stats'

Filter History by Date Range:

curl 'https://api.crawlbase.com/crawler/_USER_TOKEN_/stats?history_from=yyyy-mm-dd&history_to=yyyy-mm-dd'

# Purge API

To purge a specific crawler, make this POST request with the crawler name and token (JS/TCP):

curl -X POST 'https://api.crawlbase.com/crawler/_USER_TOKEN_/YourCrawlerName/purge'

Note: This will immediately remove all pages from the crawler.

# Delete Job API

To delete a job from a crawler, send this POST request with the request RID, crawler name, and token (JS/TCP):

curl -X POST 'https://api.crawlbase.com/crawler/_USER_TOKEN_/YourCrawlerName/delete_job?rid=RID'

# Find Job API

To find a request by RID in your crawler's queue, use the following GET request (JS/TCP):

curl 'https://api.crawlbase.com/crawler/_USER_TOKEN_/YourCrawlerName/find_by_rid/RID'

Responses:

  • If QUEUED:
{
  "status": "QUEUED",
  "request_info": {
    "rid": "YOUR_RID",
    "url": "YOUR_URL",
    "retry": 3,
    "created_at": 1600494969.189415
  }
}
  • If NOT_QUEUED: If the request is already crawled or not in queue.
{
  "status": "NOT_QUEUED",
  "request_info": {
    "rid": "YOUR_RID"
  }
}

# Pause API

To pause a crawler, use this POST request with the crawler name and token (JS/TCP):

curl -X POST 'https://api.crawlbase.com/crawler/_USER_TOKEN_/YourCrawlerName/pause'

# Unpause API

To unpause a crawler, make this POST request with the crawler name and token (JS/TCP):

curl -X POST 'https://api.crawlbase.com/crawler/_USER_TOKEN_/YourCrawlerName/unpause'