# Response

When doing a request to Crawlbase you'll get a response.

This response will be a JSON object or the html code of the page depending on the option you selected with the format parameter (default is html).

# HTML Response

If you selected html response format (which is the default), you'll receive the html of the page as the response.

The response parameters will be added to the response headers.

GET 'https://api.crawlbase.com/?token=_USER_TOKEN_&url=https%3A%2F%2Fgithub.com%2Fcrawlbase%3Ftab%3Drepositories&format=html'
Response:
  Headers:
    url: https://github.com/crawlbase?tab=repositories
    original_status: 200
    pc_status: 200
    'X-Domain-Complexity': standard

  Body:
    <!doctype html><html class="a-no-js" data-19ax5a9jf="dingo"><!-- sp:feature:head-start -->
    <head><script>var aPageStart = (new Date()).getTime();</script><meta charset="utf-8">
    ... (all the html of the page)

# JSON Response

If you selected the json response format, you'll receive a JSON object that you can parse.

This object contains all the information that you need. Read response parameters for all the information.

GET 'https://api.crawlbase.com/?token=_USER_TOKEN_&url=https%3A%2F%2Fgithub.com%2Fcrawlbase%3Ftab%3Drepositories&format=json'
Response:
{
  "original_status": "200",
  "pc_status": 200,
  "url": "https://github.com/crawlbase?tab=repositories",
  "domain_complexity": "standard",
  "body": "\u003C!doctype html\u003E\u003Chtml class=\"a-no-js\" data-19ax5a9jf\n... (all the html of the page)"
}

# Headers

As you have seen in the examples above, the response, json or html will return parameters for json, or headers for html, which will allow you to know what happened with the request.

# URL

The original URL that was sent in the request or the URL of the redirect that Crawlbase followed.

# original_status

The status response that we (Crawlbase) receive when crawling the URL sent in the request.

It can be any valid http status code.

Please note that Crawlbase only charges requests that are of original_status success (200, 201, 204), permanent redirect (301), temporary redirect (302) if the follow redirect returned content, not found (410, 404). And when pc_status is 200. Any other original_status codes received will not be charged.

# pc_status

The Crawlbase (pc) status code can be any status code and it's the code that ends up being valid.
For example, a website might return original_status 200 with a captcha, in that case, pc_status will be 503.

Any code that is not standard like 601, 999, etc. Is used for the engineering team internally and only exposed to help you debug problems when contacting support.

Please note that requests made to Crawlbase that ends up having an unsuccessful pc_status code (so different than 200) won't be charged.

# X-Domain-Complexity

The complexity level indicates how difficult it is to crawl or scrape a given domain, and also reflects the associated resource requirements and pricing:

  • standard – Domains that are easy to crawl or scrape, with minimal protection measures. These domains have typically the lowest pricing tier.
  • moderate – Domains with moderate anti-bot protections that require specialized handling. These domains have typically intermediate pricing tier and are more resource-intensive to process.
  • complex – Domains with advanced protection systems that are challenging to crawl or scrape. These require advanced techniques and specialized resources, reflected in the highest pricing tier.

Understanding the complexity level of different domains helps you estimate potential pricing and technical considerations for your crawling tasks. For specific pricing information based on domain complexity levels, please refer to your subscription plan or contact our sales team through the Contact page.

# body

This parameter is only available in json format, in html format will be the body of the response itself.

The content of the page that Crawlbase found as a result of proxy crawling the URL sent in the request.