# Response
When doing a request to Crawlbase you'll get a response.
This response will be a JSON object or the html code of the page depending on the option you selected with the format parameter (default is html).
# HTML Response
If you selected html response format (which is the default), you'll receive the html of the page as the response.
The response parameters will be added to the response headers.
GET 'https://api.crawlbase.com/?token=_USER_TOKEN_&url=https%3A%2F%2Fgithub.com%2Fcrawlbase%3Ftab%3Drepositories&format=html'
Response:
Headers:
url: https://github.com/crawlbase?tab=repositories
original_status: 200
pc_status: 200
Body:
<!doctype html><html class="a-no-js" data-19ax5a9jf="dingo"><!-- sp:feature:head-start -->
<head><script>var aPageStart = (new Date()).getTime();</script><meta charset="utf-8">
... (all the html of the page)
# JSON Response
If you selected the json response format, you'll receive a JSON object that you can parse.
This object contains all the information that you need. Read response parameters for all the information.
GET 'https://api.crawlbase.com/?token=_USER_TOKEN_&url=https%3A%2F%2Fgithub.com%2Fcrawlbase%3Ftab%3Drepositories&format=json'
Response:
{
"original_status": "200",
"pc_status": 200,
"url": "https%3A%2F%2Fgithub.com%2Fcrawlbase%3Ftab%3Drepositories",
"body": "\u003C!doctype html\u003E\u003Chtml class=\"a-no-js\" data-19ax5a9jf\n... (all the html of the page)"
}
# Headers
As you have seen in the examples above, the response, json or html will return parameters for json, or headers for html, which will allow you to know what happened with the request.
# url
The original url that was sent in the request or the url of the redirect that Crawlbase followed.
# original_status
The status response that we (Crawlbase) receive when crawling the url sent in the request.
It can be any valid http status code.
Please note that Crawlbase only charges requests that are of original_status
success (200, 201, 204), permanent redirect (301), temporary redirect (302) if the follow redirect returned content,
not found (410, 404). And when pc_status
is 200. Any other original_status
codes received will not be charged.
# pc_status
The Crawlbase (pc) status code can be any status code and it's the code that ends up being valid.
For example, a website might return original_status
200 with a captcha, in that case, pc_status
will be 503.
Any code that is not standard like 601, 999, etc. Is used for the engineering team internally and only exposed to help you debug problems when contacting support.
Please note that requests made to Crawlbase that ends up having an unsuccessful pc_status
code (so different than 200) won't be charged.
# body
This parameter is only available in json format, in html format
will be the body of the response itself.
The content of the page that Crawlbase found as a result of proxy crawling the url sent in the request.