# Bulk
The /bulk
endpoint allows clients to retrieve data in bulk using a list of Request IDs (RIDs). This operation supports efficient data retrieval for large datasets and provides an option to automatically delete the fetched items from the storage after retrieval.
# Parameters
Send a JSON object with the following properties:
rids
(required): An array of RIDs for the data you want to retrieve.auto_delete
(optional): A boolean parameter that, when set totrue
, will automatically delete the fetched items from the storage after they are retrieved. The default value isfalse
, meaning items will not be deleted unless explicitly requested.
# Request
To retrieve and automatically delete data for three RIDs:
curl -X POST 'https://api.crawlbase.com/storage/bulk?token=_USER_TOKEN_' \
-H 'Content-Type: application/json' \
-d '{ "rids": ["RID1","RID2","RID3"], "auto_delete": true }'
# Response
The response is a JSON array of objects, each representing the data for one RID. Note that the body
field is base64 encoded and gzip compressed. You will need to base64 decode and then gzip decompress it to retrieve the original content.
[
{
"stored_at": "2021-03-01T14:22:58+02:00",
"original_status": 200,
"pc_status": 200,
"rid": "RID1",
"url": "URL1",
"body": "BODY1"
},
{
"stored_at": "2021-03-01T14:30:51+02:00",
"original_status": 200,
"pc_status": 200,
"rid": "RID2",
"url": "URL2",
"body": "BODY2"
}
]
# Notes
For efficient use of the /bulk
API, please take note of the following:
The maximum number of RIDs that can be processed per request is 100. If more than 100 RIDs are sent, only the first 100 will be processed.
The
auto_delete
feature is particularly useful for maintaining storage efficiency and managing data lifecycle without requiring separate deletion requests. Use this feature judiciously to avoid unintentional data loss.