In this blog, we will show how to use the Crawlbase Smart AI Proxy to extract ASIN for a selected Amazon product, we will also show how to pass Crawlbase Crawling APIs parameters to the Smart AI Proxy for enhanced scraping with more control on the way to crawl data. In the end we will have a structured JSON of the Amazon product page for easy consumption. We will also answer few frequent questions we get asked about web scraping Amazon and the Amazon product pages aka ASIN pages.
You can use our Amazon scraper to extract all kinds of data from the platform. Try it now.
Step-by-Step: Extracting Amazon ASIN with Crawlbase Smart AI Proxy
Step 1: Start by creating a free Crawlbase account to access your Smart AI Proxy token.
Step 2: Navigate to the Crawlbase Smart AI Proxy Dashboard to retrieve your free access token found under the âConnection detailsâ section.

Step 3: Select the Amazon product you wish to crawl. For this example, letâs crawl this OtterBox iPhone 14 Pro Max (ONLY) Commuter Series Case Amazon product. The URL is as follows:
https://www.amazon.com/OtterBox-COMMUTER-iPhone-Pro-ONLY/dp/B0B7CH8DMR/
Step 4: To send a request to the Smart AI Proxy, copy the following line and paste it into your terminal:
| 1 | curl -x "http://[email protected]:8012" -k "https://www.amazon.com/OtterBox-COMMUTER-iPhone-Pro-ONLY/dp/B0B7CH8DMR/" | 
This curl command can also be found in the Crawlbase Smart AI Proxy Documentation. Remember to replace âUSER_TOKENâ with your access token and insert the URL of the product you wish to crawl.
As you can see the curl command has 2 options, the -x which is equivalent to âproxy allows the user to send a proxy host:port and also a proxy authentication. The Crawlbase Smart AI Proxy does not require a password for authentication as the proxy usernames are unique and are secure, it is enough to use the username or USER_TOKEN for the proxy authentication. If you however require to add a password in your web scraping application then add any string you prefer like your company name or just add âCrawlbaseâ.
In the curl command, we also added the -k flag (or --insecure) stands for âinsecure.â When you use the -k flag with curl, it tells the command to allow connections to SSL/TLS-protected (HTTPS) sites without verifying the authenticity of the certificate presented by the server. This option is required at the Smart AI Proxy, it allows us to handle the forwarding to the Crawling API and bypass captchas and blocks before sending the request to the original requested website. It is mandatory to use the -k or âinsecure flag when sending requests to the Smart AI Proxy.
Step 5: If performed correctly, you should receive an HTML response similar to the one shown in this screenshot.

In the above example, we have crawled the target Amazon page and we can see that the ASIN we were looking for is present as currentAsin:

Scraping Amazon ASIN using Python and Smart AI Proxy
In the last section, we used curl to make a basic request which returns scraped data for a product page from where we extracted the ASIN. For a more advanced usage, we will now delve into using Python to automate these requests and parse the response.
For the Python code, we will be using the requests library only and create a file named smartproxy_amazon_scraper.py.
| 1 | import requests | 
Then you can simply run the above script in your terminal with python smartproxy_amazon_scraper.py.

This is the successful response that you get in your terminal in the form of HTML. You can parse this response and structure the data which then can be stored in a database for easy retrieval and analysis.
Customizing Requests with Crawling API Parameters
Letâs dive deeper by exploring how to customize Smart AI Proxy requests using Crawlbaseâs Crawling API Parameters. You can simply pass these parameters to the Smart AI Proxy as headers prefixed with CrawlbaseAPI-Parameters: ... For example:
Example # 1:
In this Python script, we set the CrawlbaseAPI-Parameters to autoparse=true. This API call instructs the Smart AI Proxy to automatically parse the page and return a JSON response. You can then use this structured data as per your requirement.
| 1 | # pip install requests | 
After running the above call in the terminal youâll get the response in JSON format and you can see that data looks much more structured now.

Example #2:
In order to achieve geolocation for your requests from a particular country, simply include the âcountry=â parameter, using the two-character country code, such as âcountry=USâ. See below:
| 1 | # pip install requests | 
After running the above call in the terminal youâll get the response in HTML as shown below:

You can save the output HTML as smartproxy_amazon_scraper.html on your local machine. When you open the HTML file in the browser, you will notice the page says United Kingdom under âDeliver toâ which means your request to Amazon was routed from GB as we instructed the API in the code above.

In the above two examples, we showed you how you can successfully crawl a webpage using Crawlbase Smart AI Proxy and also how you can easily utilize the potentials of our Crawlbase Crawling API via the CrawlbaseAPI-Parameters. Specifically, we introduced the autoparse=true parameter, which provides a structured output for easier data processing, and the country=GB parameter (or any valid two-letter country code) that facilitates targeted geolocation.
Crawlbase Smart AI Proxy Made Redirects Easy!
Usually, proxies donât do URL redirects but Crawlbase Smart AI Proxy does. Thatâs why we call it Smart AI Proxy. Smart AI Proxy uses Crawling API features to handle URL redirects by intercepting incoming requests, evaluating redirect rules set by users, and sending appropriate HTTP status codes to clients. It efficiently routes users from the source URL to the target URL based on the specified redirect type (e.g., 301 or 302).
Letâs demonstrate one redirect scenario by targeting the same URL as before, but this time we will remove the âwwwâ prefix from the URL. The modified URL will trigger a redirect, showcasing how Crawlbase Smart AI Proxy handles this type of redirection. The resulting URL without the âwwwâ prefix will appear like this:
https://amazon.com/OtterBox-COMMUTER-iPhone-Pro-ONLY/dp/B0B7CH8DMR/
We will continue using the Python code provided earlier, and the API call for setting up URL redirects will follow the same structure as before. The code snippet will look like this:
| 1 | # pip install requests | 
After executing the above API call in the terminal, you will receive the response in JSON format. In the response, you can observe that the âoriginal_statusâ field has the value â301.â

Scrape Amazon ASIN with Smart AI Proxy
Scraping Amazon ASINs on a large scale allows developers to quickly pull important product information. This key data is crucial for studying the market, setting prices, and comparing competition. By using web scraping tools, users can automate the collection of ASINs from large product lists, saving a lot of time and energy.
To summarize, Crawlbase Smart AI Proxy stands as a revolutionary solution offering custom geolocation, unlimited bandwidth, AI-driven crawling, rotating IP addresses, and a high success rate. Its diverse features, including a vast proxy pool, anonymous crawling, and real-time monitoring make it an essential tool for developers, enabling them to thrive in the dynamic realm of web data acquisition. Sign up now and get the benefit of 5000 free requests with Crawlbase Smart AI Proxy!
Frequently Asked Questions
Q: What is an Amazon ASIN?
A: An Amazon ASIN (Amazon Standard Identification Number) is a unique 10-character alphanumeric code assigned to products sold on Amazonâs marketplace. It serves as a product identifier and is used to differentiate items in Amazonâs vast catalog. It always begins with âB0.â
Q: Is it legal to scrape Amazon?
A: Scraping Amazon data is entirely legal when the data is publicly accessible. However, itâs crucial to avoid scraping data that requires login credentials and to ensure that collected datasets do not contain any sensitive or copyrighted content.
Q: What is SKU?
A: SKU (Stock Keeping Unit) is a unique code assigned by sellers or retailers to track and manage their inventory. Unlike ASIN, SKU is not specific to Amazonâs platform and can be used across multiple sales channels
Q: Why is it important to scrape ASIN for products listed on Amazon?
- Scraping ASINs for products listed on Amazon is important because ASINs act as unique identifiers for each item in Amazonâs vast marketplace.
- By retrieving ASINs through web scraping, developers can gather essential product details, pricing, availability, and customer reviews, empowering them to build custom applications, analyze trends, and compare products across categories.
- Scraping ASINs enables developers to integrate Amazonâs product data into their own applications and websites seamlessly.
- By tracking ASINs and monitoring their performance over time, businesses and developers can optimize marketing strategies, manage inventory, and stay competitive in the e-commerce landscape.
Q: What are key features of Crawlbase Smart AI Proxy?
A: The key features of the Smart AI Proxy are rotating IP addresses for maintaining anonymity during the crawling process. The pool of rotating IP addresses includes 140 million residential and data center proxies.The Smart AI Proxy is also really helpful in bypassing CAPTCHA challenges and ensuring a 99% success rate for your crawling and scraping. The Smart AI Proxy also offers custom geolocation for region-specific data access.











