Amazon is a price-monitoring and product-research goldmine and one of the most aggressively defended sites on the web. Scrape it from a datacenter IP and you'll see CAPTCHAs, the "Robot Check" wall, rotating prices, or a 503 within a handful of requests. This guide covers what Amazon checks, the proxy strategy that survives it, and the session discipline that keeps prices and Buy Box data accurate.
Datacenter proxies are a dead end on Amazon at any real volume — residential exits with clean ASNs are what sustain it. Just as important: match the exit country to the marketplace you're scraping. Pulling amazon.de prices through a US exit gives you US pricing, currency and Buy Box, which quietly corrupts your dataset. One exit geography per marketplace.
Amazon personalizes by session — add-to-cart, delivery location, and price tests all live in cookies. If you rotate IP on every request, you reset that context constantly and get inconsistent prices and Buy Box winners. Hold a sticky session for a product's full scrape, then rotate at the product or task boundary:
import itertools, time, random
from curl_cffi import requests
PROXIES = ["socks5h://USERNAME:[email protected]:913", "..."]
pool = itertools.cycle(PROXIES)
def scrape_product(asin, proxy):
# Same sticky exit for every page of this product
s = requests.Session(impersonate="chrome",
proxies={"http": proxy, "https": proxy})
detail = s.get(f"https://www.amazon.com/dp/{asin}")
offers = s.get(f"https://www.amazon.com/gp/offer-listing/{asin}")
return detail.text, offers.text
for asin in asins:
proxy = next(pool) # new sticky exit per product
if "Robot Check" in scrape_product(asin, proxy)[0]:
# flagged - back off, rotate, retry on a fresh exit
...
time.sleep(random.uniform(2.0, 5.0))
As with Google SERP scraping, the discipline that keeps you alive is the per-exit rate, not the pool size. Randomize gaps, cap concurrency per IP, and when you see "Robot Check" or a 503, stop hitting that exit and back off — retrying into a flagged IP just lengthens the block.
A plain requests call sends a JA3 that marks you as automation before the IP score even applies. Send a browser fingerprint with curl_cffi (above) or a real browser — see Bypass TLS Fingerprinting with curl_cffi. If Amazon throws a JS CAPTCHA wall, escalate to a real browser engine.
New users get 500MB free traffic instantly, plus an extra first-deposit reward — limited-time offer.