A proxy error is rarely a mystery once you read it as a sentence. The status code tells you who failed (your client, the proxy, or the target) and the body or headers usually tell you why. The trouble is that the same code can mean two different things depending on whether it came from the proxy hop or the origin server, and that ambiguity is what sends people in circles. A 403 from the target is a block; a 407 from the proxy is an auth problem; a 502 can be either side timing out. Sort out which layer spoke and the fix is almost always obvious.

This is a reference you can keep open while debugging. It maps the status codes you actually hit when scraping through a proxy server to what they mean in a proxy context and how to clear them, with the proxy-specific cases (auth, rate limits, gateway failures, geo blocks) called out rather than glossed as generic HTTP. Skim the table, jump to your code, apply the fix.

Proxy error codes at a glance

Start here. The table covers the codes that actually stop a scrape; the sections below go deeper on the ones with proxy-specific traps.

Code What it means in a proxy context First fix
400 Malformed request the proxy or origin refused Fix request syntax, headers, encoding
401 Target needs auth (not the proxy) Send the target's credentials or token
403 Target blocked the request or the exit IP Rotate IP, fix fingerprint, raise trust tier
407 The proxy itself needs auth Send proxy user:pass or whitelist your IP
429 Rate limited by target or proxy plan Slow down, rotate IPs, honor Retry-After
451 Blocked for legal/geo reasons Exit from an allowed region
502 Proxy got a bad/empty upstream response Retry, switch exit, check the proxy is up
503 Target or proxy overloaded/unavailable Back off, retry with jitter, rotate
504 Gateway timeout on the proxy or origin hop Raise timeout, retry, switch exit
conn refused / reset No proxy listening, or it dropped the socket Check host:port, scheme, and credentials

One reading rule before the details: ask which hop produced the code. A 4xx that names "proxy" (407) is the proxy talking; almost every other 4xx is the target talking through a working proxy. A 5xx that arrives instantly is usually the proxy failing to reach upstream; a 5xx after a long pause is usually the origin timing out. That single distinction routes you to the right fix faster than memorizing every code.

4xx: the request reached the target and got refused

These mean a request got through but was rejected. With one exception (407), the proxy did its job and the verdict came from the origin.

403 Forbidden: you got blocked, not banned forever

This is the code scrapers see most. The target accepted the connection, looked at the request, and decided it did not want to answer it. In a proxy context that almost always means the exit IP was flagged (a known datacenter range, an address already rate-limited) or the request fingerprint looked automated (missing or default headers, no realistic User-Agent, a header order no browser sends).

The fix is layered. First, rotate to a fresh exit IP and retry; a single block does not mean the whole pool is burned. If clean IPs still get a 403, the problem is trust, not luck: a hardened target drops datacenter ranges on sight, so move up to residential. See datacenter vs residential proxies for where that line sits. If rotation and a higher trust tier both fail, the block is fingerprint-level, and you need realistic headers and often a rendered page rather than a raw fetch. The broader playbook is in how to scrape without getting blocked.

407 Proxy Authentication Required: the proxy is asking, not the site

The 407 is the one 4xx that comes from the proxy hop itself, and it is the most misread. The target never saw your request; the proxy refused to forward it because you did not authenticate to the proxy. Two things cause it: wrong or missing credentials, or an IP-whitelist proxy that does not recognize the address you are calling from.

Send credentials in the proxy URL, or whitelist your source IP in the provider dashboard, depending on which auth mode your plan uses. With curl, credentials go in the -x argument, not in the request to the target.

bash
# 407 fix: authenticate TO the proxy in the -x URL.
# user:pass here are the proxy's credentials, not the site's.
curl -x "http://USER:[email protected]:8080" \
     "https://httpbin.org/ip"

# IP-whitelist plans take no credentials; the proxy
# checks your source IP instead. Confirm it is allowed.

If you authenticate every other request fine and one suddenly throws 407, check whether your outbound IP changed (a new container, a VPN, a CI runner) and re-whitelist it. More curl-and-proxy patterns are in how to use curl with a proxy.

401 Unauthorized: the target wants credentials

Easy to confuse with 407, but it is the opposite layer. A 401 means the proxy forwarded your request and the target replied that the resource needs authentication you did not supply. The proxy is working correctly. Fix it by sending the site's own credentials, API key, or session token, exactly as you would without a proxy. If you see 401 and 407 alternating, you have two separate auth problems stacked: one to the proxy, one to the origin.

400 Bad Request: the message itself is broken

A 400 means the request was malformed before anyone could evaluate its merits: invalid syntax, a bad URL encoding, duplicated or oversized headers, or a body that does not match its declared content type. Through a proxy, watch for double-encoding when you pass a target URL as a query parameter to an API, and for header injection from string concatenation. Log the exact bytes you send and read them back; the defect is almost always in your request builder, not the network.

429 Too Many Requests: you tripped a rate limit

You sent more requests than the target (or your proxy plan) allows in a window. The honest fix is to slow down and spread out: add delay between requests, cap concurrency, and rotate exit IPs so the load lands on many addresses instead of one. Respect the Retry-After header if the response includes it rather than hammering blindly.

429 is a backoff signal, not a retry-immediately signal

The reflex to retry a 429 right away makes it worse: you confirm the abusive pattern and extend the cooldown. Back off exponentially with jitter, honor Retry-After when present, and rotate the exit IP so the next attempt starts from a clean address. Rate limits are usually per-IP, so rotation plus spacing clears most of them.

Rotating IPs is the structural fix here, because a rate limit counted per-IP resets when the IP does. See how to use rotating proxies for the mechanics, and rotating residential proxies when the target also fingerprints the IP's trust.

A 451 means the content is withheld for legal or regional reasons rather than a technical fault. In a proxy context this is usually geo-gating: the exit IP is in a region the target does not serve. The fix is to exit from an allowed country. Pick a proxy with geo-targeting and confirm it actually covers the region you need, then verify the exit location with an echo service before assuming coverage.

5xx: the failure is upstream, but which hop?

These come from a server, not your request, yet through a proxy "the server" is ambiguous: it can be the proxy failing to reach the origin, or the origin failing on its own. Timing tells them apart.

502 Bad Gateway: the proxy got garbage from upstream

A 502 means a gateway (here, your proxy) received an invalid or empty response from the server it tried to reach. It usually arrives fast, because the proxy gave up quickly. Causes split two ways: the proxy itself is unhealthy or misconfigured, or the upstream exit it chose is dead. Retry first, since 502s are often transient; if retries keep failing, switch to a different exit IP, and confirm the proxy endpoint is actually up by sending a request to a known-good echo service through it. If that also 502s, the proxy layer is the problem, not the target.

503 Service Unavailable: overloaded or down

A 503 means the server is temporarily unable to handle the request: overloaded, in maintenance, or shedding load. From a proxy, it can be the target shedding your traffic specifically (a soft block dressed as 503) or genuinely down for everyone. Back off and retry with jitter; if a single exit IP gets 503 while others succeed, that IP is being throttled, so rotate. If every exit gets 503 at the same time, the target is down and waiting is the only move.

504 Gateway Timeout: someone in the chain ran out of time

A 504 is a timeout on a gateway hop: the proxy waited for the origin and the origin never finished, or the proxy itself was too slow to respond. Unlike 502 it arrives after a long pause. Raise your client timeout for genuinely slow pages, retry (timeouts are frequently transient), and switch exits if one route is consistently slow. If pages need a browser to finish rendering, a raw fetch can appear to time out when the real issue is incomplete JavaScript execution, which a rendering layer solves rather than a longer timeout.

Connection-level failures: nothing got an HTTP code at all

Some failures never reach the point of an HTTP status. These are socket errors, and they almost always mean the proxy was misconfigured rather than the request rejected.

Connection refused means nothing was listening at the host and port you dialed: wrong port, wrong host, the proxy not running, or a scheme mismatch (sending HTTP to an HTTPS-only endpoint). Verify the exact host:port and protocol your provider gave you.

Connection reset / timeout means the socket opened but was dropped or never answered: a dead exit IP, an aggressive firewall, or the target silently dropping flagged traffic. Retry with a fresh exit; if resets cluster on one IP, it is burned. TLS / certificate errors through a proxy usually mean something is terminating TLS in the middle. Use CONNECT tunneling for HTTPS so the proxy relays encrypted bytes without inspecting them; details in HTTP vs HTTPS proxies.

Make whole error classes disappear

Notice the pattern across the table: 403, 429, 502, 503, and most connection resets all share one fix, rotate to a clean exit and retry. Building that yourself means a healthy pool, a rotation policy, backoff with jitter, and per-code retry logic. A managed rotating endpoint does it for you, which is why several error classes simply stop appearing once rotation and retries move server-side.

Crawlbase Smart AI Proxy

Smart AI Proxy is one endpoint that rotates the exit IP per request across datacenter, residential, and mobile pools and retries on blocks server-side, so 403s, 429s, and transient 5xx clear themselves instead of landing in your code. Point your existing client at it and run your real target on the free tier first.

A retry loop that reads the code

Whatever proxy you use, your client should act on the status code instead of treating every failure the same. The minimal version: retry the transient classes with backoff, rotate on a per-IP block, and stop on a real client error you cannot fix by trying again.

python
import time, requests

RETRY = {429, 502, 503, 504}  # transient: back off and retry
ROTATE = {403, 429}        # per-IP: try a fresh exit

def fetch(url, get_proxy, tries=5):
    for n in range(tries):
        proxy = get_proxy()  # new exit IP each call
        r = requests.get(url, proxies=proxy, timeout=30)
        if r.status_code == 200:
            return r
        if r.status_code not in RETRY and r.status_code not in ROTATE:
            r.raise_for_status()  # 400/401/451: stop, fix the request
        wait = int(r.headers.get("Retry-After", 2 ** n))
        time.sleep(wait)
    r.raise_for_status()

The shape matters more than the language: classify the code, back off on transient failures, rotate on per-IP blocks, and fail fast on errors that retrying cannot fix. A managed endpoint folds the rotate-and-retry half into itself, leaving your loop to handle only the genuine client errors.

Recap

Key takeaways

  • Read the code as a sentence: which hop spoke? 407 is the proxy; almost every other 4xx is the target talking through a working proxy.
  • 403 is a block, not a permanent ban. Rotate the exit IP first, raise the trust tier next, fix the fingerprint last.
  • 429 means back off, not retry now. Space requests, honor Retry-After, and rotate, since rate limits are usually per-IP.
  • For 5xx, timing tells the hop. A fast 502 is the proxy failing upstream; a slow 504 is the origin timing out.
  • Most error classes share one fix. Rotate to a clean exit and retry with jitter; a managed endpoint removes that whole loop.

Frequently Asked Questions (FAQs)

What is the difference between a 403 and a 407 proxy error?

A 407 comes from the proxy itself: it refused to forward your request because you did not authenticate to it, so the target never saw the request. A 403 comes from the target after the proxy successfully forwarded the request: the site looked at your IP or fingerprint and refused to serve it. Fix 407 with proxy credentials or IP whitelisting; fix 403 by rotating the exit IP or raising the trust tier of your proxy.

How do I fix a 429 Too Many Requests error when scraping?

Slow down and spread out. Add delay between requests, cap concurrency, and honor the Retry-After header if it is present instead of retrying immediately. Because rate limits are typically counted per IP, rotating your exit IP is the structural fix: each new address starts with a fresh quota. Combining spacing with rotation clears the large majority of 429s.

What causes a 502 Bad Gateway through a proxy?

A 502 means your proxy (acting as a gateway) got an invalid or empty response from the server it tried to reach. It usually arrives quickly. The cause is either an unhealthy proxy or a dead upstream exit IP. Retry first, since many 502s are transient; if they persist, switch to a different exit and confirm the proxy endpoint itself is up by sending a request to a known-good echo service through it.

Why do I get connection refused instead of an HTTP error code?

Connection refused is a socket-level failure that happens before any HTTP response can exist: nothing was listening at the host and port you dialed. Check that the proxy host, port, and scheme exactly match what your provider gave you, that the proxy process is running, and that you are not sending plain HTTP to an HTTPS-only endpoint. It is a configuration error, not a rejection of your request.

Can rotating proxies eliminate proxy error codes?

Rotation removes the per-IP error classes, primarily 403 blocks and 429 rate limits, because both reset when the exit IP changes, and it helps with transient 5xx by retrying on a fresh route. It does not fix codes that come from your own request, like 400 or 401, or legal geo blocks like 451. Pair rotation with correct request building and geo-targeting to cover the rest.

Should I retry every proxy error automatically?

No. Retry the transient classes (429, 502, 503, 504) with exponential backoff and jitter, and rotate the exit IP on per-IP blocks (403, 429). Do not blindly retry client errors you caused (400, 401, 451), since trying again with the same request will fail again and can extend a rate-limit cooldown. Classify the code first, then choose retry, rotate, or stop.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available