On Fri, 27 Feb 2004, Adam wrote:

> We have a problem for which I was unable to find an explanation or solution
> via the list archives or FAQ: We are able to access the site
> www.calottery.com (don't ask - we just support the users unproxied
> (directly through our Pix firewall) but when going through our Squid
> 2.5STABLE3 proxy it takes forever to time out, then gives this error:
> "While trying to retrieve the URL: http://www.calottery.com/
> The following error was encountered:
> Read Error
> The system returned: (131) Connection reset by peer


Please try upgrading to 2.5.STABLE4 as this includes a workaround for a
common major bug in the HTTP inspection module of Cisco PIX and some other
firewalls which could cause symptoms similar to this.

> Checking the archives, most "connection reset by peer" posts resolve with
> "ignore them."


Only in response to what the cache.log error message "sslReadClient: FD
43: read failure: Connection reset by peer" means, not on questions why
connection reset errors is returned to the client.

> anything on Squid in a while. The only change we've made since this broke
> Feb 5th is we switched from a Checkpoint Firewall to the Pix firewall (no
> content-engines, just the firewall).


I wonder if it is a coincidence but most reports about odd connection
reset or unreachable sites involve Cisco PIX one way or another..

> problem. Then again I am stumped so willing to try anything (we have a DEV
> Squid proxy that is identical to the other, so I am working on that. I
> tried clearing the cache (echo "" > swap.state method) and adding
> calottery.com to the notcached directive (restarting each time) and both
> failed to resolve the problem.


Upgrading may help.

The problem you are seeing is a very low-level network problem and it is
unlikely any changes in squid.conf will make any difference.

What you need to test is if the site is reachable when running a browser
on the proxy server, but not using the Squid proxy. Squid will only work
as good as the connectivity from the server is running on.

The kind of problem you are seeing is almost always caused by misbehaving
firewalls in one way or antoher. When it is only a few sites that fail
the misbehaving firewall is usually at the remote site, not yours.

The ECN issue and the Host header workaround in 2.5.STABLE4 is both
examples of having to work around very broken firewalls. A similar problem
was also seen with the timestamp TCP option some years ago. It seems
firewall vendors sometime forget to proactively test how their
implementations will behave the day TCP/IP gets extended with one more
option (timestamp) or flag (ECN), or make assumption that HTTP requests
can always be processed as packets and not a TCP stream (Host: header
issue with PIX etc).

Regards
Henrik