This is a discussion on [squid-users] Squid stops responding after a number of requests - squid ; First, some background. The server squid is installed on is a Compaq 1850R with dual 400mHz processors and 512MB of RAM. All the hardware is known good, including the network cable. There are two 4.3GB 10K SCSI drives, set up ...
First, some background. The server squid is installed on is a Compaq
1850R with dual 400mHz processors and 512MB of RAM. All the hardware is
known good, including the network cable. There are two 4.3GB 10K SCSI
drives, set up ins a RAID 0 array. Squid is running on Redhat 9, and we
are using the latest official Redhat 9 Squid package - 2.5STABLE1-3.9.
Squid is running transparently behnd a NATed firewall, but I have added
a parent cache residing on a real address. We have about 350 unique
users and about 25-35 simultaneous users. The users are using the squid
box as a gateway.
The problem is that after running squid for a period of time, it stops
responding. Squid is still running and accepting connections, which are
logged to access.log, but it never answers back. A restart of squid will
clear the problem, but after a number of restarts, the entire system
must be rebooted to correct the problem. It appears the problem is tied
to the DNS. Running "squidclient mgr:idns" shows an increasing number of
DNS queries, but no responses. Also, it shows a queue of DNS queries.
Soon after this starts happening, squid stops responding. Cache.log
shows the following error: "idnsCheckQueue: ID af: giving up after 31
trys and 302.7 seconds". Trys and seconds both vary. Running "squid -k
debug" after squid stops responding shows more info related to the
idnsCheckQueue error. No other errors are logged. I can still resolve a
hostname with ping or dig from the command line. The external proxy and
DNS are both fully accessible.
Others things that might be useful: All squid directives are at their
defaults, except for defining the cache, port, cache parent, and the
directives to enable transparency. the acl is set to allow all since we
are running on an internal network. I've checked the FAQs, the archives,
and google, which have fixed a few other things, but this problem is
All help most appreciated,