This is a discussion on nfs hangs apache - NFS ; I have a web server farm that mounts their docroot using NFSv3 over TCP across 2 layers of firewalls (Yeah I know Client request) to an EMC Celera NAS box, what's happening is Apache intermintely hangs and spikes the CPU ...
I have a web server farm that mounts their docroot using NFSv3 over TCP
across 2 layers of firewalls (Yeah I know Client request) to an EMC
Celera NAS box, what's happening is Apache intermintely hangs and
spikes the CPU on a server at this point we are unable to
shutdown/restart apache or even attach via gdb or run strace its hung
waiting for some sort of request or i/o after a while this happens
across all CPU's and the box is pretty much hosed and has to be
rebooted this could be anywhere from 1 hour to 1 week, after doing some
trouble shooting, I noticed my apache servers all have seem to have the
following session in netstat "syn_sent" which refuses to timeout or
maybe its just constantly going into a "syn_sent" state i'm unable to
tell if its timing out this is to port 2049 NFS to the NAS box, after
looking through the firewall logs i'm seeing lots of dropped packets.
My firewall logs has the associated message with each dropped packet
"Syn Packet for established connection"
I'm also noticing that the
nfs client is using the same src ports 798,799,800 over and over so
basically my firewall (checkpoint) is seeing the client trying to
create a new session that is already opened and dropping it, the
firewall is statefull, anyone has any idea how I could trouble shoot
this, maybe some way I can have the Linux nfs client behave like
Solaris? and use different src addresses on each request? or is nfs
behaving as it should and my firewall just isnt configured correct?
If it make any difference this is Redhat AS kernel 2.4.21-32.0.1