Volker Lendecke wrote:
> On Mon, Apr 21, 2008 at 09:13:28AM -0500, James A. Dinkel wrote:
>
>> Anyway, the server will be fine and snappy for a week or so, then out of
>> the blue, nobody can connect. Top shows a few smbd processes maxing out
>> the cpu and the load (which is usually < 1.0) gradually climbs up to 10,
>>

>
> I've seen this only when something like connections.tdb
> became corrupt. With CentOS this is not likely, but reiserfs
> did that to me fairly often. What filesystem are your tdbs
> residing on? Maybe some other kernel-level problem like a
> problematic driver in the path to the hard disk?
>
> Volker
>

I have seen this once on a CentOS-4.5-x86_64 box; IIRC, there was an
issue with the Intel e1000 kernel module that caused a high number of
connection resets,
but the RSTs never made it back, so the connections would just time out
while the client started a new connection. Then again, this box was
using reiserfs to hold the tdbs, and it might have just been a fsck on
reboot that fixed it when I rebooted after applying the kernel module
update... anyways, what I was seeing was a consistently high number
(several hundred) of queued packets for the sendQ across a dozen or so
connections, and groups of reset connections all happening at the same
time. The load went up slowly for about a day, and then rocketed to
well over 100 when a client was reset with a stuck locked file.

FWIW, this was a SMP Xeon box w/ integrated Intel E1000s and the
(mostly) stock 2.6.9-12(?) RHEL kernel. I had found that Intel did have
a patch for an issue very similar to what I was seeing, and after
applying it, everything was happy again.
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/listinfo/samba