On Mon, Nov 05, 2007 at 12:05:08PM -0500, Adam McDougall wrote:

On Mon, Nov 05, 2007 at 10:24:14AM +0100, Kris Kennaway wrote:

Thomas Sparrevohn wrote:
>> On Sunday 04 November 2007 15:00:50 Kris Kennaway wrote:
>>> http://www.freebsd.org/doc/en_US.ISO...rneldebug.html

>> Oh my god - Overlooked that ;-) - funny that - Its a bit tricky as it not
>> possibly to dump a kernel
>> when the swap is on ZFS - I did a test with all debugging enabled and the
>> problem
>> did not show up - which makes it somewhat nasty - I check if I can
>> reproduce it with only DDB enabled

You can still hook up a serial console, or at the very least take
photographs of the screen with the relevant DDB information. Or add
another disk and dump on that.


I have some screenshots of ps in ddb from one of several zfs hangs I've had
on one amd64 system:


I didn't post every single screenful since I don't have a microsd reader handy,
and emailing the pictures off my phone is painful. If I missed a screenshot of
one or more particular processes that might have a telling state, let me know.

I also have a gzipped kernel + dump from a forced panic when it was in this
state, if a developer is interested in it please let me know so I can post it
somewhere private since the system is in NIS and likely has tables cached
in memory.

It is running a kernel from Oct 17. I tried a kernel with WITNESS, INVARIANTS
etc but it did the same hang without any panic. I completed a zpool scrub
this morning with no errors. Lately zfs seems to wedge up every single night
when rsync from remote servers run. This is the only amd64 system I have zfs on,
the other two are i386 and the problems on those systems have only been kmem panics
which so far have been avoidable.

I can help by checking somewhat specific things and running prescribed tests,
but right now I don't have time to tackle this problem on this system and learn
how to debug it entirely on my own starting with nothing more than a DDB guide
from the handbook. Its not that I refuse to; I recognize its difficult to
join remote skill with local hands for something this technical.

Sorry if I seemed negetive or unhelpful, I will try on my own if I have time but
I'm pretty busy lately. On a hunch from other past emails, I tried turning off ZIL
and so far it survived the night, rsync is still running. The only other change
I did was running the zpool scrub yesterday (no fixes were needed) and I applied
the patch to make more of the zfs process states visible in top. I've rebooted
several times (each time after zfs hung) so uptime isn't an issue, but for every
day rsync doesn't finish, the next day's rsync might has more updates because it
missed a day.

Friday I replaced the motherboard/cpu just as a shot in the dark (since the
system had some strange instability in the past) but this didn't help zfs
(not surprised). When zfs was hung saturday morning, I tried to reboot it
but reboot would not even get far enough to stop new ssh connections.
freebsd-current@freebsd.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

freebsd-current@freebsd.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"