Quoting Kris Kennaway :

> Chris H. wrote:
>> Quoting Kris Kennaway :
>>
>>> Clifton Royston wrote:
>>>> On Tue, Oct 16, 2007 at 01:01:46PM -0700, Chris H. wrote:
>>>>> excerpt from this list titled: NFS == lock && reboot, that I
>>>>> posted follows:
>>>>>
>>>>> ------8<---SNIP---8<-----SNIP-----8<-------
>>>>> # uname -a
>>>>> FreeBSD host.domain.tld 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri
>>>>> Jan 26 16:27:14 PST 2007
>>>>>
>>>>> Greetings,
>>>>> Does anyone know when NFS and friends will be working again? I
>>>>> haven't been able
>>>>> to /safely/ use it from 4.8 on. I remember some talk on the list
>>>>> sometime ago and
>>>>> then it seemed to be resolved, as the discussion ended. So I
>>>>> thought it was
>>>>> fixed. Seems not.
>>>>>
>>>>> My scenario;
>>>>> mount host off root:
>>>>> mount script exec'd follows...
>>>>>
>>>>> #!/bin/sh -
>>>>> mount -t nfs host.domain.tld:/ /host
>>>>> mount -t nfs host.domain.tld:/var /host/var
>>>>>
>>>>> confirm mount...
>>>>>
>>>>> # ls /host
>>>>> .snap COPYRIGHT bin
>>>>> ...
>>>>> usr var tmp
>>>>>
>>>>> OK looks good...
>>>>>
>>>>> # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
>>>>>
>>>>> Fatal double fault
>>>>> eis 0x0blah
>>>>> eiblah blah0x
>>>>> panic double fault
>>>>> no dump device defined
>>>>> rebooting in 15sec...
>>>>>
>>>>> Hmmm... that's not good.
>>>>>
>>>>> ------8<---SNIP---8<-----SNIP-----8<-------
>>>>>
>>>>> My final solution was to change the lines in /etc/rc.conf
>>>>> from:
>>>>> nfs_client_enable="YES"
>>>>> nfs_reserved_port_only="YES"
>>>>> nfs_server_enable="YES"
>>>>> rpc_lockd_enable="YES"
>>>>> rpc_statd_enable="YES"
>>>>> rpcbind_enable="YES"
>>>>>
>>>>> to:
>>>>> nfs_client_enable="YES"
>>>>> nfs_reserved_port_only="YES"
>>>>> nfs_server_enable="YES"
>>>>> #rpc_lockd_enable="YES"
>>>>> #rpc_statd_enable="YES"
>>>>> rpcbind_enable="YES"
>>>>>
>>>>> Making those changes ended the "Fatal double fault && reboot in
>>>>> 15 seconds..."
>>>>
>>>> Thanks for this very timely mention! The cluster of servers I am
>>>> about to upgrade from 4.8 to 6.2 relies heavily on
>>>> NFS to an old Netapp. If I have got to disable rpc_lockd and
>>>> rpc_statd, it's good to know that now!
>>>> Can I ask, can anybody confirm that they're running 6.2 on NFS
>>>> successfully *with* lockd and statd?
>>>
>>> Er, yes, of course it does. The old message he is quoting is bogus
>>> on its own,

>> While I'll grant you that I haven't *yet* found/taken the time to create a
>> dump device and re-enable rpd_lockd && rpc_statd && cp 10Mb file to mount
>> point to produce an *instantaneous* "Fatal double fault". I don't think it's
>> fair to label my original post entirely /bogus/ - especially in light of
>> the recent post I replied to. Which seems to have some very common ground.
>> I should probably mention that since my last posting (my original thread),
>> I have some 20+ RELENG_6_2 boxen that *do* have rpd_lockd + rpc_statd
>> enabled. Yet none of them produce a "Fatal double fault". They are all
>> Tyan SMP boards with dual onboard fxp's - as opposed to the Nvidia UP
>> which has a single onboard nve. They are all inter-connected via NFS.
>> I have a 750Gb drive hanging off the /problematic/ Nvidia board, that I
>> had intended to use for NFS back-up's. But given the NFS issue I had with
>> it, it didn't seem to be the best solution. If anyone felt like throwing
>> me a "cheat sheet" for creating a dump device out of that drive and a
>> "quickie" for producing a backtrace. I'm sure I'd be better able to find
>> the required time to produce the required information. I'm sorry. It's
>> just that I'm a hundred million miles away from that right now. As I've
>> been building several large web applications, and their deadline is fast
>> approaching. FWIW I bounced all the servers today, and therefore have
>> recent /verbose/ dmesg's. Should any of the information they provide, be
>> of any help/use to anyone.
>>
>> Take care.

>
> http://www.freebsd.org/doc/en_US.ISO...rneldebug.html
>
> It's very unlikely NFS is relevant to the problem (which is what made
> it bogus, together with the lack of debugging) and likely that nve is
> the cause. The above URL explains in detail how to obtain the
> necessary debugging to confirm this.
>
> Kris
>
>

Thank you Kris,
I was recently able to find a small window in my workload. So I decided to
use it to provide the "non-bogus" information needed. After reading:
http://www.freebsd.org/doc/en_US.ISO...rneldebug.html
and:
http://www.freebsd.org/doc/en_US.ISO...debug-gdb.html
a few days ago, I was only unclear on one point in setting up the required
environment. So I posted my question to the list "dumpdev question
(probably stupid)"
which Andrey V. Elsukov immediately responded to.
I'll be creating a Crash Dump in the next couple of days. So if it's
not already
abundantly clear that this is the first time I've attempted to produce this
information - now would be the perfect time to /enlighten/ me as to
anything you
can think of that will ensure you get the information you're looking for.

Thank you again for your reply.

--Chris

--
panic: kernel trap (ignored)



_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/lis...freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"