NFS Problem With SnapServer - NFS

This is a discussion on NFS Problem With SnapServer - NFS ; (The following is from a faculty member in my department). I have a snap server operating as an NFS file server. I mount the snap partitions on the following machines: a Fedora Core 2 box, a Fedora Core 3 box, ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: NFS Problem With SnapServer

  1. NFS Problem With SnapServer

    (The following is from a faculty member in my department).

    I have a snap server operating as an NFS file server.

    I mount the snap partitions on the following machines: a Fedora Core 2 box,
    a Fedora Core 3 box, and a box running RedHat 7.1. The core 2 and 3 boxes
    are running relatively upto date kernels. The RedHat 7.1 box is running an original kernel.

    The NFS file system works 99% of the time. However on the Fedora Core 2 and 3
    boxes I experience errors when linking large programs. When I link large programs
    and write the executable directly to the NSF partion the executable will
    not run (core dumps, won't even get past _start() ); I'll call this case a. If I
    take the same object files (which reside) on the snap server and
    direct the linker's output to the local hard drive, then
    the executable works fine; I'll call this case b. I can even move
    the executable from case b to the NFS partition and it still works fine.
    If I diff the executables I see that about 40 bytes in case a
    (out of 2MB) in the executable are zero compared to what is present in case b.
    [The overall sizes of the executable from case a and b are identical.]
    I do not see this problem when using my older RedHat 7.1 box.

    I have tired to increase the rsize and wsize parameters in /etc/fstab as well as
    trying both nsf client versions 2 and 3. The error is present in all cases.
    When I try compiling smaller programs things work fine on all three boxes.

    The snap server is running SnapOS version 4.0.837. The Core 2 box is running
    kernel 2.6.9-1.6_FC2smp, the Core 3 box is running 2.6.10-1.741_FC3, and the RedHat
    box is running kernel 2.4.2-2. The people at Snap will only help if I'm
    running upto RedHat 9.0.

    Any suggestions?

    Cordially,
    Jon Forrest
    Computer Resources Manager
    Civil and Environmental Engineering Dept.
    205 Davis Hall
    Univ. of Calif., Berkeley
    Berkeley, CA 94720-1710
    510-642-0904
    forrest@ce.berkeley.edu

  2. Re: NFS Problem With SnapServer


    Jon Forrest wrote:

    > The NFS file system works 99% of the time. However on the Fedora

    Core 2 and 3
    > boxes I experience errors when linking large programs. When I link

    large programs
    > and write the executable directly to the NSF partion the executable

    will
    > not run (core dumps, won't even get past _start() ); I'll call this

    case a. If I
    > take the same object files (which reside) on the snap server and
    > direct the linker's output to the local hard drive, then
    > the executable works fine; I'll call this case b. I can even move
    > the executable from case b to the NFS partition and it still works

    fine.
    > If I diff the executables I see that about 40 bytes in case a
    > (out of 2MB) in the executable are zero compared to what is present

    in case b.

    These issues are usually on the client side. nfs@lists.sourceforge.net
    might be your best bet, especially if older Linmux kernels don't
    exhibit
    the corruption.


  3. Re: NFS Problem With SnapServer

    spamisevi1@yahoo.com wrote:
    > Jon Forrest wrote:



    > These issues are usually on the client side. nfs@lists.sourceforge.net
    > might be your best bet, especially if older Linmux kernels don't
    > exhibit the corruption.


    The faculty member who submitted the report has told me that
    the problem does not happen when using any other NFS server
    other than the SNAP server so I doubt it's the client.

    Also, isn't it true that an errant client should never
    cause an NFS server to store bad data? (I know that it
    hasn't been proven in this case for sure that the client
    is sending the data correctly but never the less this
    is my understanding of how NFS works).

    Jon



  4. Re: NFS Problem With SnapServer


    Jon Forrest wrote:
    > spamisevi1@yahoo.com wrote:
    > > Jon Forrest wrote:

    >
    >
    > > These issues are usually on the client side.

    nfs@lists.sourceforge.net
    > > might be your best bet, especially if older Linux kernels don't
    > > exhibit the corruption.

    >
    > The faculty member who submitted the report has told me that
    > the problem does not happen when using any other NFS server
    > other than the SNAP server so I doubt it's the client.


    You might be right in this case, but in my experience with NFS,
    corruption
    usually originates with the client.

    NFS servers, especially ones running version 2 and 3 of the
    protocol, are fairly stupid creatures. It's much easier to
    get it wrong on the client than on the server (of course if a few
    things are gotten wrong, like the duplicate request cache,
    corruption via the server is easy to pull off). That you
    see corruption with Linux 2.6, which RedHat as far as I know has no
    Enterprise edition for, and not Linux 2.4 is telling. Lots of
    code in the client has been re-vamped.

    > Also, isn't it true that an errant client should never
    > cause an NFS server to store bad data? (I know that it


    No, it isn't true. A correctly written server takes the stuff given
    it by the client and writes it. Linking of executables has historically
    been a common area for clients to exhibit bugs. That's why the
    Connectathon NFS test suite includes a C compiliation and link test.

    Some ideas: might your NFS server be experiencing file system full
    conditions? I can imagine different clients might handle that
    different ways. Are if the export options on the
    server limit access by client address or client name, and the client
    has multiple source IP addresses, then things could be getting
    confused.
    Are you mounting soft? If so, don't; it is a guarantee
    for corruption. Are you mounting with TCP? UDP is more prone to data
    corruption. If using UDP, might the server have UDP checksums turned
    off? In 21st century, you'd think not, but it never hurts to check.

    > hasn't been proven in this case for sure that the client
    > is sending the data correctly but never the less this
    > is my understanding of how NFS works).



+ Reply to Thread