Help with NFS reliability problem - Networking

This is a discussion on Help with NFS reliability problem - Networking ; Hi, I am having reliability issues with a NFS mount. I have a directory hierarchy containing several thousands files and directories that get exported from a server to a client. The only options I am using in the server /etc/exports ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Help with NFS reliability problem

  1. Help with NFS reliability problem


    Hi,

    I am having reliability issues with a NFS mount.

    I have a directory hierarchy containing several thousands files and
    directories that get exported from a server to a client.

    The only options I am using in the server /etc/exports are:
    (rw,no_root_squash)

    I realise this is a potential security risk but this is a private
    network with no external access so at the moment, I am happy with
    that and am more concerned with the reliability issues.

    I used no special option apart from rw in the client fstab

    Operating system:

    CentOS 4.1
    rpm -q nfs-utils => nfs-utils-1.0.6-70.EL4


    My problem:

    Occasionally I am unable to access one of the directory nor its
    content. On investigation, doing a "ls -l" gives me the following:

    drwxr-xr-x 4 root root 4096 Jan 31 2007 .
    drwxr-xr-x 30 root root 4096 Jan 31 2007 ..
    drwxr-xr-x 6 root root 4096 Jan 31 2007 37705
    ?--------- ? ? ? ? ? 45644

    when I would have expected a normal listing to look like:

    drwxr-xr-x 4 root root 4096 Jan 31 2007 .
    drwxr-xr-x 30 root root 4096 Jan 31 2007 ..
    drwxr-xr-x 6 root root 4096 Jan 31 2007 37705
    drwxr-xr-x 6 root root 4096 Jan 31 2007 45644

    Does anybody know why this is happening and how to fix it?

    I've read various NFS guides. Didn't find a useful answer. I am
    considering remounting using both "tcp" and "sync" option and
    potentially "wno_delay" in the hope that this would solve my
    reliability issue but doing so would be a blind shot in the dark
    crossing my fingers and hoping. An explanation of the cause of the
    problem and how to fix it would be a lot more acceptable.

    The added problems is that I don't have the liberty to try all
    possible option combinations until I find something that work and the
    problem happens randomly without known way to reproduce it. It has
    happened twice in 5 days, was temporary solved with umounting and
    remounting the client.

    Any help would be greatly appreciated.

    Thanks

    Yan

  2. Re: Help with NFS reliability problem

    Yannick Tremblay wrote:
    > Hi,
    >
    > I am having reliability issues with a NFS mount.
    >
    > I have a directory hierarchy containing several thousands files and
    > directories that get exported from a server to a client.
    >
    > The only options I am using in the server /etc/exports are:
    > (rw,no_root_squash)
    >
    > I realise this is a potential security risk but this is a private
    > network with no external access so at the moment, I am happy with
    > that and am more concerned with the reliability issues.
    >
    > I used no special option apart from rw in the client fstab
    >
    > Operating system:
    >
    > CentOS 4.1
    > rpm -q nfs-utils => nfs-utils-1.0.6-70.EL4
    >
    >
    > My problem:
    >
    > Occasionally I am unable to access one of the directory nor its
    > content. On investigation, doing a "ls -l" gives me the following:
    >
    > drwxr-xr-x 4 root root 4096 Jan 31 2007 .
    > drwxr-xr-x 30 root root 4096 Jan 31 2007 ..
    > drwxr-xr-x 6 root root 4096 Jan 31 2007 37705
    > ?--------- ? ? ? ? ? 45644
    >
    > when I would have expected a normal listing to look like:
    >
    > drwxr-xr-x 4 root root 4096 Jan 31 2007 .
    > drwxr-xr-x 30 root root 4096 Jan 31 2007 ..
    > drwxr-xr-x 6 root root 4096 Jan 31 2007 37705
    > drwxr-xr-x 6 root root 4096 Jan 31 2007 45644
    >
    > Does anybody know why this is happening and how to fix it?
    >
    > I've read various NFS guides. Didn't find a useful answer. I am
    > considering remounting using both "tcp" and "sync" option and
    > potentially "wno_delay" in the hope that this would solve my
    > reliability issue but doing so would be a blind shot in the dark
    > crossing my fingers and hoping. An explanation of the cause of the
    > problem and how to fix it would be a lot more acceptable.
    >
    > The added problems is that I don't have the liberty to try all
    > possible option combinations until I find something that work and the
    > problem happens randomly without known way to reproduce it. It has
    > happened twice in 5 days, was temporary solved with umounting and
    > remounting the client.


    Extra options shouldn't be needed. My NFS servers are up to
    354days of uptime... no issues (not longer simply because we
    usually have one mandatory shutdown during a year).

    I'm running SUSE 9.3 on my servers currently (just NFSv3, and
    we have a few "at risk" accessors running udp instead of
    tcp (because of the age of the Unix OS hitting them).

    I'm going to guess that your problem could be network
    related, but there could be a bug in NFS that is in the
    newer code.... not sure.

    All I'm saying is that NFS is very reliable in Linux.
    We house hundreds of gigabytes of home directories and
    tons of software development build areas....

    >
    > Any help would be greatly appreciated.
    >
    > Thanks
    >
    > Yan


  3. Re: Help with NFS reliability problem

    In comp.os.linux.networking Yannick Tremblay :

    > I am having reliability issues with a NFS mount.

    [..]

    > CentOS 4.1
    > rpm -q nfs-utils => nfs-utils-1.0.6-70.EL4


    I'd install all patches including kernel there should be tons,
    reboot and check if this improves things, preferable using 'yum
    update'.

    Good luck

    [..]

    --
    Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94)
    mail: echo zvpunry@urvzvat.qr | perl -pe 'y/a-z/n-za-m/'
    #bofh excuse 63: not properly grounded, please bury computer

+ Reply to Thread