NFS slowdown w. Linux server, Solaris client - NFS

This is a discussion on NFS slowdown w. Linux server, Solaris client - NFS ; We are experiencing an itermittent NFS performance slowdown on a Solaris 8 Ultra-80 client when reading/writing to filesystems mounted from a Redhat 8 NFS server. The performance is acceptable (2 - 5 MB/s) initially, but after some period of time ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: NFS slowdown w. Linux server, Solaris client

  1. NFS slowdown w. Linux server, Solaris client

    We are experiencing an itermittent NFS performance slowdown on a
    Solaris 8 Ultra-80 client when reading/writing to filesystems mounted
    from a Redhat 8 NFS server. The performance is acceptable (2 - 5 MB/s)
    initially, but after some period of time (10's of minutes),
    performance slows by a factor of 5 or more, and the Solaris system CPU
    usage jumps to > 30%.

    Strangely, if any access to the target filesystem on the NFS client is
    done (e.g. ls -la), the client system CPU drops, and the NFS
    performance returns to normal. But, usually after another 5 - 10
    minutes, the slowdown re-occurs, requiring another "ls -l" to speed it
    up again. Accessing the file on the server has no effect. The output
    files are fairly large (> 6 GBytes) and the writes are purely
    sequential.

    I've dumped some packets with snoop during both the "slow" and "fast"
    phases. The NFS requests are definitely different between the two.
    From a sample of 10,000 packet headers, I found the following
    distribution:


    "fast" phase:
    NFS READ3: 4259 packets
    NFS WRITE3: 0
    NFS GETATTR3: 0
    NFS COMMIT3: 0
    UDP: 5741
    RPC: 0

    "slow" phase:
    NFS READ3: 7039 packets
    NFS WRITE3: 66
    NFS GETATTR3: 772
    NFS COMMIT3: 211
    UDP: 1900
    RPC: 12

    Apparently, the NFS GETATTR3 requests are slowing things down. But
    what is causing the performance to switch between the two modes? Would
    increasing the acregmin/max or acdirmin/max on the Solaris client
    mount help?

    System details are:

    Redhat 8 NFS server:
    Dual Pentium Xeon 2.8 GHz, Redhat 8, kernel: 2.4.20-20.8smp
    nfs-utils-1.0.5-1
    export options: rw,insecure,sync,no_subtree_check,insecure_locks
    Intel Pro/1000 Gigabit Ethernet, e1000 driver vers. 5.2.20.

    Solaris 8 NFS client:
    Ultra-80, Kernel version: SunOS 5.8 Generic 108528-27 Nov 2003
    mount options: vers=3,proto=udp,sec=none,hard,intr,link,symlink,a cl,
    rsize=8192,wsize=8192,retrans=5,timeo=11
    Gigabit Ethernet controller.

    The 2.4.20 kernel doesn't support NFS with TCP, hence the proto=udp on
    the mount.

    Thanks,

    Marc Langlois,
    Key Seismic Solutions, Calgary, AB, Canada

  2. Re: NFS slowdown w. Linux server, Solaris client

    marc@keyseismic.com (Marc Langlois) wrote in message news:...

    > Solaris 8 NFS client:
    > Ultra-80, Kernel version: SunOS 5.8 Generic 108528-27 Nov 2003
    > mount options: vers=3,proto=udp,sec=none,hard,intr,link,symlink,a cl,
    > rsize=8192,wsize=8192,retrans=5,timeo=11
    > Gigabit Ethernet controller.


    I don't know why an ls would fix the performance for a while.
    However, you may be affected by the "spurious cache invalidation on
    large writes" bug in the NFS client. That you see more reads during
    the
    slow phase for a workload that you've described as mostly writes
    suggests it might be worth a look

    The Sun bug id is 4407669. sunsolve.sun.com lists
    a patch for it. Apologies if you already have that patch ... I
    can't tell from the uname output if you do.

  3. Re: NFS slowdown w. Linux server, Solaris client

    Mike Eisler wrote:

    (snip)
    (previously snipped question about slow NFS access
    after some minutes)

    > I don't know why an ls would fix the performance for a while.
    > However, you may be affected by the "spurious cache invalidation on
    > large writes" bug in the NFS client. That you see more reads during
    > the
    > slow phase for a workload that you've described as mostly writes
    > suggests it might be worth a look


    It did sound like a cache problem to me, though I hadn't
    known what one would do about it. Reading/writing one large
    file should not be doing many GETATTR calls.

    Though another slow NFS problem that I have known about
    comes with very large numbers of entries in a directory,
    thousands or tens of thousands. Unix, in most implementations,
    don't handle such directories very well.

    -- glen


+ Reply to Thread