freebsd client attr+data caches do not always time out correctly - NFS

This is a discussion on freebsd client attr+data caches do not always time out correctly - NFS ; Hi, We are using Freebsd 4.5/7 on our workstations and servers, and use NetApps (ONTAP 6.3.3) for our nfs servers. We have experienced a problem a number of times in the past few months where a binary file in an ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: freebsd client attr+data caches do not always time out correctly

  1. freebsd client attr+data caches do not always time out correctly

    Hi,

    We are using Freebsd 4.5/7 on our workstations and servers, and use
    NetApps (ONTAP 6.3.3) for our nfs servers.

    We have experienced a problem a number of times in the past few months
    where a binary file in an nfs mounted directory is way out of date on
    one of our workstations. In one particular example, a binary
    executable had been recompiled/changed many times over a period of a
    few months, and each successive version had been used on various
    workstations. On one of our workstations we noticed that the output of
    the program was of a much older format than the recent versions. The
    actual version on the file server was:

    -r-xr-xr-x 1 someuser user 1240180 Nov 21 14:06 someprogram

    the version that this particular workstation had cached was:

    -r-xr-xr-x 1 someuser user 2571784 Aug 22 15:16 someprogram

    the version from august 22 was many versions old, at least 5 versions
    of the executable had been compiled and installed with the same name
    in that same directory, and many of our other workstations had run the
    newer versions of that program. The inode number that the rest of our
    workstations were reporting was different than this particular
    workstation. When we looked at all of the various versions contained
    in our netapp snapshots, none of the versions snapshotted were as old
    as the one the workstation was seeing. when we 'touch'ed a file in
    that directory from the workstation in question then the cache was
    updated with the recent attribs + data.

    my understanding of nfs caching of attributes and data in freebsd is
    that they should time out in a matter of seconds not months. we have
    not been able to intentionally reproduce this behavior, but we have
    seen it happen a few times on various workstations.

    has anyone else experienced this, and is there a known fix?

    thanks,

    rob watt

  2. Re: freebsd client attr+data caches do not always time out correctly

    rob@hudson-trading.com (Robert Watt) wrote in message news:...
    > Hi,
    >
    > We are using Freebsd 4.5/7 on our workstations and servers, and use
    > NetApps (ONTAP 6.3.3) for our nfs servers.
    >
    > We have experienced a problem a number of times in the past few months
    > where a binary file in an nfs mounted directory is way out of date on
    > one of our workstations. In one particular example, a binary
    > executable had been recompiled/changed many times over a period of a
    > few months, and each successive version had been used on various
    > workstations. On one of our workstations we noticed that the output of
    > the program was of a much older format than the recent versions. The
    > actual version on the file server was:
    >
    > -r-xr-xr-x 1 someuser user 1240180 Nov 21 14:06 someprogram
    >
    > the version that this particular workstation had cached was:
    >
    > -r-xr-xr-x 1 someuser user 2571784 Aug 22 15:16 someprogram
    >
    > the version from august 22 was many versions old, at least 5 versions
    > of the executable had been compiled and installed with the same name
    > in that same directory, and many of our other workstations had run the
    > newer versions of that program. The inode number that the rest of our
    > workstations were reporting was different than this particular
    > workstation. When we looked at all of the various versions contained
    > in our netapp snapshots, none of the versions snapshotted were as old
    > as the one the workstation was seeing. when we 'touch'ed a file in
    > that directory from the workstation in question then the cache was
    > updated with the recent attribs + data.
    >
    > my understanding of nfs caching of attributes and data in freebsd is
    > that they should time out in a matter of seconds not months. we have


    Correct, but it is much worse than that. At least for exec*() system call,
    it does not appear your client's NFS filesystem is correctly implementing
    close-to-open semantics. By which, I mean that every open or exec of
    the binary, the client should issue a GETATTR over the wire, and compare
    the ctime of the file with that which is cached. If they are different, then
    the cache needs to be flushed.

    > not been able to intentionally reproduce this behavior, but we have
    > seen it happen a few times on various workstations.
    >
    > has anyone else experienced this, and is there a known fix?


    Many times on other clients I'm afraid. I don't know about a fix for
    BSD. If you don't want to change BSD/NFS source, so could try a different
    approach to managing your binaries. Which is to use symlinks. Have someprogram
    linked to someprogram.v1, and then when you want to update, create
    someprogram.v2, and link someprogram to someprogram.v2. The side benefit
    of this is that clients with processes currently paging in
    someprogram aren't affected. Whereas, simply overwriting a
    existing binary that might be in use will, at best cause processes
    to dump core, and at worst, corrupt any data sets those processes are
    producing.
    >
    > thanks,
    >
    > rob watt


+ Reply to Thread