Stale NFS file handle - SGI

This is a discussion on Stale NFS file handle - SGI ; Hi, I have a NFS problem with the following configuration : - one Linux NFS server (NFS2) - several IRIX clients - automounter (autofs actually) It all works nicely, so long I don't restart or change the configuration on the ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: Stale NFS file handle

  1. Stale NFS file handle

    Hi,

    I have a NFS problem with the following configuration :
    - one Linux NFS server (NFS2)
    - several IRIX clients
    - automounter (autofs actually)

    It all works nicely, so long I don't restart or change the configuration
    on the server.

    Starting from then, any access to any previously mounted directory on
    the IRIX clients triggers the following error message :

    Cannot access /path: Stale NFS file handle

    It seems IRIX can not detect that the NFS handle has to be renegociated.
    Other computers on the network (Linux, Solaris) have no problem at all.

    If I try to manualy unmount those mount points, I get a "device busy"
    message. I tried also "umount -v -h nfsserver -k", with no success.

    The problem is pretty annoying, since each time I want to add an export
    entry on the NFS, I need either to :
    - ask all IRIX users to log out, wait that all gets unmounted (and hope
    that no running script or cron job will prevent that), do the
    modifications and let the users log in again
    or
    - reboot all IRIX computers after the modification

    Did anyone know something about this problem (and its solutions ) ?

    Thanks !

    Olivier

  2. Re: Stale NFS file handle

    >>>>> "OC" == Olivier Croquette writes:

    OC> I have a NFS problem with the following configuration :
    OC> - one Linux NFS server (NFS2)
    OC> - several IRIX clients
    OC> - automounter (autofs actually)

    OC> It all works nicely, so long I don't restart or change the configuration
    OC> on the server.

    OC> Starting from then, any access to any previously mounted directory on
    OC> the IRIX clients triggers the following error message :

    OC> Cannot access /path: Stale NFS file handle

    There are two issues here:

    1. Linux does not have stable filehandle format which can survive
    reboot or re-export.
    2. If there is any form of access control involved, linux expects to
    see a MOUNT (RPC 100005) call before it allows access from a
    client.

    You can deal with #1 by using fsid options for exports - it helps
    somewhat. AFAIK there is no current solution for #2.

    OC> It seems IRIX can not detect that the NFS handle has to be renegociated.

    There is no "renegociation" of filehandles - once you have one, you
    have it, if it becomes stale, it's stale.

    Irix would insist that any data which it thinks it has to write must
    be written, if you get a stale filehandle, then the data cannot be
    written and therefore the filesystem is busy. There is no processes
    which hold it busy, so "umount -k" does not help. You could consider
    using "wsync" option force syncronous writes but then performance
    would suck.

    max

  3. Re: Stale NFS file handle

    Max Matveev wrote:

    > 1. Linux does not have stable filehandle format which can survive
    > reboot or re-export.


    OK.
    But the others clients (under Solaris and Linux) have no problem at all.


    > 2. If there is any form of access control involved, linux expects to
    > see a MOUNT (RPC 100005) call before it allows access from a
    > client.
    >
    > You can deal with #1 by using fsid options for exports - it helps
    > somewhat. AFAIK there is no current solution for #2.


    After some investigation, I understood the following :
    The fsid is a way to identify the export entry, so that IRIX "can"
    restore the previous "state" and avoid a stale file handle.

    Is this right ?

    > OC> It seems IRIX can not detect that the NFS handle has to be renegociated.
    >
    > There is no "renegociation" of filehandles - once you have one, you
    > have it, if it becomes stale, it's stale.


    What about the other clients ? Although "renegociation" might not be the
    right term, they remount the export(s) without problems.

    May be they lose data, but there is no other way. IRIX may not lose data
    (if it stays in the cache), but it won't either go ever in the right
    file. It only makes the workstation unusable

    > Irix would insist that any data which it thinks it has to write must
    > be written, if you get a stale filehandle, then the data cannot be
    > written and therefore the filesystem is busy.


    But this is a dead-end. It is logic, but I find it somewhat stupid.

    Your help has been greatly appreciated.

    Regards

    Olivier

  4. Re: Stale NFS file handle

    Max Matveev wrote in message news:...
    > 1. Linux does not have stable filehandle format which can survive
    > reboot or re-export.


    btw, my understanding of NFSv2 and v3 (RFC1094, 1813) is that all file
    handles are supposed to non-volatile and T stable. This means that the
    file handle should continue to work after server restarts, reboots...

    The only time a Stale File Handle should be returned is after the file
    has been removed from the server. (T-stable refers to the property that
    it represents that file and that file only for a "long time", including
    a long time after the file is removed.)

    NFSv4 does have support for volatile file handles, but that isn't relevant
    here.

    --> Any NFS file server that does the above is badly broken, rick

  5. Re: Stale NFS file handle

    >>>>> "OC" == Olivier Croquette writes:

    OC> After some investigation, I understood the following :
    OC> The fsid is a way to identify the export entry, so that IRIX "can"
    OC> restore the previous "state" and avoid a stale file handle.

    Any file handle identifies two "things": an export entry and a
    filesystem's object "inside" the export entry. I cannot remember of
    the top of my head what did Linux do but it somehow encoded the export
    entry bits in the filehandle in such a way that it would change on
    each export. If you're curious, find a copy of Jeff Ogata's nfs
    client- it would allow you to see filehandles as hex strings and
    compare them. Then play with exporting things on a non-important
    server and compare filehandles you're getting.

    OC> What about the other clients ? Although "renegociation" might not be
    OC> the right term, they remount the export(s) without problems.

    That depends on whatever the filesystem has been mounted or not then
    export changes. There is also a lot of "interesting" games involved
    between NFS/autofs on the client and NFS/mountd on server which could
    lead to a lot of problems with using /hosts-like maps on Irix, mostly
    because autofs trys to treat each /hosts map as an atomic entity and
    would try to recover from partial unmounts if one of the gazillion
    mounts fails to unmount.

    OC> May be they lose data, but there is no other way. IRIX may not lose
    OC> data (if it stays in the cache), but it won't either go ever in the
    OC> right file. It only makes the workstation unusable

    Normally if it happens to me, I'll try to make export re-appear on the
    server (provided I can do it) to make client happy and allow it to
    finish its stuff, then I unexport the temporary entries and life goes
    back to normal.

    >> Irix would insist that any data which it thinks it has to write must
    >> be written, if you get a stale filehandle, then the data cannot be
    >> written and therefore the filesystem is busy.


    OC> But this is a dead-end. It is logic, but I find it somewhat stupid.
    Well, it's a trade-off and some people believe that it's better to
    have a stuck mount then to suffer loss of data: at least you have an
    option of trying to coerce server into accepting the data somehow vs
    just silently dropping them on the floor and pretending they never
    being written anyway. As one of the guys here says: "If it hurts then
    don't do it to yourself". Have you ever considered perhaps going from
    /hosts-style maps to more controlled maps where you explicitly try to
    mount only the stuff you need? That could make your life slightly more
    palatable.

    BTW, it could make you feel somewhat better to know that there is bug
    report open for this problem inside SGI. Alas, there is no easy fix
    for it which is not going to screw someone else.

    max


+ Reply to Thread