Mysterious NFS-related issue - NFS

This is a discussion on Mysterious NFS-related issue - NFS ; I have a mail server setup that works as follows: - MX servers use a lockfile to write to a message file. The file is opened for read and write, and uses lseek(fd, 0, SEEK_END) to get to the end ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: Mysterious NFS-related issue

  1. Mysterious NFS-related issue

    I have a mail server setup that works as follows:

    - MX servers use a lockfile to write to a message file. The file is
    opened for read and write, and uses lseek(fd, 0, SEEK_END) to get to
    the end to start writing. During the write, it is necessary to seek
    back to the start of where writing first began to write a header
    sequence that contains information only known after the message data
    is written. Only a small amount of data is written there (less than
    100 bytes). After the message file is written to and closed, a
    request file is written to another NFS location.

    - Index management servers read these request files, and use them to
    update an index to messages within the message file. The message file
    is opened read-only to verify that the message data contains a valid
    header, and the text message header data is read at this point to
    update a separate header database file. No locking is done here as
    the index server is the only one writing to the index file and
    database files.

    The problem I am discovering is that on rare occasion I will come
    across a message that is completely nulled out within the message file
    -- yet the index and header database files contain valid information
    for the message, which means that the message was good when the index
    management server examined it, but at some later point (i.e. when a
    new message came in) the data from the previous message became
    completely set to nulls.

    There is a high number of MX servers (over 100) running Linux 2.6,
    mounting a NetApp (Network Appliance).

    Has anyone else experienced similar NFS issues, or have any idea how
    to avoid this problem?


  2. Re: Mysterious NFS-related issue

    Nevermind... looks like there were some nfs bugs in the 2.6.20 Linux
    kernel (that didn't exist in the 2.6.16 kernel)


+ Reply to Thread