Re: Using multiple file descriptors for the same file - Unix

This is a discussion on Re: Using multiple file descriptors for the same file - Unix ; On Oct 20, 10:54*am, DJ Dharme wrote: > * * * * * I am writing a multi-threaded application in c++ running on > solaris. I have a file which is updated by a single thread by > appending data ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: Re: Using multiple file descriptors for the same file

  1. Re: Using multiple file descriptors for the same file

    On Oct 20, 10:54*am, DJ Dharme wrote:

    > * * * * * I am writing a multi-threaded application in c++ running on
    > solaris. I have a file which is updated by a single thread by
    > appending data into the file and at same time the other threads are
    > reading the content written into the file. Can anybody tell me is
    > there a performance or any other gain (except for the multex locking)
    > by using different file descriptors in each thread for the same file
    > rather than using a single FD with a mutex (or read write) lock. Is it
    > an overhead using multiple FDs for a single file?


    It depends on how you create file descriptors on the same file.

    If you use open() with the same file name, you'll get file descriptors
    referring to a different file description, referring to the same file.

    fd0 -> description0 \
    fd1 -> description1 -> file
    fd2 -> description2 /

    If, on the other hand, you use dup() to get new file descriptors,
    these file descriptors refer to the same file description.

    fd0 \
    fd1 -> description -> file
    fd2 /

    File description is a structure where file offset and access mode are
    stored among other things. This structure is protected by a mutex (or
    a spin-lock).

    In the latter case the threads will contend to access the same file
    description when you do read/write().

    > Pardon me if I am posting this in a wrong group.


    Better use comp.unix.programmer for Unix specific questions. Replies
    to this message should automatically go there.

    --
    Max

  2. Re: Using multiple file descriptors for the same file

    On Oct 20, 5:06*pm, Maxim Yegorushkin
    wrote:
    > On Oct 20, 10:54*am, DJ Dharme wrote:
    >
    > > * * * * * I am writing a multi-threaded application in c++ running on
    > > solaris. I have a file which is updated by a single thread by
    > > appending data into the file and at same time the other threads are
    > > reading the content written into the file. Can anybody tell me is
    > > there a performance or any other gain (except for the multex locking)
    > > by using different file descriptors in each thread for the same file
    > > rather than using a single FD with a mutex (or read write) lock. Is it
    > > an overhead using multiple FDs for a single file?

    >
    > It depends on how you create file descriptors on the same file.
    >
    > If you use open() with the same file name, you'll get file descriptors
    > referring to a different file description, referring to the same file.
    >
    > * * fd0 -> description0 \
    > * * fd1 -> description1 *-> file
    > * * fd2 -> description2 /
    >
    > If, on the other hand, you use dup() to get new file descriptors,
    > these file descriptors refer to the same file description.
    >
    > * * fd0 \
    > * * fd1 *-> description -> file
    > * * fd2 /
    >
    > File description is a structure where file offset and access mode are
    > stored among other things. This structure is protected by a mutex (or
    > a spin-lock).
    >
    > In the latter case the threads will contend to access the same file
    > description when you do read/write().
    >
    > > Pardon me if I am posting this in a wrong group.

    >
    > Better use comp.unix.programmer for Unix specific questions. Replies
    > to this message should automatically go there.
    >
    > --
    > Max


    Thanks max, yes I was thinking of creating the FDs using open system
    call. My question is basically this.

    Even if I create multiple FDs for the same file and use them
    independently in different threads will the OS serialize the system
    calls and make the threads to be synchronized when accessing the file.
    If the ans is yes, is it better synchronizing the file access by using
    a single FD and a read-write lock, rather than having an FD for each
    thread?

    DJD

  3. Re: Using multiple file descriptors for the same file

    On Oct 20, 1:49*pm, DJ Dharme wrote:
    > On Oct 20, 5:06*pm, Maxim Yegorushkin
    > wrote:
    >
    >
    >
    > > On Oct 20, 10:54*am, DJ Dharme wrote:

    >
    > > > * * * * * I am writing a multi-threaded application in c++ running on
    > > > solaris. I have a file which is updated by a single thread by
    > > > appending data into the file and at same time the other threads are
    > > > reading the content written into the file. Can anybody tell me is
    > > > there a performance or any other gain (except for the multex locking)
    > > > by using different file descriptors in each thread for the same file
    > > > rather than using a single FD with a mutex (or read write) lock. Is it
    > > > an overhead using multiple FDs for a single file?

    >
    > > It depends on how you create file descriptors on the same file.

    >
    > > If you use open() with the same file name, you'll get file descriptors
    > > referring to a different file description, referring to the same file.

    >
    > > * * fd0 -> description0 \
    > > * * fd1 -> description1 *-> file
    > > * * fd2 -> description2 /

    >
    > > If, on the other hand, you use dup() to get new file descriptors,
    > > these file descriptors refer to the same file description.

    >
    > > * * fd0 \
    > > * * fd1 *-> description -> file
    > > * * fd2 /

    >
    > > File description is a structure where file offset and access mode are
    > > stored among other things. This structure is protected by a mutex (or
    > > a spin-lock).

    >
    > > In the latter case the threads will contend to access the same file
    > > description when you do read/write().

    >
    > Thanks max, yes I was thinking of creating the FDs using open system
    > call. My question is basically this.
    >
    > Even if I create multiple FDs for the same file and use them
    > independently in different threads will the OS serialize the system
    > calls and make the threads to be synchronized when accessing the file.


    The kernel will serialise access to its data structures, so that they
    can not be corrupted.

    > If the ans is yes, is it better synchronizing the file access by using
    > a single FD and a read-write lock, rather than having an FD for each
    > thread?


    If the same file description is used (dup()ed fds), than certainly,
    you need to do external locking, because this way writing actually
    involves seek()ing to the end of file followed by write(), and reading
    involves seek()ing elsewhere followed by read(). You will need to make
    sure seek()-followed-by-read/write() sequence is atomic.

    You don't need to use external locking if you use different file
    descriptions (when fds are obtained via open()), because this way the
    file descriptions are not shared.

    --
    Max

  4. Re: Using multiple file descriptors for the same file

    On Oct 20, 5:49*am, DJ Dharme wrote:


    > Even if I create multiple FDs for the same file and use them
    > independently in different threads will the OS serialize the system
    > calls and make the threads to be synchronized when accessing the file.


    Of course. Otherwise there would be chaos.

    > If the ans is yes, is it better synchronizing the file access by using
    > a single FD and a read-write lock, rather than having an FD for each
    > thread?


    Of course not. You think you can outperform the OSes low-level
    synchronization primitives with your own? I highly doubt that. The OS
    will synchronize only the very lowest-level file system operations.
    You'll have to synchronize the entire system call.

    In any event, you can have the best of both worlds. Use a single file
    handle, but use pread/pwrite.

    DS

  5. Re: Using multiple file descriptors for the same file

    On Oct 20, 7:43*pm, David Schwartz wrote:
    > On Oct 20, 5:49*am, DJ Dharme wrote:
    >
    > > Even if I create multiple FDs for the same file and use them
    > > independently in different threads will the OS serialize the system
    > > calls and make the threads to be synchronized when accessing the file.

    >
    > Of course. Otherwise there would be chaos.
    >
    > > If the ans is yes, is it better synchronizing the file access by using
    > > a single FD and a read-write lock, rather than having an FD for each
    > > thread?

    >
    > Of course not. You think you can outperform the OSes low-level
    > synchronization primitives with your own? I highly doubt that. The OS
    > will synchronize only the very lowest-level file system operations.
    > You'll have to synchronize the entire system call.
    >
    > In any event, you can have the best of both worlds. Use a single file
    > handle, but use pread/pwrite.
    >
    > DS


    Thanks guys, David you have given the answer to the question that I
    was going to ask next.

    So I think it is safe to use 2 FDs like this.

    1. Writes data to the end of file using write function (here it is
    convenient to have the FD being updated after each call).
    2. Reads the data from the file using pread function.(I don't have to
    use locks when accessing this FD from multiple threads since the pread
    doesn't update this FD)

    Will this strategy work best for me?

  6. Re: Using multiple file descriptors for the same file

    On Oct 20, 9:58*pm, DJ Dharme wrote:

    > So I think it is safe to use 2 FDs like this.
    >
    > 1. Writes data to the end of file using write function (here it is
    > convenient to have the FD being updated after each call).
    > 2. Reads the data from the file using pread function.(I don't have to
    > use locks when accessing this FD from multiple threads since the pread
    > doesn't update this FD)
    >
    > Will this strategy work best for me?


    So long as you only have one thread writing, that should work fine.
    Otherwise, use 'pwrite'.

    Note that you're supposed to be able to open an 'fd' for append and
    even use 'pwrite' to write to any part of the file! (However, that may
    be broken on many real world machines since it's such a crazy
    capability.)

    You can use one FD or two. There should be no real difference.

    DS

  7. Re: Using multiple file descriptors for the same file

    On Oct 21, 8:13*pm, David Schwartz wrote:
    > On Oct 20, 9:58*pm, DJ Dharme wrote:
    >
    > > So I think it is safe to use 2 FDs like this.

    >
    > > 1. Writes data to the end of file using write function (here it is
    > > convenient to have the FD being updated after each call).
    > > 2. Reads the data from the file using pread function.(I don't have to
    > > use locks when accessing this FD from multiple threads since the pread
    > > doesn't update this FD)

    >
    > > Will this strategy work best for me?

    >
    > So long as you only have one thread writing, that should work fine.
    > Otherwise, use 'pwrite'.
    >
    > Note that you're supposed to be able to open an 'fd' for append and
    > even use 'pwrite' to write to any part of the file! (However, that may
    > be broken on many real world machines since it's such a crazy
    > capability.)
    >
    > You can use one FD or two. There should be no real difference.
    >
    > DS


    Thanks a lot David. You saved lot of my time.

    DJD

+ Reply to Thread