TCP send() implementation - TCP-IP

This is a discussion on TCP send() implementation - TCP-IP ; On Dec 2, 3:50 am, "Oleh Derevenko" wrote: > If application serializes access to the socket itself it is not a > multithreaded use. Do you know the old anecdote about Windows 95? > - Daddy, why Windows 95 is ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 33 of 33

Thread: TCP send() implementation

  1. Re: TCP send() implementation

    On Dec 2, 3:50 am, "Oleh Derevenko" wrote:

    > If application serializes access to the socket itself it is not a
    > multithreaded use. Do you know the old anecdote about Windows 95?


    > - Daddy, why Windows 95 is called a multitasking operating system?
    > - Wait a minite, son. I'll finish formatting my floppy disk and show you.


    Every realistic application has to serialize access to a TCP
    connection. There are few protocols that permit data to be sent in a
    random order.

    In any event, you fail to see that your argument defeats your own
    position. *You* are the one complaining that the OS does *not*
    serialize access to the TCP connection. The OS permits concurrent
    access. So your anecdote makes *my* point.

    DS

  2. Re: TCP send() implementation

    On Dec 2, 4:02 am, "Oleh Derevenko" wrote:

    > Yes, this is what I do to solve the problem, of course. However serializing
    > threads inside of socket library implementation would be more effective and
    > would improve execution parallelism.


    So you are saying that all applications should suffer the overhead of
    extra serialization in the network fast path for the one-in-a-billion
    application that doesn't care about the order its data is sent?
    (Because if it did care, it would have to serialize entry to 'send'
    anyway.)

    Wow, you're really out there on this one.

    DS


  3. Re: TCP send() implementation

    On Nov 30, 9:59 pm, Scott Gifford wrote:
    > David Schwartz writes:
    > > On Nov 30, 5:15 pm, Esmond Pitt
    > > wrote:

    >
    > >> I'm sorry to disagree with so many distinguished colleagues but the
    > >> problem is right here. That is not correct. What it should do is buffer
    > >> what can be buffered and return that bytecount to the application. What
    > >> happens next is up to the application: it will usually advance the
    > >> offset and reduce the length, both by the return value, and repeat, and
    > >> *that's* where inter-thread problems can occur, but that's strictly the
    > >> application's problem.

    >
    > > In other words, the existing implementation has no problem, but it
    > > should be changed such that it does. What sense does that make?

    >
    > I believe the problem the OP is reporting is this. Let's use the IMAP
    > protocol, since it's one where it makes sense to have multiple threads
    > writing to the same socket.
    >
    > Thread 1: A LIST "" Mailbox1
    > Thread 2: B LIST "" Mailbox2
    >
    > If the threads are unsynchronized, I would expect it to get the two
    > commands, with no guarantees about order.
    >
    > What the OP is saying is that under some circumstances, it's possible
    > that the server would actually receive the two commands intermixed
    > arbitrarily, possibly resulting in completely garbled commands
    >
    > A LIST B LIST "" Mailbox2
    > "" Mailbox1
    >
    > Here are the circumstances:
    >
    > Thread 1: runs
    > Thread 1: Sends A LIST
    > Thread 1: send buffer full, blocks
    > Thread 2: runs
    > Thread 2: send buffer has space
    > Thread 2: Sends B LIST "" Mailbox2\r\n
    > Thread 2: send buffer full, blocks
    > Thread 1: runs
    > Thread 1: send buffer has space
    > Thread 1: Sends "" Mailbox1\r\n
    >
    > I think that's very surprising behavior, even if it is allowable.
    >
    > But the solution, synchronizing the threads, seems straightforward
    > enough, even if it is surprising that it's necessary.


    Try to create an even remotely realistic scenario where it matters. In
    this case, it's not remotely realistic because there would be no way
    to know what was a reply to thread A and what was a reply to thread B.
    If we are going to assume the two commands are always the same, then
    there's no point in sending it twice.

    DS



  4. Re: TCP send() implementation

    David Schwartz writes:

    [...]

    >> I believe the problem the OP is reporting is this. Let's use the IMAP
    >> protocol, since it's one where it makes sense to have multiple threads
    >> writing to the same socket.
    >>
    >> Thread 1: A LIST "" Mailbox1
    >> Thread 2: B LIST "" Mailbox2
    >>


    [...]

    > Try to create an even remotely realistic scenario where it matters. In
    > this case, it's not remotely realistic because there would be no way
    > to know what was a reply to thread A and what was a reply to thread B.
    > If we are going to assume the two commands are always the same, then
    > there's no point in sending it twice.


    I chose IMAP for this example because it actually does handle this
    properly. The reply to A will be tagged with A, and the reply to B
    will be tagged with B; that's why the tags are there.

    See RFC 3501 for more details; section 2.2.2 has a good summary of
    this feature:

    The server completion result response indicates the success or
    failure of the operation. It is tagged with the same tag as the
    client command which began the operation. Thus, if more than one
    command is in progress, the tag in a server completion response
    identifies the command to which the response applies.

    ----Scott.

  5. Re: TCP send() implementation

    On Dec 3, 1:58 pm, Scott Gifford wrote:

    > I chose IMAP for this example because it actually does handle this
    > properly. The reply to A will be tagged with A, and the reply to B
    > will be tagged with B; that's why the tags are there.
    >
    > See RFC 3501 for more details; section 2.2.2 has a good summary of
    > this feature:
    >
    > The server completion result response indicates the success or
    > failure of the operation. It is tagged with the same tag as the
    > client command which began the operation. Thus, if more than one
    > command is in progress, the tag in a server completion response
    > identifies the command to which the response applies.
    >
    > ----Scott.


    Nevertheless, it's still not realistic. Threads A and B can't just
    both call 'recv' or 'read' and expect to get the correct output. There
    would have to be some single concentrator that calls 'recv' and
    figures out whether the received data is an entire line and whether it
    should go to thread A or thread B. This is precisely the same logic as
    is needed for send.

    Again, there is no realistic case where this kind of synchronization
    is not needed for other reasons.

    DS

  6. Re: TCP send() implementation

    David Schwartz writes:

    > On Dec 3, 1:58 pm, Scott Gifford wrote:
    >
    >> I chose IMAP for this example because it actually does handle this
    >> properly. The reply to A will be tagged with A, and the reply to B
    >> will be tagged with B; that's why the tags are there.


    [...]

    > Nevertheless, it's still not realistic. Threads A and B can't just
    > both call 'recv' or 'read' and expect to get the correct output. There
    > would have to be some single concentrator that calls 'recv' and
    > figures out whether the received data is an entire line and whether it
    > should go to thread A or thread B.


    That is definitely true for read/recv, but the OP's question was about
    send.

    > This is precisely the same logic as is needed for send.
    >
    > Again, there is no realistic case where this kind of synchronization
    > is not needed for other reasons.


    In this example, the synchronization around send would not be
    necessary if it always sent the entire buffer, or sent nothing.

    I don't mean to harp on this point, but it is very hard not to try and
    find counterexamples when somebody says "there is no realistic case
    case where this matters". :-) Perhaps we can just agree that the
    existing behavior is legal, although it can be surprising; and that
    most applications would have to synchronize anyways, so it's not worth
    changing kernel code to accomodate rarely used optimizations for some
    specific application protocols?

    ----Scott.

  7. Re: TCP send() implementation

    On Dec 4, 8:54 pm, Scott Gifford wrote:

    > > Nevertheless, it's still not realistic. Threads A and B can't just
    > > both call 'recv' or 'read' and expect to get the correct output. There
    > > would have to be some single concentrator that calls 'recv' and
    > > figures out whether the received data is an entire line and whether it
    > > should go to thread A or thread B.


    > That is definitely true for read/recv, but the OP's question was about
    > send.


    The reasoning is precisely the same.

    > > This is precisely the same logic as is needed for send.

    >
    > > Again, there is no realistic case where this kind of synchronization
    > > is not needed for other reasons.


    > In this example, the synchronization around send would not be
    > necessary if it always sent the entire buffer, or sent nothing.


    But the same synchronization would be needed for receive, so the
    synchronization would be needed anyway.

    > I don't mean to harp on this point, but it is very hard not to try and
    > find counterexamples when somebody says "there is no realistic case
    > case where this matters". :-) Perhaps we can just agree that the
    > existing behavior is legal, although it can be surprising; and that
    > most applications would have to synchronize anyways, so it's not worth
    > changing kernel code to accomodate rarely used optimizations for some
    > specific application protocols?


    It's not that there's no realistic case where this matters. You found
    a realistic case where it matters. The point is, there's no realistic
    case where the added cost of additional synchronization in the kernel
    would be outweighed by some benefit that could be realized elsewhere.

    In this case, the basic claim is that user code could be simpler,
    since it wouldn't need synchronization. But it needs that very same
    synchronization for receiving, so the code has to be written and
    tested anyway.

    DS

  8. Re: TCP send() implementation

    On Nov 30, 7:03 am, Casper H.S. Dik wrote:

    > Depending on the OS the guarantees it makes can be different;
    > on POSIX, the guarantee is that:
    >
    > o Write requests of {PIPE_BUF} bytes or less are
    > guaranteed not to be interleaved with data from
    > other processes doing writes on the same pipe.
    > Writes of greater than {PIPE_BUF} bytes may have
    > data interleaved, on arbitrary boundaries, with
    > writes by other processes, whether or not the
    > O_NONBLOCK or O_NDELAY flags are set.
    >
    > send() and write() are pretty much equivalent in that respect.
    >
    > If the behaviour as describes happens for bits of a message <= PIPE_BUF
    > in length on a POSIX style implementation, then I would argue that it
    > *is* in fact a bug. But since PIPE_BUF is typically small (5K), an
    > application with multiple threads writing to the same socket would do
    > well to serialize those writes itself.


    This is simply flat out wrong. This behavior is explicitly only for
    pipes of FIFOs. In fact, keep reading, it says:

    "If fildes refers to a socket, write() shall be equivalent to send()
    with no flags set."

    The documentation for 'send' says *nothing* about PIPE_BUF.

    DS

  9. Re: TCP send() implementation

    On Dec 6, 3:36 am, David Schwartz wrote:

    > This is simply flat out wrong. This behavior is explicitly only for
    > pipes of FIFOs. In fact, keep reading, it says:
    >
    > "If fildes refers to a socket, write() shall be equivalent to send()
    > with no flags set."
    >
    > The documentation for 'send' says *nothing* about PIPE_BUF.


    Let me rephrase. Your claim "send() and write() are pretty much
    equivalent in that respect" appears to come from outer space. In fact,
    'send' and 'write' are completely different in that respect.

    I see that you quoted the same section I did, that for a socket,
    'write' acts like 'send'. What I don't get is why you think 'send'
    must respect PIPE_BUF. Because if not, neither must 'write' for a
    socket.

    DS



  10. Re: TCP send() implementation

    On Dec 5, 5:54 am, David Schwartz wrote:
    > On Dec 3, 1:58 pm, Scott Gifford wrote:
    >
    > > I chose IMAP for this example because it actually does handle this
    > > properly. The reply to A will be tagged with A, and the reply to B
    > > will be tagged with B; that's why the tags are there.

    >
    > > See RFC 3501 for more details; section 2.2.2 has a good summary of
    > > this feature:

    >
    > > The server completion result response indicates the success or
    > > failure of the operation. It is tagged with the same tag as the
    > > client command which began the operation. Thus, if more than one
    > > command is in progress, the tag in a server completion response
    > > identifies the command to which the response applies.

    >
    > > ----Scott.

    >
    > Nevertheless, it's still not realistic. Threads A and B can't just
    > both call 'recv' or 'read' and expect to get the correct output. There
    > would have to be some single concentrator that calls 'recv' and
    > figures out whether the received data is an entire line and whether it
    > should go to thread A or thread B. This is precisely the same logic as
    > is needed for send.
    >
    > Again, there is no realistic case where this kind of synchronization
    > is not needed for other reasons.
    >
    > DS


    The fundamentals of UNIX has to be remembered here. In UNIX( and all
    similar OSes these days), an open socket connection is just a file.
    Forget threads. Imagine you create a child process B(while the parent
    A had open file decriptors at the time of fork). And now try writing
    into a file using write() system call, to the previously mentioned
    open file , you can experience that the file data contains the
    interleaved information based on the order of different processes got
    schedules to run. And this is not the responsibility of the file
    system.Its the responsibility of the applications(processes)to
    maintain a lock, so that there is no data corruption. You know that a
    child process shares the file table entry structure(and thereby file
    offset)

    ON your case, mutliple threads, share the file table entries/socket
    descriptors. TCP is like a file system in this reagrd, it has no
    business. Its the business of applications. Its the fundamental
    thing you need to know when you do concurrent programming, that you
    should do mutual exclusion when it comes to sharing files/open
    connections.

    NO need to refer RFC, you need to teach yourself more about
    threads.It is just that.

    thanks,
    Jijo Chacko

  11. Re: TCP send() implementation

    On Dec 5, 5:54 am, David Schwartz wrote:
    > On Dec 3, 1:58 pm, Scott Gifford wrote:
    >
    > > I chose IMAP for this example because it actually does handle this
    > > properly. The reply to A will be tagged with A, and the reply to B
    > > will be tagged with B; that's why the tags are there.

    >
    > > See RFC 3501 for more details; section 2.2.2 has a good summary of
    > > this feature:

    >
    > > The server completion result response indicates the success or
    > > failure of the operation. It is tagged with the same tag as the
    > > client command which began the operation. Thus, if more than one
    > > command is in progress, the tag in a server completion response
    > > identifies the command to which the response applies.

    >
    > > ----Scott.

    >
    > Nevertheless, it's still not realistic. Threads A and B can't just
    > both call 'recv' or 'read' and expect to get the correct output. There
    > would have to be some single concentrator that calls 'recv' and
    > figures out whether the received data is an entire line and whether it
    > should go to thread A or thread B. This is precisely the same logic as
    > is needed for send.
    >
    > Again, there is no realistic case where this kind of synchronization
    > is not needed for other reasons.
    >
    > DS


    The fundamentals of UNIX has to be remembered here. In UNIX( and all
    similar OSes these days), an open socket connection is just a file.
    Forget threads. Imagine you create a child process B(while the parent
    A had open file decriptors at the time of fork). And now try writing
    into a file using write() system call, to the previously mentioned
    open file , you can experience that the file data contains the
    interleaved information based on the order of different processes got
    schedules to run. And this is not the responsibility of the file
    system.Its the responsibility of the applications(processes)to
    maintain a lock, so that there is no data corruption. You know that a
    child process shares the file table entry structure(and thereby file
    offset)

    ON your case, mutliple threads, share the file table entries/socket
    descriptors. TCP is like a file system in this reagrd, it has no
    business. Its the business of applications. Its the fundamental
    thing you need to know when you do concurrent programming, that you
    should do mutual exclusion when it comes to sharing files/open
    connections.

    NO need to refer RFC, you need to teach yourself more about
    threads.It is just that.

    thanks,
    Jijo Chacko

  12. Re: TCP send() implementation

    David Schwartz writes:

    >On Nov 30, 7:03 am, Casper H.S. Dik wrote:


    >> Depending on the OS the guarantees it makes can be different;
    >> on POSIX, the guarantee is that:
    >>
    >> o Write requests of {PIPE_BUF} bytes or less are
    >> guaranteed not to be interleaved with data from
    >> other processes doing writes on the same pipe.
    >> Writes of greater than {PIPE_BUF} bytes may have
    >> data interleaved, on arbitrary boundaries, with
    >> writes by other processes, whether or not the
    >> O_NONBLOCK or O_NDELAY flags are set.
    >>
    >> send() and write() are pretty much equivalent in that respect.
    >>
    >> If the behaviour as describes happens for bits of a message <= PIPE_BUF
    >> in length on a POSIX style implementation, then I would argue that it
    >> *is* in fact a bug. But since PIPE_BUF is typically small (5K), an
    >> application with multiple threads writing to the same socket would do
    >> well to serialize those writes itself.


    >This is simply flat out wrong. This behavior is explicitly only for
    >pipes of FIFOs. In fact, keep reading, it says:


    >"If fildes refers to a socket, write() shall be equivalent to send()
    >with no flags set."


    You should have also read the send() manual page:

    "The send() function is identical to sendto() with a null pointer
    dest_len argument, and to write() if no flags are used."

    The definition is at best circular.

    And the write() definition clearly leaves room for implementations
    to interleave the data from multiple send() (e.g., under the heading
    "STREAMS" in write()).

    Also, the standard guarantees non-interleaving *ONLY* for pipes and FIFOs
    with writes < PIPE_BUF. Not for others.

    There's a vague guarantee for plain files (but you could read it as not
    to exclude concurrent file offset changes from other threads or processes)

    (Only very few implementation appear to have an atomic form of TCP send())

    Casper
    --
    Expressed in this posting are my opinions. They are in no way related
    to opinions held by my employer, Sun Microsystems.
    Statements on Sun products included here are not gospel and may
    be fiction rather than truth.

  13. Re: TCP send() implementation


    Casper H. S. Dik wrote:

    > David Schwartz writes:


    > You should have also read the send() manual page:


    > "The send() function is identical to sendto() with a null pointer
    > dest_len argument, and to write() if no flags are used."


    > The definition is at best circular.


    No, it's really not circular. The behavior for sockets is explained on
    the 'send' page and the behavior for streams and regular files is
    explained on the 'write' page. It's not circular because each
    reference is conditional.

    For example, consider two functions, 'foo' and 'bar', documented as
    such:

    Foo: For negative numbers, 'foo' returns the square of the number. For
    positive numbers or zero, 'foo' behaves the same as 'bar'.

    Bar: For positive numbers or zero, 'foo' returns twice the number. For
    positive numbers, 'bar' behaves the same as 'foo'.

    This is the kind of circularity we have. It would have been better to
    define them on a single page, IMO, 'send', 'sendto', and 'write'. The
    reason it wasn't done that way is largely historical.

    On a modern OS, whether you call 'send', 'sendto', or 'write', you'll
    wind up in the same logic. That logic will dispatch you to one
    function if the target is a socket, one if it's a pipe, and one if
    it's a regular file. So the behavior will depend on what type of
    socket you called on rather than which function you called. Sadly,
    POSIX didn't document that way, but the documentation allows the
    typical implementation.

    Honoring of PIPE_BUF atomicity guarantees will be in the pipe
    implementation. So you'll likely get that guarantee whether you call
    'send' or 'write'. That atomicity may or may not be in the socket
    implementation. There is no guarantee and POSIX does not require one.

    > And the write() definition clearly leaves room for implementations
    > to interleave the data from multiple send() (e.g., under the heading
    > "STREAMS" in write()).


    Yes, just not for pipes or regular files.

    > Also, the standard guarantees non-interleaving *ONLY* for pipes and FIFOs
    > with writes < PIPE_BUF. Not for others.


    Exactly.

    > There's a vague guarantee for plain files (but you could read it as not
    > to exclude concurrent file offset changes from other threads or processes)
    >
    > (Only very few implementation appear to have an atomic form of TCP send())


    Right.

    DS

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2