TCP send() implementation - TCP-IP

This is a discussion on TCP send() implementation - TCP-IP ; Hello, Could anybody help me finding a spec with description how TCP stack should process a case of send buffer overflow with respect to SEND command? The reason I'm asking is a discussion I'm having with an OS developer regarding ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 33

Thread: TCP send() implementation

  1. TCP send() implementation

    Hello,

    Could anybody help me finding a spec with description how TCP stack should
    process a case of send buffer overflow with respect to SEND command?
    The reason I'm asking is a discussion I'm having with an OS developer
    regarding TCP layer implementation. In his implementation if send buffer
    (SO_SNDBUF) is nearly full and next SEND request brings more data than the
    space remaining, part of the data is copied to output buffer immediately and
    the client thread is blocked until some data is transmitted and the room for
    the rest of the data block is made. However, if more threads with SEND
    requests arrive at that time, on resume the highest-priority thread is
    allowed to put data in buffer first, regardless to the fact that original
    thread has already queued part of its data block and inserting different
    data inside will corrupt the stream structure for recepient.
    Such implementation makes sockets totally inapplicable for multithreaded
    use. However I can't convince him to make changes without finding some
    document which would clearly state if send() should process its data block
    as atomic or not.

    Oleh Derevenko
    -- ICQ: 36361783



  2. Re: TCP send() implementation

    In article ,
    "Oleh Derevenko" wrote:

    > Hello,
    >
    > Could anybody help me finding a spec with description how TCP stack should
    > process a case of send buffer overflow with respect to SEND command?
    > The reason I'm asking is a discussion I'm having with an OS developer
    > regarding TCP layer implementation. In his implementation if send buffer
    > (SO_SNDBUF) is nearly full and next SEND request brings more data than the
    > space remaining, part of the data is copied to output buffer immediately and
    > the client thread is blocked until some data is transmitted and the room for
    > the rest of the data block is made. However, if more threads with SEND
    > requests arrive at that time, on resume the highest-priority thread is
    > allowed to put data in buffer first, regardless to the fact that original
    > thread has already queued part of its data block and inserting different
    > data inside will corrupt the stream structure for recepient.
    > Such implementation makes sockets totally inapplicable for multithreaded
    > use. However I can't convince him to make changes without finding some
    > document which would clearly state if send() should process its data block
    > as atomic or not.


    See the specification of send() in the Single Unix Specification

    http://www.unix.org/single_unix_specification/

    It doesn't say anything about being atomic or thread-safe, so I think
    you need to use your own mutex to implement atomicity.
    --
    Barry Margolin
    Arlington, MA

  3. Re: TCP send() implementation

    Hi, Barry

    "Barry Margolin" wrote in message

    >> Could anybody help me finding a spec with description how TCP stack
    >> should
    >> process a case of send buffer overflow with respect to SEND command?
    >> The reason I'm asking is a discussion I'm having with an OS developer
    >> regarding TCP layer implementation. In his implementation if send buffer
    >> (SO_SNDBUF) is nearly full and next SEND request brings more data than
    >> the
    >> space remaining, part of the data is copied to output buffer immediately
    >> and
    >> the client thread is blocked until some data is transmitted and the room
    >> for
    >> the rest of the data block is made. However, if more threads with SEND
    >> requests arrive at that time, on resume the highest-priority thread is
    >> allowed to put data in buffer first, regardless to the fact that original
    >> thread has already queued part of its data block and inserting different
    >> data inside will corrupt the stream structure for recepient.
    >> Such implementation makes sockets totally inapplicable for multithreaded
    >> use. However I can't convince him to make changes without finding some
    >> document which would clearly state if send() should process its data
    >> block
    >> as atomic or not.

    >
    > See the specification of send() in the Single Unix Specification
    >
    > http://www.unix.org/single_unix_specification/
    >
    > It doesn't say anything about being atomic or thread-safe, so I think
    > you need to use your own mutex to implement atomicity.


    Well, this is just an UNIX man. I would not expect to find such
    peculiarities described in manual. There should be RFC or something similar
    that would describe TCP operations. I checked RFC793 but it is too general -
    there are no details like that in it.

    Oleh Derevenko
    -- ICQ: 36361783



  4. Re: TCP send() implementation

    On Nov 30, 3:52 am, "Oleh Derevenko" wrote:
    > Hello,
    >
    > Could anybody help me finding a spec with description how TCP stack should
    > process a case of send buffer overflow with respect to SEND command?
    > The reason I'm asking is a discussion I'm having with an OS developer
    > regarding TCP layer implementation. In his implementation if send buffer
    > (SO_SNDBUF) is nearly full and next SEND request brings more data than the
    > space remaining, part of the data is copied to output buffer immediately and
    > the client thread is blocked until some data is transmitted and the room for
    > the rest of the data block is made. However, if more threads with SEND
    > requests arrive at that time, on resume the highest-priority thread is
    > allowed to put data in buffer first, regardless to the fact that original
    > thread has already queued part of its data block and inserting different
    > data inside will corrupt the stream structure for recepient.
    > Such implementation makes sockets totally inapplicable for multithreaded
    > use. However I can't convince him to make changes without finding some
    > document which would clearly state if send() should process its data block
    > as atomic or not.



    There is no such specification. An OS designer, or more specifically
    the TCP/IP stack designer, will likely set them selves some goals for
    arbitrating fairness or other policies when presented with such
    situations. Many OSs/stacks take the approach that tasks that are
    hogging the available buffers get lower priority for additional use.

    More importantly, there is no such things as a "stream structure for
    (the) recipient." No such thing exists in TCP. TCP is explicitly
    allowed to split up or merge consecutive writes to the stream (with
    some minor exceptions when in comes to urgent/out-of-band data) into
    segments in any way it sees fit, whatsoever. TCP may merge 1000 one
    byte sends into a single segment, which the receiver might see as a
    single 1000 byte chuck. Or split a 1000 byte send into 1000 separate
    one byte segments, each of which might be seen individually by the
    receiver. Or any other combination that the stack designer, current
    circumstances, resources, moon phase, or random number generator being
    used to make those determinations dictates. In fact, one of the
    common features of TCP (Nagle), exists explicitly to merge segments in
    a very timing dependent manner so as to increase efficiency.

    TCP provides a stream of bytes. Period. There are no message
    boundaries in TCP. The vague visibility of send boundaries is purely
    an artifact of some implementations, and only works in some
    situations, even then. Any application relying on such is
    fundamentally broken. In fact, ordinary retransmission in many stacks
    (after packet loss) will cause TCP segments to be repacked - so an
    application depending on such will fail if there's ever a lost
    packet. Also, many circuit level proxies will flatly break such an
    application.

  5. Re: TCP send() implementation

    Hello

    wrote in message

    >> use. However I can't convince him to make changes without finding some
    >> document which would clearly state if send() should process its data
    >> block
    >> as atomic or not.

    >
    >
    > There is no such specification. An OS designer, or more specifically
    > the TCP/IP stack designer, will likely set them selves some goals for
    > arbitrating fairness or other policies when presented with such
    > situations. Many OSs/stacks take the approach that tasks that are
    > hogging the available buffers get lower priority for additional use.
    >
    > More importantly, there is no such things as a "stream structure for
    > (the) recipient." No such thing exists in TCP. TCP is explicitly
    > allowed to split up or merge consecutive writes to the stream (with
    > some minor exceptions when in comes to urgent/out-of-band data) into
    > segments in any way it sees fit, whatsoever. TCP may merge 1000 one

    [skipped]

    I know all that stuff.

    I do not require my blocks to be delivered one in a single read. I only
    would like to have block data not intermix with each other if several
    threads are sending envelopes over a socket. Because if two threads invoke
    send() with "AAAAAA" and "BBBB" and receiver reads "ABBAAABBA" it is
    impossible to parse data and it is impossible to use socket from multiple
    threads at all. Every API should be user friendly and provide convenient
    services for user and not just "function for its own pleasure".

    Though I partially agree that it could be questionable practice to provide
    such atomicity if sender can push several megabytes of data in single block.

    Oleh Derevenko
    -- ICQ: 36361783



  6. Re: TCP send() implementation

    On Nov 30, 6:45 am, "Oleh Derevenko" wrote:
    > Hello
    >
    >
    >
    >
    >
    > wrote in message
    > >> use. However I can't convince him to make changes without finding some
    > >> document which would clearly state if send() should process its data
    > >> block
    > >> as atomic or not.

    >
    > > There is no such specification. An OS designer, or more specifically
    > > the TCP/IP stack designer, will likely set them selves some goals for
    > > arbitrating fairness or other policies when presented with such
    > > situations. Many OSs/stacks take the approach that tasks that are
    > > hogging the available buffers get lower priority for additional use.

    >
    > > More importantly, there is no such things as a "stream structure for
    > > (the) recipient." No such thing exists in TCP. TCP is explicitly
    > > allowed to split up or merge consecutive writes to the stream (with
    > > some minor exceptions when in comes to urgent/out-of-band data) into
    > > segments in any way it sees fit, whatsoever. TCP may merge 1000 one

    >
    > [skipped]
    >
    > I know all that stuff.
    >
    > I do not require my blocks to be delivered one in a single read. I only
    > would like to have block data not intermix with each other if several
    > threads are sending envelopes over a socket. Because if two threads invoke
    > send() with "AAAAAA" and "BBBB" and receiver reads "ABBAAABBA" it is
    > impossible to parse data and it is impossible to use socket from multiple
    > threads at all. Every API should be user friendly and provide convenient
    > services for user and not just "function for its own pleasure".
    >
    > Though I partially agree that it could be questionable practice to provide
    > such atomicity if sender can push several megabytes of data in single block.



    Basically the APIs are not mentioned at all in the RFCs, nor are
    threads, so you'll find no answer there.

    While you'd like to think that the entire chuck of data accepted by
    send (remembering that it may be less that the whole chuck requested),
    would be treated atomically, that has *not* been the case in
    historically, at least for some systems. In fact it's not so for many
    of the *BSDs, Linux 2.6, and others. An interesting read:

    http://www.almaden.ibm.com/cs/people...h/sendmsg.html

  7. Re: TCP send() implementation

    Hi

    >> wrote in message


    > Basically the APIs are not mentioned at all in the RFCs, nor are
    > threads, so you'll find no answer there.
    >
    > While you'd like to think that the entire chuck of data accepted by
    > send (remembering that it may be less that the whole chuck requested),
    > would be treated atomically, that has *not* been the case in
    > historically, at least for some systems. In fact it's not so for many
    > of the *BSDs, Linux 2.6, and others. An interesting read:
    >
    > http://www.almaden.ibm.com/cs/people...h/sendmsg.html


    Thank you very much for the link and clarifications.

    Oleh Derevenko
    -- ICQ: 36361783



  8. Re: TCP send() implementation

    "robertwessel2@yahoo.com" writes:

    >On Nov 30, 3:52 am, "Oleh Derevenko" wrote:
    >> Hello,
    >>
    >> Could anybody help me finding a spec with description how TCP stack should
    >> process a case of send buffer overflow with respect to SEND command?
    >> The reason I'm asking is a discussion I'm having with an OS developer
    >> regarding TCP layer implementation. In his implementation if send buffer
    >> (SO_SNDBUF) is nearly full and next SEND request brings more data than the
    >> space remaining, part of the data is copied to output buffer immediately and
    >> the client thread is blocked until some data is transmitted and the room for
    >> the rest of the data block is made. However, if more threads with SEND
    >> requests arrive at that time, on resume the highest-priority thread is
    >> allowed to put data in buffer first, regardless to the fact that original
    >> thread has already queued part of its data block and inserting different
    >> data inside will corrupt the stream structure for recepient.
    >> Such implementation makes sockets totally inapplicable for multithreaded
    >> use. However I can't convince him to make changes without finding some
    >> document which would clearly state if send() should process its data block
    >> as atomic or not.



    >There is no such specification. An OS designer, or more specifically
    >the TCP/IP stack designer, will likely set them selves some goals for
    >arbitrating fairness or other policies when presented with such
    >situations. Many OSs/stacks take the approach that tasks that are
    >hogging the available buffers get lower priority for additional use.



    I don't think this is about the TCP specification, but rather it's
    about the behaviour of a particular system when there are multiple writers
    to a single TCP stream.

    His complaint is that two sends from two threads may get their data
    intermixed on the stream.

    Not much todo with TCP as TCP doesn't come into play until AFTER
    the bytes have been queued.

    Depending on the OS the guarantees it makes can be different;
    on POSIX, the guarantee is that:

    o Write requests of {PIPE_BUF} bytes or less are
    guaranteed not to be interleaved with data from
    other processes doing writes on the same pipe.
    Writes of greater than {PIPE_BUF} bytes may have
    data interleaved, on arbitrary boundaries, with
    writes by other processes, whether or not the
    O_NONBLOCK or O_NDELAY flags are set.

    send() and write() are pretty much equivalent in that respect.

    If the behaviour as describes happens for bits of a message <= PIPE_BUF
    in length on a POSIX style implementation, then I would argue that it
    *is* in fact a bug. But since PIPE_BUF is typically small (5K), an
    application with multiple threads writing to the same socket would do
    well to serialize those writes itself.

    Casper
    --
    Expressed in this posting are my opinions. They are in no way related
    to opinions held by my employer, Sun Microsystems.
    Statements on Sun products included here are not gospel and may
    be fiction rather than truth.

  9. Re: TCP send() implementation

    On Nov 30, 1:52 am, "Oleh Derevenko" wrote:

    > Such implementation makes sockets totally inapplicable for multithreaded
    > use.


    No, it doesn't. There are few realistic scenarios in which it matters.

    Think about TCP for a second. If we want to send "How are you?" and
    one thread is going to send "How are" and the other is going to send "
    you?" we need to make sure the first thread *calls* 'send' before the
    second thread does. This will, whether we mean it to or not, also make
    sure the first thread *returns* from 'send' before the first thread
    calls 'send' automatically.

    DS

  10. Re: TCP send() implementation

    Hi

    "David Schwartz" wrote in message

    >> Such implementation makes sockets totally inapplicable for multithreaded
    >> use.

    >
    > No, it doesn't. There are few realistic scenarios in which it matters.
    >
    > Think about TCP for a second. If we want to send "How are you?" and
    > one thread is going to send "How are" and the other is going to send "
    > you?" we need to make sure the first thread *calls* 'send' before the
    > second thread does. This will, whether we mean it to or not, also make
    > sure the first thread *returns* from 'send' before the first thread
    > calls 'send' automatically.


    Sorry, but I do not understand this. This is a comple nonsense. How do you
    intend to indicate to second thread that the first one has entered send()
    call before it returns?

    Oleh Derevenko
    -- ICQ: 36361783



  11. Re: TCP send() implementation

    On Nov 30, 10:53 am, "Oleh Derevenko" wrote:

    > > No, it doesn't. There are few realistic scenarios in which it matters.

    >
    > > Think about TCP for a second. If we want to send "How are you?" and
    > > one thread is going to send "How are" and the other is going to send "
    > > you?" we need to make sure the first thread *calls* 'send' before the
    > > second thread does. This will, whether we mean it to or not, also make
    > > sure the first thread *returns* from 'send' before the first thread
    > > calls 'send' automatically.


    > Sorry, but I do not understand this. This is a comple nonsense. How do you
    > intend to indicate to second thread that the first one has entered send()
    > call before it returns?


    The first thread must make sure that it calls 'send' before the second
    thread does. Otherwise, the other end will get " you?How are" instead
    of "How are you?" So its code must look something like this:

    1) Make sure the second thread doesn't call 'send' by acquiring a
    lock, setting a flag, or whatever.
    2) Call 'send'.
    3) Allow the second thread to call 'send'

    As a result, the second thread won't be able to call 'send' until the
    first thread returns *automatically*.

    Since applications need to control the order in which threads start
    sending, they wind up preventing two threads from sending on the same
    socket at the same time.

    So in the vast majority of realistic applications, this doesn't cause
    a problem.

    DS

  12. Re: TCP send() implementation

    Oleh Derevenko wrote:

    > In his implementation if send buffer
    > (SO_SNDBUF) is nearly full and next SEND request brings more data than the
    > space remaining, part of the data is copied to output buffer immediately and
    > the client thread is blocked until some data is transmitted and the room for
    > the rest of the data block is made.


    I'm sorry to disagree with so many distinguished colleagues but the
    problem is right here. That is not correct. What it should do is buffer
    what can be buffered and return that bytecount to the application. What
    happens next is up to the application: it will usually advance the
    offset and reduce the length, both by the return value, and repeat, and
    *that's* where inter-thread problems can occur, but that's strictly the
    application's problem.

    send() should only block if there is *zero* room in the send buffer, and
    it should only block until the buffer goes below the highwater mark. The
    whole send() operation has *nothing to do with* how much data the
    application specified in the send() call, unless it it zero, when there
    is an obvious shortcut.

    It should be easy to convince your colleague of this: apart from
    agreeing with every implementation of send() I am aware of, it is much
    easier for a start, and absolves the kernel of *any* responsibility for
    inter-thread semantics.

  13. Re: TCP send() implementation

    On Nov 30, 7:15 pm, Esmond Pitt
    wrote:
    > I'm sorry to disagree with so many distinguished colleagues but the
    > problem is right here. That is not correct. What it should do is buffer
    > what can be buffered and return that bytecount to the application. What
    > happens next is up to the application: it will usually advance the
    > offset and reduce the length, both by the return value, and repeat, and
    > *that's* where inter-thread problems can occur, but that's strictly the
    > application's problem.
    >
    > send() should only block if there is *zero* room in the send buffer, and
    > it should only block until the buffer goes below the highwater mark. The
    > whole send() operation has *nothing to do with* how much data the
    > application specified in the send() call, unless it it zero, when there
    > is an obvious shortcut.



    That's only true if the socket is in non-blocking mode. In blocking
    mode, the default, and obviously what the OP is wanting to use, send
    will block until the entire message is sent (or it fails).

  14. Re: TCP send() implementation

    In article ,
    "Oleh Derevenko" wrote:

    > Well, this is just an UNIX man. I would not expect to find such
    > peculiarities described in manual. There should be RFC or something similar
    > that would describe TCP operations. I checked RFC793 but it is too general -
    > there are no details like that in it.


    send() is part of the Unix sockets API, not the network protocol, so it
    doesn't belong in an RFC.
    --
    Barry Margolin
    Arlington, MA

  15. Re: TCP send() implementation

    On Nov 30, 5:15 pm, Esmond Pitt
    wrote:

    > I'm sorry to disagree with so many distinguished colleagues but the
    > problem is right here. That is not correct. What it should do is buffer
    > what can be buffered and return that bytecount to the application. What
    > happens next is up to the application: it will usually advance the
    > offset and reduce the length, both by the return value, and repeat, and
    > *that's* where inter-thread problems can occur, but that's strictly the
    > application's problem.


    In other words, the existing implementation has no problem, but it
    should be changed such that it does. What sense does that make?

    > send() should only block if there is *zero* room in the send buffer, and
    > it should only block until the buffer goes below the highwater mark. The
    > whole send() operation has *nothing to do with* how much data the
    > application specified in the send() call, unless it it zero, when there
    > is an obvious shortcut.


    That would require the application to repeat the 'send' operation.

    > It should be easy to convince your colleague of this: apart from
    > agreeing with every implementation of send() I am aware of, it is much
    > easier for a start, and absolves the kernel of *any* responsibility for
    > inter-thread semantics.


    Huh? Every implementation of 'send' I know of provides blocking
    semantics by default and blocks until all data can be sent (unless
    there is an error, the connection is shutdown, or the operation is
    interrupted).

    DS

  16. Re: TCP send() implementation

    David Schwartz wrote:
    > Huh? Every implementation of 'send' I know of provides blocking
    > semantics by default


    It blocks until there is room for at least 1 byte, or until below the
    highwater mark if implemented.

    > and blocks until all data can be sent (unless
    > there is an error, the connection is shutdown, or the operation is
    > interrupted).


    That's not what it says in Stevens, or man send(2). See for example
    http://www.informatik.uni-frankfurt....x/send.2.html:

    'If there is not enough space in the buffer to write out the
    entire request, send() completes successfully, having written
    as much data as possible, and returns the number of bytes it
    was able to write'.

    And that's why send(2) returns the count of bytes actually written.

    It's also how Java SocketChannels behave in blocking mode. That wouldn't
    be possible unless the the system calls behave that way too.

    An implementation of send(2) is free to block until the whole thing is
    sent, but it isn't obliged to do that, and not doing so moves the entire
    concurrency problem back into the application where it can be handled
    however the application likes, rather than imposing a kernel policy.
    Which is the original problem here.

  17. Re: TCP send() implementation

    robertwessel2@yahoo.com wrote:
    > That's only true if the socket is in non-blocking mode.


    No it's not. In non-blocking mode there is no blocking at all. In
    blocking mode it blocks until there is room for *something* in the buffer.

  18. Re: TCP send() implementation

    David Schwartz writes:

    > On Nov 30, 5:15 pm, Esmond Pitt
    > wrote:
    >
    >> I'm sorry to disagree with so many distinguished colleagues but the
    >> problem is right here. That is not correct. What it should do is buffer
    >> what can be buffered and return that bytecount to the application. What
    >> happens next is up to the application: it will usually advance the
    >> offset and reduce the length, both by the return value, and repeat, and
    >> *that's* where inter-thread problems can occur, but that's strictly the
    >> application's problem.

    >
    > In other words, the existing implementation has no problem, but it
    > should be changed such that it does. What sense does that make?


    I believe the problem the OP is reporting is this. Let's use the IMAP
    protocol, since it's one where it makes sense to have multiple threads
    writing to the same socket.

    Thread 1: A LIST "" Mailbox1
    Thread 2: B LIST "" Mailbox2

    If the threads are unsynchronized, I would expect it to get the two
    commands, with no guarantees about order.

    What the OP is saying is that under some circumstances, it's possible
    that the server would actually receive the two commands intermixed
    arbitrarily, possibly resulting in completely garbled commands

    A LIST B LIST "" Mailbox2
    "" Mailbox1

    Here are the circumstances:

    Thread 1: runs
    Thread 1: Sends A LIST
    Thread 1: send buffer full, blocks
    Thread 2: runs
    Thread 2: send buffer has space
    Thread 2: Sends B LIST "" Mailbox2\r\n
    Thread 2: send buffer full, blocks
    Thread 1: runs
    Thread 1: send buffer has space
    Thread 1: Sends "" Mailbox1\r\n

    I think that's very surprising behavior, even if it is allowable.

    But the solution, synchronizing the threads, seems straightforward
    enough, even if it is surprising that it's necessary.

    ---Scott.

  19. Re: TCP send() implementation

    Hello

    "David Schwartz" wrote in message ...

    > The first thread must make sure that it calls 'send' before the second
    > thread does. Otherwise, the other end will get " you?How are" instead
    > of "How are you?" So its code must look something like this:
    >
    > 1) Make sure the second thread doesn't call 'send' by acquiring a
    > lock, setting a flag, or whatever.
    > 2) Call 'send'.
    > 3) Allow the second thread to call 'send'
    >
    > As a result, the second thread won't be able to call 'send' until the
    > first thread returns *automatically*.
    >
    > Since applications need to control the order in which threads start
    > sending, they wind up preventing two threads from sending on the same
    > socket at the same time.
    >
    > So in the vast majority of realistic applications, this doesn't cause
    > a problem.


    If application serializes access to the socket itself it is not a
    multithreaded use. Do you know the old anecdote about Windows 95?

    - Daddy, why Windows 95 is called a multitasking operating system?
    - Wait a minite, son. I'll finish formatting my floppy disk and show you.

    Oleh Derevenko
    -- ICQ: 36361783



  20. Re: TCP send() implementation

    Hello

    "Scott Gifford" wrote in message

    > But the solution, synchronizing the threads, seems straightforward
    > enough, even if it is surprising that it's necessary.


    Yes, this is what I do to solve the problem, of course. However serializing
    threads inside of socket library implementation would be more effective and
    would improve execution parallelism.

    Oleh Derevenko
    -- ICQ: 36361783



+ Reply to Thread
Page 1 of 2 1 2 LastLast