writev returns less then expected - Unix

This is a discussion on writev returns less then expected - Unix ; Hope this is the right group to post to, if not please direct me as needed. I have an application which is using NON-blocking sockets. My application attempts to send a response using the writev function (I am testing on ...

+ Reply to Thread
Results 1 to 12 of 12

Thread: writev returns less then expected

  1. writev returns less then expected

    Hope this is the right group to post to, if not please direct me as needed.

    I have an application which is using NON-blocking sockets. My
    application attempts to send a response using the writev function (I am
    testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP Wed Aug
    1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some circumstances
    writev does not write all of the data to the socket. When this happens
    the client only gets part of the request and then sits there waiting for
    more data.
    It is not clear to me why this might happen from the docs.

    This seems to be an issue depending on the amount of data being sent
    back. For example if I am sending 3000 bytes it works fine, but when I
    try to send 30K bytes I run into the problem.

    This is the code I am using:

    ssize_t senddata(int socket, const char *buffer, int length)
    {
    ssize_t ret = 0;
    CCIMBool shouldRetry = CCIMFalse;
    int retVal = -1;
    struct iovec iov[] = { {NULL, 0} };
    iov[0].iov_base = (char*)buffer;
    iov[0].iov_len = length;
    do {
    ret = writev(socket, iov, 1);
    if (ret == -1) {
    .... handle error or need to retry
    }
    } while (shouldRetry == CCIMTrue);

    In the case where I run into the problem, the buffer length is 36333
    (according to strlen) and writev only returns 11340. Using WireShark I
    can see that only part of the message is being sent to the client.

    errno is not set - but then again writev is not returning '-1'

    Any help appreciated.

    Thanks
    -Jim

  2. Re: writev returns less then expected

    On Thu, 18 Oct 2007 15:30:24 -0400, Jim Marshall wrote:

    > Hope this is the right group to post to, if not please direct me as
    > needed.
    >
    > I have an application which is using NON-blocking sockets. My
    > application attempts to send a response using the writev function (I am
    > testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP Wed Aug
    > 1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some circumstances
    > writev does not write all of the data to the socket. When this happens
    > the client only gets part of the request and then sits there waiting for
    > more data.
    > It is not clear to me why this might happen from the docs.


    man writev:

    RETURN VALUE
    On success, the readv() function returns the number of bytes read; the
    writev() function returns the number of bytes written. On error, -1 is
    returned, and errno is set appropriately.

    , which MEANS: writev() (,just like write()) MAY write less than your
    'length' argument. You'll have to take care of sending the rest of the
    data in your buffers in subsequent calls.



    >
    > This seems to be an issue depending on the amount of data being sent
    > back. For example if I am sending 3000 bytes it works fine, but when I
    > try to send 30K bytes I run into the problem.
    >
    > This is the code I am using:
    >
    > ssize_t senddata(int socket, const char *buffer, int length) {
    > ssize_t ret = 0;
    > CCIMBool shouldRetry = CCIMFalse;
    > int retVal = -1;
    > struct iovec iov[] = { {NULL, 0} };
    > iov[0].iov_base = (char*)buffer;
    > iov[0].iov_len = length;
    > do {
    > ret = writev(socket, iov, 1);
    > if (ret == -1) {
    > ... handle error or need to retry
    > }
    > } while (shouldRetry == CCIMTrue);
    >
    > In the case where I run into the problem, the buffer length is 36333
    > (according to strlen) and writev only returns 11340. Using WireShark I


    Forget strlen(). It only counts the number of non-nul bytes in string.
    Do you WANT to send nuls ?

    > can see that only part of the message is being sent to the client.
    >
    > errno is not set - but then again writev is not returning '-1'


    errno is only set on errors (the -1 return) . Meaningless otherwise.

    > Any help appreciated.
    >


    HTH,
    AvK

  3. Re: writev returns less then expected

    moi wrote:
    > On Thu, 18 Oct 2007 15:30:24 -0400, Jim Marshall wrote:
    >
    >> Hope this is the right group to post to, if not please direct me as
    >> needed.
    >>
    >> I have an application which is using NON-blocking sockets. My
    >> application attempts to send a response using the writev function (I am
    >> testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP Wed Aug
    >> 1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some circumstances
    >> writev does not write all of the data to the socket. When this happens
    >> the client only gets part of the request and then sits there waiting for
    >> more data.
    >> It is not clear to me why this might happen from the docs.

    >
    > man writev:
    >
    > RETURN VALUE
    > On success, the readv() function returns the number of bytes read; the
    > writev() function returns the number of bytes written. On error, -1 is
    > returned, and errno is set appropriately.
    >
    > , which MEANS: writev() (,just like write()) MAY write less than your
    > 'length' argument. You'll have to take care of sending the rest of the
    > data in your buffers in subsequent calls.

    I was under the impression that writev was atomic, my bad. However this
    confuses me now as if you had more then 1 item in the vector it would be
    very difficult to figure out which part of the vector wasn't sent. I'm
    not sure I follow the point of this function.

    Guess I'll have to play with it some more.

    Thanks
    >
    >
    >
    >> This seems to be an issue depending on the amount of data being sent
    >> back. For example if I am sending 3000 bytes it works fine, but when I
    >> try to send 30K bytes I run into the problem.
    >>
    >> This is the code I am using:
    >>
    >> ssize_t senddata(int socket, const char *buffer, int length) {
    >> ssize_t ret = 0;
    >> CCIMBool shouldRetry = CCIMFalse;
    >> int retVal = -1;
    >> struct iovec iov[] = { {NULL, 0} };
    >> iov[0].iov_base = (char*)buffer;
    >> iov[0].iov_len = length;
    >> do {
    >> ret = writev(socket, iov, 1);
    >> if (ret == -1) {
    >> ... handle error or need to retry
    >> }
    >> } while (shouldRetry == CCIMTrue);
    >>
    >> In the case where I run into the problem, the buffer length is 36333
    >> (according to strlen) and writev only returns 11340. Using WireShark I

    >
    > Forget strlen(). It only counts the number of non-nul bytes in string.
    > Do you WANT to send nuls ?
    >
    >> can see that only part of the message is being sent to the client.
    >>
    >> errno is not set - but then again writev is not returning '-1'

    >
    > errno is only set on errors (the -1 return) . Meaningless otherwise.
    >
    >> Any help appreciated.
    >>

    >
    > HTH,
    > AvK


  4. Re: writev returns less then expected

    Jim Marshall wrote:
    > Hope this is the right group to post to, if not please direct me as needed.


    > I have an application which is using NON-blocking sockets. My
    > application attempts to send a response using the writev function (I
    > am testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP
    > Wed Aug 1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some
    > circumstances writev does not write all of the data to the
    > socket. When this happens the client only gets part of the request
    > and then sits there waiting for more data. It is not clear to me
    > why this might happen from the docs.


    > This seems to be an issue depending on the amount of data being sent
    > back. For example if I am sending 3000 bytes it works fine, but when I
    > try to send 30K bytes I run into the problem.


    Since the socket is non-blocking, your writev will do a partial write
    whenever there is less space in the socket than you are trying to
    write with writev.

    If you *know* you will "never" try to write more than N bytes into a
    socket at one time and can "know" when that has been emptied - say by
    the receipt of data from the remote indicating it has all you sent
    previously - you could use setsockopt(SO_SNDBUF) to set the socket
    buffer to be larger than your largest writev() call, and even if the
    socket is non-blocking writev() should always write everything - well
    assuming it can allocate space - socket buffer sizes are limits not
    preallocations.... after your setsockopt() call, you should make a
    getsockopt() call to make sure you got at least the size you wanted...

    One or more of the texts of Stevens Fenner and Rudoff, or Stallings
    might be good to add to your bookshelf - they cover all sorts of stuff
    like this.

    I trust you already know that even if you did put all N KB into the
    socket at once that your reciever will still get it in smaller chunks
    right? Ie that TCP is a byte-stream service and that message
    boundaries are not preserved...

    rick jones
    --
    a wide gulf separates "what if" from "if only"
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  5. Re: writev returns less then expected

    Jim Marshall wrote:


    >> , which MEANS: writev() (,just like write()) MAY write less than your
    >> 'length' argument. You'll have to take care of sending the rest of the
    >> data in your buffers in subsequent calls.

    >I was under the impression that writev was atomic, my bad. However this
    >confuses me now as if you had more then 1 item in the vector it would be
    >very difficult to figure out which part of the vector wasn't sent. I'm
    >not sure I follow the point of this function.


    My guess is you're running up against the amount of TCP socket buffer
    space in whichever OS you're using. Since your socket is set to
    NON_BLOCKING, when those buffers fill up, your write() or writev()
    returns immediately. Otherwise, your call would block until the client
    ACK's enough data to free up the buffer space.

    The only point of writev() versus write() is that you don't have to
    marshall your data into one buffer yourself. You are correct, though,
    that in your application, it may be a bit of work to calculate where
    the writev() left off in order to send the rest of your message.

    There is probably some tuning you can do to increase the TCP buffer
    sizes. What OS is this on?


  6. Re: writev returns less then expected

    On Oct 18, 1:31 pm, Jim Marshall
    wrote:
    > moi wrote:
    > > On Thu, 18 Oct 2007 15:30:24 -0400, Jim Marshall wrote:

    >
    > >> Hope this is the right group to post to, if not please direct me as
    > >> needed.

    >
    > >> I have an application which is using NON-blocking sockets. My
    > >> application attempts to send a response using the writev function (I am
    > >> testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP Wed Aug
    > >> 1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some circumstances
    > >> writev does not write all of the data to the socket. When this happens
    > >> the client only gets part of the request and then sits there waiting for
    > >> more data.
    > >> It is not clear to me why this might happen from the docs.

    >
    > > man writev:

    >
    > > RETURN VALUE
    > > On success, the readv() function returns the number of bytes read; the
    > > writev() function returns the number of bytes written. On error, -1 is
    > > returned, and errno is set appropriately.

    >
    > > , which MEANS: writev() (,just like write()) MAY write less than your
    > > 'length' argument. You'll have to take care of sending the rest of the
    > > data in your buffers in subsequent calls.

    >
    > I was under the impression that writev was atomic, my bad.


    No. In fact, such a thing is impossible under normal circumstances.
    What if it sends one packet and then the link fails?

    My man page for writev even says:

    When using non-blocking I/O on objects such as sockets that are
    subject
    to flow control, write() and writev() may write fewer bytes than
    requested; the return value must be noted, and the remainder of
    the oper-
    ation should be retried when possible.

    If you think about it, it's entirely possible that you write more data
    than there is buffer space for. You've asked for non-blocking I/O, so
    writev isn't allowed to wait until the first chunk is sent and then
    send some more. All it can do is return, and tell you how much it did
    write.

    Even when blocking I/O is being used, write/writev can still write
    less than requested if some error is encountered, or a signal
    arrives. You always have to be prepared to deal with that
    possibility.

    > However this
    > confuses me now as if you had more then 1 item in the vector it would be
    > very difficult to figure out which part of the vector wasn't sent.


    It isn't that hard, because they're sent in order. For instance, if
    you had chunks of size 30, 17, and 24, and writev returned 36, you
    know that it wrote all of the first chunk and the first 6 bytes of the
    second, so that is where you should start from.

    > I'm
    > not sure I follow the point of this function.


    Sometimes it can be more efficient. For example, imagine an HTTP
    server. You've constructed the header of the response in one buffer,
    and the body is somewhere else. Without writev(), you would either
    have to copy them both into a single buffer, or make two calls to
    write() which involves a little more overhead.

    But unless you know that the overhead is a serious problem for you,
    it's usually more convenient just to use write(). For instance, it's
    easier to compute where you should restart a partial write.


  7. Re: writev returns less then expected

    Jim Marshall wrote:
    > I was under the impression that writev was atomic, my bad. However this
    > confuses me now as if you had more then 1 item in the vector it would be
    > very difficult to figure out which part of the vector wasn't sent.


    I think "very difficult" might be a little bit of an overstatement. As far
    as I can tell, you'd just loop through the vector and skip items as long as
    the lengths haven't added up to what writev() returned. It's non-trivial,
    but it's also not super hard.

    In particular, if it's OK to stomp on your iovec structs as you go, I think
    this should do the trick of sending everything in a list of iovecs:

    struct iovec *first_remaining_iovec = original_iovec_array;
    int remaining_count = original_count;

    while (remaining_count > 0) {
    ssize_t bytes_written = writev(first_remaining_iovec, remaining_count);
    if (bytes_written == -1) { /* .... handle error .... */ }

    size_t bytes_to_consume = bytes_written;
    while (bytes_to_consume > 0) {
    if (bytes_to_consume >= first_remaining_iovec->iov_len) {
    /* consume entire vector element */
    bytes_to_consume -= first_remaining_iovec->iov_len;
    remaining_count--;
    first_remaining_iovec++;
    } else {
    /* consume partial vector element */
    first_remaining_iovec->iov_len -= bytes_to_consume;
    first_remaining_iovec->iov_base += bytes_to_consume;
    bytes_to_consume = 0;
    }
    }
    }

    > I'm
    > not sure I follow the point of this function.


    Efficiency. Specially, avoiding unnecessary copies whilst at the same time
    avoiding unnecessary system calls (and unnecessary extra packets or other
    physical I/O, i.e. following the principle of "never delay in queuing up
    everything you already have ready to send").

    Note that for maximum efficiency, you could potentially want to extend the
    vector if only a partial vector has already been sent. That is, if I call
    writev() with 3 items and it returns a count that tells me it wrote the
    first 2 and part of the 3rd, I might want to pass a new vector with the
    remainder of the 3rd plus a 4th and 5th with some new data that has become
    ready in the intervening time.

    So I guess writev(), in essence, allows you to maintain a queue of chunks
    and ask the system to process as much from the head of that queue of chunks
    as it can conveniently do right now, then return.

    - Logan

  8. Re: writev returns less then expected

    Rick Jones wrote:
    > Jim Marshall wrote:
    >> Hope this is the right group to post to, if not please direct me as needed.

    >
    >> I have an application which is using NON-blocking sockets. My
    >> application attempts to send a response using the writev function (I
    >> am testing on Fedora Core 6, "Linux starfury 2.6.22.1-32.fc6 #1 SMP
    >> Wed Aug 1 14:10:08 EDT 2007 i686 i686 i386 GNU/Linux"), in some
    >> circumstances writev does not write all of the data to the
    >> socket. When this happens the client only gets part of the request
    >> and then sits there waiting for more data. It is not clear to me
    >> why this might happen from the docs.

    >
    >> This seems to be an issue depending on the amount of data being sent
    >> back. For example if I am sending 3000 bytes it works fine, but when I
    >> try to send 30K bytes I run into the problem.

    >
    > Since the socket is non-blocking, your writev will do a partial write
    > whenever there is less space in the socket than you are trying to
    > write with writev.

    I obviously mis-read something some where (I can't find it now, isn't
    that always the case...) but it makes sense that this would happen with
    non-blocking sockets, but given my mis-understanding it was confusing to me.

    Thanks to everyone for setting me on course.

    >
    > If you *know* you will "never" try to write more than N bytes into a
    > socket at one time and can "know" when that has been emptied - say by
    > the receipt of data from the remote indicating it has all you sent
    > previously - you could use setsockopt(SO_SNDBUF) to set the socket
    > buffer to be larger than your largest writev() call, and even if the
    > socket is non-blocking writev() should always write everything - well
    > assuming it can allocate space - socket buffer sizes are limits not
    > preallocations.... after your setsockopt() call, you should make a
    > getsockopt() call to make sure you got at least the size you wanted...

    In my application the amount of data we send back can vary from 1K to
    100meg so I will just code the function to work as it should.
    >
    > One or more of the texts of Stevens Fenner and Rudoff, or Stallings
    > might be good to add to your bookshelf - they cover all sorts of stuff
    > like this.

    Thanks, I will look these up.
    >
    > I trust you already know that even if you did put all N KB into the
    > socket at once that your reciever will still get it in smaller chunks
    > right? Ie that TCP is a byte-stream service and that message
    > boundaries are not preserved...

    Yes. My (mis)understanding was that writev was "atomic" in that it
    wouldn't return until all the data is written - probably applies to
    blocking sockets but not non-blocking ones.

    Again thanks to everyone who replied.
    >
    > rick jones


  9. Re: writev returns less then expected

    Logan Shaw writes:
    > Jim Marshall wrote:
    >> I was under the impression that writev was atomic, my bad. However
    >> this confuses me now as if you had more then 1 item in the vector it
    >> would be very difficult to figure out which part of the vector
    >> wasn't sent.

    >
    > I think "very difficult" might be a little bit of an overstatement. As far
    > as I can tell, you'd just loop through the vector and skip items as long as
    > the lengths haven't added up to what writev() returned. It's non-trivial,
    > but it's also not super hard.
    >
    > In particular, if it's OK to stomp on your iovec structs as you go, I think
    > this should do the trick of sending everything in a list of iovecs:
    >
    > struct iovec *first_remaining_iovec = original_iovec_array;
    > int remaining_count = original_count;
    >
    > while (remaining_count > 0) {
    > ssize_t bytes_written = writev(first_remaining_iovec, remaining_count);
    > if (bytes_written == -1) { /* .... handle error .... */ }
    >
    > size_t bytes_to_consume = bytes_written;
    > while (bytes_to_consume > 0) {
    > if (bytes_to_consume >= first_remaining_iovec->iov_len) {
    > /* consume entire vector element */
    > bytes_to_consume -= first_remaining_iovec->iov_len;
    > remaining_count--;
    > first_remaining_iovec++;
    > } else {
    > /* consume partial vector element */
    > first_remaining_iovec->iov_len -= bytes_to_consume;
    > first_remaining_iovec->iov_base += bytes_to_consume;
    > bytes_to_consume = 0;
    > }
    > }
    > }


    The condition tested inside the loop is always true except for the
    last iteration. This means a better way to write this would be

    while (nr >= iovs->iov_len) {
    nr -= iovs->iov_len;
    --n_iovs;
    ++iovs;
    }

    if (nr) {
    iovs->iov_len -= nr;
    iovs->iov_base = (char *)iovs->iov_base + nr;
    }

    Additionally, the iov_base member is a void *, meaning, arithmetic on
    it is undefined as of ISO-C. Treating void * like char * in this
    respect is a gcc extension. But IMO it is better to not get into the
    habit of using it, because there is an easy way to get bitten by it:

    void *p;
    struct something *ps0, *ps1;

    p = allocate_two_somethings();
    if (!p) ...

    ps0 = p;
    ps1 = p + 1 /* Ouch. Should have been ps0 + 1 */

    With default settings, gcc will not catch this, but there is an
    optional warning (-Wpointer-arith), which can be used to detect
    places where this error may lurk (I have made it for enough times
    myself to rather enable the warning).

  10. Re: writev returns less then expected

    Jim Marshall wrote:

    > Yes. My (mis)understanding was that writev was "atomic" in that it
    > wouldn't return until all the data is written - probably applies to
    > blocking sockets but not non-blocking ones.


    I suspect there is some verbiage about atomicity of write/writev
    somewhere - usually that is in the context of multiple writers to the
    same _file_ and the granularity at which those writes will be
    interleaved. When you switch from calling write/writev against a file
    to write/writev against a socket you have switched contexts

    If you are only ever writing against a socket, you might want to
    consider sendmsg and friends - not a big deal, but it may avoid a tiny
    bit of path mapping from writev to the socket code under the covers.

    rick jones
    --
    denial, anger, bargaining, depression, acceptance, rebirth...
    where do you want to be today?
    these opinions are mine, all mine; HP might not want them anyway...
    vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvvvvvvvv
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^

  11. Re: writev returns less then expected

    On Fri, 19 Oct 2007 01:34:14 -0400 Jim Marshall wrote:
    > Yes. My (mis)understanding was that writev was "atomic" in that it
    > wouldn't return until all the data is written - probably applies to
    > blocking sockets but not non-blocking ones.


    writev() is atomic, in the sense that that data from multiple iov's
    will not be interleaved with data from other write()/writev()'s to
    the same file descriptor. That is, if a writev() returns N (bytes
    written), it is guaranteed that those N bytes are not interleaved
    with any bytes from another write()/writev(). This is notable
    because of the multiple iov's that can be passed ... the atomicity
    guarantee dictates that writev() cannot allow other data to be
    interleaved as it handles each subsequent iov.

    -frank

  12. Re: writev returns less then expected

    On Fri, 19 Oct 2007 11:31:59 -0700 Frank Cusack wrote:
    > On Fri, 19 Oct 2007 01:34:14 -0400 Jim Marshall wrote:
    >> Yes. My (mis)understanding was that writev was "atomic" in that it
    >> wouldn't return until all the data is written - probably applies to
    >> blocking sockets but not non-blocking ones.

    >
    > writev() is atomic, in the sense that that data from multiple iov's
    > will not be interleaved with data from other write()/writev()'s to
    > the same file descriptor.


    Sorry, I meant to the same file.

    -frank

+ Reply to Thread