select() question, avoiding blocks on read/write - Unix

This is a discussion on select() question, avoiding blocks on read/write - Unix ; I'm sure the answer to this is out there somewhere but the key words are so common I can't find it. Also I'm pretty sure I've misunderstood some key point. Anyway, the question is this: select()/FD_ISSET/(etc.) are used to determine ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 22

Thread: select() question, avoiding blocks on read/write

  1. select() question, avoiding blocks on read/write

    I'm sure the answer to this is out there somewhere but the key words are
    so common I can't find it. Also I'm pretty sure I've misunderstood some
    key point.

    Anyway, the question is this: select()/FD_ISSET/(etc.) are used to
    determine which file descriptor has data available, or in some instances
    if the file has closed, but it does not explicitly return the number of
    bytes available. Imagine an input stream contains data in highly
    variable amounts, at a highly variable rate. select() and FD_ISSET show
    that data is available on fd. The amount of that data, which is not
    revealed, is 100 bytes. Then a read is attempted, since we don't know
    how many bytes are available a buffer size is used which is likely to
    be larger than that number, for instance:

    ret = read(fd,buffer,256);

    This is where it gets confusing. There is less data than the buffer
    will hold. If the file or socket isn't closed right after this 100 byte
    block what keeps read() from waiting around for the other 156 bytes?
    This isn't much of an issue if the data is coming in at a regular rate,
    but here it might be 20 minutes before the remaining 156 bytes arrive,
    which is a long time to sit in read(). If non blocking IO is selected
    presumably it will return immediately with whatever was in the buffer
    (presumably because the man page for read() on linux doesn't actually
    say this explicitly). If non blocking IO isn't used, I don't see how
    select() lets you avoid blocked IO, unless you read one byte at a time
    and run select()/FD_ISSET between each read, which is crazy. But I
    thought the purpose of select()/FD_ISSET /etc. was specifically to avoid
    IO blocking.

    The same issue comes up for write(). Here select()/etc. indicate that
    an FD will be in a nonblocking state, but it doesn't guarantee that it
    will STAY in that state. So if one tries to write too much data
    following select write() will also block. As with read(), there seems
    to be no way to determine the number of bytes which will result in a
    blocked IO operation prior to initiating that operation.

    Thanks,

    David Mathog

  2. Re: select() question, avoiding blocks on read/write

    David Mathog writes:
    >

    [...]

    > select()/FD_ISSET/(etc.) are used to
    > determine which file descriptor has data available,

    [...]

    > The amount of that data,


    [...]

    > is 100 bytes. Then a read is attempted, since
    > we don't know how many bytes are available a buffer size is used which
    > is likely to be larger than that number, for instance:
    >
    > ret = read(fd,buffer,256);
    >
    > This is where it gets confusing. There is less data than the buffer
    > will hold. If the file or socket isn't closed right after this 100
    > byte block what keeps read() from waiting around for the other 156
    > bytes?


    For stream sockets:

    For stream-based sockets, such as SOCK_STREAM, message
    boundaries shall be ignored. In this case, data shall be
    returned to the user as soon as it becomes available,
    (SUS, recv)

    For other types of descriptors, read 'may' return a short count if
    only some amount of data is available.

    [...]

    > If non blocking IO is selected presumably it will return immediately
    > with whatever was in the buffer (presumably because the man page for
    > read() on linux doesn't actually say this explicitly).


    Since a non-blocking read frome some descriptor must not block, there
    is no other option.

    [...]

    > But I thought the purpose of select()/FD_ISSET /etc. was specifically to
    > avoid IO blocking.


    A lot of people appear to think this for some unknown reason[*], but
    it is wrong. The purpose of select/ poll/ etc is to wait for
    I/O-readiness of multiple descriptors at the same time, so that a
    single thread of execution can be 'multiplexed' across more than one
    descriptor. The only reliable way to avoid blocking during an I/O
    operation on a particular descriptor (in absence of more specific
    information) is to set O_NONBLOCK.
    [*] For some equally unknown reason, a lot of code is written
    to block in select/ poll on a single descriptor and call read
    after the other call has returned instead of blocking in read.

  3. Re: select() question, avoiding blocks on read/write

    Rainer Weikusat wrote:
    > David Mathog writes:


    >>But I thought the purpose of select()/FD_ISSET /etc. was specifically to
    >>avoid IO blocking.


    > A lot of people appear to think this for some unknown reason[*], but
    > it is wrong.


    I think you're incorrect. The SUS/POSIX definition of select()
    indicating that a file descriptor is readable is that a subsequent
    read() will not block.

    > The only reliable way to avoid blocking during an I/O
    > operation on a particular descriptor (in absence of more specific
    > information) is to set O_NONBLOCK.


    In practice this is a useful technique, as some broken implementations
    will in fact block on a read() even if select() indicated that the
    socket was readable. Some versions of linux behave this way for udp
    packets with invalid checksums, for instance.

    > [*] For some equally unknown reason, a lot of code is written
    > to block in select/ poll on a single descriptor and call read
    > after the other call has returned instead of blocking in read.


    This can be a useful technique if you want to do some time-based
    operation as well as wait on a single socket. For the simple case it's
    definately overkill.

    Chris

  4. Re: select() question, avoiding blocks on read/write

    Chris Friesen wrote:
    > Rainer Weikusat wrote:
    >> David Mathog writes:

    >
    >>> But I thought the purpose of select()/FD_ISSET /etc. was specifically to
    >>> avoid IO blocking.

    >
    >> A lot of people appear to think this for some unknown reason[*], but
    >> it is wrong.

    >
    > I think you're incorrect. The SUS/POSIX definition of select()
    > indicating that a file descriptor is readable is that a subsequent
    > read() will not block.


    Things are a little clearer with recv() which has flags to
    control blocking behavior. Even there, one has to work backwards
    from the flags to figure out what the default behavior is.
    The key flags are:

    MSG_DONTWAIT
    Enables non-blocking operation; if the operation would block,
    EAGAIN is returned (this can also be enabled using the O_NON-
    BLOCK with the F_SETFL fcntl(2)).
    MSG_WAITALL
    This flag requests that the operation block until the full
    request is satisfied. However, the call may still return less
    data than requested if a signal is caught, an error or discon-
    nect occurs, or the next data to be received is of a different
    type than that returned.

    Presumably MSG_DONTWAIT is not needed if select/FD_ISSET shows the
    file descriptor has data ready. The existence of the MSG_WAITALL
    option indicates, that when it is absenct, recv() would return
    immediately with whatever data is on hand and which will fit in the
    supplied buffer. This is generally the desired behavior when
    integrating input from various sources. The read() documentation is
    ambiguous on this (key) point - perhaps some implementations
    of read() wait for a full buffer before returning, and others don't?
    That documentation makes me nuts, for instance:

    EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and no data
    ^^^^^^^^^^^
    was immediately available for reading.
    ^^^^^^^^^^^^^^^^^^^^^^^^^

    Fine, so what does it do when SOME data is available? Does it return
    it immediately or does it block? The implication is that it returns it
    immediately but the man page doesn't actually say that anywhere.

    Regards,

    David Mathog

  5. Re: select() question, avoiding blocks on read/write

    On Dec 4, 11:15 am, Chris Friesen wrote:

    > I think you're incorrect. The SUS/POSIX definition of select()
    > indicating that a file descriptor is readable is that a subsequent
    > read() will not block.


    Wrong. Not a *subsequent* read but a hypothetical concurrent read.

    This is the same as every other status reporting function. For
    example, 'access' doesn't tell you that a subsequent operation will be
    permitted but that if that had been an operation, it would not have
    blocked. That is, it tells you that a hypothetical concurrent
    operation would have been allowed, but provides no guarantees about
    the future.

    > > The only reliable way to avoid blocking during an I/O
    > > operation on a particular descriptor (in absence of more specific
    > > information) is to set O_NONBLOCK.

    >
    > In practice this is a useful technique, as some broken implementations
    > will in fact block on a read() even if select() indicated that the
    > socket was readable. Some versions of linux behave this way for udp
    > packets with invalid checksums, for instance.


    You have it backwards. That's the problem if you *don't* set the
    descriptor to O_NONBLOCK and proves *his* point.

    DS

  6. Re: select() question, avoiding blocks on read/write

    On Dec 4, 9:05 am, David Mathog wrote:

    > Anyway, the question is this: select()/FD_ISSET/(etc.) are used to
    > determine which file descriptor has data available, or in some instances
    > if the file has closed, but it does not explicitly return the number of
    > bytes available. Imagine an input stream contains data in highly
    > variable amounts, at a highly variable rate. select() and FD_ISSET show
    > that data is available on fd. The amount of that data, which is not
    > revealed, is 100 bytes. Then a read is attempted, since we don't know
    > how many bytes are available a buffer size is used which is likely to
    > be larger than that number, for instance:
    >
    > ret = read(fd,buffer,256);
    >
    > This is where it gets confusing. There is less data than the buffer
    > will hold. If the file or socket isn't closed right after this 100 byte
    > block what keeps read() from waiting around for the other 156 bytes?


    If the application wants the socket to never block, it must set the
    socket non-blocking. This assures the 'read' will not ever wait for
    anything.

    > This isn't much of an issue if the data is coming in at a regular rate,
    > but here it might be 20 minutes before the remaining 156 bytes arrive,
    > which is a long time to sit in read(). If non blocking IO is selected
    > presumably it will return immediately with whatever was in the buffer
    > (presumably because the man page for read() on linux doesn't actually
    > say this explicitly). If non blocking IO isn't used, I don't see how
    > select() lets you avoid blocked IO, unless you read one byte at a time
    > and run select()/FD_ISSET between each read, which is crazy. But I
    > thought the purpose of select()/FD_ISSET /etc. was specifically to avoid
    > IO blocking.


    The select/FD_ISSET functions should always be used with non-blocking
    sockets. They don't let you avoid blocking by themselves, but that's
    not the problem. Simply setting the socket non-blocking assures you
    will never block -- the problem is knowing *when* to try (or re-try)
    an operation. That's what 'select' does.

    > The same issue comes up for write(). Here select()/etc. indicate that
    > an FD will be in a nonblocking state, but it doesn't guarantee that it
    > will STAY in that state. So if one tries to write too much data
    > following select write() will also block. As with read(), there seems
    > to be no way to determine the number of bytes which will result in a
    > blocked IO operation prior to initiating that operation.


    Right. That's why you set the socket non-blocking.

    DS

  7. Re: select() question, avoiding blocks on read/write

    David Schwartz wrote:
    > On Dec 4, 11:15 am, Chris Friesen wrote:


    >>I think you're incorrect. The SUS/POSIX definition of select()
    >>indicating that a file descriptor is readable is that a subsequent
    >>read() will not block.


    > Wrong. Not a *subsequent* read but a hypothetical concurrent read.
    >
    > This is the same as every other status reporting function. For
    > example, 'access' doesn't tell you that a subsequent operation will be
    > permitted but that if that had been an operation, it would not have
    > blocked. That is, it tells you that a hypothetical concurrent
    > operation would have been allowed, but provides no guarantees about
    > the future.


    If no other operation is done (by any thread of execution) on the file
    descriptor in question after it has been determined to be readable, then
    in my opinion a subsequent read should not block. The kernel shouldn't
    remove data from a socket buffer once it has received it.

    I think you were part of the UDP bad checksum discussion, no? The end
    result of that was to special-case it so that the checksum was done at
    select() time...so the above view is at least plausible enough for the
    linux kernel developers to implement it.

    In any case, the best solution is to use nonblocking sockets or else to
    use nonblocking reads. This will work regardless of system implementation.

    Chris

  8. Re: select() question, avoiding blocks on read/write

    On Dec 4, 9:13 pm, Chris Friesen wrote:

    > > Wrong. Not a *subsequent* read but a hypothetical concurrent read.


    > > This is the same as every other status reporting function. For
    > > example, 'access' doesn't tell you that a subsequent operation will be
    > > permitted but that if that had been an operation, it would not have
    > > blocked. That is, it tells you that a hypothetical concurrent
    > > operation would have been allowed, but provides no guarantees about
    > > the future.


    > If no other operation is done (by any thread of execution) on the file
    > descriptor in question after it has been determined to be readable, then
    > in my opinion a subsequent read should not block. The kernel shouldn't
    > remove data from a socket buffer once it has received it.


    Suppose another thread reads the data using a different file
    descriptor that refers to the same connection. Are you still insisting
    the data should be there?

    Now, if you amend your claim to "no other operation is done on the
    same connection", you might have a point. But a connection has two
    ends, and the other end might do things no matter what your thread
    does.

    > I think you were part of the UDP bad checksum discussion, no? The end
    > result of that was to special-case it so that the checksum was done at
    > select() time...so the above view is at least plausible enough for the
    > linux kernel developers to implement it.


    You are mis-stating the view as your claim above is incoherent. I
    think you'll find that it's not possible to fix it either.

    > In any case, the best solution is to use nonblocking sockets or else to
    > use nonblocking reads. This will work regardless of system implementation.


    Yep. I don't know of any status-reporting function that makes future
    guarantees. The 'select' function is no different from 'access',
    'stat', and dozens of other functions.

    DS

  9. Re: select() question, avoiding blocks on read/write

    David Schwartz wrote:
    > On Dec 4, 9:05 am, David Mathog wrote:
    >
    >> Anyway, the question is this: select()/FD_ISSET/(etc.) are used to
    >> determine which file descriptor has data available, or in some instances
    >> if the file has closed, but it does not explicitly return the number of
    >> bytes available. Imagine an input stream contains data in highly
    >> variable amounts, at a highly variable rate. select() and FD_ISSET show
    >> that data is available on fd. The amount of that data, which is not
    >> revealed, is 100 bytes. Then a read is attempted, since we don't know
    >> how many bytes are available a buffer size is used which is likely to
    >> be larger than that number, for instance:
    >>
    >> ret = read(fd,buffer,256);
    >>
    >> This is where it gets confusing. There is less data than the buffer
    >> will hold. If the file or socket isn't closed right after this 100 byte
    >> block what keeps read() from waiting around for the other 156 bytes?

    >
    > If the application wants the socket to never block, it must set the
    > socket non-blocking. This assures the 'read' will not ever wait for
    > anything.
    >


    So in the hypothetical case above with O_NONBLOCK you're saying that
    POSIX or some other standard states that read will always return with
    ret = 100 and the 100 bytes of data in buffer?

    I'd like to believe you but I just can't find the relevant standards
    sections anywhere, just lots of man pages. For instance:

    http://www.opengroup.org/onlinepubs/...ions/read.html

    says:

    A read() from a STREAMS file can read data in three different modes:
    byte-stream mode, message-nondiscard mode, and message-discard mode.
    The default shall be byte-stream mode.

    and

    In STREAMS message-nondiscard mode, read() shall retrieve data until
    as many bytes as were requested are transferred, or until a message
    boundary is reached.

    which agrees with your statement. But it then goes on to say:

    In byte-stream mode, read() shall accept data until it has read nbyte
    bytes, or until there is no more data to read, or until a zero-byte
    message block is encountered.

    which would appear to disagree, unless there is something about this
    type of "file" which insures that zero-byte message blocks are
    interspersed between each nonzero-byte message, or there is another
    case involving O_NONBLOCK which they don't document.

    It's also worth noting that the documentation for write() in linux says
    something, which is to my mind, equally confusing:

    EAGAIN Non-blocking I/O has been selected using O_NONBLOCK
    and the write would block.

    So if the output would accept 100 bytes, but would block at 101 or
    higher, and we send it 256, does it return -1 with EAGAIN or 100? This
    is separate from the issue of transmit buffer size, which if that is
    1024, and we send 2048, it will return 1024 since it can't write more
    than that at a time. Another way to phrase this is, if we fill the
    output buffer with a single write, will we be able to write() even a
    single byte to that file descriptor until the write() mechanism has
    completely emptied that buffer? If the data is going out across a
    network the packet size could be much smaller than the transmission
    buffer, and so we could very easily get into a situation where the
    transmission buffer is 99% empty but still has data waiting to go out.
    In that state, will a write of one byte return -1/EAGAIN or 1?

    Regards,

    David Mathog

  10. Re: select() question, avoiding blocks on read/write

    On Dec 5, 8:41 am, David Mathog wrote:

    > > If the application wants the socket to never block, it must set the
    > > socket non-blocking. This assures the 'read' will not ever wait for
    > > anything.


    > So in the hypothetical case above with O_NONBLOCK you're saying that
    > POSIX or some other standard states that read will always return with
    > ret = 100 and the 100 bytes of data in buffer?


    Huh? No, the opposite. With O_NONBLOCK the call will do as much work
    as it can that second but will not ever block. If you meant "without
    O_NONBLOCK", then still no. But whatever it does, it might block.

    > I'd like to believe you but I just can't find the relevant standards
    > sections anywhere, just lots of man pages. For instance:
    >
    > http://www.opengroup.org/onlinepubs/...ions/read.html
    >
    > says:
    >
    > A read() from a STREAMS file can read data in three different modes:
    > byte-stream mode, message-nondiscard mode, and message-discard mode.
    > The default shall be byte-stream mode.
    >
    > and
    >
    > In STREAMS message-nondiscard mode, read() shall retrieve data until
    > as many bytes as were requested are transferred, or until a message
    > boundary is reached.
    >
    > which agrees with your statement. But it then goes on to say:


    Which statement?! I wasn't talking about regular files but about
    sockets.

    > In byte-stream mode, read() shall accept data until it has read nbyte
    > bytes, or until there is no more data to read, or until a zero-byte
    > message block is encountered.
    >
    > which would appear to disagree, unless there is something about this
    > type of "file" which insures that zero-byte message blocks are
    > interspersed between each nonzero-byte message, or there is another
    > case involving O_NONBLOCK which they don't document.
    >
    > It's also worth noting that the documentation for write() in linux says
    > something, which is to my mind, equally confusing:
    >
    > EAGAIN Non-blocking I/O has been selected using O_NONBLOCK
    > and the write would block.
    >
    > So if the output would accept 100 bytes, but would block at 101 or
    > higher, and we send it 256, does it return -1 with EAGAIN or 100?


    It returns 100.

    > This
    > is separate from the issue of transmit buffer size, which if that is
    > 1024, and we send 2048, it will return 1024 since it can't write more
    > than that at a time. Another way to phrase this is, if we fill the
    > output buffer with a single write, will we be able to write() even a
    > single byte to that file descriptor until the write() mechanism has
    > completely emptied that buffer? If the data is going out across a
    > network the packet size could be much smaller than the transmission
    > buffer, and so we could very easily get into a situation where the
    > transmission buffer is 99% empty but still has data waiting to go out.
    > In that state, will a write of one byte return -1/EAGAIN or 1?


    Most likely, it will return 1.

    I think you are confusing two totally different issues:

    1) What is the typical behavior of blocking and non-blocking read and
    write operations?

    2) what is the guaranteed behavior of blocking and non-blocking read
    and write operations?

    The difference is critical. If 'select' gives us a read hit on a TCP
    socket, we would expect that a subsequent read of any number of bytes
    will not block, but this behavior is *not* guaranteed (no standard
    prohibits the implementation from discarding TCP data if it hasn't
    acknowledged it). So if blocking is a disaster, we still must set the
    socket non-blocking.

    DS

  11. Re: select() question, avoiding blocks on read/write

    Chris Friesen writes:
    > Rainer Weikusat wrote:
    >> David Mathog writes:

    >
    >>>But I thought the purpose of select()/FD_ISSET /etc. was specifically to
    >>>avoid IO blocking.

    >
    >> A lot of people appear to think this for some unknown reason[*], but
    >> it is wrong.

    >
    > I think you're incorrect. The SUS/POSIX definition of select()
    > indicating that a file descriptor is readable is that a subsequent
    > read() will not block.


    The ***-definition of select has somewhat unclear wording on this:

    A descriptor shall be considered ready for reading when a call
    to an input function with O_NONBLOCK clear would not block,

    This does not demand that this condition should be persistent.
    The poll-definition is more explicit in this respect:

    POLLIN
    Data other than high-priority data may be read without
    blocking.

    But you chopped my statement in half and omitted the more important
    part of it:

    The purpose of select/ poll/ etc is to wait for I/O-readiness
    of multiple descriptors at the same time, so that a single
    thread of execution can be 'multiplexed' across more than one
    descriptor.

    The sentence you quoted was supposed to mean 'the purpose of select is
    not specifically to avoid I/O blocking, although a lot of people
    appear to [...], but to wait for [...].

    [...]

    >> [*] For some equally unknown reason, a lot of code is written
    >> to block in select/ poll on a single descriptor and call read
    >> after the other call has returned instead of blocking in read.

    >
    > This can be a useful technique if you want to do some time-based
    > operation as well as wait on a single socket.


    The usual 'time-based operation' accomplished this way is to cause a
    gratuitous program failure after some random interval the code author
    dreamed up[*] has passed. For 'serious' time-dependent operations,
    using an interval timer, combined with a suitable mechanism to process
    both timer and I/O events from the same event waiting loop (eg
    self-pipe trick or O_ASYNC [on Linux]) is IMO a better choice.
    [*] eg the cross sum of the bust size of his girl friend,
    multiplied by the square root of PI.

  12. Re: select() question, avoiding blocks on read/write

    David Schwartz wrote:
    > On Dec 4, 9:13 pm, Chris Friesen wrote:


    >>If no other operation is done (by any thread of execution) on the file
    >>descriptor in question after it has been determined to be readable, then
    >>in my opinion a subsequent read should not block. The kernel shouldn't
    >>remove data from a socket buffer once it has received it.

    >
    > Suppose another thread reads the data using a different file
    > descriptor that refers to the same connection. Are you still insisting
    > the data should be there?


    Interesting point. I hadn't considered the case of duped file descriptors.

    > Now, if you amend your claim to "no other operation is done on the
    > same connection", you might have a point. But a connection has two
    > ends, and the other end might do things no matter what your thread
    > does.


    It might be more accurate to amend it to "no other operation is done on
    the underlying kernel data structure corresponding to the socket
    buffer", but that gets pretty verbose.

    Also, if the other end of a connection-oriented socket shuts down,
    doesn't the read return with an errno of EPIPE? I believe this is the
    case for tcp, at least. I don't think there's any way for the other
    end to "suck back" data that has already been received...

    Chris

  13. Re: select() question, avoiding blocks on read/write

    David Mathog writes:

    [recv/ MSG_WAITALL ]

    > The read() documentation is ambiguous on this (key) point - perhaps
    > some implementations of read() wait for a full buffer before
    > returning, and others don't?


    The standard demands that

    If fildes refers to a socket, read() shall be equivalent to
    recv() with no flags set.
    (SUS/ read)

    > That documentation makes me nuts, for
    > instance:
    >
    > EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and no data

    ^^^^^^^^^^^
    > was immediately available for reading.
    > ^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > Fine, so what does it do when SOME data is available?


    According to the text you quoted, it will certainly not fail with
    errno == EAGAIN (assuming the documentation matches the behaviour of
    the actual implementation).

    > Does it return it immediately or does it block?


    Does the read documentation you are referring to contain an exception
    clause for reading from descriptors with the O_NONBLOCK flag set
    stating that the elsewhere defined meaning of this flag would not be
    applicable to read? If not, why do you think such an exception would
    be possible?

    The general definition of 'Non-Blocking' (SUS 3.240) is

    A property of an open file description that causes function calls
    involving it to return without delay when it is detected that the
    requested action associated with the function call cannot be completed
    without unknown delay.

    For sockets, this is made even more explicit (SUS 2.10.7):

    When the O_NONBLOCK flag is set, functions that would normally
    block until they are complete shall either return immediately
    with an error, or shall complete asynchronously to the
    execution of the calling process. Data transfer operations
    (the read(), write(), send(), and recv() functions) shall
    complete immediately, transfer only as much as is available,
    and then return without blocking, or return an error
    indicating that no transfer could be made without blocking.



  14. Re: select() question, avoiding blocks on read/write

    On Dec 5, 9:22 am, Chris Friesen wrote:

    > It might be more accurate to amend it to "no other operation is done on
    > the underlying kernel data structure corresponding to the socket
    > buffer", but that gets pretty verbose.


    Except that can be done from the other end as well. Try as you might,
    you won't be able to get a guarantee that a blocking socket won't
    block.

    > Also, if the other end of a connection-oriented socket shuts down,
    > doesn't the read return with an errno of EPIPE? I believe this is the
    > case for tcp, at least. I don't think there's any way for the other
    > end to "suck back" data that has already been received...


    Sure there is, it simply issues a "suck back" request using the "suck
    back" protocol. The semantics of 'select', blocking and non-blocking
    reads, and the like aren't TCP-specific. They would apply precisely
    the same to some other stream oriented protocol that had a "suck back"
    option.

    If you insist on a TCP example, consider an implementation that fired
    'select' when it gets a TCP data packet (because had a read been
    blocking at that instant, it would have been satisfied) but then
    doesn't have enough memory to stash the data at that time and so
    discards it and doesn't ACK it. A subsequent 'read' might block until
    the other end retransmits.

    Now no existing implementation actually does this, AFAIK, because it's
    kind of silly. But it's not prohibited by any standard.

    DS

  15. Re: select() question, avoiding blocks on read/write

    On Dec 5, 9:02 am, Rainer Weikusat wrote:

    > This does not demand that this condition should be persistent.
    > The poll-definition is more explicit in this respect:
    >
    > POLLIN
    > Data other than high-priority data may be read without
    > blocking.


    This doesn't mean the condition is persistent. Similar wording could
    (and in some cases is) used to describe functions like 'access' and
    'stat' and nobody argues that they report conditions that must somehow
    be kept persistent.

    For example, on could perfectly well explain 'access' like this:

    R_OK: The file may be opened for reading.

    For example, my man page for 'access' says:

    Only access bits are checked, not the file type or
    contents. There-
    fore, if a directory is found to be "writable," it probably
    means that
    files can be created in the directory, and not that the
    directory can
    be written as a file. Similarly, a DOS file may be found to
    be "exe-
    cutable," but the execve(2) call will still fail.

    Clearly the "can be created in the directory" means they could have
    been at some point in-between when you called 'access' and when it
    returned.

    To provide an actual future guarantee, something almost unheard of,
    would require very explicit wording. In fact, I can't think of any
    function that provides that kind of guarantee.

    Much real-world code has broken by assuming that functions such as
    'select' and 'access' provided future guarantees. They simply don't.

    DS

  16. Re: select() question, avoiding blocks on read/write

    David Schwartz writes:
    > On Dec 5, 9:02 am, Rainer Weikusat wrote:
    >
    >> This does not demand that this condition should be persistent.
    >> The poll-definition is more explicit in this respect:
    >>
    >> POLLIN
    >> Data other than high-priority data may be read without
    >> blocking.

    >
    > This doesn't mean the condition is persistent.


    Actually, even the select-statements means the condition is supposed
    to be persistent. It's just easier to confuse 'a read occuring after
    select indicated readable would not block' with 'a read having occured
    instead of the select would have completed at the same time the select
    had returned' (any correction of the tenses I used would be greatly
    appreciated).

    In both cases, interpreting the statement as not supposed to inidicate
    persistence requires additions to the written text which are basically
    claimed to be justified because they fill a semantic void. But outside
    of mathematics, adding suitable definitions for everything not
    otherwise specified generally doesn't make logical sense, because
    adding something to a text changes its meaning.

    > Similar wording could (and in some cases is) used to describe
    > functions like 'access' and 'stat' and nobody argues that they
    > report conditions that must somehow be kept persistent.


    But similar wording is not used to describe access and stat in SUS and
    that somebody could use similar wording to describe to different
    things neither proves that this description would be correct nor that
    differences outside the scope of the description don't exist.

    > For example, on could perfectly well explain 'access' like this:
    >
    > R_OK: The file may be opened for reading.


    This is an example of an incorrect description. A correct one would be
    "The 'reading allowed' bit was set in the permission bit set checked
    by access".

    > For example, my man page for 'access' says:
    >
    > Only access bits are checked, not the file type or
    > contents. There-
    > fore, if a directory is found to be "writable," it probably
    > means that
    > files can be created in the directory, and not that the
    > directory can
    > be written as a file. Similarly, a DOS file may be found to
    > be "exe-
    > cutable," but the execve(2) call will still fail.
    >
    > Clearly the "can be created in the directory" means they could have
    > been at some point in-between when you called 'access' and when it
    > returned.


    This clearly means that the text is incorrect and should be
    'could have been created in the directory'.

    [...]

    > Much real-world code has broken by assuming that functions such as
    > 'select' and 'access' provided future guarantees. They simply don't.


    These are 'simply' two entirely different cases: Access tests a set of
    bits stored in a certain i-node (or filesystem-equivalent of an
    i-node) and changing these bits is part of the normal use of the
    system, which provides the required (standardized) APIs to do
    so. There is no API for emptying some in-kernel data buffer associated
    with 'some I/O channel' except using a suitable input routine with a
    suitable file descriptor. Which processes can have access to a
    suitable file descriptor depends on the capabilites of the system down
    to the device driver level, on interprocess relationships and
    generally on 'the sequence of events which took place since the
    referred-to kernel object was created'. Taking this into account,
    using O_NONBLOCK to avoid blocking would IMO be prudent, except in
    under very controlled circumstances, but not necessarily a technical
    requirement.


  17. Re: select() question, avoiding blocks on read/write

    David Schwartz wrote:

    > Much real-world code has broken by assuming that functions such as
    > 'select' and 'access' provided future guarantees. They simply don't.


    As this thread has shown. Apparently in terms of avoiding blocking
    select() is at best an advisory function, indicating those file
    descriptors not likely to block. One still has to put O_NONBLOCK
    on everything to actually avoid blocking. Code could just as well
    poll all the file descriptors and check for EAGAIN and never bother
    calling select() at all. It's less efficient but it would work.
    On the other hand, select() does have the nice ability to "wait here
    until some IO can (probably) be done and then tell us which file
    descriptors to use". That's separate from the blocking issue and
    quite valuable.

    Regards,

    David Mathog

  18. Re: select() question, avoiding blocks on read/write

    David Mathog writes:

    [...]

    > Apparently in terms of avoiding blocking select() is at best an
    > advisory function, indicating those file descriptors not likely to
    > block.


    [...]

    > On the other hand, select() does have the nice ability to "wait here
    > until some IO can (probably) be done and then tell us which file
    > descriptors to use".


    [rw@fever]~/build/krb/arm-build/src $whatis select
    select (3) [FD_CLR] - synchronous I/O multiplexing
    select (2) - synchronous I/O multiplexing

    It is somewhat absurd to claim to be able to avoid blocking until some
    I/O-task can be accomplished by means of a subroutine which blocks
    until some I/O task can be accomplished.

  19. Re: select() question, avoiding blocks on read/write

    Rainer Weikusat wrote:

    > It is somewhat absurd to claim to be able to avoid blocking until some
    > I/O-task can be accomplished by means of a subroutine which blocks
    > until some I/O task can be accomplished.


    Based on your previous posts you know better than this and you're just
    trying to pick a fight.

    The claim is that we can avoid blocking until some __specific__ I/O-task
    can be accomplished by means of a subroutine which blocks (or doesn't,
    if you set a timeout of zero) until some I/O task __from within a
    potentially large set__ can be accomplished.

    Chris

  20. Re: select() question, avoiding blocks on read/write

    On Dec 6, 2:09 pm, Chris Friesen wrote:


    > The claim is that we can avoid blocking until some __specific__ I/O-task
    > can be accomplished by means of a subroutine which blocks (or doesn't,
    > if you set a timeout of zero) until some I/O task __from within a
    > potentially large set__ can be accomplished.


    Actually, at best, could have been accomplished. The problem is
    whether a past status report provides a future guarantee.

    DS

+ Reply to Thread
Page 1 of 2 1 2 LastLast