non-blocking SSL_read() API problem - Openssl

This is a discussion on non-blocking SSL_read() API problem - Openssl ; I think I've discovered another problem with the current non-blocking API. I have an application which reads data into fixed-size buffers which it maintains per session. It uses non-blocking IO and select() when a read returns SSL_ERROR_WANT_{READ,WRITE}. To conserve memory ...

+ Reply to Thread
Results 1 to 14 of 14

Thread: non-blocking SSL_read() API problem

  1. non-blocking SSL_read() API problem

    I think I've discovered another problem with the current non-blocking API.

    I have an application which reads data into fixed-size buffers which it
    maintains per session. It uses non-blocking IO and select() when a read
    returns SSL_ERROR_WANT_{READ,WRITE}.

    To conserve memory I reduced the buffer size from 16384 to 8192 and saw
    sessions suddenly hang. A coworker diagnosed this as follows:

    1) The peer sends a SSL record larger than the buffer size.

    2) We receive the SSL record. The socket selects as ready to read.

    3) We call SSL_read with our 8k buffer. The received data does not fit,
    so OpenSSL buffers it internally and returns 8K with SSL_ERROR_WANT_READ.

    4) We call select again for read on the socket (see attached quotation from
    SSL_read manual page!) but it never comes up ready, because OpenSSL has
    internally consumed the data in order to decrypt the SSL record!

    The problem (again! this pervades the non-blocking "API"!) is that a
    single error code is used to indicate two different errors which require
    *different* application behavior. If SSL_ERROR_WANT_READ was returned
    because the application did not supply a buffer of sufficient size, then
    the application must immediately call SSL_read() again -- contradicting
    the manual page description of the API. But if SSL_ERROR_WANT_READ was
    returned because the underlying file descriptor indicated not ready for
    read, the applicaiton must immediately call select() or poll() again.

    This can be determined heuristically but it would be far better to return
    a different error code in each case. At the very least, the manual page
    needs to be revised to alert API users to this bug and suggest a workaround
    (I *think* it may be sufficient to always call SSL_read() again if we
    actually got any data but had SSL_ERROR_WANT_READ returned).

    One possible workaround (which is gross, but feasible, I think) is to push
    one byte back onto the socket so select() will DTRT. *Shudder*.

    Here is the manual page text which seems relevant:

    If the underlying BIO is non-blocking, SSL_read() will also return when
    the underlying BIO could not satisfy the needs of SSL_read() to con-
    tinue the operation. In this case a call to SSL_get_error(3) with the
    return value of SSL_read() will yield SSL_ERROR_WANT_READ or
    SSL_ERROR_WANT_WRITE. As at any time a re-negotiation is possible, a
    call to SSL_read() can also cause write operations! The calling process
    then must repeat the call after taking appropriate action to satisfy
    the needs of SSL_read(). The action depends on the underlying BIO. When
    using a non-blocking socket, nothing is to be done, but select() can be
    used to check for the required condition. When using a buffering BIO,
    like a BIO pair, data must be written into or retrieved out of the BIO
    before being able to continue.

    Thor
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  2. RE: non-blocking SSL_read() API problem


    > 3) We call SSL_read with our 8k buffer. The received data does not fit,
    > so OpenSSL buffers it internally and returns 8K with
    > SSL_ERROR_WANT_READ.


    How it can both succeed (returning 8K) and fail (returning
    SSL_ERROR_WANT_READ)?

    DS


    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  3. Re: non-blocking SSL_read() API problem

    On Thu, Jul 31, 2008 at 11:49:05AM -0700, David Schwartz wrote:
    >
    > > 3) We call SSL_read with our 8k buffer. The received data does not fit,
    > > so OpenSSL buffers it internally and returns 8K with
    > > SSL_ERROR_WANT_READ.

    >
    > How it can both succeed (returning 8K) and fail (returning
    > SSL_ERROR_WANT_READ)?


    Let me trace through the application. Looking at the source code, it
    appears that the application may have a bug (checking the SSL error stack
    via SSL_get_error() when SSL_read() returned > 0) but that what is actually
    happening here is:

    1) SSL_read() is returning < 0, SSL_get_error() is returning WANT_READ

    *but*

    2) Internally, SSL_read has taken the bytes from the socket buffer, so
    calling select() on the fd for read will still never work. There seems
    to be no way for the application to know that what it *really* needs to
    do is retry the call with a larger buffer, that nothing else will suffice.

    In other words, by code inspection, it seems the bug's actually worse than
    I thought. But I'll trace through it too to double-check.

    Thor
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  4. RE: non-blocking SSL_read() API problem


    > Let me trace through the application. Looking at the source code, it
    > appears that the application may have a bug (checking the SSL error stack
    > via SSL_get_error() when SSL_read() returned > 0) but that what
    > is actually
    > happening here is:
    >
    > 1) SSL_read() is returning < 0, SSL_get_error() is returning WANT_READ
    >
    > *but*
    >
    > 2) Internally, SSL_read has taken the bytes from the socket buffer, so
    > calling select() on the fd for read will still never work. There seems
    > to be no way for the application to know that what it *really* needs to
    > do is retry the call with a larger buffer, that nothing else
    > will suffice.


    If this is really what's happening, it's a bug in OpenSSL. The application
    should be able to pass a 1-byte buffer the OpenSSL and get 1 byte of
    decrypted data.

    > In other words, by code inspection, it seems the bug's actually worse than
    > I thought. But I'll trace through it too to double-check.


    If that's true, then it's much worse than you thought. OpenSSL is not
    allocating or using sufficient internal buffer space and is returning a
    WANT_READ application in case other than one in which the socket cannot
    supply the data it needs.

    But I suspect you have an application bug. You are manufacturing the
    WANT_READ indication yourself, I strongly suspect.

    DS


    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  5. Re: non-blocking SSL_read() API problem

    On Thu, Jul 31, 2008 at 01:02:16PM -0700, David Schwartz wrote:
    >
    > > Let me trace through the application. Looking at the source code, it
    > > appears that the application may have a bug (checking the SSL error stack
    > > via SSL_get_error() when SSL_read() returned > 0) but that what
    > > is actually
    > > happening here is:
    > >
    > > 1) SSL_read() is returning < 0, SSL_get_error() is returning WANT_READ
    > >
    > > *but*
    > >
    > > 2) Internally, SSL_read has taken the bytes from the socket buffer, so
    > > calling select() on the fd for read will still never work. There seems
    > > to be no way for the application to know that what it *really* needs to
    > > do is retry the call with a larger buffer, that nothing else
    > > will suffice.

    >
    > If this is really what's happening, it's a bug in OpenSSL. The application
    > should be able to pass a 1-byte buffer the OpenSSL and get 1 byte of
    > decrypted data.


    Consider that when running with the current non-blocking API, once OpenSSL
    takes the data out of the socket buffer, both library and application
    programmer are basically stuck. The socket will never come up selectable
    for read again, but there is no other way for the application to find out
    that there is further data pending internally in OpenSSL.

    If the intended semantics are "select, then loop reading until WANT_READ
    is returned", this is:

    A) Another significant difference between how read(2) and
    SSL_read() operate for nonblocking sockets. Basically all
    these differences cause problems. If SSL_select() or SSL_poll()
    were provided, it could address this by looking at the internal
    buffering; but I think we've argued about the advisability of
    providing SSL_select() in the past.

    B) Still problematic even if code is adapted to this API, because
    you can't use the arrival of data at the socket to enforce
    fairness between peers. Or, rather, you probably can, but only
    at the expense of looping over all sockets you haven't yet
    seen WANT_READ on -- in the worst case (the case where the peer
    writes exactly the application's read buffer size) this at least
    doubles the number of calls to SSL_read() for a given workload.

    Thor
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  6. RE: non-blocking SSL_read() API problem


    > > If this is really what's happening, it's a bug in OpenSSL. The
    > > application
    > > should be able to pass a 1-byte buffer the OpenSSL and get 1 byte of
    > > decrypted data.


    > Consider that when running with the current non-blocking API, once OpenSSL
    > takes the data out of the socket buffer, both library and application
    > programmer are basically stuck.


    No, you are seriously mistaken.

    > The socket will never come up selectable
    > for read again, but there is no other way for the application to find out
    > that there is further data pending internally in OpenSSL.


    The application should assume that data is pending unless it knows for a
    fact that no data is pending.

    > If the intended semantics are "select, then loop reading until WANT_READ
    > is returned", this is:


    No. The semantics are:

    1) If you want to read data, call SSL_read.

    2) If SSL_read returns WANT_READ, then select.

    If you don't understand this at a fundamental level, then you are totally
    misusing the OpenSSL non-blocking API.

    > A) Another significant difference between how read(2) and
    > SSL_read() operate for nonblocking sockets. Basically all
    > these differences cause problems. If SSL_select() or SSL_poll()
    > were provided, it could address this by looking at the internal
    > buffering; but I think we've argued about the advisability of
    > providing SSL_select() in the past.


    Yes, this a difference between how 'read' and 'SSL_read' work. But it's a
    thoroughly-documented and well-understood one. You may only wait for socket
    data if the OpenSSL API tells you to.

    > B) Still problematic even if code is adapted to this API, because
    > you can't use the arrival of data at the socket to enforce
    > fairness between peers. Or, rather, you probably can, but only
    > at the expense of looping over all sockets you haven't yet
    > seen WANT_READ on -- in the worst case (the case where the peer
    > writes exactly the application's read buffer size) this at least
    > doubles the number of calls to SSL_read() for a given workload.


    I don't follow you at all. You probably have some kind of architecture in
    mind and you've argued that it's broken. Well, then don't use that
    architecture, since it's broken. If you're arguing that non-broken
    architectures are not possible, well, that's just not true.

    DS


    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  7. RE: non-blocking SSL_read() API problem


    Let me just state this one more time one other way to help people wrap their
    brain around it. The OpenSSL library is a black box. You are not supposed to
    look inside the black box.

    If you want to get decrypted plaintext from the black box, the black box may
    or may not need to read data from the socket to get it. You don't know, and
    you're not supposed to know. When you want to read data, you're supposed to
    call SSL_read.

    Now one of the things that might happen when you call SSL_read is that the
    black box has no data for you. But you have no way to know this until you
    ask it. If it has no data for you, it will tell you why. Maybe it needs to
    read from the socket. Maybe it needs to write to the socket.

    But until it tells you, you have no idea.

    Yes, you really do know that OpenSSL typically has to read encrypted data
    from the socket to give you unencrypted data. But this secret knowledge of
    the internals of SSL is not supposed to be in your code. Your code is
    supposed to be agnostic. All it knows is that OpenSSL gives it decrypted
    data.

    Your code should be just as prepared for SSL_read to return WANT_WRITE as
    WANT_READ. Why? Because OpenSSL is a black box that sometimes needs to read
    and sometimes needs to write. You should not ever assume that waiting for
    data to read on the socket means plaintext will arrive. It might, but your
    knowledge that it will is knowledge of SSL internals that your code should
    *not* have.

    So when you say:

    >Consider that when running with the current non-blocking API, once OpenSSL
    >takes the data out of the socket buffer, both library and application
    >programmer are basically stuck. The socket will never come up selectable
    >for read again, but there is no other way for the application to find out
    >that there is further data pending internally in OpenSSL.


    The answer is -- of course there is. The application simply asks OpenSSL if
    there is further data pending. If OpenSSL cannot make further forward
    progress without reading from the socket, it will tell the application.

    DS


    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  8. Re: non-blocking SSL_read() API problem

    On Thu, Jul 31, 2008 at 05:14:09PM -0700, David Schwartz wrote:
    >
    > Let me just state this one more time one other way to help people wrap their
    > brain around it. The OpenSSL library is a black box. You are not supposed to
    > look inside the black box.
    >
    > If you want to get decrypted plaintext from the black box, the black box may
    > or may not need to read data from the socket to get it. You don't know, and
    > you're not supposed to know. When you want to read data, you're supposed to
    > call SSL_read.


    Perhaps if you intended to have an API that didn't act like Unix read()
    and write() it should not have used those names...

    ....and yes, I'm aware that it works like read() and write() until you
    set non-blocking I/O. Then the semantics are (as you've just pointed out)
    different.

    Care to explain why you can't discuss how the API might or might not work
    without throwing around gratuitous insults? This last message to which
    I'm responding is merely condescending; the previous was downright
    insulting and offensive. I can't see how that helps anyone. And after
    all, it's not like the current mess of a non-blocking API is your design
    nor your code, at least not as far as I can tell.

    The problem with "read until there's no more" semantics is that you can't
    really use them to do fair I/O in a traditional Unix single-threaded
    event-driven model. If I have no way to find out whether there might be
    more I/O available to drain except to try to drain it, I have to either:

    1) Service each client in turn whether or not they're ready, since I can't
    tell whether they're ready without paying the whole cost of the read()
    and the decryption

    or

    2) Fully drain everything each client might have cared to write me each
    time I find that he's ready at all. This allows a single client to
    consume as much of a server's time as it cares to, to the detriment
    of the others.

    The traditional Unix non-blocking semantics where you can stop reading
    whenever you like and sleep (via select or poll on _many_ streams at once)
    to find out who's got more for you don't have this problem. So many, many
    event-driven Unix applications are written to do precisely that.

    If I'm hearing you correctly you are saying that not only cannot one do
    that with OpenSSL, one ought not want to do such a thing. I do not
    grasp why.

    Incidentally, if one's not intended to peek under the hood, again, I ask
    why OpenSSL _encourages_ this by providing no sleep-for-IO mechanism
    which does not, in fact, _require_ peeking under the hood.

    Furthermore, the SSL_read() documentation, which I was so foolish as to
    use as my guidance to this portion of the API, explicitly says to use
    select() to find out if I/O's available after receiving a WANT_READ or
    WANT_WRITE error. You _cannot do that_ without peeking under the hood,
    because you of course must break the abstraction and get the BIO's
    file descriptor to feed to select!

    I can think of a number of ways to support applications that would like
    to do something like typical Unix event-driven multiplexed I/O by
    augmenting, rather than altering, the existing API. But if they're
    just going to be met by a barrage of condescension or insults, I'm not
    sure it's worth bothering to discuss...

    Thor
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  9. Re: non-blocking SSL_read() API problem

    Thor Lancelot Simon wrote:
    > I think I've discovered another problem with the current non-blocking API.
    >
    > I have an application which reads data into fixed-size buffers which it
    > maintains per session. It uses non-blocking IO and select() when a read
    > returns SSL_ERROR_WANT_{READ,WRITE}.
    >
    > To conserve memory I reduced the buffer size from 16384 to 8192 and saw
    > sessions suddenly hang. A coworker diagnosed this as follows:
    >
    > 1) The peer sends a SSL record larger than the buffer size.
    >
    > 2) We receive the SSL record. The socket selects as ready to read.
    >
    > 3) We call SSL_read with our 8k buffer. The received data does not fit,
    > so OpenSSL buffers it internally and returns 8K with SSL_ERROR_WANT_READ.
    >

    The record size of the SSL record is predetermined by the sender with
    16k being the maximum size specified by the protocol.

    In order to return the (decrytped and authenticated) data to the
    application, the full record must have been received as the MAC
    (Message Authentication Code) is at the end of the record and
    checking it requires to calculate the hash over the complete record
    anyway. Hence, SSL_read() will only return and provide data once
    the full record has been recevied from the underlying socket.

    As the SSL communication just like TCP is stream oriented there is no
    way for an application to know what is the size of a record, whether
    data are split over several records or sent in one or whether more
    information inside the stream was combined to just one record.

    Hence you have to read from the stream with SSL_read() until there is
    no more data to be retrieved. As long as there are bytes available
    SSL_read() will return the number of bytes written to the buffer.
    A following call to SSL_get_errror(ssl, ) will return
    SSL_ERROR_NONE. The logic here is quite simple: the first test in
    ssl/ssl_lib.c:SSL_get_error() is:
    if (i > 0) return(SSL_ERROR_NONE);

    Hence the scenario you describe here (returned 8k and
    SSL_ERROR_WANT_READ at the same time) is technically impossible as
    long as you did call SSL_get_error() with the correct return value of
    SSL_read().

    To find out whether there actually is decrypted data available to be read
    you may use SSL_pending() before calling SSL_read().

    I have written quite some amount of applications using non-blocking I/O
    and while especially the shutdown behavior is questionable at least, the
    actual
    state machine never made any trouble to me.

    Note: I did not invent the API, I just wrote the manual pages :-)

    Best regards,
    Lutz
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  10. Re: non-blocking SSL_read() API problem

    On Fri, Aug 01, 2008 at 03:49:01PM +0200, Lutz Jaenicke wrote:
    > Thor Lancelot Simon wrote:
    >
    > The record size of the SSL record is predetermined by the sender with
    > 16k being the maximum size specified by the protocol.


    32K for SSLv2, no?

    > In order to return the (decrytped and authenticated) data to the
    > application, the full record must have been received as the MAC
    > (Message Authentication Code) is at the end of the record and
    > checking it requires to calculate the hash over the complete record
    > anyway. Hence, SSL_read() will only return and provide data once
    > the full record has been recevied from the underlying socket.


    Yes, I understand this. The problem is that since the API doesn't include
    SSL_select() or SSL_poll(), there's no way for an application to sleep
    once SSL_read() consumes the data out of the socket buffer. This means
    SSL_read() can't work quite like read(2) here -- it requires the "read to
    completion" behavior you mention.

    This leads to another problem, actually:

    A malicious peer which sends data as fast as it can can get _more_ data
    into the socket buffer while the application is trying to "read to
    completion". This can deny service to _other_ peers.

    Basically an event-driven application which had an event loop like this
    (which worked with the Unix model):

    while (1)
    {
    select(....)

    for(selected writable) {
    write(it all);
    }
    for(selected readable) {
    read(fixed size for fairness);
    }
    }

    Now has to do something like this:

    while (1)
    {
    select(....)

    for(selected writable) {
    SSL_write(it all);
    }
    for(SSL_pending() was true after last read) {
    SSL_read(another fixed size chunk);
    if (SSL_pending(this SSL)) {
    flag as "more coming" in private datastructure;
    }
    }
    for(selected readable) {
    SSL_read(fixed size for fairness);
    if (SSL_pending(this SSL)) {
    flag as "more coming" in private datastructure;
    }
    }
    }

    This will work, but it will require restructuring the event loops of
    many applications written to expect the Unix was (which is not great, but
    so long as it's documented, it's better).

    The SSL_read() manual page should at least mention that it's unsafe to
    call select again if SSL_pending() comes true. At present, it doesn't
    mention SSL_pending() at all.

    And this will work only as long as we're guaranteed SSL_pending() will
    never actually read from the socket buffer, which someone might want to
    make a note of somewhere!

    > Hence the scenario you describe here (returned 8k and
    > SSL_ERROR_WANT_READ at the same time) is technically impossible as
    > long as you did call SSL_get_error() with the correct return value of
    > SSL_read().


    Yes. That's not what occurred, as I determined when I traced through a
    run of the application. Rather, the application put the fd in its select
    set for read without SSL_get_error() returning WANT_READ, because it was
    trying to enforce fairness among clients (see above). It looks like this
    can be done via use of SSL_pending().

    > Note: I did not invent the API, I just wrote the manual pages :-)


    So noted! And thanks for taking the time to explain this to me!

    Thor
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  11. Re: non-blocking SSL_read() API problem

    Thor Lancelot Simon wrote:
    > On Fri, Aug 01, 2008 at 03:49:01PM +0200, Lutz Jaenicke wrote:
    >
    >> Thor Lancelot Simon wrote:
    >>
    >> The record size of the SSL record is predetermined by the sender with
    >> 16k being the maximum size specified by the protocol.
    >>

    >
    > 32K for SSLv2, no?
    >

    I stopped caring for SSLv2 quite some time ago.

    >> In order to return the (decrytped and authenticated) data to the
    >> application, the full record must have been received as the MAC
    >> (Message Authentication Code) is at the end of the record and
    >> checking it requires to calculate the hash over the complete record
    >> anyway. Hence, SSL_read() will only return and provide data once
    >> the full record has been recevied from the underlying socket.
    >>

    >
    > Yes, I understand this. The problem is that since the API doesn't include
    > SSL_select() or SSL_poll(), there's no way for an application to sleep
    > once SSL_read() consumes the data out of the socket buffer. This means
    > SSL_read() can't work quite like read(2) here -- it requires the "read to
    > completion" behavior you mention.
    >

    Yes.

    > This leads to another problem, actually:
    >
    > A malicious peer which sends data as fast as it can can get _more_ data
    > into the socket buffer while the application is trying to "read to
    > completion". This can deny service to _other_ peers.
    >


    This type of fairness has to be implemented by the application.
    This will include modifying the event handling.

    > Basically an event-driven application which had an event loop like this
    > (which worked with the Unix model):
    >
    > while (1)
    > {
    > select(....)
    >
    > for(selected writable) {
    > write(it all);
    > }
    > for(selected readable) {
    > read(fixed size for fairness);
    > }
    > }
    >
    > Now has to do something like this:
    >
    > while (1)
    > {
    > select(....)
    >
    > for(selected writable) {
    > SSL_write(it all);
    > }
    > for(SSL_pending() was true after last read) {
    > SSL_read(another fixed size chunk);
    > if (SSL_pending(this SSL)) {
    > flag as "more coming" in private datastructure;
    > }
    > }
    > for(selected readable) {
    > SSL_read(fixed size for fairness);
    > if (SSL_pending(this SSL)) {
    > flag as "more coming" in private datastructure;
    > }
    > }
    > }
    >
    > This will work, but it will require restructuring the event loops of
    > many applications written to expect the Unix was (which is not great, but
    > so long as it's documented, it's better).
    >

    What kind of application are you talking about?
    So far I have not seen any appliation that is collecting data from different
    peers based on a "amount of data sent" round robin fashion. More or
    less every application tends to have a higher level protocol with
    handshake etc. that is actually responsible to deal with the peers
    data.

    Even though the interface might seem read()/write() compatible on the
    first glance it finally is not and probably can never be as the protocol is
    using bidirectional traffic for both read and write such that a simple
    "select" model cannot be used.

    > The SSL_read() manual page should at least mention that it's unsafe to
    > call select again if SSL_pending() comes true. At present, it doesn't
    > mention SSL_pending() at all.
    >

    That is true indeed.
    > And this will work only as long as we're guaranteed SSL_pending() will
    > never actually read from the socket buffer, which someone might want to
    > make a note of somewhere!
    >

    It better should not because actually SSL_pending() should be
    usable in the blocking case as well which would not hold if it
    would actually try going down the chain.

    Best regards,
    Lutz
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  12. RE: non-blocking SSL_read() API problem


    > Care to explain why you can't discuss how the API might or might not work
    > without throwing around gratuitous insults?


    They are warnings, not insults. I'm sorry you see them that way.

    > This last message to which
    > I'm responding is merely condescending; the previous was downright
    > insulting and offensive. I can't see how that helps anyone. And after
    > all, it's not like the current mess of a non-blocking API is your design
    > nor your code, at least not as far as I can tell.


    You are reporting bugs in an API you do not understand. You are using an API
    that you fundamentally do not understand. I'm simply pointing this out and
    warning you that this is not a good thing to do.

    > The problem with "read until there's no more" semantics is that you can't
    > really use them to do fair I/O in a traditional Unix single-threaded
    > event-driven model.


    Well, you can't be fair in a traditional Unix single-threaded event-driven
    model anyway. If one client, for example, triggers a rare unusual condition
    that your code has not handled before, and the code to handle it needs to
    fault in, all clients have to wait while that happens. If the local disk is
    busy, well, they all sit there.

    > If I have no way to find out whether there might be
    > more I/O available to drain except to try to drain it, I have to either:
    >
    > 1) Service each client in turn whether or not they're ready, since I can't
    > tell whether they're ready without paying the whole cost of the read()
    > and the decryption
    >
    > or
    >
    > 2) Fully drain everything each client might have cared to write me each
    > time I find that he's ready at all. This allows a single client to
    > consume as much of a server's time as it cares to, to the detriment
    > of the others.


    If you want a limit on how much data you read from a single client, just
    read only up to that limit. You'd have the same issue with 'select'. When
    'select' tells you that there's data from a client, you have no idea how
    much, and the only way to tell is to read it.

    > The traditional Unix non-blocking semantics where you can stop reading
    > whenever you like and sleep (via select or poll on _many_ streams at once)
    > to find out who's got more for you don't have this problem. So many, many
    > event-driven Unix applications are written to do precisely that.


    I still don't see what the problem is. In either case, if you don't fully
    drain the connection, you have to come back and read the rest later. If you
    want to call 'poll' or 'select' to discover more sockets, you can.

    The only difference is that in one case, you can use 'poll' or 'select' to
    rediscover the sockets you already discovered and in the other case you have
    to keep track. But this is as simple as an extra 'or' clause in the 'if' of
    your poll/select loop.

    Instead of 'if the poll/select discovered this socket' then read, it's 'if
    the poll/select discovered this socket or I wasn't waiting for it to be
    discovered' then read.

    > If I'm hearing you correctly you are saying that not only cannot one do
    > that with OpenSSL, one ought not want to do such a thing. I do not
    > grasp why.


    Because single-threaded, poll/select loop applications are one of the
    poorest design architectures there can be. If any line of code anywhere
    unexpectedly blocks, the entire server is toast. This means that not only
    the 20% of the code that's really performance critical has to be designed
    carefully, but even the parts that shouldn't be performance critical
    *SURPRISE* are, because any unexpected blocking kills the server. (This is
    why most IRC servers are so bursty, by the way.)

    > Incidentally, if one's not intended to peek under the hood, again, I ask
    > why OpenSSL _encourages_ this by providing no sleep-for-IO mechanism
    > which does not, in fact, _require_ peeking under the hood.


    OpenSSL tells you when to sleep for I/O. You should not sleep for I/O unless
    told because you have no way to know whether or not sleeping for I/O is
    appropriate.

    > Furthermore, the SSL_read() documentation, which I was so foolish as to
    > use as my guidance to this portion of the API, explicitly says to use
    > select() to find out if I/O's available after receiving a WANT_READ or
    > WANT_WRITE error. You _cannot do that_ without peeking under the hood,
    > because you of course must break the abstraction and get the BIO's
    > file descriptor to feed to select!


    This is where OpenSSL explicitly tells you -- you must peek under the hood.
    The point is that you cannot know when to do this. OpenSSL has to tell you.
    You are right though, that's the wrinkle.

    > I can think of a number of ways to support applications that would like
    > to do something like typical Unix event-driven multiplexed I/O by
    > augmenting, rather than altering, the existing API. But if they're
    > just going to be met by a barrage of condescension or insults, I'm not
    > sure it's worth bothering to discuss...


    The problem wish some kind of SSL_poll or SSL_select is that it doesn't
    actually solve the problem. Suppose instead of just using OpenSSL, you were
    also using OpenFOO. Now your thread can't block in SSL_select because it
    needs to block in FOO_select to also properly handle FOO protocol sockets.

    So what you'd wind up doing is writing your own code that checks all the SSL
    sockets to see which it is appropriate to 'select' on, checks all the FOO
    sockets to see which it is appropriate to 'select' on, and then calls
    'select' on the mixed combo for which it's appropriate, then sends a
    combined report.

    But that's exactly what you should, and can, do now. This is the case where
    you have to peek under the hood, and you can't make that go away. At least,
    not any way that I can think of.

    DS


    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  13. Re: non-blocking SSL_read() API problem

    Thor Lancelot Simon wrote:
    > I have an application which reads data into fixed-size buffers which it
    > maintains per session. It uses non-blocking IO and select() when a read
    > returns SSL_ERROR_WANT_{READ,WRITE}.
    >
    > To conserve memory I reduced the buffer size from 16384 to 8192 and saw
    > sessions suddenly hang. A coworker diagnosed this as follows:


    To be clear, Which buffer ?

    Case 1) The internal one inside libssl.so used to decode a full packet ?
    (which as others have commented is set by the protocol standard)

    Case 2) The application space buffer used to receive decrypted
    application data which us usually passed in SSL_read() ?



    If you chose "case 1" then the issue is you maybe violating the protocol
    spec (maybe there is a negotiation option/setting to lower the max
    packet size).


    If you choose "case 2" then I see no reason why SSL should hang, if your
    application choose to read 1 byte at a time of application data with
    SSL_read() that should work 100%.

    What OpenSSL does internally is decode the full packet and maintain
    pointers to data left of the previous packet, once there is no more data
    left to convey to application it then attempts to read more data in from
    the lower layers (from the socket) until it gets another full packet.

    Yes all data is double buffered (technically triple buffered, since
    there is encrypted data, then unencrypted data, then application space),
    maybe new OpenSSL API should be allowed to request a read-only pointer
    and length to new data to remove the extra buffering overhead, since
    SSL_read() does a full copy (from unencrypted data to application space).

    Sorry for not taking the time to read every email in this thread today.


    Darryl
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


  14. Re: non-blocking SSL_read() API problem

    Lutz Jaenicke wrote:
    > Thor Lancelot Simon wrote:
    >> On Fri, Aug 01, 2008 at 03:49:01PM +0200, Lutz Jaenicke wrote:
    >> This leads to another problem, actually:
    >>
    >> A malicious peer which sends data as fast as it can can get _more_ data
    >> into the socket buffer while the application is trying to "read to
    >> completion". This can deny service to _other_ peers.
    >>

    >
    > This type of fairness has to be implemented by the application.
    > This will include modifying the event handling.


    Exactly, simply process only so many IO's per SSL handle at any one time.


    I do this for multi-threaded server like applications, there are 2 basic
    cases:

    * You received a unit-of-work (i.e. your application called SSL_read()
    and got everything it needed to carry out some further processing), you
    execute that unit-of-work and always force yourself onto looking at the
    next stream of data. Even if there is more data to look at on the one
    you just processed the unit-of-work for. Sometimes you might want to
    ensure you flush any data that was written due to the unit-of-work
    processing (before looking for the next unit of work) this stops write
    request-response dead-locks.

    * You called SSL_read() 3 times but you were not able to decode a
    unit-of-work, i.e. there is still more data needed to assemble a valid
    unit-of-work, so again your force yourself onto looking at the next
    stream of data. Even if there is more data to look at to assemble the
    unit of work. This stops someone staring others with very small packet
    sizes and huge units of work.


    YMMV

    Darryl
    __________________________________________________ ____________________
    OpenSSL Project http://www.openssl.org
    Development Mailing List openssl-dev@openssl.org
    Automated List Manager majordomo@openssl.org


+ Reply to Thread