Multiplexing timer expirations and packet reception - Linux

This is a discussion on Multiplexing timer expirations and packet reception - Linux ; Hello everyone, (I run Linux 2.6.18) Consider sock = socket(PF_INET, SOCK_DGRAM, 0) bound to a specific port, and a POSIX real-time timer created with timer_create(). cf. https://linuxlink.timesys.com/resour...time/posix.htm http://www.opengroup.org/onlinepubs/...er_create.html http://www.opengroup.org/onlinepubs/...etoverrun.html In my app, I wait for two events in an infinite ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Multiplexing timer expirations and packet reception

  1. Multiplexing timer expirations and packet reception

    Hello everyone,

    (I run Linux 2.6.18)

    Consider sock = socket(PF_INET, SOCK_DGRAM, 0) bound to a specific port,
    and a POSIX real-time timer created with timer_create().

    cf.
    https://linuxlink.timesys.com/resour...time/posix.htm
    http://www.opengroup.org/onlinepubs/...er_create.html
    http://www.opengroup.org/onlinepubs/...etoverrun.html

    In my app, I wait for two events in an infinite loop:

    1) system has received a packet which is available through 'sock'

    2) timer has expired

    When the timer expires, the kernel sends a signal to my process.
    (I could have configured things differently though).

    In order to avoid race conditions, I decided it would be simpler to
    block the signal, and handle it synchronously.

    Pseudo-code:

    Create timer.
    Block signal.
    while ( 1 )
    {
    wait for timer expiration or packet reception (??)
    if (timer expiration)
    do_something;
    else if (packet reception)
    do_another_thing;
    else
    why am I here?
    }

    Which system call lets met wait for a signal (timer expiration) or a
    packet on a socket? I don't think I can use poll or select?

    If all my events are signals, I can use sigwaitinfo()

    I know I can ask the kernel to send me a signal (typically SIGIO) when a
    file descriptor is ready. (In Linux, I can ask for a different signal.)

    I request a real-time signal, so that the kernel can queue several
    instances of the same signal. Does the kernel send a signal every time a
    packet is received? I've written a small test program which seems to
    suggest this is the case, but my larger app fails for mysterious
    reasons, and I'm stabbing in the dark at this point :-(

    I'm open to all suggestions.

    Regards.

  2. Re: Multiplexing timer expirations and packet reception

    Spoon wrote:

    > In my app, I wait for two events in an infinite loop:
    >
    > 1) system has received a packet which is available through 'sock'
    >
    > 2) timer has expired
    >
    > When the timer expires, the kernel sends a signal to my process.
    > (I could have configured things differently though).
    >
    > In order to avoid race conditions, I decided it would be simpler to
    > block the signal, and handle it synchronously.


    You could look into using pselect().

    Alternately, one method I've used successfully is to leave the signal
    handler async, but the only thing the signal handler does is write the
    number of the received signal (and possibly other information if you're
    getting fancy with sigaction()) to a pipe. Then you simply monitor the
    pipe and your socket via select().

    Chris

  3. Re: Multiplexing timer expirations and packet reception

    Chris Friesen wrote:

    > Spoon wrote:
    >
    >> In my app, I wait for two events in an infinite loop:
    >>
    >> 1) system has received a packet which is available through 'sock'
    >>
    >> 2) timer has expired
    >>
    >> When the timer expires, the kernel sends a signal to my process.
    >> (I could have configured things differently though).
    >>
    >> In order to avoid race conditions, I decided it would be simpler to
    >> block the signal, and handle it synchronously.

    >
    > You could look into using pselect().


    So far, I've avoided pselect() like the plague.

    The Linux man page states:
    Since version 2.1, glibc has provided an emulation of pselect() that is
    implemented using sigprocmask() and select(). This implementation
    remains vulnerable to the very race condition that pselect() was
    designed to prevent. On systems that lack pselect, reliable (and more
    portable) signal trapping can be achieved using the self-pipe trick
    (where a signal handler writes a byte to a pipe whose other end is
    monitored by select() in the main program.)

    However, it also states:
    pselect() was added to Linux in kernel 2.6.16. Prior to this, pselect()
    was emulated in glibc (but see BUGS).

    It also states:
    Under Linux, select() may report a socket file descriptor as "ready for
    reading", while nevertheless a subsequent read blocks. This could for
    example happen when data has arrived but upon examination has wrong
    checksum and is discarded. There may be other circumstances in which a
    file descriptor is spuriously reported as ready. Thus it may be safer to
    use O_NONBLOCK on sockets that should not block.

    I didn't know a file descriptor could be "incorrectly" marked as ready.
    Can this happen with the epoll infrastructure?

    > Alternately, one method I've used successfully is to leave the signal
    > handler async, but the only thing the signal handler does is write the
    > number of the received signal (and possibly other information if you're
    > getting fancy with sigaction()) to a pipe. Then you simply monitor the
    > pipe and your socket via select().


    I had totally forgotten about the self-pipe trick!
    Thanks for reminding me.
    It seems a bit wasteful though.

    Timer expiration => context switch to signal handler code => write()
    (round-trip to kernel space) => exit from signal handler => back to
    kernel space => finally context switch back into normal execution flow.

    "Catching" the signal with sigwaitinfo seemed like a leaner solution.
    But perhaps I have misplaced misconceptions?

    Anyway, thanks a lot for the suggestion!

  4. Re: Multiplexing timer expirations and packet reception

    On Fri, 26 Jan 2007 12:26:44 -0600, Chris Friesen wrote:
    > Spoon wrote:


    > Alternately, one method I've used successfully is to leave the signal
    > handler async, but the only thing the signal handler does is write the
    > number of the received signal (and possibly other information if you're
    > getting fancy with sigaction()) to a pipe. Then you simply monitor the
    > pipe and your socket via select().
    >


    This is broken. You must deal w/ the pipe becoming full. You deadlock if
    the signal handler attempts a write to the pipe unless you use it in
    non-blocking mode. However, if you do this in the same scenario you lose
    atomicity of writes (and I'm not even sure the _reads_ could ever have been
    guaranteed atomic in the first place).

    Using the pipe trick is very common, but you typically use a single byte,
    and often with a fixed nonce since you can't even fit a signal number in
    one byte (unless you used a coding scheme).

    - Bill

  5. Re: Multiplexing timer expirations and packet reception

    On Fri, 26 Jan 2007 23:07:01 +0100, Spoon wrote:

    > It also states:
    > Under Linux, select() may report a socket file descriptor as "ready for
    > reading", while nevertheless a subsequent read blocks. This could for
    > example happen when data has arrived but upon examination has wrong
    > checksum and is discarded. There may be other circumstances in which a
    > file descriptor is spuriously reported as ready. Thus it may be safer to
    > use O_NONBLOCK on sockets that should not block.
    >
    > I didn't know a file descriptor could be "incorrectly" marked as ready.
    > Can this happen with the epoll infrastructure?


    Yes. For the same reason. Also, I've experienced this first-hand
    on Linux 2.4 with UDP sockets signaling read readiness.


    > I had totally forgotten about the self-pipe trick!
    > Thanks for reminding me.
    > It seems a bit wasteful though.
    >
    > Timer expiration => context switch to signal handler code => write()
    > (round-trip to kernel space) => exit from signal handler => back to
    > kernel space => finally context switch back into normal execution flow.
    >
    > "Catching" the signal with sigwaitinfo seemed like a leaner solution.
    > But perhaps I have misplaced misconceptions?


    If signals don't happen all that often, it's not really wasteful. Consider
    what a typical POSIX or SysV mutex has to do. I remember reading an IBM
    Developerworks article comparing Linux and Windows pipes; IIRC, the
    performance of sending small chunks of data over a pipe in Linux was
    comparable to signaling via a Window's thread mutex. Maybe my memory is
    exaggerating a little (I can only find reference to the article, not the
    article itself; at least I tried to check).

  6. Re: Multiplexing timer expirations and packet reception

    William Ahern wrote:

    > Using the pipe trick is very common, but you typically use a single byte,
    > and often with a fixed nonce since you can't even fit a signal number in
    > one byte (unless you used a coding scheme).


    I was working under linux in particular, where writes of up to one page
    of data are implementation-defined to be atomic.

    Chris

  7. Re: Multiplexing timer expirations and packet reception

    Spoon wrote:

    > I didn't know a file descriptor could be "incorrectly" marked as ready.
    > Can this happen with the epoll infrastructure?


    Certainly. The canonical instance of this is when you have a UDP packet
    with a bad checksum. For performance reasons, the checksum is performed
    while copying the data into the user's buffer (when the cache will be
    hot). This means that at select/poll/epoll time we do not know whether
    the checksum is good or not, only that there is data present. At
    recv/recvmsg/read time, the checksum is found to be bad, and the packet
    is discarded.

    This can do bad things if you are using blocking sockets. Nonblocking
    sockets are unaffected.

    There was a big long email thread on the issue a while back. My
    suggestion was to keep the existing behaviour for nonblocking sockets
    but verify the checksum at select/poll/epoll time for blocking sockets.
    I haven't checked that code lately to see whether they did this or
    left the original behaviour.

    > I had totally forgotten about the self-pipe trick!
    > Thanks for reminding me.
    > It seems a bit wasteful though.


    Pipes are very fast in linux.

    Chris

  8. Re: Multiplexing timer expirations and packet reception

    William Ahern wrote:

    > Spoon wrote:
    >
    >> I had totally forgotten about the self-pipe trick!
    >> Thanks for reminding me.
    >> It seems a bit wasteful though.
    >>
    >> Timer expiration => context switch to signal handler code => write()
    >> (round-trip to kernel space) => exit from signal handler => back to
    >> kernel space => finally context switch back into normal execution flow.
    >>
    >> "Catching" the signal with sigwaitinfo seemed like a leaner solution.
    >> But perhaps I have misplaced misconceptions?

    >
    > If signals don't happen all that often, it's not really wasteful.


    In my app, I have to arm a timer for each packet received. Therefore, if
    I receive 1000 packets per second (typical rate for my app), then I need
    to handle 1000 signals per second.

    I think I'll give pselect() a try. I wasn't aware that Linux had
    implemented the system call since 2.6.16.

    Regards.

  9. Re: Multiplexing timer expirations and packet reception

    On Sat, 27 Jan 2007 09:21:57 -0600, Chris Friesen wrote:

    > William Ahern wrote:
    >
    >> Using the pipe trick is very common, but you typically use a single byte,
    >> and often with a fixed nonce since you can't even fit a signal number in
    >> one byte (unless you used a coding scheme).

    >
    > I was working under linux in particular, where writes of up to one page
    > of data are implementation-defined to be atomic.
    >


    Your pipe is full of signals of type X, and so your write for signal Y
    fails. What do you do with the signal? You're forced to discard it,
    which could possibly dead-lock some task unless you have a
    [costly] fail-safe mechanism. If you didn't, in order to keep the
    signal information you'd need to manage a possibly huge ring buffer or
    some other store... I'm not even sure it's practically feasible from
    a signal handler. But even in that case, you're still passing the
    signal state outside of the pipe, and using the pipe only as a sort of
    semaphore.

    libevent, for example, sets a flag in a signal array before writing a nonce
    to the pipe. That works well, because the maximum (and fixed) memory
    requirement is simply N * sizeof sig_atomic_t, as opposed to being tied to
    the number of caught signals, which is potentially unbounded.
    Granted, they're not entirely equivalent in terms of what they do; but only
    one can guarantee that a signal is always acted upon, at least as far
    as user-land is capable (the rest is up to the kernel, but technically the
    kernel can use the same tricks for old-style signal handlers).

    This hints at one of the problems with SIGIO. And it occurs in heavily
    loaded applications. That's why any SIGIO application is told in almost
    all documentation that they must enable a fallback mechanism which is
    triggered when signals overflow.

  10. Re: Multiplexing timer expirations and packet reception

    William Ahern writes:
    > On Fri, 26 Jan 2007 12:26:44 -0600, Chris Friesen wrote:
    >> Spoon wrote:

    >
    >> Alternately, one method I've used successfully is to leave the signal
    >> handler async, but the only thing the signal handler does is write the
    >> number of the received signal (and possibly other information if you're
    >> getting fancy with sigaction()) to a pipe. Then you simply monitor the
    >> pipe and your socket via select().
    >>

    >
    > This is broken. You must deal w/ the pipe becoming full.


    Applications only need to deal with situations that can actually
    happen. The pipe can only become 'full' if more than PIPE_BUF /
    sizeof(information) signals may arrive before an other part of the
    program consumes the data.

    > You deadlock if the signal handler attempts a write to the pipe
    > unless you use it in non-blocking mode. However, if you do this in
    > the same scenario you lose atomicity of writes (and I'm not even
    > sure the _reads_ could ever have been guaranteed atomic in the first
    > place).


    This isn't true:

    Reading or writing pipe data is atomic if the size of data
    written is less than PIPE_BUF. This means that the data
    transfer seems to be an instantaneous unit, in that nothing
    else in the system can observe a state in which it is
    partially complete. Atomic I/O may not begin right away (it
    may need to wait for buffer space or for data), but once it
    does begin, it finishes immediately.

    This quote is from the glibc documentation, but APUE contains this in
    various places as well (eg on page 430) and SUS mandates it as well.

+ Reply to Thread