debugging or tracing a multi-threaded program - Unix

This is a discussion on debugging or tracing a multi-threaded program - Unix ; Hi all, I am writing a multi threaded app (basically a small HTTP server that uses threadpools) and stress testing it with 20-30 simultaneous client connections (sort of like apachebench). (This is on AIX 5.3.) The basic design is around ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: debugging or tracing a multi-threaded program

  1. debugging or tracing a multi-threaded program


    Hi all,

    I am writing a multi threaded app (basically a small HTTP server that
    uses threadpools) and stress testing it with 20-30 simultaneous client
    connections (sort of like apachebench). (This is on AIX 5.3.)

    The basic design is around a main thread (that accepts client conns
    and select()'s over them to detect client activity) and a bunch of
    worker threads (that process one transaction and return to the pool to
    idle). I am experimenting with this and other multiprocess/multithread
    models.

    I'm facing issues around mutex locking, pthread_cond_wait, and in some
    cases the server just stops, possibly due to either deadlock or maybe
    something else. To try and get to the bottom of this, I'm used
    printf()s, then went to using write()s to a debug file, but I feel
    both are too slow or otherwise unreliable. I guess my basic problem is
    to trace through the thread activity to see which threads get woken up
    when, and what they do.

    My questions is: is there a good i.e. reliable and efficient way to
    trace the execution of the program i.e. when threads become active and
    when they return? In my case, threads are woken up by signals and they
    restore the mask when they go back to idling. I have googled and RTMs
    of GDB, DBX and others, but they only go as far as to indicate when
    threads are created or are exiting e.g. GDB print thread-events. Also,
    debuggers give a snapshot view instead of an execution view: typing
    "threads" or "lwps" into dbx shows snapshots of what all the threads
    are doing, and it's hard to piece these snapshots together to find out
    what's happening from start to end.

    I am guessing there are probably 3 alternatives:
    1. write a debugger script that takes these snapshots at high speed
    and dumps them somewhere; but will that perturb the actual application
    too much?
    2. fast write() in the app, but how to synchronize across threads? yet
    another lock, causing more perturbation?
    3. app writes out debug messages to a socket that is read by another
    reader app; but will the OS (AIX 5.3) guarantee that concurrent
    write()s by different threads will be delivered in order?

    Any other suggestions or hints would be appreciated.

    thanks in advance,
    -SNS

  2. Re: debugging or tracing a multi-threaded program

    On Aug 21, 11:20*pm, sam.n.seab...@gmail.com wrote:
    > Hi all,
    >
    > I am writing a multi threaded app (basically a small HTTP server that
    > uses threadpools) and stress testing it with 20-30 simultaneous client
    > connections (sort of like apachebench). (This is on AIX 5.3.)
    >
    > The basic design is around a main thread (that accepts client conns
    > and select()'s over them to detect client activity) and a bunch of
    > worker threads (that process one transaction and return to the pool to
    > idle). I am experimenting with this and other multiprocess/multithread
    > models.
    >
    > I'm facing issues around mutex locking, pthread_cond_wait, and in some
    > cases the server just stops, possibly due to either deadlock or maybe
    > something else. To try and get to the bottom of this, I'm used
    > printf()s, then went to using write()s to a debug file, but I feel
    > both are too slow or otherwise unreliable. I guess my basic problem is
    > to trace through the thread activity to see which threads get woken up
    > when, and what they do.
    >
    > My questions is: is there a good i.e. reliable and efficient way to
    > trace the execution of the program i.e. when threads become active and
    > when they return? In my case, threads are woken up by signals and they
    > restore the mask when they go back to idling. I have googled and RTMs
    > of GDB, DBX and others, but they only go as far as to indicate when
    > threads are created or are exiting e.g. GDB print thread-events. Also,
    > debuggers give a snapshot view instead of an execution view: typing
    > "threads" or "lwps" into dbx shows snapshots of what all the threads
    > are doing, and it's hard to piece these snapshots together to find out
    > what's happening from start to end.
    >
    > I am guessing there are probably 3 alternatives:
    > 1. write a debugger script that takes these snapshots at high speed
    > and dumps them somewhere; but will that perturb the actual application
    > too much?
    > 2. fast write() in the app, but how to synchronize across threads? yet
    > another lock, causing more perturbation?
    > 3. app writes out debug messages to a socket that is read by another
    > reader app; but will the OS (AIX 5.3) guarantee that concurrent
    > write()s by different threads will be delivered in order?
    >
    > Any other suggestions or hints would be appreciated.
    >
    > thanks in advance,
    > -SNS


    hi
    this may not be the answer you are looking for, but have you ever used
    static analysis tools? I recently found a tool called findbugs that
    looks over your java source code and points out anything sketchy you
    may be doing A friend told me about some similar tools they were
    using, a few of which supported multithreaded applications. This post
    reminded me of them. Maybe you could google for threaded static
    analysis tools to find something to help you?

  3. Re: debugging or tracing a multi-threaded program

    hello,

    On Aug 22, 8:20*am, sam.n.seab...@gmail.com wrote:

    > My questions is: is there a good i.e. reliable and efficient way to
    > trace the execution of the program i.e. when threads become active and


    perhaps you could give a try to ibm rational purify/quantify package.
    there
    is a stuff like this and i was generally happy with it.

    > 2. fast write() in the app, but how to synchronize across threads? yet
    > another lock, causing more perturbation?


    this is common myth - this is in fact opposite to perturbation,
    because the
    debugging serializes the flow so most programs runs quite smoothly
    with
    debug on at least from the point of view of user..

    > 3. app writes out debug messages to a socket that is read by another
    > reader app; but will the OS (AIX 5.3) guarantee that concurrent
    > write()s by different threads will be delivered in order?


    well i solved this recently by creating a low priority sender thread
    to socket.
    each entry to log gets a buffer from manager with 2 lock-free queues
    of buffers
    (one for empty and one for full buffers), fills the information in and
    then
    returns the buffer to the manager, that inserts it into full queue.
    that one
    is emptied on the background by the sender thread and buffers are
    returned to
    the pool.

    another option would be to have separate file for each thread and do
    the assembly
    of the log by some external script thus avoiding need for
    synchronization.

    hope it helps,
    mojmir

  4. Re: debugging or tracing a multi-threaded program

    sam.n.seaborn@gmail.com writes:

    [...]

    > The basic design is around a main thread (that accepts client conns
    > and select()'s over them to detect client activity) and a bunch of
    > worker threads (that process one transaction and return to the pool to
    > idle). I am experimenting with this and other multiprocess/multithread
    > models.
    >
    > I'm facing issues around mutex locking, pthread_cond_wait, and in some
    > cases the server just stops, possibly due to either deadlock or maybe
    > something else. To try and get to the bottom of this, I'm used
    > printf()s, then went to using write()s to a debug file, but I feel
    > both are too slow or otherwise unreliable.


    [...]

    > I am guessing there are probably 3 alternatives:
    > 1. write a debugger script that takes these snapshots at high speed
    > and dumps them somewhere; but will that perturb the actual application
    > too much?
    > 2. fast write() in the app, but how to synchronize across threads? yet
    > another lock, causing more perturbation?
    > 3. app writes out debug messages to a socket that is read by another
    > reader app; but will the OS (AIX 5.3) guarantee that concurrent
    > write()s by different threads will be delivered in order?
    >
    > Any other suggestions or hints would be appreciated.


    Force a core dump when the server has stopped working (eg by sending a
    SIGQUIT to it) and use a debugger to determine what your threads are
    doing (or what they are waiting for). At least deadlocks should be
    easily detected this way (I have used this technique to find and
    eliminate quite a few of them from a fairly 'massively' multi-threaded
    server running on a small multi-core [x86, Linux] system).



  5. Re: debugging or tracing a multi-threaded program

    sam.n.seaborn@gmail.com wrote:
    > Hi all,
    >
    > I am writing a multi threaded app (basically a small HTTP server that
    > uses threadpools) and stress testing it with 20-30 simultaneous client
    > connections (sort of like apachebench). (This is on AIX 5.3.)
    >
    > The basic design is around a main thread (that accepts client conns
    > and select()'s over them to detect client activity) and a bunch of
    > worker threads (that process one transaction and return to the pool to
    > idle). I am experimenting with this and other multiprocess/multithread
    > models.
    >
    > I'm facing issues around mutex locking, pthread_cond_wait, and in some
    > cases the server just stops, possibly due to either deadlock or maybe
    > something else. To try and get to the bottom of this, I'm used
    > printf()s, then went to using write()s to a debug file, but I feel
    > both are too slow or otherwise unreliable. I guess my basic problem is
    > to trace through the thread activity to see which threads get woken up
    > when, and what they do.
    >
    > My questions is: is there a good i.e. reliable and efficient way to
    > trace the execution of the program i.e. when threads become active and
    > when they return?


    That depends on your platform and tools.

    Sun Studio (for Solaris and Linux) has an excellent profiler that traces
    active threads. It also has an excellent analysis tool called locklint
    (see http://developers.sun.com/solaris/ar.../locklint.html).

    --
    Ian Collins.

+ Reply to Thread