how to figure out where a process hung - Embedded

This is a discussion on how to figure out where a process hung - Embedded ; I've got a problem... I have a process run from cron that hangs in an uninterruptible sleep. It reads data from a webcam, and writes to a tmpfs partition. This process runs every 15 minutes; and most of the time ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: how to figure out where a process hung

  1. how to figure out where a process hung

    I've got a problem...

    I have a process run from cron that hangs in an uninterruptible sleep.

    It reads data from a webcam, and writes to a tmpfs partition.

    This process runs every 15 minutes; and most of the time it will run just
    fine, but once in a while, it hangs. Then cron runs it again, and it
    hangs again. Pretty soon I have a bunch of hung processes that consume
    all resources, and for all practical purposes my little system is dead.

    The frustrating thing is that it happens rarely; the process runs every 15
    minutes and sometimes it will run for days just fine, and then it will
    start hanging.

    Is there some way to find out where the process is hung after it is hung
    up?

    The program is spcacat, a very simple snapshot util for webcams using the
    spca driver: . Anyone have any suggestions? I
    have 3 weeks to get this up and running, and that doesn't give me much
    time....

    --Yan

    --
    o__
    ,>/'_ o__
    (_)\(_) ,>/'_ o__
    Yan Seiner, PE (_)\(_) ,>/'_ o__
    Certified Personal Trainer (_)\(_) ,>/'_ o__
    Licensed Professional Engineer (_)\(_) ,>/'_
    Who says engineers have to be pencil necked geeks? (_)\(_)


  2. Re: how to figure out where a process hung

    > hangs in an uninterruptible sleep.

    What does this mean ? A sleep() call needs to specify a time, so it
    can't "hang".

    Moreover, AFAIK, a user land process only can do uninterruptible sleep
    (a very short nanosleep() ), if it is assigned very special attributes.

    Is it possible that the process waits for some hardware event that does
    not occur due to defective hardware ?

    -Michael

  3. Re: how to figure out where a process hung

    Captain Dondo wrote:
    > The frustrating thing is that it happens rarely; the process runs every 15
    > minutes and sometimes it will run for days just fine, and then it will
    > start hanging.
    >
    > Is there some way to find out where the process is hung after it is hung
    > up?


    Try to use "strace". When it hangs connect to it with "strace -p " and
    you will see where it hangs (if it hangs in a system call). If this does
    not help, try with "ltrace" instead.

    Hope it helps
    Juergen

  4. Re: how to figure out where a process hung

    Hello,

    > This process runs every 15 minutes; and most of the time it will run just
    > fine, but once in a while, it hangs. Then cron runs it again, and it
    > hangs again. Pretty soon I have a bunch of hung processes that consume
    > all resources, and for all practical purposes my little system is dead.


    Your cronjob could kill all running instances before starting a new one.
    That way the ressources would stay free and the system won't get problems.
    It is not a clean solution, but at least it can keep your ressources free.
    You could log if any instances are killed (instead of a clean exit) and
    maybe find some event which causes the program to hang.

    > have 3 weeks to get this up and running, and that doesn't give me much
    > time....


    That's an idea, but it will only help against the ressource leak of hung
    processes, not against the problem.

    I'd guess the program waits for something (a camera's event?), but never
    gets it.

    Regards,
    Sebastian



  5. Re: how to figure out where a process hung

    On Tue, 01 Aug 2006 10:19:57 +0200, Michael Schnell wrote:

    >> hangs in an uninterruptible sleep.

    >
    > What does this mean ? A sleep() call needs to specify a time, so it
    > can't "hang".
    >
    > Moreover, AFAIK, a user land process only can do uninterruptible sleep
    > (a very short nanosleep() ), if it is assigned very special attributes.
    >
    > Is it possible that the process waits for some hardware event that does
    > not occur due to defective hardware ?
    >
    > -Michael


    From 'man ps':

    PROCESS STATE CODES
    Here are the different values that the s, stat and state output
    specifiers (header "STAT" or "S") will display to describe the state of
    a process.
    D Uninterruptible sleep (usually IO)

    These processes show up as 'D', which means they cannot be killed.

    I am guessing that thse processes are waiting for some camera event that
    never occurs, but I have not figured out why only sometimes....

    The camera shares the USB bus with a GPS, which is being polled almost
    continously. I suspect there is some bus contention which triggers this,
    but I have no idea where to start looking; all of the code I've looked at
    looks OK so far.

    --Yan

    --
    o__
    ,>/'_ o__
    (_)\(_) ,>/'_ o__
    Yan Seiner, PE (_)\(_) ,>/'_ o__
    Certified Personal Trainer (_)\(_) ,>/'_ o__
    Licensed Professional Engineer (_)\(_) ,>/'_
    Who says engineers have to be pencil necked geeks? (_)\(_)


+ Reply to Thread