debugging a hung WebSphere JVM - Websphere

This is a discussion on debugging a hung WebSphere JVM - Websphere ; Hi, I'm looking for some help related to debugging hung JVMs. We ran into this issue couple of time in recent times & were forced to restart the server to overcome that. We are using WebSphere 6.1.0.13 ND on Solaris ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: debugging a hung WebSphere JVM

  1. debugging a hung WebSphere JVM

    Hi,
    I'm looking for some help related to debugging hung JVMs. We ran into this issue couple of time in recent times & were forced to restart the server to overcome that.

    We are using WebSphere 6.1.0.13 ND on Solaris 10.

    In both the cases after doing some thread dump analysis, we saw that one thread was stuck here -
    "WebContainer : 89" daemon prio=10 tid=0x01683c58 runnable (0x73f7d000..0x73f7faf0)
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.ja va:260)
    at com.ibm.ejs.ras.WrappingFileOutputStream.write(Wra ppingFileOutputStream.java:364)
    - locked (0x97ff0230) (a com.ibm.ejs.ras.WrappingFileOutputStream)
    at java.io.PrintStream.write(PrintStream.java:412)

    This thread also aquired a lock on - locked (0x9945bbf8) (a org.apache.log4j.spi.RootCategory)
    during this process. All other threads were waiting to lock this ((0x9945bbf8)) & waiting for ever.

    Any way to figure out why the thread ("WebContainer : 89") is not able to finish the writeBytes(native Method).
    How to debug this further from Solaris perspective. Is there a way to send some system interrupts/signals in this case which can help. I was going through an article where someone had suggested on sending SIGWAITING signals (for Solaris 8). Will that help on Solaris 10 ? OR is there any equivalent OR any other way to overcome this issue ?

    Any pointer/help is highly appreciated..

    Thanks
    Niraj Nath

  2. Re: debugging a hung WebSphere JVM

    On Wed, 17 Dec 2008 13:21:54 -0500, niraj_nath wrote:

    > We are using WebSphere 6.1.0.13 ND on Solaris 10.
    >
    > In both the cases after doing some thread dump analysis, we saw that one
    > thread was stuck here - "WebContainer : 89" daemon prio=10
    > tid=0x01683c58 runnable (0x73f7d000..0x73f7faf0)
    > at java.io.FileOutputStream.writeBytes(Native Method) at
    > java.io.FileOutputStream.write(FileOutputStream.ja va:260) at
    > com.ibm.ejs.ras.WrappingFileOutputStream.write

    (WrappingFileOutputStream.java:364)
    > - locked (0x97ff0230) (a

    com.ibm.ejs.ras.WrappingFileOutputStream) at
    > java.io.PrintStream.write(PrintStream.java:412)
    >
    > This thread also aquired a lock on - locked (0x9945bbf8) (a
    > org.apache.log4j.spi.RootCategory) during this process. All other
    > threads were waiting to lock this ((0x9945bbf8)) & waiting for ever.


    I don't have a Solaris at hand at home here, but you can try to analyze
    the (Solaris) stack trace (pstack ) while mapping the thread id from
    the thread dump (tid=0x01683c58) to the Solaris lwp (one of the bottom
    calls in the stack trace of a JVM has a method with this tid as one of
    its arguments).

    Perhaps pmap / pfiles might reveal information about which
    file is actually being written to (my guess would be the activity.log
    file)..

    Wkr,
    Sven Vermeulen

  3. Re: debugging a hung WebSphere JVM

    Is this case different from your previous post of Nov 19 ?

    It's worth looking at Log4J in the first place anyway.

    E.g. https://issues.apache.org/bugzilla/s...i?id=41214#c14

    You could also try to reduce the number (or complexity) of appenders to see
    if it changes to the issue.


  4. Re: debugging a hung WebSphere JVM

    There may be some thread dead-locking going on here. Take multiple thread dumps and compare them to see the same thread is hanging. As far as memory, make sure you don't run out of HEAP during this problem. Also check the "process size" (using "ps" command), which includes HEAP+Native Memory. Check if "
    process size" is gradually increasing. You may be leaking native memory in this case.

    Are you able to find the file it is trying to write. Check the file size and make sure OS does not have hard limit (2GB for some file systems).

    - Karun

  5. Re: debugging a hung WebSphere JVM

    I guess this something to do with some of the resources not being closed. Check whether all the open resources have been closed.

+ Reply to Thread