High load average - Aix

This is a discussion on High load average - Aix ; Hi. I have AIX 6.1 (6100-01) running on top of LPAR with 24 CPUs and 116 GB of RAM. System runs several Oracle instances. Although system runs fine, i.e. there are no complaints on host performance, today I've noticed this: ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: High load average

  1. High load average

    Hi.

    I have AIX 6.1 (6100-01) running on top of LPAR with 24 CPUs and 116 GB
    of RAM. System runs several Oracle instances. Although system runs fine,
    i.e. there are no complaints on host performance, today I've noticed this:

    # uptime
    12:45PM up 10 days, 12:45, 0 users, load average: 389.41, 389.59,
    389.69

    I can't figure it out what's causing this huge load average. CPU load
    is constantly around 20%, there is plenty of free memory (practically
    no page space is used) and I don't think(*) there is a significant
    network load (there are two NICs in EtherChannel with one backup adapter).

    Any hints on what could be the problem or what should I look for next?

    (*) How do I monitor current network I/O?

  2. Re: High load average

    On Sep 22, 12:51 pm, Igor Pozgaj
    wrote:
    > Hi.
    >
    > I have AIX 6.1 (6100-01) running on top of LPAR with 24 CPUs and 116 GB
    > of RAM. System runs several Oracle instances. Although system runs fine,
    > i.e. there are no complaints on host performance, today I've noticed this:
    >
    > # uptime
    > 12:45PM up 10 days, 12:45, 0 users, load average: 389.41, 389.59,
    > 389.69
    >
    > I can't figure it out what's causing this huge load average. CPU load
    > is constantly around 20%, there is plenty of free memory (practically
    > no page space is used) and I don't think(*) there is a significant
    > network load (there are two NICs in EtherChannel with one backup adapter).
    >
    > Any hints on what could be the problem or what should I look for next?
    >
    > (*) How do I monitor current network I/O?


    The program nmon gives a good overview about the current system load.
    Processes can be sorted by CPU usage, size and I/O load.

    To analyse the network load netpmon is a nice tool as well.

    hth
    Hajo


  3. Re: High load average

    On Mon, 22 Sep 2008 05:59:58 -0700, Hajo Ehlers wrote:

    > The program nmon gives a good overview about the current system load.
    > Processes can be sorted by CPU usage, size and I/O load.


    Yes, I use nmon, but as I said, neither CPU or disk I/O seem to be the
    cause for very high load average.

    > To analyse the network load netpmon is a nice tool as well.


    Tnx for this!

    --
    Igor Pozgaj | ipozgaj at fly.srk.fer.hr
    ICQ: 126002505 | IRC: @thunder (#linux@IdolNet)
    PGP: 0xEF36A092 | http://fly.srk.fer.hr/~ipozgaj
    http://ipozgaj.blogspot.com (/atom.xml RSS feed)

  4. Re: High load average

    Igor Pozgaj wrote:
    > On Mon, 22 Sep 2008 05:59:58 -0700, Hajo Ehlers wrote:
    >
    >> The program nmon gives a good overview about the current system load.
    >> Processes can be sorted by CPU usage, size and I/O load.

    >
    > Yes, I use nmon, but as I said, neither CPU or disk I/O seem to be the
    > cause for very high load average.
    >
    >> To analyse the network load netpmon is a nice tool as well.

    >
    > Tnx for this!
    >


    "system load" on unix is defined as the number of processes or threads
    waiting to run on a cpu. run ps -ef. look for unusually numerous processes.

  5. Re: High load average

    On Sep 22, 5:41 pm, Igor Pozgaj wrote:
    > On Mon, 22 Sep 2008 05:59:58 -0700, Hajo Ehlers wrote:
    > > The program nmon gives a good overview about the current system load.
    > > Processes can be sorted by CPU usage, size and I/O load.

    >
    > Yes, I use nmon, but as I said, neither CPU or disk I/O seem to be the
    > cause for very high load average.


    btw: The uptime on AIX gives an average of the amount of runnable
    processes within a certain time frame.
    So it should fit to the first colum of an "vmstat 60"

    To find the current running thread a
    $ ps -me -o THREAD | grep -w R
    gives at least a starting point to get to the parent of the running
    thread.

    better solutions are welcome
    Hajo

  6. Re: High load average

    On 2008-09-22, Hajo Ehlers wrote:
    > btw: The uptime on AIX gives an average of the amount of runnable
    > processes within a certain time frame.
    > So it should fit to the first colum of an "vmstat 60"


    Yes, by definition it should display number of runnable processes in
    last 1, 5, and 15 minutes, but check this:

    % vmstat 60

    System configuration: lcpu=24 mem=118784MB

    kthr memory page faults cpu
    ----- ----------- ------------------------ ------------ -----------
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    1 1 7309737 27390 0 0 0 0 0 0 45 2175 678 0 0 99 0
    1 1 7307435 28411 0 0 0 0 0 0 55 8232 952 1 0 98 0
    1 1 7310011 24474 0 0 0 0 0 0 100 28964 1393 3 2 95 0
    1 1 7303989 30370 0 0 0 0 0 0 59 6183 631 0 1 99 0
    1 1 7302414 31634 0 0 0 0 0 0 79 2168 738 0 0 99 0

    % uptime
    09:09AM up 11 days, 9:09, 0 users, load average: 389.08, 389.37, 389.61

    --
    Igor Pozgaj | ipozgaj at fly.srk.fer.hr
    ICQ: 126002505 | IRC: @thunder (#linux@IdolNet)
    PGP: 0xEF36A092 | http://fly.srk.fer.hr/~ipozgaj
    http://ipozgaj.blogspot.com (/atom.xml RSS feed)

  7. Re: High load average

    On Sep 23, 9:13 am, Igor Pozgaj wrote:
    > On 2008-09-22, Hajo Ehlers wrote:
    >
    > > btw: The uptime on AIX gives an average of the amount of runnable
    > > processes within a certain time frame.
    > > So it should fit to the first colum of an "vmstat 60"

    >
    > Yes, by definition it should display number of runnable processes in
    > last 1, 5, and 15 minutes, but check this:
    >
    > % vmstat 60
    >
    > System configuration: lcpu=24 mem=118784MB
    >
    > kthr memory page faults cpu
    > ----- ----------- ------------------------ ------------ -----------
    > r b avm fre re pi po fr sr cy in sy cs us sy id wa
    > 1 1 7309737 27390 0 0 0 0 0 0 45 2175 678 0 0 99 0
    > 1 1 7307435 28411 0 0 0 0 0 0 55 8232 952 1 0 98 0
    > 1 1 7310011 24474 0 0 0 0 0 0 100 28964 1393 3 2 95 0
    > 1 1 7303989 30370 0 0 0 0 0 0 59 6183 631 0 1 99 0
    > 1 1 7302414 31634 0 0 0 0 0 0 79 2168 738 0 0 99 0
    >
    > % uptime
    > 09:09AM up 11 days, 9:09, 0 users, load average: 389.08, 389.37, 389.61
    >



    Then i assume the following bug:
    HIGH OR INCORRECT LOAD AVERAGE DUE TO UNINITIALIZED KPROCS
    http://www-01.ibm.com/support/docvie...id=isg1IY98543

    hth
    Hajo

+ Reply to Thread