Q: HP-UX 11.31 slow performance without load - HP UX

This is a discussion on Q: HP-UX 11.31 slow performance without load - HP UX ; Hi, does anybody else have performance issues with HP-UX 11.31 on IA64? We have one single-CPU machine (rx2620) running the March 2008 Base-OE plus PHKL_38243 (among others). Before the patch the kernel used to hang or crash. Recently I was ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: Q: HP-UX 11.31 slow performance without load

  1. Q: HP-UX 11.31 slow performance without load

    Hi,

    does anybody else have performance issues with HP-UX 11.31 on IA64? We have
    one single-CPU machine (rx2620) running the March 2008 Base-OE plus PHKL_38243
    (among others). Before the patch the kernel used to hang or crash. Recently I
    was experiencing severy performance problems when reading inodes (i.e. "find /
    -fsonly vxfs -user 102"). The system load was 0.0something, and no process
    really seemed to be busy I have no good ideas.

    On another machine (rx3600) with one HT-dual-core CPU (i.e. 4 logical CPUs)
    running HP-UX 11.31 December OE plus March update bundles (but not the OE
    update, and not PHKL_38243 or PHKL_38174) I saw similar problems: A "find"
    that shold be complete in about 5 seconds took 20 minutes. A real-life
    snapshot taken by crashinfo showed that find was blocked on "pgdata". Looking
    further I found that the usage of filecache_max was close to 100%.

    Up to that time HT was disabled in the kernel, and as you can enable it at
    runtime, just out of despair and curiosity I enabled HT (which online-added
    two CPUs). Then the find process (and others) became active.

    What I'm wondering: PHKL_38243 was expected to have an effect only on a
    single-CPU machine. I'm planning to install the patch on the other machine as
    well, but I'm unsure whether that will cure the symptom. The patch README
    refers to PHKL_38275 which in turn mentions the filecache_max problem.

    Any experiences (same problem, or a solution)?

    Regards,
    Ulrich

  2. Re: Q: HP-UX 11.31 slow performance without load

    Ulrich Windl wrote:
    > Hi,
    >
    > does anybody else have performance issues with HP-UX 11.31 on IA64? We have
    > one single-CPU machine (rx2620) running the March 2008 Base-OE plus PHKL_38243
    > (among others). Before the patch the kernel used to hang or crash. Recently I
    > was experiencing severy performance problems when reading inodes (i.e. "find /
    > -fsonly vxfs -user 102"). The system load was 0.0something, and no process
    > really seemed to be busy I have no good ideas.
    >
    > On another machine (rx3600) with one HT-dual-core CPU (i.e. 4 logical CPUs)
    > running HP-UX 11.31 December OE plus March update bundles (but not the OE
    > update, and not PHKL_38243 or PHKL_38174) I saw similar problems: A "find"
    > that shold be complete in about 5 seconds took 20 minutes. A real-life
    > snapshot taken by crashinfo showed that find was blocked on "pgdata". Looking
    > further I found that the usage of filecache_max was close to 100%.


    Sounds rather like QXCR1000593755. That wait channel means the thread is
    sleeping in the physical allocator. That should be very, very rare -- if
    the system is low on memory, there's a wait channel higher up that would
    have been first.

    Which means that we're likely to have physical memory -- just not the
    particular type this allocation wants at the moment (there's something
    special here). Being inode related... I can reasonably suspect that the
    "special" bit about it is that this allocation wants 16Kb and no less --
    but the physical memory on the system is fragmented such that only 4Kb
    [base page size] or 8Kb ranges are available. Hence the thread is
    sleeping until someone frees memory such that a 16Kb range is formed
    (either by a free of 16Kb or greater, or a piece that will coalesce with
    other free memory to form the size we need). The key issue addressed
    in the CR is that the allocation doesn't actually need to be made the
    way it is. The inode does want 16Kb, but that can be part of an
    even larger, already translated piece of kernel dynamic memory. It
    is being allocated as a new unique kernel allocation -- and hence
    bypassing caching layers that are there to avoid this type of
    fragmentation / waiting in the physical layer.

    There's probably a contributing factor fixed as part of a wider effort
    in QXCR1000802246 -- some filecache metadata was using base page
    allocations when it didn't need to, and hence you'd get physical
    allocations patterns like single page for metadata, a few pages for
    data... then the data pages come back (as the cache hits max and
    cycles), but metadata is typically cached longer. When this type of
    allocation needs to break up a larger physical range to satisfy the
    request -- you end up with more fragmentation due to being unable to
    coalesce. (The correction here was to remove the unneeded restriction,
    so these allocations will come from kernel dynamic memory caching
    layers in most cases -- which refill at a much larger size, reducing
    fragmentation).

    Both CRs are addressed in the upcoming September 2008 Core Kernel VM
    patch. [There's still a chance the exact number can be changed, so I'm
    not going to quote it here].

    >
    > Up to that time HT was disabled in the kernel, and as you can enable it at
    > runtime, just out of despair and curiosity I enabled HT (which online-added
    > two CPUs). Then the find process (and others) became active.


    Interesting... shouldn't have had any effect on the core problem per se.
    It suggests you caused some memory to get flushed down to the allocator
    as things moved around (or providing more contexts for the inode caching
    layer up above this resulted in some of these being cleaned up and those
    frees unblocked the allocation... did vxfsd run more on the OLA'd
    contexts before the find unblocked by chance?)

    >
    > What I'm wondering: PHKL_38243 was expected to have an effect only on a
    > single-CPU machine. I'm planning to install the patch on the other machine as
    > well, but I'm unsure whether that will cure the symptom. The patch README
    > refers to PHKL_38275 which in turn mentions the filecache_max problem.


    Actually, 38243's key CR (QXCR1000795727) has nothing to do with the
    number of processors. That was all about executable header data caching
    management in the file cache. As you point out, though - PHKL_38275's
    QXCR1000790313 which was dealing with vhand monopolizing a processor.
    (That's a problem in general -- but on a single processor system that
    changes from bad vhand performance and utilization to "hang"..).
    (PHKL_38449 is really where you want to go until the September release
    for that kind of thing... since that's got a couple spots 38174 didn't
    address as well).

    If this is JFS (as I rather suspect it is), a better workaround as I
    understand some other threads of conversation is to set the tunable
    vxfs_ifree_timelag to -1. (This has the effect of making the JFS inode
    cache static instead of doing a bunch of kernel memory alloc/dealloc
    operations. Reduces the fragmentation and precludes the alloc/free
    paths that are where the find gets hung up). Tuning the inode cache
    layer directly should also help, see:
    http://docs.hp.com/en/5992-0732/5992-0732.pdf
    if you haven't already read over this document.

    JFS inode tuning is still going to be something to think about even
    with the September patch, of course -- the patch addresses incorrect
    behavior, but you can still have a lot of memory eaten by that cache
    after find, etc.

    Barring that, or if this isn't JFS -- all I can think of other than
    a reboot would be to try to kick some pages back down to the allocator
    to unlock it. Easiest ways I can think of to do that would be:

    1) Shrink the filecache, causing it to flush pages back to the physical
    allocator level. Once things unblock, you can raise it back.

    2) Similarly -- you could run something to eat a bunch of physical
    memory to cause global memory pressure, triggering vhand... which
    would flush pages back to the physical allocator, etc.

    Obviously, that's a bit of a gamble... hence the JFS tuning if
    appropriate is the way to go... plus the patch when it goes out.

    Don
    --
    kernel, n:
    A part of an operating system that preserves the medieval traditions
    of sorcery and black art.

+ Reply to Thread