error 9 - HP UX

This is a discussion on error 9 - HP UX ; Given: HP superdome, HP-UX 11.23 ia64 Problem: after the migration into a bigger superdome, the application (a bunch of ksh93 and other scripts with C code) started misbehaving. The symptoms are strange, processes may die for no reason, i.e. starts ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: error 9

  1. error 9

    Given: HP superdome, HP-UX 11.23 ia64

    Problem: after the migration into a bigger superdome, the application
    (a bunch of ksh93 and other scripts with C code) started misbehaving.
    The symptoms are strange, processes may die for no reason, i.e. starts
    but dies. No error message is printed. What's consistent is the parent
    detects exit code = 9 from the dead child. The errors happen
    infrequently, in different places and impossible to reproduce in
    controlled environment.

    Question: what could that be? Is it possible to intercept such
    condition? Special system monitoring tool available or something?

    The most illustrative STDERR example from one of scripts that uses
    gmake:
    ....
    gmake: *** [SOME_DATA_FILE] Error 9
    ....


  2. Re: error 9

    den wrote:
    > Given: HP superdome, HP-UX 11.23 ia64
    >
    > Problem: after the migration into a bigger superdome, the application
    > (a bunch of ksh93 and other scripts with C code) started misbehaving.
    > The symptoms are strange, processes may die for no reason, i.e. starts
    > but dies. No error message is printed. What's consistent is the parent
    > detects exit code = 9 from the dead child. The errors happen
    > infrequently, in different places and impossible to reproduce in
    > controlled environment.
    >
    > Question: what could that be? Is it possible to intercept such
    > condition? Special system monitoring tool available or something?
    >
    > The most illustrative STDERR example from one of scripts that uses
    > gmake:
    > ...
    > gmake: *** [SOME_DATA_FILE] Error 9
    > ...
    >


    Assuming that the exit code is derived from one of the stock
    error codes (and given that you seem to get it on file operations),
    EBADF is equivalent to 9 and corresponds to Bad File Number.

    My gut is that nfile(5), maxfiles_lim(5)or maxfiles(5) is lower
    on the new SuperDome (or more memory/cpus is enabling more open
    files, so you're hitting the limit(s)) and you've run out of
    file descriptors... so new open operations fail, etc. Not 100%
    rock solid sure on this, but given that these are dynamic (except
    for maxfiles -- that's static) tunables on 11.23, it would be worth a
    check of their values and to try raising them.

    Don

  3. Re: error 9

    Doesn't seem to be the cause. According to kcusage, the number of
    opened files is big, but still not even close to max allowed values.
    Thanks, for idea though.

    On 22 อมา, 11:32, Don Morris wrote:
    > den wrote:
    > > Given: HP superdome, HP-UX 11.23 ia64

    >
    > > Problem: after the migration into a bigger superdome, the application
    > > (a bunch of ksh93 and other scripts with C code) started misbehaving.
    > > The symptoms are strange, processes may die for no reason, i.e. starts
    > > but dies. No error message is printed. What's consistent is the parent
    > > detects exit code = 9 from the dead child. The errors happen
    > > infrequently, in different places and impossible to reproduce in
    > > controlled environment.

    >
    > > Question: what could that be? Is it possible to intercept such
    > > condition? Special system monitoring tool available or something?

    >
    > > The most illustrative STDERR example from one of scripts that uses
    > > gmake:
    > > ...
    > > gmake: *** [SOME_DATA_FILE] Error 9
    > > ...

    >
    > Assuming that the exit code is derived from one of the stock
    > error codes (and given that you seem to get it on file operations),
    > EBADF is equivalent to 9 and corresponds to Bad File Number.
    >
    > My gut is that nfile(5), maxfiles_lim(5)or maxfiles(5) is lower
    > on the new SuperDome (or more memory/cpus is enabling more open
    > files, so you're hitting the limit(s)) and you've run out of
    > file descriptors... so new open operations fail, etc. Not 100%
    > rock solid sure on this, but given that these are dynamic (except
    > for maxfiles -- that's static) tunables on 11.23, it would be worth a
    > check of their values and to try raising them.
    >
    > Don




+ Reply to Thread