NB 5.1 MP5 - bpbkar Memory Hog, causing system crash on Soalris 10 - Veritas Net Backup

This is a discussion on NB 5.1 MP5 - bpbkar Memory Hog, causing system crash on Soalris 10 - Veritas Net Backup ; Hi All, Interesting night had by myself last night. My previously flawless NB 5.1MP5 Master Server crashed after running out of memory for the first time since it's installation in May last year (2005). Platform: Sun V240, 4GB RAM, Solaris ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: NB 5.1 MP5 - bpbkar Memory Hog, causing system crash on Soalris 10

  1. NB 5.1 MP5 - bpbkar Memory Hog, causing system crash on Soalris 10

    Hi All,

    Interesting night had by myself last night. My previously flawless NB
    5.1MP5 Master Server crashed after running out of memory for the first
    time since it's installation in May last year (2005).

    Platform: Sun V240, 4GB RAM, Solaris 10 - latest patches (Using Update
    Connection, so it is bleeding edge - almost).

    Tape SAN: 2 x L700 tape libraries containing 5 - STK 9940B drives (3 in
    primary, 2 in secondary L700), All SSO.

    15 Media Servers - 12 are SAN Media servers for their own SAN disk. So
    three that are used by the clients (56 at last count), more to follow as
    I replace/migrate an old NB 3.4 implementation.

    Anyway, back to the issues;

    1. Memory Hog - Sun analyzed a Crash Dump and pointed the finger at
    bpbkar - at the time of the crash it was allocated 3 GB of the available
    RAM and more requests were queued, pending allocation. According to the
    README for MP5, bpbkar is supposedly fixed for a similar issue.

    2. Upon restart - None of the Master Server's tape drives will fire up.
    Jobs will queue and go "Active", allocate a drive but never use the
    drive! bpbkar never fires. This is verified by iostat and the activity
    log for the Active jobs on the Master.

    do I have a Device Catalog corruption?

    Remedial action:

    I will stop the Master once all the Media server backups complete and
    rebuild the device configuration on the master to see if this helps the
    tape use issue.

    Awaiting a Symantec call-back re: bpbkar. I've uploaded them a nbsupport
    capture to review.

    Any ideas people?

    Thanks in advance & regards,

    Andrew R-C.







  2. Re: NB 5.1 MP5 - bpbkar Memory Hog, causing system crash on Solaris10

    Hi All,

    Some follow up on my issue.

    Root Cause:

    Solaris 10 Kernel Patch ID: 118833-17

    There is a registered bug for Solaris 10 with the Memory issue
    experienced by NetBackup when running on Solaris 10 at this patch level.

    The fix/workaround:

    Remove this patch from your system if applied.
    Never install this patch level until a fix is released.



    Andrew Ross-Costello wrote:
    > Hi All,
    >
    > Interesting night had by myself last night. My previously flawless NB
    > 5.1MP5 Master Server crashed after running out of memory for the first
    > time since it's installation in May last year (2005).
    >
    > Platform: Sun V240, 4GB RAM, Solaris 10 - latest patches (Using Update
    > Connection, so it is bleeding edge - almost).
    >
    > Tape SAN: 2 x L700 tape libraries containing 5 - STK 9940B drives (3 in
    > primary, 2 in secondary L700), All SSO.
    >
    > 15 Media Servers - 12 are SAN Media servers for their own SAN disk. So
    > three that are used by the clients (56 at last count), more to follow as
    > I replace/migrate an old NB 3.4 implementation.
    >
    > Anyway, back to the issues;
    >



    > 1. Memory Hog - Sun analyzed a Crash Dump and pointed the finger at
    > bpbkar - at the time of the crash it was allocated 3 GB of the available
    > RAM and more requests were queued, pending allocation. According to the
    > README for MP5, bpbkar is supposedly fixed for a similar issue.
    >
    > 2. Upon restart - None of the Master Server's tape drives will fire up.
    > Jobs will queue and go "Active", allocate a drive but never use the
    > drive! bpbkar never fires. This is verified by iostat and the activity
    > log for the Active jobs on the Master.
    >
    > do I have a Device Catalog corruption?
    >
    > Remedial action:
    >
    > I will stop the Master once all the Media server backups complete and
    > rebuild the device configuration on the master to see if this helps the
    > tape use issue.
    >
    > Awaiting a Symantec call-back re: bpbkar. I've uploaded them a nbsupport
    > capture to review.
    >
    > Any ideas people?
    >
    > Thanks in advance & regards,
    >
    > Andrew R-C.
    >
    >
    >
    >
    >
    >


+ Reply to Thread