TOPSVCS will not start, HACMP (clstrmgr) crashes node on start - Aix

This is a discussion on TOPSVCS will not start, HACMP (clstrmgr) crashes node on start - Aix ; Hi all, Strange problem here (also IBM support is looking for a solution) 2 Node HACMP cluster with AIX 5.2.7, HACMP 5.2 (ES), RSCT 2.3.9.2 mutual take-over. If starting HACMP the topsvcs will not start and gives this message in ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: TOPSVCS will not start, HACMP (clstrmgr) crashes node on start

  1. TOPSVCS will not start, HACMP (clstrmgr) crashes node on start

    Hi all,

    Strange problem here (also IBM support is looking for a solution)
    2 Node HACMP cluster with AIX 5.2.7, HACMP 5.2 (ES), RSCT 2.3.9.2
    mutual take-over.

    If starting HACMP the topsvcs will not start and gives this message in
    cluster.log (at the end). We have syncd from the (only) active node.
    The strange thing is that both nodes have instance number 82 (odmget
    HACMPtopscvcs):

    ::TS_DIFF_INST_NUM_ER Message received with different instance nu
    mber Topology Services being shutdown to avoid false network partition.
    Message
    instance number 78 Recipient's instance number 80 Originator of message

    The other node has a similair error, but it complains with different
    numbers: recip 80 and Originator 82 !

    I'm lost ! So please any hint is app. !

    Marcel


  2. Re: TOPSVCS will not start, HACMP (clstrmgr) crashes node on start

    Marcel,

    Some considerations:

    1. Read (or paste in the list, whatever) the most recent
    /var/ha/log/grpsvcs file contents and most recent /var/ha/log/topsvcs
    file content. There you can found more details about group/topology
    services;

    2. Are you using heartbeat via disk heartbeat? These two services
    (topsvcs and grpsvcs) are most used in disk heartbeat based
    environments (I suppose :P);

    3. In the cluster documentation, that problem has some probable cause:
    3.1. an unsucessful DARE operation;
    3.2. HACMP configuration isn't correctly updated on all cluster nodes;
    3.3. One node was down during a topology DARE operation;

    You can try these tips:

    i) Check the topology services daemon (Open one tech. call on IBM);
    ii) Check the group services daemon (Open one tech. call on IBM);

    Remember: That can be one group services problem (not a topology
    services problem).

    Regards,
    Luciano Martins.

    Bobohoolie wrote:
    > Hi all,
    >
    > Strange problem here (also IBM support is looking for a solution)
    > 2 Node HACMP cluster with AIX 5.2.7, HACMP 5.2 (ES), RSCT 2.3.9.2
    > mutual take-over.
    >
    > If starting HACMP the topsvcs will not start and gives this message in
    > cluster.log (at the end). We have syncd from the (only) active node.
    > The strange thing is that both nodes have instance number 82 (odmget
    > HACMPtopscvcs):
    >
    > ::TS_DIFF_INST_NUM_ER Message received with different instance nu
    > mber Topology Services being shutdown to avoid false network partition.
    > Message
    > instance number 78 Recipient's instance number 80 Originator of message
    >
    > The other node has a similair error, but it complains with different
    > numbers: recip 80 and Originator 82 !
    >
    > I'm lost ! So please any hint is app. !
    >
    > Marcel



+ Reply to Thread