IPMP group failure for (seemingly) no reason. - SUN

This is a discussion on IPMP group failure for (seemingly) no reason. - SUN ; Hi all. I've got a Netra440 with IPMP set up (config below). Every so often I will get messages that the the interfaces fail and the group fails. I only get the messages from IPMP, I don't get messages from ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: IPMP group failure for (seemingly) no reason.

  1. IPMP group failure for (seemingly) no reason.

    Hi all. I've got a Netra440 with IPMP set up (config below). Every so
    often I will get messages that the the interfaces fail and the group
    fails. I only get the messages from IPMP, I don't get messages from
    the physical layer saying anything is wrong with the connection. Also,
    I have other connections on this server going to the same switch and
    those appear to be fine. I am doing something I think is a bit
    unorthodox. The multipathing is over a 16 bit subnet with no router.
    The 2 individual addresses are 10.20.10.61 and 10.20.11.61 and they
    share 10.20.10.60. Could that be the cause of my problem? I've never
    seen that done before, but I don't really know why it couldn't. I'm
    running Solaris 8. I've read that 108528 is a necessary patch for an
    issue similar to this, and I'm currently running 108528-22.

    Thanks.

    ce1:
    flags=78040843
    mtu 1500
    inet 10.20.10.61 netmask ffff0000 broadcast 10.20.255.255
    groupname ipmp
    ether 0:14:4f:25:dd:f8
    ce5:
    flags=18040843
    mtu 1500
    inet 10.20.11.61 netmask ffff0000 broadcast 10.20.255.255
    groupname ipmp
    ether 0:14:4f:25:dc:c4
    ce5:1: flags=10000843 mtu 1500
    inet 10.20.10.60 netmask ffff0000 broadcast 10.20.255.255


  2. Re: IPMP group failure for (seemingly) no reason.

    On Apr 17, 4:08 pm, bozothedeathmachine
    wrote:
    > Hi all. I've got a Netra440 with IPMP set up (config below). Every so
    > often I will get messages that the the interfaces fail and the group
    > fails. I only get the messages from IPMP, I don't get messages from
    > the physical layer saying anything is wrong with the connection. Also,
    > I have other connections on this server going to the same switch and
    > those appear to be fine. I am doing something I think is a bit
    > unorthodox. The multipathing is over a 16 bit subnet with no router.
    > The 2 individual addresses are 10.20.10.61 and 10.20.11.61 and they
    > share 10.20.10.60. Could that be the cause of my problem? I've never
    > seen that done before, but I don't really know why it couldn't. I'm
    > running Solaris 8. I've read that 108528 is a necessary patch for an
    > issue similar to this, and I'm currently running 108528-22.
    >
    > Thanks.
    >
    > ce1:
    > flags=78040843
    > mtu 1500
    > inet 10.20.10.61 netmask ffff0000 broadcast 10.20.255.255
    > groupname ipmp
    > ether 0:14:4f:25:dd:f8
    > ce5:
    > flags=18040843
    > mtu 1500
    > inet 10.20.11.61 netmask ffff0000 broadcast 10.20.255.255
    > groupname ipmp
    > ether 0:14:4f:25:dc:c4
    > ce5:1: flags=10000843 mtu 1500
    > inet 10.20.10.60 netmask ffff0000 broadcast 10.20.255.255


    Probe based IPMP requires ping partners, which by default is the
    default router. The ping partner must reply consistently to ICMP
    requests from mpathd. If there are consecutive requests without
    replies, it is considered failed. You can specify ping partners by
    defining static routes.


  3. Re: IPMP group failure for (seemingly) no reason.

    On Apr 18, 12:04 am, Adam Sanders wrote:
    > On Apr 17, 4:08 pm, bozothedeathmachine
    >
    >
    >
    > wrote:
    > > Hi all. I've got a Netra440 with IPMP set up (config below). Every so
    > > often I will get messages that the the interfaces fail and the group
    > > fails. I only get the messages from IPMP, I don't get messages from
    > > the physical layer saying anything is wrong with the connection. Also,
    > > I have other connections on this server going to the same switch and
    > > those appear to be fine. I am doing something I think is a bit
    > > unorthodox. The multipathing is over a 16 bit subnet with no router.
    > > The 2 individual addresses are 10.20.10.61 and 10.20.11.61 and they
    > > share 10.20.10.60. Could that be the cause of my problem? I've never
    > > seen that done before, but I don't really know why it couldn't. I'm
    > > running Solaris 8. I've read that 108528 is a necessary patch for an
    > > issue similar to this, and I'm currently running 108528-22.

    >
    > > Thanks.

    >
    > > ce1:
    > > flags=78040843
    > > mtu 1500
    > > inet 10.20.10.61 netmask ffff0000 broadcast 10.20.255.255
    > > groupname ipmp
    > > ether 0:14:4f:25:dd:f8
    > > ce5:
    > > flags=18040843
    > > mtu 1500
    > > inet 10.20.11.61 netmask ffff0000 broadcast 10.20.255.255
    > > groupname ipmp
    > > ether 0:14:4f:25:dc:c4
    > > ce5:1: flags=10000843 mtu 1500
    > > inet 10.20.10.60 netmask ffff0000 broadcast 10.20.255.255

    >
    > Probe based IPMP requires ping partners, which by default is the
    > default router. The ping partner must reply consistently to ICMP
    > requests from mpathd. If there are consecutive requests without
    > replies, it is considered failed. You can specify ping partners by
    > defining static routes.


    I did a "route add net 10.20.0.0/16 10.20.10.50" on both machines.
    Let's see if it works. Thanks.


  4. Re: IPMP group failure for (seemingly) no reason.

    I'm getting all sort of errors in one the server's messages log
    (below). I'm not quite sure if adding the static route worked.

    Could those errors be indicative of a hardware problem?

    Thanks again,
    Ben..

    Apr 23 11:56:58 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 23646 ms
    Apr 23 11:56:59 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 11823 ms
    Apr 23 11:56:59 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 10000 ms
    Apr 23 12:01:59 utspptslee1 in.mpathd[38]: [ID 398532 daemon.error]
    Cannot meet requested failure detection time of 10000 ms on (ine
    t ce1) new failure detection time is 63042 ms
    Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 31521 ms
    Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 15760 ms
    Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    Improved failure detection time 10000 ms


  5. Re: IPMP group failure for (seemingly) no reason.

    In article <1177352237.982077.310430@y5g2000hsa.googlegroups.c om>,
    bozothedeathmachine wrote:
    >I'm getting all sort of errors in one the server's messages log
    >(below). I'm not quite sure if adding the static route worked.
    >
    >Could those errors be indicative of a hardware problem?
    >
    >Thanks again,
    >Ben..
    >
    >Apr 23 11:56:58 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 23646 ms
    >Apr 23 11:56:59 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 11823 ms
    >Apr 23 11:56:59 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 10000 ms
    >Apr 23 12:01:59 utspptslee1 in.mpathd[38]: [ID 398532 daemon.error]
    >Cannot meet requested failure detection time of 10000 ms on (ine
    >t ce1) new failure detection time is 63042 ms
    >Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 31521 ms
    >Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 15760 ms
    >Apr 23 12:03:00 utspptslee1 in.mpathd[38]: [ID 122137 daemon.error]
    >Improved failure detection time 10000 ms


    Silly question, but: does the router you added respond to pings or does
    any device on your segment respond to broadcast pings? Is the system that
    you're having issues with a relatively network idle system? I've noticed
    issues like you've described mostly on systems that have low amounts of
    traffic on a given IPMP grouping.

    -tom

    --

    "You can only be -so- accurate with a claw-hammer." --me

+ Reply to Thread