IP Failover - Veritas Cluster Server
This is a discussion on IP Failover - Veritas Cluster Server ; Hi !
I have two node cluster on which I have configured IP failovers.
i.e. I have resource group called IPFail. Under this group I have two
resources. IPFail_IP (of type IP) and IPFail_NIC (type NIC). IPFail_IP
is online on ...
-
IP Failover
Hi !
I have two node cluster on which I have configured IP failovers.
i.e. I have resource group called IPFail. Under this group I have two
resources. IPFail_IP (of type IP) and IPFail_NIC (type NIC). IPFail_IP
is online on one system (system A) and off-line on the other (system B).
When system A crashes, IPFail_IP is automatically enabled on system B
and system B assumes the IP address of A. This part is cool.
However when system A recovers, resource IPFail_IP on A is brought
online, causing a concurrency violation (since IPFail_IP got enabled on
B after the A crashed). I have added system A to the autostart list.
This problem can be solved if I removed system A from the autostart
list, but then since A is the primary server, whenever A comes online I
want IPFail_IP to be online on A and off-line on B ( B is the standby).
Is there a way to automate this. I hope my explanation is clear. Any
help will be appreciated
Thanks & Regards
Manik
-
Re: IP Failover
I have a similar situation. My understanding is that the service
remains on the 'secondary' system, until it is 'forced' to fail back
over to the 'primary' system.
My guess is it would be possible to do a cron or something on the
'secondary' system, to check the health of the 'primary', and if
the 'primary' was healthy, AND the 'secondary' is running 'primary'
services, it could force an artificial 'failure' and have VCS
move the 'primary' services back to the 'primary' system.
The logic is there, just a little out of my league.
.... Jack
Manik Taneja wrote:
>
> Hi !
> I have two node cluster on which I have configured IP failovers.
>
> i.e. I have resource group called IPFail. Under this group I have two
> resources. IPFail_IP (of type IP) and IPFail_NIC (type NIC). IPFail_IP
> is online on one system (system A) and off-line on the other (system B).
>
> When system A crashes, IPFail_IP is automatically enabled on system B
> and system B assumes the IP address of A. This part is cool.
>
> However when system A recovers, resource IPFail_IP on A is brought
> online, causing a concurrency violation (since IPFail_IP got enabled on
>
> B after the A crashed). I have added system A to the autostart list.
> This problem can be solved if I removed system A from the autostart
> list, but then since A is the primary server, whenever A comes online I
> want IPFail_IP to be online on A and off-line on B ( B is the standby).
>
> Is there a way to automate this. I hope my explanation is clear. Any
> help will be appreciated
>
> Thanks & Regards
> Manik
-
Re: IP Failover
Jack wrote:
>I have a similar situation. My understanding is that the service
>remains on the 'secondary' system, until it is 'forced' to fail back
>over to the 'primary' system.
>
>My guess is it would be possible to do a cron or something on the
>'secondary' system, to check the health of the 'primary', and if
>the 'primary' was healthy, AND the 'secondary' is running 'primary'
>services, it could force an artificial 'failure' and have VCS
>move the 'primary' services back to the 'primary' system.
>
>The logic is there, just a little out of my league.
>
>.... Jack
>
>
>
>Manik Taneja wrote:
>>
>> Hi !
>> I have two node cluster on which I have configured IP failovers.
>>
>> i.e. I have resource group called IPFail. Under this group I have two
>> resources. IPFail_IP (of type IP) and IPFail_NIC (type NIC). IPFail_IP
>> is online on one system (system A) and off-line on the other (system B).
>>
>> When system A crashes, IPFail_IP is automatically enabled on system B
>> and system B assumes the IP address of A. This part is cool.
>>
>> However when system A recovers, resource IPFail_IP on A is brought
>> online, causing a concurrency violation (since IPFail_IP got enabled
on
>>
>> B after the A crashed). I have added system A to the autostart list.
>> This problem can be solved if I removed system A from the autostart
>> list, but then since A is the primary server, whenever A comes online
I
>> want IPFail_IP to be online on A and off-line on B ( B is the standby).
>>
>> Is there a way to automate this. I hope my explanation is clear. Any
>> help will be appreciated
>>
>> Thanks & Regards
>> Manik
The both of you have deferent situations -
Jack –
*** NEVER USE CRON TO CONTROL A HA PRODUCT! ***
This can cause very bad things to happen.
Had a customer do a cron job to do a “hastop –all –force” and a “hastart”
every 15 minutes because of Oracle failing. The problem was that Oracle was
testing and updating a record and ran out of user licenses. So he fixed VCS
NOT Oracle and you can only guess as to how VCS responded.
An “auto-fail-back” is not available in VCS out of the box. I would try to
use the triggers to do something to cause a “hagrp –switch “ to happen. Sorry
can do the agent for you but, hope this helps some.
Manik –
I have to ask if the IP’s used in the resource IP the same as the interface
on the first node. If it is, change it to a VIP (virtual IP) and not bound
to a interface on a node. You may need to change your font end server(s)
or client(s) and application(s) to start using the VIP and not the main interface
of a node.