VCS on AIX - failover problem
Did not find VCS related group, so.. :)
I am using Veritas Cluster Server on AIX LPARs (AIX 5.3).
I have 2 machines in cluster, node1 and node2 (working under
Veritas Cluster Server); there are 4 volume groups having
several filesystems - 3 of them are 'normal', one is so
called 'scalable' - there is more than 130 disks on SAN,
and total VG size is over 5TB. this VG (oravg) has 4
filesystems inside, having 1,2TB each.
I am close to handover this cluster to production, but
during testing still have issue. When doing failover from
node1 to node2, oravg (this big vg mentioned above) is
not failing over to node2 - error found in vcs logfile is
VCS ERROR V-16-2-13066, Agent is calling clean for resource
(oravg) because the resource is not up even after online
Interesting is that I oravg after this unsuccesful is not
varied on on any node - but I can vary it on manually on node2,
and it becomes online. Also intersting is that failover from
node2 to node1 is always successful (complicated, I know).
So far I tried to:
- increase timeouts wherever I could - no change
- tried to re-import oravg on node2 - no change
- tried to open call to symantec - not any answer yet
- I have also found similar issue with patch on Veritas WWW,
but related to Solaris only:
Still trying to figure out what is wrong there, but so far
cannot help myself :) Any advices appreciated.