SFRAC - 5.0 configuration problem
Steps did :
-----------
SFRAC installation is done with 2 nodes ( through installsfrac )
SFRAC configuration is done with 2 nodes ( installsfrac -configure )
Fencing is done and rebooted cluster nodes..,
Problem :
---------
1) gabconfig -a shows two ports are not running ( f-CFS Port and w-Vxconfigd
port ).... How to make these two ports are running in both nodes ?
2) "vxdctl -c mode" output is correct in MASTER node but output differs
in slave node... How to make both are to be part of cluster ?
Output In Node A
----------------
# gabconfig -a
GAB Port Memberships
===============================================================
Port a gen 5afe03 membership 01
Port b gen 5afe05 membership 01
Port d gen 5afe0a membership 01
Port f gen 5afe0e membership ;1
Port f gen 5afe0e visible 0
Port h gen 5afe07 membership 01
Port o gen 5afe08 membership 01
Port v gen 5afe0b membership 01
Port w gen 5afe0c membership ;1
Port w gen 5afe0c visible 0
#
# vxdctl -c mode
mode: enabled: cluster active - MASTER
master: invicta
reconfig: master selection
#
Output In Node B
----------------
# gabconfig -a
GAB Port Memberships
===============================================================
Port a gen 5afe03 membership 01
Port b gen 5afe05 membership 01
Port d gen 5afe0a membership 01
Port h gen 5afe07 membership 01
Port o gen 5afe08 membership 01
Port v gen 5afe0b membership 01
#
# vxdctl -c mode
mode: enabled: cluster inactive
#
Verified :
----------
a. vxdctl -k stop and restarted vxconfigd ( CVMVxconfigd) and verified gabconfig
-a output
b. Rebooted hosts and ran hastart and verified gabconfig -a output
Please pass me hints/suggestions to solve the problem.
Regards
~Saravanan
Re: SFRAC - 5.0 configuration problem
port f = fsckd
So, check if fsckd is running on the second node, and start (through VCS
if needed)
port w = vxconfigd
Again, as part of the VCS, this will be started.
Where I would start searching is "hastatus -sum" and look at the
/var/VRTSvcs/log/engine_A.log file for clues why it did not start
Saravanan wrote:[color=blue]
> Steps did :
> -----------
> SFRAC installation is done with 2 nodes ( through installsfrac )
> SFRAC configuration is done with 2 nodes ( installsfrac -configure )
> Fencing is done and rebooted cluster nodes..,
>
> Problem :
> ---------
> 1) gabconfig -a shows two ports are not running ( f-CFS Port and w-Vxconfigd
> port ).... How to make these two ports are running in both nodes ?
> 2) "vxdctl -c mode" output is correct in MASTER node but output differs
> in slave node... How to make both are to be part of cluster ?
>
>
> Output In Node A
> ----------------
>
> # gabconfig -a
> GAB Port Memberships
> ===============================================================
> Port a gen 5afe03 membership 01
> Port b gen 5afe05 membership 01
> Port d gen 5afe0a membership 01
> Port f gen 5afe0e membership ;1
> Port f gen 5afe0e visible 0
> Port h gen 5afe07 membership 01
> Port o gen 5afe08 membership 01
> Port v gen 5afe0b membership 01
> Port w gen 5afe0c membership ;1
> Port w gen 5afe0c visible 0
> #
>
> # vxdctl -c mode
> mode: enabled: cluster active - MASTER
> master: invicta
> reconfig: master selection
> #
>
>
> Output In Node B
> ----------------
>
> # gabconfig -a
> GAB Port Memberships
> ===============================================================
> Port a gen 5afe03 membership 01
> Port b gen 5afe05 membership 01
> Port d gen 5afe0a membership 01
> Port h gen 5afe07 membership 01
> Port o gen 5afe08 membership 01
> Port v gen 5afe0b membership 01
> #
>
> # vxdctl -c mode
> mode: enabled: cluster inactive
> #
>
> Verified :
> ----------
> a. vxdctl -k stop and restarted vxconfigd ( CVMVxconfigd) and verified gabconfig
> -a output
> b. Rebooted hosts and ran hastart and verified gabconfig -a output
>
> Please pass me hints/suggestions to solve the problem.
>
> Regards
> ~Saravanan[/color]
Re: SFRAC - 5.0 configuration problem
Hi,
Yes, after running vxfsckd port f got enbaled and no issues with port f.
But port w = vxconfigd is still having problem. When i look into hastatus
-summary and hares -display and after seeing the log /var/VRTSvcs/log/engine_A.log,
i am seeing VCSweb application is failed in one node. Hope this might be
a problem for port w.
even when i try to run, vcsweb it gives an error.. vcs*.war file is not found..
Any idea/hints on how to solve this problem.. Please find the log excerpt
in engine_A.log
grep -i "fault" /var/VRTSvcs/log/engine_A.log
2006/10/18 11:28:24 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
Group: ClusterService) is FAULTED (timed out) on sys invicta
2006/10/18 11:28:24 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to send
trigger for resfault; script doesn't exist
2006/10/18 11:45:16 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
Group: cvm) is FAULTED (timed out) on sys invicta
2006/10/18 11:45:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system
invicta
2006/10/18 11:45:17 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to send
trigger for resfault; script doesn't exist
2006/10/18 14:45:53 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
Group: ClusterService) is FAULTED (timed out) on sys invicta
2006/10/18 14:45:53 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to send
trigger for resfault; script doesn't exist
2006/10/18 14:49:06 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
Group: ClusterService) is FAULTED (timed out) on sys charger
2006/10/18 14:49:06 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to send
trigger for resfault; script doesn't exist
2006/10/18 14:57:34 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
Group: ClusterService) is FAULTED (timed out) on sys invicta
2006/10/18 14:57:34 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to send
trigger for resfault; script doesn't exist
2006/10/18 15:14:46 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
Group: cvm) is FAULTED (timed out) on sys charger
2006/10/18 15:14:46 VCS ERROR V-16-1-10205 Group cvm is faulted on system
charger
2006/10/18 15:14:46 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to send
trigger for resfault; script doesn't exist
Thanks
saravanan
Me <me@hotmail.com> wrote:[color=blue]
>port f = fsckd
>
>So, check if fsckd is running on the second node, and start (through VCS[/color]
[color=blue]
>if needed)
>
>port w = vxconfigd
>
>Again, as part of the VCS, this will be started.
>
>
>Where I would start searching is "hastatus -sum" and look at the
>/var/VRTSvcs/log/engine_A.log file for clues why it did not start
>
>
>
>Saravanan wrote:[color=green]
>> Steps did :
>> -----------
>> SFRAC installation is done with 2 nodes ( through installsfrac )
>> SFRAC configuration is done with 2 nodes ( installsfrac -configure )
>> Fencing is done and rebooted cluster nodes..,
>>
>> Problem :
>> ---------
>> 1) gabconfig -a shows two ports are not running ( f-CFS Port and w-Vxconfigd
>> port ).... How to make these two ports are running in both nodes ?
>> 2) "vxdctl -c mode" output is correct in MASTER node but output differs
>> in slave node... How to make both are to be part of cluster ?
>>
>>
>> Output In Node A
>> ----------------
>>
>> # gabconfig -a
>> GAB Port Memberships
>> ===============================================================
>> Port a gen 5afe03 membership 01
>> Port b gen 5afe05 membership 01
>> Port d gen 5afe0a membership 01
>> Port f gen 5afe0e membership ;1
>> Port f gen 5afe0e visible 0
>> Port h gen 5afe07 membership 01
>> Port o gen 5afe08 membership 01
>> Port v gen 5afe0b membership 01
>> Port w gen 5afe0c membership ;1
>> Port w gen 5afe0c visible 0
>> #
>>
>> # vxdctl -c mode
>> mode: enabled: cluster active - MASTER
>> master: invicta
>> reconfig: master selection
>> #
>>
>>
>> Output In Node B
>> ----------------
>>
>> # gabconfig -a
>> GAB Port Memberships
>> ===============================================================
>> Port a gen 5afe03 membership 01
>> Port b gen 5afe05 membership 01
>> Port d gen 5afe0a membership 01
>> Port h gen 5afe07 membership 01
>> Port o gen 5afe08 membership 01
>> Port v gen 5afe0b membership 01
>> #
>>
>> # vxdctl -c mode
>> mode: enabled: cluster inactive
>> #
>>
>> Verified :
>> ----------
>> a. vxdctl -k stop and restarted vxconfigd ( CVMVxconfigd) and verified[/color][/color]
gabconfig[color=blue][color=green]
>> -a output
>> b. Rebooted hosts and ran hastart and verified gabconfig -a output
>>
>> Please pass me hints/suggestions to solve the problem.
>>
>> Regards
>> ~Saravanan[/color][/color]
Re: SFRAC - 5.0 configuration problem
For VCSweb resource, the vcs web application must be installed properly. you
should see somthing like "?.war" under /opt/VRTSweb/VERITAS. sounds like
that your system missed this file.
Port w problem may be relevant with your cvm group. check vxvm level. Do
you see vxconfigd daemon?
Yu
"Saravanan" <sara.kovai@gmail.com> wrote:[color=blue]
>
>Hi,
>
>Yes, after running vxfsckd port f got enbaled and no issues with port f.
>
>But port w = vxconfigd is still having problem. When i look into hastatus
>-summary and hares -display and after seeing the log /var/VRTSvcs/log/engine_A.log,
>i am seeing VCSweb application is failed in one node. Hope this might be
>a problem for port w.
>
>even when i try to run, vcsweb it gives an error.. vcs*.war file is not[/color]
found..[color=blue]
>Any idea/hints on how to solve this problem.. Please find the log excerpt
>in engine_A.log
>
>grep -i "fault" /var/VRTSvcs/log/engine_A.log
>
>2006/10/18 11:28:24 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>Group: ClusterService) is FAULTED (timed out) on sys invicta
>2006/10/18 11:28:24 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>2006/10/18 11:45:16 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
>Group: cvm) is FAULTED (timed out) on sys invicta
>2006/10/18 11:45:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system
>invicta
>2006/10/18 11:45:17 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>2006/10/18 14:45:53 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>Group: ClusterService) is FAULTED (timed out) on sys invicta
>2006/10/18 14:45:53 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>2006/10/18 14:49:06 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>Group: ClusterService) is FAULTED (timed out) on sys charger
>2006/10/18 14:49:06 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>2006/10/18 14:57:34 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>Group: ClusterService) is FAULTED (timed out) on sys invicta
>2006/10/18 14:57:34 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>2006/10/18 15:14:46 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
>Group: cvm) is FAULTED (timed out) on sys charger
>2006/10/18 15:14:46 VCS ERROR V-16-1-10205 Group cvm is faulted on system
>charger
>2006/10/18 15:14:46 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to[/color]
send[color=blue]
>trigger for resfault; script doesn't exist
>
>
>Thanks
>saravanan
>
>
>Me <me@hotmail.com> wrote:[color=green]
>>port f = fsckd
>>
>>So, check if fsckd is running on the second node, and start (through VCS[/color]
>[color=green]
>>if needed)
>>
>>port w = vxconfigd
>>
>>Again, as part of the VCS, this will be started.
>>
>>
>>Where I would start searching is "hastatus -sum" and look at the
>>/var/VRTSvcs/log/engine_A.log file for clues why it did not start
>>
>>
>>
>>Saravanan wrote:[color=darkred]
>>> Steps did :
>>> -----------
>>> SFRAC installation is done with 2 nodes ( through installsfrac )
>>> SFRAC configuration is done with 2 nodes ( installsfrac -configure )
>>> Fencing is done and rebooted cluster nodes..,
>>>
>>> Problem :
>>> ---------
>>> 1) gabconfig -a shows two ports are not running ( f-CFS Port and w-Vxconfigd
>>> port ).... How to make these two ports are running in both nodes ?
>>> 2) "vxdctl -c mode" output is correct in MASTER node but output differs
>>> in slave node... How to make both are to be part of cluster ?
>>>
>>>
>>> Output In Node A
>>> ----------------
>>>
>>> # gabconfig -a
>>> GAB Port Memberships
>>> ===============================================================
>>> Port a gen 5afe03 membership 01
>>> Port b gen 5afe05 membership 01
>>> Port d gen 5afe0a membership 01
>>> Port f gen 5afe0e membership ;1
>>> Port f gen 5afe0e visible 0
>>> Port h gen 5afe07 membership 01
>>> Port o gen 5afe08 membership 01
>>> Port v gen 5afe0b membership 01
>>> Port w gen 5afe0c membership ;1
>>> Port w gen 5afe0c visible 0
>>> #
>>>
>>> # vxdctl -c mode
>>> mode: enabled: cluster active - MASTER
>>> master: invicta
>>> reconfig: master selection
>>> #
>>>
>>>
>>> Output In Node B
>>> ----------------
>>>
>>> # gabconfig -a
>>> GAB Port Memberships
>>> ===============================================================
>>> Port a gen 5afe03 membership 01
>>> Port b gen 5afe05 membership 01
>>> Port d gen 5afe0a membership 01
>>> Port h gen 5afe07 membership 01
>>> Port o gen 5afe08 membership 01
>>> Port v gen 5afe0b membership 01
>>> #
>>>
>>> # vxdctl -c mode
>>> mode: enabled: cluster inactive
>>> #
>>>
>>> Verified :
>>> ----------
>>> a. vxdctl -k stop and restarted vxconfigd ( CVMVxconfigd) and verified[/color][/color]
>gabconfig[color=green][color=darkred]
>>> -a output
>>> b. Rebooted hosts and ran hastart and verified gabconfig -a output
>>>
>>> Please pass me hints/suggestions to solve the problem.
>>>
>>> Regards
>>> ~Saravanan[/color][/color]
>[/color]
Re: SFRAC - 5.0 configuration problem
Here is the problem:
2006/10/18 15:14:46 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner:
unknown,[color=blue]
>Group: cvm) is FAULTED (timed out) on sys charger
>2006/10/18 15:14:46 VCS ERROR V-16-1-10205 Group cvm is faulted on system
>charger[/color]
So, you can see that port "w" (the vxvm patr of the cluster will not join.
If you do get a timeout, it would mean one of the following:
1. The onlinetimeout (300 seconds) has been reached - it took more than
300 seconds to online port "w"
2. Disks are slow to respond. As part of port "w" joining, the disks are
compaerd between the joining node and the master node.
3. Inside the main.cf , the CVMTimeout is too low.
ok, how to fix:
1. on the node that does not want to join port "w"
vxclustad -m vcs -t gab startnode
Check for any messages on the console as well as in the
/var/adm/messages file and in the /var/VRTSvcs/log/engine_A.log file
During this time, do "gabconfig -a". You should see at least oprt "u"
joining, and then perhaps "w" and "v" joining (perhaps before closing again)
2. If it still does not want to join, try increasing the CVMTImeout. If
the other node in the cluster is still running, you will have to do "ha"
commands to do this. Default is 200 - make it 500
3. Last option is to see why vxconfigd (the part of volume manager that
does not want to join) is doing while joining.
vxconfigd -k -x 9 -x log /var/tmp/vxconfigd.log
Then try the command in 1. again to start the cluster on the node.
If still no joy, "vxconfigd -k" and then either post the log file here
or open a case with support (giving them the log file)
yu wrote:[color=blue]
> For VCSweb resource, the vcs web application must be installed properly. you
> should see somthing like "?.war" under /opt/VRTSweb/VERITAS. sounds like
> that your system missed this file.
>
> Port w problem may be relevant with your cvm group. check vxvm level. Do
> you see vxconfigd daemon?
>
> Yu
> "Saravanan" <sara.kovai@gmail.com> wrote:
>[color=green]
>>Hi,
>>
>>Yes, after running vxfsckd port f got enbaled and no issues with port f.
>>
>>But port w = vxconfigd is still having problem. When i look into hastatus
>>-summary and hares -display and after seeing the log /var/VRTSvcs/log/engine_A.log,
>>i am seeing VCSweb application is failed in one node. Hope this might be
>>a problem for port w.
>>
>>even when i try to run, vcsweb it gives an error.. vcs*.war file is not[/color]
>
> found..
>[color=green]
>>Any idea/hints on how to solve this problem.. Please find the log excerpt
>>in engine_A.log
>>
>>grep -i "fault" /var/VRTSvcs/log/engine_A.log
>>
>>2006/10/18 11:28:24 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>>Group: ClusterService) is FAULTED (timed out) on sys invicta
>>2006/10/18 11:28:24 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>2006/10/18 11:45:16 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
>>Group: cvm) is FAULTED (timed out) on sys invicta
>>2006/10/18 11:45:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system
>>invicta
>>2006/10/18 11:45:17 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>2006/10/18 14:45:53 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>>Group: ClusterService) is FAULTED (timed out) on sys invicta
>>2006/10/18 14:45:53 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>2006/10/18 14:49:06 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>>Group: ClusterService) is FAULTED (timed out) on sys charger
>>2006/10/18 14:49:06 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>2006/10/18 14:57:34 VCS ERROR V-16-1-10303 Resource VCSweb (Owner: unknown,
>>Group: ClusterService) is FAULTED (timed out) on sys invicta
>>2006/10/18 14:57:34 VCS INFO V-16-6-15004 (invicta) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>2006/10/18 15:14:46 VCS ERROR V-16-1-10303 Resource cvm_clus (Owner: unknown,
>>Group: cvm) is FAULTED (timed out) on sys charger
>>2006/10/18 15:14:46 VCS ERROR V-16-1-10205 Group cvm is faulted on system
>>charger
>>2006/10/18 15:14:46 VCS INFO V-16-6-15004 (charger) hatrigger:Failed to[/color]
>
> send
>[color=green]
>>trigger for resfault; script doesn't exist
>>
>>
>>Thanks
>>saravanan
>>
>>
>>Me <me@hotmail.com> wrote:
>>[color=darkred]
>>>port f = fsckd
>>>
>>>So, check if fsckd is running on the second node, and start (through VCS[/color]
>>[color=darkred]
>>>if needed)
>>>
>>>port w = vxconfigd
>>>
>>>Again, as part of the VCS, this will be started.
>>>
>>>
>>>Where I would start searching is "hastatus -sum" and look at the
>>>/var/VRTSvcs/log/engine_A.log file for clues why it did not start
>>>
>>>
>>>
>>>Saravanan wrote:
>>>
>>>>Steps did :
>>>>-----------
>>>> SFRAC installation is done with 2 nodes ( through installsfrac )
>>>> SFRAC configuration is done with 2 nodes ( installsfrac -configure )
>>>> Fencing is done and rebooted cluster nodes..,
>>>>
>>>>Problem :
>>>>---------
>>>>1) gabconfig -a shows two ports are not running ( f-CFS Port and w-Vxconfigd
>>>>port ).... How to make these two ports are running in both nodes ?
>>>>2) "vxdctl -c mode" output is correct in MASTER node but output differs
>>>>in slave node... How to make both are to be part of cluster ?
>>>>
>>>>
>>>>Output In Node A
>>>>----------------
>>>>
>>>># gabconfig -a
>>>>GAB Port Memberships
>>>>===============================================================
>>>>Port a gen 5afe03 membership 01
>>>>Port b gen 5afe05 membership 01
>>>>Port d gen 5afe0a membership 01
>>>>Port f gen 5afe0e membership ;1
>>>>Port f gen 5afe0e visible 0
>>>>Port h gen 5afe07 membership 01
>>>>Port o gen 5afe08 membership 01
>>>>Port v gen 5afe0b membership 01
>>>>Port w gen 5afe0c membership ;1
>>>>Port w gen 5afe0c visible 0
>>>>#
>>>>
>>>># vxdctl -c mode
>>>>mode: enabled: cluster active - MASTER
>>>>master: invicta
>>>>reconfig: master selection
>>>>#
>>>>
>>>>
>>>>Output In Node B
>>>>----------------
>>>>
>>>># gabconfig -a
>>>>GAB Port Memberships
>>>>===============================================================
>>>>Port a gen 5afe03 membership 01
>>>>Port b gen 5afe05 membership 01
>>>>Port d gen 5afe0a membership 01
>>>>Port h gen 5afe07 membership 01
>>>>Port o gen 5afe08 membership 01
>>>>Port v gen 5afe0b membership 01
>>>>#
>>>>
>>>># vxdctl -c mode
>>>>mode: enabled: cluster inactive
>>>>#
>>>>
>>>>Verified :
>>>>----------
>>>> a. vxdctl -k stop and restarted vxconfigd ( CVMVxconfigd) and verified[/color]
>>
>>gabconfig
>>[color=darkred]
>>>>-a output
>>>> b. Rebooted hosts and ran hastart and verified gabconfig -a output
>>>>
>>>>Please pass me hints/suggestions to solve the problem.
>>>>
>>>>Regards
>>>>~Saravanan[/color]
>>[/color]
>[/color]