Re: Intermittent RWSCS state - VMS

This is a discussion on Re: Intermittent RWSCS state - VMS ; Marty Kuhrt wrote on 09/09/2008 12:38:20 PM: > Richard Brodie wrote: > > wrote in message news:OF1D8E67E9. > 7DD1DEB5-ON852574BF.00508FDB- > >> Okay, thanks so now we know what they are, but not if things are > good or bad. > ...

+ Reply to Thread
Results 1 to 12 of 12

Thread: Re: Intermittent RWSCS state

  1. Re: Intermittent RWSCS state

    Marty Kuhrt wrote on 09/09/2008 12:38:20 PM:

    > Richard Brodie wrote:
    > > wrote in message news:OF1D8E67E9.

    > 7DD1DEB5-ON852574BF.00508FDB-
    > >> Okay, thanks so now we know what they are, but not if things are

    > good or bad.
    > >
    > > I don't think the CR_WAITS in the VMS$VAXcluster sysap are good news.
    > > That's sort of locky rather than bulk traffic. LOCKDIRWT too high on

    an old
    > > VAX? Beyond that I would grab one of Keith Parris' presentations on

    cluster/
    > > lock manager performance.
    > >
    > >

    >
    > I ran into something similar once that was triggered by my cat slightly
    > pulling the cat-5 cable (cats and cat5 don't mix). Not so much that the


    > network connection was completely lost, but this VAX, in a cluster of
    > Alphas, was joining and leaving the cluster as fast as it could.
    > Everybody else in the cluster was stalling waiting for the transition to


    > finish, which would restart immediately. When I finally got logged into


    > a machine console via a VT and did a reply/enable to see OPCOM stuff, I
    > saw the endless stream of joining/leaving messages from the culprit
    > VAX. Pulled the network cable, reseated it, and all was well when the
    > storm subsided.
    >
    > I've also seen something like this as well when a hub/switch went "kinda


    > bad".
    >
    > Since the VAX in OP's question is probably only talking to the cluster
    > via its 10M network cable, it might be as simple as a that.


    No such luck. Talking on FDDI for SCS traffic. 10MB network for other.

    > Bad card,
    > bad cable, bad hub/switch, etc. Babble, babble, babble, and the next
    > thing you know you're knee deep in RWSCS.
    >
    > Since this is a home cluster I use for porting and development work, I
    > don't normally have OPCOM enabled, or do much logging. VMS machines
    > just run right, right? ;^)



  2. Re: Intermittent RWSCS state

    norm.raphael@metso.com wrote:
    >
    > Marty Kuhrt wrote on 09/09/2008 12:38:20 PM:
    >
    > > Richard Brodie wrote:
    > > > wrote in message news:OF1D8E67E9.

    > > 7DD1DEB5-ON852574BF.00508FDB-
    > > >> Okay, thanks so now we know what they are, but not if things are

    > > good or bad.
    > > >
    > > > I don't think the CR_WAITS in the VMS$VAXcluster sysap are good news.
    > > > That's sort of locky rather than bulk traffic. LOCKDIRWT too high

    > on an old
    > > > VAX? Beyond that I would grab one of Keith Parris' presentations on

    > cluster/
    > > > lock manager performance.
    > > >
    > > >

    > >
    > > I ran into something similar once that was triggered by my cat slightly
    > > pulling the cat-5 cable (cats and cat5 don't mix). Not so much that the
    > > network connection was completely lost, but this VAX, in a cluster of
    > > Alphas, was joining and leaving the cluster as fast as it could.
    > > Everybody else in the cluster was stalling waiting for the transition to
    > > finish, which would restart immediately. When I finally got logged into
    > > a machine console via a VT and did a reply/enable to see OPCOM stuff, I
    > > saw the endless stream of joining/leaving messages from the culprit
    > > VAX. Pulled the network cable, reseated it, and all was well when the
    > > storm subsided.
    > >
    > > I've also seen something like this as well when a hub/switch went "kinda
    > > bad".
    > >
    > > Since the VAX in OP's question is probably only talking to the cluster
    > > via its 10M network cable, it might be as simple as a that.

    >
    > No such luck. Talking on FDDI for SCS traffic. 10MB network for other.


    Are you certain? IIRC, the default cluster configuration is to enable all
    SCS-capable circuits, and normally all the traffic would end up on the
    fastest one (FDDI), but if there was a momentary failure or excessive
    congestion on the FDDI, it might have failed over to the ethernet, thus
    hitting the VAX's 10Mb bottleneck, and then never failed back. I
    think the show cluster circuit counters should reveal if this has
    happened. (I think the 2nd example shows circuit counters by circuit,
    but not circuit names, so I can't tell which is which, though possibly a
    cluster expert could.)

    There is a way to force it to use *only* the FDDI, and I think there's
    a way to force to fail back to FDDI if for some reason it has failed
    over to the Ethernet.

    HTH.

    >
    > > Bad card,
    > > bad cable, bad hub/switch, etc. Babble, babble, babble, and the next
    > > thing you know you're knee deep in RWSCS.
    > >
    > > Since this is a home cluster I use for porting and development work, I
    > > don't normally have OPCOM enabled, or do much logging. VMS machines
    > > just run right, right? ;^)



    --
    John Santos
    Evans Griffiths & Hart, Inc.
    781-861-0670 ext 539

  3. Re: Intermittent RWSCS state

    "John Santos" wrote in message
    news:shYxk.16$ia.0@nwrddc02.gnilink.net...
    > There is a way to force it to use *only* the FDDI, and I think there's
    > a way to force to fail back to FDDI if for some reason it has failed
    > over to the Ethernet.
    > --
    > John Santos
    > Evans Griffiths & Hart, Inc.
    > 781-861-0670 ext 539


    Look for SYS$EXAMPLES:LAVC$STOP_BUS


  4. Re: Intermittent RWSCS state

    John Santos wrote:
    > norm.raphael@metso.com wrote:
    >>
    >> Marty Kuhrt wrote on 09/09/2008 12:38:20 PM:
    >>
    >> > Since the VAX in OP's question is probably only talking to the cluster
    >> > via its 10M network cable, it might be as simple as a that.

    >>
    >> No such luck. Talking on FDDI for SCS traffic. 10MB network for other.

    >
    > Are you certain? IIRC, the default cluster configuration is to enable all
    > SCS-capable circuits, and normally all the traffic would end up on the
    > fastest one (FDDI), but if there was a momentary failure or excessive
    > congestion on the FDDI, it might have failed over to the ethernet, thus
    > hitting the VAX's 10Mb bottleneck, and then never failed back. I
    > think the show cluster circuit counters should reveal if this has
    > happened. (I think the 2nd example shows circuit counters by circuit,
    > but not circuit names, so I can't tell which is which, though possibly a
    > cluster expert could.)
    >
    > There is a way to force it to use *only* the FDDI, and I think there's
    > a way to force to fail back to FDDI if for some reason it has failed
    > over to the Ethernet.
    >
    > HTH.


    Now that I think on it, was there a FDDI interconnect for VAXen? I
    vaguely remember that Nemonix was making an after market one, but I
    don't remember a "native" one. Of course, that doesn't mean too much,
    since I occasionally forget I have my glasses on my head. ;^)

  5. Re: Intermittent RWSCS state


    "Marty Kuhrt" wrote in message
    news:r8Gdnf9Z3fma2FTVnZ2dnUVZ_q_inZ2d@speakeasy.ne t...

    > Now that I think on it, was there a FDDI interconnect for VAXen?


    There certainly were; there was the DEFQA. It was 100Mb
    Ethernet that Digital didnt sell for VAXen. Admittedly running
    lock traffic over Q-bus with disk IO over Fibre Channel seeems
    a bit of a mismatch.

    Assuming it's not big iron, that is. It could be a BI or XMI FDDI
    adapter.



  6. Re: Intermittent RWSCS state

    On Sep 11, 5:19 pm, Marty Kuhrt wrote:
    > John Santos wrote:
    > > norm.raph...@metso.com wrote:

    >
    > >> Marty Kuhrt wrote on 09/09/2008 12:38:20 PM:

    >
    > >> > Since the VAX in OP's question is probably only talking to the cluster
    > >> > via its 10M network cable, it might be as simple as a that.

    >
    > >> No such luck. Talking on FDDI for SCS traffic. 10MB network for other.

    >
    > > Are you certain? IIRC, the default cluster configuration is to enable all
    > > SCS-capable circuits, and normally all the traffic would end up on the
    > > fastest one (FDDI), but if there was a momentary failure or excessive
    > > congestion on the FDDI, it might have failed over to the ethernet, thus
    > > hitting the VAX's 10Mb bottleneck, and then never failed back. I
    > > think the show cluster circuit counters should reveal if this has
    > > happened. (I think the 2nd example shows circuit counters by circuit,
    > > but not circuit names, so I can't tell which is which, though possibly a
    > > cluster expert could.)

    >
    > > There is a way to force it to use *only* the FDDI, and I think there's
    > > a way to force to fail back to FDDI if for some reason it has failed
    > > over to the Ethernet.

    >
    > > HTH.

    >
    > Now that I think on it, was there a FDDI interconnect for VAXen? I
    > vaguely remember that Nemonix was making an after market one, but I
    > don't remember a "native" one. Of course, that doesn't mean too much,
    > since I occasionally forget I have my glasses on my head. ;^)


    There was FDDI from DEC for TURBOCHANNEL (the DEFTA). And there were
    VAXes (eg VAXstation 4000s?) with TURBOCHANNEL. I'm pretty sure there
    was DEFTA support, on VAX, in at least some versions of VMS, though I
    don't recall actually ever seeing that combination (whereas I knew of
    lots of DEC 3000s with FDDI, especially where resilience was/is of
    interest). A more definitive answer would be the VAX/VMS SPDs
    themselves.

  7. Re: Intermittent RWSCS state

    Marty Kuhrt writes:

    >Now that I think on it, was there a FDDI interconnect for VAXen? I
    >vaguely remember that Nemonix was making an after market one, but I
    >don't remember a "native" one. Of course, that doesn't mean too much,
    >since I occasionally forget I have my glasses on my head. ;^)


    Here's an oddball.

    Around 1990, there was actually an FDDI interconnect being developed that
    sat on a SCSI bus, intended for VAXstation 3100 series systems. My
    baptism by fire in VMS drivers was to write a driver for this thing! I
    got it to run in a LAN Cluster, and shortly thereafter the project was
    cancelled.

  8. Re: Intermittent RWSCS state

    Marty Kuhrt skrev:
    > John Santos wrote:
    >> norm.raphael@metso.com wrote:
    >>>
    >>> Marty Kuhrt wrote on 09/09/2008 12:38:20 PM:
    >>>
    >>> > Since the VAX in OP's question is probably only talking to the
    >>> cluster
    >>> > via its 10M network cable, it might be as simple as a that.
    >>> No such luck. Talking on FDDI for SCS traffic. 10MB network for other.

    >>
    >> Are you certain? IIRC, the default cluster configuration is to enable
    >> all
    >> SCS-capable circuits, and normally all the traffic would end up on the
    >> fastest one (FDDI), but if there was a momentary failure or excessive
    >> congestion on the FDDI, it might have failed over to the ethernet, thus
    >> hitting the VAX's 10Mb bottleneck, and then never failed back. I
    >> think the show cluster circuit counters should reveal if this has
    >> happened. (I think the 2nd example shows circuit counters by circuit,
    >> but not circuit names, so I can't tell which is which, though possibly a
    >> cluster expert could.)
    >>
    >> There is a way to force it to use *only* the FDDI, and I think there's
    >> a way to force to fail back to FDDI if for some reason it has failed
    >> over to the Ethernet.
    >>
    >> HTH.

    >
    > Now that I think on it, was there a FDDI interconnect for VAXen? I
    > vaguely remember that Nemonix was making an after market one, but I
    > don't remember a "native" one. Of course, that doesn't mean too much,
    > since I occasionally forget I have my glasses on my head. ;^)


    There was atleast one FDDI controller for the Q-bus from DEC. Can't remember the
    name of it.
    (I wonder if anyone every tried using that on a PDP-11... :-) )

    Johnny

    --
    Johnny Billquist || "I'm on a bus
    || on a psychedelic trip
    email: bqt@softjar.se || Reading murder books
    pdp is alive! || tryin' to stay hip" - B. Idol

  9. Re: Intermittent RWSCS state

    Michael Moroney wrote:
    > Marty Kuhrt writes:
    >
    >
    >>Now that I think on it, was there a FDDI interconnect for VAXen? I
    >>vaguely remember that Nemonix was making an after market one, but I
    >>don't remember a "native" one. Of course, that doesn't mean too much,
    >>since I occasionally forget I have my glasses on my head. ;^)

    >
    >
    > Here's an oddball.
    >
    > Around 1990, there was actually an FDDI interconnect being developed that
    > sat on a SCSI bus, intended for VAXstation 3100 series systems. My
    > baptism by fire in VMS drivers was to write a driver for this thing! I
    > got it to run in a LAN Cluster, and shortly thereafter the project was
    > cancelled.


    One of my customers had some VAX 6000-series systems with FDDI, (XMI-based
    I think, not BI), so it was definitely real and supported.


    --
    John Santos
    Evans Griffiths & Hart, Inc.
    781-861-0670 ext 539

  10. Re: Intermittent RWSCS state

    Am having problem with IMAP server.

    This runs on a node that has direct access to all the necesasry disks.
    So no clustering features should be needed, right ?

    SHOW SYS reveals it is in RWSCS state. (everytime I do show sys).

    But SHOW PROC/CONT never shows it in RWSCS. It does show it in MWAIT as
    well as normal COM/HIB/LEF states.

    This is alpha VMS 8.3

    The process serves about 200 message headers then dies. The client then
    restarts at message 1. Perhaps some message at the end which causes imap
    to crash, and the client then restarts to download the messages database
    from sratch.

    However, it is interesting to see the discrepency between SHOW SYS and
    SHOW PROC/CONT in terms of process status.

  11. Re: Intermittent RWSCS state


    "JF Mezei" wrote in message
    news:48cad299$0$12384$c3e8da3@news.astraweb.com...

    > This runs on a node that has direct access to all the necesasry disks.
    > So no clustering features should be needed, right ?


    You still need to co-ordinate lock operations for file open/close etc.

    > But SHOW PROC/CONT never shows it in RWSCS. It does show it in MWAIT as
    > well as normal COM/HIB/LEF states.


    RWSCS is a subtype of MWAIT.



  12. Re: Intermittent RWSCS state

    In article , "Richard Brodie" writes:
    >
    > "JF Mezei" wrote in message
    > news:48cad299$0$12384$c3e8da3@news.astraweb.com...
    >
    >> This runs on a node that has direct access to all the necesasry disks.
    >> So no clustering features should be needed, right ?

    >
    > You still need to co-ordinate lock operations for file open/close etc.
    >


    I had a fellow set up to HP-UX systems with their sendmail accessing
    a shared data store via NFS mount (i.e. no locking). Lasted abot
    20 minutes until someone broadcast an email that both those systems
    saw.


+ Reply to Thread