-
why VMS at uni?
because
"It appears that the Machine Room has suffered a major
power failure. As a result Bluecrystal is currently unavailable.
We are working on the problem and hope to have the system back
up and running by Monday.
Sorry for the inconvenience."
from Sunday's sysadmin's notification about
our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).
It is still down on Mon morning.
We could certainly do with some RAS (or RASS)...
But to be honest, I wonder how much administrator skills have to
do with RAS, be it a VMS or other cluster.
--
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 928 8233
Fax: +44 (0)117 929 4423
-
RE: why VMS at uni?
> -----Original Message-----[color=blue]
> From: Anton Shterenlikht [mailto:mexas@bristol.ac.uk]
> Sent: April 21, 2008 4:57 AM
> To: [email]Info-VAX@Mvb.Saic.Com[/email]
> Subject: why VMS at uni?
>
> because
>
> "It appears that the Machine Room has suffered a major
> power failure. As a result Bluecrystal is currently
> unavailable.
>
> We are working on the problem and hope to have the system back
> up and running by Monday.
>
> Sorry for the inconvenience."
>
> from Sunday's sysadmin's notification about
> our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).
> It is still down on Mon morning.
>
> We could certainly do with some RAS (or RASS)...
>
> But to be honest, I wonder how much administrator skills have to
> do with RAS, be it a VMS or other cluster.
>
> --
> Anton Shterenlikht
> Room 2.6, Queen's Building
> Mech Eng Dept
> Bristol University
> University Walk, Bristol BS8 1TR, UK
> Tel: +44 (0)117 928 8233
> Fax: +44 (0)117 929 4423[/color]
RASS is all about the skills of the local administrator to implement the
technology available to him/her in order to protect the business he/she
is supporting and to meet defined SLA's. Some platforms have proven
technology that makes it easy to do this, while others have loads of
scripts, third party utilities etc that need to be put in place.
Also, if the business says a 2 day downtime is ok, then it does not make
sense to spend excessive $'s on technology that will make it available
in hours.
Course, the challenge is that the business will often forget that they
originally said 2 days of downtime is an acceptable risk and will want
the app back up 10 minutes after it goes down.
:-)
Regards
Kerry Main
Senior Consultant
HP Services Canada
Voice: 613-254-8911
Fax: 613-591-4477
kerryDOTmainAThpDOTcom
(remove the DOT's and AT)
OpenVMS - the secure, multi-site OS that just works.
-
Re: why VMS at uni?
Anton Shterenlikht wrote:[color=blue]
> because
>
> "It appears that the Machine Room has suffered a major
> power failure. As a result Bluecrystal is currently unavailable.
>
> We are working on the problem and hope to have the system back
> up and running by Monday.
>
> Sorry for the inconvenience."
>
> from Sunday's sysadmin's notification about
> our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).
> It is still down on Mon morning.
>
> We could certainly do with some RAS (or RASS)...
>
> But to be honest, I wonder how much administrator skills have to
> do with RAS, be it a VMS or other cluster.[/color]
The two most important factors are:
- the willingness to pay for redundancy
- the syadmins ability to implement it correctly
I doubt that many super computers has a standby system ....
Arne
-
Re: why VMS at uni?
Arne Vajhøj wrote:[color=blue]
> Anton Shterenlikht wrote:[color=green]
>> because
>>
>> "It appears that the Machine Room has suffered a major
>> power failure. As a result Bluecrystal is currently unavailable.
>>
>> We are working on the problem and hope to have the system back
>> up and running by Monday.
>>
>> Sorry for the inconvenience."
>>
>> from Sunday's sysadmin's notification about
>> our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).
>> It is still down on Mon morning.
>>
>> We could certainly do with some RAS (or RASS)...
>>
>> But to be honest, I wonder how much administrator skills have to
>> do with RAS, be it a VMS or other cluster.[/color]
>
> The two most important factors are:
> - the willingness to pay for redundancy
> - the syadmins ability to implement it correctly
>
> I doubt that many super computers has a standby system ....
>
> Arne[/color]
100% uptime is EXPENSIVE. It generally requires redundant computers,
storage, air conditioners, electric power supply, chilled water, etc.
Costing out the necessary capital investment is generally enough to make
most people concede that 97% uptime will do just fine! There are cases
where nothing but 100% will do and that's when you see N+1 or Dual
Redundancy for everything essential to the operation of the data center.
It's generally not done unless lives, or many megabucks, are at stake!
-
Re: why VMS at uni?
In article <20080421085650.GB33626@mech-aslap33.men.bris.ac.uk>,
Anton Shterenlikht <mexas@bristol.ac.uk> writes:[color=blue]
> because
>
> "It appears that the Machine Room has suffered a major
> power failure. As a result Bluecrystal is currently unavailable.
>
> We are working on the problem and hope to have the system back
> up and running by Monday.
>
> Sorry for the inconvenience."
>
> from Sunday's sysadmin's notification about
> our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).[/color]
Somehow I doubt even VMS could contiue to run when the power went down.
My servers can ride out outges of up to about 2 hours, but we don't
really plan on it. The UPS's are more for power conditioning and to
allow for a controlled shutdown in the event of a power failure.
[color=blue]
> It is still down on Mon morning.
>
> We could certainly do with some RAS (or RASS)...
>
> But to be honest, I wonder how much administrator skills have to
> do with RAS, be it a VMS or other cluster.[/color]
If your Uni is anything like mine, VMS wouldn't help this either.
While I am willing to make the trip in to reset things outside of
normal work time, the U's data center tends to not be as dedicated.
We have even had network outages that cut off the whole campus from
the INTERNET that were knowingly allowed to wait overnight or thru
the rest of a weekend rather than have someone called in to fix it.
It's the nature of the (academic) beast. And probably why so many
people don't take Universities seriously when discussing professional
IT installations.
bill
--
Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
[email]billg999@cs.scranton.edu[/email] | and a sheep voting on what's for dinner.
University of Scranton |
Scranton, Pennsylvania | #include <std.disclaimer.h>
-
Re: why VMS at uni?
On 21 Apr, 14:04, Arne Vajhøj <a...@vajhoej.dk> wrote:[color=blue]
> Anton Shterenlikht wrote:[color=green]
> > because[/color]
>[color=green]
> > * *"It appears that the Machine Room has suffered a major
> > * *power failure. As a result Bluecrystal is currently unavailable.[/color]
>[color=green]
> > * *We are working on the problem and hope to have the system back
> > * *up and running by Monday.[/color]
>[color=green]
> > * *Sorry for the inconvenience."[/color]
>[color=green]
> > from Sunday's sysadmin's notification about
> > our supercomp ([url]http://www.acrc.bris.ac.uk/acrc/hpc.htm[/url]).
> > It is still down on Mon morning.[/color]
>[color=green]
> > We could certainly do with some RAS (or RASS)...[/color]
>[color=green]
> > But to be honest, I wonder how much administrator skills have to
> > do with RAS, be it a VMS or other cluster.[/color]
>
> The two most important factors are:
> - the willingness to pay for redundancy
> - the syadmins ability to implement it correctly
>
> I doubt that many super computers has a standby system ....
>
> Arne- Hide quoted text -
>
> - Show quoted text -[/color]
And I thought sys admins normally had no choice in redundancy - they
got it because management said they had to go... :os