maintenance downtime on our supercomputer - VMS
This is a discussion on maintenance downtime on our supercomputer - VMS ; Below is a message from the administrator of our supercomputer, www.bris.ac.uk .
Clearly, VMS cluster rolling update features would be very useful here.
Although I understand that cluster shutdown is also necessary in some
cases.
[...]
You have received this ...
-
maintenance downtime on our supercomputer
Below is a message from the administrator of our supercomputer, www.bris.ac.uk.
Clearly, VMS cluster rolling update features would be very useful here.
Although I understand that cluster shutdown is also necessary in some
cases.
[...]
You have received this email because you have an account on bluecrystal,
a beowulf cluster managed by the Advanced Computing Research Centre,
Bristol University.
Greetings,
It is necessary to schedule some maintenance downtime on bluecrystal
to update the parallel filesystem software, amongst other things.
The system will be shutdown on Wednesday 26th September (obviously
all running jobs will be killed when this happens). Hopefully this
work will be completed on the same day.
We apologise for any inconvenience caused by this essential
maintenance work.
[...]
--
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 928 8233
Fax: +44 (0)117 929 4423
-
Re: maintenance downtime on our supercomputer
On 09/13/07 10:31, Anton Shterenlikht wrote:
> Below is a message from the administrator of our supercomputer, www.bris.ac.uk.
> Clearly, VMS cluster rolling update features would be very useful here.
> Although I understand that cluster shutdown is also necessary in some
> cases.
>
> [...]
>
> You have received this email because you have an account on bluecrystal,
> a beowulf cluster managed by the Advanced Computing Research Centre,
> Bristol University.
>
> Greetings,
>
> It is necessary to schedule some maintenance downtime on bluecrystal
> to update the parallel filesystem software, amongst other things.
How do Beowulf clusters sync file accesses? Thru the master node?
> The system will be shutdown on Wednesday 26th September (obviously
> all running jobs will be killed when this happens). Hopefully this
> work will be completed on the same day.
>
> We apologise for any inconvenience caused by this essential
> maintenance work.
--
Ron Johnson, Jr.
Jefferson LA USA
Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!
-
Re: maintenance downtime on our supercomputer
Anton Shterenlikht wrote:
> Below is a message from the administrator of our supercomputer, www.bris.ac.uk.
Oh, that is one system Mr Vaxman probably wouldn't want to get near :-)
:-) :-) :-) :-) :-)
-
Re: maintenance downtime on our supercomputer
On Thu, Sep 13, 2007 at 11:43:31AM -0500, Ron Johnson wrote:
> On 09/13/07 10:31, Anton Shterenlikht wrote:
> >
> > It is necessary to schedule some maintenance downtime on bluecrystal
> > to update the parallel filesystem software, amongst other things.
>
> How do Beowulf clusters sync file accesses? Thru the master node?
You mean several processes to the same file? That's not easy as far
as I understand. Or rather it is discouraged due to significant overhead.
Therefore typically one has to have a separate file
for each copy of the program (for each core). In cases where sync file
access is requred, this will probably be done by the master node.
One example - parallel matrix operation. In the beginning the master
node will read the matrix data from a file and split into N chunks
according to the number of cores used in the analysis. The code executed
on each node might, if necessary, create a temp file used exclusively
by this node to store intermediate data. (Obviously it is best to keep
all data in RAM, or even better in cache if it fits). When the computation is
complete, all slave nodes will pass the data to the master which will
then combine the matrix back and write to the file.
However, I might be wrong. You probably know this area better than me..
--
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 928 8233
Fax: +44 (0)117 929 4423