-
Re: [9fans] 9grid
> Hi Eric,[color=blue]
> I check lib/ndb/auth in the file server, this is what i have:
> hostid°otes
> uid
> dm uid[/color]
i assume poor spelling and cut-n-paste failure. :-)
the key thing here is if the only hostid in /lib/ndb/auth
is bootes, then the cpu server's hostowner must be
bootes. is it? can you run commands from the cpu
server's console?
you're doing the right thing, you've just got a
configuration error.
- erik
-
Re: 9grid
Ok, i just ran some local commands from cpu server, and it is ok, i'm
gonna use the cpu servers only like a compute nodes. From cpu server i
wanted to see responses , so i did and got this:
cpus# ssh terminal name
ssh: dialing terminal name: connection refused
cpus# ssh file server name
ssh: reading server version: unexpected EOF
the second error, i got also when i tried either from terminal or file
server.
yes, bootes is the cpu server's hostowner,but on terminal i log as
Armando.
what do you mean with "configuration error"?where could it be?
is ssh right to launch a task from terminal?
thanks again
Armando
[color=blue]
> i assume poor spelling and cut-n-paste failure. :-)
> the key thing here is if the only hostid in /lib/ndb/auth
> is bootes, then the cpu server's hostowner must be
> bootes. *is it? *can you run commands from the cpu
> server's console?
>
> you're doing the right thing, you've just got a
> configuration error.
>
> - erik[/color]
-
Re: [9fans] 9grid
> Ok, i just ran some local commands from cpu server, and it is ok, i'm[color=blue]
> gonna use the cpu servers only like a compute nodes. From cpu server i
> wanted to see responses , so i did and got this:
> cpus# ssh terminal name
> ssh: dialing terminal name: connection refused
> cpus# ssh file server name
> ssh: reading server version: unexpected EOF
> the second error, i got also when i tried either from terminal or file
> server.
> yes, bootes is the cpu server's hostowner,but on terminal i log as
> Armando.
> what do you mean with "configuration error"?where could it be?
> is ssh right to launch a task from terminal?
>
> thanks again
>
> Armando
>[/color]
You want to be sitting at a terminal, and start a command on a cpu
server, right?
cpu -h <cpuserver> -c <command> <args>
That will execute the command on the cpu server and leave you at the
terminal prompt when you are done.
I have no idea why you are trying to ssh from your cpu server to your
terminal or to the fileserver. Forget ssh. If you have a Plan 9
network, ssh is 100% wrong for you.
John
-
Re: 9grid
On 11 Nov, 15:46, j...@csplan9.rit.edu wrote:[color=blue][color=green]
> > Ok, i just ran some local *commands from cpu server, and it is ok, i'm
> > gonna use the cpu servers only like a compute nodes. From cpu server i
> > wanted to see responses , so *i did and got this:
> > * * * * cpus# ssh terminal name
> > * * * * ssh: dialing terminal name: connection refused
> > * * * * cpus# ssh file server name
> > * * * * ssh: reading server version: unexpected EOF
> > the second error, i got also when i tried either from terminal or file
> > server.
> > yes, bootes is the cpu server's hostowner,but on terminal i log as
> > Armando.
> > what do you mean with "configuration error"?where could it be?
> > is ssh right to launch a task from terminal?[/color]
>[color=green]
> > thanks again[/color]
>[color=green]
> > Armando[/color][/color]
Thanks john, i would like to send simple programs (jobs) to the nodes
(diskless cpu server) of a 9grid from terminal, and get responses from
them. How can i do it?
[color=blue]
> You want to be sitting at a terminal, and start a command on a cpu
> server, right?
>
> cpu -h <cpuserver> -c <command> <args>
>
> That will execute the command on the cpu server and leave you at the
> terminal prompt when you are done.
>
> I have no idea why you are trying to ssh from your cpu server to your
> terminal or to the fileserver. *Forget ssh. *If you have a Plan 9
> network, ssh is 100% wrong for you.
>
> John[/color]
-
Re: [9fans] 9grid
What is a '9grid'?
uriel
On Tue, Nov 11, 2008 at 4:12 PM, <lupin636@gmail.com> wrote:[color=blue]
> On 11 Nov, 15:46, j...@csplan9.rit.edu wrote:[color=green][color=darkred]
>> > Ok, i just ran some local commands from cpu server, and it is ok, i'm
>> > gonna use the cpu servers only like a compute nodes. From cpu server i
>> > wanted to see responses , so i did and got this:
>> > cpus# ssh terminal name
>> > ssh: dialing terminal name: connection refused
>> > cpus# ssh file server name
>> > ssh: reading server version: unexpected EOF
>> > the second error, i got also when i tried either from terminal or file
>> > server.
>> > yes, bootes is the cpu server's hostowner,but on terminal i log as
>> > Armando.
>> > what do you mean with "configuration error"?where could it be?
>> > is ssh right to launch a task from terminal?[/color]
>>[color=darkred]
>> > thanks again[/color]
>>[color=darkred]
>> > Armando[/color][/color]
> Thanks john, i would like to send simple programs (jobs) to the nodes
> (diskless cpu server) of a 9grid from terminal, and get responses from
> them. How can i do it?
>[color=green]
>> You want to be sitting at a terminal, and start a command on a cpu
>> server, right?
>>
>> cpu -h <cpuserver> -c <command> <args>
>>
>> That will execute the command on the cpu server and leave you at the
>> terminal prompt when you are done.
>>
>> I have no idea why you are trying to ssh from your cpu server to your
>> terminal or to the fileserver. Forget ssh. If you have a Plan 9
>> network, ssh is 100% wrong for you.
>>
>> John[/color]
>
>[/color]
-
Re: 9grid
9grid is a distributed computing project, which features prominently
the Plan 9 from Bell Labs operating system
Armando
On 11 Nov, 16:43, urie...@gmail.com (Uriel) wrote:[color=blue]
> What is a '9grid'?
>
> uriel[/color]
-
Re: [9fans] 9grid
How cool! Tell me more....
Your ideas intrigue me and I wish to subscribe to your newsletter.
uriel
On Tue, Nov 11, 2008 at 5:32 PM, <lupin636@gmail.com> wrote:[color=blue]
> 9grid is a distributed computing project, which features prominently
> the Plan 9 from Bell Labs operating system
>
> Armando
>
> On 11 Nov, 16:43, urie...@gmail.com (Uriel) wrote:[color=green]
>> What is a '9grid'?
>>
>> uriel[/color]
>
>[/color]
-
Re: [9fans] 9grid
On Tue, Nov 11, 2008 at 7:12 AM, <lupin636@gmail.com> wrote:
[color=blue]
> Thanks john, i would like to send simple programs (jobs) to the nodes
> (diskless cpu server) of a 9grid from terminal, and get responses from
> them. How can i do it?
>[/color]
suppose you have a list of nodes
cpu% NODES=(a b c d)
cpu% echo $NODES
a b c d
cpu% for (i in $NODES) {
cpu -h $i -c some-command&
}
Go ahead. Try it!
for (i in $NODES) {
cpu -h $i -c date&
}
OK, now suppose you have what in the high end business is still called
an 'input deck'. It's in a weird place. You get to it by saying
some-command -i input-file
for (i in $NODES) {
cpu -h $i -c some-command -i your-file&
}
This will work whether there is a mount on those nodes for your home
directory or not. Comes free with cpu.
What if you for whatever reason want a ps to show all the proces on
all the nodes you're running on.
for (i in $NODES) {
import -a $i .com /proc /proc
}
Your /proc is now the unified /proc of all your nodes. (I used to do
this all the time with my plan 9 minicluster)
That way, if you want to kill all the some-commands running on ALL your nodes:
slay some-command | rc
The point being that you only need to run this command on the
front-end, not on each node.
You just can't even try to do this sort of thing with ssh.
ron
-
Re: [9fans] 9grid
> What if you for whatever reason want a ps to show all the proces on[color=blue]
> all the nodes you're running on.
>
> for (i in $NODES) {
> import -a $i .com /proc /proc
> }[/color]
what's the .com for?
[color=blue]
> Your /proc is now the unified /proc of all your nodes. (I used to do
> this all the time with my plan 9 minicluster)[/color]
does ps not mind if several processes have the same pid?
- erik
-
Re: [9fans] 9grid
On Tue, Nov 11, 2008 at 4:11 PM, erik quanstrom <quanstro@quanstro.net> wrote:[color=blue][color=green]
>> What if you for whatever reason want a ps to show all the proces on
>> all the nodes you're running on.
>>
>> for (i in $NODES) {
>> import -a $i .com /proc /proc
>> }[/color]
>
> what's the .com for?
>[/color]
it's when I forgot to take part of the test :-)
[color=blue][color=green]
>> Your /proc is now the unified /proc of all your nodes. (I used to do
>> this all the time with my plan 9 minicluster)[/color]
>
> does ps not mind if several processes have the same pid?
>[/color]
It never seemed to.
But of course if you have procs with same pid, the collisions are obvious.
So, do the easy thing:
for all nodes, mount them at
/proc/localhost
/proc/hostname/whatever
Then modify ps (takes about 5 minutes) so it iterates over /proc/*
where * is a set of host names.
now you can do fun stuff
slay node8/mpirun | rc
slay node*/mpirun | rc
There's a lot of good stuff in there if you want to use it ... I
actually implemented all this a few years back when Vic did hist first
xcpu code. It was really nice.
ron
-
Re: [9fans] 9grid
> It never seemed to.[color=blue]
>
> But of course if you have procs with same pid, the collisions are obvious.
>
> So, do the easy thing:
>
> for all nodes, mount them at
> /proc/localhost
> /proc/hostname/whatever
>
> Then modify ps (takes about 5 minutes) so it iterates over /proc/*
> where * is a set of host names.
>
> now you can do fun stuff
> slay node8/mpirun | rc
> slay node*/mpirun | rc
>
> There's a lot of good stuff in there if you want to use it ... I
> actually implemented all this a few years back when Vic did hist first
> xcpu code. It was really nice.[/color]
the trivial solution on your hardware would be to partition
the pid space, wouldn't it. just have 64bit pids? let each
machine start at a 1<<32 boundary?
four billion machines ought with four billion processes each
ought to be enough for anyone.
- erik
-
Re: [9fans] 9grid
On Tue, Nov 11, 2008 at 4:36 PM, erik quanstrom <quanstro@quanstro.net> wrote:
[color=blue]
> the trivial solution on your hardware would be to partition
> the pid space, wouldn't it. just have 64bit pids? let each
> machine start at a 1<<32 boundary?[/color]
Sure. But you have to change the pid type in the kernel and and and and and
The point here is that with fairly trivial mods to a few programs you
can build a cluster management suite that unix or windows based
cluster tools can not really touch.
But you don't have gcc. That's an issue. Not kidding here. Don't have
a good fortran compiler either. This is where binary support is very
useful.
ron