Re: [9fans] 9grid - Plan9

This is a discussion on Re: [9fans] 9grid - Plan9 ; > Hi Eric, > I check lib/ndb/auth in the file server, this is what i have: > hostid°otes > uid > dm uid i assume poor spelling and cut-n-paste failure. :-) the key thing here is if the only hostid ...

+ Reply to Thread
Results 1 to 12 of 12

Thread: Re: [9fans] 9grid

  1. Re: [9fans] 9grid

    > Hi Eric,
    > I check lib/ndb/auth in the file server, this is what i have:
    > hostid°otes
    > uid
    > dm uid


    i assume poor spelling and cut-n-paste failure. :-)
    the key thing here is if the only hostid in /lib/ndb/auth
    is bootes, then the cpu server's hostowner must be
    bootes. is it? can you run commands from the cpu
    server's console?

    you're doing the right thing, you've just got a
    configuration error.

    - erik


  2. Re: 9grid

    Ok, i just ran some local commands from cpu server, and it is ok, i'm
    gonna use the cpu servers only like a compute nodes. From cpu server i
    wanted to see responses , so i did and got this:
    cpus# ssh terminal name
    ssh: dialing terminal name: connection refused
    cpus# ssh file server name
    ssh: reading server version: unexpected EOF
    the second error, i got also when i tried either from terminal or file
    server.
    yes, bootes is the cpu server's hostowner,but on terminal i log as
    Armando.
    what do you mean with "configuration error"?where could it be?
    is ssh right to launch a task from terminal?

    thanks again

    Armando

    > i assume poor spelling and cut-n-paste failure. :-)
    > the key thing here is if the only hostid in /lib/ndb/auth
    > is bootes, then the cpu server's hostowner must be
    > bootes. *is it? *can you run commands from the cpu
    > server's console?
    >
    > you're doing the right thing, you've just got a
    > configuration error.
    >
    > - erik


  3. Re: [9fans] 9grid

    > Ok, i just ran some local commands from cpu server, and it is ok, i'm
    > gonna use the cpu servers only like a compute nodes. From cpu server i
    > wanted to see responses , so i did and got this:
    > cpus# ssh terminal name
    > ssh: dialing terminal name: connection refused
    > cpus# ssh file server name
    > ssh: reading server version: unexpected EOF
    > the second error, i got also when i tried either from terminal or file
    > server.
    > yes, bootes is the cpu server's hostowner,but on terminal i log as
    > Armando.
    > what do you mean with "configuration error"?where could it be?
    > is ssh right to launch a task from terminal?
    >
    > thanks again
    >
    > Armando
    >


    You want to be sitting at a terminal, and start a command on a cpu
    server, right?

    cpu -h -c

    That will execute the command on the cpu server and leave you at the
    terminal prompt when you are done.

    I have no idea why you are trying to ssh from your cpu server to your
    terminal or to the fileserver. Forget ssh. If you have a Plan 9
    network, ssh is 100% wrong for you.


    John



  4. Re: 9grid

    On 11 Nov, 15:46, j...@csplan9.rit.edu wrote:
    > > Ok, i just ran some local *commands from cpu server, and it is ok, i'm
    > > gonna use the cpu servers only like a compute nodes. From cpu server i
    > > wanted to see responses , so *i did and got this:
    > > * * * * cpus# ssh terminal name
    > > * * * * ssh: dialing terminal name: connection refused
    > > * * * * cpus# ssh file server name
    > > * * * * ssh: reading server version: unexpected EOF
    > > the second error, i got also when i tried either from terminal or file
    > > server.
    > > yes, bootes is the cpu server's hostowner,but on terminal i log as
    > > Armando.
    > > what do you mean with "configuration error"?where could it be?
    > > is ssh right to launch a task from terminal?

    >
    > > thanks again

    >
    > > Armando

    Thanks john, i would like to send simple programs (jobs) to the nodes
    (diskless cpu server) of a 9grid from terminal, and get responses from
    them. How can i do it?

    > You want to be sitting at a terminal, and start a command on a cpu
    > server, right?
    >
    > cpu -h -c
    >
    > That will execute the command on the cpu server and leave you at the
    > terminal prompt when you are done.
    >
    > I have no idea why you are trying to ssh from your cpu server to your
    > terminal or to the fileserver. *Forget ssh. *If you have a Plan 9
    > network, ssh is 100% wrong for you.
    >
    > John


  5. Re: [9fans] 9grid

    What is a '9grid'?

    uriel

    On Tue, Nov 11, 2008 at 4:12 PM, wrote:
    > On 11 Nov, 15:46, j...@csplan9.rit.edu wrote:
    >> > Ok, i just ran some local commands from cpu server, and it is ok, i'm
    >> > gonna use the cpu servers only like a compute nodes. From cpu server i
    >> > wanted to see responses , so i did and got this:
    >> > cpus# ssh terminal name
    >> > ssh: dialing terminal name: connection refused
    >> > cpus# ssh file server name
    >> > ssh: reading server version: unexpected EOF
    >> > the second error, i got also when i tried either from terminal or file
    >> > server.
    >> > yes, bootes is the cpu server's hostowner,but on terminal i log as
    >> > Armando.
    >> > what do you mean with "configuration error"?where could it be?
    >> > is ssh right to launch a task from terminal?

    >>
    >> > thanks again

    >>
    >> > Armando

    > Thanks john, i would like to send simple programs (jobs) to the nodes
    > (diskless cpu server) of a 9grid from terminal, and get responses from
    > them. How can i do it?
    >
    >> You want to be sitting at a terminal, and start a command on a cpu
    >> server, right?
    >>
    >> cpu -h -c
    >>
    >> That will execute the command on the cpu server and leave you at the
    >> terminal prompt when you are done.
    >>
    >> I have no idea why you are trying to ssh from your cpu server to your
    >> terminal or to the fileserver. Forget ssh. If you have a Plan 9
    >> network, ssh is 100% wrong for you.
    >>
    >> John

    >
    >



  6. Re: 9grid

    9grid is a distributed computing project, which features prominently
    the Plan 9 from Bell Labs operating system

    Armando

    On 11 Nov, 16:43, urie...@gmail.com (Uriel) wrote:
    > What is a '9grid'?
    >
    > uriel


  7. Re: [9fans] 9grid

    How cool! Tell me more....

    Your ideas intrigue me and I wish to subscribe to your newsletter.

    uriel

    On Tue, Nov 11, 2008 at 5:32 PM, wrote:
    > 9grid is a distributed computing project, which features prominently
    > the Plan 9 from Bell Labs operating system
    >
    > Armando
    >
    > On 11 Nov, 16:43, urie...@gmail.com (Uriel) wrote:
    >> What is a '9grid'?
    >>
    >> uriel

    >
    >



  8. Re: [9fans] 9grid

    On Tue, Nov 11, 2008 at 7:12 AM, wrote:

    > Thanks john, i would like to send simple programs (jobs) to the nodes
    > (diskless cpu server) of a 9grid from terminal, and get responses from
    > them. How can i do it?
    >


    suppose you have a list of nodes

    cpu% NODES=(a b c d)
    cpu% echo $NODES
    a b c d
    cpu% for (i in $NODES) {
    cpu -h $i -c some-command&
    }

    Go ahead. Try it!
    for (i in $NODES) {
    cpu -h $i -c date&
    }

    OK, now suppose you have what in the high end business is still called
    an 'input deck'. It's in a weird place. You get to it by saying
    some-command -i input-file

    for (i in $NODES) {
    cpu -h $i -c some-command -i your-file&
    }

    This will work whether there is a mount on those nodes for your home
    directory or not. Comes free with cpu.

    What if you for whatever reason want a ps to show all the proces on
    all the nodes you're running on.

    for (i in $NODES) {
    import -a $i .com /proc /proc
    }

    Your /proc is now the unified /proc of all your nodes. (I used to do
    this all the time with my plan 9 minicluster)

    That way, if you want to kill all the some-commands running on ALL your nodes:
    slay some-command | rc

    The point being that you only need to run this command on the
    front-end, not on each node.

    You just can't even try to do this sort of thing with ssh.

    ron


  9. Re: [9fans] 9grid

    > What if you for whatever reason want a ps to show all the proces on
    > all the nodes you're running on.
    >
    > for (i in $NODES) {
    > import -a $i .com /proc /proc
    > }


    what's the .com for?

    > Your /proc is now the unified /proc of all your nodes. (I used to do
    > this all the time with my plan 9 minicluster)


    does ps not mind if several processes have the same pid?

    - erik



  10. Re: [9fans] 9grid

    On Tue, Nov 11, 2008 at 4:11 PM, erik quanstrom wrote:
    >> What if you for whatever reason want a ps to show all the proces on
    >> all the nodes you're running on.
    >>
    >> for (i in $NODES) {
    >> import -a $i .com /proc /proc
    >> }

    >
    > what's the .com for?
    >


    it's when I forgot to take part of the test :-)

    >> Your /proc is now the unified /proc of all your nodes. (I used to do
    >> this all the time with my plan 9 minicluster)

    >
    > does ps not mind if several processes have the same pid?
    >


    It never seemed to.

    But of course if you have procs with same pid, the collisions are obvious.

    So, do the easy thing:

    for all nodes, mount them at
    /proc/localhost
    /proc/hostname/whatever

    Then modify ps (takes about 5 minutes) so it iterates over /proc/*
    where * is a set of host names.

    now you can do fun stuff
    slay node8/mpirun | rc
    slay node*/mpirun | rc

    There's a lot of good stuff in there if you want to use it ... I
    actually implemented all this a few years back when Vic did hist first
    xcpu code. It was really nice.

    ron


  11. Re: [9fans] 9grid

    > It never seemed to.
    >
    > But of course if you have procs with same pid, the collisions are obvious.
    >
    > So, do the easy thing:
    >
    > for all nodes, mount them at
    > /proc/localhost
    > /proc/hostname/whatever
    >
    > Then modify ps (takes about 5 minutes) so it iterates over /proc/*
    > where * is a set of host names.
    >
    > now you can do fun stuff
    > slay node8/mpirun | rc
    > slay node*/mpirun | rc
    >
    > There's a lot of good stuff in there if you want to use it ... I
    > actually implemented all this a few years back when Vic did hist first
    > xcpu code. It was really nice.


    the trivial solution on your hardware would be to partition
    the pid space, wouldn't it. just have 64bit pids? let each
    machine start at a 1<<32 boundary?

    four billion machines ought with four billion processes each
    ought to be enough for anyone.

    - erik



  12. Re: [9fans] 9grid

    On Tue, Nov 11, 2008 at 4:36 PM, erik quanstrom wrote:

    > the trivial solution on your hardware would be to partition
    > the pid space, wouldn't it. just have 64bit pids? let each
    > machine start at a 1<<32 boundary?


    Sure. But you have to change the pid type in the kernel and and and and and

    The point here is that with fairly trivial mods to a few programs you
    can build a cluster management suite that unix or windows based
    cluster tools can not really touch.

    But you don't have gcc. That's an issue. Not kidding here. Don't have
    a good fortran compiler either. This is where binary support is very
    useful.

    ron


+ Reply to Thread