-
State of execution
Hi,
Anyone knows any project to save the current execution state to disk?
If there is'nt any, could you give me any clue about where to start to
develop this feature?
I supose that the only thing that should be saved is the state of the
CPUs and a memory dump.
I think this would be interesting, for example, in HPC machines to save
long-term calculations. This way they could save months in case of
power failure.
-
Re: State of execution
> Anyone knows any project to save the current execution state to disk?[color=blue]
> If there is'nt any, could you give me any clue about where to start to
> develop this feature?[/color]
For a whole system and all its processes, this feature is "suspend to disk",
which is commonly used on portable (laptop) systems. On some hardware
it is problematic due to poor documentation by the manufacturer.
[color=blue]
> I supose that the only thing that should be saved is the state of the
> CPUs and a memory dump.[/color]
For an individual process, there are things such as 'unexec' used by emacs.
In general, it requires cooperation from the process to manage open
network connections and other ephemeral state.
[color=blue]
> I think this would be interesting, for example, in HPC machines to save
> long-term calculations. This way they could save months in case of
> power failure.[/color]
Checkpoint and restart facilities are builtin to any long-running
HPC project. On *NIX systems it may be advantageous to use fork()
to make a [virtual] copy of the address space. Use Advanced Search
at groups.google.com for 'checkpoint' in group:*comp* .
--
-
Re: State of execution
Some process like Gaussian have, but there's still some that do'nt have
it.
I've been searching about unexec, it seems to be a system call. So, my
process must call it voluntarily. It'd be nice to have some utility to
force a copy of a process to disk so it could be resotered later (after
reboot).
I think that Cray had this feature: The entire system could sleep to
disk, so it could be restored later on boot-time.
John Reiser wrote:[color=blue][color=green]
> > Anyone knows any project to save the current execution state to disk?
> > If there is'nt any, could you give me any clue about where to start to
> > develop this feature?[/color]
>
> For a whole system and all its processes, this feature is "suspend to disk",
> which is commonly used on portable (laptop) systems. On some hardware
> it is problematic due to poor documentation by the manufacturer.
>[color=green]
> > I supose that the only thing that should be saved is the state of the
> > CPUs and a memory dump.[/color]
>
> For an individual process, there are things such as 'unexec' used by emacs.
> In general, it requires cooperation from the process to manage open
> network connections and other ephemeral state.
>[color=green]
> > I think this would be interesting, for example, in HPC machines to save
> > long-term calculations. This way they could save months in case of
> > power failure.[/color]
>
> Checkpoint and restart facilities are builtin to any long-running
> HPC project. On *NIX systems it may be advantageous to use fork()
> to make a [virtual] copy of the address space. Use Advanced Search
> at groups.google.com for 'checkpoint' in group:*comp* .
>
> --[/color]
-
Re: State of execution
[email]jordi.prats@gmail.com[/email] wrote:[color=blue]
> Hi,
> Anyone knows any project to save the current execution state to disk?
> If there is'nt any, could you give me any clue about where to start to
> develop this feature?
>
> I supose that the only thing that should be saved is the state of the
> CPUs and a memory dump.[/color]
No, that is a very simplistic view of a process' state. There is lots of
additional information kept by the kernel: open files, memory layout,
network connections, user- and group-IDs, etc.
Also, there may be file contents and states of other processes that have
influence on the further execution of the process.
--
Josef Möllers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize
-- T. Pratchett
-
Re: State of execution
[email]jordi.prats@gmail.com[/email] writes:
[color=blue]
> Hi,
> Anyone knows any project to save the current execution state to disk?
> If there is'nt any, could you give me any clue about where to start to
> develop this feature?
>
> I supose that the only thing that should be saved is the state of the
> CPUs and a memory dump.[/color]
The basic problem is that "state" is something you define. Yes, what
you suggest is a possibility.
[color=blue]
> I think this would be interesting, for example, in HPC machines to save
> long-term calculations. This way they could save months in case of
> power failure.[/color]
So... you decide what variables contain stuff that you want to hang
on to, and what can be regenerated in a hurry. Periodically save the
ones that can't be regenerated in a hurry.
The database guys, and the HPC guys, have devoted a lot of effort to
this. I expect (without checking) that googling for "checkpoint"
would turn up a lot.
--
Joseph J. Pfeiffer, Jr., Ph.D. Phone -- (505) 646-1605
Department of Computer Science FAX -- (505) 646-1002
New Mexico State University [url]http://www.cs.nmsu.edu/~pfeiffer[/url]