State of execution - Linux

This is a discussion on State of execution - Linux ; Hi, Anyone knows any project to save the current execution state to disk? If there is'nt any, could you give me any clue about where to start to develop this feature? I supose that the only thing that should be ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: State of execution

  1. State of execution

    Hi,
    Anyone knows any project to save the current execution state to disk?
    If there is'nt any, could you give me any clue about where to start to
    develop this feature?

    I supose that the only thing that should be saved is the state of the
    CPUs and a memory dump.

    I think this would be interesting, for example, in HPC machines to save
    long-term calculations. This way they could save months in case of
    power failure.


  2. Re: State of execution

    > Anyone knows any project to save the current execution state to disk?
    > If there is'nt any, could you give me any clue about where to start to
    > develop this feature?


    For a whole system and all its processes, this feature is "suspend to disk",
    which is commonly used on portable (laptop) systems. On some hardware
    it is problematic due to poor documentation by the manufacturer.

    > I supose that the only thing that should be saved is the state of the
    > CPUs and a memory dump.


    For an individual process, there are things such as 'unexec' used by emacs.
    In general, it requires cooperation from the process to manage open
    network connections and other ephemeral state.

    > I think this would be interesting, for example, in HPC machines to save
    > long-term calculations. This way they could save months in case of
    > power failure.


    Checkpoint and restart facilities are builtin to any long-running
    HPC project. On *NIX systems it may be advantageous to use fork()
    to make a [virtual] copy of the address space. Use Advanced Search
    at groups.google.com for 'checkpoint' in group:*comp* .

    --

  3. Re: State of execution

    Some process like Gaussian have, but there's still some that do'nt have
    it.

    I've been searching about unexec, it seems to be a system call. So, my
    process must call it voluntarily. It'd be nice to have some utility to
    force a copy of a process to disk so it could be resotered later (after
    reboot).

    I think that Cray had this feature: The entire system could sleep to
    disk, so it could be restored later on boot-time.


    John Reiser wrote:
    > > Anyone knows any project to save the current execution state to disk?
    > > If there is'nt any, could you give me any clue about where to start to
    > > develop this feature?

    >
    > For a whole system and all its processes, this feature is "suspend to disk",
    > which is commonly used on portable (laptop) systems. On some hardware
    > it is problematic due to poor documentation by the manufacturer.
    >
    > > I supose that the only thing that should be saved is the state of the
    > > CPUs and a memory dump.

    >
    > For an individual process, there are things such as 'unexec' used by emacs.
    > In general, it requires cooperation from the process to manage open
    > network connections and other ephemeral state.
    >
    > > I think this would be interesting, for example, in HPC machines to save
    > > long-term calculations. This way they could save months in case of
    > > power failure.

    >
    > Checkpoint and restart facilities are builtin to any long-running
    > HPC project. On *NIX systems it may be advantageous to use fork()
    > to make a [virtual] copy of the address space. Use Advanced Search
    > at groups.google.com for 'checkpoint' in group:*comp* .
    >
    > --



  4. Re: State of execution

    jordi.prats@gmail.com wrote:
    > Hi,
    > Anyone knows any project to save the current execution state to disk?
    > If there is'nt any, could you give me any clue about where to start to
    > develop this feature?
    >
    > I supose that the only thing that should be saved is the state of the
    > CPUs and a memory dump.


    No, that is a very simplistic view of a process' state. There is lots of
    additional information kept by the kernel: open files, memory layout,
    network connections, user- and group-IDs, etc.
    Also, there may be file contents and states of other processes that have
    influence on the further execution of the process.

    --
    Josef Möllers (Pinguinpfleger bei FSC)
    If failure had no penalty success would not be a prize
    -- T. Pratchett


  5. Re: State of execution

    jordi.prats@gmail.com writes:

    > Hi,
    > Anyone knows any project to save the current execution state to disk?
    > If there is'nt any, could you give me any clue about where to start to
    > develop this feature?
    >
    > I supose that the only thing that should be saved is the state of the
    > CPUs and a memory dump.


    The basic problem is that "state" is something you define. Yes, what
    you suggest is a possibility.

    > I think this would be interesting, for example, in HPC machines to save
    > long-term calculations. This way they could save months in case of
    > power failure.


    So... you decide what variables contain stuff that you want to hang
    on to, and what can be regenerated in a hurry. Periodically save the
    ones that can't be regenerated in a hurry.

    The database guys, and the HPC guys, have devoted a lot of effort to
    this. I expect (without checking) that googling for "checkpoint"
    would turn up a lot.
    --
    Joseph J. Pfeiffer, Jr., Ph.D. Phone -- (505) 646-1605
    Department of Computer Science FAX -- (505) 646-1002
    New Mexico State University http://www.cs.nmsu.edu/~pfeiffer

+ Reply to Thread