[RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work - Kernel

This is a discussion on [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work - Kernel ; On Fri, 2008-10-10 at 09:04 -0500, Serge E. Hallyn wrote: > Remember a part of Ingo's motivation is to push c/r developers to > address the lacking features that users use most, earlier. So the > warnings and subsequent email ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 34 of 34

Thread: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

  1. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    On Fri, 2008-10-10 at 09:04 -0500, Serge E. Hallyn wrote:
    > Remember a part of Ingo's motivation is to push c/r developers to
    > address the lacking features that users use most, earlier. So the
    > warnings and subsequent email complaints are what we're after. Hence a
    > single 'checkpointable or not' flag.
    >
    > Given the single flag, how do you know at sys_mq_unlink() whether the
    > process also has an opensocket?
    >
    > Rather than make this tracking facility more complicated and intrusive,
    > if people complain that they couldn't checkpoint bc of a warning about
    > aio, then we implement aio c/r! We don't just try and reduce the amount
    > of time that you can't checkpoint bc of lack of aio c/r support
    >
    > -serge


    Serge,

    It's exactly what I meant before, the tracking facility would be awfully
    complicated. It cannot be done that way.
    But there's also something awkward with the flag thing : can you provide
    right now an exhaustive list of all the places where you must raise it ?

    I'd rather do some heavy checking at checkpoint time.

    Greg.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    Quoting Greg Kurz (gkurz@fr.ibm.com):
    > On Fri, 2008-10-10 at 09:04 -0500, Serge E. Hallyn wrote:
    > > Remember a part of Ingo's motivation is to push c/r developers to
    > > address the lacking features that users use most, earlier. So the
    > > warnings and subsequent email complaints are what we're after. Hence a
    > > single 'checkpointable or not' flag.
    > >
    > > Given the single flag, how do you know at sys_mq_unlink() whether the
    > > process also has an opensocket?
    > >
    > > Rather than make this tracking facility more complicated and intrusive,
    > > if people complain that they couldn't checkpoint bc of a warning about
    > > aio, then we implement aio c/r! We don't just try and reduce the amount
    > > of time that you can't checkpoint bc of lack of aio c/r support
    > >
    > > -serge

    >
    > Serge,
    >
    > It's exactly what I meant before, the tracking facility would be awfully
    > complicated. It cannot be done that way.
    > But there's also something awkward with the flag thing : can you provide
    > right now an exhaustive list of all the places where you must raise it ?
    >
    > I'd rather do some heavy checking at checkpoint time.


    Noone is saying that we are not going to do that.

    -serge
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

    Greg Kurz wrote:

    > This flag is weak... testing it gives absolutly no hint whether the
    > checkpoint may succeed or not. As it is designed now, a user can only be
    > aware that checkpoint is *forever* denied. I agree that it's only useful
    > as a "flexible CR todo list".


    I don't think it's true that it gives "absolutly no hint".

    If the flag is not set, then checkpoint will succeed, right? Whereas if
    the flag is set, then it's an indication that checkpoint could fail (but
    may still succeed if whatever condition caused the flag to be set is no
    longer true).

    Chris

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    On Fri, 2008-10-10 at 18:45 +0200, Greg Kurz wrote:
    > It's exactly what I meant before, the tracking facility would be awfully
    > complicated. It cannot be done that way.
    > But there's also something awkward with the flag thing : can you provide
    > right now an exhaustive list of all the places where you must raise it ?


    Greg, that's just pure FUD. We don't say that spinlocks are a bad thing
    because we can't come up with an exhaustive list of places where we need
    locking.

    We'll do plenty of checks at checkpoint time.

    We'll do plenty of checks at runtime.

    Neither will work completely on its own, and neither will be exhaustive.
    The way this will work is just as Serge said: in true Linux style, we'll
    add more places users of process_deny_checkpoint() incrementally as we
    find them and as people complain. We'll also be incrementally removing
    them as we add functionality.

    -- Dave

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    On Friday, 10 of October 2008, Ingo Molnar wrote:
    >
    > * Rafael J. Wysocki wrote:
    >
    > > > In the long run, could we expect a (experimental) version of
    > > > hibernation that would just use this checkpointing facility to
    > > > hibernate?

    > >
    > > Surely not ACPI-compliant.

    >
    > what do you mean?


    The ACPI spec says quite specifically what should be done while entering
    hibernation and during resume from hibernation. We're not following that
    in the current code, but we can (gradually) update the code to become
    ACPI-compilant in that respect. However, if we go the checkpointing route, I
    don't think that will be possible any more.

    [In short, the problem is that ACPI regards the S4 state corresponding to
    hibernation as a sleep state of the system which is therefore fundamentally
    different from the soft power-off state and requires special handling.]

    This may be a theory etc. (I don't want to start the entire discussion about
    that once again), but clearly there's a choice to be made here. I'd prefer
    hibernation to be ACPI-compliant, but if people don't want that, I won't fight
    for it.

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()


    * Rafael J. Wysocki wrote:

    > > > Surely not ACPI-compliant.

    > >
    > > what do you mean?

    >
    > The ACPI spec says quite specifically what should be done while
    > entering hibernation and during resume from hibernation. We're not
    > following that in the current code, but we can (gradually) update the
    > code to become ACPI-compilant in that respect. However, if we go the
    > checkpointing route, I don't think that will be possible any more.


    ah, i see. I did not mean to utilize any ACPI paths but simple powerdown
    or reboot.

    If we checkpoint all apps to persistent disk areas (which the checkpoint
    patches in this thread are about), then we can just reboot the kernel
    and forget all its state.

    That capability can be used to build a really robust hibernation
    implementation IMO: we could "hibernate/kexec" over between different
    kernel versions transparently. (only a small delay will be noticed by
    the user - if we do it smartly with in-kernel modesetting then not even
    the screen contents will be changed over this.)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

    Dave Hansen wrote:
    > On Fri, 2008-10-10 at 18:34 +0200, Greg Kurz wrote:
    >> This flag is weak... testing it gives absolutly no hint whether the
    >> checkpoint may succeed or not. As it is designed now, a user can only be
    >> aware that checkpoint is *forever* denied. I agree that it's only useful
    >> as a "flexible CR todo list".

    >
    > Cool, so everyone agrees the patch is useful!


    Yes, definitively. It is a good idea.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()



    On Fri, 10 Oct 2008, Ingo Molnar wrote:

    >
    > * Rafael J. Wysocki wrote:
    >
    > > > > Surely not ACPI-compliant.
    > > >
    > > > what do you mean?

    > >
    > > The ACPI spec says quite specifically what should be done while
    > > entering hibernation and during resume from hibernation. We're not
    > > following that in the current code, but we can (gradually) update the
    > > code to become ACPI-compilant in that respect. However, if we go the
    > > checkpointing route, I don't think that will be possible any more.

    >
    > ah, i see. I did not mean to utilize any ACPI paths but simple powerdown
    > or reboot.


    If we don't enter ACPI S4, and instead poweroff,
    then we'll lose the capability to wake the system from
    devices that are capable of waking S4, but incapable of waking S5.

    ie. The power button will still work, but others may not.

    cheers,
    -Len

    > If we checkpoint all apps to persistent disk areas (which the checkpoint
    > patches in this thread are about), then we can just reboot the kernel
    > and forget all its state.
    >
    > That capability can be used to build a really robust hibernation
    > implementation IMO: we could "hibernate/kexec" over between different
    > kernel versions transparently. (only a small delay will be noticed by
    > the user - if we do it smartly with in-kernel modesetting then not even
    > the screen contents will be changed over this.)


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    On Friday, 10 of October 2008, Ingo Molnar wrote:
    >
    > * Rafael J. Wysocki wrote:
    >
    > > > > Surely not ACPI-compliant.
    > > >
    > > > what do you mean?

    > >
    > > The ACPI spec says quite specifically what should be done while
    > > entering hibernation and during resume from hibernation. We're not
    > > following that in the current code, but we can (gradually) update the
    > > code to become ACPI-compilant in that respect. However, if we go the
    > > checkpointing route, I don't think that will be possible any more.

    >
    > ah, i see. I did not mean to utilize any ACPI paths but simple powerdown
    > or reboot.
    >
    > If we checkpoint all apps to persistent disk areas (which the checkpoint
    > patches in this thread are about), then we can just reboot the kernel
    > and forget all its state.
    >
    > That capability can be used to build a really robust hibernation
    > implementation IMO: we could "hibernate/kexec" over between different
    > kernel versions transparently. (only a small delay will be noticed by
    > the user - if we do it smartly with in-kernel modesetting then not even
    > the screen contents will be changed over this.)


    That actually should be called a migration of VM IMO and would be a useful
    functionality. Sure.

    Hibernation, however, generally involves the restoration of the hardware and
    most importantly _platform_ state which IMO is impossible without the ACPI
    functionality, as well as wake-up, which may depend on ACPI too.

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    Hi!

    > > > Hmm, I don't know too much about aio, but is it possible to succeed with
    > > > io_getevents if we didn't first do a submit? It looks like the contexts
    > > > are looked up out of current->mm, so I don't think we need this call
    > > > here.
    > > >
    > > > Otherwise, this is neat.

    > >
    > > Good question. I know nothing, either.
    > >
    > > My thought was that any process *trying* to do aio stuff of any kind
    > > is going to be really confused if it gets checkpointed. Or, it might
    > > try to submit an aio right after it checks the list of them. I
    > > thought it best to be cautious and say, if you screw with aio, no
    > > checkpointing for you!

    >
    > as long as there's total transparency and the transition from CR-capable
    > to CR-disabled state is absolutely safe and race-free, that should be
    > fine.
    >
    > I expect users to quickly cause enough pressure to reduce the NOCR areas
    > of the kernel significantly ;-)
    >
    > In the long run, could we expect a (experimental) version of hibernation
    > that would just use this checkpointing facility to hibernate? That would
    > be way cool for users and for testing: we could do transparent kernel
    > upgrades/downgrades via this form of hibernation, between CR-compatible
    > kernels (!).


    Well, if we could do that, I guess we could also use CR to 'hibernate'
    your desktop then continue on your notebook. And yes that sounds cool.

    > Pie in the sky for sure, but way cool: it could propel Linux kernel
    > testing to completely new areas - new kernels could be tried
    > non-intrusively. (as long as a new kernel does not corrupt the CR data
    > structures - so some good consistency and redundancy checking would be
    > nice in the format!)


    Well, for simple apps, it should not be that hard...
    Pavel
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()


    * Pavel Machek wrote:

    > > In the long run, could we expect a (experimental) version of
    > > hibernation that would just use this checkpointing facility to
    > > hibernate? That would be way cool for users and for testing: we
    > > could do transparent kernel upgrades/downgrades via this form of
    > > hibernation, between CR-compatible kernels (!).

    >
    > Well, if we could do that, I guess we could also use CR to 'hibernate'
    > your desktop then continue on your notebook. And yes that sounds cool.


    yes.

    > > Pie in the sky for sure, but way cool: it could propel Linux kernel
    > > testing to completely new areas - new kernels could be tried
    > > non-intrusively. (as long as a new kernel does not corrupt the CR
    > > data structures - so some good consistency and redundancy checking
    > > would be nice in the format!)

    >
    > Well, for simple apps, it should not be that hard...


    Generally, if something works for simple apps already (in a robust,
    compatible and supportable way) and users find it "very cool", then
    support for more complex apps is not far in the future.

    but if you want to support more complex apps straight away, it takes
    forever and gets ugly.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

    On Fri, 2008-10-10 at 11:18 -0600, Chris Friesen wrote:
    > Greg Kurz wrote:
    >
    > > This flag is weak... testing it gives absolutly no hint whether the
    > > checkpoint may succeed or not. As it is designed now, a user can only be
    > > aware that checkpoint is *forever* denied. I agree that it's only useful
    > > as a "flexible CR todo list".

    >
    > I don't think it's true that it gives "absolutly no hint".
    >
    > If the flag is not set, then checkpoint will succeed, right? Whereas if


    Wrong. Unless you test_and_checkpoint atomically, the flag doesn't help.

    > the flag is set, then it's an indication that checkpoint could fail (but
    > may still succeed if whatever condition caused the flag to be set is no
    > longer true).
    >
    > Chris
    >

    --
    Gregory Kurz gkurz@fr.ibm.com
    Software Engineer @ IBM/Meiosys http://www.ibm.com
    Tel +33 (0)534 638 479 Fax +33 (0)561 400 420

    "Anarchy is about taking complete responsibility for yourself."
    Alan Moore.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint()

    On Fri, 2008-10-10 at 10:28 -0700, Dave Hansen wrote:
    > On Fri, 2008-10-10 at 18:45 +0200, Greg Kurz wrote:
    > > It's exactly what I meant before, the tracking facility would be awfully
    > > complicated. It cannot be done that way.
    > > But there's also something awkward with the flag thing : can you provide
    > > right now an exhaustive list of all the places where you must raise it ?

    >
    > Greg, that's just pure FUD. We don't say that spinlocks are a bad thing
    > because we can't come up with an exhaustive list of places where we need
    > locking.
    >
    > We'll do plenty of checks at checkpoint time.
    >
    > We'll do plenty of checks at runtime.
    >
    > Neither will work completely on its own, and neither will be exhaustive.
    > The way this will work is just as Serge said: in true Linux style, we'll
    > add more places users of process_deny_checkpoint() incrementally as we
    > find them and as people complain. We'll also be incrementally removing
    > them as we add functionality.
    >
    > -- Dave
    >


    Well then I misunderstood the purpose your initial postings. Sorry.

    --
    Gregory Kurz gkurz@fr.ibm.com
    Software Engineer @ IBM/Meiosys http://www.ibm.com
    Tel +33 (0)534 638 479 Fax +33 (0)561 400 420

    "Anarchy is about taking complete responsibility for yourself."
    Alan Moore.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work

    Quoting Greg Kurz (gkurz@fr.ibm.com):
    > On Fri, 2008-10-10 at 11:18 -0600, Chris Friesen wrote:
    > > Greg Kurz wrote:
    > >
    > > > This flag is weak... testing it gives absolutly no hint whether the
    > > > checkpoint may succeed or not. As it is designed now, a user can only be
    > > > aware that checkpoint is *forever* denied. I agree that it's only useful
    > > > as a "flexible CR todo list".

    > >
    > > I don't think it's true that it gives "absolutly no hint".
    > >
    > > If the flag is not set, then checkpoint will succeed, right? Whereas if

    >
    > Wrong. Unless you test_and_checkpoint atomically, the flag doesn't help.


    Atomically wrt what? Presumably you test and checkpoint while the
    container is frozen...

    > > the flag is set, then it's an indication that checkpoint could fail (but
    > > may still succeed if whatever condition caused the flag to be set is no
    > > longer true).
    > >
    > > Chris
    > >

    > --
    > Gregory Kurz gkurz@fr.ibm.com
    > Software Engineer @ IBM/Meiosys http://www.ibm.com
    > Tel +33 (0)534 638 479 Fax +33 (0)561 400 420
    >
    > "Anarchy is about taking complete responsibility for yourself."
    > Alan Moore.
    >
    > _______________________________________________
    > Containers mailing list
    > Containers@lists.linux-foundation.org
    > https://lists.linux-foundation.org/m...nfo/containers

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2