CONFIG_NUMA breaks hibernation on x86-32 with PAE - Kernel

This is a discussion on CONFIG_NUMA breaks hibernation on x86-32 with PAE - Kernel ; ....at least on 2.6.27 and 2.6.28-rc3. Resume gets to acpi_hibernation_leave, then SLAB corruption is detected and machine ends in series of oops. Any ideas how to debug that? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html -- To unsubscribe from this list: ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 21

Thread: CONFIG_NUMA breaks hibernation on x86-32 with PAE

  1. CONFIG_NUMA breaks hibernation on x86-32 with PAE


    ....at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    acpi_hibernation_leave, then SLAB corruption is detected and machine
    ends in series of oops.

    Any ideas how to debug that?
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE


    * Pavel Machek wrote:

    > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > ends in series of oops.
    >
    > Any ideas how to debug that?


    do you get any serial log or USB key output, so that it's debuggable
    directly?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    >
    > * Pavel Machek wrote:
    >
    > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > ends in series of oops.
    > >
    > > Any ideas how to debug that?

    >
    > do you get any serial log or USB key output, so that it's debuggable
    > directly?


    Well, I can transcribe the BUG() from a picture, I guess, but it does
    not seem to contain much useful info: SLAB corruption was detected and
    backtrace is not quite important at that point...

    Serial console is probably possible, but would take few days to setup.
    Pavel
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE


    * Pavel Machek wrote:

    > >
    > > * Pavel Machek wrote:
    > >
    > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > ends in series of oops.
    > > >
    > > > Any ideas how to debug that?

    > >
    > > do you get any serial log or USB key output, so that it's debuggable
    > > directly?

    >
    > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > not seem to contain much useful info: SLAB corruption was detected and
    > backtrace is not quite important at that point...
    >
    > Serial console is probably possible, but would take few days to setup.


    No good ideas - the bug description gives me the impression of memory
    maps save/restore hickup in the hibernation code - and memory maps are
    pretty much the only thing that are significantly different on NUMA.
    (amongst the things that would matter to hibernation - there's a lot
    more other NUMA details)

    In any case, could you send the .config that fails please, so that
    this is documented better?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE


    * KAMEZAWA Hiroyuki wrote:

    > On Mon, 10 Nov 2008 08:55:29 +0100
    > Pavel Machek wrote:
    >
    > > >
    > > > * Pavel Machek wrote:
    > > >
    > > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > > ends in series of oops.
    > > > >
    > > > > Any ideas how to debug that?
    > > >
    > > > do you get any serial log or USB key output, so that it's debuggable
    > > > directly?

    > >
    > > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > > not seem to contain much useful info: SLAB corruption was detected and
    > > backtrace is not quite important at that point...
    > >
    > > Serial console is probably possible, but would take few days to setup.
    > > Pavel

    >
    > How about this patch ?
    > ==
    > http://marc.info/?l=linux-mm-commits...4388629106&w=2
    > ==


    note, that fix of Rafael's is now upstream as:

    c5d7124: Fix __pfn_to_page(pfn) for CONFIG_DISCONTIGMEM=y

    and included in v2.6.28-rc4.

    But yes, this area of code (the save/restore of memory maps) would be
    the main suspect to investigate.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon, 10 Nov 2008 08:55:29 +0100
    Pavel Machek wrote:

    > >
    > > * Pavel Machek wrote:
    > >
    > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > ends in series of oops.
    > > >
    > > > Any ideas how to debug that?

    > >
    > > do you get any serial log or USB key output, so that it's debuggable
    > > directly?

    >
    > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > not seem to contain much useful info: SLAB corruption was detected and
    > backtrace is not quite important at that point...
    >
    > Serial console is probably possible, but would take few days to setup.
    > Pavel


    How about this patch ?
    ==
    http://marc.info/?l=linux-mm-commits...4388629106&w=2
    ==

    Thanks,
    -Kame


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon 2008-11-10 09:04:36, Ingo Molnar wrote:
    >
    > * Pavel Machek wrote:
    >
    > > >
    > > > * Pavel Machek wrote:
    > > >
    > > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > > ends in series of oops.
    > > > >
    > > > > Any ideas how to debug that?
    > > >
    > > > do you get any serial log or USB key output, so that it's debuggable
    > > > directly?

    > >
    > > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > > not seem to contain much useful info: SLAB corruption was detected and
    > > backtrace is not quite important at that point...
    > >
    > > Serial console is probably possible, but would take few days to setup.

    >
    > No good ideas - the bug description gives me the impression of memory
    > maps save/restore hickup in the hibernation code - and memory maps are
    > pretty much the only thing that are significantly different on NUMA.
    > (amongst the things that would matter to hibernation - there's a lot
    > more other NUMA details)
    >
    > In any case, could you send the .config that fails please, so that
    > this is documented better?


    Attached... but exact config does not seem to matter.

    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html


  8. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon 2008-11-10 17:24:07, KAMEZAWA Hiroyuki wrote:
    > On Mon, 10 Nov 2008 08:55:29 +0100
    > Pavel Machek wrote:
    >
    > > >
    > > > * Pavel Machek wrote:
    > > >
    > > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > > ends in series of oops.
    > > > >
    > > > > Any ideas how to debug that?
    > > >
    > > > do you get any serial log or USB key output, so that it's debuggable
    > > > directly?

    > >
    > > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > > not seem to contain much useful info: SLAB corruption was detected and
    > > backtrace is not quite important at that point...
    > >
    > > Serial console is probably possible, but would take few days to setup.

    >
    > How about this patch ?
    > ==
    > http://marc.info/?l=linux-mm-commits...4388629106&w=2
    > ==


    Yep, that one is neccessary to get that far... yes, I have it applied.

    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    Ingo Molnar writes:
    >
    > No good ideas - the bug description gives me the impression of memory
    > maps save/restore hickup in the hibernation code - and memory maps are
    > pretty much the only thing that are significantly different on NUMA.


    I assume the problem happened on a single node system.
    On single node the memory map should be actually quite similar
    to the UMA case.

    One possibility would be to bisect if it ever worked?

    -Andi

    --
    ak@linux.intel.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE


    * Pavel Machek wrote:

    > > In any case, could you send the .config that fails please, so that
    > > this is documented better?

    >
    > Attached... but exact config does not seem to matter.


    there are the physical memory enumeration model options:

    # CONFIG_FLATMEM_MANUAL is not set
    # CONFIG_DISCONTIGMEM_MANUAL is not set
    CONFIG_SPARSEMEM_MANUAL=y
    CONFIG_SPARSEMEM=y
    CONFIG_NEED_MULTIPLE_NODES=y
    CONFIG_HAVE_MEMORY_PRESENT=y
    CONFIG_SPARSEMEM_STATIC=y

    i can only suggest some stupid tweaking of the above options - for
    example, does the bug scenario change in any way if you switch it over
    to CONFIG_DISCONTIGMEM=y ?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    Ingo Molnar writes:

    > i can only suggest some stupid tweaking of the above options - for
    > example, does the bug scenario change in any way if you switch it over
    > to CONFIG_DISCONTIGMEM=y ?


    iirc we removed that some time ago so it would probably need some
    source hackery. There is only vmemmap or flat left.

    -Andi

    --
    ak@linux.intel.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Monday, 10 of November 2008, Ingo Molnar wrote:
    >
    > * KAMEZAWA Hiroyuki wrote:
    >
    > > On Mon, 10 Nov 2008 08:55:29 +0100
    > > Pavel Machek wrote:
    > >
    > > > >
    > > > > * Pavel Machek wrote:
    > > > >
    > > > > > ...at least on 2.6.27 and 2.6.28-rc3. Resume gets to
    > > > > > acpi_hibernation_leave, then SLAB corruption is detected and machine
    > > > > > ends in series of oops.
    > > > > >
    > > > > > Any ideas how to debug that?
    > > > >
    > > > > do you get any serial log or USB key output, so that it's debuggable
    > > > > directly?
    > > >
    > > > Well, I can transcribe the BUG() from a picture, I guess, but it does
    > > > not seem to contain much useful info: SLAB corruption was detected and
    > > > backtrace is not quite important at that point...
    > > >
    > > > Serial console is probably possible, but would take few days to setup.
    > > > Pavel

    > >
    > > How about this patch ?
    > > ==
    > > http://marc.info/?l=linux-mm-commits...4388629106&w=2
    > > ==

    >
    > note, that fix of Rafael's is now upstream as:
    >
    > c5d7124: Fix __pfn_to_page(pfn) for CONFIG_DISCONTIGMEM=y
    >
    > and included in v2.6.28-rc4.
    >
    > But yes, this area of code (the save/restore of memory maps) would be
    > the main suspect to investigate.


    It surely is, but the root cause need not be there, actually.

    The problem only happens with CONFIG_NUMA enabled, only on 32-bit systems
    and it _doesn't_ happen if the kernel is booted with highmem=0.

    Evidently, CONFIG_NUMA is sufficient for the breakage to appear, even if
    CONFIG_DISCONTIGMEM is not set (CONFIG_SPARSEMEM is set in this case, but
    hibernation works with CONFIG_SPARSEMEM and without CONFIG_NUMA, both for
    CONFIG_HIGHMEM64G=y).

    I've been investigating this issue for quite some time now and it's starting
    to look like a problem with kmap_atomic() or something along these lines.

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Monday, 10 of November 2008, Andi Kleen wrote:
    > Ingo Molnar writes:
    > >
    > > No good ideas - the bug description gives me the impression of memory
    > > maps save/restore hickup in the hibernation code - and memory maps are
    > > pretty much the only thing that are significantly different on NUMA.

    >
    > I assume the problem happened on a single node system.
    > On single node the memory map should be actually quite similar
    > to the UMA case.


    It is. However, the problem is 100% reproducible on any 32-bit single-node
    system with CONFIG_NUMA set, from what I can tell.

    It doesn't happen if the kernel is booted with highmem=0, so it looks like
    the code that saves highmem causes the problem to happen. However, this
    same code works well for all of the !CONFIG_NUMA cases and practically only
    the only non-open-coded it uses is kmap_atomic().

    > One possibility would be to bisect if it ever worked?


    Not sure it did, probably not. :-(

    BTW, can you please tell me why HIGHMEM64G is requisite for NUMA on 32-bit?

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Monday, 10 of November 2008, Andi Kleen wrote:
    > Ingo Molnar writes:
    >
    > > i can only suggest some stupid tweaking of the above options - for
    > > example, does the bug scenario change in any way if you switch it over
    > > to CONFIG_DISCONTIGMEM=y ?

    >
    > iirc we removed that some time ago so it would probably need some
    > source hackery. There is only vmemmap or flat left.


    No, it's there. :-)

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon, Nov 10, 2008 at 07:33:25PM +0100, Rafael J. Wysocki wrote:
    > On Monday, 10 of November 2008, Andi Kleen wrote:
    > > Ingo Molnar writes:
    > >
    > > > i can only suggest some stupid tweaking of the above options - for
    > > > example, does the bug scenario change in any way if you switch it over
    > > > to CONFIG_DISCONTIGMEM=y ?

    > >
    > > iirc we removed that some time ago so it would probably need some
    > > source hackery. There is only vmemmap or flat left.

    >
    > No, it's there. :-)


    You're right, i was off in 64bit only land again.

    -Andi

    --
    ak@linux.intel.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon 2008-11-10 19:28:03, Rafael J. Wysocki wrote:
    > On Monday, 10 of November 2008, Andi Kleen wrote:
    > > Ingo Molnar writes:
    > > >
    > > > No good ideas - the bug description gives me the impression of memory
    > > > maps save/restore hickup in the hibernation code - and memory maps are
    > > > pretty much the only thing that are significantly different on NUMA.

    > >
    > > I assume the problem happened on a single node system.
    > > On single node the memory map should be actually quite similar
    > > to the UMA case.

    >
    > It is. However, the problem is 100% reproducible on any 32-bit single-node
    > system with CONFIG_NUMA set, from what I can tell.
    >
    > It doesn't happen if the kernel is booted with highmem=0, so it looks like
    > the code that saves highmem causes the problem to happen. However, this
    > same code works well for all of the !CONFIG_NUMA cases and practically only
    > the only non-open-coded it uses is kmap_atomic().
    >
    > > One possibility would be to bisect if it ever worked?

    >
    > Not sure it did, probably not. :-(


    Well, interesting point would be just before this commit:


    commit 8357376d3df21b7d6f857931a57ac50da9c66e26
    tree daf2c369e9b79d24c1666323b3ae75189e482a4a
    parent bf73bae6ba0dc4bd4f1e570feb34a06b72725af6
    author Rafael J. Wysocki Wed, 06 Dec 2006 20:34:18 -0800
    committer Linus Torvalds Thu, 07 Dec 2006
    08:39:27 -0800

    [PATCH] swsusp: Improve handling of highmem

    Currently swsusp saves the contents of highmem pages by copying
    them to the
    normal zone which is quite inefficient (eg. it requires two
    normal pages
    to be used for saving one highmem page). This may be improved by
    using
    highmem for saving the contents of saveable highmem pages.

    ....highmem handling was way simpler in those good old days ;-)
    Pavel
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Mon 2008-11-10 19:28:03, Rafael J. Wysocki wrote:
    > On Monday, 10 of November 2008, Andi Kleen wrote:
    > > Ingo Molnar writes:
    > > >
    > > > No good ideas - the bug description gives me the impression of memory
    > > > maps save/restore hickup in the hibernation code - and memory maps are
    > > > pretty much the only thing that are significantly different on NUMA.

    > >
    > > I assume the problem happened on a single node system.
    > > On single node the memory map should be actually quite similar
    > > to the UMA case.

    >
    > It is. However, the problem is 100% reproducible on any 32-bit single-node
    > system with CONFIG_NUMA set, from what I can tell.
    >
    > It doesn't happen if the kernel is booted with highmem=0, so it looks like
    > the code that saves highmem causes the problem to happen. However, this
    > same code works well for all of the !CONFIG_NUMA cases and practically only
    > the only non-open-coded it uses is kmap_atomic().


    kmap_atomic() and kernel_map_pages(), AFAICT.

    Can kmap_atomic() modify pages in highmem, too?

    Wait... are we putting kernel pagetables in the high memory?

    What about this cleanup (warning: untested)

    diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
    index 5d2ab83..f1d8336 100644
    --- a/kernel/power/snapshot.c
    +++ b/kernel/power/snapshot.c
    @@ -966,7 +966,7 @@ static void copy_data_page(unsigned long
    * data modified by kmap_atomic()
    */
    safe_copy_page(buffer, s_page);
    - dst = kmap_atomic(pfn_to_page(dst_pfn), KM_USER0);
    + dst = kmap_atomic(d_page, KM_USER0);
    memcpy(dst, buffer, PAGE_SIZE);
    kunmap_atomic(dst, KM_USER0);
    } else {

    Pavel
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Tuesday, 11 of November 2008, Pavel Machek wrote:
    > On Mon 2008-11-10 19:28:03, Rafael J. Wysocki wrote:
    > > On Monday, 10 of November 2008, Andi Kleen wrote:
    > > > Ingo Molnar writes:
    > > > >
    > > > > No good ideas - the bug description gives me the impression of memory
    > > > > maps save/restore hickup in the hibernation code - and memory maps are
    > > > > pretty much the only thing that are significantly different on NUMA.
    > > >
    > > > I assume the problem happened on a single node system.
    > > > On single node the memory map should be actually quite similar
    > > > to the UMA case.

    > >
    > > It is. However, the problem is 100% reproducible on any 32-bit single-node
    > > system with CONFIG_NUMA set, from what I can tell.
    > >
    > > It doesn't happen if the kernel is booted with highmem=0, so it looks like
    > > the code that saves highmem causes the problem to happen. However, this
    > > same code works well for all of the !CONFIG_NUMA cases and practically only
    > > the only non-open-coded it uses is kmap_atomic().

    >
    > kmap_atomic() and kernel_map_pages(), AFAICT.
    >
    > Can kmap_atomic() modify pages in highmem, too?
    >
    > Wait... are we putting kernel pagetables in the high memory?
    >
    > What about this cleanup (warning: untested)


    Yes, I have this queued up and actually tested.

    Thanks,
    Rafael


    > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
    > index 5d2ab83..f1d8336 100644
    > --- a/kernel/power/snapshot.c
    > +++ b/kernel/power/snapshot.c
    > @@ -966,7 +966,7 @@ static void copy_data_page(unsigned long
    > * data modified by kmap_atomic()
    > */
    > safe_copy_page(buffer, s_page);
    > - dst = kmap_atomic(pfn_to_page(dst_pfn), KM_USER0);
    > + dst = kmap_atomic(d_page, KM_USER0);
    > memcpy(dst, buffer, PAGE_SIZE);
    > kunmap_atomic(dst, KM_USER0);
    > } else {
    >
    > Pavel




    --
    Everyone knows that debugging is twice as hard as writing a program
    in the first place. So if you're as clever as you can be when you write it,
    how will you ever debug it? --- Brian Kernighan
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE

    On Tuesday, 11 of November 2008, Pavel Machek wrote:
    > On Mon 2008-11-10 19:28:03, Rafael J. Wysocki wrote:
    > > On Monday, 10 of November 2008, Andi Kleen wrote:
    > > > Ingo Molnar writes:
    > > > >
    > > > > No good ideas - the bug description gives me the impression of memory
    > > > > maps save/restore hickup in the hibernation code - and memory maps are
    > > > > pretty much the only thing that are significantly different on NUMA.
    > > >
    > > > I assume the problem happened on a single node system.
    > > > On single node the memory map should be actually quite similar
    > > > to the UMA case.

    > >
    > > It is. However, the problem is 100% reproducible on any 32-bit single-node
    > > system with CONFIG_NUMA set, from what I can tell.
    > >
    > > It doesn't happen if the kernel is booted with highmem=0, so it looks like
    > > the code that saves highmem causes the problem to happen. However, this
    > > same code works well for all of the !CONFIG_NUMA cases and practically only
    > > the only non-open-coded it uses is kmap_atomic().
    > >
    > > > One possibility would be to bisect if it ever worked?

    > >
    > > Not sure it did, probably not. :-(

    >
    > Well, interesting point would be just before this commit:
    >
    >
    > commit 8357376d3df21b7d6f857931a57ac50da9c66e26
    > tree daf2c369e9b79d24c1666323b3ae75189e482a4a
    > parent bf73bae6ba0dc4bd4f1e570feb34a06b72725af6
    > author Rafael J. Wysocki Wed, 06 Dec 2006 20:34:18 -0800
    > committer Linus Torvalds Thu, 07 Dec 2006
    > 08:39:27 -0800
    >
    > [PATCH] swsusp: Improve handling of highmem
    >
    > Currently swsusp saves the contents of highmem pages by copying
    > them to the
    > normal zone which is quite inefficient (eg. it requires two
    > normal pages
    > to be used for saving one highmem page). This may be improved by
    > using
    > highmem for saving the contents of saveable highmem pages.
    >
    > ...highmem handling was way simpler in those good old days ;-)


    Please stop kidding, this is a serious issue.

    The hibernation code _works_ with all kinds of highmem when CONFIG_NUMA is
    unset.

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: CONFIG_NUMA breaks hibernation on x86-32 with PAE


    > > > It is. However, the problem is 100% reproducible on any 32-bit single-node
    > > > system with CONFIG_NUMA set, from what I can tell.
    > > >
    > > > It doesn't happen if the kernel is booted with highmem=0, so it looks like
    > > > the code that saves highmem causes the problem to happen. However, this
    > > > same code works well for all of the !CONFIG_NUMA cases and practically only
    > > > the only non-open-coded it uses is kmap_atomic().
    > > >
    > > > > One possibility would be to bisect if it ever worked?
    > > >
    > > > Not sure it did, probably not. :-(

    > >
    > > Well, interesting point would be just before this commit:
    > >
    > >
    > > commit 8357376d3df21b7d6f857931a57ac50da9c66e26
    > > tree daf2c369e9b79d24c1666323b3ae75189e482a4a
    > > parent bf73bae6ba0dc4bd4f1e570feb34a06b72725af6
    > > author Rafael J. Wysocki Wed, 06 Dec 2006 20:34:18 -0800
    > > committer Linus Torvalds Thu, 07 Dec 2006
    > > 08:39:27 -0800
    > >
    > > [PATCH] swsusp: Improve handling of highmem
    > >
    > > Currently swsusp saves the contents of highmem pages by copying
    > > them to the
    > > normal zone which is quite inefficient (eg. it requires two
    > > normal pages
    > > to be used for saving one highmem page). This may be improved by
    > > using
    > > highmem for saving the contents of saveable highmem pages.
    > >
    > > ...highmem handling was way simpler in those good old days ;-)

    >
    > Please stop kidding, this is a serious issue.
    >
    > The hibernation code _works_ with all kinds of highmem when CONFIG_NUMA is
    > unset.


    And it does not work with single highmem page when NUMA is set... I
    went through the highmem saving code, and it depends on highmem not
    changing from under it (right?) and is generally quite tricky ('if
    they are both in highmem do this, else if one of them is do that, else
    do something else') and it changes page protections on the fly, etc.

    I'm not saying the bug is in that code, but before that commit we had
    very stupid --- but very robust -- code. I'll try if that one works
    with config_numa, perhaps we can get some debug info that way.

    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 1 of 2 1 2 LastLast