Linux 2.6.28-rc1 - Kernel

This is a discussion on Linux 2.6.28-rc1 - Kernel ; On Fri, 2008-10-24 at 16:01 -0700, Arjan van de Ven wrote: > I suspect these are totally innocent; the reason I think this is that > select/poll only get used once you hit userspace... and you're hanging > way before ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 33 of 33

Thread: Linux 2.6.28-rc1

  1. Re: Linux 2.6.28-rc1

    On Fri, 2008-10-24 at 16:01 -0700, Arjan van de Ven wrote:
    > I suspect these are totally innocent; the reason I think this is that
    > select/poll only get used once you hit userspace... and you're hanging
    > way before that.


    Entirely correct. It seems commit
    4403b406d4369a275d483ece6ddee0088cc0d592 by Linus just fixed it for me.
    My boot hang is gone.

    Linux prometheus 2.6.28-rc1-00005-g23cf24c #1 SMP Sun Oct 26 13:12:55
    GMT 2008 x86_64 Quad-Core AMD Opteron(tm) Processor 2354 AuthenticAMD
    GNU/Linux

    Regards,
    Tony V.



    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)

    iEYEABECAAYFAkkEbfYACgkQp5vW4rUFj5ryEACfdv8bw/g/cO1h+dO/PPy/GwqY
    qGEAoIf5rqw3Z9XaJZsLFGUuYnTamFJD
    =8L9k
    -----END PGP SIGNATURE-----


  2. 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Thu, Oct 23, 2008 at 09:10:29PM -0700, Linus Torvalds wrote:
    >
    > It's been two weeks, so it's time to close the merge window. A 2.6.28-rc1
    > is out there, and it's hopefully all good.
    >


    I first encountered this problem in SLES 11 Beta 2 but now I see it
    affects 2.6.28-rc1 too.

    On some ppc64 machines, NVRAM is being corrupted very early in boot (before
    console is initialised). The machine reboots and then fails to find yaboot
    printing the error "PReP-BOOT: Unable to load PRep image". It's nowhere near
    as serious as the ftrace+e1000 problem as the machine is not bricked but it's
    fairly scary looking, the machine cannot boot and the fix is non-obvious. To
    "fix" the machine;

    1. Go to OpenFirmware prompt
    2. type dev nvram
    3. type wipe-nvram

    The machine will reboot, reconstruct the NVRAM using some magic and yaboot
    work again allowing an older kernel to be used. I bisected the problem down
    to this commit.

    From 91a00302959545a9ae423e99732b1e46eb19e877 Mon Sep 17 00:00:00 2001
    From: Paul Mackerras
    Date: Wed, 8 Oct 2008 14:03:29 +0000
    Subject: [PATCH] powerpc: Sync RPA note in zImage with kernel's RPA note

    Commit 9b09c6d909dfd8de96b99b9b9c808b94b0a71614 ("powerpc: Change the
    default link address for pSeries zImage kernels") changed the
    real-base value in the CHRP note added by the addnote program from
    12MB to 32MB to give more space for Open Firmware to load the zImage.
    (The real-base value says where we want OF to position itself in
    memory.) However, this change was ineffective on most pSeries
    machines, because the RPA note added by addnote has the "ignore me"
    flag set to 1. This was intended to tell OF to ignore just the RPA
    note, but has the side effect of also making OF ignore the CHRP note
    (at least on most pSeries machines).

    To solve this we have to set the "ignore me" flag to 0 in the RPA
    note. (We can't just omit the RPA note because that is equivalent to
    having an RPA note with default values, and the default values are not
    what we want.) However, then we have to make sure the values in the
    zImage's RPA note match up with the values that the kernel supplies
    later in prom_init.c with either the ibm,client-architecture-support
    call or the process-elf-header call in prom_send_capabilities().

    So this sets the "ignore me" flag in the RPA note in addnote to 0, and
    adjusts the RPA note values in addnote.c and in prom_init.c to be
    consistent with each other and with the values in ibm_architecture_vec.

    However, since the wrapper is independent of the kernel, this doesn't
    ensure that the notes will stay consistent. To ensure that, this adds
    code to addnote.c so that it can extract the kernel's RPA note from
    the kernel binary and put that in the zImage. To that end, we put the
    kernel's fake ELF header (which contains the kernel's RPA note) into
    its own section, and arrange for wrapper to pull out that section with
    objcopy and pass it to addnote, which then extracts the RPA note from
    it and transfers it to the zImage.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Benjamin Herrenschmidt

    diff --git a/arch/powerpc/boot/addnote.c b/arch/powerpc/boot/addnote.c
    index b1e5611..dcc9ab2 100644
    --- a/arch/powerpc/boot/addnote.c
    +++ b/arch/powerpc/boot/addnote.c
    @@ -11,7 +11,12 @@
    * as published by the Free Software Foundation; either version
    * 2 of the License, or (at your option) any later version.
    *
    - * Usage: addnote zImage
    + * Usage: addnote zImage [note.elf]
    + *
    + * If note.elf is supplied, it is the name of an ELF file that contains
    + * an RPA note to use instead of the built-in one. Alternatively, the
    + * note.elf file may be empty, in which case the built-in RPA note is
    + * used (this is to simplify how this is invoked from the wrapper script).
    */
    #include
    #include
    @@ -43,27 +48,29 @@ char rpaname[] = "IBM,RPA-Client-Config";
    */
    #define N_RPA_DESCR 8
    unsigned int rpanote[N_RPA_DESCR] = {
    - 0, /* lparaffinity */
    - 64, /* min_rmo_size */
    + 1, /* lparaffinity */
    + 128, /* min_rmo_size */
    0, /* min_rmo_percent */
    - 40, /* max_pft_size */
    + 46, /* max_pft_size */
    1, /* splpar */
    -1, /* min_load */
    - 0, /* new_mem_def */
    - 1, /* ignore_my_client_config */
    + 1, /* new_mem_def */
    + 0, /* ignore_my_client_config */
    };

    #define ROUNDUP(len) (((len) + 3) & ~3)

    unsigned char buf[512];
    +unsigned char notebuf[512];

    -#define GET_16BE(off) ((buf[off] << 8) + (buf[(off)+1]))
    -#define GET_32BE(off) ((GET_16BE(off) << 16) + GET_16BE((off)+2))
    +#define GET_16BE(b, off) (((b)[off] << 8) + ((b)[(off)+1]))
    +#define GET_32BE(b, off) ((GET_16BE((b), (off)) << 16) + \
    + GET_16BE((b), (off)+2))

    -#define PUT_16BE(off, v) (buf[off] = ((v) >> 8) & 0xff, \
    - buf[(off) + 1] = (v) & 0xff)
    -#define PUT_32BE(off, v) (PUT_16BE((off), (v) >> 16), \
    - PUT_16BE((off) + 2, (v)))
    +#define PUT_16BE(b, off, v) ((b)[off] = ((v) >> 8) & 0xff, \
    + (b)[(off) + 1] = (v) & 0xff)
    +#define PUT_32BE(b, off, v) (PUT_16BE((b), (off), (v) >> 16), \
    + PUT_16BE((b), (off) + 2, (v)))

    /* Structure of an ELF file */
    #define E_IDENT 0 /* ELF header */
    @@ -88,15 +95,71 @@ unsigned char buf[512];

    unsigned char elf_magic[4] = { 0x7f, 'E', 'L', 'F' };

    +unsigned char *read_rpanote(const char *fname, int *nnp)
    +{
    + int notefd, nr, i;
    + int ph, ps, np;
    + int note, notesize;
    +
    + notefd = open(fname, O_RDONLY);
    + if (notefd < 0) {
    + perror(fname);
    + exit(1);
    + }
    + nr = read(notefd, notebuf, sizeof(notebuf));
    + if (nr < 0) {
    + perror("read note");
    + exit(1);
    + }
    + if (nr == 0) /* empty file */
    + return NULL;
    + if (nr < E_HSIZE ||
    + memcmp(&notebuf[E_IDENT+EI_MAGIC], elf_magic, 4) != 0 ||
    + notebuf[E_IDENT+EI_CLASS] != ELFCLASS32 ||
    + notebuf[E_IDENT+EI_DATA] != ELFDATA2MSB)
    + goto notelf;
    + close(notefd);
    +
    + /* now look for the RPA-note */
    + ph = GET_32BE(notebuf, E_PHOFF);
    + ps = GET_16BE(notebuf, E_PHENTSIZE);
    + np = GET_16BE(notebuf, E_PHNUM);
    + if (ph < E_HSIZE || ps < PH_HSIZE || np < 1)
    + goto notelf;
    +
    + for (i = 0; i < np; ++i, ph += ps) {
    + if (GET_32BE(notebuf, ph + PH_TYPE) != PT_NOTE)
    + continue;
    + note = GET_32BE(notebuf, ph + PH_OFFSET);
    + notesize = GET_32BE(notebuf, ph + PH_FILESZ);
    + if (notesize < 34 || note + notesize > nr)
    + continue;
    + if (GET_32BE(notebuf, note) != strlen(rpaname) + 1 ||
    + GET_32BE(notebuf, note + 8) != 0x12759999 ||
    + strcmp((char *)&notebuf[note + 12], rpaname) != 0)
    + continue;
    + /* looks like an RPA note, return it */
    + *nnp = notesize;
    + return &notebuf[note];
    + }
    + /* no RPA note found */
    + return NULL;
    +
    + notelf:
    + fprintf(stderr, "%s is not a big-endian 32-bit ELF image\n", fname);
    + exit(1);
    +}
    +
    int
    main(int ac, char **av)
    {
    int fd, n, i;
    int ph, ps, np;
    int nnote, nnote2, ns;
    + unsigned char *rpap;

    - if (ac != 2) {
    - fprintf(stderr, "Usage: %s elf-file\n", av[0]);
    + if (ac != 2 && ac != 3) {
    + fprintf(stderr, "Usage: %s elf-file [rpanote.elf]\n", av[0]);
    exit(1);
    }
    fd = open(av[1], O_RDWR);
    @@ -107,6 +170,7 @@ main(int ac, char **av)

    nnote = 12 + ROUNDUP(strlen(arch) + 1) + sizeof(descr);
    nnote2 = 12 + ROUNDUP(strlen(rpaname) + 1) + sizeof(rpanote);
    + rpap = NULL;

    n = read(fd, buf, sizeof(buf));
    if (n < 0) {
    @@ -124,16 +188,19 @@ main(int ac, char **av)
    exit(1);
    }

    - ph = GET_32BE(E_PHOFF);
    - ps = GET_16BE(E_PHENTSIZE);
    - np = GET_16BE(E_PHNUM);
    + if (ac == 3)
    + rpap = read_rpanote(av[2], &nnote2);
    +
    + ph = GET_32BE(buf, E_PHOFF);
    + ps = GET_16BE(buf, E_PHENTSIZE);
    + np = GET_16BE(buf, E_PHNUM);
    if (ph < E_HSIZE || ps < PH_HSIZE || np < 1)
    goto notelf;
    if (ph + (np + 2) * ps + nnote + nnote2 > n)
    goto nospace;

    for (i = 0; i < np; ++i) {
    - if (GET_32BE(ph + PH_TYPE) == PT_NOTE) {
    + if (GET_32BE(buf, ph + PH_TYPE) == PT_NOTE) {
    fprintf(stderr, "%s already has a note entry\n",
    av[1]);
    exit(0);
    @@ -148,37 +215,42 @@ main(int ac, char **av)

    /* fill in the program header entry */
    ns = ph + 2 * ps;
    - PUT_32BE(ph + PH_TYPE, PT_NOTE);
    - PUT_32BE(ph + PH_OFFSET, ns);
    - PUT_32BE(ph + PH_FILESZ, nnote);
    + PUT_32BE(buf, ph + PH_TYPE, PT_NOTE);
    + PUT_32BE(buf, ph + PH_OFFSET, ns);
    + PUT_32BE(buf, ph + PH_FILESZ, nnote);

    /* fill in the note area we point to */
    /* XXX we should probably make this a proper section */
    - PUT_32BE(ns, strlen(arch) + 1);
    - PUT_32BE(ns + 4, N_DESCR * 4);
    - PUT_32BE(ns + 8, 0x1275);
    + PUT_32BE(buf, ns, strlen(arch) + 1);
    + PUT_32BE(buf, ns + 4, N_DESCR * 4);
    + PUT_32BE(buf, ns + 8, 0x1275);
    strcpy((char *) &buf[ns + 12], arch);
    ns += 12 + strlen(arch) + 1;
    for (i = 0; i < N_DESCR; ++i, ns += 4)
    - PUT_32BE(ns, descr[i]);
    + PUT_32BE(buf, ns, descr[i]);

    /* fill in the second program header entry and the RPA note area */
    ph += ps;
    - PUT_32BE(ph + PH_TYPE, PT_NOTE);
    - PUT_32BE(ph + PH_OFFSET, ns);
    - PUT_32BE(ph + PH_FILESZ, nnote2);
    + PUT_32BE(buf, ph + PH_TYPE, PT_NOTE);
    + PUT_32BE(buf, ph + PH_OFFSET, ns);
    + PUT_32BE(buf, ph + PH_FILESZ, nnote2);

    /* fill in the note area we point to */
    - PUT_32BE(ns, strlen(rpaname) + 1);
    - PUT_32BE(ns + 4, sizeof(rpanote));
    - PUT_32BE(ns + 8, 0x12759999);
    - strcpy((char *) &buf[ns + 12], rpaname);
    - ns += 12 + ROUNDUP(strlen(rpaname) + 1);
    - for (i = 0; i < N_RPA_DESCR; ++i, ns += 4)
    - PUT_32BE(ns, rpanote[i]);
    + if (rpap) {
    + /* RPA note supplied in file, just copy the whole thing over */
    + memcpy(buf + ns, rpap, nnote2);
    + } else {
    + PUT_32BE(buf, ns, strlen(rpaname) + 1);
    + PUT_32BE(buf, ns + 4, sizeof(rpanote));
    + PUT_32BE(buf, ns + 8, 0x12759999);
    + strcpy((char *) &buf[ns + 12], rpaname);
    + ns += 12 + ROUNDUP(strlen(rpaname) + 1);
    + for (i = 0; i < N_RPA_DESCR; ++i, ns += 4)
    + PUT_32BE(buf, ns, rpanote[i]);
    + }

    /* Update the number of program headers */
    - PUT_16BE(E_PHNUM, np + 2);
    + PUT_16BE(buf, E_PHNUM, np + 2);

    /* write back */
    lseek(fd, (long) 0, SEEK_SET);
    diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
    index 965c237..ee0dc41 100755
    --- a/arch/powerpc/boot/wrapper
    +++ b/arch/powerpc/boot/wrapper
    @@ -307,7 +307,9 @@ fi
    # post-processing needed for some platforms
    case "$platform" in
    pseries|chrp)
    - $objbin/addnote "$ofile"
    + ${CROSS}objcopy -O binary -j .fakeelf "$kernel" "$ofile".rpanote
    + $objbin/addnote "$ofile" "$ofile".rpanote
    + rm -r "$ofile".rpanote
    ;;
    coff)
    ${CROSS}objcopy -O aixcoff-rs6000 --set-start "$entry" "$ofile"
    diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
    index 7cf274a..2fdbc18 100644
    --- a/arch/powerpc/kernel/prom_init.c
    +++ b/arch/powerpc/kernel/prom_init.c
    @@ -732,7 +732,7 @@ static struct fake_elf {
    u32 ignore_me;
    } rpadesc;
    } rpanote;
    -} fake_elf = {
    +} fake_elf __section(.fakeelf) = {
    .elfhdr = {
    .e_ident = { 0x7f, 'E', 'L', 'F',
    ELFCLASS32, ELFDATA2MSB, EV_CURRENT },
    @@ -774,13 +774,13 @@ static struct fake_elf {
    .type = 0x12759999,
    .name = "IBM,RPA-Client-Config",
    .rpadesc = {
    - .lpar_affinity = 0,
    - .min_rmo_size = 64, /* in megabytes */
    + .lpar_affinity = 1,
    + .min_rmo_size = 128, /* in megabytes */
    .min_rmo_percent = 0,
    - .max_pft_size = 48, /* 2^48 bytes max PFT size */
    + .max_pft_size = 46, /* 2^46 bytes max PFT size */
    .splpar = 1,
    .min_load = ~0U,
    - .new_mem_def = 0
    + .new_mem_def = 1
    }
    }
    };
    diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
    index e6927fb..b39c27e 100644
    --- a/arch/powerpc/kernel/vmlinux.lds.S
    +++ b/arch/powerpc/kernel/vmlinux.lds.S
    @@ -203,6 +203,9 @@ SECTIONS
    *(.rela*)
    }

    + /* Fake ELF header containing RPA note; for addnote */
    + .fakeelf : AT(ADDR(.fakeelf) - LOAD_OFFSET) { *(.fakeelf) }
    +
    /* freed after init ends here */
    . = ALIGN(PAGE_SIZE);
    __init_end = .;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    Mel Gorman writes:

    > On some ppc64 machines, NVRAM is being corrupted very early in boot (before
    > console is initialised). The machine reboots and then fails to find yaboot
    > printing the error "PReP-BOOT: Unable to load PRep image". It's nowhere near
    > as serious as the ftrace+e1000 problem as the machine is not bricked but it's
    > fairly scary looking, the machine cannot boot and the fix is non-obvious. To
    > "fix" the machine;
    >
    > 1. Go to OpenFirmware prompt
    > 2. type dev nvram
    > 3. type wipe-nvram
    >
    > The machine will reboot, reconstruct the NVRAM using some magic and yaboot
    > work again allowing an older kernel to be used. I bisected the problem down
    > to this commit.


    Eek!

    Which ppc64 machines has this been seen on, and how were they being
    booted (netboot, yaboot, etc.)?

    Is it just the Powerstations with their SLOF-based firmware, or is it
    IBM pSeries machines as well?

    Paul.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, Oct 31, 2008 at 07:52:02AM +1100, Paul Mackerras wrote:
    >Mel Gorman writes:
    >
    >> On some ppc64 machines, NVRAM is being corrupted very early in boot (before
    >> console is initialised). The machine reboots and then fails to find yaboot
    >> printing the error "PReP-BOOT: Unable to load PRep image". It's nowhere near
    >> as serious as the ftrace+e1000 problem as the machine is not bricked but it's
    >> fairly scary looking, the machine cannot boot and the fix is non-obvious. To
    >> "fix" the machine;
    >>
    >> 1. Go to OpenFirmware prompt
    >> 2. type dev nvram
    >> 3. type wipe-nvram
    >>
    >> The machine will reboot, reconstruct the NVRAM using some magic and yaboot
    >> work again allowing an older kernel to be used. I bisected the problem down
    >> to this commit.

    >
    >Eek!
    >
    >Which ppc64 machines has this been seen on, and how were they being
    >booted (netboot, yaboot, etc.)?
    >
    >Is it just the Powerstations with their SLOF-based firmware, or is it
    >IBM pSeries machines as well?


    I'm pretty sure it was with pSeries machines. I saw reports of POWER5
    being effected (p520 and p710). I believe one of them resolved the
    issue by upgrading firmware on the machine.

    josh
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Thu, 2008-10-30 at 17:05 -0400, Josh Boyer wrote:
    > On Fri, Oct 31, 2008 at 07:52:02AM +1100, Paul Mackerras wrote:
    > >Mel Gorman writes:
    > >
    > >> On some ppc64 machines, NVRAM is being corrupted very early in boot (before
    > >> console is initialised). The machine reboots and then fails to find yaboot
    > >> printing the error "PReP-BOOT: Unable to load PRep image".

    ....
    > >Eek!
    > >
    > >Which ppc64 machines has this been seen on, and how were they being
    > >booted (netboot, yaboot, etc.)?
    > >
    > >Is it just the Powerstations with their SLOF-based firmware, or is it
    > >IBM pSeries machines as well?

    >
    > I'm pretty sure it was with pSeries machines. I saw reports of POWER5
    > being effected (p520 and p710). I believe one of them resolved the
    > issue by upgrading firmware on the machine.


    This is true of a p720 (CHRP IBM,9124-720) that I was testing on. With
    upgraded firmware, the problem is gone.

    --
    David Kleikamp
    IBM Linux Technology Center

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, Oct 31, 2008 at 07:52:02AM +1100, Paul Mackerras wrote:
    > Mel Gorman writes:
    >
    > > On some ppc64 machines, NVRAM is being corrupted very early in boot (before
    > > console is initialised). The machine reboots and then fails to find yaboot
    > > printing the error "PReP-BOOT: Unable to load PRep image". It's nowhere near
    > > as serious as the ftrace+e1000 problem as the machine is not bricked but it's
    > > fairly scary looking, the machine cannot boot and the fix is non-obvious. To
    > > "fix" the machine;
    > >
    > > 1. Go to OpenFirmware prompt
    > > 2. type dev nvram
    > > 3. type wipe-nvram
    > >
    > > The machine will reboot, reconstruct the NVRAM using some magic and yaboot
    > > work again allowing an older kernel to be used. I bisected the problem down
    > > to this commit.

    >
    > Eek!
    >
    > Which ppc64 machines has this been seen on, and how were they being
    > booted (netboot, yaboot, etc.)?
    >


    Yaboot in my case and I've heard it affected a DVD installation. I don't
    know for sure if it affects netboot but as I think it's something the
    kernel is doing, it probably doesn't matter how it gets loaded?

    > Is it just the Powerstations with their SLOF-based firmware, or is it
    > IBM pSeries machines as well?
    >


    To be honest, I haven't been brave enough to try this on a Powerstation yet
    as I only have the one and I don't know if it's a) affected or b) fixable
    with the same workaround. It was an IBM pSeries that was affected in my case
    and a few people have hit the problem on pSeries AFARIK.

    It's been pointed out that it can be "fixed" by upgrading the firmware but
    surely we can avoid breaking the machine in the first place?

    --
    Mel Gorman
    Part-time Phd Student Linux Technology Center
    University of Limerick IBM Dublin Software Lab
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    Mel Gorman writes:

    > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > know for sure if it affects netboot but as I think it's something the
    > kernel is doing, it probably doesn't matter how it gets loaded?


    I do need to know whether it was the vmlinux or the zImage.pseries
    that you were loading with yaboot. That commit you identified affects
    the contents of an ELF note in the zImage.pseries that firmware looks
    at, as well as a structure in the kernel itself that gets passed as an
    argument to a call to firmware. If you were loading a vmlinux with
    yaboot when you saw the corruption occur then that narrows things down
    a bit.

    Paul.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    Mel Gorman writes:

    > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > know for sure if it affects netboot but as I think it's something the
    > kernel is doing, it probably doesn't matter how it gets loaded?


    What changed in that commit was the contents of a couple of structures
    that the firmware looks at to see what the kernel wants from
    firmware. Specifically the change was to say that the kernel (or
    really the zImage wrapper) would like the firmware to be based at the
    32MB point (which is what AIX uses) rather than 12MB (which was the
    default on older machines).

    So, as I understand it, it's not anything the kernel is actively
    doing, it's how the firmware is reacting to what the kernel says it
    wants. And since we are requesting the same value as AIX (as far as I
    know) I'm really surprised it caused problems.

    We can revert that commit, but I still need to solve the problem that
    the distros are facing, namely that their installer kernel + initramfs
    images are now bigger than 12MB and can't be loaded if the firmware is
    based at 12MB. That's why I really want to understand the problem in
    more detail.

    > It's been pointed out that it can be "fixed" by upgrading the firmware but
    > surely we can avoid breaking the machine in the first place?


    Have you upgraded the firmware on the machine you saw this problem on?
    If not, would you be willing to run some tests for me?

    Paul.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, Oct 31, 2008 at 10:18:38PM +1100, Paul Mackerras wrote:
    > Mel Gorman writes:
    >
    > > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > > know for sure if it affects netboot but as I think it's something the
    > > kernel is doing, it probably doesn't matter how it gets loaded?

    >
    > I do need to know whether it was the vmlinux or the zImage.pseries
    > that you were loading with yaboot. That commit you identified affects
    > the contents of an ELF note in the zImage.pseries that firmware looks
    > at, as well as a structure in the kernel itself that gets passed as an
    > argument to a call to firmware. If you were loading a vmlinux with
    > yaboot when you saw the corruption occur then that narrows things down
    > a bit.
    >


    It's the vmlinux file I am seeing problems with.

    --
    Mel Gorman
    Part-time Phd Student Linux Technology Center
    University of Limerick IBM Dublin Software Lab
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, 2008-10-31 at 22:18 +1100, Paul Mackerras wrote:
    > Mel Gorman writes:
    >
    > > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > > know for sure if it affects netboot but as I think it's something the
    > > kernel is doing, it probably doesn't matter how it gets loaded?

    >
    > I do need to know whether it was the vmlinux or the zImage.pseries
    > that you were loading with yaboot. That commit you identified affects
    > the contents of an ELF note in the zImage.pseries that firmware looks
    > at, as well as a structure in the kernel itself that gets passed as an
    > argument to a call to firmware. If you were loading a vmlinux with
    > yaboot when you saw the corruption occur then that narrows things down
    > a bit.


    Unless missed something, I think it's narrowed already. When loaded from
    yaboot, there is no relevant difference between zImage and vmlinux here.
    IE. yaboot parses the ELF header of the zImage itself and ignores the
    special notes anyway so only the CAS firmware call is relevant in both
    cases, no ?

    Cheers,
    Ben.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, Oct 31, 2008 at 10:10:55PM +1100, Paul Mackerras wrote:
    > Mel Gorman writes:
    >
    > > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > > know for sure if it affects netboot but as I think it's something the
    > > kernel is doing, it probably doesn't matter how it gets loaded?

    >
    > What changed in that commit was the contents of a couple of structures
    > that the firmware looks at to see what the kernel wants from
    > firmware. Specifically the change was to say that the kernel (or
    > really the zImage wrapper) would like the firmware to be based at the
    > 32MB point (which is what AIX uses) rather than 12MB (which was the
    > default on older machines).
    >
    > So, as I understand it, it's not anything the kernel is actively
    > doing, it's how the firmware is reacting to what the kernel says it
    > wants. And since we are requesting the same value as AIX (as far as I
    > know) I'm really surprised it caused problems.
    >


    Same here, it sounds like an innocent change. While it is possible that AIX
    could not work on this machine, it seems a bit unlikely.

    > We can revert that commit, but I still need to solve the problem that
    > the distros are facing, namely that their installer kernel + initramfs
    > images are now bigger than 12MB and can't be loaded if the firmware is
    > based at 12MB. That's why I really want to understand the problem in
    > more detail.
    >
    > > It's been pointed out that it can be "fixed" by upgrading the firmware but
    > > surely we can avoid breaking the machine in the first place?

    >
    > Have you upgraded the firmware on the machine you saw this problem on?


    No. Luckily for us, it was scheduled to be upgraded but it got delayed
    . I've asked the guy to go somewhere else for a while so I should be able
    to keep the machine in the state it's currently in.

    > If not, would you be willing to run some tests for me?
    >


    Of course.

    --
    Mel Gorman
    Part-time Phd Student Linux Technology Center
    University of Limerick IBM Dublin Software Lab
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    Benjamin Herrenschmidt writes:

    > Unless missed something, I think it's narrowed already. When loaded from
    > yaboot, there is no relevant difference between zImage and vmlinux here.
    > IE. yaboot parses the ELF header of the zImage itself and ignores the
    > special notes anyway so only the CAS firmware call is relevant in both
    > cases, no ?


    Good point. However, it would be the parse-elf-header firmware call,
    rather than the CAS firmware call, since 91a00302 modified the
    fake_elf structure (to make it consistent with the CAS structure) but
    not the CAS structure.

    Paul.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: 2.6.28-rc1: NVRAM being corrupted on ppc64 preventing boot (bisected)

    On Fri, Oct 31, 2008 at 10:10:55PM +1100, Paul Mackerras wrote:
    > Mel Gorman writes:
    >
    > > Yaboot in my case and I've heard it affected a DVD installation. I don't
    > > know for sure if it affects netboot but as I think it's something the
    > > kernel is doing, it probably doesn't matter how it gets loaded?

    >
    > What changed in that commit was the contents of a couple of structures
    > that the firmware looks at to see what the kernel wants from
    > firmware. Specifically the change was to say that the kernel (or
    > really the zImage wrapper) would like the firmware to be based at the
    > 32MB point (which is what AIX uses) rather than 12MB (which was the
    > default on older machines).
    >
    > So, as I understand it, it's not anything the kernel is actively
    > doing, it's how the firmware is reacting to what the kernel says it
    > wants. And since we are requesting the same value as AIX (as far as I
    > know) I'm really surprised it caused problems.
    >
    > We can revert that commit, but I still need to solve the problem that
    > the distros are facing, namely that their installer kernel + initramfs
    > images are now bigger than 12MB and can't be loaded if the firmware is
    > based at 12MB. That's why I really want to understand the problem in
    > more detail.
    >
    > > It's been pointed out that it can be "fixed" by upgrading the firmware but
    > > surely we can avoid breaking the machine in the first place?

    >
    > Have you upgraded the firmware on the machine you saw this problem on?
    > If not, would you be willing to run some tests for me?
    >


    As per an off-line suggestion, I was able to get past the NVRAM problem
    using the following patch. The machine still fails to fully boot but it's
    due to some modules problem and unrelated to this issue.

    From 7e54016ce29eb80026d7ff9a8310cf9c3a7e17a9 Mon Sep 17 00:00:00 2001
    From: Mel Gorman
    Date: Fri, 31 Oct 2008 17:12:46 +0000
    Subject: [PATCH] Partial revert of 91a00302, set new_mem_def back to 0

    On the suggestion of Paul McKerras, I tried the following patch. It partially
    reverts a change made by commit 91a00302 by setting new_mem_def back to 0.
    Once applied, IBM pSeries with old firmware do not corrupt their NVRAM early
    in boot.

    I do not know why this change fixes the problem. A structure like this is
    also in arch/powerpc/boot/addnote.c but it's not clear if it needs to be
    similarly changed or not. Paul?

    Signed-off-by: Mel Gorman
    ---
    arch/powerpc/kernel/prom_init.c | 2 +-
    1 file changed, 1 insertion(+), 1 deletion(-)

    diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
    index 23e0db2..d6c8128 100644
    --- a/arch/powerpc/kernel/prom_init.c
    +++ b/arch/powerpc/kernel/prom_init.c
    @@ -719,7 +719,7 @@ static struct fake_elf {
    .max_pft_size = 46, /* 2^46 bytes max PFT size */
    .splpar = 1,
    .min_load = ~0U,
    - .new_mem_def = 1
    + .new_mem_def = 0
    }
    }
    };

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2