weird, boot CPU (#0) not listed by the BIOS. - Linux

This is a discussion on weird, boot CPU (#0) not listed by the BIOS. - Linux ; Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron model 2220 processors. BIOS and kernel both reported the expected count of 4 processors. Then things get weird. There are what appear to be strange CPU numbers involved. ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: weird, boot CPU (#0) not listed by the BIOS.

  1. weird, boot CPU (#0) not listed by the BIOS.

    Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron
    model 2220 processors. BIOS and kernel both reported the expected count of
    4 processors. Then things get weird. There are what appear to be strange
    CPU numbers involved. These strange numbers do not get a processor response
    so nothing more is activated and the system runs with just the initial CPU
    indicated as #0. Kernel is 2.6.23.12 compiled with max CPUS at 4.

    The message buffer has these interesting parts:

    [ 0.000000] Intel MultiProcessor Specification v1.1
    [ 0.000000] Virtual Wire compatibility mode.
    [ 0.000000] OEM ID: TEMPLATE Product ID: ETEMPLATE APIC at: 0xFEE00000
    [ 0.000000] Processor #67 15:1 APIC version 16
    [ 0.000000] Processor #68 15:1 APIC version 16
    [ 0.000000] Processor #69 15:1 APIC version 16
    [ 0.000000] Processor #70 15:1 APIC version 16
    [ 0.000000] Enabling APIC mode: Flat. Using 0 I/O APICs
    [ 0.000000] Processors: 4

    ....

    [ 56.272441] Freeing SMP alternatives: 16k freed
    [ 56.272519] CPU0: AMD Dual-Core AMD Opteron(tm) Processor 2220 stepping 03
    [ 56.272672] weird, boot CPU (#0) not listed by the BIOS.
    [ 56.272736] Booting processor 1/67 eip 2000
    [ 56.272796] APIC error on CPU0: 00(04)
    [ 56.282784] APIC error on CPU0: 00(04)
    [ 56.283286] APIC error on CPU0: 00(04)
    [ 61.277842] Not responding.
    [ 61.277893] Inquiring remote APIC #67...
    [ 61.277946] ... APIC #67 ID: failed
    [ 61.278129] ... APIC #67 VERSION: failed
    [ 61.278312] ... APIC #67 SPIV: failed
    [ 61.278496] CPU #67 not responding - cannot use it.

    ....

    Then the last 10 messages repeat 3 more times in the context of CPU numbers
    68, 69, and 70. Why these numbers? Corrupt APIC data? BIOS error?

    I'll look into the kernel code handling this tomorrow to see what it might
    be expecting. Maybe I could make a patch and fake it for this particular
    machine for now, assuming the correct numbers should be 0, 1, 2, and 3.
    Might that work? Or is this something to contact Tyan tech support on?

    The full dmesg buffer is at:
    http://phil.ipal.org/usenet/colds/20...ird-cpus-0.txt

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2008-01-02-0029@ipal.net |
    |------------------------------------/-------------------------------------|

  2. Re: weird, boot CPU (#0) not listed by the BIOS.

    phil-news-nospam@ipal.net burped up warm pablum in
    news:flfcvr02v22@news1.newsguy.com:

    > Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron
    > model 2220 processors. BIOS and kernel both reported the expected count of
    > 4 processors. Then things get weird. There are what appear to be strange
    > CPU numbers involved. These strange numbers do not get a processor response
    > so nothing more is activated and the system runs with just the initial CPU
    > indicated as #0. Kernel is 2.6.23.12 compiled with max CPUS at 4.


    ....

    > Then the last 10 messages repeat 3 more times in the context of CPU numbers
    > 68, 69, and 70. Why these numbers? Corrupt APIC data? BIOS error?


    Tyan boards seem to have trouble with multiple CPUs while booting. Try this google search:
    http://www.google.ca/search?num=30&h...ic+%2367+TYAN&
    btnG=Search&meta=

    --
    Tris Orendorff
    [ Anyone naming their child should spend a few minutes checking rhyming slang and dodgy
    sounding names. Brad and Angelina failed to do this when naming their kid Shiloh Pitt. At some
    point, someone at school is going to spoonerise her name.
    Craig Stark ]


  3. Re: weird, boot CPU (#0) not listed by the BIOS.

    On Wed, 02 Jan 2008 16:26:46 GMT Tris Orendorff wrote:
    | phil-news-nospam@ipal.net burped up warm pablum in
    | news:flfcvr02v22@news1.newsguy.com:
    |
    |> Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron
    |> model 2220 processors. BIOS and kernel both reported the expected count of
    |> 4 processors. Then things get weird. There are what appear to be strange
    |> CPU numbers involved. These strange numbers do not get a processor response
    |> so nothing more is activated and the system runs with just the initial CPU
    |> indicated as #0. Kernel is 2.6.23.12 compiled with max CPUS at 4.
    |
    | ...
    |
    |> Then the last 10 messages repeat 3 more times in the context of CPU numbers
    |> 68, 69, and 70. Why these numbers? Corrupt APIC data? BIOS error?
    |
    | Tyan boards seem to have trouble with multiple CPUs while booting. Try this google search:
    | http://www.google.ca/search?num=30&h...ic+%2367+TYAN&
    | btnG=Search&meta=

    Nice. My post comes up first

    So basically, does this mean Tyan and Linux are incompatible? I saw a few
    complains about various related issues in that search, but no answers. I
    did post a ticket with Tyan support.

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2008-01-02-1831@ipal.net |
    |------------------------------------/-------------------------------------|

  4. Re: weird, boot CPU (#0) not listed by the BIOS.

    On Wed, 02 Jan 2008 16:26:46 GMT Tris Orendorff wrote:
    | phil-news-nospam@ipal.net burped up warm pablum in
    | news:flfcvr02v22@news1.newsguy.com:
    |
    |> Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron
    |> model 2220 processors. BIOS and kernel both reported the expected count of
    |> 4 processors. Then things get weird. There are what appear to be strange
    |> CPU numbers involved. These strange numbers do not get a processor response
    |> so nothing more is activated and the system runs with just the initial CPU
    |> indicated as #0. Kernel is 2.6.23.12 compiled with max CPUS at 4.
    |
    | ...
    |
    |> Then the last 10 messages repeat 3 more times in the context of CPU numbers
    |> 68, 69, and 70. Why these numbers? Corrupt APIC data? BIOS error?
    |
    | Tyan boards seem to have trouble with multiple CPUs while booting. Try this google search:
    | http://www.google.ca/search?num=30&h...ic+%2367+TYAN&
    | btnG=Search&meta=

    Well, that search was not helpful.

    However, I did, at someone's suggestion, try booting some other distros.
    Fedora 8 hangs during kernel probes. Ubuntu 6.06 does come up if I use
    the "noapic" option AND ... It recognizes all 4 CPUS correctly! But if I
    use the "noapic" option with my 2.6.23.12 kernel, it still has the same
    problem where it gets the wrong CPU numbers. Ubuntu 6.06 has kernel 2.6.15.

    So it seems one of two things might be the issue:

    1. Ubuntu built their kernel with the magic "don't get the CPU numbers
    wrong" option (whatever that might be).

    2. Somewhere between 2.6.15 and 2.6.23.12 the kernel broke the ability
    to see the correct CPU numbers.

    Any idea which it is? Whatever it is, it might be something Fedora did
    not do, or did wrong.

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------

  5. Solved - Was: weird, boot CPU (#0) not listed by the BIOS.

    On 2 Jan 2008 07:03:23 GMT phil-news-nospam@ipal.net wrote:

    | Newly built machine. Tyan S2927 mainboard. A pair of dual-core AMD Opteron
    | model 2220 processors. BIOS and kernel both reported the expected count of
    | 4 processors. Then things get weird. There are what appear to be strange
    | CPU numbers involved. These strange numbers do not get a processor response
    | so nothing more is activated and the system runs with just the initial CPU
    | indicated as #0. Kernel is 2.6.23.12 compiled with max CPUS at 4.
    |
    | The message buffer has these interesting parts:
    |
    | [ 0.000000] Intel MultiProcessor Specification v1.1
    | [ 0.000000] Virtual Wire compatibility mode.
    | [ 0.000000] OEM ID: TEMPLATE Product ID: ETEMPLATE APIC at: 0xFEE00000
    | [ 0.000000] Processor #67 15:1 APIC version 16
    | [ 0.000000] Processor #68 15:1 APIC version 16
    | [ 0.000000] Processor #69 15:1 APIC version 16
    | [ 0.000000] Processor #70 15:1 APIC version 16
    | [ 0.000000] Enabling APIC mode: Flat. Using 0 I/O APICs
    | [ 0.000000] Processors: 4
    |
    | ...
    |
    | [ 56.272441] Freeing SMP alternatives: 16k freed
    | [ 56.272519] CPU0: AMD Dual-Core AMD Opteron(tm) Processor 2220 stepping 03
    | [ 56.272672] weird, boot CPU (#0) not listed by the BIOS.
    | [ 56.272736] Booting processor 1/67 eip 2000
    | [ 56.272796] APIC error on CPU0: 00(04)
    | [ 56.282784] APIC error on CPU0: 00(04)
    | [ 56.283286] APIC error on CPU0: 00(04)
    | [ 61.277842] Not responding.
    | [ 61.277893] Inquiring remote APIC #67...
    | [ 61.277946] ... APIC #67 ID: failed
    | [ 61.278129] ... APIC #67 VERSION: failed
    | [ 61.278312] ... APIC #67 SPIV: failed
    | [ 61.278496] CPU #67 not responding - cannot use it.
    |
    | ...
    |
    | Then the last 10 messages repeat 3 more times in the context of CPU numbers
    | 68, 69, and 70. Why these numbers? Corrupt APIC data? BIOS error?
    |
    | I'll look into the kernel code handling this tomorrow to see what it might
    | be expecting. Maybe I could make a patch and fake it for this particular
    | machine for now, assuming the correct numbers should be 0, 1, 2, and 3.
    | Might that work? Or is this something to contact Tyan tech support on?
    |
    | The full dmesg buffer is at:
    | http://phil.ipal.org/usenet/colds/20...ird-cpus-0.txt

    It turns out the problem is that SMP, at least on an APIC machine, requires
    that APIC be enabled under power management in the source tree configuration.
    Having seen a few computers that don't play well with Linux where APIC is
    involved, I usually leave APIC disabled. IMHO, APIC is one of the many ways
    to totally bastardize the "PC architecture". It is overly complicated for
    what it provides, and is too unstable. Nevertheless, it actually works in
    the case of the Tyan S2927 mainboard. So my guess is the SMP code detected
    the machine was APIC, and attempted to look at the APIC information which
    had probably not been gathered since APIC was not enabled, and merely picked
    up garbage left over from other uses, or used an invalid pointer that happened
    to not crash things.

    If it makes sense on some (non-APIC) machines to have SMP enabled (SMP did
    exist before APIC, so this must be a yes), then perhaps the SMP code itself
    needs to be made to correctly detect if APIC is truly enabled in the kernel
    (don't assume so just because the hardware/BIOS has it), and not attempt to
    use APIC if not, and revert to using older methods to detect CPUs (like guess
    the CPU numbers sequentially and stop when one doesn't work and don't even
    try for pluggable CPU sets).

    Better documentation in the kernel would also help.

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------

+ Reply to Thread