Interpreting SRR1 and OOPS - Powerpc

This is a discussion on Interpreting SRR1 and OOPS - Powerpc ; I am getting the OOPS message that follows and have been having a very difficult time determining what is causing it. According to "PowerPC Microprocessor Family: The Programming Environments for 32-Bit Microprocessors", "When an exception occurs, bits 1-4 and 10-15 ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Interpreting SRR1 and OOPS

  1. Interpreting SRR1 and OOPS

    I am getting the OOPS message that follows and have been having a very
    difficult time determining what is causing it. According to "PowerPC
    Microprocessor Family: The Programming Environments for 32-Bit
    Microprocessors", "When an exception occurs, bits 1-4 and 10-15 of SRR1
    are loaded with exception specific information."

    SRR1 is 00089032, so bits 1-4 are 0000 and bits 10-15 are 001000.
    Unfortunately, I cannot find anywhere what the "exception specific
    information" contained in these bits is.

    Any information on this exception or interpreting an OOPS message in
    general on PPC would be greatly appreciated.



    Eclipse # Machine check in kernel mode.
    Caused by SRR0=0xC0005D28
    Caused by (from SRR1=89032): Machine check signal
    Oops: machine check, sig: 7
    NIP: C3095218 XER: 00000000 LR: C30951BC SP: C015E240 REGS: c015e190
    TRAP: 0200 Not tainted
    MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
    TASK = c015c470[0] 'swapper' Last syscall: 120
    last math c1db4000 last altivec 00000000
    GPR00: 00000000 C015E240 C015C470 C32E6EB8 00001032 000000C6 0000008C
    00000000
    GPR08: C3110000 C36EF000 C310FA94 C0269600 00000175 1010E944 01FFD000
    00000001
    GPR16: FFFFFFFF 00000000 00000000 01FF7A0C 00001032 00000002 00000002
    C3110000
    GPR24: 00000001 C01B0000 C0140000 C0140000 00000002 00000002 00000000
    00010000
    Call backtrace:
    C30951BC C30A81BC C001D25C C001D008 C0006D0C C0005B20 C00071D0
    C00071EC C0003948 C01705D8 000035F0
    Kernel panic: Aiee, killing interrupt handler!
    In interrupt handler - not syncing


  2. Re: Interpreting SRR1 and OOPS

    Please specify what PowerPC processor is involved.
    For instance: if it is MPC603e (or G2) than SSR1 bit 12 indicates
    "Machine check signal caused exception" for vector 0x200 which is the
    exception in your case.

    David Gabbay
    DoGav Systems


  3. Re: Interpreting SRR1 and OOPS

    On Mon, 23 Oct 2006 10:31:05 -0700, Bill wrote:

    > I am getting the OOPS message that follows and have been having a very
    > difficult time determining what is causing it. According to "PowerPC
    > Microprocessor Family: The Programming Environments for 32-Bit
    > Microprocessors", "When an exception occurs, bits 1-4 and 10-15 of SRR1
    > are loaded with exception specific information."
    >
    > SRR1 is 00089032, so bits 1-4 are 0000 and bits 10-15 are 001000.
    > Unfortunately, I cannot find anywhere what the "exception specific
    > information" contained in these bits is.


    See the chapter on exception processing, chapter 6.

    >
    > Any information on this exception or interpreting an OOPS message in
    > general on PPC would be greatly appreciated.
    >


    Machine check exception is described in 6.4.2 in my copy:


    SRR1 Bit 30 is loaded from MSR[RI] if the processor is in a recoverable
    state. Otherwise cleared. The setting of all other SRR1 bits is
    implementation-dependent.


    So you may need to look at the user manual of your CPU.


    >
    >
    > Eclipse # Machine check in kernel mode.
    > Caused by SRR0=0xC0005D28
    > Caused by (from SRR1=89032): Machine check signal
    > Oops: machine check, sig: 7
    > NIP: C3095218 XER: 00000000 LR: C30951BC SP: C015E240 REGS: c015e190
    > TRAP: 0200 Not tainted
    > MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
    > TASK = c015c470[0] 'swapper' Last syscall: 120
    > last math c1db4000 last altivec 00000000
    > GPR00: 00000000 C015E240 C015C470 C32E6EB8 00001032 000000C6 0000008C
    > 00000000
    > GPR08: C3110000 C36EF000 C310FA94 C0269600 00000175 1010E944 01FFD000
    > 00000001
    > GPR16: FFFFFFFF 00000000 00000000 01FF7A0C 00001032 00000002 00000002
    > C3110000
    > GPR24: 00000001 C01B0000 C0140000 C0140000 00000002 00000002 00000000
    > 00010000
    > Call backtrace:
    > C30951BC C30A81BC C001D25C C001D008 C0006D0C C0005B20 C00071D0
    > C00071EC C0003948 C01705D8 000035F0
    > Kernel panic: Aiee, killing interrupt handler!
    > In interrupt handler - not syncing


    Rob


  4. Re: Interpreting SRR1 and OOPS

    MPC8248.


  5. Re: Interpreting SRR1 and OOPS

    I looked at section 6.4.2 but did not find it very helpful. My
    register settings do not match those listed. I have:

    POW 0 FP 0 BE 0 DR 1
    ILE 0 ME 1 FE1 0 RI 1
    EE 1 FE0 0 IP 0 LE 0
    PR 0 SE 0 IR 1


  6. Re: Interpreting SRR1 and OOPS

    >From mpc603e UM (the core in your case):
    0-11 Cleared
    12 core_mcp-Machine check signal caused exception
    Check the SIU's register TESCR1 (offset 0x10040) for the specific
    cause.

    David Gabbay
    DoGav Systems


  7. Re: Interpreting SRR1 and OOPS

    Should I add printing the value of this register to the OOPS message?
    Is there a better way to read that register before a crash?


    dg@dogav.net wrote:
    > >From mpc603e UM (the core in your case):

    > 0-11 Cleared
    > 12 core_mcp-Machine check signal caused exception
    > Check the SIU's register TESCR1 (offset 0x10040) for the specific
    > cause.
    >
    > David Gabbay
    > DoGav Systems



  8. Re: Interpreting SRR1 and OOPS

    I would print it
    David


  9. Re: Interpreting SRR1 and OOPS

    Reading the TESCR1 revealed a PCI machine check. Then, reading the ESR
    showed that there was a PCI read data parity error, which had gone
    undetected because the parity error response bit in the PCI Bus Command
    Register was set to 0. Once this bit was set to 1, the presense of the
    parity error was confirmed.

    Thank you very much. Now we know what is causing the oops and can go
    about fixing it.



    dg@dogav.net wrote:
    > I would print it
    > David



  10. Re: Interpreting SRR1 and OOPS

    dg@dogav.net wrote:
    >
    > I would print it


    This is totally meaningless. Google is not usenet - it is only a
    poor imitation of an interface to the system. Read the links in my
    sig. below.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at:
    Also see



+ Reply to Thread