MIPS IntLock - VxWorks

This is a discussion on MIPS IntLock - VxWorks ; I see where windriver fixed an IntLock bug in the Mips architecture in vxWorks 6.3. We seem to have a VERY similar problem in vxWorks 5.5.1. I've sent off a TSR, but was wondering if anyone has details on the ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: MIPS IntLock

  1. MIPS IntLock

    I see where windriver fixed an IntLock bug in the Mips architecture in
    vxWorks 6.3. We seem to have a VERY similar problem in vxWorks
    5.5.1. I've sent off a TSR, but was wondering if anyone has details
    on the actual fix for this issue. We have the source for the
    intAlib...etc... any help would be appreciated.

    Rick


  2. Re: MIPS IntLock

    On Sep 5, 3:07 pm, webber1998 wrote:

    > I see where windriver fixed an IntLock bug in the Mips architecture in
    > vxWorks 6.3. We seem to have a VERY similar problem in vxWorks
    > 5.5.1. I've sent off a TSR, but was wondering if anyone has details
    > on the actual fix for this issue. We have the source for the
    > intAlib...etc... any help would be appreciated.
    >
    > Rick


    The code has changed a lot from 5.5.1 to 6.x, so it's hard to say if
    the bug was present there. The actual issue has to do with the way
    interrupts are masked/unmasked on the MIPS arch. Doing it requires a
    read-modify-write operation to set or clear the IE bit in the SR
    register in the MIPS core. One of the problems with this is that it
    isn't atomic, and another is that the instruction that writes to the
    register may take serveral machine cycles to complete. The intLock()
    routine has to execute several NOP instructions after writing the
    change to the CPU before it returns to the caller, in order to be sure
    the change has taken effect. These NOPs are known as hazard cycles.

    The problem is that there's a small window between the time the
    register write is performed and the last hazard cycle executes during
    which interrupts may still fire. To deal with this, the interrupt
    handler code is supposed to first execute a few hazard cycles itself
    (to give any updates a chance to complete), save the current state of
    the IE bit, and then restore it before exiting.

    The bug in 6.3 was that the code had been incorrectly modified such
    that it always set the IE bit during intExit instead of preserving its
    original value, based on the flawed assumption that the interrupt
    handler couldn't possibly have been entered if interrupts weren't on
    in the first place. The effect of this change is that if someone
    called intLock(), and an interrupt occured during the hazard window,
    intLock() would return success even though interrupts were still
    enabled. This in turn could lead to all kinds of race conditions
    occuring in other critical sections of code, particularly in cases
    where the device is fielding many interrupts. (I tripped over it while
    stress testing an ethernet driver, which caused a very high interrupt
    load on the CPU and provoked the race condition very frequently.)

    The fix was to remove the code that unconditionally set the IE bit in
    the SR register during intExit, since it wasn't really needed anyway.
    (I believe the patch actually includes fixes for some other issues
    too.)

    If you're a VxWorks source customer, one trick you can do to see if
    you're being bitten by this problem or not is to add some code to the
    end of intLock() immediately after the hazard cycles to read the SR
    register and test that the IE bit has in fact been cleared, and emit a
    warning if it hasn't. If you ever encounter a case where you reach the
    "j ra" instruction in intLock() and interrupts are not infact masked
    off, then you're hosed.

    -Bill


+ Reply to Thread