Re: questions about x86: mtrr cleanup for converting continuous to discrete layout - Kernel

This is a discussion on Re: questions about x86: mtrr cleanup for converting continuous to discrete layout - Kernel ; I think a workaround in the kernel is absolutely necessary. A lot of newer motherboards have this issue, where a whole section of memory will be marked as write-back, and write-combining can't be embedded/nested. As far as I'm aware, changing ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: Re: questions about x86: mtrr cleanup for converting continuous to discrete layout

  1. Re: questions about x86: mtrr cleanup for converting continuous to discrete layout

    I think a workaround in the kernel is absolutely necessary. A lot of
    newer motherboards have this issue, where a whole section of memory
    will be marked as write-back, and write-combining can't be
    embedded/nested.
    As far as I'm aware, changing MTRRs won't make a system unstable,
    especially if done so early on, when the kernel is starting up. All
    it does is change the behavior on how the CPU will cache write
    requests to memory. All system memory should be marked as write-back,
    how many MTRRs are used to do this...I'm not sure if it exactly
    matters. You can set MTRR_SPARE_REG_NR and control how many MTRR
    slots the code will use.

    Is it legal to mark a write-combining range within a write-back range?

    Ideally, maybe adding a minimal amount of MTRRs might be best, as D.
    Hugh Redelmeier's userspace app does, but I think a fix for this in
    the _kernel_ is an absolute must for 2.6.27. Weather a range that has
    to be marked for write-combining is just "uncovered", or weather
    ranges are entirely automatically generated in chunks, either should
    work, but Hugh's suggestion might save MTRR entries in practice?

    I'm no kernel dev, I code a bit here and there, but I spent a LOT of
    time researching this when I ran into the problem myself on my new PC.
    There's a lot of posts about it too in the intel bug tracker for
    people with newer boards and the g45 chipset. Most users shouldn't
    have to worry about this, and it should, "just work".

    I don't think this should be pulled unless a different fix is in place
    in the kernel.

    Thanks!


    Here's what bios does with my MTRRs, write combining can't be set up
    for my video card
    reg00: base=0x1b0000000 (6912MB), size= 256MB: uncachable, count=1
    reg01: base=0x1c0000000 (7168MB), size=1024MB: uncachable, count=1
    reg02: base=0x00000000 ( 0MB), size=8192MB: write-back, count=1
    reg03: base=0xd0000000 (3328MB), size= 256MB: uncachable, count=1
    reg04: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
    reg05: base=0xc7e00000 (3198MB), size= 2MB: uncachable, count=1
    reg06: base=0xc8000000 (3200MB), size= 128MB: uncachable, count=1

    and with Yinghai Lu's patches in git tip, with working write-combining mark
    reg00: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1
    reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1
    reg02: base=0xc0000000 (3072MB), size= 128MB: write-back, count=1
    reg03: base=0xc7e00000 (3198MB), size= 2MB: uncachable, count=1
    reg04: base=0x100000000 (4096MB), size=2048MB: write-back, count=1
    reg05: base=0x180000000 (6144MB), size= 512MB: write-back, count=1
    reg06: base=0x1a0000000 (6656MB), size= 256MB: write-back, count=1
    reg07: base=0xd0000000 (3328MB), size= 256MB: write-combining, count=1
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: questions about x86: mtrr cleanup for converting continuous to discrete layout

    | From: Dylan Taft

    Thanks for your reply.

    | To: linux-kernel@vger.kernel.org, yinghai@kernel.org, hugh@mimosa.com
    | Subject: Re: questions about x86: mtrr cleanup for converting continuous to
    | discrete layout
    |
    | I think a workaround in the kernel is absolutely necessary. A lot of
    | newer motherboards have this issue,

    I agree.

    | where a whole section of memory
    | will be marked as write-back, and write-combining can't be
    | embedded/nested.

    To be more clear:

    Uncachable can be nested within write-back but write-combining cannot
    be nested within write-back. These newer BIOSes, when they see 4GiB
    or more of RAM, nest an uncachable MTRR for a video buffer inside a
    larger write-back region.

    The video driver cannot simply change the type of the inner MTRR
    because write-combining cannot be nested within write-back.

    | As far as I'm aware, changing MTRRs won't make a system unstable,
    | especially if done so early on, when the kernel is starting up. All
    | it does is change the behavior on how the CPU will cache write
    | requests to memory.

    Two kinds of stability issues:

    - if the MTRRs are being changed while other things are going on, it
    may be the case that memory accesses are performed with an improper
    configuration.

    This is quite possible if the changes are from a userland program,
    like mine. It might happen in a kernel-based version if
    insufficient locking is performed.

    - wise people have said that SMM code may make assumptions about MTRR
    settings. Here are a couple of random messages that touch on this:

    http://lkml.org/lkml/2008/4/28/201
    http://lkml.org/lkml/2008/4/29/522

    | All system memory should be marked as write-back,
    | how many MTRRs are used to do this...I'm not sure if it exactly
    | matters. You can set MTRR_SPARE_REG_NR and control how many MTRR
    | slots the code will use.

    There are only 8 MTRRs on current hardware, as far as I know. You
    cannot use more. If you use 8 or fewer, the number probably doesn't
    matter.

    Clearly Yinghai Lu thinks the number of unused registers matters or he
    would not have implemented MTRR_SPARE_REG_NR. I don't know why.

    | Is it legal to mark a write-combining range within a write-back range?

    No.

    | Ideally, maybe adding a minimal amount of MTRRs might be best, as D.
    | Hugh Redelmeier's userspace app does,

    My program aims to minimize MTRRs used in the hope that no
    approximation need be used.

    | I'm no kernel dev, I code a bit here and there, but I spent a LOT of
    | time researching this when I ran into the problem myself on my new PC.

    Hear hear! This is a dark and ill documented corner of the world
    with nasty things lurking there bite you. Both of us are here because
    we got bit.

    I'm a bit disappointed that my messages to LKML haven't provoked more
    reaction.

    | There's a lot of posts about it too in the intel bug tracker for
    | people with newer boards and the g45 chipset.

    Could you point me towards them? I'd like to see if mtrr-uncover
    works for their problems.

    | Most users shouldn't
    | have to worry about this, and it should, "just work".

    Yes.

    | I don't think this should be pulled unless a different fix is in place
    | in the kernel.

    I agree. But if it introduces new mysterious problems, then things
    are not necessarily better.

    That is why it defaults to being off. At least I think it does.

    I think/suspect/hope that my algorithm is safer. I'm not advocating
    userland code -- that's just a prototype.

    | Here's what bios does with my MTRRs, write combining can't be set up
    | for my video card
    | reg00: base=0x1b0000000 (6912MB), size= 256MB: uncachable, count=1
    | reg01: base=0x1c0000000 (7168MB), size=1024MB: uncachable, count=1
    | reg02: base=0x00000000 ( 0MB), size=8192MB: write-back, count=1
    | reg03: base=0xd0000000 (3328MB), size= 256MB: uncachable, count=1
    | reg04: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
    | reg05: base=0xc7e00000 (3198MB), size= 2MB: uncachable, count=1
    | reg06: base=0xc8000000 (3200MB), size= 128MB: uncachable, count=1

    Hmm. that's not what I saw in
    http://bugs.freedesktop.org/show_bug.cgi?id=17782

    A more readable presentation of the same information:
    2 0x000000000-0x1ffffffff write-back
    5 0x0c7e00000-0x0c7ffffff uncachable
    6 0x0c8000000-0x0cfffffff uncachable
    3 0x0d0000000-0x0dfffffff uncachable
    4 0x0e0000000-0x0ffffffff uncachable
    0 0x1b0000000-0x1bfffffff uncachable
    1 0x1c0000000-0x1ffffffff uncachable

    Today's version of mtrr-uncover comes up with the following precise
    solution:

    2' 0x000000000-0x07fffffff write-back
    51' 0x080000000-0x0bfffffff write-back
    52' 0x0c0000000-0x0c7ffffff write-back
    5 0x0c7e00000-0x0c7ffffff uncachable
    3T 0x0d0000000-0x0dfffffff uncachable
    50 0x100000000-0x1ffffffff write-back
    0 0x1b0000000-0x1bfffffff uncachable
    1 0x1c0000000-0x1ffffffff uncachable

    I made some changes to mtrr-uncover today

    - it now makes sure that there is a distinct uncovered MTRR
    corresponding to each range the user specified. This makes changing
    the region to WC easier.

    Before it often optimized away the target MTRR. This generally was
    not a problem, but it could be if there were no free MTRR registers.

    - I added another optimization. Prompted by this example
    configuration (thanks!). Without this optimization, the
    program could not fit a solution to this example in 8 MTRRs.

    ftp://ftp.cs.utoronto.ca/pub/hugh/mt...2008sept30.tgz

    | and with Yinghai Lu's patches in git tip, with working write-combining mark
    | reg00: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1
    | reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1
    | reg02: base=0xc0000000 (3072MB), size= 128MB: write-back, count=1
    | reg03: base=0xc7e00000 (3198MB), size= 2MB: uncachable, count=1
    | reg04: base=0x100000000 (4096MB), size=2048MB: write-back, count=1
    | reg05: base=0x180000000 (6144MB), size= 512MB: write-back, count=1
    | reg06: base=0x1a0000000 (6656MB), size= 256MB: write-back, count=1
    | reg07: base=0xd0000000 (3328MB), size= 256MB: write-combining, count=1

    Interesting. In this case, reg03 is nested within reg02. I didn't
    realize that Yinghai Lu's code allowed nesting.

    This is an approximation:
    0x1b0000000-0x1ffffffff is now UC but was WB

    I would claim that mtrr-uncover's solution is therefore superior.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread