Re: [Bug 10328] New: [regression] performance drop for glx - Kernel

This is a discussion on Re: [Bug 10328] New: [regression] performance drop for glx - Kernel ; (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 25 Mar 2008 15:11:15 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10328 > > Summary: [regression] performance drop for glx > Product: Memory Management > Version: ...

+ Reply to Thread
Results 1 to 9 of 9

Thread: Re: [Bug 10328] New: [regression] performance drop for glx

  1. Re: [Bug 10328] New: [regression] performance drop for glx

    (switched to email. Please respond via emailed reply-to-all, not via the
    bugzilla web interface).

    On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    bugme-daemon@bugzilla.kernel.org wrote:

    > http://bugzilla.kernel.org/show_bug.cgi?id=10328
    >
    > Summary: [regression] performance drop for glx
    > Product: Memory Management
    > Version: 2.5
    > KernelVersion: 2.6.25-rc6
    > Platform: All
    > OS/Version: Linux
    > Tree: Mainline
    > Status: NEW
    > Severity: normal
    > Priority: P1
    > Component: Other
    > AssignedTo: akpm@osdl.org
    > ReportedBy: bug-track@fisher-privat.net
    >
    >
    > after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
    > related perfomance issues.
    >
    > I used glxgears for test.
    > before this patch: 1281.005 FPS
    > and after: 765.000 FPS


    It nearly halved.

    > latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
    > still hase bad performance.
    >
    > I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
    >


    That's

    : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
    : Author: Siddha, Suresh B
    : Date: Wed Jan 30 13:33:43 2008 +0100
    :
    : x86: set strong uncacheable where UC is really desired
    :
    : Also use _PAGE_PWT for all the mappings which need uncache mapping.
    : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
    : we now use PAT3 which is strong uncacheable.
    :
    : This makes it consistent with pgprot_noncached()
    :
    : Signed-off-by: Suresh Siddha
    : Signed-off-by: Ingo Molnar
    : Signed-off-by: Thomas Gleixner
    :


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [Bug 10328] New: [regression] performance drop for glx

    On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
    > (switched to email. Please respond via emailed reply-to-all, not via the
    > bugzilla web interface).
    >
    > On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    > bugme-daemon@bugzilla.kernel.org wrote:
    >
    > > http://bugzilla.kernel.org/show_bug.cgi?id=10328
    > >
    > > Summary: [regression] performance drop for glx
    > >
    > > after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
    > > related perfomance issues.
    > >
    > > I used glxgears for test.
    > > before this patch: 1281.005 FPS
    > > and after: 765.000 FPS

    >
    > It nearly halved.
    >
    > > latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
    > > still hase bad performance.
    > >
    > > I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
    > >

    >
    > That's
    >
    > : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
    > : Author: Siddha, Suresh B
    > : Date: Wed Jan 30 13:33:43 2008 +0100
    > :
    > : x86: set strong uncacheable where UC is really desired
    > :
    > : Also use _PAGE_PWT for all the mappings which need uncache mapping.
    > : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
    > : we now use PAT3 which is strong uncacheable.
    > :
    > : This makes it consistent with pgprot_noncached()


    Alexey, Can you please try the appended patch?

    Andrew, can you please push the appended patch for 2.6.25? Thanks.
    ---

    fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
    WC attribute. Recent changes in page attribute code made both
    ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
    breaks the graphics performance, as the effective memory type is UC instead
    of expected WC.

    The correct way to fix this is to add ioremap_wc() (which uses UC- in the
    absence of PAT kernel support and WC with PAT) and change all the
    fb drivers to use this new ioremap_wc() API.

    We can take this correct and longer route for post 2.6.25. For now,
    revert back to the UC- behavior for ioremap/ioremap_nocache.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    Cc: Arjan van de Ven
    ---

    diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
    index 4afaba0..794895c 100644
    --- a/arch/x86/mm/ioremap.c
    +++ b/arch/x86/mm/ioremap.c
    @@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
    switch (mode) {
    case IOR_MODE_UNCACHED:
    default:
    - prot = PAGE_KERNEL_NOCACHE;
    + /*
    + * FIXME: we will use UC MINUS for now, as video fb drivers
    + * depend on it. Upcoming ioremap_wc() will fix this behavior.
    + */
    + prot = PAGE_KERNEL_UC_MINUS;
    break;
    case IOR_MODE_CACHED:
    prot = PAGE_KERNEL;
    diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
    index 174b877..9cf472a 100644
    --- a/include/asm-x86/pgtable.h
    +++ b/include/asm-x86/pgtable.h
    @@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
    #define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
    #define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
    #define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
    +#define __PAGE_KERNEL_UC_MINUS (__PAGE_KERNEL | _PAGE_PCD)
    #define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
    #define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
    #define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
    @@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
    #define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
    #define PAGE_KERNEL_RX MAKE_GLOBAL(__PAGE_KERNEL_RX)
    #define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
    +#define PAGE_KERNEL_UC_MINUS MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
    #define PAGE_KERNEL_EXEC_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
    #define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
    #define PAGE_KERNEL_LARGE_EXEC MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [Bug 10328] New: [regression] performance drop for glx

    Suresh Siddha wrote:
    > On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
    >> (switched to email. Please respond via emailed reply-to-all, not via the
    >> bugzilla web interface).
    >>
    >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    >> bugme-daemon@bugzilla.kernel.org wrote:
    >>
    >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
    >>>
    >>> Summary: [regression] performance drop for glx
    >>>
    >>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
    >>> related perfomance issues.
    >>>
    >>> I used glxgears for test.
    >>> before this patch: 1281.005 FPS
    >>> and after: 765.000 FPS

    >> It nearly halved.
    >>
    >>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
    >>> still hase bad performance.
    >>>
    >>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
    >>>

    >> That's
    >>
    >> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
    >> : Author: Siddha, Suresh B
    >> : Date: Wed Jan 30 13:33:43 2008 +0100
    >> :
    >> : x86: set strong uncacheable where UC is really desired
    >> :
    >> : Also use _PAGE_PWT for all the mappings which need uncache mapping.
    >> : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
    >> : we now use PAT3 which is strong uncacheable.
    >> :
    >> : This makes it consistent with pgprot_noncached()

    >
    > Alexey, Can you please try the appended patch?
    >
    > Andrew, can you please push the appended patch for 2.6.25? Thanks.
    > ---
    >
    > fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
    > WC attribute. Recent changes in page attribute code made both
    > ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
    > breaks the graphics performance, as the effective memory type is UC instead
    > of expected WC.
    >
    > The correct way to fix this is to add ioremap_wc() (which uses UC- in the
    > absence of PAT kernel support and WC with PAT) and change all the
    > fb drivers to use this new ioremap_wc() API.
    >
    > We can take this correct and longer route for post 2.6.25. For now,
    > revert back to the UC- behavior for ioremap/ioremap_nocache.


    I would still like to add an ioremap_wc() even in 2.6.25; even if it's for now
    identical to ioremap_nocache(). Better get the right API in place as soon as possible

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [Bug 10328] New: [regression] performance drop for glx


    * Suresh Siddha wrote:

    > fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add
    > with WC attribute. Recent changes in page attribute code made both
    > ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-).
    > This breaks the graphics performance, as the effective memory type is
    > UC instead of expected WC.
    >
    > The correct way to fix this is to add ioremap_wc() (which uses UC- in
    > the absence of PAT kernel support and WC with PAT) and change all the
    > fb drivers to use this new ioremap_wc() API.
    >
    > We can take this correct and longer route for post 2.6.25. For now,
    > revert back to the UC- behavior for ioremap/ioremap_nocache.


    thanks Suresh, applied.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [Bug 10328] New: [regression] performance drop for glx

    On Wed, Mar 26, 2008 at 06:29:11AM +0100, Ingo Molnar wrote:
    >
    > * Suresh Siddha wrote:
    >
    > > fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add
    > > with WC attribute. Recent changes in page attribute code made both
    > > ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-).
    > > This breaks the graphics performance, as the effective memory type is
    > > UC instead of expected WC.
    > >
    > > The correct way to fix this is to add ioremap_wc() (which uses UC- in
    > > the absence of PAT kernel support and WC with PAT) and change all the
    > > fb drivers to use this new ioremap_wc() API.
    > >
    > > We can take this correct and longer route for post 2.6.25. For now,
    > > revert back to the UC- behavior for ioremap/ioremap_nocache.

    >
    > thanks Suresh, applied.


    Well, we need to take care of set_memory_uc() aswell, as the previous version
    didn't fix Alexey's issue.

    Alexey, can you please test and ack this, before Andrew/Ingo can push relevant
    bits to their trees. Thanks.
    ---

    fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
    WC attribute. Recent changes in page attribute code made both
    ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
    breaks the graphics performance, as the effective memory type is UC instead
    of expected WC.

    The correct way to fix this is to add ioremap_wc() (which uses UC- in the
    absence of PAT kernel support and WC with PAT) and change all the
    drivers to use this new ioremap_wc() API.

    We can take this correct and longer route for post 2.6.25. For now,
    revert back to the UC- behavior for ioremap/ioremap_nocache.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    Cc: Arjan van de Ven
    Cc: Alexey Fisher
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    ---

    diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
    index 4afaba0..794895c 100644
    --- a/arch/x86/mm/ioremap.c
    +++ b/arch/x86/mm/ioremap.c
    @@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
    switch (mode) {
    case IOR_MODE_UNCACHED:
    default:
    - prot = PAGE_KERNEL_NOCACHE;
    + /*
    + * FIXME: we will use UC MINUS for now, as video fb drivers
    + * depend on it. Upcoming ioremap_wc() will fix this behavior.
    + */
    + prot = PAGE_KERNEL_UC_MINUS;
    break;
    case IOR_MODE_CACHED:
    prot = PAGE_KERNEL;
    diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
    index 14e48b5..7b79f6b 100644
    --- a/arch/x86/mm/pageattr.c
    +++ b/arch/x86/mm/pageattr.c
    @@ -771,7 +771,7 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,
    int set_memory_uc(unsigned long addr, int numpages)
    {
    return change_page_attr_set(addr, numpages,
    - __pgprot(_PAGE_PCD | _PAGE_PWT));
    + __pgprot(_PAGE_PCD));
    }
    EXPORT_SYMBOL(set_memory_uc);

    diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
    index 174b877..9cf472a 100644
    --- a/include/asm-x86/pgtable.h
    +++ b/include/asm-x86/pgtable.h
    @@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
    #define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
    #define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
    #define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
    +#define __PAGE_KERNEL_UC_MINUS (__PAGE_KERNEL | _PAGE_PCD)
    #define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
    #define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
    #define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
    @@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
    #define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
    #define PAGE_KERNEL_RX MAKE_GLOBAL(__PAGE_KERNEL_RX)
    #define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
    +#define PAGE_KERNEL_UC_MINUS MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
    #define PAGE_KERNEL_EXEC_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
    #define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
    #define PAGE_KERNEL_LARGE_EXEC MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [Bug 10328] New: [regression] performance drop for glx

    Am Dienstag, den 25.03.2008, 21:42 -0700 schrieb Arjan van de Ven:
    > Suresh Siddha wrote:
    > > On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
    > >> (switched to email. Please respond via emailed reply-to-all, not via the
    > >> bugzilla web interface).
    > >>
    > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    > >> bugme-daemon@bugzilla.kernel.org wrote:
    > >>
    > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
    > >>>
    > >>> Summary: [regression] performance drop for glx
    > >>>
    > >>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
    > >>> related perfomance issues.
    > >>>
    > >>> I used glxgears for test.
    > >>> before this patch: 1281.005 FPS
    > >>> and after: 765.000 FPS
    > >> It nearly halved.
    > >>
    > >>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
    > >>> still hase bad performance.
    > >>>
    > >>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
    > >>>
    > >> That's
    > >>
    > >> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
    > >> : Author: Siddha, Suresh B
    > >> : Date: Wed Jan 30 13:33:43 2008 +0100
    > >> :
    > >> : x86: set strong uncacheable where UC is really desired
    > >> :
    > >> : Also use _PAGE_PWT for all the mappings which need uncache mapping.
    > >> : Instead of existing PAT2 which is UC- (and can be overwritten byMTRRs),
    > >> : we now use PAT3 which is strong uncacheable.
    > >> :
    > >> : This makes it consistent with pgprot_noncached()

    > >
    > > Alexey, Can you please try the appended patch?
    > >
    > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
    > > ---
    > >



    This issue come again with commit:

    commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
    Author: venkatesh.pallipadi@intel.com
    Date: Tue Mar 18 17:00:14 2008 -0700

    x86: PAT infrastructure patch

    Sets up pat_init() infrastructure.

    PAT MSR has following setting.
    PAT
    |PCD
    ||PWT
    |||
    000 WB _PAGE_CACHE_WB
    001 WC _PAGE_CACHE_WC
    010 UC- _PAGE_CACHE_UC_MINUS
    011 UC _PAGE_CACHE_UC

    We are effectively changing WT from boot time setting to WC.
    UC_MINUS is used to provide backward compatibility to existing /dev/mem
    users(X).

    reserve_memtype and free_memtype are new interfaces for maintaining
    alias-free
    mapping. It is currently implemented in a simple way with a linked listand
    not optimized. reserve and free tracks the effective memory type, as a
    result
    of PAT and MTRR setting rather than what is actually requested in PAT.

    pat_init piggy backs on mtrr_init as the rules for setting both pat and
    mtrr
    are same.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIEW2bVVCoNUmKuAcRAgorAKDFgfR5jRTPiyzohmGVkg CTU3ocYwCfS/06
    WapHyXdZ+s219mSum4Szykg=
    =5//2
    -----END PGP SIGNATURE-----


  7. Re: [Bug 10328] New: [regression] performance drop for glx

    On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
    > > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    > > >> bugme-daemon@bugzilla.kernel.org wrote:
    > > >>
    > > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
    > > >>>
    > > >>> Summary: [regression] performance drop for glx
    > > >
    > > > Alexey, Can you please try the appended patch?
    > > >
    > > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
    > > > ---
    > > >

    >
    >
    > This issue come again with commit:
    >
    > commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
    > Author: venkatesh.pallipadi@intel.com
    > Date: Tue Mar 18 17:00:14 2008 -0700
    >
    > x86: PAT infrastructure patch
    >
    > Sets up pat_init() infrastructure.
    >


    Alexey, Sorry. I missed posting a fix in the post PAT series for this issue. Can you
    please try the appended patch and Ack?

    copying X folks get their attention and their Ack on the patch and next steps outlined.

    Thanks.
    ---

    [patch] use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()

    Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
    Once all the X drivers move to ioremap_wc(), we can go back to strong
    UC semantics for ioremap() and ioremap_nocache().

    To avoid attribute aliasing issues, pci_mmap_page_range() will also
    use UC_MINUS for default non write-combining mapping request.

    Next steps:
    a) change all the video drivers using ioremap() or ioremap_nocache()
    and adding WC MTTR using mttr_add() to ioremap_wc()

    b) for strict usage, we can go back to strong uc semantics
    for ioremap() and ioremap_nocache() after some grace period for
    completing step-a.

    c) user level X server needs to use the appropriate method for setting
    up WC mapping (like using resourceX_wc sysfs file instead of
    adding MTRR for WC and using /dev/mem or resourceX under /sys)

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    ---

    diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
    index 3a4baf9..c053977 100644
    --- a/arch/x86/mm/ioremap.c
    +++ b/arch/x86/mm/ioremap.c
    @@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
    /*
    * Do not fallback to certain memory types with certain
    * requested type:
    - * - request is uncached, return cannot be write-back
    - * - request is uncached, return cannot be write-combine
    + * - request is uc-, return cannot be write-back
    + * - request is uc-, return cannot be write-combine
    * - request is write-combine, return cannot be write-back
    */
    - if ((prot_val == _PAGE_CACHE_UC &&
    + if ((prot_val == _PAGE_CACHE_UC_MINUS &&
    (new_prot_val == _PAGE_CACHE_WB ||
    new_prot_val == _PAGE_CACHE_WC)) ||
    (prot_val == _PAGE_CACHE_WC &&
    @@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
    default:
    prot = PAGE_KERNEL_NOCACHE;
    break;
    + case _PAGE_CACHE_UC_MINUS:
    + prot = PAGE_KERNEL_UC_MINUS;
    + break;
    case _PAGE_CACHE_WC:
    prot = PAGE_KERNEL_WC;
    break;
    @@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
    */
    void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
    {
    - return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
    + /*
    + * Ideally, this should be:
    + * pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
    + *
    + * Till we fix all X drivers to use ioremap_wc(), we will use
    + * UC MINUS.
    + */
    + unsigned long val = _PAGE_CACHE_UC_MINUS;
    +
    + return __ioremap(phys_addr, size, val);
    }
    EXPORT_SYMBOL(ioremap_nocache);

    diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
    index c29ebd0..ad3202c 100644
    --- a/arch/x86/mm/pageattr.c
    +++ b/arch/x86/mm/pageattr.c
    @@ -779,14 +779,20 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,

    int _set_memory_uc(unsigned long addr, int numpages)
    {
    + /*
    + * for now UC MINUS. see comments in ioremap_nocache()
    + */
    return change_page_attr_set(addr, numpages,
    - __pgprot(_PAGE_CACHE_UC));
    + __pgprot(_PAGE_CACHE_UC_MINUS));
    }

    int set_memory_uc(unsigned long addr, int numpages)
    {
    + /*
    + * for now UC MINUS. see comments in ioremap_nocache()
    + */
    if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
    - _PAGE_CACHE_UC, NULL))
    + _PAGE_CACHE_UC_MINUS, NULL))
    return -EINVAL;

    return _set_memory_uc(addr, numpages);
    diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
    index 94f6c73..8af0f0b 100644
    --- a/arch/x86/pci/i386.c
    +++ b/arch/x86/pci/i386.c
    @@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
    prot = pgprot_val(vma->vm_page_prot);
    if (pat_wc_enabled && write_combine)
    prot |= _PAGE_CACHE_WC;
    + else if (pat_wc_enabled)
    + /*
    + * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
    + * To avoid attribute conflicts, request UC MINUS here
    + * aswell.
    + */
    + prot |= _PAGE_CACHE_UC_MINUS;
    else if (boot_cpu_data.x86 > 3)
    prot |= _PAGE_CACHE_UC;

    @@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
    * - request is uncached, return cannot be write-combine
    * - request is write-combine, return cannot be write-back
    */
    - if ((flags == _PAGE_CACHE_UC &&
    - (new_flags == _PAGE_CACHE_WB ||
    - new_flags == _PAGE_CACHE_WC)) ||
    + if ((flags == _PAGE_CACHE_UC_MINUS &&
    + (new_flags == _PAGE_CACHE_WB)) ||
    (flags == _PAGE_CACHE_WC &&
    new_flags == _PAGE_CACHE_WB)) {
    free_memtype(addr, addr+len);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [Bug 10328] New: [regression] performance drop for glx

    Am Freitag, den 25.04.2008, 17:07 -0700 schrieb Suresh Siddha:
    > On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
    > > > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
    > > > >> bugme-daemon@bugzilla.kernel.org wrote:
    > > > >>
    > > > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
    > > > >>>
    > > > >>> Summary: [regression] performance drop for glx
    > > > >
    > > > > Alexey, Can you please try the appended patch?
    > > > >
    > > > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
    > > > > ---
    > > > >

    > >
    > >
    > > This issue come again with commit:
    > >
    > > commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
    > > Author: venkatesh.pallipadi@intel.com
    > > Date: Tue Mar 18 17:00:14 2008 -0700
    > >
    > > x86: PAT infrastructure patch
    > >
    > > Sets up pat_init() infrastructure.
    > >

    >
    > Alexey, Sorry. I missed posting a fix in the post PAT series for this issue. Can you
    > please try the appended patch and Ack?
    >
    > copying X folks get their attention and their Ack on the patch and next steps outlined.
    >
    > Thanks.
    > ---
    >
    > [patch] use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
    >
    > Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
    > Once all the X drivers move to ioremap_wc(), we can go back to strong
    > UC semantics for ioremap() and ioremap_nocache().
    >


    Thank you. This patch working.

    Ack ?

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIExV9VVCoNUmKuAcRAtbZAKCjo97tjchJn/ntsK3WWqyR2VMRSwCeMjOD
    pKOjS7F7DP88DMDhXCXOfg4=
    =O3fr
    -----END PGP SIGNATURE-----


  9. Re: [Bug 10328] New: [regression] performance drop for glx


    * Suresh Siddha wrote:

    > Alexey, Sorry. I missed posting a fix in the post PAT series for this
    > issue. Can you please try the appended patch and Ack?
    >
    > copying X folks get their attention and their Ack on the patch and
    > next steps outlined.


    i queued up the patch below - hand-merged to x86.git.

    Ingo

    ----------->
    Subject: x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
    From: Suresh Siddha
    Date: Fri, 25 Apr 2008 17:07:22 -0700

    Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
    Once all the X drivers move to ioremap_wc(), we can go back to strong
    UC semantics for ioremap() and ioremap_nocache().

    To avoid attribute aliasing issues, pci_mmap_page_range() will also
    use UC_MINUS for default non write-combining mapping request.

    Next steps:
    a) change all the video drivers using ioremap() or ioremap_nocache()
    and adding WC MTTR using mttr_add() to ioremap_wc()

    b) for strict usage, we can go back to strong uc semantics
    for ioremap() and ioremap_nocache() after some grace period for
    completing step-a.

    c) user level X server needs to use the appropriate method for setting
    up WC mapping (like using resourceX_wc sysfs file instead of
    adding MTRR for WC and using /dev/mem or resourceX under /sys)

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Ingo Molnar
    ---
    arch/x86/mm/ioremap.c | 20 ++++++++++++++++----
    arch/x86/mm/pageattr.c | 10 ++++++++--
    arch/x86/pci/i386.c | 12 +++++++++---
    3 files changed, 33 insertions(+), 9 deletions(-)

    Index: linux-x86.q/arch/x86/mm/ioremap.c
    ================================================== =================
    --- linux-x86.q.orig/arch/x86/mm/ioremap.c
    +++ linux-x86.q/arch/x86/mm/ioremap.c
    @@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_
    /*
    * Do not fallback to certain memory types with certain
    * requested type:
    - * - request is uncached, return cannot be write-back
    - * - request is uncached, return cannot be write-combine
    + * - request is uc-, return cannot be write-back
    + * - request is uc-, return cannot be write-combine
    * - request is write-combine, return cannot be write-back
    */
    - if ((prot_val == _PAGE_CACHE_UC &&
    + if ((prot_val == _PAGE_CACHE_UC_MINUS &&
    (new_prot_val == _PAGE_CACHE_WB ||
    new_prot_val == _PAGE_CACHE_WC)) ||
    (prot_val == _PAGE_CACHE_WC &&
    @@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_
    default:
    prot = PAGE_KERNEL_NOCACHE;
    break;
    + case _PAGE_CACHE_UC_MINUS:
    + prot = PAGE_KERNEL_UC_MINUS;
    + break;
    case _PAGE_CACHE_WC:
    prot = PAGE_KERNEL_WC;
    break;
    @@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_
    */
    void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
    {
    - return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
    + /*
    + * Ideally, this should be:
    + * pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
    + *
    + * Till we fix all X drivers to use ioremap_wc(), we will use
    + * UC MINUS.
    + */
    + unsigned long val = _PAGE_CACHE_UC_MINUS;
    +
    + return __ioremap(phys_addr, size, val);
    }
    EXPORT_SYMBOL(ioremap_nocache);

    Index: linux-x86.q/arch/x86/mm/pageattr.c
    ================================================== =================
    --- linux-x86.q.orig/arch/x86/mm/pageattr.c
    +++ linux-x86.q/arch/x86/mm/pageattr.c
    @@ -829,14 +829,20 @@ static inline int change_page_attr_clear

    int _set_memory_uc(unsigned long addr, int numpages)
    {
    + /*
    + * for now UC MINUS. see comments in ioremap_nocache()
    + */
    return change_page_attr_set(&addr, numpages,
    - __pgprot(_PAGE_CACHE_UC), 0);
    + __pgprot(_PAGE_CACHE_UC_MINUS), 0);
    }

    int set_memory_uc(unsigned long addr, int numpages)
    {
    + /*
    + * for now UC MINUS. see comments in ioremap_nocache()
    + */
    if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
    - _PAGE_CACHE_UC, NULL))
    + _PAGE_CACHE_UC_MINUS, NULL))
    return -EINVAL;

    return _set_memory_uc(addr, numpages);
    Index: linux-x86.q/arch/x86/pci/i386.c
    ================================================== =================
    --- linux-x86.q.orig/arch/x86/pci/i386.c
    +++ linux-x86.q/arch/x86/pci/i386.c
    @@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *
    prot = pgprot_val(vma->vm_page_prot);
    if (pat_wc_enabled && write_combine)
    prot |= _PAGE_CACHE_WC;
    + else if (pat_wc_enabled)
    + /*
    + * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
    + * To avoid attribute conflicts, request UC MINUS here
    + * aswell.
    + */
    + prot |= _PAGE_CACHE_UC_MINUS;
    else if (boot_cpu_data.x86 > 3)
    prot |= _PAGE_CACHE_UC;

    @@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *
    * - request is uncached, return cannot be write-combine
    * - request is write-combine, return cannot be write-back
    */
    - if ((flags == _PAGE_CACHE_UC &&
    - (new_flags == _PAGE_CACHE_WB ||
    - new_flags == _PAGE_CACHE_WC)) ||
    + if ((flags == _PAGE_CACHE_UC_MINUS &&
    + (new_flags == _PAGE_CACHE_WB)) ||
    (flags == _PAGE_CACHE_WC &&
    new_flags == _PAGE_CACHE_WB)) {
    free_memtype(addr, addr+len);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread