[PATCH] x86: do not allow to optimize flag_is_changeable_p() - Kernel

This is a discussion on [PATCH] x86: do not allow to optimize flag_is_changeable_p() - Kernel ; From: Krzysztof Helt The flag_is_changeable_p() is used by has_cpuid_p() which can return different results in the code sequence below: if (!have_cpuid_p()) identify_cpu_without_cpuid(c); /* cyrix could have cpuid enabled via c_identify()*/ if (!have_cpuid_p()) return; Otherwise, the gcc 3.4.6 optimizes these two ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

  1. [PATCH] x86: do not allow to optimize flag_is_changeable_p()

    From: Krzysztof Helt

    The flag_is_changeable_p() is used by
    has_cpuid_p() which can return different results
    in the code sequence below:

    if (!have_cpuid_p())
    identify_cpu_without_cpuid(c);

    /* cyrix could have cpuid enabled via c_identify()*/
    if (!have_cpuid_p())
    return;

    Otherwise, the gcc 3.4.6 optimizes these two calls
    into one which make the code not working correctly.
    Cyrix cpus have the CPUID instruction enabled but
    it is not detected due to the gcc optimization.
    Thus the ARR registers (mtrr like) are not detected
    on such a cpu.

    Signed-off-by: Krzysztof Helt
    ---

    I have tested the 6x86MX cpu with the CPUID
    disabled. I have used linux-next tree (20080819)
    and Yinghai Lu's patch:

    x86: identify_cpu_without_cpuid v2

    http://marc.info/?l=linux-kernel&m=122138380004347&w=2

    The patch below is required to make the patch
    above working correctly.

    diff -urp linux-orig/arch/x86/kernel/cpu/common.c linux-2.6.27/arch/x86/kernel/cpu/common.c
    --- linux-orig/arch/x86/kernel/cpu/common.c 2008-09-29 07:11:54.000000000 +0200
    +++ linux-2.6.27/arch/x86/kernel/cpu/common.c 2008-09-29 18:07:27.667392725 +0200
    @@ -124,18 +124,18 @@ static inline int flag_is_changeable_p(u
    {
    u32 f1, f2;

    - asm("pushfl\n\t"
    - "pushfl\n\t"
    - "popl %0\n\t"
    - "movl %0,%1\n\t"
    - "xorl %2,%0\n\t"
    - "pushl %0\n\t"
    - "popfl\n\t"
    - "pushfl\n\t"
    - "popl %0\n\t"
    - "popfl\n\t"
    - : "=&r" (f1), "=&r" (f2)
    - : "ir" (flag));
    + asm volatile ("pushfl\n\t"
    + "pushfl\n\t"
    + "popl %0\n\t"
    + "movl %0,%1\n\t"
    + "xorl %2,%0\n\t"
    + "pushl %0\n\t"
    + "popfl\n\t"
    + "pushfl\n\t"
    + "popl %0\n\t"
    + "popfl\n\t"
    + : "=&r" (f1), "=&r" (f2)
    + : "ir" (flag));

    return ((f1^f2) & flag) != 0;
    }

    ----------------------------------------------------------------------
    Dzwon taniej na zagraniczne komorki!
    Sprawdz >> http://link.interia.pl/f1f26

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

    Krzysztof Helt wrote:
    > From: Krzysztof Helt
    >
    > The flag_is_changeable_p() is used by
    > has_cpuid_p() which can return different results
    > in the code sequence below:
    >
    > if (!have_cpuid_p())
    > identify_cpu_without_cpuid(c);
    >
    > /* cyrix could have cpuid enabled via c_identify()*/
    > if (!have_cpuid_p())
    > return;
    >
    > Otherwise, the gcc 3.4.6 optimizes these two calls
    > into one which make the code not working correctly.
    > Cyrix cpus have the CPUID instruction enabled but
    > it is not detected due to the gcc optimization.
    > Thus the ARR registers (mtrr like) are not detected
    > on such a cpu.
    >


    I suspect we should also out-of-line this function. It's actually
    relatively sizable and certainly there is no point in inlining it.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

    Krzysztof Helt wrote:
    > From: Krzysztof Helt
    >
    > The flag_is_changeable_p() is used by
    > has_cpuid_p() which can return different results
    > in the code sequence below:
    >
    > if (!have_cpuid_p())
    > identify_cpu_without_cpuid(c);
    >
    > /* cyrix could have cpuid enabled via c_identify()*/
    > if (!have_cpuid_p())
    > return;
    >
    > Otherwise, the gcc 3.4.6 optimizes these two calls
    > into one which make the code not working correctly.
    > Cyrix cpus have the CPUID instruction enabled but
    > it is not detected due to the gcc optimization.
    > Thus the ARR registers (mtrr like) are not detected
    > on such a cpu.
    >


    If "asm volatile" changes the code and fixes the bug, it seems like
    you're making use of an undocumented - or at least non-portable - behaviour.

    Does adding a "memory" clobber also fix the problem? That would have
    better defined characteristics.

    J

    > Signed-off-by: Krzysztof Helt
    > ---
    >
    > I have tested the 6x86MX cpu with the CPUID
    > disabled. I have used linux-next tree (20080819)
    > and Yinghai Lu's patch:
    >
    > x86: identify_cpu_without_cpuid v2
    >
    > http://marc.info/?l=linux-kernel&m=122138380004347&w=2
    >
    > The patch below is required to make the patch
    > above working correctly.
    >
    > diff -urp linux-orig/arch/x86/kernel/cpu/common.c linux-2.6.27/arch/x86/kernel/cpu/common.c
    > --- linux-orig/arch/x86/kernel/cpu/common.c 2008-09-29 07:11:54.000000000 +0200
    > +++ linux-2.6.27/arch/x86/kernel/cpu/common.c 2008-09-29 18:07:27.667392725 +0200
    > @@ -124,18 +124,18 @@ static inline int flag_is_changeable_p(u
    > {
    > u32 f1, f2;
    >
    > - asm("pushfl\n\t"
    > - "pushfl\n\t"
    > - "popl %0\n\t"
    > - "movl %0,%1\n\t"
    > - "xorl %2,%0\n\t"
    > - "pushl %0\n\t"
    > - "popfl\n\t"
    > - "pushfl\n\t"
    > - "popl %0\n\t"
    > - "popfl\n\t"
    > - : "=&r" (f1), "=&r" (f2)
    > - : "ir" (flag));
    > + asm volatile ("pushfl\n\t"
    > + "pushfl\n\t"
    > + "popl %0\n\t"
    > + "movl %0,%1\n\t"
    > + "xorl %2,%0\n\t"
    > + "pushl %0\n\t"
    > + "popfl\n\t"
    > + "pushfl\n\t"
    > + "popl %0\n\t"
    > + "popfl\n\t"
    > + : "=&r" (f1), "=&r" (f2)
    > + : "ir" (flag));
    >
    > return ((f1^f2) & flag) != 0;
    > }
    >
    > ----------------------------------------------------------------------
    > Dzwon taniej na zagraniczne komorki!
    > Sprawdz >> http://link.interia.pl/f1f26
    >
    > --
    > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    > the body of a message to majordomo@vger.kernel.org
    > More majordomo info at http://vger.kernel.org/majordomo-info.html
    > Please read the FAQ at http://www.tux.org/lkml/
    >


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

    On Mon, Sep 29, 2008 at 11:14 PM, Jeremy Fitzhardinge wrote:
    > Krzysztof Helt wrote:
    >> From: Krzysztof Helt
    >>
    >> The flag_is_changeable_p() is used by
    >> has_cpuid_p() which can return different results
    >> in the code sequence below:
    >>
    >> if (!have_cpuid_p())
    >> identify_cpu_without_cpuid(c);
    >>
    >> /* cyrix could have cpuid enabled via c_identify()*/
    >> if (!have_cpuid_p())
    >> return;
    >>
    >> Otherwise, the gcc 3.4.6 optimizes these two calls
    >> into one which make the code not working correctly.
    >> Cyrix cpus have the CPUID instruction enabled but
    >> it is not detected due to the gcc optimization.
    >> Thus the ARR registers (mtrr like) are not detected
    >> on such a cpu.
    >>

    >
    > If "asm volatile" changes the code and fixes the bug, it seems like
    > you're making use of an undocumented - or at least non-portable - behaviour.
    >
    > Does adding a "memory" clobber also fix the problem? That would have
    > better defined characteristics.
    >


    how about

    if (!have_cpuid_p()) {
    identify_cpu_without_cpuid(c);

    /* cyrix could have cpuid enabled via c_identify()*/
    if (!have_cpuid_p())
    return;
    }


    YH
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

    Yinghai Lu wrote:
    > On Mon, Sep 29, 2008 at 11:14 PM, Jeremy Fitzhardinge wrote:
    >
    >> Krzysztof Helt wrote:
    >>
    >>> From: Krzysztof Helt
    >>>
    >>> The flag_is_changeable_p() is used by
    >>> has_cpuid_p() which can return different results
    >>> in the code sequence below:
    >>>
    >>> if (!have_cpuid_p())
    >>> identify_cpu_without_cpuid(c);
    >>>
    >>> /* cyrix could have cpuid enabled via c_identify()*/
    >>> if (!have_cpuid_p())
    >>> return;
    >>>
    >>> Otherwise, the gcc 3.4.6 optimizes these two calls
    >>> into one which make the code not working correctly.
    >>> Cyrix cpus have the CPUID instruction enabled but
    >>> it is not detected due to the gcc optimization.
    >>> Thus the ARR registers (mtrr like) are not detected
    >>> on such a cpu.
    >>>
    >>>

    >> If "asm volatile" changes the code and fixes the bug, it seems like
    >> you're making use of an undocumented - or at least non-portable - behaviour.
    >>
    >> Does adding a "memory" clobber also fix the problem? That would have
    >> better defined characteristics.
    >>
    >>

    >
    > how about
    >
    > if (!have_cpuid_p()) {
    > identify_cpu_without_cpuid(c);
    >
    > /* cyrix could have cpuid enabled via c_identify()*/
    > if (!have_cpuid_p())
    > return;
    > }
    >


    That doesn't help, does it? If gcc thinks it can get away with
    evaluating have_cpuid_p() once, then that's the same as:

    if (!have_cpuid_p()) {
    identify_cpu_without_cpuid(c);

    return;
    }

    even though identify_cpu_without_cpuid() can cause the cpu to suddenly
    start supporting cpuid.

    The trouble is that flag_is_changeable_p() doesn't have any obvious
    global dependencies; it just takes a constant argument and returns a
    result. The asm() needs to be updated to have a "memory" constraint as
    a stand-in for the specific constraint of "cpu has switched into
    cpuid-supporting state".

    J
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread