[PATCH] Option to disable AMD C1E (allows dynticks to work) - Kernel

This is a discussion on [PATCH] Option to disable AMD C1E (allows dynticks to work) - Kernel ; Some multiprocessor 64-bit AMD systems don't allow the user to disable the C1E C-state. The kernel detects C1E and marks the LAPIC as broken, thereby disabling dynticks. This patch adds an option to disable C1E when detected. It also allows ...

+ Reply to Thread
Results 1 to 17 of 17

Thread: [PATCH] Option to disable AMD C1E (allows dynticks to work)

  1. [PATCH] Option to disable AMD C1E (allows dynticks to work)

    Some multiprocessor 64-bit AMD systems don't allow the user to disable
    the C1E C-state. The kernel detects C1E and marks the LAPIC as
    broken, thereby disabling dynticks. This patch adds an option to
    disable C1E when detected. It also allows the user to enable this
    processor feature even if that means disabling dynticks, which is
    useful in case C1E might provide better power savings (e.g.: C-states
    beyond C1 don't work). Tested on a Turion X2 TL-56 laptop. Thanks to
    Mikhail Kshevetskiy and FreeBSD for pointing out the relevant AMD docs.

    Signed-off-by: Eduard-Gabriel Munteanu

    ---
    Documentation/kernel-parameters.txt | 6 ++++++
    arch/x86/Kconfig | 16 ++++++++++++++++
    arch/x86/kernel/setup_64.c | 27 +++++++++++++++++++++++++--
    3 files changed, 47 insertions(+), 2 deletions(-)

    diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
    index 33121d6..0deee7a 100644
    --- a/Documentation/kernel-parameters.txt
    +++ b/Documentation/kernel-parameters.txt
    @@ -643,6 +643,12 @@ and is between 256 and 4096 characters. It is defined in the file
    floppy= [HW]
    See Documentation/floppy.txt.

    + force_amd_c1e [KNL,SMP,HW,BUGS=X86-64]
    + Don't disable C1E on AMD systems even if this means
    + disabling nohz. This is _not_ automatically implied by
    + any other parameters, such as "nohz=off".
    + Depends on CONFIG_X86_AMD_C1E_WORKAROUND.
    +
    gamecon.map[2|3]=
    [HW,JOY] Multisystem joystick and NES/SNES/PSX pad
    support via parallel port (up to 5 devices per port)
    diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
    index 368864d..8b9bb49 100644
    --- a/arch/x86/Kconfig
    +++ b/arch/x86/Kconfig
    @@ -198,6 +198,22 @@ config SMP

    If you don't know what to do here, say N.

    +config X86_AMD_C1E_WORKAROUND
    + bool "Disable C1E on AMD systems to make dynticks work"
    + default y
    + depends on X86_64 && SMP && NO_HZ
    + ---help---
    + On some systems, the C1E C-state is enabled by default and cannot be
    + disabled from the CMOS setup. Local APICs don't behave as they should
    + in this case. If you say Y here, C1E will be disabled to allow
    + dynamic ticks to work. It's safe to enable this option even if
    + your system doesn't have an AMD CPU (there are no side-effects if
    + such a CPU isn't detected).
    +
    + You can pass the "force_amd_c1e" boot parameter to the kernel to
    + disable this workaround without recompiling.
    + See Documentation/kernel-parameters.txt for more details.
    +
    choice
    prompt "Subarchitecture Type"
    default X86_PC
    diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c
    index 30d94d1..15556a0 100644
    --- a/arch/x86/kernel/setup_64.c
    +++ b/arch/x86/kernel/setup_64.c
    @@ -583,6 +583,17 @@ static void __init amd_detect_cmp(struct cpuinfo_x86 *c)
    #endif
    }

    +#ifdef CONFIG_X86_AMD_C1E_WORKAROUND
    +static int __cpuinit disable_amd_c1e = 1;
    +
    +static int __cpuinit force_amd_c1e(char *str) {
    + disable_amd_c1e = 0;
    + return 1;
    +}
    +
    +__setup("force_amd_c1e", force_amd_c1e);
    +#endif /* CONFIG_X86_AMD_C1E_WORKAROUND */
    +
    #define ENABLE_C1E_MASK 0x18000000
    #define CPUID_PROCESSOR_SIGNATURE 1
    #define CPUID_XFAM 0x0ff00000
    @@ -597,6 +608,7 @@ static __cpuinit int amd_apic_timer_broken(void)
    {
    u32 lo, hi;
    u32 eax = cpuid_eax(CPUID_PROCESSOR_SIGNATURE);
    +
    switch (eax & CPUID_XFAM) {
    case CPUID_XFAM_K8:
    if ((eax & CPUID_XMOD) < CPUID_XMOD_REV_F)
    @@ -604,8 +616,19 @@ static __cpuinit int amd_apic_timer_broken(void)
    case CPUID_XFAM_10H:
    case CPUID_XFAM_11H:
    rdmsr(MSR_K8_ENABLE_C1E, lo, hi);
    - if (lo & ENABLE_C1E_MASK)
    - return 1;
    +#ifdef CONFIG_X86_AMD_C1E_WORKAROUND
    + if ((lo & ENABLE_C1E_MASK) && disable_amd_c1e) {
    + printk(KERN_INFO "Disabling AMD C1E on CPU %d\n",
    + smp_processor_id());
    + /*
    + * See AMD's "BIOS and Kernel Developer's Guide for AMD
    + * NPT Family 0Fh Processors", publication #32559,
    + * for details.
    + */
    + wrmsr(MSR_K8_ENABLE_C1E, lo & ~ENABLE_C1E_MASK, hi);
    + } else
    +#endif /* CONFIG_X86_AMD_C1E_WORKAROUND */
    + if (lo & ENABLE_C1E_MASK) return 1;
    break;
    default:
    /* err on the side of caution */
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    Eduard-Gabriel Munteanu writes:
    >
    > + force_amd_c1e [KNL,SMP,HW,BUGS=X86-64]
    > + Don't disable C1E on AMD systems even if this means


    The description/option is not correct. The mainline kernel never disables C1e.
    Some distribution kernels and Xen do, perhaps you're confusing this
    with them.

    You would rather need a "force_disable_c1e" option if anything.

    Anyways this should be near all obsolete with forced HPET. With HPET
    dynticks can be used even with C1e. So in most cases you can just
    use hpet=force instead and get dynticks and C1e together.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Thu, 13 Dec 2007 23:33:07 +0100
    Andi Kleen wrote:

    > The description/option is not correct. The mainline kernel never
    > disables C1e. Some distribution kernels and Xen do, perhaps you're
    > confusing this with them.
    >
    > You would rather need a "force_disable_c1e" option if anything.


    The option I added (which is set to Y by default, but that's another
    matter) disables C1E without any other kernel parameter. In my opinion,
    this should be the normal behavior: the kernel has both SMP and NO_HZ
    enabled, so do whatever is necessary to enable dynticks. But it would
    also be useful, for benchmarking purposes, to prevent the kernel from
    disabling C1E using a kernel parameter; that's what force_amd_c1e does.

    > Anyways this should be near all obsolete with forced HPET. With HPET
    > dynticks can be used even with C1e. So in most cases you can just
    > use hpet=force instead and get dynticks and C1e together.


    On my system, hpet=force does not enable dynticks:
    $ dmesg | grep -E "(not functional|hpet)"
    Command line: ro hpet=force
    Kernel command line: ro hpet=force
    hpet clockevent registered
    hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31
    hpet0: 3 32-bit timers, 25000000 Hz
    Time: hpet clocksource has been installed.
    Clockevents: could not switch to one-shot mode: lapic is not functional.
    Clockevents: could not switch to one-shot mode: lapic is not functional.
    hpet_resources: 0xfed00000 is busy

    But, using this patch, the kernel enables dynticks.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    >so do whatever is necessary to enable dynticks.

    dynticks' main purpose is to save power, but C1e saves more power.
    Disabling C1e for dynticks would be a fairly useless default
    trade off.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007 11:17:21 +0100
    Andi Kleen wrote:

    > >so do whatever is necessary to enable dynticks.

    >
    > dynticks' main purpose is to save power, but C1e saves more power.
    > Disabling C1e for dynticks would be a fairly useless default
    > trade off.


    I see. Also, AMD specs say that either higher C-states are enabled, or
    C1E, but not both at the same time. So if the BIOS doesn't offer an
    option to disable C1E, it can't provide _CST or P_LVL* ACPI
    objects for higher C-states.

    Dynticks also provides lower latencies as far as I know. I think this
    workaround should be merged, even if as a non-default config option.

    I'll resubmit a patch if you agree.

    (By the way, sorry for messing up the "Cc:" fields in the previous
    message.)
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, Dec 14, 2007 at 03:41:06PM +0200, Eduard-Gabriel Munteanu wrote:
    > On Fri, 14 Dec 2007 11:17:21 +0100
    > Andi Kleen wrote:
    >
    > > >so do whatever is necessary to enable dynticks.

    > >
    > > dynticks' main purpose is to save power, but C1e saves more power.
    > > Disabling C1e for dynticks would be a fairly useless default
    > > trade off.

    >
    > I see. Also, AMD specs say that either higher C-states are enabled, or
    > C1E, but not both at the same time. So if the BIOS doesn't offer an


    AMD doesn't support states deeper than C1 on multi core currently, so
    in general they don't matter much right now.

    > Dynticks also provides lower latencies as far as I know.


    The better solution there is to use HPET instead. Newer systems
    generally have HPET already enabled in the BIOS and for older systems
    hpet=force gains more and more support. So try that.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007 13:20:48 +0100
    Andi Kleen wrote:

    > AMD doesn't support states deeper than C1 on multi core currently, so
    > in general they don't matter much right now.


    Thanks for the info, I wasn't aware of this.

    > The better solution there is to use HPET instead. Newer systems
    > generally have HPET already enabled in the BIOS and for older systems
    > hpet=force gains more and more support. So try that.


    Dynticks won't use the HPET, even if enabled. IIRC, HPET is enabled on
    my system (NVIDIA MCP51) even without "hpet=force". Here's dmesg's
    output on Linux 2.6.24-rc5:

    $ dmesg | grep -Ei "(lapic|hpet|disabling)"Command line: ro hpet=force
    ACPI: HPET 3FFA9730, 0038 (r1 A M I OEMHPET0 2000727 MSFT 97)
    ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
    ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
    ACPI: HPET id: 0x10de8201 base: 0xfed00000
    Kernel command line: ro hpet=force
    hpet clockevent registered
    TSC calibrated against HPET
    Disabling APIC timer
    hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31
    hpet0: 3 32-bit timers, 25000000 Hz
    Time: hpet clocksource has been installed.
    Clockevents: could not switch to one-shot mode: lapic is not functional.
    Clockevents: could not switch to one-shot mode: lapic is not functional.
    Unpacking initramfs...<6>Clockevents: could not switch to one-shot mode:<6>Clockevents: could not switch to one-shot mode: lapic is not functional.
    lapic is not functional.
    hpet_resources: 0xfed00000 is busy

    LAPIC is seemingly disabled (C1E detection code does this), but
    clockevents still tries to use it, instead of relying on HPET. I'll
    look into this, but please give me a heads up if you know more about
    what's happening. Looks like fixing this is better than using LAPIC for
    dynticks (and disabling C1E) on such systems.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007, Eduard-Gabriel Munteanu wrote:

    > LAPIC is seemingly disabled (C1E detection code does this), but
    > clockevents still tries to use it, instead of relying on HPET.


    It relies on HPET. The LAPIC is just used as a mechanism which allows
    us to broadcast the tick to both cores.

    > I'll look into this, but please give me a heads up if you know more
    > about what's happening. Looks like fixing this is better than using
    > LAPIC for dynticks (and disabling C1E) on such systems.


    Yes, it's definitely worth fixing, but it's not trivial:

    For Highres/dyntick we need per CPU clock event devices to avoid
    serialization and broadcasting overhead in the normal operation mode.

    The LAPIC timer is per CPU, fast and the best thing we have, except
    for C1E enabled AMD systems.

    The perfect solution for those systems would be to use the HPET
    channels seperately as per CPU clock event devices. Venki tried this
    some time ago, but it's hard to resolve simply because none of the
    BIOSes gives us an idea to which interrupts we can route the HPET
    channel interrupts. The only choice we have is to use the legacy
    interrupts, which gives us the headache of emulating the RTC via the
    HPET and some other stupid legacy issues.

    We can not utilize the broadcast mechanism of the cpuidle code because
    we do not have an idea that we are going into C1E as it is done
    magically in the SMM code. To work around this is we would need to add
    the broadcast notification to the halt(), safe_halt(), pm_idle_halt()
    variants which float around in the kernel and make this conditional on
    the C1E detection. That's nasty, but it seems the only solution for
    now.

    Thanks,

    tglx



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    > LAPIC is seemingly disabled (C1E detection code does this), but

    It should only disable the LAPIC timer, but not the full use of the
    LAPIC.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    > magically in the SMM code. To work around this is we would need to add
    > the broadcast notification to the halt(), safe_halt(), pm_idle_halt()
    > variants which float around in the kernel and make this conditional on
    > the C1E detection. That's nasty, but it seems the only solution for
    > now.


    On 64bit it would be easy using the idle notifiers. Perhaps they need
    to be extended to pass in the sleep state though.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007, Andi Kleen wrote:

    > > LAPIC is seemingly disabled (C1E detection code does this), but

    >
    > It should only disable the LAPIC timer, but not the full use of the
    > LAPIC.


    That's what it does. The LAPIC timer is invalidated and registered as
    a per CPU broadcast dummy source (CLOCK_EVT_FEAT_DUMMY).

    Thanks,

    tglx
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)


    Thanks to both of you for shedding some light on this matter. I'll look
    into HPET-related efforts; it looks like a better solution than my
    patch.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007, Andi Kleen wrote:

    > > magically in the SMM code. To work around this is we would need to add
    > > the broadcast notification to the halt(), safe_halt(), pm_idle_halt()
    > > variants which float around in the kernel and make this conditional on
    > > the C1E detection. That's nasty, but it seems the only solution for
    > > now.

    >
    > On 64bit it would be easy using the idle notifiers. Perhaps they need
    > to be extended to pass in the sleep state though.


    Well, that would interfere with the acpi-idle code.

    Anyway the idle notifiers is a pretty artificial interface which is on
    my get rid of it list anyway.

    Thanks,

    tglx

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    > Well, that would interfere with the acpi-idle code.

    How so? idle notifiers should work for acpi idle too.

    > Anyway the idle notifiers is a pretty artificial interface which is on
    > my get rid of it list anyway.


    The original use cases were:
    - Accounting for idle time with stopped counters in oprofile and eliminate
    the need for idle=poll.
    That never was implemented unfortunately but would be still a worthy
    feature I think.
    - Perfmon for similar uses.
    - noidletick -- my original noidletick implementation used the idle
    notifiers similar to the s390 implementation. Obsolete now.

    Right now it is used for the machine check early notification, but I never
    liked that (imho it just "fixes" a non problem only relevant
    in non realistic testing situations) and it would be fine to drop that one
    I think. In fact I already made it a config in the patchkit to use the 64bit
    mce code on 32bit too.

    -Andi

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On 12/14/2007 05:17 AM, Andi Kleen wrote:
    >> so do whatever is necessary to enable dynticks.

    >
    > dynticks' main purpose is to save power, but C1e saves more power.
    > Disabling C1e for dynticks would be a fairly useless default
    > trade off.
    >


    What about machines where the BIOS has disabled C1e on CPU 0 but
    left it enabled on CPU 1 ??
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, 14 Dec 2007 17:35:13 -0500
    Chuck Ebbert wrote:

    > On 12/14/2007 05:17 AM, Andi Kleen wrote:
    > >> so do whatever is necessary to enable dynticks.

    > >
    > > dynticks' main purpose is to save power, but C1e saves more power.
    > > Disabling C1e for dynticks would be a fairly useless default
    > > trade off.
    > >

    >
    > What about machines where the BIOS has disabled C1e on CPU 0 but
    > left it enabled on CPU 1 ??


    Do you mean Linux should enable C1E on CPU 0 if it's detected on CPU 1?
    C3 + dynticks make up a better power saver than simply C1E, as far as I
    know. Higher C-states should be enabled on such CPUs, as AMD docs say
    firmware should either enable C1E or C2 & C3 (it must provide one of
    these mutually exclusive options). I take having C1E on the second CPU
    but not the first as an attempt on BIOS's part to provide higher
    C-states instead of the former. How broken is it, really?

    But maybe someone with access to such hardware can tell us what
    happens: does he get C2/C3 power states under such circumstances?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

    On Fri, Dec 14, 2007 at 05:35:13PM -0500, Chuck Ebbert wrote:
    > On 12/14/2007 05:17 AM, Andi Kleen wrote:
    > >> so do whatever is necessary to enable dynticks.

    > >
    > > dynticks' main purpose is to save power, but C1e saves more power.
    > > Disabling C1e for dynticks would be a fairly useless default
    > > trade off.
    > >

    >
    > What about machines where the BIOS has disabled C1e on CPU 0 but
    > left it enabled on CPU 1 ??


    It's a BIOS bug. We handle this and threat it like C1e always enabled.

    The right fix would be to enable it on both CPUs, but
    then that's not strictly needed for correct operation and I'm not
    sure it would be worth special fixup code in Linux. Probably not.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread