[PATCH 00/10] NR_CPUS: third reduction of NR_CPUS memory usage x86-version v2 - Kernel

This is a discussion on [PATCH 00/10] NR_CPUS: third reduction of NR_CPUS memory usage x86-version v2 - Kernel ; * Mike Travis wrote: > >> + int n = 0; > >> + int len = cpumask_scnprintf_len(nr_cpu_ids); > >> + char *mask_str = kmalloc(len, GFP_KERNEL); > >> + > >> + if (mask_str) { > >> + cpumask_scnprintf(mask_str, len, ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 28 of 28

Thread: [PATCH 00/10] NR_CPUS: third reduction of NR_CPUS memory usage x86-version v2

  1. Re: [PATCH 06/10] x86: reduce memory and stack usage in intel_cacheinfo


    * Mike Travis wrote:

    > >> + int n = 0;
    > >> + int len = cpumask_scnprintf_len(nr_cpu_ids);
    > >> + char *mask_str = kmalloc(len, GFP_KERNEL);
    > >> +
    > >> + if (mask_str) {
    > >> + cpumask_scnprintf(mask_str, len, this_leaf->shared_cpu_map);
    > >> + n = sprintf(buf, "%s\n", mask_str);
    > >> + kfree(mask_str);
    > >> + }
    > >> + return n;

    > >
    > > the other changes look good, but this one looks a bit ugly and complex.
    > > We basically want to sprintf shared_cpu_map into 'buf', but we do that
    > > by first allocating a temporary buffer, print a string into it, then
    > > print that string into another buffer ...
    > >
    > > this very much smells like an API bug in cpumask_scnprintf() - why dont
    > > you create a cpumask_scnprintf_ptr() API that takes a pointer to a
    > > cpumask? Then this change would become a trivial and much more readable:
    > >
    > > - char mask_str[NR_CPUS];
    > > - cpumask_scnprintf(mask_str, NR_CPUS, this_leaf->shared_cpu_map);
    > > - return sprintf(buf, "%s\n", mask_str);
    > > + return cpumask_scnprintf_ptr(buf, NR_CPUS, &this_leaf->shared_cpu_map);
    > >
    > > Ingo

    >
    > The main goal was to avoid allocating 4096 bytes when only 32 would do
    > (characters needed to represent nr_cpu_ids cpus instead of NR_CPUS
    > cpus.) But I'll look at cleaning it up a bit more. It wouldn't have
    > to be a function if CHUNKSZ in cpumask_scnprintf() were visible (or a
    > non-changeable constant.)


    well, do we care about allocating 4096 bytes, as long as we also free
    it? It's not like we need to clear all the bytes or something. Am i
    missing something here?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH 06/10] x86: reduce memory and stack usage in intel_cacheinfo

    Ingo Molnar wrote:
    >> The main goal was to avoid allocating 4096 bytes when only 32 would do
    >> (characters needed to represent nr_cpu_ids cpus instead of NR_CPUS
    >> cpus.) But I'll look at cleaning it up a bit more. It wouldn't have
    >> to be a function if CHUNKSZ in cpumask_scnprintf() were visible (or a
    >> non-changeable constant.)

    >
    > well, do we care about allocating 4096 bytes, as long as we also free
    > it? It's not like we need to clear all the bytes or something. Am i
    > missing something here?


    Well, 32 bytes fits on the stack, whereas 4096 bytes requires allocating
    a page -- which means either taking the risk of failing or blocking. Of
    course, we're doing this for output, which has the same issue.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH 06/10] x86: reduce memory and stack usage in intel_cacheinfo


    * H. Peter Anvin wrote:

    > Ingo Molnar wrote:
    >>> The main goal was to avoid allocating 4096 bytes when only 32 would do
    >>> (characters needed to represent nr_cpu_ids cpus instead of NR_CPUS cpus.)
    >>> But I'll look at cleaning it up a bit more. It wouldn't have to be a
    >>> function if CHUNKSZ in cpumask_scnprintf() were visible (or a
    >>> non-changeable constant.)

    >>
    >> well, do we care about allocating 4096 bytes, as long as we also free it?
    >> It's not like we need to clear all the bytes or something. Am i missing
    >> something here?

    >
    > Well, 32 bytes fits on the stack, whereas 4096 bytes requires
    > allocating a page -- which means either taking the risk of failing or
    > blocking. Of course, we're doing this for output, which has the same
    > issue.


    hm, i thought this was all implemented via dynamic allocation already,
    within the cpumask_scnprintf function. But i see it doesnt do it - i
    guess a new call could be introduced, cpumask_scnprintf_ptr() which
    passes in a cpumask pointer and does dynamic allocation itself?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH 06/10] x86: reduce memory and stack usage in intel_cacheinfo

    Jeremy Fitzhardinge wrote:
    > Mike Travis wrote:
    >> Hmm, I hadn't thought of that. There is commonly a format spec called
    >> %b for diags, etc. to print bit strings. Maybe something like:
    >>
    >> "... %*b ...", nr_cpu_ids, ptr_to_bitmap
    >>
    >> where the length arg is rounded up to 32 or 64 bits...?

    >
    > I think that would need to be %.*b, but I always need to try it both
    > ways anyway...
    >
    > But yes, that seems like the right way to go.


    I had the same thought after hitting return.

    But for this case, I was over thinking the problem. Turns out that the
    number of cpus in a leaf will be fairly small, even with new cpus around
    the corner (maybe 64 or 128 cpu threads per leaf?)

    So I dropped the cpumask_scnprintf_len() patch and have a new intel_cacheinfo
    patch which I'll send in a separate message.

    Thanks,
    Mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH 06/10] x86: reduce memory and stack usage in intel_cacheinfo

    Ingo Molnar wrote:
    > * H. Peter Anvin wrote:
    >
    >> Ingo Molnar wrote:
    >>>> The main goal was to avoid allocating 4096 bytes when only 32 would do
    >>>> (characters needed to represent nr_cpu_ids cpus instead of NR_CPUS cpus.)
    >>>> But I'll look at cleaning it up a bit more. It wouldn't have to be a
    >>>> function if CHUNKSZ in cpumask_scnprintf() were visible (or a
    >>>> non-changeable constant.)
    >>> well, do we care about allocating 4096 bytes, as long as we also free it?
    >>> It's not like we need to clear all the bytes or something. Am i missing
    >>> something here?

    >> Well, 32 bytes fits on the stack, whereas 4096 bytes requires
    >> allocating a page -- which means either taking the risk of failing or
    >> blocking. Of course, we're doing this for output, which has the same
    >> issue.

    >
    > hm, i thought this was all implemented via dynamic allocation already,
    > within the cpumask_scnprintf function. But i see it doesnt do it - i
    > guess a new call could be introduced, cpumask_scnprintf_ptr() which
    > passes in a cpumask pointer and does dynamic allocation itself?
    >
    > Ingo


    Here's a snippet of the new patch. This works fine (I think) for
    cpus on a leaf. The sched_debug_one problem should work the same way,
    hopefully ;-)

    [sorry, cut and pasted so no tabs]

    static ssize_t show_shared_cpu_map(struct _cpuid4_info *this_leaf, char *buf)
    {
    - char mask_str[NR_CPUS];
    - cpumask_scnprintf(mask_str, NR_CPUS, this_leaf->shared_cpu_map);
    - return sprintf(buf, "%s\n", mask_str);
    + /*
    + * cpulist_scnprintf() has the advantage of compressing
    + * consecutive cpu numbers into a single range which seems
    + * appropriate for cpus on a leaf. This will change what is
    + * output so scripts that process the output will have to change.
    + * The good news is that the output format is compatible
    + * with cpulist_parse() [bitmap_parselist()].
    + *
    + * Have to guess at output buffer size... 128 seems reasonable
    + * to represent all cpus on a leaf in the worst case, like
    + * if all cpus are non-consecutive and large numbers.
    + */
    + return cpulist_scnprintf(buf, 128, this_leaf->shared_cpu_map);
    }

    Thanks,
    Mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH 02/10] init: move setup of nr_cpu_ids to as early as possible v2

    Ingo Molnar wrote:
    > * Mike Travis wrote:
    >
    >> Ingo Molnar wrote:
    >>> * Mike Travis wrote:
    >>>
    >>>> Move the setting of nr_cpu_ids from sched_init() to
    >>>> setup_per_cpu_areas(), so that it's available as early as possible.
    >>> hm, why not a separate call before setup_per_cpu_areas(), so that we can
    >>> avoid spreading this from generic kernel into a bunch of architectures
    >>> that happen to have their own version of setup_per_cpu_areas():
    >>>
    >>>> 7 files changed, 43 insertions(+), 15 deletions(-)
    >>> Ingo

    >> I had this before but I then discovered that an arch would increase
    >> (and possible decrease) it's number of possible cpus in
    >> setup_per_cpu_areas(). So I figured that setting nr_cpu_ids (and the
    >> cpumask_of_cpu map) should be a side effect of setup_per_cpu_areas().

    >
    > well, then why not do it shortly after setup_per_cpu_areas()? That still
    > moves it earlier than sched_init() but doesnt export all this code and
    > complexity toevery setup_per_cpu_areas() implementation. (which clearly
    > didnt need this complexity before)
    >
    > Ingo



    Ok, will do.

    Thanks,
    Mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH 01/10] x86_64: Cleanup non-smp usage of cpu maps v2

    Ingo Molnar wrote:
    > * Mike Travis wrote:
    >
    >> Cleanup references to the early cpu maps for the non-SMP configuration
    >> and remove some functions called for SMP configurations only.

    >
    > thanks, applied.
    >
    > one observation:
    >
    >> +#ifdef CONFIG_SMP
    >> extern int x86_cpu_to_node_map_init[];
    >> extern void *x86_cpu_to_node_map_early_ptr;
    >> +#else
    >> +#define x86_cpu_to_node_map_early_ptr NULL
    >> +#endif

    >
    > Right now all these early_ptrs are in essence open-coded "early
    > per-cpu", right? But shouldnt we solve that in a much cleaner way: by
    > explicitly adding an early-per-cpu types and accessors, and avoid all
    > that #ifdeffery?
    >
    > Ingo


    How about something like the below? (I haven't tried compiling it yet.)

    [I also thought about not restricting it to only NR_CPUS type variables
    to allow for example, node-local node maps/variables.]

    Thanks,
    Mike
    ------------------------------------------------------------
    include/linux/percpu.h:

    #ifdef CONFIG_SMP
    #define DEFINE_EARLY_PER_CPU(type, name, initvalue) \
    DEFINE_PER_CPU(type, name) = initvalue; \
    type name##_early_map[NR_CPUS] __initdata = \
    { [0 ... NR_CPUS-1] = initvalue; } \
    type *name##_early_ptr = name##_early_map

    #define DECLARE_EARLY_PER_CPU(type, name) \
    DECLARE_PER_CPU(type, name); \
    extern type *name##_early_ptr; \
    extern type name##_early_map[]

    #define EXPORT_EARLY_PER_CPU(name) \
    EXPORT_PER_CPU(name)

    /* rvalue only */
    #define early_per_cpu(name, cpu) \
    (name##ptr? name##ptr[cpu] : per_cpu(name, cpu))
    #define early_per_cpu_ptr(name) (name##_early_ptr)
    #define early_per_cpu_map(name, idx) (name##_early_map[idx])

    #else /* !CONFIG_SMP */
    #define DEFINE_EARLY_PER_CPU(type, name, initvalue) \
    DEFINE_PER_CPU(type, name) = initvalue

    #define DECLARE_EARLY_PER_CPU(name) \
    DECLARE_PER_CPU(name)

    #define EXPORT_EARLY_PER_CPU(name) \
    EXPORT_PER_CPU(name)

    #define early_per_cpu(name, cpu) per_cpu(name, cpu)
    #define early_per_cpu_ptr(name) NULL
    /* no early_per_cpu_map() */

    #endif /* !CONFIG_SMP */


    ------------------------------------------------------------
    include/asm-x86/smp.h:

    DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
    DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);

    ------------------------------------------------------------
    arch/x86/kernel/setup.c:

    /* which logical CPU number maps to which CPU (physical APIC ID) */
    DEFINE_EARLY_PER_CPU(u16, x86_cpu_to_apicid, BAD_APICID);
    DEFINE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid, BAD_APICID);
    EXPORT_EARLY_PER_CPU(x86_cpu_to_apicid);
    EXPORT_EARLY_PER_CPU(x86_bios_cpu_apicid);

    #ifdef CONFIG_NUMA
    DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node, NUMA_NO_NODE);
    EXPORT_EARLY_PER_CPU(x86_cpu_to_node);
    #endif

    ....

    #if defined(CONFIG_HAVE_SETUP_PER_CPU_AREA) && defined(CONFIG_SMP)
    /*
    * Copy data used in early init routines from the initial arrays to the
    * per cpu data areas. These arrays then become expendable and the
    * *_early_ptr's are zeroed indicating that the static arrays are gone.
    */
    static void __init setup_per_cpu_maps(void)
    {
    int cpu;

    for_each_possible_cpu(cpu) {
    per_cpu(x86_cpu_to_apicid, cpu) =
    early_per_cpu_map(x86_cpu_to_apicid, cpu);
    per_cpu(x86_bios_cpu_apicid, cpu) =
    early_per_cpu_map(x86_bios_cpu_apicid, cpu);
    #ifdef CONFIG_NUMA
    per_cpu(x86_cpu_to_node_map, cpu) =
    early_per_cpu_map(x86_cpu_to_node_map, cpu);
    #endif
    }

    /* indicate the early static arrays will soon be gone */
    early_per_cpu_ptr(x86_cpu_to_apicid) = NULL;
    early_per_cpu_ptr(x86_bios_cpu_apicid) = NULL;
    #ifdef CONFIG_NUMA
    early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
    #endif
    ...

    ------------------------------------------------------------
    arch/x86/mm/numa_64.c:

    void __cpuinit numa_set_node(int cpu, int node)
    {
    int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map);

    if(cpu_to_node_map)
    cpu_to_node_map[cpu] = node;
    else if(per_cpu_offset(cpu))
    per_cpu(x86_cpu_to_node_map, cpu) = node;
    ....
    void __init init_cpu_to_node(void)
    {
    int i;

    for (i = 0; i < NR_CPUS; i++) {
    int node;
    u16 apicid = early_per_cpu(x86_cpu_to_apicid, i);

    if (apicid == BAD_APICID)
    continue;
    ....

    ------------------------------------------------------------
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH 01/10] x86_64: Cleanup non-smp usage of cpu maps v2


    * Mike Travis wrote:

    > How about something like the below? (I haven't tried compiling it
    > yet.)


    looks good to me - the closer it is to the real API, the better.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2