[PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code - Kernel

This is a discussion on [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code - Kernel ; Use cpuset.stack_list rather than kfifo, so we avoid memory allocation for kfifo. Signed-off-by: Li Zefan Signed-off-by: Lai Jiangshan --- kernel/cpuset.c | 21 ++++++++------------- 1 files changed, 8 insertions(+), 13 deletions(-) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3624dc0..d0c57ac 100644 --- a/kernel/cpuset.c +++ ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

  1. [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Use cpuset.stack_list rather than kfifo, so we avoid memory allocation
    for kfifo.

    Signed-off-by: Li Zefan
    Signed-off-by: Lai Jiangshan
    ---
    kernel/cpuset.c | 21 ++++++++-------------
    1 files changed, 8 insertions(+), 13 deletions(-)

    diff --git a/kernel/cpuset.c b/kernel/cpuset.c
    index 3624dc0..d0c57ac 100644
    --- a/kernel/cpuset.c
    +++ b/kernel/cpuset.c
    @@ -54,7 +54,6 @@
    #include
    #include
    #include
    -#include
    #include
    #include

    @@ -532,7 +531,7 @@ update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c)
    * So the reverse nesting would risk an ABBA deadlock.
    *
    * The three key local variables below are:
    - * q - a kfifo queue of cpuset pointers, used to implement a
    + * q - a linked-list queue of cpuset pointers, used to implement a
    * top-down scan of all cpusets. This scan loads a pointer
    * to each cpuset marked is_sched_load_balance into the
    * array 'csa'. For our purposes, rebuilding the schedulers
    @@ -567,7 +566,7 @@ update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c)

    void rebuild_sched_domains(void)
    {
    - struct kfifo *q; /* queue of cpusets to be scanned */
    + LIST_HEAD(q); /* queue of cpusets to be scanned*/
    struct cpuset *cp; /* scans q */
    struct cpuset **csa; /* array of all cpuset ptrs */
    int csn; /* how many cpuset ptrs in csa so far */
    @@ -577,7 +576,6 @@ void rebuild_sched_domains(void)
    int ndoms; /* number of sched domains in result */
    int nslot; /* next empty doms[] cpumask_t slot */

    - q = NULL;
    csa = NULL;
    doms = NULL;
    dattr = NULL;
    @@ -597,20 +595,19 @@ void rebuild_sched_domains(void)
    goto rebuild;
    }

    - q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);
    - if (IS_ERR(q))
    - goto done;
    csa = kmalloc(number_of_cpusets * sizeof(cp), GFP_KERNEL);
    if (!csa)
    goto done;
    csn = 0;

    - cp = &top_cpuset;
    - __kfifo_put(q, (void *)&cp, sizeof(cp));
    - while (__kfifo_get(q, (void *)&cp, sizeof(cp))) {
    + list_add(&top_cpuset.stack_list, &q);
    + while (!list_empty(&q)) {
    struct cgroup *cont;
    struct cpuset *child; /* scans child cpusets of cp */

    + cp = list_first_entry(&q, struct cpuset, stack_list);
    + list_del(q.next);
    +
    if (cpus_empty(cp->cpus_allowed))
    continue;

    @@ -619,7 +616,7 @@ void rebuild_sched_domains(void)

    list_for_each_entry(cont, &cp->css.cgroup->children, sibling) {
    child = cgroup_cs(cont);
    - __kfifo_put(q, (void *)&child, sizeof(cp));
    + list_add_tail(&child->stack_list, &q);
    }
    }

    @@ -702,8 +699,6 @@ rebuild:
    put_online_cpus();

    done:
    - if (q && !IS_ERR(q))
    - kfifo_free(q);
    kfree(csa);
    /* Don't kfree(doms) -- partition_sched_domains() does that. */
    /* Don't kfree(dattr) -- partition_sched_domains() does that. */
    --
    1.5.4.rc3
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Li Zefan wrote:
    > + list_add(&top_cpuset.stack_list, &q);


    Can you figure out if we are holding any system wide lock, such as
    cgroup_mutex, at this point? There is only one top_cpuset.stack_list
    in the system, so I would think that we need to single thread its usage
    somehow.

    The hotplug hooks, such as common_cpu_mem_hotplug_unplug() that were
    added sometime in the last few months (while the so called maintainer,
    who's initials seem to be 'pj', was asleep don't describe what
    locking applies to them very well, and they call rebuild_sched_domains(),
    where the above line of code lives.

    --
    I won't rest till it's the best ...
    Programmer, Linux Scalability
    Paul Jackson 1.940.382.4214
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    pj wrote:
    > Can you figure out if we are holding any system wide lock, such as
    > cgroup_mutex, at this point?


    Nevermind ... this is in rebuild_sched_domains(), which must be
    called with cgroup_mutex held. So we danged well better be holding
    a system wide lock.

    --
    I won't rest till it's the best ...
    Programmer, Linux Scalability
    Paul Jackson 1.940.382.4214
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Li Zefan wrote:
    > - q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);



    The block comment for rebuild_sched_domains() states:

    > ... May take callback_mutex during
    > * call due to the kfifo_alloc() and kmalloc() calls.


    I suspect that mention of kfifo_alloc() is no longer correct,
    with your patch. If so, perhaps you could send a little additional
    fix patch, to remove that mention from the comment.

    --
    I won't rest till it's the best ...
    Programmer, Linux Scalability
    Paul Jackson 1.940.382.4214
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. references:in-reply-to:content-type:

    Paul Jackson wrote:
    > Li Zefan wrote:
    >> - q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);

    >
    >
    > The block comment for rebuild_sched_domains() states:
    >
    >> ... May take callback_mutex during
    >> * call due to the kfifo_alloc() and kmalloc() calls.

    >
    > I suspect that mention of kfifo_alloc() is no longer correct,
    > with your patch. If so, perhaps you could send a little additional
    > fix patch, to remove that mention from the comment.
    >


    Paul, please take a look at
    cpuset: Rework sched domains and CPU hotplug handling
    patch I sent out last week.
    I'd appreciate if we applied that one first. It simplifies lock nesting
    and rearranges the way sched domains are rebuilt. It is IMO a bit higher
    priority than this patch because scheduler depends on the
    rebuild_sched_domains() call and we have to call rebuild_sched_domains()
    from cpu hotplug handlers.

    Max

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    > Paul, please take a look at
    > cpuset: Rework sched domains and CPU hotplug handling
    > patch I sent out last week.
    > I'd appreciate if we applied that one first. It simplifies lock nesting
    > and rearranges the way sched domains are rebuilt. It is IMO a bit higher
    > priority than this patch because scheduler depends on the
    > rebuild_sched_domains() call and we have to call rebuild_sched_domains()
    > from cpu hotplug handlers.
    >


    I don't mind which one gets first.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Paul Jackson wrote:
    > Li Zefan wrote:
    >> - q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);

    >
    >
    > The block comment for rebuild_sched_domains() states:
    >
    >> ... May take callback_mutex during
    >> * call due to the kfifo_alloc() and kmalloc() calls.

    >
    > I suspect that mention of kfifo_alloc() is no longer correct,
    > with your patch. If so, perhaps you could send a little additional
    > fix patch, to remove that mention from the comment.
    >


    Yes, but I just noticed Max removes that comment completely in his patch,
    so I guess I can leave it as it is.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Paul Jackson wrote:
    > pj wrote:
    >> Can you figure out if we are holding any system wide lock, such as
    >> cgroup_mutex, at this point?

    >
    > Nevermind ... this is in rebuild_sched_domains(), which must be
    > called with cgroup_mutex held. So we danged well better be holding
    > a system wide lock.
    >


    Yes, and even after Max's patch, the code playing with ->stack_list is
    still protected by cgroup_mutex.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Li Zefan wrote:
    > Yes, but I just noticed Max removes that comment completely in his patch,
    > so I guess I can leave it as it is.


    Yeah - leave it as it is.

    --
    I won't rest till it's the best ...
    Programmer, Linux Scalability
    Paul Jackson 1.940.382.4214
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code

    Max wrote:
    > Paul, please take a look at
    > cpuset: Rework sched domains and CPU hotplug handling
    > patch I sent out last week.


    Ok - I've been trying to get to this for the last two days,
    but my good employer has reorganized me twice so far this week .

    If I can keep the same boss for more than eight hours,
    I should find time to get to this today.

    --
    I won't rest till it's the best ...
    Programmer, Linux Scalability
    Paul Jackson 1.940.382.4214
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread