[PATCH] cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) - Kernel

This is a discussion on [PATCH] cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) - Kernel ; Gregory Haskins wrote: > Peter Zijlstra wrote: >> I'm thinking doing it explicitly with the new cpu mask is clearer and >> easier to understand than 'hiding' the variable in the root domain and >> having to understand all the ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 29 of 29

Thread: [PATCH] cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)

  1. references:in-reply-to:content-type:


    Gregory Haskins wrote:
    > Peter Zijlstra wrote:
    >> I'm thinking doing it explicitly with the new cpu mask is clearer and
    >> easier to understand than 'hiding' the variable in the root domain and
    >> having to understand all the domain/root-domain stuff.
    >>
    >> I think this was Linus' main point. It should be easier to understand
    >> this code.

    >
    > While I can appreciate this sentiment, note that we conceptually require
    > IMO the notion that the root-domain masks present. E.g. we really dont
    > want to migrate to just cpus_active_map, but rather
    > rq->rd->span & cpus_active_map (otherwise we could route outside
    > of a disjoint cpuset). And this is precisely what rq->rd->online is (a
    > cached version of cpus_active_map & rq->rd->span).
    >
    > So while I can understand the motivation to keep things simple, note that
    > I tried to design the root-domain stuff to be as simple as possible while
    > still meeting the requirements for working within disjoint sets. I am
    > open to other suggestions, but I see nothing particularly complex or
    > wrong with whats going on there currently. Perhaps this very
    > conversation is evidence that I needed to comment better
    >
    >>
    >> So, if there is functional overlap with the root domain stuff, it might
    >> be good to remove that bit and use the cpu_active_map stuff for that
    >> instead.

    >
    > I think we would be doing the code that does use it a disservice, per above.


    Sorry for the delay. I finally had a chance to read through this thread again
    and to look at the rq->rd->online logic.

    One thing that came up during original discussion with Linus and Dmitry is
    that cpuset can call partition_sched_domains() at random (user writes into
    /dev/cpuset/...) but the scheduler used to rely on a certain sequence of the
    domain creation/deletion during cpu hotplug. That's exactly what caused the
    problem in the first place. My patch that fixed domain destruction by the
    hotplug events changed the sequence the scheduler relied on and things broke.

    Greg, correct me if I'm wrong but we seem to have exact same issue with the
    rq->rq->online map. Lets take "cpu going down" for example. We're clearing
    rq->rd->online bit on DYING event, but nothing AFAICS prevents another cpu
    calling rebuild_sched_domains()->partition_sched_domains() in the middle of
    the hotplug sequence.
    partition_sched_domains() will happily reset rd->rq->online mask and things
    will fail. I'm talking about this path

    __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    ...
    cpu_set(rq->cpu, rd->span);
    if (cpu_isset(rq->cpu, cpu_online_map))
    set_rq_online(rq);
    ...

    --

    btw Why didn't we convert sched*.c to use rq->rd->online when it was
    introduced ? ie Instead of using cpu_online_map directly.

    Max
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    Max Krasnyansky wrote:
    > Greg, correct me if I'm wrong but we seem to have exact same issue withthe
    > rq->rq->online map. Lets take "cpu going down" for example. We're clearing
    > rq->rd->online bit on DYING event, but nothing AFAICS prevents another cpu
    > calling rebuild_sched_domains()->partition_sched_domains() in the middle of
    > the hotplug sequence.
    > partition_sched_domains() will happily reset rd->rq->online mask and things
    > will fail. I'm talking about this path
    >
    > __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    > ...
    > cpu_set(rq->cpu, rd->span);
    > if (cpu_isset(rq->cpu, cpu_online_map))
    > set_rq_online(rq);
    > ...
    >
    >


    I think you are right, but wouldn't s/online/active above fix that as
    well? The active_map didnt exist at the time that code went in initially

    > --
    >
    > btw Why didn't we convert sched*.c to use rq->rd->online when it was
    > introduced ? ie Instead of using cpu_online_map directly.
    >

    I think things were converted where they made sense to convert. But we
    also had a different goal at that time, so perhaps something was
    missed. If you think something else should be converted, please point
    it out.

    In the meantime, I would suggest we consider this patch on top of yours
    (applies to tip/sched/devel):

    ----------------------

    sched: Fully integrate cpus_active_map and root-domain code

    Signed-off-by: Gregory Haskins

    diff --git a/kernel/sched.c b/kernel/sched.c
    index 62b1b8e..99ba70d 100644
    --- a/kernel/sched.c
    +++ b/kernel/sched.c
    @@ -6611,7 +6611,7 @@ static void rq_attach_root(struct rq *rq, struct
    root_domain *rd)
    rq->rd = rd;

    cpu_set(rq->cpu, rd->span);
    - if (cpu_isset(rq->cpu, cpu_online_map))
    + if (cpu_isset(rq->cpu, cpu_active_map))
    set_rq_online(rq);

    spin_unlock_irqrestore(&rq->lock, flags);
    diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    index 7f70026..2bae8de 100644
    --- a/kernel/sched_fair.c
    +++ b/kernel/sched_fair.c
    @@ -1004,7 +1004,7 @@ static void yield_task_fair(struct rq *rq)
    * search starts with cpus closest then further out as needed,
    * so we always favor a closer, idle cpu.
    * Domains may include CPUs that are not usable for migration,
    - * hence we need to mask them out (cpu_active_map)
    + * hence we need to mask them out (rq->rd->online)
    *
    * Returns the CPU we should wake onto.
    */
    @@ -1032,7 +1032,7 @@ static int wake_idle(int cpu, struct task_struct *p)
    || ((sd->flags & SD_WAKE_IDLE_FAR)
    && !task_hot(p, task_rq(p)->clock, sd))) {
    cpus_and(tmp, sd->span, p->cpus_allowed);
    - cpus_and(tmp, tmp, cpu_active_map);
    + cpus_and(tmp, tmp, task_rq(p)->rd->online);
    for_each_cpu_mask(i, tmp) {
    if (idle_cpu(i)) {
    if (i != task_cpu(p)) {
    diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
    index 24621ce..d93169d 100644
    --- a/kernel/sched_rt.c
    +++ b/kernel/sched_rt.c
    @@ -936,13 +936,6 @@ static int find_lowest_rq(struct task_struct *task)
    return -1; /* No targets found */

    /*
    - * Only consider CPUs that are usable for migration.
    - * I guess we might want to change cpupri_find() to ignore those
    - * in the first place.
    - */
    - cpus_and(*lowest_mask, *lowest_mask, cpu_active_map);
    -
    - /*
    * At this point we have built a mask of cpus representing the
    * lowest priority tasks in the system. Now we want to elect
    * the best one based on our affinity and topology.

    --------------

    Regards,
    -Greg


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkiF6VIACgkQlOSOBdgZUxlc0gCfVXPATJZZd0 3+Kp30JpMrK9Bl
    6gAAnR5pVJijMpaug6cKP8dSq3EQ5XTN
    =PX6D
    -----END PGP SIGNATURE-----


  3. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    On Tue, 2008-07-22 at 10:06 -0400, Gregory Haskins wrote:
    > Max Krasnyansky wrote:
    > > Greg, correct me if I'm wrong but we seem to have exact same issue with the
    > > rq->rq->online map. Lets take "cpu going down" for example. We're clearing
    > > rq->rd->online bit on DYING event, but nothing AFAICS prevents another cpu
    > > calling rebuild_sched_domains()->partition_sched_domains() in the middle of
    > > the hotplug sequence.
    > > partition_sched_domains() will happily reset rd->rq->online mask and things
    > > will fail. I'm talking about this path
    > >
    > > __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    > > ...
    > > cpu_set(rq->cpu, rd->span);
    > > if (cpu_isset(rq->cpu, cpu_online_map))
    > > set_rq_online(rq);
    > > ...
    > >
    > >

    >
    > I think you are right, but wouldn't s/online/active above fix that as
    > well? The active_map didnt exist at the time that code went in initially
    >
    > > --
    > >
    > > btw Why didn't we convert sched*.c to use rq->rd->online when it was
    > > introduced ? ie Instead of using cpu_online_map directly.
    > >

    > I think things were converted where they made sense to convert. But we
    > also had a different goal at that time, so perhaps something was
    > missed. If you think something else should be converted, please point
    > it out.
    >
    > In the meantime, I would suggest we consider this patch on top of yours
    > (applies to tip/sched/devel):
    >
    > ----------------------
    >
    > sched: Fully integrate cpus_active_map and root-domain code
    >
    > Signed-off-by: Gregory Haskins



    Right, makes sense.

    ACK

    > diff --git a/kernel/sched.c b/kernel/sched.c
    > index 62b1b8e..99ba70d 100644
    > --- a/kernel/sched.c
    > +++ b/kernel/sched.c
    > @@ -6611,7 +6611,7 @@ static void rq_attach_root(struct rq *rq, struct
    > root_domain *rd)
    > rq->rd = rd;
    >
    > cpu_set(rq->cpu, rd->span);
    > - if (cpu_isset(rq->cpu, cpu_online_map))
    > + if (cpu_isset(rq->cpu, cpu_active_map))
    > set_rq_online(rq);
    >
    > spin_unlock_irqrestore(&rq->lock, flags);
    > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    > index 7f70026..2bae8de 100644
    > --- a/kernel/sched_fair.c
    > +++ b/kernel/sched_fair.c
    > @@ -1004,7 +1004,7 @@ static void yield_task_fair(struct rq *rq)
    > * search starts with cpus closest then further out as needed,
    > * so we always favor a closer, idle cpu.
    > * Domains may include CPUs that are not usable for migration,
    > - * hence we need to mask them out (cpu_active_map)
    > + * hence we need to mask them out (rq->rd->online)
    > *
    > * Returns the CPU we should wake onto.
    > */
    > @@ -1032,7 +1032,7 @@ static int wake_idle(int cpu, struct task_struct *p)
    > || ((sd->flags & SD_WAKE_IDLE_FAR)
    > && !task_hot(p, task_rq(p)->clock, sd))) {
    > cpus_and(tmp, sd->span, p->cpus_allowed);
    > - cpus_and(tmp, tmp, cpu_active_map);
    > + cpus_and(tmp, tmp, task_rq(p)->rd->online);
    > for_each_cpu_mask(i, tmp) {
    > if (idle_cpu(i)) {
    > if (i != task_cpu(p)) {
    > diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
    > index 24621ce..d93169d 100644
    > --- a/kernel/sched_rt.c
    > +++ b/kernel/sched_rt.c
    > @@ -936,13 +936,6 @@ static int find_lowest_rq(struct task_struct *task)
    > return -1; /* No targets found */
    >
    > /*
    > - * Only consider CPUs that are usable for migration.
    > - * I guess we might want to change cpupri_find() to ignore those
    > - * in the first place.
    > - */
    > - cpus_and(*lowest_mask, *lowest_mask, cpu_active_map);
    > -
    > - /*
    > * At this point we have built a mask of cpus representing the
    > * lowest priority tasks in the system. Now we want to elect
    > * the best one based on our affinity and topology.
    >
    > --------------
    >
    > Regards,
    > -Greg
    >


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    Peter Zijlstra wrote:
    > On Tue, 2008-07-22 at 10:06 -0400, Gregory Haskins wrote:
    >
    >> Max Krasnyansky wrote:
    >>
    >>> Greg, correct me if I'm wrong but we seem to have exact same issue with the
    >>> rq->rq->online map. Lets take "cpu going down" for example. We're clearing
    >>> rq->rd->online bit on DYING event, but nothing AFAICS prevents another cpu
    >>> calling rebuild_sched_domains()->partition_sched_domains() in the middle of
    >>> the hotplug sequence.
    >>> partition_sched_domains() will happily reset rd->rq->online mask and things
    >>> will fail. I'm talking about this path
    >>>
    >>> __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    >>> ...
    >>> cpu_set(rq->cpu, rd->span);
    >>> if (cpu_isset(rq->cpu, cpu_online_map))
    >>> set_rq_online(rq);
    >>> ...
    >>>
    >>>
    >>>

    >> I think you are right, but wouldn't s/online/active above fix that as
    >> well? The active_map didnt exist at the time that code went in initially
    >>
    >>
    >>> --
    >>>
    >>> btw Why didn't we convert sched*.c to use rq->rd->online when it was
    >>> introduced ? ie Instead of using cpu_online_map directly.
    >>>
    >>>

    >> I think things were converted where they made sense to convert. But we
    >> also had a different goal at that time, so perhaps something was
    >> missed. If you think something else should be converted, please point
    >> it out.
    >>
    >> In the meantime, I would suggest we consider this patch on top of yours
    >> (applies to tip/sched/devel):
    >>
    >> ----------------------
    >>
    >> sched: Fully integrate cpus_active_map and root-domain code
    >>
    >> Signed-off-by: Gregory Haskins
    >>

    >
    >
    > Right, makes sense.
    >
    > ACK
    >


    Thanks Peter.

    BTW: I should mention that this was just an RFC. I haven't even build
    tested it yet. I will do this now.

    Regards,
    -Greg
    >
    >> diff --git a/kernel/sched.c b/kernel/sched.c
    >> index 62b1b8e..99ba70d 100644
    >> --- a/kernel/sched.c
    >> +++ b/kernel/sched.c
    >> @@ -6611,7 +6611,7 @@ static void rq_attach_root(struct rq *rq, struct
    >> root_domain *rd)
    >> rq->rd = rd;
    >>
    >> cpu_set(rq->cpu, rd->span);
    >> - if (cpu_isset(rq->cpu, cpu_online_map))
    >> + if (cpu_isset(rq->cpu, cpu_active_map))
    >> set_rq_online(rq);
    >>
    >> spin_unlock_irqrestore(&rq->lock, flags);
    >> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    >> index 7f70026..2bae8de 100644
    >> --- a/kernel/sched_fair.c
    >> +++ b/kernel/sched_fair.c
    >> @@ -1004,7 +1004,7 @@ static void yield_task_fair(struct rq *rq)
    >> * search starts with cpus closest then further out as needed,
    >> * so we always favor a closer, idle cpu.
    >> * Domains may include CPUs that are not usable for migration,
    >> - * hence we need to mask them out (cpu_active_map)
    >> + * hence we need to mask them out (rq->rd->online)
    >> *
    >> * Returns the CPU we should wake onto.
    >> */
    >> @@ -1032,7 +1032,7 @@ static int wake_idle(int cpu, struct task_struct*p)
    >> || ((sd->flags & SD_WAKE_IDLE_FAR)
    >> && !task_hot(p, task_rq(p)->clock, sd))) {
    >> cpus_and(tmp, sd->span, p->cpus_allowed);
    >> - cpus_and(tmp, tmp, cpu_active_map);
    >> + cpus_and(tmp, tmp, task_rq(p)->rd->online);
    >> for_each_cpu_mask(i, tmp) {
    >> if (idle_cpu(i)) {
    >> if (i != task_cpu(p)) {
    >> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
    >> index 24621ce..d93169d 100644
    >> --- a/kernel/sched_rt.c
    >> +++ b/kernel/sched_rt.c
    >> @@ -936,13 +936,6 @@ static int find_lowest_rq(struct task_struct *task)
    >> return -1; /* No targets found */
    >>
    >> /*
    >> - * Only consider CPUs that are usable for migration.
    >> - * I guess we might want to change cpupri_find() to ignore those
    >> - * in the first place.
    >> - */
    >> - cpus_and(*lowest_mask, *lowest_mask, cpu_active_map);
    >> -
    >> - /*
    >> * At this point we have built a mask of cpus representing the
    >> * lowest priority tasks in the system. Now we want to elect
    >> * the best one based on our affinity and topology.
    >>
    >> --------------
    >>
    >> Regards,
    >> -Greg
    >>
    >>

    >
    >




    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkiF7BEACgkQlOSOBdgZUxn4PgCfT9SGhXOozH I2U8onlvkqTx08
    JtoAnjI7+RDEomJ4z16AVNdSiJCxRhrC
    =L3n2
    -----END PGP SIGNATURE-----


  5. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    On Tue, 2008-07-22 at 10:17 -0400, Gregory Haskins wrote:

    > Thanks Peter.
    >
    > BTW: I should mention that this was just an RFC. I haven't even build
    > tested it yet. I will do this now.


    Heh, might make sense to ensure it indeed builds before stuffing it into
    -tip ;-)

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    Peter Zijlstra wrote:
    > On Tue, 2008-07-22 at 10:17 -0400, Gregory Haskins wrote:
    >
    >
    >> Thanks Peter.
    >>
    >> BTW: I should mention that this was just an RFC. I haven't even build
    >> tested it yet. I will do this now.
    >>

    >
    > Heh, might make sense to ensure it indeed builds before stuffing it into
    > -tip ;-)
    >
    >


    Ok, it passes the sniff test without any corrections.

    Change the status from RFC to consider-for-inclusion.

    -Greg


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkiF8pkACgkQlOSOBdgZUxn4nwCfaWhHKuM1Ty oVSPEW474ur4sG
    U/8An3LA6HyEiyQlH3vkbvzqGygN2jrA
    =YPZ0
    -----END PGP SIGNATURE-----


  7. references:in-reply-to:content-type:

    Gregory Haskins wrote:
    > Max Krasnyansky wrote:
    >> Greg, correct me if I'm wrong but we seem to have exact same issue
    >> with the rq->rq->online map. Lets take "cpu going down" for
    >> example. We're clearing rq->rd->online bit on DYING event, but
    >> nothing AFAICS prevents another cpu calling
    >> rebuild_sched_domains()->partition_sched_domains() in the middle of
    >> the hotplug sequence. partition_sched_domains() will happily reset
    >> rd->rq->online mask and things will fail. I'm talking about this
    >> path
    >>
    >> __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    >> ...
    >> cpu_set(rq->cpu, rd->span);
    >> if (cpu_isset(rq->cpu, cpu_online_map))
    >> set_rq_online(rq);
    >> ...
    >>
    >>

    >
    > I think you are right, but wouldn't s/online/active above fix that as
    > well? The active_map didnt exist at the time that code went in
    > initially


    Actually after a bit more thinking I realized that the scenario I
    explained above cannot happen because partition_sched_domains() must be
    called under get_online_cpus() and the set_rq_online() happens in the
    hotplug writer's path (ie under cpu_hotplug.lock). Since I unified all
    the other domain rebuild paths (arch_reinit_sched_domains, etc) we
    should be safe. But it again means we'd rely on those intricate
    dependencies that we wanted to avoid with the cpu_active_map. Also
    cpusets might still need to rebuild the domains in the hotplug writer's
    path.
    So it's better to fix it once and for all

    >> --
    >>
    >> btw Why didn't we convert sched*.c to use rq->rd->online when it was
    >> introduced ? ie Instead of using cpu_online_map directly.
    >>

    > I think things were converted where they made sense to convert. But we
    > also had a different goal at that time, so perhaps something was
    > missed. If you think something else should be converted, please point
    > it out.

    Ok. I'll keep an eye on it.

    > In the meantime, I would suggest we consider this patch on top of yours
    > (applies to tip/sched/devel):
    >
    > ----------------------
    >
    > sched: Fully integrate cpus_active_map and root-domain code
    > Signed-off-by: Gregory Haskins


    Ack.
    The only thing I'm a bit unsure of is the error scenarios in the cpu
    hotplug event sequence. online_map is not cleared when something in the
    notifier chain fails, but active_map is.

    Max
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment (take 2)

    Hi Ingo,
    Here is another patch submitted that has not been acked/nacked yet.
    If you get a free moment, please let me know your thoughts. Here is the
    full thread for your convenience:

    http://lkml.org/lkml/2008/7/22/281

    (and FYI it was ACKed by Peter here: http://lkml.org/lkml/2008/7/22/286)

    -Greg

    Gregory Haskins wrote:
    > Max Krasnyansky wrote:
    >> Greg, correct me if I'm wrong but we seem to have exact same issue
    >> with the
    >> rq->rq->online map. Lets take "cpu going down" for example. We're
    >> clearing
    >> rq->rd->online bit on DYING event, but nothing AFAICS prevents
    >> another cpu
    >> calling rebuild_sched_domains()->partition_sched_domains() in the
    >> middle of
    >> the hotplug sequence.
    >> partition_sched_domains() will happily reset rd->rq->online mask and
    >> things
    >> will fail. I'm talking about this path
    >>
    >> __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    >> ...
    >> cpu_set(rq->cpu, rd->span);
    >> if (cpu_isset(rq->cpu, cpu_online_map))
    >> set_rq_online(rq);
    >> ...
    >>
    >>

    >
    > I think you are right, but wouldn't s/online/active above fix that as
    > well? The active_map didnt exist at the time that code went in
    > initially
    >
    >> --
    >>
    >> btw Why didn't we convert sched*.c to use rq->rd->online when it was
    >> introduced ? ie Instead of using cpu_online_map directly.
    >>

    > I think things were converted where they made sense to convert. But
    > we also had a different goal at that time, so perhaps something was
    > missed. If you think something else should be converted, please point
    > it out.
    >
    > In the meantime, I would suggest we consider this patch on top of
    > yours (applies to tip/sched/devel):
    >
    > ----------------------
    >
    > sched: Fully integrate cpus_active_map and root-domain code
    > Signed-off-by: Gregory Haskins
    >
    > diff --git a/kernel/sched.c b/kernel/sched.c
    > index 62b1b8e..99ba70d 100644
    > --- a/kernel/sched.c
    > +++ b/kernel/sched.c
    > @@ -6611,7 +6611,7 @@ static void rq_attach_root(struct rq *rq, struct
    > root_domain *rd)
    > rq->rd = rd;
    >
    > cpu_set(rq->cpu, rd->span);
    > - if (cpu_isset(rq->cpu, cpu_online_map))
    > + if (cpu_isset(rq->cpu, cpu_active_map))
    > set_rq_online(rq);
    >
    > spin_unlock_irqrestore(&rq->lock, flags);
    > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    > index 7f70026..2bae8de 100644
    > --- a/kernel/sched_fair.c
    > +++ b/kernel/sched_fair.c
    > @@ -1004,7 +1004,7 @@ static void yield_task_fair(struct rq *rq)
    > * search starts with cpus closest then further out as needed,
    > * so we always favor a closer, idle cpu.
    > * Domains may include CPUs that are not usable for migration,
    > - * hence we need to mask them out (cpu_active_map)
    > + * hence we need to mask them out (rq->rd->online)
    > *
    > * Returns the CPU we should wake onto.
    > */
    > @@ -1032,7 +1032,7 @@ static int wake_idle(int cpu, struct task_struct
    > *p)
    > || ((sd->flags & SD_WAKE_IDLE_FAR)
    > && !task_hot(p, task_rq(p)->clock, sd))) {
    > cpus_and(tmp, sd->span, p->cpus_allowed);
    > - cpus_and(tmp, tmp, cpu_active_map);
    > + cpus_and(tmp, tmp, task_rq(p)->rd->online);
    > for_each_cpu_mask(i, tmp) {
    > if (idle_cpu(i)) {
    > if (i != task_cpu(p)) {
    > diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
    > index 24621ce..d93169d 100644
    > --- a/kernel/sched_rt.c
    > +++ b/kernel/sched_rt.c
    > @@ -936,13 +936,6 @@ static int find_lowest_rq(struct task_struct *task)
    > return -1; /* No targets found */
    >
    > /*
    > - * Only consider CPUs that are usable for migration.
    > - * I guess we might want to change cpupri_find() to ignore those
    > - * in the first place.
    > - */
    > - cpus_and(*lowest_mask, *lowest_mask, cpu_active_map);
    > -
    > - /*
    > * At this point we have built a mask of cpus representing the
    > * lowest priority tasks in the system. Now we want to elect
    > * the best one based on our affinity and topology.
    >
    > --------------
    >
    > Regards,
    > -Greg
    >




    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkigOo0ACgkQlOSOBdgZUxmY+ACeI9BojR+IKs JYQUnwIP33x6TX
    b8EAoIxcVfFEhAmb1pztBjNJAf9MN0eX
    =qdGJ
    -----END PGP SIGNATURE-----


  9. references:in-reply-to:content-type:



    Gregory Haskins wrote:
    > Hi Ingo,
    > Here is another patch submitted that has not been acked/nacked yet. If
    > you get a free moment, please let me know your thoughts. Here is the
    > full thread for your convenience:
    >
    > http://lkml.org/lkml/2008/7/22/281
    >
    > (and FYI it was ACKed by Peter here: http://lkml.org/lkml/2008/7/22/286)


    I thought this went in already. It looks good to me too.

    Max


    >
    > Gregory Haskins wrote:
    >> Max Krasnyansky wrote:
    >>> Greg, correct me if I'm wrong but we seem to have exact same issue
    >>> with the
    >>> rq->rq->online map. Lets take "cpu going down" for example. We're
    >>> clearing
    >>> rq->rd->online bit on DYING event, but nothing AFAICS prevents
    >>> another cpu
    >>> calling rebuild_sched_domains()->partition_sched_domains() in the
    >>> middle of
    >>> the hotplug sequence.
    >>> partition_sched_domains() will happily reset rd->rq->online mask and
    >>> things
    >>> will fail. I'm talking about this path
    >>>
    >>> __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
    >>> ...
    >>> cpu_set(rq->cpu, rd->span);
    >>> if (cpu_isset(rq->cpu, cpu_online_map))
    >>> set_rq_online(rq);
    >>> ...
    >>>
    >>>

    >>
    >> I think you are right, but wouldn't s/online/active above fix that as
    >> well? The active_map didnt exist at the time that code went in
    >> initially
    >>
    >>> --
    >>>
    >>> btw Why didn't we convert sched*.c to use rq->rd->online when it was
    >>> introduced ? ie Instead of using cpu_online_map directly.
    >>>

    >> I think things were converted where they made sense to convert. But
    >> we also had a different goal at that time, so perhaps something was
    >> missed. If you think something else should be converted, please point
    >> it out.
    >>
    >> In the meantime, I would suggest we consider this patch on top of
    >> yours (applies to tip/sched/devel):
    >>
    >> ----------------------
    >>
    >> sched: Fully integrate cpus_active_map and root-domain code
    >> Signed-off-by: Gregory Haskins
    >>
    >> diff --git a/kernel/sched.c b/kernel/sched.c
    >> index 62b1b8e..99ba70d 100644
    >> --- a/kernel/sched.c
    >> +++ b/kernel/sched.c
    >> @@ -6611,7 +6611,7 @@ static void rq_attach_root(struct rq *rq, struct
    >> root_domain *rd)
    >> rq->rd = rd;
    >>
    >> cpu_set(rq->cpu, rd->span);
    >> - if (cpu_isset(rq->cpu, cpu_online_map))
    >> + if (cpu_isset(rq->cpu, cpu_active_map))
    >> set_rq_online(rq);
    >>
    >> spin_unlock_irqrestore(&rq->lock, flags);
    >> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    >> index 7f70026..2bae8de 100644
    >> --- a/kernel/sched_fair.c
    >> +++ b/kernel/sched_fair.c
    >> @@ -1004,7 +1004,7 @@ static void yield_task_fair(struct rq *rq)
    >> * search starts with cpus closest then further out as needed,
    >> * so we always favor a closer, idle cpu.
    >> * Domains may include CPUs that are not usable for migration,
    >> - * hence we need to mask them out (cpu_active_map)
    >> + * hence we need to mask them out (rq->rd->online)
    >> *
    >> * Returns the CPU we should wake onto.
    >> */
    >> @@ -1032,7 +1032,7 @@ static int wake_idle(int cpu, struct task_struct
    >> *p)
    >> || ((sd->flags & SD_WAKE_IDLE_FAR)
    >> && !task_hot(p, task_rq(p)->clock, sd))) {
    >> cpus_and(tmp, sd->span, p->cpus_allowed);
    >> - cpus_and(tmp, tmp, cpu_active_map);
    >> + cpus_and(tmp, tmp, task_rq(p)->rd->online);
    >> for_each_cpu_mask(i, tmp) {
    >> if (idle_cpu(i)) {
    >> if (i != task_cpu(p)) {
    >> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
    >> index 24621ce..d93169d 100644
    >> --- a/kernel/sched_rt.c
    >> +++ b/kernel/sched_rt.c
    >> @@ -936,13 +936,6 @@ static int find_lowest_rq(struct task_struct *task)
    >> return -1; /* No targets found */
    >>
    >> /*
    >> - * Only consider CPUs that are usable for migration.
    >> - * I guess we might want to change cpupri_find() to ignore those
    >> - * in the first place.
    >> - */
    >> - cpus_and(*lowest_mask, *lowest_mask, cpu_active_map);
    >> -
    >> - /*
    >> * At this point we have built a mask of cpus representing the
    >> * lowest priority tasks in the system. Now we want to elect
    >> * the best one based on our affinity and topology.
    >>
    >> --------------
    >>
    >> Regards,
    >> -Greg
    >>

    >
    >

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2