[revert] mysql+oltp regression - Kernel

This is a discussion on [revert] mysql+oltp regression - Kernel ; Greetings, During regression testing of tip/sched/clock fixes, a regression in low client count throughput turned up, which I traced this back to the commit below. I don't see anything wrong with it, but suspect that it is preventing client/server pairs ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: [revert] mysql+oltp regression

  1. [revert] mysql+oltp regression

    Greetings,

    During regression testing of tip/sched/clock fixes, a regression in low
    client count throughput turned up, which I traced this back to the
    commit below. I don't see anything wrong with it, but suspect that it
    is preventing client/server pairs from staying together on the same CPU
    as buddies, which mysql definitely likes quite a lot. (I suspect that
    this is the case, because I've seen this same performance curve while
    tinkering with wakeup affinity and breaking it all to pieces

    Changelog and test results below in case nobody sees a problem with the
    commit itself.

    Revert commit 6d299f1b53b84e2665f402d9bcc494800aba6386

    Testing of the tip/sched/clock tree revealed a mysql+oltp regression
    which bisection eventually traced back to this commit in mainline.

    Pertinent test results: Three run sysbench averages, throughput units
    in read/write requests/sec.

    clients 1 2 4 8 16 32 64
    6e0534f 9646 17876 34774 33868 32230 30767 29441
    2.6.26.1 9112 17936 34652 33383 31929 30665 29232
    6d299f1 9112 14637 28370 33339 32038 30762 29204

    Note: subsequent commits hide the majority of this regression until you
    apply the clock fixes, at which time it reemerges at full magnitude.

    Signed-off-by: Mike Galbraith

    diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    index 1fe4c65..08ae848 100644
    --- a/kernel/sched_fair.c
    +++ b/kernel/sched_fair.c
    @@ -1275,18 +1275,23 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
    struct task_struct *p = NULL;
    struct sched_entity *se;

    - while (next != &cfs_rq->tasks) {
    + if (next == &cfs_rq->tasks)
    + return NULL;
    +
    + /* Skip over entities that are not tasks */
    + do {
    se = list_entry(next, struct sched_entity, group_node);
    next = next->next;
    + } while (next != &cfs_rq->tasks && !entity_is_task(se));

    - /* Skip over entities that are not tasks */
    - if (entity_is_task(se)) {
    - p = task_of(se);
    - break;
    - }
    - }
    + if (next == &cfs_rq->tasks)
    + return NULL;

    cfs_rq->balance_iterator = next;
    +
    + if (entity_is_task(se))
    + p = task_of(se);
    +
    return p;
    }



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [revert] mysql+oltp regression


    * Mike Galbraith wrote:

    > Greetings,
    >
    > During regression testing of tip/sched/clock fixes, a regression in
    > low client count throughput turned up, which I traced this back to the
    > commit below. I don't see anything wrong with it, but suspect that it
    > is preventing client/server pairs from staying together on the same
    > CPU as buddies, which mysql definitely likes quite a lot. (I suspect
    > that this is the case, because I've seen this same performance curve
    > while tinkering with wakeup affinity and breaking it all to pieces
    >
    > Changelog and test results below in case nobody sees a problem with
    > the commit itself.


    i've applied your fix to tip/sched/urgent for the time being, thanks
    Mike for tracking it down. We can re-try newer iterations of Greg's
    patch in tip/sched/devel.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [revert] mysql+oltp regression

    Ingo Molnar wrote:
    > * Mike Galbraith wrote:
    >
    >
    >> Greetings,
    >>
    >> During regression testing of tip/sched/clock fixes, a regression in
    >> low client count throughput turned up, which I traced this back to the
    >> commit below. I don't see anything wrong with it, but suspect that it
    >> is preventing client/server pairs from staying together on the same
    >> CPU as buddies, which mysql definitely likes quite a lot. (I suspect
    >> that this is the case, because I've seen this same performance curve
    >> while tinkering with wakeup affinity and breaking it all to pieces
    >>
    >> Changelog and test results below in case nobody sees a problem with
    >> the commit itself.
    >>

    >
    > i've applied your fix to tip/sched/urgent for the time being, thanks
    > Mike for tracking it down. We can re-try newer iterations of Greg's
    > patch in tip/sched/devel.
    >
    >


    Hmm.. The patch still looks correct afaict. I fear we are just
    papering over some other issue by reverting it, but I will try to see if
    I can track this down. We will, of course, now be skipping trying to
    balance the (effectively random) last task in the queue which may or may
    not result in better performance on sheer luck instead of algorithmic
    intelligence. This makes me nervous.

    Speaking of this: Another patch I submitted to you Ingo (had to do with
    updating the load_weight inside task_setprio) seems to also have this
    phenomenon: e.g. its technically correct but further testing has
    revealed negative repercussions elsewhere. So please ignore that patch
    (or revert if you already pulled in, but I don't think you have). Ill
    try to look into this issue as well.

    -Greg



    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkigMEkACgkQlOSOBdgZUxlRVACeOK/3maFR7WMAO+Yfy+En7CD2
    mkQAoIZrBatyajmt9GzjwpcwSHW9h9jH
    =6fEI
    -----END PGP SIGNATURE-----


  4. Re: [revert] mysql+oltp regression


    * Gregory Haskins wrote:

    > Ingo Molnar wrote:
    >> * Mike Galbraith wrote:
    >>
    >>
    >>> Greetings,
    >>>
    >>> During regression testing of tip/sched/clock fixes, a regression in
    >>> low client count throughput turned up, which I traced this back to
    >>> the commit below. I don't see anything wrong with it, but suspect
    >>> that it is preventing client/server pairs from staying together on
    >>> the same CPU as buddies, which mysql definitely likes quite a lot.
    >>> (I suspect that this is the case, because I've seen this same
    >>> performance curve while tinkering with wakeup affinity and breaking
    >>> it all to pieces
    >>>
    >>> Changelog and test results below in case nobody sees a problem with
    >>> the commit itself.
    >>>

    >>
    >> i've applied your fix to tip/sched/urgent for the time being, thanks
    >> Mike for tracking it down. We can re-try newer iterations of Greg's
    >> patch in tip/sched/devel.
    >>
    >>

    >
    > Hmm.. The patch still looks correct afaict. I fear we are just
    > papering over some other issue by reverting it, but I will try to see
    > if I can track this down. We will, of course, now be skipping trying
    > to balance the (effectively random) last task in the queue which may
    > or may not result in better performance on sheer luck instead of
    > algorithmic intelligence. This makes me nervous.


    yeah - but we had that behavior for quite some time.

    This is how the patch cycle works normally: we had a fair chance to
    discover this problem in your testing then in -tip testing and then in
    linux-next or -mm but we didnt find it at any stage.

    Now we are in the upstream release cycle so unless there's some
    immediate fix available (or there are _really_ strong reasons against
    the revert) doing the revert is the right approach.

    A revert is not necessarily the indicator of the quality of the change
    in question, it is a tester-driven exception event that guarantees that
    the kernel improves in a monotonic way. (for all testers who opt to help
    us in doing so)

    And given that the problem was readily reproducible for Mike, it should
    be reproducible for you as well - so we dont actually make the bug
    harder to fix by doing the revert.

    Perhaps we should introduce the notion of "Defer-to-next-release"
    reverts - which this really is - in contrast to "Revert-because-bad",
    which your change definitely is not.

    > Speaking of this: Another patch I submitted to you Ingo (had to do
    > with updating the load_weight inside task_setprio) seems to also have
    > this phenomenon: e.g. its technically correct but further testing has
    > revealed negative repercussions elsewhere. So please ignore that
    > patch (or revert if you already pulled in, but I don't think you
    > have). Ill try to look into this issue as well.


    ok, under which thread/subject is that? Not queued in tip/sched/* yet,
    correct?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [revert] mysql+oltp regression


    * Ingo Molnar wrote:

    > And given that the problem was readily reproducible for Mike, it
    > should be reproducible for you as well - so we dont actually make the
    > bug harder to fix by doing the revert.
    >
    > Perhaps we should introduce the notion of "Defer-to-next-release"
    > reverts - which this really is - in contrast to "Revert-because-bad",
    > which your change definitely is not.


    i've edited the commit message in the manner below to make this
    distinction more apparent.

    Ingo

    ---------------------->
    From 77ae651347bdd46830da8b28b1efc5e4a9d7cbd0 Mon Sep 17 00:00:00 2001
    From: Mike Galbraith
    Date: Mon, 11 Aug 2008 13:32:02 +0200
    Subject: [PATCH] sched: fix mysql+oltp regression

    Defer commit 6d299f1b53b84e2665f402d9bcc494800aba6386 to the next release.

    Testing of the tip/sched/clock tree revealed a mysql+oltp regression
    which bisection eventually traced back to this commit in mainline.

    Pertinent test results: Three run sysbench averages, throughput units
    in read/write requests/sec.

    clients 1 2 4 8 16 32 64
    6e0534f 9646 17876 34774 33868 32230 30767 29441
    2.6.26.1 9112 17936 34652 33383 31929 30665 29232
    6d299f1 9112 14637 28370 33339 32038 30762 29204

    Note: subsequent commits hide the majority of this regression until you
    apply the clock fixes, at which time it reemerges at full magnitude.

    We cannot see anything bad about the change itself so we defer it to the
    next release until this problem is fully analysed.

    Signed-off-by: Mike Galbraith
    Acked-by: Peter Zijlstra
    Cc: Gregory Haskins
    Signed-off-by: Ingo Molnar
    ---
    kernel/sched_fair.c | 19 ++++++++++++-------
    1 files changed, 12 insertions(+), 7 deletions(-)

    diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
    index 0fe94ea..fb8994c 100644
    --- a/kernel/sched_fair.c
    +++ b/kernel/sched_fair.c
    @@ -1442,18 +1442,23 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
    struct task_struct *p = NULL;
    struct sched_entity *se;

    - while (next != &cfs_rq->tasks) {
    + if (next == &cfs_rq->tasks)
    + return NULL;
    +
    + /* Skip over entities that are not tasks */
    + do {
    se = list_entry(next, struct sched_entity, group_node);
    next = next->next;
    + } while (next != &cfs_rq->tasks && !entity_is_task(se));

    - /* Skip over entities that are not tasks */
    - if (entity_is_task(se)) {
    - p = task_of(se);
    - break;
    - }
    - }
    + if (next == &cfs_rq->tasks)
    + return NULL;

    cfs_rq->balance_iterator = next;
    +
    + if (entity_is_task(se))
    + p = task_of(se);
    +
    return p;
    }

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [revert] mysql+oltp regression

    Ingo Molnar wrote:
    > * Gregory Haskins wrote:
    >
    >
    >> Ingo Molnar wrote:
    >>
    >>> * Mike Galbraith wrote:
    >>>
    >>>
    >>>
    >>>> Greetings,
    >>>>
    >>>> During regression testing of tip/sched/clock fixes, a regression in
    >>>> low client count throughput turned up, which I traced this back to
    >>>> the commit below. I don't see anything wrong with it, but suspect
    >>>> that it is preventing client/server pairs from staying together on
    >>>> the same CPU as buddies, which mysql definitely likes quite a lot.
    >>>> (I suspect that this is the case, because I've seen this same
    >>>> performance curve while tinkering with wakeup affinity and breaking
    >>>> it all to pieces
    >>>>
    >>>> Changelog and test results below in case nobody sees a problem with
    >>>> the commit itself.
    >>>>
    >>>>
    >>> i've applied your fix to tip/sched/urgent for the time being, thanks
    >>> Mike for tracking it down. We can re-try newer iterations of Greg's
    >>> patch in tip/sched/devel.
    >>>
    >>>
    >>>

    >> Hmm.. The patch still looks correct afaict. I fear we are just
    >> papering over some other issue by reverting it, but I will try to see
    >> if I can track this down. We will, of course, now be skipping trying
    >> to balance the (effectively random) last task in the queue which may
    >> or may not result in better performance on sheer luck instead of
    >> algorithmic intelligence. This makes me nervous.
    >>

    >
    > yeah - but we had that behavior for quite some time.
    >
    > This is how the patch cycle works normally: we had a fair chance to
    > discover this problem in your testing then in -tip testing and then in
    > linux-next or -mm but we didnt find it at any stage.
    >
    > Now we are in the upstream release cycle so unless there's some
    > immediate fix available (or there are _really_ strong reasons against
    > the revert) doing the revert is the right approach.
    >
    > A revert is not necessarily the indicator of the quality of the change
    > in question, it is a tester-driven exception event that guarantees that
    > the kernel improves in a monotonic way. (for all testers who opt to help
    > us in doing so)
    >
    > And given that the problem was readily reproducible for Mike, it should
    > be reproducible for you as well - so we dont actually make the bug
    > harder to fix by doing the revert.
    >
    > Perhaps we should introduce the notion of "Defer-to-next-release"
    > reverts - which this really is - in contrast to "Revert-because-bad",
    > which your change definitely is not.
    >


    Hi Ingo,
    Understood, and a totally reasonable stance. I mostly wanted to make
    sure it was understood that I don't think I can "fix" that particular
    patch since I think it was already correct. Rather, I will have to try
    to identify some other area (presumably the load balancer) to harmonize
    with it. I think we are on the same page, though.


    >
    >> Speaking of this: Another patch I submitted to you Ingo (had to do
    >> with updating the load_weight inside task_setprio) seems to also have
    >> this phenomenon: e.g. its technically correct but further testing has
    >> revealed negative repercussions elsewhere. So please ignore that
    >> patch (or revert if you already pulled in, but I don't think you
    >> have). Ill try to look into this issue as well.
    >>

    >
    > ok, under which thread/subject is that? Not queued in tip/sched/* yet,
    > correct?
    >

    Here is the original thread:

    http://lkml.org/lkml/2008/7/3/416

    I do not believe you have queued it anywhere (public anyway) yet.

    Note I have already invalidated 1/2, and now I am retracting 2/2 as
    well. (1/2 is actually a bogus patch, 2/2 is "technically correct" but
    causes ripples in the load balancer that need to be sorted out first.

    Thanks!
    -Greg



    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkigOKcACgkQlOSOBdgZUxkt0ACfQUO30lhfco N5wi7JgWz1IMdN
    CscAn1EG70JV6ettJKAMePU3gD9nzzaY
    =35Gv
    -----END PGP SIGNATURE-----


  7. Re: [revert] mysql+oltp regression


    * Gregory Haskins wrote:

    >>> Speaking of this: Another patch I submitted to you Ingo (had to do
    >>> with updating the load_weight inside task_setprio) seems to also
    >>> have this phenomenon: e.g. its technically correct but further
    >>> testing has revealed negative repercussions elsewhere. So please
    >>> ignore that patch (or revert if you already pulled in, but I don't
    >>> think you have). Ill try to look into this issue as well.

    >>
    >> ok, under which thread/subject is that? Not queued in tip/sched/*
    >> yet, correct?
    >>

    > Here is the original thread:
    >
    > http://lkml.org/lkml/2008/7/3/416
    >
    > I do not believe you have queued it anywhere (public anyway) yet.
    >
    > Note I have already invalidated 1/2, and now I am retracting 2/2 as
    > well. (1/2 is actually a bogus patch, 2/2 is "technically correct"
    > but causes ripples in the load balancer that need to be sorted out
    > first.


    ok, thanks. I'm curious, what are those ripple effects? Stability or
    performance?

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [revert] mysql+oltp regression

    Ingo Molnar wrote:
    > * Gregory Haskins wrote:
    >
    >
    >>>> Speaking of this: Another patch I submitted to you Ingo (had to do
    >>>> with updating the load_weight inside task_setprio) seems to also
    >>>> have this phenomenon: e.g. its technically correct but further
    >>>> testing has revealed negative repercussions elsewhere. So please
    >>>> ignore that patch (or revert if you already pulled in, but I don't
    >>>> think you have). Ill try to look into this issue as well.
    >>>>
    >>> ok, under which thread/subject is that? Not queued in tip/sched/*
    >>> yet, correct?
    >>>
    >>>

    >> Here is the original thread:
    >>
    >> http://lkml.org/lkml/2008/7/3/416
    >>
    >> I do not believe you have queued it anywhere (public anyway) yet.
    >>
    >> Note I have already invalidated 1/2, and now I am retracting 2/2 as
    >> well. (1/2 is actually a bogus patch, 2/2 is "technically correct"
    >> but causes ripples in the load balancer that need to be sorted out
    >> first.
    >>

    >
    > ok, thanks. I'm curious, what are those ripple effects? Stability or
    > performance?
    >


    Performance. I found it while working on my pi series (which fyi I
    should have a v2 refresh for soon, probably today...i am hoping to get
    some review feedback from you on that as well, time permitting of course .

    Basically the behavior I was observing was that kernel builds via distcc
    would cluster all the cc1 jobs on a single core. At first I thought my
    pi-series was screwed up, but then I realized I had applied the patch
    referenced above earlier in my development tree, and removing it allowed
    pi to work fine.

    I found the problem with in once boot cycle with ftrace (thanks
    Steve!). Basically newidle balancing was always returning "no
    imbalance" even though I had 32 cc1 threads on 1 core, and 3 idle
    cores. Clearly not correct! So I think that by adjusting the load up,
    we throw off the hysteresis built into the load averages and cause the
    system to incorrectly think it's balanced. TBD.

    -Greg


    > Ingo
    >




    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkigPG4ACgkQlOSOBdgZUxnoLACfWW0tOMKYYl QatcigMjsUeYVV
    NsgAnR1ClfI8sGJxLs56DFHhacnNVDDh
    =BvLa
    -----END PGP SIGNATURE-----


  9. Re: [revert] mysql+oltp regression


    * Gregory Haskins wrote:

    >> ok, thanks. I'm curious, what are those ripple effects? Stability or
    >> performance?

    >
    > Performance. I found it while working on my pi series (which fyi I
    > should have a v2 refresh for soon, probably today...i am hoping to get
    > some review feedback from you on that as well, time permitting of
    > course .
    >
    > Basically the behavior I was observing was that kernel builds via
    > distcc would cluster all the cc1 jobs on a single core. At first I
    > thought my pi-series was screwed up, but then I realized I had applied
    > the patch referenced above earlier in my development tree, and
    > removing it allowed pi to work fine.
    >
    > I found the problem with in once boot cycle with ftrace (thanks
    > Steve!). Basically newidle balancing was always returning "no
    > imbalance" even though I had 32 cc1 threads on 1 core, and 3 idle
    > cores. Clearly not correct! So I think that by adjusting the load
    > up, we throw off the hysteresis built into the load averages and cause
    > the system to incorrectly think it's balanced. TBD.


    ok. If you touch that area i'd suggest to also test Mike's
    mysql+sysbench workload (all CPU bound, not IO bound) - it is rather
    sensitive to many aspects of load-balancing. Mysqld also has scalability
    limits due to user-space locking with increasing number of cores, so it
    will show wakeup and buddy balancing artifacts very quickly.

    (With postgresql you'll need to go to 32 cores to see it scale badly
    under Linux due to user-space locking.)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [revert] mysql+oltp regression


    * Gregory Haskins wrote:

    > Gregory Haskins wrote:
    >> Ingo Molnar wrote:
    >>> * Gregory Haskins wrote:
    >>>
    >>>
    >>>>>> Speaking of this: Another patch I submitted to you Ingo (had to
    >>>>>> do with updating the load_weight inside task_setprio) seems to
    >>>>>> also have this phenomenon: e.g. its technically correct but
    >>>>>> further testing has revealed negative repercussions elsewhere.
    >>>>>> So please ignore that patch (or revert if you already pulled
    >>>>>> in, but I don't think you have). Ill try to look into this
    >>>>>> issue as well.
    >>>>>>
    >>>>> ok, under which thread/subject is that? Not queued in tip/sched/*
    >>>>> yet, correct?
    >>>>>
    >>>> Here is the original thread:
    >>>>
    >>>> http://lkml.org/lkml/2008/7/3/416
    >>>>
    >>>> I do not believe you have queued it anywhere (public anyway) yet.
    >>>>
    >>>> Note I have already invalidated 1/2, and now I am retracting 2/2 as
    >>>> well. (1/2 is actually a bogus patch, 2/2 is "technically correct"
    >>>> but causes ripples in the load balancer that need to be sorted out
    >>>> first.
    >>>>
    >>>
    >>> ok, thanks. I'm curious, what are those ripple effects? Stability or
    >>> performance?
    >>>

    >>
    >> Performance. I found it while working on my pi series (which fyi I
    >> should have a v2 refresh for soon, probably today...i am hoping to get
    >> some review feedback from you on that as well, time permitting of
    >> course .
    >>
    >> Basically the behavior I was observing was that kernel builds via
    >> distcc would cluster all the cc1 jobs on a single core. At first I
    >> thought my pi-series was screwed up, but then I realized I had applied
    >> the patch referenced above earlier in my development tree, and
    >> removing it allowed pi to work fine.
    >>
    >> I found the problem with in once boot cycle with ftrace (thanks Steve!).

    >
    > Hmm..Im not sure what went wrong between brain and hand above, but of
    > course I meant to say ".. within one boot cycle ..", not "with in
    > once". Heh.


    my second reading of that sentence auto-corrected it to your intented
    version ;-)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [revert] mysql+oltp regression

    Gregory Haskins wrote:
    > Ingo Molnar wrote:
    >> * Gregory Haskins wrote:
    >>
    >>
    >>>>> Speaking of this: Another patch I submitted to you Ingo (had to do
    >>>>> with updating the load_weight inside task_setprio) seems to also
    >>>>> have this phenomenon: e.g. its technically correct but further
    >>>>> testing has revealed negative repercussions elsewhere. So please
    >>>>> ignore that patch (or revert if you already pulled in, but I don't
    >>>>> think you have). Ill try to look into this issue as well.
    >>>>>
    >>>> ok, under which thread/subject is that? Not queued in tip/sched/*
    >>>> yet, correct?
    >>>>
    >>> Here is the original thread:
    >>>
    >>> http://lkml.org/lkml/2008/7/3/416
    >>>
    >>> I do not believe you have queued it anywhere (public anyway) yet.
    >>>
    >>> Note I have already invalidated 1/2, and now I am retracting 2/2 as
    >>> well. (1/2 is actually a bogus patch, 2/2 is "technically correct"
    >>> but causes ripples in the load balancer that need to be sorted out
    >>> first.
    >>>

    >>
    >> ok, thanks. I'm curious, what are those ripple effects? Stability or
    >> performance?
    >>

    >
    > Performance. I found it while working on my pi series (which fyi I
    > should have a v2 refresh for soon, probably today...i am hoping to get
    > some review feedback from you on that as well, time permitting of
    > course .
    >
    > Basically the behavior I was observing was that kernel builds via
    > distcc would cluster all the cc1 jobs on a single core. At first I
    > thought my pi-series was screwed up, but then I realized I had applied
    > the patch referenced above earlier in my development tree, and
    > removing it allowed pi to work fine.
    >
    > I found the problem with in once boot cycle with ftrace (thanks Steve!).


    Hmm..Im not sure what went wrong between brain and hand above, but of
    course I meant to say ".. within one boot cycle ..", not "with in
    once". Heh.


    > Basically newidle balancing was always returning "no imbalance" even
    > though I had 32 cc1 threads on 1 core, and 3 idle cores. Clearly not
    > correct! So I think that by adjusting the load up, we throw off the
    > hysteresis built into the load averages and cause the system to
    > incorrectly think it's balanced. TBD.
    >
    > -Greg
    >
    >
    >> Ingo
    >>

    >
    >




    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)
    Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

    iEYEARECAAYFAkigPigACgkQlOSOBdgZUxnh1gCfTrK6reNZOo eCvbYsOq9ORGr4
    WzEAnAsRQtsdzmleDFPHGpU5RSK6BXJd
    =JVp0
    -----END PGP SIGNATURE-----


+ Reply to Thread