[tbench regression fixes]: digging out smelly deadmen. - Kernel

This is a discussion on [tbench regression fixes]: digging out smelly deadmen. - Kernel ; From: Mike Galbraith Date: Sat, 25 Oct 2008 06:05:01 +0200 > My test data indicates (to me anyway) that there is another source of > localhost throughput loss in .27. In that data, there is no hrtick > overhead since ...

+ Reply to Thread
Page 2 of 5 FirstFirst 1 2 3 4 ... LastLast
Results 21 to 40 of 92

Thread: [tbench regression fixes]: digging out smelly deadmen.

  1. Re: [tbench regression fixes]: digging out smelly deadmen.

    From: Mike Galbraith
    Date: Sat, 25 Oct 2008 06:05:01 +0200

    > My test data indicates (to me anyway) that there is another source of
    > localhost throughput loss in .27. In that data, there is no hrtick
    > overhead since I didn't have highres timers enabled, and computational
    > costs added in .27 were removed. Dunno where it lives, but it does
    > appear to exist.


    Disabling TSO on loopback doesn't fix that bit for you?

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Fri, 2008-10-24 at 22:15 -0700, David Miller wrote:
    > From: Mike Galbraith
    > Date: Sat, 25 Oct 2008 06:05:01 +0200
    >
    > > My test data indicates (to me anyway) that there is another source of
    > > localhost throughput loss in .27. In that data, there is no hrtick
    > > overhead since I didn't have highres timers enabled, and computational
    > > costs added in .27 were removed. Dunno where it lives, but it does
    > > appear to exist.

    >
    > Disabling TSO on loopback doesn't fix that bit for you?


    No. Those numbers are with TSO/GSO disabled.

    I did a manual 100% sched and everything related revert to 26 scheduler,
    and had ~the same result as these numbers. 27 with 100% revert actually
    performed a bit _worse_ for me than 27 with it's overhead.. which
    puzzles me greatly.

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Fri, 2008-10-24 at 22:16 -0700, David Miller wrote:
    > From: Mike Galbraith
    > Date: Sat, 25 Oct 2008 05:37:28 +0200
    >
    > > Part of the .27 regression was added scheduler overhead going from .26
    > > to .27. The scheduler overhead is now gone, but an unidentified source
    > > of localhost throughput loss remains for both SMP and UP configs.

    >
    > It has to be the TSO thinky Evgeniy hit too right?


    Dunno.

    > If not, please bisect this.


    (oh my gawd

    I spent long day manweeks trying to bisect and whatnot. It's immune to
    my feeble efforts, and my git-foo.

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sat, 2008-10-25 at 07:58 +0200, Mike Galbraith wrote:
    > On Fri, 2008-10-24 at 22:16 -0700, David Miller wrote:


    > > If not, please bisect this.

    >
    > (oh my gawd
    >
    > I spent long day manweeks trying to bisect and whatnot. It's immune to
    > my feeble efforts, and my git-foo.


    but..

    (tbench/netperf numbers were tested with gcc-4.1 at this time in log, I
    went back and re-measured ring-test because I switched compilers)

    2.6.22.19-up
    ring-test - 1.204 us/cycle = 830 KHz (gcc-4.1)
    ring-test - doorstop (gcc-4.3)
    netperf - 147798.56 rr/s = 295 KHz (hmm, a bit unstable, 140K..147K rr/s)
    tbench - 374.573 MB/sec

    2.6.22.19-cfs-v24.1-up
    ring-test - 1.098 us/cycle = 910 KHz (gcc-4.1)
    ring-test - doorstop (gcc-4.3)
    netperf - 140039.03 rr/s = 280 KHz = 3.57us - 1.10us sched = 2.47us/packet network
    tbench - 364.191 MB/sec

    2.6.23.17-up
    ring-test - 1.252 us/cycle = 798 KHz (gcc-4.1)
    ring-test - 1.235 us/cycle = 809 KHz (gcc-4.3)
    netperf - 123736.40 rr/s = 247 KHz sb 268 KHZ / 134336.37 rr/s
    tbench - 355.906 MB/sec

    2.6.23.17-cfs-v24.1-up
    ring-test - 1.100 us/cycle = 909 KHz (gcc-4.1)
    ring-test - 1.074 us/cycle = 931 KHz (gcc-4.3)
    netperf - 135847.14 rr/s = 271 KHz sb 280 KHz / 140039.03 rr/s
    tbench - 364.511 MB/sec

    2.6.24.7-up
    ring-test - 1.100 us/cycle = 909 KHz (gcc-4.1)
    ring-test - 1.068 us/cycle = 936 KHz (gcc-4.3)
    netperf - 122300.66 rr/s = 244 KHz sb 280 KHz / 140039.03 rr/s
    tbench - 341.523 MB/sec

    2.6.25.17-up
    ring-test - 1.163 us/cycle = 859 KHz (gcc-4.1)
    ring-test - 1.129 us/cycle = 885 KHz (gcc-4.3)
    netperf - 132102.70 rr/s = 264 KHz sb 275 KHz / 137627.30 rr/s
    tbench - 361.71 MB/sec

    ...in 25, something happened that dropped my max context switch rate from
    ~930 KHz to ~885 KHz. Maybe I'll have better luck trying to find that.
    Added to to-do list. Benchmark mysteries I'm going to have to leave
    alone, they've kicked my little butt quite thoroughly ;-)

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [tbench regression fixes]: digging out smelly deadmen.

    From: Mike Galbraith
    Date: Sat, 25 Oct 2008 07:58:53 +0200

    > I spent long day manweeks trying to bisect and whatnot. It's immune to
    > my feeble efforts, and my git-foo.


    I understand, this is what happened to me when I tried to look into
    the gradual tbench regressions since 2.6.22

    I guess the only way to attack these things is to analyze the code and
    make some debugging hacks to get some measurements and numbers.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [tbench regression fixes]: digging out smelly deadmen.

    From: Mike Galbraith
    Date: Sat, 25 Oct 2008 08:53:43 +0200

    > On Sat, 2008-10-25 at 07:58 +0200, Mike Galbraith wrote:
    > 2.6.24.7-up
    > ring-test - 1.100 us/cycle = 909 KHz (gcc-4.1)
    > ring-test - 1.068 us/cycle = 936 KHz (gcc-4.3)
    > netperf - 122300.66 rr/s = 244 KHz sb 280 KHz / 140039.03 rr/s
    > tbench - 341.523 MB/sec
    >
    > 2.6.25.17-up
    > ring-test - 1.163 us/cycle = 859 KHz (gcc-4.1)
    > ring-test - 1.129 us/cycle = 885 KHz (gcc-4.3)
    > netperf - 132102.70 rr/s = 264 KHz sb 275 KHz / 137627.30 rr/s
    > tbench - 361.71 MB/sec
    >
    > ..in 25, something happened that dropped my max context switch rate from
    > ~930 KHz to ~885 KHz. Maybe I'll have better luck trying to find that.
    > Added to to-do list. Benchmark mysteries I'm going to have to leave
    > alone, they've kicked my little butt quite thoroughly ;-)


    But note that tbench performance improved a bit in 2.6.25.

    In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    weird.

    Just for the public record here are the numbers I got in my testing.
    Each entry was run purely on the latest 2.6.X-stable tree for each
    release. First is the tbench score and then there are 40 numbers
    which are sparc64 cpu cycle counts of default_wake_function().

    v2.6.22:

    Throughput 173.677 MB/sec 2 clients 2 procs max_latency=38.192 ms

    1636 1483 1552 1560 1534 1522 1472 1530 1518 1468
    1534 1402 1468 1656 1383 1362 1516 1336 1392 1472
    1652 1522 1486 1363 1430 1334 1382 1398 1448 1439
    1662 1540 1526 1472 1539 1434 1452 1492 1502 1432

    v2.6.23: This is when CFS got added to the tree.

    Throughput 167.933 MB/sec 2 clients 2 procs max_latency=25.428 ms

    3435 3363 3165 3304 3401 3189 3280 3243 3156 3295
    3439 3375 2950 2945 2727 3383 3560 3417 3221 3271
    3595 3293 3323 3283 3267 3279 3343 3293 3203 3341
    3413 3268 3107 3361 3245 3195 3079 3184 3405 3191

    v2.6.24:

    Throughput 170.314 MB/sec 2 clients 2 procs max_latency=22.121 ms

    2136 1886 2030 1929 2021 1941 2009 2067 1895 2019
    2072 1985 1992 1986 2031 2085 2014 2103 1825 1705
    2018 2034 1921 2079 1901 1989 1976 2035 2053 1971
    2144 2059 2025 2024 2029 1932 1980 1947 1956 2008

    v2.6.25:

    Throughput 165.294 MB/sec 2 clients 2 procs max_latency=108.869 ms

    2551 2707 2674 2771 2641 2727 2647 2865 2800 2796
    2793 2745 2609 2753 2674 2618 2671 2668 2641 2744
    2727 2616 2897 2720 2682 2737 2551 2677 2687 2603
    2725 2717 2510 2682 2658 2581 2713 2608 2619 2586

    v2.6.26:

    Throughput 160.759 MB/sec 2 clients 2 procs max_latency=31.420 ms

    2576 2492 2556 2517 2496 2473 2620 2464 2535 2494
    2800 2297 2183 2634 2546 2579 2488 2455 2632 2540
    2566 2540 2536 2496 2432 2453 2462 2568 2406 2522
    2565 2620 2532 2416 2434 2452 2524 2440 2424 2412

    v2.6.27:

    Throughput 143.776 MB/sec 2 clients 2 procs max_latency=31.279 ms

    4783 4710 27307 4955 5363 4270 4514 4469 3949 4422
    4177 4424 4510 18290 4380 3956 4293 4368 3919 4283
    4607 3960 4294 3842 18957 3942 4402 4488 3988 5157
    4604 4219 4186 22628 4289 4149 4089 4543 4217 4075
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sat, 2008-10-25 at 00:19 -0700, David Miller wrote:
    > From: Mike Galbraith
    > Date: Sat, 25 Oct 2008 07:58:53 +0200
    >
    > > I spent long day manweeks trying to bisect and whatnot. It's immune to
    > > my feeble efforts, and my git-foo.

    >
    > I understand, this is what happened to me when I tried to look into
    > the gradual tbench regressions since 2.6.22


    That's exactly what I've been trying to look into, but combined with
    netperf. The thing is an incredibly twisted maze of _this_ affects
    _that_... sometimes involving magic and/or mythical creatures.

    Very very annoying.

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sat, 2008-10-25 at 00:24 -0700, David Miller wrote:
    > From: Mike Galbraith
    > Date: Sat, 25 Oct 2008 08:53:43 +0200
    >
    > > On Sat, 2008-10-25 at 07:58 +0200, Mike Galbraith wrote:
    > > 2.6.24.7-up
    > > ring-test - 1.100 us/cycle = 909 KHz (gcc-4.1)
    > > ring-test - 1.068 us/cycle = 936 KHz (gcc-4.3)
    > > netperf - 122300.66 rr/s = 244 KHz sb 280 KHz / 140039.03 rr/s
    > > tbench - 341.523 MB/sec
    > >
    > > 2.6.25.17-up
    > > ring-test - 1.163 us/cycle = 859 KHz (gcc-4.1)
    > > ring-test - 1.129 us/cycle = 885 KHz (gcc-4.3)
    > > netperf - 132102.70 rr/s = 264 KHz sb 275 KHz / 137627.30 rr/s
    > > tbench - 361.71 MB/sec
    > >
    > > ..in 25, something happened that dropped my max context switch rate from
    > > ~930 KHz to ~885 KHz. Maybe I'll have better luck trying to find that.
    > > Added to to-do list. Benchmark mysteries I'm going to have to leave
    > > alone, they've kicked my little butt quite thoroughly ;-)

    >
    > But note that tbench performance improved a bit in 2.6.25.


    Yeah, netperf too.

    > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    > weird.


    23->24 I can understand. In my testing, 23 CFS was not a wonderful
    experience for rapid switchers. 24 is cfs-24.1.

    > Just for the public record here are the numbers I got in my testing.
    > Each entry was run purely on the latest 2.6.X-stable tree for each
    > release. First is the tbench score and then there are 40 numbers
    > which are sparc64 cpu cycle counts of default_wake_function().


    Your numbers seem to ~agree with mine. And yeah, that hrtick is damned
    expensive. I didn't realize _how_ expensive until I trimmed my config
    way way down from distro. Just having highres timers enabled makes a
    very large difference here, even without hrtick enabled, and with the
    overhead of a disabled hrtick removed.

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Saturday, 25 of October 2008, David Miller wrote:
    > From: "Rafael J. Wysocki"
    > Date: Sat, 25 Oct 2008 00:25:34 +0200
    >
    > > On Friday, 10 of October 2008, Ingo Molnar wrote:
    > > >
    > > > * Evgeniy Polyakov wrote:
    > > >
    > > > > On Fri, Oct 10, 2008 at 01:42:45PM +0200, Ingo Molnar (mingo@elte.hu) wrote:
    > > > > > > vanilla 27: 347.222
    > > > > > > no TSO/GSO: 357.331
    > > > > > > no hrticks: 382.983
    > > > > > > no balance: 389.802
    > > > > >
    > > > > > okay. The target is 470 MB/sec, right? (Assuming the workload is sane
    > > > > > and 'fixing' it does not mean we have to schedule worse.)
    > > > >
    > > > > Well, that's where I started/stopped, so maybe we will even move
    > > > > further?
    > > >
    > > > that's the right attitude

    > >
    > > Can anyone please tell me if there was any conclusion of this thread?

    >
    > I made some more analysis in private with Ingo and Peter Z. and found
    > that the tbench decreases correlate pretty much directly with the
    > ongoing increasing cpu cost of wake_up() and friends in the fair
    > scheduler.
    >
    > The largest increase in computational cost of wakeups came in 2.6.27
    > when the hrtimer bits got added, it more than tripled the cost of a wakeup.
    > In 2.6.28-rc1 the hrtimer feature has been disabled, but I think that
    > should be backports into the 2.6.27-stable branch.


    Thanks a lot for the info.

    Could you please give me a pointer to the commit disabling the hrtimer feature?

    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sat, 25 Oct 2008, David Miller wrote:

    > But note that tbench performance improved a bit in 2.6.25.
    > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    > weird.
    > Just for the public record here are the numbers I got in my testing.


    I have been currently looking at very similarly looking issue. For the
    public record, here are the numbers we have been able to come up with so
    far (measured with dbench, so the absolute values are slightly different,
    but still shows similar pattern)

    208.4 MB/sec -- vanilla 2.6.16.60
    201.6 MB/sec -- vanilla 2.6.20.1
    172.9 MB/sec -- vanilla 2.6.22.19
    74.2 MB/sec -- vanilla 2.6.23
    46.1 MB/sec -- vanilla 2.6.24.2
    30.6 MB/sec -- vanilla 2.6.26.1

    I.e. huge drop for 2.6.23 (this was with default configs for each
    respective kernel).
    2.6.23-rc1 shows 80.5 MB/s, i.e. a few % better than final 2.6.23, but
    still pretty bad.

    I have gone through the commits that went into -rc1 and tried to figure
    out which one could be responsible. Here are the numbers:

    85.3 MB/s for 2ba2d00363 (just before on-deman readahead has been merged)
    82.7 MB/s for 45426812d6 (before cond_resched() has been added into page
    187.7 MB/s for c1e4fe711a4 (just before CFS scheduler has been merged)
    invalidation code)

    So the current bigest suspect is CFS, but I don't have enough numbers yet
    to be able to point a finger to it with 100% certainity. Hopefully soon.

    Just my $0.02

    --
    Jiri Kosina
    SUSE Labs

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [tbench regression fixes]: digging out smelly deadmen.

    From: "Rafael J. Wysocki"
    Date: Sat, 25 Oct 2008 13:13:20 +0200

    > Could you please give me a pointer to the commit disabling the hrtimer feature?


    Here it is:

    commit 0c4b83da58ec2e96ce9c44c211d6eac5f9dae478
    Author: Ingo Molnar
    Date: Mon Oct 20 14:27:43 2008 +0200

    sched: disable the hrtick for now

    David Miller reported that hrtick update overhead has tripled the
    wakeup overhead on Sparc64.

    That is too much - disable the HRTICK feature for now by default,
    until a faster implementation is found.

    Reported-by: David Miller
    Acked-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    diff --git a/kernel/sched_features.h b/kernel/sched_features.h
    index 7c9e8f4..fda0162 100644
    --- a/kernel/sched_features.h
    +++ b/kernel/sched_features.h
    @@ -5,7 +5,7 @@ SCHED_FEAT(START_DEBIT, 1)
    SCHED_FEAT(AFFINE_WAKEUPS, 1)
    SCHED_FEAT(CACHE_HOT_BUDDY, 1)
    SCHED_FEAT(SYNC_WAKEUPS, 1)
    -SCHED_FEAT(HRTICK, 1)
    +SCHED_FEAT(HRTICK, 0)
    SCHED_FEAT(DOUBLE_TICK, 0)
    SCHED_FEAT(ASYM_GRAN, 1)
    SCHED_FEAT(LB_BIAS, 1)
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 2008-10-26 at 01:10 +0200, Jiri Kosina wrote:
    > On Sat, 25 Oct 2008, David Miller wrote:
    >
    > > But note that tbench performance improved a bit in 2.6.25.
    > > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    > > weird.
    > > Just for the public record here are the numbers I got in my testing.

    >
    > I have been currently looking at very similarly looking issue. For the
    > public record, here are the numbers we have been able to come up with so
    > far (measured with dbench, so the absolute values are slightly different,
    > but still shows similar pattern)
    >
    > 208.4 MB/sec -- vanilla 2.6.16.60
    > 201.6 MB/sec -- vanilla 2.6.20.1
    > 172.9 MB/sec -- vanilla 2.6.22.19
    > 74.2 MB/sec -- vanilla 2.6.23
    > 46.1 MB/sec -- vanilla 2.6.24.2
    > 30.6 MB/sec -- vanilla 2.6.26.1
    >
    > I.e. huge drop for 2.6.23 (this was with default configs for each
    > respective kernel).
    > 2.6.23-rc1 shows 80.5 MB/s, i.e. a few % better than final 2.6.23, but
    > still pretty bad.
    >
    > I have gone through the commits that went into -rc1 and tried to figure
    > out which one could be responsible. Here are the numbers:
    >
    > 85.3 MB/s for 2ba2d00363 (just before on-deman readahead has been merged)
    > 82.7 MB/s for 45426812d6 (before cond_resched() has been added into page
    > 187.7 MB/s for c1e4fe711a4 (just before CFS scheduler has been merged)
    > invalidation code)
    >
    > So the current bigest suspect is CFS, but I don't have enough numbers yet
    > to be able to point a finger to it with 100% certainity. Hopefully soon.


    Hi,

    High client count right?

    I reproduced this on my Q6600 box. However, I also reproduced it with
    2.6.22.19. What I think you're seeing is just dbench creating a massive
    train wreck. With CFS, it appears to be more likely to start->end
    _sustain_, but the wreckage is present in O(1) scheduler runs as well,
    and will start->end sustain there as well.

    2.6.22.19-smp Throughput 967.933 MB/sec 16 procs Throughput 147.879 MB/sec 160 procs
    Throughput 950.325 MB/sec 16 procs Throughput 349.959 MB/sec 160 procs
    Throughput 953.382 MB/sec 16 procs Throughput 126.821 MB/sec 160 procs <== massive jitter
    2.6.22.19-cfs-v24.1-smp Throughput 978.047 MB/sec 16 procs Throughput 170.662 MB/sec 160 procs
    Throughput 943.254 MB/sec 16 procs Throughput 39.388 MB/sec 160 procs <== sustained train wreck
    Throughput 934.042 MB/sec 16 procs Throughput 239.574 MB/sec 160 procs
    2.6.23.17-smp Throughput 1173.97 MB/sec 16 procs Throughput 100.996 MB/sec 160 procs
    Throughput 1122.85 MB/sec 16 procs Throughput 80.3747 MB/sec 160 procs
    Throughput 1113.60 MB/sec 16 procs Throughput 99.3723 MB/sec 160 procs
    2.6.24.7-smp Throughput 1030.34 MB/sec 16 procs Throughput 256.419 MB/sec 160 procs
    Throughput 970.602 MB/sec 16 procs Throughput 257.008 MB/sec 160 procs
    Throughput 1056.48 MB/sec 16 procs Throughput 248.841 MB/sec 160 procs
    2.6.25.19-smp Throughput 955.874 MB/sec 16 procs Throughput 40.5735 MB/sec 160 procs
    Throughput 943.348 MB/sec 16 procs Throughput 62.3966 MB/sec 160 procs
    Throughput 937.595 MB/sec 16 procs Throughput 17.4639 MB/sec 160 procs
    2.6.26.7-smp Throughput 904.564 MB/sec 16 procs Throughput 118.364 MB/sec 160 procs
    Throughput 891.824 MB/sec 16 procs Throughput 34.2193 MB/sec 160 procs
    Throughput 880.850 MB/sec 16 procs Throughput 22.4938 MB/sec 160 procs
    2.6.27.4-smp Throughput 856.660 MB/sec 16 procs Throughput 168.243 MB/sec 160 procs
    Throughput 880.121 MB/sec 16 procs Throughput 120.132 MB/sec 160 procs
    Throughput 880.121 MB/sec 16 procs Throughput 142.105 MB/sec 160 procs

    Check out fugliness:

    2.6.22.19-smp Throughput 35.5075 MB/sec 160 procs (start->end sustained train wreck)

    Full output from above run:

    dbench version 3.04 - Copyright Andrew Tridgell 1999-2004

    Running for 60 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 12 secs
    160 clients started
    160 54 310.43 MB/sec warmup 1 sec
    160 54 155.18 MB/sec warmup 2 sec
    160 54 103.46 MB/sec warmup 3 sec
    160 54 77.59 MB/sec warmup 4 sec
    160 56 64.81 MB/sec warmup 5 sec
    160 57 54.01 MB/sec warmup 6 sec
    160 57 46.29 MB/sec warmup 7 sec
    160 812 129.07 MB/sec warmup 8 sec
    160 1739 205.08 MB/sec warmup 9 sec
    160 2634 262.22 MB/sec warmup 10 sec
    160 3437 305.41 MB/sec warmup 11 sec
    160 3815 307.35 MB/sec warmup 12 sec
    160 4241 311.07 MB/sec warmup 13 sec
    160 5142 344.02 MB/sec warmup 14 sec
    160 5991 369.46 MB/sec warmup 15 sec
    160 6346 369.09 MB/sec warmup 16 sec
    160 6347 347.97 MB/sec warmup 17 sec
    160 6347 328.66 MB/sec warmup 18 sec
    160 6348 311.50 MB/sec warmup 19 sec
    160 6348 0.00 MB/sec execute 1 sec
    160 6348 2.08 MB/sec execute 2 sec
    160 6349 2.75 MB/sec execute 3 sec
    160 6356 16.25 MB/sec execute 4 sec
    160 6360 17.21 MB/sec execute 5 sec
    160 6574 45.07 MB/sec execute 6 sec
    160 6882 76.17 MB/sec execute 7 sec
    160 7006 86.37 MB/sec execute 8 sec
    160 7006 76.77 MB/sec execute 9 sec
    160 7006 69.09 MB/sec execute 10 sec
    160 7039 68.67 MB/sec execute 11 sec
    160 7043 64.71 MB/sec execute 12 sec
    160 7044 60.29 MB/sec execute 13 sec
    160 7044 55.98 MB/sec execute 14 sec
    160 7057 56.13 MB/sec execute 15 sec
    160 7057 52.63 MB/sec execute 16 sec
    160 7059 50.21 MB/sec execute 17 sec
    160 7083 49.73 MB/sec execute 18 sec
    160 7086 48.05 MB/sec execute 19 sec
    160 7088 46.40 MB/sec execute 20 sec
    160 7088 44.19 MB/sec execute 21 sec
    160 7094 43.59 MB/sec execute 22 sec
    160 7094 41.69 MB/sec execute 23 sec
    160 7094 39.96 MB/sec execute 24 sec
    160 7094 38.36 MB/sec execute 25 sec
    160 7094 36.88 MB/sec execute 26 sec
    160 7094 35.52 MB/sec execute 27 sec
    160 7098 34.91 MB/sec execute 28 sec
    160 7124 36.72 MB/sec execute 29 sec
    160 7124 35.50 MB/sec execute 30 sec
    160 7124 34.35 MB/sec execute 31 sec
    160 7124 33.28 MB/sec execute 32 sec
    160 7124 32.27 MB/sec execute 33 sec
    160 7124 31.32 MB/sec execute 34 sec
    160 7283 34.80 MB/sec execute 35 sec
    160 7681 44.95 MB/sec execute 36 sec
    160 7681 43.79 MB/sec execute 37 sec
    160 7681 42.64 MB/sec execute 38 sec
    160 7689 42.23 MB/sec execute 39 sec
    160 7691 41.48 MB/sec execute 40 sec
    160 7693 40.76 MB/sec execute 41 sec
    160 7703 40.54 MB/sec execute 42 sec
    160 7704 39.81 MB/sec execute 43 sec
    160 7704 38.91 MB/sec execute 44 sec
    160 7704 38.04 MB/sec execute 45 sec
    160 7704 37.21 MB/sec execute 46 sec
    160 7704 36.42 MB/sec execute 47 sec
    160 7704 35.66 MB/sec execute 48 sec
    160 7747 36.58 MB/sec execute 49 sec
    160 7854 38.00 MB/sec execute 50 sec
    160 7857 37.65 MB/sec execute 51 sec
    160 7861 37.29 MB/sec execute 52 sec
    160 7862 36.67 MB/sec execute 53 sec
    160 7864 36.21 MB/sec execute 54 sec
    160 7877 35.85 MB/sec execute 55 sec
    160 7877 35.21 MB/sec execute 56 sec
    160 8015 37.11 MB/sec execute 57 sec
    160 8019 36.57 MB/sec execute 58 sec
    160 8019 35.95 MB/sec execute 59 sec
    160 8019 35.36 MB/sec cleanup 60 sec
    160 8019 34.78 MB/sec cleanup 61 sec
    160 8019 34.23 MB/sec cleanup 63 sec
    160 8019 33.69 MB/sec cleanup 64 sec
    160 8019 33.16 MB/sec cleanup 65 sec
    160 8019 32.65 MB/sec cleanup 66 sec
    160 8019 32.21 MB/sec cleanup 67 sec
    160 8019 31.73 MB/sec cleanup 68 sec
    160 8019 31.27 MB/sec cleanup 69 sec
    160 8019 30.84 MB/sec cleanup 70 sec
    160 8019 30.40 MB/sec cleanup 71 sec
    160 8019 29.98 MB/sec cleanup 72 sec
    160 8019 29.58 MB/sec cleanup 73 sec
    160 8019 29.18 MB/sec cleanup 74 sec
    160 8019 29.03 MB/sec cleanup 74 sec

    Throughput 35.5075 MB/sec 160 procs

    Throughput 180.934 MB/sec 160 procs (next run, non-sustained train wreck)

    Full output of this run:

    dbench version 3.04 - Copyright Andrew Tridgell 1999-2004

    Running for 60 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 12 secs
    160 clients started
    160 67 321.43 MB/sec warmup 1 sec
    160 67 160.61 MB/sec warmup 2 sec
    160 67 107.04 MB/sec warmup 3 sec
    160 67 80.27 MB/sec warmup 4 sec
    160 67 64.21 MB/sec warmup 5 sec
    160 267 89.74 MB/sec warmup 6 sec
    160 1022 169.68 MB/sec warmup 7 sec
    160 1821 240.62 MB/sec warmup 8 sec
    160 2591 290.39 MB/sec warmup 9 sec
    160 3125 308.04 MB/sec warmup 10 sec
    160 3125 280.04 MB/sec warmup 11 sec
    160 3217 263.23 MB/sec warmup 12 sec
    160 3725 276.45 MB/sec warmup 13 sec
    160 4237 288.32 MB/sec warmup 14 sec
    160 4748 300.98 MB/sec warmup 15 sec
    160 4810 286.69 MB/sec warmup 16 sec
    160 4812 270.89 MB/sec warmup 17 sec
    160 4812 255.95 MB/sec warmup 18 sec
    160 4812 242.48 MB/sec warmup 19 sec
    160 4812 230.35 MB/sec warmup 20 sec
    160 4812 219.38 MB/sec warmup 21 sec
    160 4812 209.41 MB/sec warmup 22 sec
    160 4812 200.31 MB/sec warmup 23 sec
    160 4812 191.96 MB/sec warmup 24 sec
    160 4812 184.28 MB/sec warmup 25 sec
    160 4812 177.19 MB/sec warmup 26 sec
    160 4836 175.89 MB/sec warmup 27 sec
    160 4836 169.61 MB/sec warmup 28 sec
    160 4841 163.97 MB/sec warmup 29 sec
    160 5004 163.03 MB/sec warmup 30 sec
    160 5450 170.58 MB/sec warmup 31 sec
    160 5951 178.79 MB/sec warmup 32 sec
    160 6086 176.86 MB/sec warmup 33 sec
    160 6127 174.53 MB/sec warmup 34 sec
    160 6129 169.67 MB/sec warmup 35 sec
    160 6131 165.36 MB/sec warmup 36 sec
    160 6137 161.65 MB/sec warmup 37 sec
    160 6141 157.85 MB/sec warmup 38 sec
    160 6145 154.32 MB/sec warmup 39 sec
    160 6145 150.46 MB/sec warmup 40 sec
    160 6145 146.79 MB/sec warmup 41 sec
    160 6145 143.30 MB/sec warmup 42 sec
    160 6145 139.97 MB/sec warmup 43 sec
    160 6145 136.78 MB/sec warmup 44 sec
    160 6145 133.74 MB/sec warmup 45 sec
    160 6145 130.84 MB/sec warmup 46 sec
    160 6145 128.05 MB/sec warmup 47 sec
    160 6178 128.41 MB/sec warmup 48 sec
    160 6180 126.13 MB/sec warmup 49 sec
    160 6184 124.09 MB/sec warmup 50 sec
    160 6187 122.03 MB/sec warmup 51 sec
    160 6192 120.19 MB/sec warmup 52 sec
    160 6196 118.42 MB/sec warmup 53 sec
    160 6228 116.88 MB/sec warmup 54 sec
    160 6231 114.97 MB/sec warmup 55 sec
    160 6231 112.92 MB/sec warmup 56 sec
    160 6398 114.17 MB/sec warmup 57 sec
    160 6401 112.44 MB/sec warmup 58 sec
    160 6402 110.69 MB/sec warmup 59 sec
    160 6402 108.84 MB/sec warmup 60 sec
    160 6405 107.38 MB/sec warmup 61 sec
    160 6405 105.65 MB/sec warmup 62 sec
    160 6407 104.03 MB/sec warmup 64 sec
    160 6431 103.16 MB/sec warmup 65 sec
    160 6432 101.64 MB/sec warmup 66 sec
    160 6432 100.10 MB/sec warmup 67 sec
    160 6460 99.42 MB/sec warmup 68 sec
    160 6698 100.92 MB/sec warmup 69 sec
    160 7218 106.21 MB/sec warmup 70 sec
    160 7254 36.49 MB/sec execute 1 sec
    160 7254 18.24 MB/sec execute 2 sec
    160 7259 21.06 MB/sec execute 3 sec
    160 7359 37.80 MB/sec execute 4 sec
    160 7381 34.05 MB/sec execute 5 sec
    160 7381 28.37 MB/sec execute 6 sec
    160 7381 24.32 MB/sec execute 7 sec
    160 7381 21.28 MB/sec execute 8 sec
    160 7404 21.03 MB/sec execute 9 sec
    160 7647 43.24 MB/sec execute 10 sec
    160 7649 39.94 MB/sec execute 11 sec
    160 7672 38.48 MB/sec execute 12 sec
    160 7680 37.10 MB/sec execute 13 sec
    160 7856 46.09 MB/sec execute 14 sec
    160 7856 43.02 MB/sec execute 15 sec
    160 7856 40.33 MB/sec execute 16 sec
    160 7856 37.99 MB/sec execute 17 sec
    160 8561 71.30 MB/sec execute 18 sec
    160 9070 92.10 MB/sec execute 19 sec
    160 9080 88.86 MB/sec execute 20 sec
    160 9086 86.13 MB/sec execute 21 sec
    160 9089 82.70 MB/sec execute 22 sec
    160 9095 79.98 MB/sec execute 23 sec
    160 9098 77.32 MB/sec execute 24 sec
    160 9101 74.78 MB/sec execute 25 sec
    160 9105 72.70 MB/sec execute 26 sec
    160 9107 70.34 MB/sec execute 27 sec
    160 9110 68.40 MB/sec execute 28 sec
    160 9114 66.60 MB/sec execute 29 sec
    160 9114 64.38 MB/sec execute 30 sec
    160 9114 62.30 MB/sec execute 31 sec
    160 9146 61.31 MB/sec execute 32 sec
    160 9493 68.80 MB/sec execute 33 sec
    160 10040 80.50 MB/sec execute 34 sec
    160 10567 91.12 MB/sec execute 35 sec
    160 10908 96.72 MB/sec execute 36 sec
    160 11234 101.86 MB/sec execute 37 sec
    160 12062 118.23 MB/sec execute 38 sec
    160 12987 135.90 MB/sec execute 39 sec
    160 13883 152.07 MB/sec execute 40 sec
    160 14730 166.18 MB/sec execute 41 sec
    160 14829 165.26 MB/sec execute 42 sec
    160 14836 162.03 MB/sec execute 43 sec
    160 14851 158.64 MB/sec execute 44 sec
    160 14851 155.11 MB/sec execute 45 sec
    160 14851 151.74 MB/sec execute 46 sec
    160 15022 151.70 MB/sec execute 47 sec
    160 15292 153.38 MB/sec execute 48 sec
    160 15580 155.28 MB/sec execute 49 sec
    160 15846 156.73 MB/sec execute 50 sec
    160 16449 164.00 MB/sec execute 51 sec
    160 17097 171.56 MB/sec execute 52 sec
    160 17097 168.32 MB/sec execute 53 sec
    160 17310 168.62 MB/sec execute 54 sec
    160 18075 177.42 MB/sec execute 55 sec
    160 18828 186.31 MB/sec execute 56 sec
    160 18876 184.04 MB/sec execute 57 sec
    160 18876 180.87 MB/sec execute 58 sec
    160 18879 177.81 MB/sec execute 59 sec
    160 19294 180.80 MB/sec cleanup 60 sec
    160 19294 177.84 MB/sec cleanup 61 sec
    160 19294 174.97 MB/sec cleanup 63 sec
    160 19294 172.24 MB/sec cleanup 64 sec
    160 19294 169.55 MB/sec cleanup 65 sec
    160 19294 166.95 MB/sec cleanup 66 sec
    160 19294 164.42 MB/sec cleanup 67 sec
    160 19294 161.97 MB/sec cleanup 68 sec
    160 19294 159.59 MB/sec cleanup 69 sec
    160 19294 157.28 MB/sec cleanup 70 sec
    160 19294 155.03 MB/sec cleanup 71 sec
    160 19294 152.86 MB/sec cleanup 72 sec
    160 19294 150.76 MB/sec cleanup 73 sec
    160 19294 148.71 MB/sec cleanup 74 sec
    160 19294 146.70 MB/sec cleanup 75 sec
    160 19294 144.75 MB/sec cleanup 76 sec
    160 19294 142.85 MB/sec cleanup 77 sec
    160 19294 141.72 MB/sec cleanup 77 sec

    Throughput 180.934 MB/sec 160 procs


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 2008-10-26 at 09:46 +0100, Mike Galbraith wrote:
    > On Sun, 2008-10-26 at 01:10 +0200, Jiri Kosina wrote:
    > > On Sat, 25 Oct 2008, David Miller wrote:
    > >
    > > > But note that tbench performance improved a bit in 2.6.25.
    > > > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    > > > weird.
    > > > Just for the public record here are the numbers I got in my testing.

    > >
    > > I have been currently looking at very similarly looking issue. For the
    > > public record, here are the numbers we have been able to come up with so
    > > far (measured with dbench, so the absolute values are slightly different,
    > > but still shows similar pattern)
    > >
    > > 208.4 MB/sec -- vanilla 2.6.16.60
    > > 201.6 MB/sec -- vanilla 2.6.20.1
    > > 172.9 MB/sec -- vanilla 2.6.22.19
    > > 74.2 MB/sec -- vanilla 2.6.23
    > > 46.1 MB/sec -- vanilla 2.6.24.2
    > > 30.6 MB/sec -- vanilla 2.6.26.1
    > >
    > > I.e. huge drop for 2.6.23 (this was with default configs for each
    > > respective kernel).
    > > 2.6.23-rc1 shows 80.5 MB/s, i.e. a few % better than final 2.6.23, but
    > > still pretty bad.
    > >
    > > I have gone through the commits that went into -rc1 and tried to figure
    > > out which one could be responsible. Here are the numbers:
    > >
    > > 85.3 MB/s for 2ba2d00363 (just before on-deman readahead has been merged)
    > > 82.7 MB/s for 45426812d6 (before cond_resched() has been added into page
    > > 187.7 MB/s for c1e4fe711a4 (just before CFS scheduler has been merged)
    > > invalidation code)
    > >
    > > So the current bigest suspect is CFS, but I don't have enough numbers yet
    > > to be able to point a finger to it with 100% certainity. Hopefully soon.


    > I reproduced this on my Q6600 box. However, I also reproduced it with
    > 2.6.22.19. What I think you're seeing is just dbench creating a
    > massive train wreck.


    wasn't dbench one of those non-benchmarks that thrives on randomness and
    unfairness?

    Andrew said recently:
    "dbench is pretty chaotic and it could be that a good change causes
    dbench to get worse. That's happened plenty of times in the past."

    So I'm not inclined to worry too much about dbench in any way shape or
    form.



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 26 Oct 2008 10:00:48 +0100 Peter Zijlstra wrote:

    > On Sun, 2008-10-26 at 09:46 +0100, Mike Galbraith wrote:
    > > On Sun, 2008-10-26 at 01:10 +0200, Jiri Kosina wrote:
    > > > On Sat, 25 Oct 2008, David Miller wrote:
    > > >
    > > > > But note that tbench performance improved a bit in 2.6.25.
    > > > > In my tests I noticed a similar effect, but from 2.6.23 to 2.6.24,
    > > > > weird.
    > > > > Just for the public record here are the numbers I got in my testing.
    > > >
    > > > I have been currently looking at very similarly looking issue. For the
    > > > public record, here are the numbers we have been able to come up with so
    > > > far (measured with dbench, so the absolute values are slightly different,
    > > > but still shows similar pattern)
    > > >
    > > > 208.4 MB/sec -- vanilla 2.6.16.60
    > > > 201.6 MB/sec -- vanilla 2.6.20.1
    > > > 172.9 MB/sec -- vanilla 2.6.22.19
    > > > 74.2 MB/sec -- vanilla 2.6.23
    > > > 46.1 MB/sec -- vanilla 2.6.24.2
    > > > 30.6 MB/sec -- vanilla 2.6.26.1
    > > >
    > > > I.e. huge drop for 2.6.23 (this was with default configs for each
    > > > respective kernel).


    Was this when we decreased the default value of
    /proc/sys/vm/dirty_ratio, perhaps? dbench is sensitive to that.

    > > > 2.6.23-rc1 shows 80.5 MB/s, i.e. a few % better than final 2.6.23, but
    > > > still pretty bad.
    > > >
    > > > I have gone through the commits that went into -rc1 and tried to figure
    > > > out which one could be responsible. Here are the numbers:
    > > >
    > > > 85.3 MB/s for 2ba2d00363 (just before on-deman readahead has been merged)
    > > > 82.7 MB/s for 45426812d6 (before cond_resched() has been added into page
    > > > 187.7 MB/s for c1e4fe711a4 (just before CFS scheduler has been merged)
    > > > invalidation code)
    > > >
    > > > So the current bigest suspect is CFS, but I don't have enough numbers yet
    > > > to be able to point a finger to it with 100% certainity. Hopefully soon.

    >
    > > I reproduced this on my Q6600 box. However, I also reproduced it with
    > > 2.6.22.19. What I think you're seeing is just dbench creating a
    > > massive train wreck.

    >
    > wasn't dbench one of those non-benchmarks that thrives on randomness and
    > unfairness?
    >
    > Andrew said recently:
    > "dbench is pretty chaotic and it could be that a good change causes
    > dbench to get worse. That's happened plenty of times in the past."
    >
    > So I'm not inclined to worry too much about dbench in any way shape or
    > form.


    Well. If there is a consistent change in dbench throughput, it is
    important that we at least understand the reasons for it. But we
    don't necessarily want to optimise for dbench throughput.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 2008-10-26 at 10:00 +0100, Peter Zijlstra wrote:
    > On Sun, 2008-10-26 at 09:46 +0100, Mike Galbraith wrote:


    > > I reproduced this on my Q6600 box. However, I also reproduced it with
    > > 2.6.22.19. What I think you're seeing is just dbench creating a
    > > massive train wreck.

    >
    > wasn't dbench one of those non-benchmarks that thrives on randomness and
    > unfairness?
    >
    > Andrew said recently:
    > "dbench is pretty chaotic and it could be that a good change causes
    > dbench to get worse. That's happened plenty of times in the past."
    >
    > So I'm not inclined to worry too much about dbench in any way shape or
    > form.


    Yeah, I was just curious. The switch rate of dbench isn't high enough
    for math to be an issue, so I wondered how the heck CFS could be such a
    huge problem for this load. Looks to me like all the math in the
    _world_ couldn't hurt.. or help.

    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [tbench regression fixes]: digging out smelly deadmen.

    Hi.

    On Sun, Oct 26, 2008 at 02:11:53AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
    > > Andrew said recently:
    > > "dbench is pretty chaotic and it could be that a good change causes
    > > dbench to get worse. That's happened plenty of times in the past."
    > >
    > > So I'm not inclined to worry too much about dbench in any way shape or
    > > form.

    >
    > Well. If there is a consistent change in dbench throughput, it is
    > important that we at least understand the reasons for it. But we
    > don't necessarily want to optimise for dbench throughput.


    Sorry, but such excuses do not deserve to be said. No matter how
    ugly, wrong, unusual or whatever else you might say about some test, but
    it shows the problem, which has to be fixed. There is no 'dbench tune',
    there is fair number of problems, and at least several of them dbench
    already helped to narrow down and precisely locate. The same regressions
    were also observed in other benchmarks, originally reported before I
    started this thread.

    --
    Evgeniy Polyakov
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 26 Oct 2008 12:27:22 +0300 Evgeniy Polyakov wrote:

    > Hi.
    >
    > On Sun, Oct 26, 2008 at 02:11:53AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
    > > > Andrew said recently:
    > > > "dbench is pretty chaotic and it could be that a good change causes
    > > > dbench to get worse. That's happened plenty of times in the past."
    > > >
    > > > So I'm not inclined to worry too much about dbench in any way shape or
    > > > form.

    > >
    > > Well. If there is a consistent change in dbench throughput, it is
    > > important that we at least understand the reasons for it. But we
    > > don't necessarily want to optimise for dbench throughput.

    >
    > Sorry, but such excuses do not deserve to be said. No matter how
    > ugly, wrong, unusual or whatever else you might say about some test, but
    > it shows the problem, which has to be fixed.


    Not necessarily. There are times when we have made changes which we
    knew full well reduced dbench's throughput, because we believed them to
    be of overall benefit. I referred to one of them above.

    > There is no 'dbench tune',
    > there is fair number of problems, and at least several of them dbench
    > already helped to narrow down and precisely locate. The same regressions
    > were also observed in other benchmarks, originally reported before I
    > started this thread.


    You seem to be saying what I said.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: [tbench regression fixes]: digging out smelly deadmen.

    Hi Andrew.

    On Sun, Oct 26, 2008 at 02:34:39AM -0700, Andrew Morton (akpm@linux-foundation.org) wrote:
    > Not necessarily. There are times when we have made changes which we
    > knew full well reduced dbench's throughput, because we believed them to
    > be of overall benefit. I referred to one of them above.


    I suppose, there were words about dbench is not a real-life test, so if
    it will suddenly suck, no one will care. Sigh, theorists...
    I'm not surprised there were no changes when I reported hrtimers to be
    the main guilty factor in my setup for dbench tests, and only when David
    showed that they also killed his sparks via wake_up(), something was
    done. Now this regression even dissapeared from the list.
    Good direction, we should always follow this.

    As a side note, is hrtimer subsystem also used for BH backend? I have
    not yet analyzed data about vanilla kernels only being able to accept
    clients at 20-30k accepts per second, while some other magical tree
    (not vanilla) around 2.6.18 was able to that with 50k accepts per
    second. There are lots of CPUs, ram, bandwidth, which are effectively
    unused even behind linux load balancer...

    --
    Evgeniy Polyakov
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sun, 2008-10-26 at 02:11 -0700, Andrew Morton wrote:

    > Was this when we decreased the default value of
    > /proc/sys/vm/dirty_ratio, perhaps? dbench is sensitive to that.


    Wow, indeed. I fired up an ext2 disk to take kjournald out of the
    picture (dunno, just a transient thought). Stock settings produced
    three perma-wrecks in a row. With it bumped to 50, three very
    considerably nicer results in a row appeared.

    2.6.26.7-smp dirty_ratio = 10 (stock)
    Throughput 36.3649 MB/sec 160 procs
    Throughput 47.0787 MB/sec 160 procs
    Throughput 88.2055 MB/sec 160 procs

    2.6.26.7-smp dirty_ratio = 50
    Throughput 1009.98 MB/sec 160 procs
    Throughput 1101.57 MB/sec 160 procs
    Throughput 943.205 MB/sec 160 procs
    
    -Mike

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: [tbench regression fixes]: digging out smelly deadmen.

    On Sunday, 26 of October 2008, David Miller wrote:
    > From: "Rafael J. Wysocki"
    > Date: Sat, 25 Oct 2008 13:13:20 +0200
    >
    > > Could you please give me a pointer to the commit disabling the hrtimer feature?

    >
    > Here it is:


    Thanks a lot!

    > commit 0c4b83da58ec2e96ce9c44c211d6eac5f9dae478
    > Author: Ingo Molnar
    > Date: Mon Oct 20 14:27:43 2008 +0200
    >
    > sched: disable the hrtick for now
    >
    > David Miller reported that hrtick update overhead has tripled the
    > wakeup overhead on Sparc64.
    >
    > That is too much - disable the HRTICK feature for now by default,
    > until a faster implementation is found.
    >
    > Reported-by: David Miller
    > Acked-by: Peter Zijlstra
    > Signed-off-by: Ingo Molnar
    >
    > diff --git a/kernel/sched_features.h b/kernel/sched_features.h
    > index 7c9e8f4..fda0162 100644
    > --- a/kernel/sched_features.h
    > +++ b/kernel/sched_features.h
    > @@ -5,7 +5,7 @@ SCHED_FEAT(START_DEBIT, 1)
    > SCHED_FEAT(AFFINE_WAKEUPS, 1)
    > SCHED_FEAT(CACHE_HOT_BUDDY, 1)
    > SCHED_FEAT(SYNC_WAKEUPS, 1)
    > -SCHED_FEAT(HRTICK, 1)
    > +SCHED_FEAT(HRTICK, 0)
    > SCHED_FEAT(DOUBLE_TICK, 0)
    > SCHED_FEAT(ASYM_GRAN, 1)
    > SCHED_FEAT(LB_BIAS, 1)


    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 5 FirstFirst 1 2 3 4 ... LastLast