Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system - Kernel

This is a discussion on Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system - Kernel ; Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample runs of the following on both it and after applying Mathieu Desnoyers 11-patch sequence (19 September 2007). * 32-way IA64 + 132GiB + 10 FC adapters + 10 ...

+ Reply to Thread
Results 1 to 9 of 9

Thread: Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

  1. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    runs of the following on both it and after applying Mathieu Desnoyers
    11-patch sequence (19 September 2007).

    * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    volume per MSA used)

    * 10 runs with each configuration, averages shown below
    o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running
    o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running
    o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running
    o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running

    * A run consists of doing the following in parallel:
    o Make an ext3 FS on each of the 10 volumes
    o Mount & unmount each volume
    + The unmounting generates a tremendous amount of writes
    to the disks - thus stressing the intended storage
    devices (10 volumes) plus the separate volume for all
    the blktrace data (when blk tracing is enabled).
    + Note the times reported below only cover the
    make/mount/unmount time - the actual blktrace runs
    extended beyond the times measured (took quite a while
    for the blk trace data to be output). We're only
    concerned with the impact on the "application"
    performance in this instance.

    Results are:

    Kernel w/out BT STDDEV w/ BT STDDEV
    ------------------------------------- --------- ------ --------- ------
    2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796 2.09
    2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993 3.23

    It looks to be about 2.1% increase in time to do the make/mount/unmount
    operations with the marker patches in place and no blktrace operations.
    With the blktrace operations in place we see about a 3.8% decrease in
    time to do the same ops.

    When our Oracle benchmarking machine frees up, and when the
    marker/blktrace patches are more stable, we'll try to get some "real"
    Oracle benchmark runs done to gage the impact of the markers changes to
    performance...

    Alan D. Brunelle
    Hewlett-Packard / Open Source and Linux Organization / Scalability and
    Performance Group

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    * Alan D. Brunelle (Alan.Brunelle@hp.com) wrote:
    > Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    > runs of the following on both it and after applying Mathieu Desnoyers
    > 11-patch sequence (19 September 2007).
    >
    > * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    > volume per MSA used)
    >
    > * 10 runs with each configuration, averages shown below
    > o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running
    > o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running
    > o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running
    > o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running
    >
    > * A run consists of doing the following in parallel:
    > o Make an ext3 FS on each of the 10 volumes
    > o Mount & unmount each volume
    > + The unmounting generates a tremendous amount of writes
    > to the disks - thus stressing the intended storage
    > devices (10 volumes) plus the separate volume for all
    > the blktrace data (when blk tracing is enabled).
    > + Note the times reported below only cover the
    > make/mount/unmount time - the actual blktrace runs
    > extended beyond the times measured (took quite a while
    > for the blk trace data to be output). We're only
    > concerned with the impact on the "application"
    > performance in this instance.
    >
    > Results are:
    >
    > Kernel w/out BT STDDEV w/ BT STDDEV
    > ------------------------------------- --------- ------ --------- ------
    > 2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796 2.09
    > 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993 3.23
    >


    Interesting results, although we cannot say any of the solutions has much
    impact due to the std dev.

    Also, it could be interesting to add the "blktrace compiled out" as a
    base line.

    Thanks for running those tests,

    Mathieu

    > It looks to be about 2.1% increase in time to do the make/mount/unmount
    > operations with the marker patches in place and no blktrace operations.
    > With the blktrace operations in place we see about a 3.8% decrease in
    > time to do the same ops.
    >
    > When our Oracle benchmarking machine frees up, and when the
    > marker/blktrace patches are more stable, we'll try to get some "real"
    > Oracle benchmark runs done to gage the impact of the markers changes to
    > performance...
    >
    > Alan D. Brunelle
    > Hewlett-Packard / Open Source and Linux Organization / Scalability and
    > Performance Group
    >


    --
    Mathieu Desnoyers
    Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    Mathieu Desnoyers wrote:
    > * Alan D. Brunelle (Alan.Brunelle@hp.com) wrote:
    >> Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    >> runs of the following on both it and after applying Mathieu Desnoyers
    >> 11-patch sequence (19 September 2007).
    >>
    >> * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    >> volume per MSA used)
    >>
    >> * 10 runs with each configuration, averages shown below
    >> o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running
    >> o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running
    >> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running
    >> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running
    >>
    >> * A run consists of doing the following in parallel:
    >> o Make an ext3 FS on each of the 10 volumes
    >> o Mount & unmount each volume
    >> + The unmounting generates a tremendous amount of writes
    >> to the disks - thus stressing the intended storage
    >> devices (10 volumes) plus the separate volume for all
    >> the blktrace data (when blk tracing is enabled).
    >> + Note the times reported below only cover the
    >> make/mount/unmount time - the actual blktrace runs
    >> extended beyond the times measured (took quite a while
    >> for the blk trace data to be output). We're only
    >> concerned with the impact on the "application"
    >> performance in this instance.
    >>
    >> Results are:
    >>
    >> Kernel w/out BT STDDEV w/ BT STDDEV
    >> ------------------------------------- --------- ------ --------- ------
    >> 2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796 2.09
    >> 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993 3.23
    >>

    >
    > Interesting results, although we cannot say any of the solutions has much
    > impact due to the std dev.
    >
    > Also, it could be interesting to add the "blktrace compiled out" as a
    > base line.
    >
    > Thanks for running those tests,
    >
    > Mathieu

    Mathieu:

    Here are the results from 6 different kernels (including ones with
    blktrace not configured in), with now performing 40 runs per kernel.

    o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1

    o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace
    configured respectively.

    o '- markers' or '+ markers' means a kernel without or with the
    11-patch marker series respectively.

    38 runs without blk traces being captured (dropped hi/lo value from 40 runs)

    Kernel Options Min val Avg val Max val Std Dev
    ------------------ --------- --------- --------- ---------
    - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    + markers - bt cfg 15.280382 16.202398 16.409257 0.191861

    - markers + bt cfg 14.464366 14.754347 16.052306 0.463665
    + markers + bt cfg 14.421765 14.644406 15.690871 0.233885

    38 runs with blk traces being captured (dropped hi/lo value from 40 runs)

    Kernel Options Min val Avg val Max val Std Dev
    ------------------ --------- --------- --------- ---------
    - markers + bt cfg 24.675859 28.480446 32.571484 1.713603
    + markers + bt cfg 18.713280 27.054927 31.684325 2.857186

    o It is not at all clear why running without blk trace configured
    into the kernel runs slower than with blk trace configured in. (9.6 to
    10.6% reduction)

    o The data is still not conclusive with respect to whether the marker
    patches change performance characteristics when we're not gathering
    traces. It appears
    that any change in performance is minimal at worst for this test.

    o The data so far still doesn't conclusively show a win in this case
    even when we are capturing traces, although, the average certainly seems
    to be in its favor.

    One concern that I should be able to deal easily with is the choice of
    the IO scheduler being used for both the volume being used to perform
    the test on, as well as the one used for storing blk traces (when
    enabled). Right now I was using the default CFQ, when perhaps NOOP or
    DEADLINE would be a better choice. If there is enough interest in seeing
    how that changes things I could try to get some runs in later this week.

    Alan D. Brunelle
    Hewlett-Packard / Open Source and Linux Organization / Scalability and
    Performance Group

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    On Wed, Sep 26 2007, Alan D. Brunelle wrote:
    > Mathieu Desnoyers wrote:
    >> * Alan D. Brunelle (Alan.Brunelle@hp.com) wrote:
    >>> Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    >>> runs of the following on both it and after applying Mathieu Desnoyers
    >>> 11-patch sequence (19 September 2007).
    >>>
    >>> * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    >>> volume per MSA used)
    >>>
    >>> * 10 runs with each configuration, averages shown below
    >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running
    >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running
    >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running
    >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running
    >>>
    >>> * A run consists of doing the following in parallel:
    >>> o Make an ext3 FS on each of the 10 volumes
    >>> o Mount & unmount each volume
    >>> + The unmounting generates a tremendous amount of writes
    >>> to the disks - thus stressing the intended storage
    >>> devices (10 volumes) plus the separate volume for all
    >>> the blktrace data (when blk tracing is enabled).
    >>> + Note the times reported below only cover the
    >>> make/mount/unmount time - the actual blktrace runs
    >>> extended beyond the times measured (took quite a while
    >>> for the blk trace data to be output). We're only
    >>> concerned with the impact on the "application"
    >>> performance in this instance.
    >>>
    >>> Results are:
    >>>
    >>> Kernel w/out BT STDDEV w/ BT
    >>> STDDEV
    >>> ------------------------------------- --------- ------ ---------
    >>> ------
    >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796
    >>> 2.09
    >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993
    >>> 3.23
    >>>

    >>
    >> Interesting results, although we cannot say any of the solutions has much
    >> impact due to the std dev.
    >>
    >> Also, it could be interesting to add the "blktrace compiled out" as a
    >> base line.
    >>
    >> Thanks for running those tests,
    >>
    >> Mathieu

    > Mathieu:
    >
    > Here are the results from 6 different kernels (including ones with blktrace
    > not configured in), with now performing 40 runs per kernel.
    >
    > o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1
    >
    > o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace
    > configured respectively.
    >
    > o '- markers' or '+ markers' means a kernel without or with the 11-patch
    > marker series respectively.
    >
    > 38 runs without blk traces being captured (dropped hi/lo value from 40
    > runs)
    >
    > Kernel Options Min val Avg val Max val Std Dev
    > ------------------ --------- --------- --------- ---------
    > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861
    >
    > - markers + bt cfg 14.464366 14.754347 16.052306 0.463665
    > + markers + bt cfg 14.421765 14.644406 15.690871 0.233885
    >
    > 38 runs with blk traces being captured (dropped hi/lo value from 40 runs)
    >
    > Kernel Options Min val Avg val Max val Std Dev
    > ------------------ --------- --------- --------- ---------
    > - markers + bt cfg 24.675859 28.480446 32.571484 1.713603
    > + markers + bt cfg 18.713280 27.054927 31.684325 2.857186
    >
    > o It is not at all clear why running without blk trace configured into
    > the kernel runs slower than with blk trace configured in. (9.6 to 10.6%
    > reduction)
    > o The data is still not conclusive with respect to whether the marker
    > patches change performance characteristics when we're not gathering traces.
    > It appears
    > that any change in performance is minimal at worst for this test.
    > o The data so far still doesn't conclusively show a win in this case
    > even when we are capturing traces, although, the average certainly seems to
    > be in its favor.
    > One concern that I should be able to deal easily with is the choice of
    > the IO scheduler being used for both the volume being used to perform the
    > test on, as well as the one used for storing blk traces (when enabled).
    > Right now I was using the default CFQ, when perhaps NOOP or DEADLINE would
    > be a better choice. If there is enough interest in seeing how that changes
    > things I could try to get some runs in later this week.


    Alan,

    Thanks for running these numbers as well. I don't think you have to
    bother with it more. My main concern was a performance regression,
    increasing the overhead of running blktrace. So while we (well, you :-))
    could run more tests, I'd say the above is Good Enough for me. Mathieu,
    you can add my Acked-by: Jens Axboe to the
    blktrace part of your marker series.

    I do wonder about that performance _increase_ with blktrace enabled. I
    remember that we have seen and discussed something like this before,
    it's still a puzzle to me...

    --
    Jens Axboe

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    * Jens Axboe (jens.axboe@oracle.com) wrote:
    > On Wed, Sep 26 2007, Alan D. Brunelle wrote:
    > > Mathieu Desnoyers wrote:
    > >> * Alan D. Brunelle (Alan.Brunelle@hp.com) wrote:
    > >>> Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    > >>> runs of the following on both it and after applying Mathieu Desnoyers
    > >>> 11-patch sequence (19 September 2007).
    > >>>
    > >>> * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    > >>> volume per MSA used)
    > >>>
    > >>> * 10 runs with each configuration, averages shown below
    > >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running
    > >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running
    > >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running
    > >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running
    > >>>
    > >>> * A run consists of doing the following in parallel:
    > >>> o Make an ext3 FS on each of the 10 volumes
    > >>> o Mount & unmount each volume
    > >>> + The unmounting generates a tremendous amount of writes
    > >>> to the disks - thus stressing the intended storage
    > >>> devices (10 volumes) plus the separate volume for all
    > >>> the blktrace data (when blk tracing is enabled).
    > >>> + Note the times reported below only cover the
    > >>> make/mount/unmount time - the actual blktrace runs
    > >>> extended beyond the times measured (took quite a while
    > >>> for the blk trace data to be output). We're only
    > >>> concerned with the impact on the "application"
    > >>> performance in this instance.
    > >>>
    > >>> Results are:
    > >>>
    > >>> Kernel w/out BT STDDEV w/ BT
    > >>> STDDEV
    > >>> ------------------------------------- --------- ------ ---------
    > >>> ------
    > >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796
    > >>> 2.09
    > >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993
    > >>> 3.23
    > >>>
    > >>
    > >> Interesting results, although we cannot say any of the solutions has much
    > >> impact due to the std dev.
    > >>
    > >> Also, it could be interesting to add the "blktrace compiled out" as a
    > >> base line.
    > >>
    > >> Thanks for running those tests,
    > >>
    > >> Mathieu

    > > Mathieu:
    > >
    > > Here are the results from 6 different kernels (including ones with blktrace
    > > not configured in), with now performing 40 runs per kernel.
    > >
    > > o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1
    > >
    > > o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace
    > > configured respectively.
    > >
    > > o '- markers' or '+ markers' means a kernel without or with the 11-patch
    > > marker series respectively.
    > >
    > > 38 runs without blk traces being captured (dropped hi/lo value from 40
    > > runs)
    > >
    > > Kernel Options Min val Avg val Max val Std Dev
    > > ------------------ --------- --------- --------- ---------
    > > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861
    > >
    > > - markers + bt cfg 14.464366 14.754347 16.052306 0.463665
    > > + markers + bt cfg 14.421765 14.644406 15.690871 0.233885
    > >
    > > 38 runs with blk traces being captured (dropped hi/lo value from 40 runs)
    > >
    > > Kernel Options Min val Avg val Max val Std Dev
    > > ------------------ --------- --------- --------- ---------
    > > - markers + bt cfg 24.675859 28.480446 32.571484 1.713603
    > > + markers + bt cfg 18.713280 27.054927 31.684325 2.857186
    > >
    > > o It is not at all clear why running without blk trace configured into
    > > the kernel runs slower than with blk trace configured in. (9.6 to 10.6%
    > > reduction)
    > > o The data is still not conclusive with respect to whether the marker
    > > patches change performance characteristics when we're not gathering traces.
    > > It appears
    > > that any change in performance is minimal at worst for this test.
    > > o The data so far still doesn't conclusively show a win in this case
    > > even when we are capturing traces, although, the average certainly seems to
    > > be in its favor.
    > > One concern that I should be able to deal easily with is the choice of
    > > the IO scheduler being used for both the volume being used to perform the
    > > test on, as well as the one used for storing blk traces (when enabled).
    > > Right now I was using the default CFQ, when perhaps NOOP or DEADLINE would
    > > be a better choice. If there is enough interest in seeing how that changes
    > > things I could try to get some runs in later this week.

    >
    > Alan,
    >
    > Thanks for running these numbers as well. I don't think you have to
    > bother with it more. My main concern was a performance regression,
    > increasing the overhead of running blktrace. So while we (well, you :-))
    > could run more tests, I'd say the above is Good Enough for me. Mathieu,
    > you can add my Acked-by: Jens Axboe to the
    > blktrace part of your marker series.
    >

    thanks!

    > I do wonder about that performance _increase_ with blktrace enabled. I
    > remember that we have seen and discussed something like this before,
    > it's still a puzzle to me...
    >


    Interesting question indeed.

    In those tests, when blktrace is running, are the relay buffers only
    written to or they are also read ?

    Running the tests without consuming the buffers (in overwrite mode)
    would tell us more about the nature of the disturbance causing the
    performance increase.

    Also, a kernel trace could help us understand more thoroughly what is
    happening there.. is it caused by the scheduler ? memory allocation ?
    data cache alignment ?

    I would suggest that you try aligning the block layer data structures
    accessed by blktrace on L2 cacheline size and compare the results (when
    blktrace is disabled).

    Mathieu

    > --
    > Jens Axboe
    >


    --
    Mathieu Desnoyers
    Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    Mathieu Desnoyers wrote:
    >> remember that we have seen and discussed something like this before,
    >> it's still a puzzle to me...
    >>
    >>

    > I do wonder about that performance _increase_ with blktrace enabled. I
    >
    > Interesting question indeed.
    >
    > In those tests, when blktrace is running, are the relay buffers only
    > written to or they are also read ?
    >


    blktrace (the utility) was running too - so the relay buffere /were/
    being read and stored out to disk elsewhere.

    > Running the tests without consuming the buffers (in overwrite mode)
    > would tell us more about the nature of the disturbance causing the
    > performance increase.
    >


    I'd have to write a utility to enable the traces, but then not read. Let
    me think about that.

    > Also, a kernel trace could help us understand more thoroughly what is
    > happening there.. is it caused by the scheduler ? memory allocation ?
    > data cache alignment ?
    >


    Yep - when I get some time, I'll look into that. [Clearly not a gating
    issue for marker support...]

    > I would suggest that you try aligning the block layer data structures
    > accessed by blktrace on L2 cacheline size and compare the results (when
    > blktrace is disabled).
    >


    Again, when I get some time! :-)

    Alan
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system


    * Alan D. Brunelle wrote:

    > o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1
    >
    > o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace
    > configured respectively.
    >
    > o '- markers' or '+ markers' means a kernel without or with the
    > 11-patch marker series respectively.
    >
    > 38 runs without blk traces being captured (dropped hi/lo value from 40 runs)
    >
    > Kernel Options Min val Avg val Max val Std Dev
    > ------------------ --------- --------- --------- ---------
    > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861
    >
    > - markers + bt cfg 14.464366 14.754347 16.052306 0.463665
    > + markers + bt cfg 14.421765 14.644406 15.690871 0.233885


    actually, the pure marker overhead seems to be a regression:

    > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861


    why isnt the marker near zero-cost as it should be? (as long as they are
    enabled but are not in actual use) 2% increase is _ALOT_. That's the
    whole point of good probes: they do not slow down the normal kernel.

    _Worst case_ it should be at most a few instructions overhead but that
    does not explain the ~2% wall-clock time regression you measured here.

    So there's something wrong going on - either markers have unacceptably
    high cost, or the measurement is not valid.

    Ingo
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    Ingo Molnar wrote:
    > actually, the pure marker overhead seems to be a regression:
    >
    > Kernel Options Min val Avg val Max val Std Dev
    >> - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    >> + markers - bt cfg 15.280382 16.202398 16.409257 0.191861

    >
    > why isnt the marker near zero-cost as it should be? (as long as they are
    > enabled but are not in actual use) 2% increase is _ALOT_.


    The increase in the mean is actually 0.033, or 0.2%.

    > So there's something wrong going on - either markers have unacceptably
    > high cost, or the measurement is not valid.


    The third option is that the measurement just needs to be done more
    times. The standard error in the mean for the + markers case is
    0.191861 / sqrt(10) = 0.061, which is twice the size of the difference
    being measured.
    --
    Joshua Root, jmr AT gelato.unsw.edu.au
    http://www.gelato.unsw.edu.au
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system

    * Ingo Molnar (mingo@elte.hu) wrote:
    >
    > * Alan D. Brunelle wrote:
    >
    > > o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1
    > >
    > > o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace
    > > configured respectively.
    > >
    > > o '- markers' or '+ markers' means a kernel without or with the
    > > 11-patch marker series respectively.
    > >
    > > 38 runs without blk traces being captured (dropped hi/lo value from 40 runs)
    > >
    > > Kernel Options Min val Avg val Max val Std Dev
    > > ------------------ --------- --------- --------- ---------
    > > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861
    > >
    > > - markers + bt cfg 14.464366 14.754347 16.052306 0.463665
    > > + markers + bt cfg 14.421765 14.644406 15.690871 0.233885

    >
    > actually, the pure marker overhead seems to be a regression:
    >
    > > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417
    > > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861

    >
    > why isnt the marker near zero-cost as it should be? (as long as they are
    > enabled but are not in actual use) 2% increase is _ALOT_. That's the
    > whole point of good probes: they do not slow down the normal kernel.
    >
    > _Worst case_ it should be at most a few instructions overhead but that
    > does not explain the ~2% wall-clock time regression you measured here.
    >
    > So there's something wrong going on - either markers have unacceptably
    > high cost, or the measurement is not valid.
    >
    > Ingo


    Hi Ingo,

    Tests were executed in the following conditions:

    "Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample
    runs of the following on both it and after applying Mathieu Desnoyers
    11-patch sequence (19 September 2007).

    * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB
    volume per MSA used)"

    Even though the 19 Sept. 2007 markers were released with dependency on
    immediate values, there are no optimized immediate values currently
    available on ia64. Therefore, we add a d-cache hit for every marker
    until we merge immediate values and implement the ia64 optimization.

    Mathieu

    --
    Mathieu Desnoyers
    Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread