[PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes - Kernel

This is a discussion on [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes - Kernel ; [H. Peter Anvin - Wed, Nov 05, 2008 at 10:04:50AM -0800] | Cyrill Gorcunov wrote: | > | > Ingo, what the conclusion is? As I understand from the thread -- | > | > 1) Implement Peter's proposed cleanup/compress. ...

+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3
Results 41 to 59 of 59

Thread: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

  1. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    [H. Peter Anvin - Wed, Nov 05, 2008 at 10:04:50AM -0800]
    | Cyrill Gorcunov wrote:
    | >
    | > Ingo, what the conclusion is? As I understand from the thread --
    | >
    | > 1) Implement Peter's proposed cleanup/compress.
    | > 2) Test Alexander's patche.
    | >
    | > Did I miss something?
    | >
    |
    | Nope, that's pretty much it.
    |
    | However, there are good reason to believe that using this kind of
    | segment selector tricks is probably a bad idea in the long term,
    | especially since CPU vendors have strong incentives to reduce the size
    | of the segment descriptor cache now when none of the mainstream OSes
    | rely on more than a small handful of segments.
    |
    | I was planning to look at doing the obvious stub shrink today.
    |
    | -hpa
    |

    I see. Thanks! Btw Peter, I remember I read long time ago about
    segment caches (well... in time of DOS programming actually). But
    there was only 'common' words like this cache exist. But maybe
    it's possible to know what exactly size of such a cache is?
    You mentoined number 32. (heh... I hadn't remember it until
    you mentoined about such a cache :-)

    - Cyrill -
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    [Andi Kleen - Tue, Nov 04, 2008 at 06:05:01PM +0100]
    | > not taking into account the cost of cs reading (which I
    | > don't suspect to be that expensive apart from writting,
    |
    | GDT accesses have an implied LOCK prefix. Especially
    | on some older CPUs that could be slow.
    |
    | I don't know if it's a problem or not but it would need
    | some careful benchmarking on different systems to make sure interrupt
    | latencies are not impacted.
    |
    | Another reason I would be also careful with this patch is that
    | it will likely trigger slow paths in JITs like qemu/vmware/etc.
    |
    | Also code segment switching is likely not something that
    | current and future micro architectures will spend a lot of time optimizing.
    |
    | I'm not sure that risk is worth the small improvement in code
    | size.
    |
    | An alternative BTW to having all the stubs in the executable
    | would be to just dynamically generate them when the interrupt
    | is set up. Then you would only have the stubs around for the
    | interrupts which are actually used.
    |
    | -Andi
    |

    Thanks a lot for comments, Andi!

    - Cyrill -
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    [H. Peter Anvin - Wed, Nov 05, 2008 at 10:20:23AM -0800]
    | Cyrill Gorcunov wrote:
    | >
    | > I see. Thanks! Btw Peter, I remember I read long time ago about
    | > segment caches (well... in time of DOS programming actually). But
    | > there was only 'common' words like this cache exist. But maybe
    | > it's possible to know what exactly size of such a cache is?
    | > You mentoined number 32. (heh... I hadn't remember it until
    | > you mentoined about such a cache :-)
    | >
    |
    | As with any other caching structure, you can discover its size,
    | associativity, and replacement policy by artificially trying to provoke
    | patterns that produce pathological timings.
    |
    | At Transmeta, at one time we used a 32-entry direct-mapped cache, which
    | ended up with a ~96% hit rate on common Win95 benchmarks.
    |
    | I should, however, make it clear that there are other alternatives for
    | speeding up segment descriptor loading, and not all of them rely on a cache.
    |
    | -hpa
    |

    Thanks a lot for explanation!

    - Cyrill -
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Cyrill Gorcunov wrote:
    >
    > I see. Thanks! Btw Peter, I remember I read long time ago about
    > segment caches (well... in time of DOS programming actually). But
    > there was only 'common' words like this cache exist. But maybe
    > it's possible to know what exactly size of such a cache is?
    > You mentoined number 32. (heh... I hadn't remember it until
    > you mentoined about such a cache :-)
    >


    As with any other caching structure, you can discover its size,
    associativity, and replacement policy by artificially trying to provoke
    patterns that produce pathological timings.

    At Transmeta, at one time we used a 32-entry direct-mapped cache, which
    ended up with a ~96% hit rate on common Win95 benchmarks.

    I should, however, make it clear that there are other alternatives for
    speeding up segment descriptor loading, and not all of them rely on a cache.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * Jeremy Fitzhardinge wrote:

    > Why are the accesses locked? Is it because it does an update of the
    > accessed bit in the descriptor? (We should be pre-setting them all
    > anyway.)


    yes, the accessed bit in the segment descriptor has to be updated in
    an atomic transaction: the CPU has to do a MESI coherent
    read+compare+write transaction, without damaging other updates to the
    6 bytes segment descriptor.

    Old OSs implemented paging to disk by swapping out segments based on
    the accessed bit, and clearing the present and accessed bit when the
    segment is swapped out.

    But given that all our GDT entries have the accessed bit set on Linux,
    there's no physical reason why the CPU should be using a locked cycle
    here - only to stay compatible with ancient stuff.

    So ... that notion just survived in the backwards-compatibility stream
    of CPU enhancements, over the past 10 years.

    On 64-bit Linux there's no reason to maintain that principle, so i'd
    expect future CPUs to relax this even more, were it ever to show up on
    the performance radar. Note that SYSCALL/SYSRET already optimize that
    away.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * Alexander van Heukelum wrote:

    > > | > | Opteron (cycles): 1024 / 1157 / 3527
    > > | > | Xeon E5345 (cycles): 1092 / 1085 / 6622
    > > | > | Athlon XP (cycles): 1028 / 1166 / 5192
    > > | >
    > > | > Xeon is defenitely out of luck :-)
    > > |
    > > | it's still OK - i.e. no outrageous showstopper overhead anywhere in
    > > | that instruction sequence. The total round-trip overhead is what will
    > > | matter most.
    > > |
    > > | Ingo
    > > |
    > >
    > > Don't get me wrong please, I really like what Alexander have done!
    > > But frankly six time slower is a bit scarying me.


    the cost is 6 cycles instead of 1 cycles. In a codepath that takes
    thousands of cycles and is often cache-limited.

    > Thanks again . Now it _is_ six times slower to do this tiny piece
    > of code... But please keep in mind all the activity that follows to
    > save the current data segment registers (the stack segment and code
    > segment are saved automatically), the general purpose registers and
    > to load most of the data segments with kernel-space values. And
    > looking at it now... do_IRQ is also not exactly trivial.
    >
    > Also, I kept the information that is saved on the stack exactly the
    > same. If this is not a requirement, "push %cs" is what is left of
    > this expensive (6 cycle!) sequence. Even that could be unnecessary
    > if the stack layout can be changed... But I'ld like to consider that
    > separately.


    we really want to keep the stack frame consistent between all the
    context types. We can do things like return-to-userspace-from-irq or
    schedule-from-irq-initiated-event, etc. - so crossing between these
    context frames has to be standard and straightforward.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * H. Peter Anvin wrote:

    > Ingo Molnar wrote:
    >>
    >> yes, the accessed bit in the segment descriptor has to be updated
    >> in an atomic transaction: the CPU has to do a MESI coherent
    >> read+compare+write transaction, without damaging other updates to
    >> the 6 bytes segment descriptor.

    >
    > 8 bytes, rather.


    heh, yes of course :-)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Ingo Molnar wrote:
    >
    > yes, the accessed bit in the segment descriptor has to be updated in
    > an atomic transaction: the CPU has to do a MESI coherent
    > read+compare+write transaction, without damaging other updates to the
    > 6 bytes segment descriptor.
    >


    8 bytes, rather.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Alexander van Heukelum wrote:
    >
    > In general: after applying the patch, latencies are more
    > often seen by the rdtsctest. It also seems to cause a
    > small percentage decrease in speed of hackbench.
    > Looking at the latency histograms I believe this is
    > a real effect, but I could not do enough boots/runs to
    > make this a certainty from the runtimes alone.
    >
    > At least for this PC, doing hpa's suggested cleanup of
    > the stub table is the right way to go for now... A
    > second option would be to get rid of the stub table by
    > assigning each important vector a unique handler and
    > to make sure those handlers do not rely on the vector
    > number at all.
    >


    Hi Alexander,

    First of all, great job on the timing analysis. I believe this confirms
    the concerns that I had about this technique.

    Here is a prototype patch of the compressed IRQ stubs -- this patch
    compresses them down to 7 stubs per 32-byte cache line (or part of cache
    line) at the expense of a back-to-back jmp which has the potential of
    being ugly on some pipelines (we can only get 4 stubs into 32 bytes
    without that).

    Would you be willing to run your timing test on this patch? This isn't
    submission-quality since it commingles multiple changes, and it needs
    some cleanup, but it should be useful for testing.

    As a side benefit it eliminates some gratuitous differences between the
    32- and 64-bit code.

    -hpa


  10. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * Alexander van Heukelum wrote:

    > Hi all,
    >
    > I have spent some time trying to find out how expensive the
    > segment-switching patch was. I have only one computer available at
    > the time: a "Sempron 2400+", 32-bit-only machine.
    >
    > Measured were timings of "hackbench 10" in a loop. The average was
    > taken of more than 100 runs. Timings were done for two seperate
    > boots of the system.


    hackbench is _way_ too noisy to measure such cycle-level differences
    as irq entry changes cause. It also does not really stress interrupts
    - it only stresses networking, the VFS and the scheduler.

    a better test might have been to generate a ton of interrupts, but
    even then it's _very_ hard to measure it properly. The best method is
    what i've suggested to you early on: run a loop in user-space and
    observe irq costs via RDTSC, as they happen. Then build a histogram
    and compare the before/after histogram. Compare best-case results as
    well (the first slot of the histogram), as those are statistically
    much more significant than a noisy average.

    Measuring such things in a meaningful way is really tricky business.
    Using hackbench to measure IRQ entry micro-costs is like trying to
    take a photo of a delicate flower at night, by using an atomic bomb as
    the flash-light: you certainly get some sort of effect to report, but
    there's not many nuances left in the picture to really look at ;-)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    On Mon, 10 Nov 2008 09:58:46 +0100, "Ingo Molnar" said:
    > * Alexander van Heukelum wrote:
    > > Hi all,
    > >
    > > I have spent some time trying to find out how expensive the
    > > segment-switching patch was. I have only one computer available at
    > > the time: a "Sempron 2400+", 32-bit-only machine.
    > >
    > > Measured were timings of "hackbench 10" in a loop. The average was
    > > taken of more than 100 runs. Timings were done for two seperate
    > > boots of the system.


    Hi Ingo,

    I guess you just stopped reading here?

    > hackbench is _way_ too noisy to measure such cycle-level differences
    > as irq entry changes cause. It also does not really stress interrupts
    > - it only stresses networking, the VFS and the scheduler.
    >
    > a better test might have been to generate a ton of interrupts, but
    > even then it's _very_ hard to measure it properly.


    I should have presented the second benchmark as the first I
    guess. I really just used hackbench as a workload. I gathered
    it would give a good amount of exceptions like page faults and
    maybe others. It would be nice to have a simple debug switch in
    the kernel to make it generate a lot of interrupts, though .

    > The best method is
    > what i've suggested to you early on: run a loop in user-space and
    > observe irq costs via RDTSC, as they happen. Then build a histogram
    > and compare the before/after histogram. Compare best-case results as
    > well (the first slot of the histogram), as those are statistically
    > much more significant than a noisy average.


    See the rest of the mail you replied to and its attachment. I've put
    the programs I used and the histogram in

    http://heukelum.fastmail.fm/irqstubs/

    I think rdtsctest.c is pretty much what you describe.

    Greetings,
    Alexander

    > Measuring such things in a meaningful way is really tricky business.
    > Using hackbench to measure IRQ entry micro-costs is like trying to
    > take a photo of a delicate flower at night, by using an atomic bomb as
    > the flash-light: you certainly get some sort of effect to report, but
    > there's not many nuances left in the picture to really look at ;-)
    >
    > Ingo

    --
    Alexander van Heukelum
    heukelum@fastmail.fm

    --
    http://www.fastmail.fm - Same, same, but different...

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * Alexander van Heukelum wrote:

    > On Mon, 10 Nov 2008 09:58:46 +0100, "Ingo Molnar" said:
    > > * Alexander van Heukelum wrote:
    > > > Hi all,
    > > >
    > > > I have spent some time trying to find out how expensive the
    > > > segment-switching patch was. I have only one computer available at
    > > > the time: a "Sempron 2400+", 32-bit-only machine.
    > > >
    > > > Measured were timings of "hackbench 10" in a loop. The average was
    > > > taken of more than 100 runs. Timings were done for two seperate
    > > > boots of the system.

    >
    > Hi Ingo,
    >
    > I guess you just stopped reading here?


    yeah, sorry! You describe and did exactly the kind of histogram that i
    wanted to see done ;-)

    I'm not sure i can read out the same thing from the result though.
    Firstly, it seems the 'after' histograms are better, because there the
    histogram shifted towards shorter delays. (i.e. lower effective irq
    entry overhead)

    OTOH, unless i'm misreading them, it's a bit hard to compare them
    visually: the integral of the histograms does not seem to be constant,
    they dont seem to be normalized.

    It should be made constant for them to be comparable. (i.e. the total
    number of irq hits profiled should be equal - or should be normalized
    with the sum after the fact)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Ingo Molnar wrote:
    > * Alexander van Heukelum wrote:
    >
    > hackbench is _way_ too noisy to measure such cycle-level differences
    > as irq entry changes cause. It also does not really stress interrupts
    > - it only stresses networking, the VFS and the scheduler.
    >
    > a better test might have been to generate a ton of interrupts, but
    > even then it's _very_ hard to measure it properly. The best method is
    > what i've suggested to you early on: run a loop in user-space and
    > observe irq costs via RDTSC, as they happen. Then build a histogram
    > and compare the before/after histogram. Compare best-case results as
    > well (the first slot of the histogram), as those are statistically
    > much more significant than a noisy average.
    >


    For what it's worth, I tested this out, and I'm pretty sure you need to
    run a uniprocessor configuration (or system) for it to make sense --
    otherwise you end up missing too many of the interrupts. I first tested
    this on an 8-processor system and, well, came up with nothing.

    I'm going to try this later on a uniprocessor, unless Alexander beats me
    to it.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    On Mon, 10 Nov 2008 14:07:09 +0100, "Ingo Molnar" said:
    > * Alexander van Heukelum wrote:
    > > On Mon, 10 Nov 2008 09:58:46 +0100, "Ingo Molnar" said:
    > > > * Alexander van Heukelum wrote:
    > > > > Hi all,
    > > > >
    > > > > I have spent some time trying to find out how expensive the
    > > > > segment-switching patch was. I have only one computer available at
    > > > > the time: a "Sempron 2400+", 32-bit-only machine.
    > > > >
    > > > > Measured were timings of "hackbench 10" in a loop. The average was
    > > > > taken of more than 100 runs. Timings were done for two seperate
    > > > > boots of the system.

    > >
    > > Hi Ingo,
    > >
    > > I guess you just stopped reading here?

    >
    > yeah, sorry! You describe and did exactly the kind of histogram that i
    > wanted to see done ;-)


    I thought so .

    > I'm not sure i can read out the same thing from the result though.
    > Firstly, it seems the 'after' histograms are better, because there the
    > histogram shifted towards shorter delays. (i.e. lower effective irq
    > entry overhead)
    >
    > OTOH, unless i'm misreading them, it's a bit hard to compare them
    > visually: the integral of the histograms does not seem to be constant,
    > they dont seem to be normalized.


    The total number of measured intervals (between two almost-adjacent
    rdtsc's) is exactly the same for all histograms (10^10). Almost all
    measurements are of the "nothing happened" type, i.e., around 11
    clock cycles on this machine. The user time spent inside the
    rdtsctest program is almost independent of the load, but it
    measures time spent outside of the program... But what should be
    attributed to what effect is unclear to me at the moment.

    > It should be made constant for them to be comparable. (i.e. the total
    > number of irq hits profiled should be equal - or should be normalized
    > with the sum after the fact)


    Basically the difference between the "idle" and "hack10" versions
    should indicate the effect of extra interrupts (timer) and additional
    exceptions and cache effects due to context switching.

    Thanks,
    Alexander

    > Ingo

    --
    Alexander van Heukelum
    heukelum@fastmail.fm

    --
    http://www.fastmail.fm - I mean, what is it about a decent email service?

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    On Mon, 10 Nov 2008 07:39:22 -0800, "H. Peter Anvin"
    said:
    > Ingo Molnar wrote:
    > > * Alexander van Heukelum wrote:
    > >
    > > hackbench is _way_ too noisy to measure such cycle-level differences
    > > as irq entry changes cause. It also does not really stress interrupts
    > > - it only stresses networking, the VFS and the scheduler.
    > >
    > > a better test might have been to generate a ton of interrupts, but
    > > even then it's _very_ hard to measure it properly. The best method is
    > > what i've suggested to you early on: run a loop in user-space and
    > > observe irq costs via RDTSC, as they happen. Then build a histogram
    > > and compare the before/after histogram. Compare best-case results as
    > > well (the first slot of the histogram), as those are statistically
    > > much more significant than a noisy average.
    > >

    >
    > For what it's worth, I tested this out, and I'm pretty sure you need to
    > run a uniprocessor configuration (or system) for it to make sense --
    > otherwise you end up missing too many of the interrupts. I first tested
    > this on an 8-processor system and, well, came up with nothing.
    >
    > I'm going to try this later on a uniprocessor, unless Alexander beats me
    > to it.


    I did the rdtsctest again for the irqstubs patch you sent. The data
    is at http://heukelum.fastmail.fm/irqstubs/ and the latency histogram
    is http://heukelum.fastmail.fm/irqstubs/latency_hpa.png

    Greetings,
    Alexander

    > -hpa

    --
    Alexander van Heukelum
    heukelum@fastmail.fm

    --
    http://www.fastmail.fm - Same, same, but different...

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Alexander van Heukelum wrote:
    >>
    >> OTOH, unless i'm misreading them, it's a bit hard to compare them
    >> visually: the integral of the histograms does not seem to be constant,
    >> they dont seem to be normalized.

    >
    > The total number of measured intervals (between two almost-adjacent
    > rdtsc's) is exactly the same for all histograms (10^10). Almost all
    > measurements are of the "nothing happened" type, i.e., around 11
    > clock cycles on this machine. The user time spent inside the
    > rdtsctest program is almost independent of the load, but it
    > measures time spent outside of the program... But what should be
    > attributed to what effect is unclear to me at the moment.
    >


    I believe you need to remove the obvious null events at the low end (no
    interrupt happened) and renormalize to the same scale for the histograms
    to make sense.

    As it is, the difference in the number of events that actually matters
    dominate the graphs; for example, there are 142187 events >= 12 in
    hack10ticks, but 136533 in hack10ticks_hpa.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Alexander van Heukelum wrote:
    >
    > I did the rdtsctest again for the irqstubs patch you sent. The data
    > is at http://heukelum.fastmail.fm/irqstubs/ and the latency histogram
    > is http://heukelum.fastmail.fm/irqstubs/latency_hpa.png
    >


    Okay, I've stared at a bunch of different transformations of this data
    and I'm starting to think that it's getting lost in the noise. The
    difference between your "idleticks" and "idleticks2" data sets, for
    example, is as big as the differences between any two data sets that I
    can see.

    Just for reference, see this graph where I have filtered out events
    outside the [30..1000] cycle range and renormalized.

    http://www.zytor.com/~hpa/hist.pdf

    I don't know how to even figure out what a realistic error range looks
    like, other than repeating each run something like 100+ times and do an
    "eye chart" kind of diagram.

    -hpa
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

    Okay, after spending most of the day trying to get something that isn't
    completely like white noise (interesting problem, otherwise I'd have
    given up long ago) I did, eventually, come up with something that looks
    like it's significant. I did a set of multiple runs, and am looking for
    the "waterfall points" in the cumulative statistics.

    http://www.zytor.com/~hpa/baseline-hpa-3000-3600.pdf

    This particular set of data points was gathered on a 64-bit kernel, so I
    didn't try the segment technique.

    It looks to me that the collection of red lines is enough to the left of
    the black ones that one can assume there is a significant effect,
    probably by about a cache miss worth of time.

    -hpa

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


    * Alexander van Heukelum wrote:

    > > OTOH, unless i'm misreading them, it's a bit hard to compare them
    > > visually: the integral of the histograms does not seem to be
    > > constant, they dont seem to be normalized.

    >
    > The total number of measured intervals (between two almost-adjacent
    > rdtsc's) is exactly the same for all histograms (10^10). Almost all
    > measurements are of the "nothing happened" type, i.e., around 11
    > clock cycles on this machine. The user time spent inside the
    > rdtsctest program is almost independent of the load, but it measures
    > time spent outside of the program... But what should be attributed
    > to what effect is unclear to me at the moment.


    a high-pass filter should be applied in any case, to filter out the
    "nothing happened" baseline. Eliminating every delta below 500-1000
    cycles would do the trick i think, all IRQ costs are at least 1000
    cycles.

    then a low-pass filter should be applied to eliminate non-irq noise
    such as scheduling effects or expensive irqs (which are both
    uninteresting to such analysis).

    and then _that_ double-filtered dataset should be normalized: the
    number of events should be made the same. (just clip the larger
    dataset to the length of the smaller dataset)

    > > It should be made constant for them to be comparable. (i.e. the
    > > total number of irq hits profiled should be equal - or should be
    > > normalized with the sum after the fact)

    >
    > Basically the difference between the "idle" and "hack10" versions
    > should indicate the effect of extra interrupts (timer) and
    > additional exceptions and cache effects due to context switching.


    i was only looking at before/after duos, for the same basic type of
    workload. Idle versus hackbench is indeed apples to oranges.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3