x86: Is there still value in having a special tlb flush IPI vector? - Kernel

This is a discussion on x86: Is there still value in having a special tlb flush IPI vector? - Kernel ; Now that normal smp_function_call is no longer an enormous bottleneck, is there still value in having a specialised IPI vector for tlb flushes? It seems like quite a lot of duplicate code. The 64-bit tlb flush multiplexes the various cpus ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 22

Thread: x86: Is there still value in having a special tlb flush IPI vector?

  1. x86: Is there still value in having a special tlb flush IPI vector?

    Now that normal smp_function_call is no longer an enormous bottleneck,
    is there still value in having a specialised IPI vector for tlb
    flushes? It seems like quite a lot of duplicate code.

    The 64-bit tlb flush multiplexes the various cpus across 8 vectors to
    increase scalability. If this is a big issue, then the smp function call
    code can (and should) do the same thing. (Though looking at it more
    closely, the way the code uses the 8 vectors is actually a less general
    way of doing what smp_call_function is doing anyway.)

    Thoughts?

    (And uv should definitely be hooking pvops if it wants its own
    flush_tlb_others; vsmp sets the precedent for a subarch-like use of pvops.)

    J
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: x86: Is there still value in having a special tlb flush IPI vector?

    Resend to cc: Andi on an address which actually works.

    Jeremy Fitzhardinge wrote:
    > Now that normal smp_function_call is no longer an enormous bottleneck,
    > is there still value in having a specialised IPI vector for tlb
    > flushes? It seems like quite a lot of duplicate code.
    >
    > The 64-bit tlb flush multiplexes the various cpus across 8 vectors to
    > increase scalability. If this is a big issue, then the smp function
    > call code can (and should) do the same thing. (Though looking at it
    > more closely, the way the code uses the 8 vectors is actually a less
    > general way of doing what smp_call_function is doing anyway.)
    >
    > Thoughts?
    >
    > (And uv should definitely be hooking pvops if it wants its own
    > flush_tlb_others; vsmp sets the precedent for a subarch-like use of
    > pvops.)
    >
    > J
    > --
    > To unsubscribe from this list: send the line "unsubscribe
    > linux-kernel" in
    > the body of a message to majordomo@vger.kernel.org
    > More majordomo info at http://vger.kernel.org/majordomo-info.html
    > Please read the FAQ at http://www.tux.org/lkml/


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: x86: Is there still value in having a special tlb flush IPI vector?


    * Jeremy Fitzhardinge wrote:

    > Now that normal smp_function_call is no longer an enormous bottleneck,
    > is there still value in having a specialised IPI vector for tlb
    > flushes? It seems like quite a lot of duplicate code.
    >
    > The 64-bit tlb flush multiplexes the various cpus across 8 vectors to
    > increase scalability. If this is a big issue, then the smp function
    > call code can (and should) do the same thing. (Though looking at it
    > more closely, the way the code uses the 8 vectors is actually a less
    > general way of doing what smp_call_function is doing anyway.)


    yep, and we could eliminate the reschedule IPI as well.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Mon, Jul 28, 2008 at 04:20:53PM -0700, Jeremy Fitzhardinge wrote:
    > Resend to cc: Andi on an address which actually works.
    >
    > Jeremy Fitzhardinge wrote:
    > >Now that normal smp_function_call is no longer an enormous bottleneck,


    Hmm? It still uses a global lock at least as of current git tree.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tuesday 29 July 2008 09:34, Ingo Molnar wrote:
    > * Jeremy Fitzhardinge wrote:
    > > Now that normal smp_function_call is no longer an enormous bottleneck,
    > > is there still value in having a specialised IPI vector for tlb
    > > flushes? It seems like quite a lot of duplicate code.
    > >
    > > The 64-bit tlb flush multiplexes the various cpus across 8 vectors to
    > > increase scalability. If this is a big issue, then the smp function
    > > call code can (and should) do the same thing. (Though looking at it
    > > more closely, the way the code uses the 8 vectors is actually a less
    > > general way of doing what smp_call_function is doing anyway.)


    It definitely is not a clear win. They do not have the same characteristics.
    So numbers will be needed.

    smp_call_function is now properly scalable in smp_call_function_single
    form. The more general case of multiple targets is not so easy and it still
    takes a global lock and touches global cachelines.

    I don't think it is a good use of time, honestly. Do you have a good reason?


    > yep, and we could eliminate the reschedule IPI as well.


    No. The rewrite makes it now very good at synchronously sending a function
    to a single other CPU.

    Sending asynchronously requires a slab allocation and then a remote slab free
    (which is nasty for slab) at the other end, and bouncing of locks and
    cachelines. No way you want to do that in the reschedule IPI.

    Not to mention the minor problem that it still deadlocks when called with
    interrupts disabled
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: x86: Is there still value in having a special tlb flush IPI vector?

    Andi Kleen wrote:
    >>> Now that normal smp_function_call is no longer an enormous bottleneck,
    >>>

    >
    > Hmm? It still uses a global lock at least as of current git tree.


    Yes, but it's only held briefly to put things onto the list. It doesn't
    get held over the whole IPI transaction as the old smp_call_function
    did, and the tlb flush code still does. RCU is used to manage the list
    walk and freeing, so there's no long-held locks there either.

    J
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: x86: Is there still value in having a special tlb flush IPI vector?

    Nick Piggin wrote:
    > It definitely is not a clear win. They do not have the same characteristics.
    > So numbers will be needed.
    >
    > smp_call_function is now properly scalable in smp_call_function_single
    > form. The more general case of multiple targets is not so easy and it still
    > takes a global lock and touches global cachelines.
    >
    > I don't think it is a good use of time, honestly. Do you have a good reason?
    >


    Code cleanup, unification. It took about 20 minutes to do. It probably
    won't take too much longer to unify kernel/tlb.c. It seems that if
    there's any performance loss in making the transition, then we can make
    it up again by tuning smp_call_function_mask, benefiting all users.

    But, truth be told, the real reason is that I think there may be some
    correctness issue around smp_call_function* - I've seen occasional
    inexplicable crashes, all within generic_smp_call_function() - and I
    just can't exercise that code enough to get a solid reproducing case.
    But if it gets used for tlb flushes, then any bug is going to become
    pretty obvious. Regardless of whether these patches get accepted, I can
    use it as a test vehicle.

    > No. The rewrite makes it now very good at synchronously sending a function
    > to a single other CPU.
    >
    > Sending asynchronously requires a slab allocation and then a remote slab free
    > (which is nasty for slab) at the other end, and bouncing of locks and
    > cachelines. No way you want to do that in the reschedule IPI.
    >
    > Not to mention the minor problem that it still deadlocks when called with
    > interrupts disabled
    >


    In the async case? Or because it can become spontaneously sync if
    there's an allocation failure?

    J
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tuesday 29 July 2008 16:19, Jeremy Fitzhardinge wrote:
    > Nick Piggin wrote:
    > > It definitely is not a clear win. They do not have the same
    > > characteristics. So numbers will be needed.

    >
    > > smp_call_function is now properly scalable in smp_call_function_single
    > > form. The more general case of multiple targets is not so easy and it
    > > still takes a global lock and touches global cachelines.
    > >
    > > I don't think it is a good use of time, honestly. Do you have a good
    > > reason?

    >
    > Code cleanup, unification. It took about 20 minutes to do. It probably


    OK, so nothing terribly important.


    > won't take too much longer to unify kernel/tlb.c. It seems that if
    > there's any performance loss in making the transition, then we can make
    > it up again by tuning smp_call_function_mask, benefiting all users.


    No I don't think that is the right way to go for such an important
    functionality. There are no ifs, smp_call_function does touch global
    cachelines and locks.

    smp_call_function is barely used, as should be very obvious because it
    was allowed to languish with such horrible performance for so long. So
    there aren't too many users.

    But if you get smp_call_function_mask performance at the same time,
    then there is less to argue about I guess (although it will always
    be necessarily more complex than plain tlb flushing).


    > But, truth be told, the real reason is that I think there may be some
    > correctness issue around smp_call_function* - I've seen occasional
    > inexplicable crashes, all within generic_smp_call_function() - and I
    > just can't exercise that code enough to get a solid reproducing case.
    > But if it gets used for tlb flushes, then any bug is going to become
    > pretty obvious. Regardless of whether these patches get accepted, I can
    > use it as a test vehicle.


    That's fair enough. Better still might be a test harness specifically
    to exercise it.


    > > No. The rewrite makes it now very good at synchronously sending a
    > > function to a single other CPU.
    > >
    > > Sending asynchronously requires a slab allocation and then a remote slab
    > > free (which is nasty for slab) at the other end, and bouncing of locks
    > > and cachelines. No way you want to do that in the reschedule IPI.
    > >
    > > Not to mention the minor problem that it still deadlocks when called with
    > > interrupts disabled

    >
    > In the async case? Or because it can become spontaneously sync if
    > there's an allocation failure?


    In both sync and async case, yes.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:

    > Not to mention the minor problem that it still deadlocks when called with
    > interrupts disabled


    __smp_call_function_single has potential though..

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
    > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > > Not to mention the minor problem that it still deadlocks when called with
    > > > interrupts disabled

    > >
    > > __smp_call_function_single has potential though..

    >
    > For reschedule interrupt? I don't really agree.


    Not specifically, for not deadlocking from irq-off, more so.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > Not to mention the minor problem that it still deadlocks when called with
    > > interrupts disabled

    >
    > __smp_call_function_single has potential though..


    For reschedule interrupt? I don't really agree.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tuesday 29 July 2008 20:04, Peter Zijlstra wrote:
    > On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
    > > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > > > Not to mention the minor problem that it still deadlocks when called
    > > > > with interrupts disabled
    > > >
    > > > __smp_call_function_single has potential though..

    > >
    > > For reschedule interrupt? I don't really agree.

    >
    > Not specifically, for not deadlocking from irq-off, more so.


    Oh, well yes it already does work from irq-off, so it has already
    realised its potential

    Not sure exactly what kinds of users it is going to attract, but
    it should be interesting to see!
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tue, 2008-07-29 at 20:17 +1000, Nick Piggin wrote:
    > On Tuesday 29 July 2008 20:04, Peter Zijlstra wrote:
    > > On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
    > > > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > > > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > > > > Not to mention the minor problem that it still deadlocks when called
    > > > > > with interrupts disabled
    > > > >
    > > > > __smp_call_function_single has potential though..
    > > >
    > > > For reschedule interrupt? I don't really agree.

    > >
    > > Not specifically, for not deadlocking from irq-off, more so.

    >
    > Oh, well yes it already does work from irq-off, so it has already
    > realised its potential
    >
    > Not sure exactly what kinds of users it is going to attract, but
    > it should be interesting to see!


    grep __smp_call_function_single kernel/sched.c


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
    > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > > Not to mention the minor problem that it still deadlocks when called with
    > > > interrupts disabled

    > >
    > > __smp_call_function_single has potential though..

    >
    > For reschedule interrupt? I don't really agree.


    How about using just arch_send_call_function_single_ipi() to implement
    smp_send_reschedule() ?

    The overhead of that is a smp_mb() and a list_empty() check in
    generic_smp_call_function_single_interrupt() if there is indeed no work
    to do.



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Mon, Jul 28, 2008 at 11:29:18PM -0700, Jeremy Fitzhardinge wrote:
    > Andi Kleen wrote:
    > >>>Now that normal smp_function_call is no longer an enormous bottleneck,
    > >>>

    > >
    > >Hmm? It still uses a global lock at least as of current git tree.

    >
    > Yes, but it's only held briefly to put things onto the list. It doesn't
    > get held over the whole IPI transaction as the old smp_call_function
    > did, and the tlb flush code still does. RCU is used to manage the list
    > walk and freeing, so there's no long-held locks there either.


    If it bounces regularly it will still hurt.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: x86: Is there still value in having a special tlb flush IPI vector?

    Andi Kleen wrote:
    >> Yes, but it's only held briefly to put things onto the list. It doesn't
    >> get held over the whole IPI transaction as the old smp_call_function
    >> did, and the tlb flush code still does. RCU is used to manage the list
    >> walk and freeing, so there's no long-held locks there either.
    >>

    >
    > If it bounces regularly it will still hurt.
    >


    We could convert smp_call_function_mask to use a multi-vector scheme
    like tlb_64.c if that turns out to be an issue.

    J
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: x86: Is there still value in having a special tlb flush IPI vector?

    On Tue, Jul 29, 2008 at 07:46:32AM -0700, Jeremy Fitzhardinge wrote:
    > Andi Kleen wrote:
    > >>Yes, but it's only held briefly to put things onto the list. It doesn't
    > >>get held over the whole IPI transaction as the old smp_call_function
    > >>did, and the tlb flush code still does. RCU is used to manage the list
    > >>walk and freeing, so there's no long-held locks there either.
    > >>

    > >
    > >If it bounces regularly it will still hurt.
    > >

    >
    > We could convert smp_call_function_mask to use a multi-vector scheme
    > like tlb_64.c if that turns out to be an issue.


    Converting it first would be fine. Or rather in parallel because
    you would need to reuse the TLB vectors (there are not that many
    free)

    But waiting first for a report would seem wrong to me.

    I can just see some poor performance person spend a lot of work to track
    down such a regression. While there's a lot of development manpower available
    for Linux there's still no reason to waste i. I think if you want to change
    such performance critical paths you should make sure the new code is roughly
    performance equivalent first. And with the global lock I don't see that
    at all.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: x86: Is there still value in having a special tlb flush IPI vector?


    * Peter Zijlstra wrote:

    > On Tue, 2008-07-29 at 20:00 +1000, Nick Piggin wrote:
    > > On Tuesday 29 July 2008 19:54, Peter Zijlstra wrote:
    > > > On Tue, 2008-07-29 at 14:30 +1000, Nick Piggin wrote:
    > > > > Not to mention the minor problem that it still deadlocks when called with
    > > > > interrupts disabled
    > > >
    > > > __smp_call_function_single has potential though..

    > >
    > > For reschedule interrupt? I don't really agree.

    >
    > How about using just arch_send_call_function_single_ipi() to implement
    > smp_send_reschedule() ?


    agreed, that's just a single IPI which kicks the need_resched logic on
    return-from-interrupt.

    > The overhead of that is a smp_mb() and a list_empty() check in
    > generic_smp_call_function_single_interrupt() if there is indeed no
    > work to do.


    that would be a miniscule cost - cacheline is read-shared amongst cpus
    so there's no real bouncing there. So i'm all for it ...

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: x86: Is there still value in having a special tlb flush IPI vector?

    Peter Zijlstra wrote:
    > How about using just arch_send_call_function_single_ipi() to implement
    > smp_send_reschedule() ?
    >
    > The overhead of that is a smp_mb() and a list_empty() check in
    > generic_smp_call_function_single_interrupt() if there is indeed no work
    > to do.
    >


    Is doing a no-op interrupt sufficient on all architectures? Is there
    some change a function call IPI might not go through the normal
    reschedule interrupt exit path?

    J

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: x86: Is there still value in having a special tlb flush IPI vector?


    * Jeremy Fitzhardinge wrote:

    > Peter Zijlstra wrote:
    > > How about using just arch_send_call_function_single_ipi() to implement
    > > smp_send_reschedule() ?
    > >
    > > The overhead of that is a smp_mb() and a list_empty() check in
    > > generic_smp_call_function_single_interrupt() if there is indeed no work
    > > to do.

    >
    > Is doing a no-op interrupt sufficient on all architectures? Is there
    > some change a function call IPI might not go through the normal
    > reschedule interrupt exit path?


    We'd still use the smp_send_reschdule(cpu) API, so it's an architecture
    detail. On x86 we'd use arch_send_call_function_single_ipi().

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 1 of 2 1 2 LastLast