[RFC patch 00/18] Trace Clock v2 - Kernel

This is a discussion on [RFC patch 00/18] Trace Clock v2 - Kernel ; On Fri, 07 Nov 2008 03:12:18 -0500 (EST) Nicolas Pitre wrote: > On Thu, 6 Nov 2008, Andrew Morton wrote: > > > On Fri, 07 Nov 2008 00:23:44 -0500 Mathieu Desnoyers wrote: > > > > > #define cnt32_to_63(cnt_lo) ...

+ Reply to Thread
Page 2 of 6 FirstFirst 1 2 3 4 ... LastLast
Results 21 to 40 of 112

Thread: [RFC patch 00/18] Trace Clock v2

  1. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    On Fri, 07 Nov 2008 03:12:18 -0500 (EST) Nicolas Pitre wrote:

    > On Thu, 6 Nov 2008, Andrew Morton wrote:
    >
    > > On Fri, 07 Nov 2008 00:23:44 -0500 Mathieu Desnoyers wrote:
    > >
    > > > #define cnt32_to_63(cnt_lo) \
    > > > ({ \
    > > > - static volatile u32 __m_cnt_hi; \
    > > > + static u32 __m_cnt_hi; \
    > > > union cnt32_to_63 __x; \
    > > > __x.hi = __m_cnt_hi; \
    > > > + smp_rmb(); /* read __m_cnt_hi before mmio cnt_lo */ \
    > > > __x.lo = (cnt_lo); \
    > > > if (unlikely((s32)(__x.hi ^ __x.lo) < 0)) \
    > > > __m_cnt_hi = __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31); \

    > >
    > > Oh dear. We have a macro which secretly maintains
    > > per-instantiation-site global state? And doesn't even implement locking
    > > to protect that state?

    >
    > Please do me a favor and look for those very unfrequent posts I've sent
    > to lkml lately.


    No. Reading the kernel code (and, at a pinch, the changelogs) should
    suffice. If it does not suffice, the kernel code is in error.

    > I've explained it all at least 3 times so far, to Peter
    > Zijlstra, to David Howells, to Mathieu Desnoyers, and now to you.


    If four heads have exploded (thus far) over one piece of code, perhaps
    the blame doesn't lie with those heads.

    > > I mean, the darned thing is called from sched_clock(), which can be
    > > concurrently called on separate CPUs and which can be called from
    > > interrupt context (with an arbitrary nesting level!) while it was running
    > > in process context.

    >
    > Yes! And this is so on *purpose*. Please take some time to read the
    > comment that goes along with it,


    OK.

    > and if you're still not convinced then
    > look for those explanation emails I've already posted.


    No.

    > > /*
    > > * Caller must provide locking to protect *caller_state
    > > */

    >
    > NO! This is meant to be LOCK FREE!


    We have a macro which must only have a single usage in any particular
    kernel build (and nothing to detect a violation of that).

    We have a macro which secretly hides internal state, on a per-expansion-site
    basis, no less.

    It apparently tries to avoid races via ordering tricks, as long
    as it is called with sufficient frequency. But nothing guarantees
    that it _is_ called sufficiently frequently?

    There is absolutely no reason why the first two of these quite bad things
    needed to be done. In fact there is no reason why it needed to be
    implemented as a macro at all.

    As I said in the text which you deleted and ignored, this would be
    better if it was implemented as a C function which requires that the
    caller explicitly pass in a reference to the state storage.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Mathieu Desnoyers wrote:

    > Assume the time source is a global clock which insures that time will never
    > *ever* go backward. Use a smp_rmb() to make sure the cnt_lo value is read before
    > __m_cnt_hi.


    If you have an smp_rmb(), then don't you need an smp_wmb()/smp_mb() to match
    it to make it work? And is your assumption valid that smp_rmb() will affect
    memory vs the I/O access to read the clock? There's no requirement that
    cnt_lo will have been read from an MMIO location at all.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Nicolas Pitre wrote:

    > > I mean, the darned thing is called from sched_clock(), which can be
    > > concurrently called on separate CPUs and which can be called from
    > > interrupt context (with an arbitrary nesting level!) while it was running
    > > in process context.

    >
    > Yes! And this is so on *purpose*. Please take some time to read the
    > comment that goes along with it, and if you're still not convinced then
    > look for those explanation emails I've already posted.


    I agree with Nicolas on this. It's abominably clever, but I think he's right.

    The one place I remain unconvinced is over the issue of preemption of a process
    that is in the middle of cnt32_to_63(), where if the preempted process is
    asleep for long enough, I think it can wind time backwards when it resumes, but
    that's not a problem for the one place I want to use it (sched_clock()) because
    that is (almost) always called with preemption disabled in one way or another.

    The one place it isn't is a debugging case that I'm not too worried about.

    > > /*
    > > * Caller must provide locking to protect *caller_state
    > > */

    >
    > NO! This is meant to be LOCK FREE!


    Absolutely.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Andrew Morton wrote:

    > I mean, the darned thing is called from sched_clock(), which can be
    > concurrently called on separate CPUs and which can be called from
    > interrupt context (with an arbitrary nesting level!) while it was running
    > in process context.
    >
    > Who let that thing into Linux?


    Having crawled all over it, and argued with Nicolas and Panasonic about it, I
    think it's safe in sched_clock(), provided sched_clock() never gets preempted -
    which appears to be the case.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Andrew Morton wrote:

    > We have a macro which must only have a single usage in any particular
    > kernel build (and nothing to detect a violation of that).


    That's not true. It's a macro containing a _static_ local variable, therefore
    the macro may be used multiple times, and each time it's used the compiler
    will allocate a new piece of storage.

    > It apparently tries to avoid races via ordering tricks, as long
    > as it is called with sufficient frequency. But nothing guarantees
    > that it _is_ called sufficiently frequently?


    The comment attached to it clearly states this restriction. Therefore the
    caller must guarantee it. That is something Mathieu's code and my code must
    deal with, not Nicolas's.

    > There is absolutely no reason why the first two of these quite bad things
    > needed to be done. In fact there is no reason why it needed to be
    > implemented as a macro at all.


    There's a very good reason to implement it as either a macro or an inline
    function: it's faster. Moving the whole thing out of line would impose an
    additional function call overhead - with a 64-bit return value on 32-bit
    platforms. For my case - sched_clock() - I'm willing to burn a bit of extra
    space to get the extra speed.

    > As I said in the text which you deleted and ignored, this would be
    > better if it was implemented as a C function which requires that the
    > caller explicitly pass in a reference to the state storage.


    I'd be quite happy if it was:

    static inline u64 cnt32_to_63(u32 cnt_lo, u32 *__m_cnt_hi)
    {
    union cnt32_to_63 __x;
    __x.hi = *__m_cnt_hi;
    __x.lo = cnt_lo;
    if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    *__m_cnt_hi =
    __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    return __x.val;
    }

    I imagine this would compile pretty much the same as the macro. I think it
    would make it more obvious about the independence of the storage.

    Alternatively, perhaps Nicolas just needs to mention this in the comment more
    clearly.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [RFC patch 17/18] MIPS : Trace clock

    On Fri, 2008-11-07 at 00:23 -0500, Mathieu Desnoyers wrote:
    > Note for Peter Zijlstra :
    > You should probably have a look at lockdep.c raw_spinlock_t lockdep_lock usage.
    > I suspect it may be used with preemption enabled in graph_lock(). (not sure
    > though, but it's worth double-checking.


    Are you worried about the graph_lock() instance in
    lookup_chain_cache() ?

    That is the locking for validate_chain,

    __lock_acquire()
    validate_chain()
    look_up_chain_cache()
    graph_lock()
    check_prevs_add()
    check_prev_add()
    graph_unlock()
    graph_lock()
    graph_unlock()

    which is all done without modifying IRQ state.

    However, __lock_acquire() is only called with IRQs disabled:

    lock_acquire()
    raw_local_irq_save()
    __lock_acquire()

    lock_release()
    raw_local_irq_save()
    __lock_release()
    lock_release_nested()
    __lock_acquire()
    lock_release_non_nested()
    __lock_acquire()

    lock_set_subclass()
    raw_local_irq_save()
    __lock_set_subclass()
    __lock_acquire()

    So I think we're good.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [RFC patch 04/18] get_cycles() : powerpc64 HAVE_GET_CYCLES

    On Fri, Nov 07, 2008 at 12:23:40AM -0500, Mathieu Desnoyers wrote:
    >This patch selects HAVE_GET_CYCLES and makes sure get_cycles_barrier() and
    >get_cycles_rate() are implemented.
    >
    >Signed-off-by: Mathieu Desnoyers
    >CC: benh@kernel.crashing.org
    >CC: paulus@samba.org
    >CC: David Miller
    >CC: Linus Torvalds
    >CC: Andrew Morton
    >CC: Ingo Molnar
    >CC: Peter Zijlstra
    >CC: Thomas Gleixner
    >CC: Steven Rostedt
    >CC: linux-arch@vger.kernel.org
    >---
    > arch/powerpc/Kconfig | 1 +
    > arch/powerpc/include/asm/timex.h | 11 +++++++++++
    > 2 files changed, 12 insertions(+)
    >
    >Index: linux.trees.git/arch/powerpc/Kconfig
    >================================================== =================
    >--- linux.trees.git.orig/arch/powerpc/Kconfig 2008-11-07 00:09:44.000000000 -0500
    >+++ linux.trees.git/arch/powerpc/Kconfig 2008-11-07 00:09:46.000000000 -0500
    >@@ -121,6 +121,7 @@ config PPC
    > select HAVE_DMA_ATTRS if PPC64
    > select USE_GENERIC_SMP_HELPERS if SMP
    > select HAVE_OPROFILE
    >+ select HAVE_GET_CYCLES if PPC64


    So maybe it's just me because it's Friday and I'm on vacation, but I don't
    see anything overly specific to ppc64 here. In fact, you use get_cycles_rate
    for all of powerpc in a later patch in the series.

    Is there something special about HAVE_GET_CYCLES that I'm missing that would
    make it only apply to ppc64 and not ppc32?

    josh
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    On Fri, 7 Nov 2008, David Howells wrote:

    > Andrew Morton wrote:
    >
    > > As I said in the text which you deleted and ignored, this would be
    > > better if it was implemented as a C function which requires that the
    > > caller explicitly pass in a reference to the state storage.


    The whole purpose of that thing is to be utterly fast and lightweight.
    Having an out of line C call would trash the major advantage of this
    code.

    > I'd be quite happy if it was:
    >
    > static inline u64 cnt32_to_63(u32 cnt_lo, u32 *__m_cnt_hi)
    > {
    > union cnt32_to_63 __x;
    > __x.hi = *__m_cnt_hi;
    > __x.lo = cnt_lo;
    > if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    > *__m_cnt_hi =
    > __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    > return __x.val;
    > }
    >
    > I imagine this would compile pretty much the same as the macro.


    Depends. As everybody has noticed now, the read ordering is important,
    and if gcc decides to not inline this for whatever reason then the
    ordering is lost. This is why this was a macro to start with.

    > I think it
    > would make it more obvious about the independence of the storage.


    I don't think having the associated storage be outside the macro make
    any sense either. There is simply no valid reason for having it shared
    between multiple invokations of the macro, as well as making its
    interface more complex for no gain.

    > Alternatively, perhaps Nicolas just needs to mention this in the comment more
    > clearly.


    I wrote that code so to me it is cristal clear already. Any suggestions
    as to how this could be improved?


    Nicolas
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    On Fri, 07 Nov 2008 10:01:01 -0500 (EST) Nicolas Pitre wrote:

    > On Fri, 7 Nov 2008, David Howells wrote:
    >
    > > Andrew Morton wrote:
    > >
    > > > As I said in the text which you deleted and ignored, this would be
    > > > better if it was implemented as a C function which requires that the
    > > > caller explicitly pass in a reference to the state storage.

    >
    > The whole purpose of that thing is to be utterly fast and lightweight.


    Well I'm glad it wasn't designed to demonstrate tastefulness.

    btw, do you know how damned irritating and frustrating it is for a code
    reviewer to have his comments deliberately ignored and deleted in
    replies?

    > Having an out of line C call would trash the major advantage of this
    > code.


    Not really.

    > > I'd be quite happy if it was:
    > >
    > > static inline u64 cnt32_to_63(u32 cnt_lo, u32 *__m_cnt_hi)
    > > {
    > > union cnt32_to_63 __x;
    > > __x.hi = *__m_cnt_hi;
    > > __x.lo = cnt_lo;
    > > if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    > > *__m_cnt_hi =
    > > __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    > > return __x.val;
    > > }
    > >
    > > I imagine this would compile pretty much the same as the macro.

    >
    > Depends. As everybody has noticed now, the read ordering is important,
    > and if gcc decides to not inline this


    If gcc did that then it would need to generate static instances of
    inlined functions within individual compilation units. It would be a
    disaster for the kernel. For a start, functions which are "inlined" in kernel
    modules wouldn't be able to access their static storage and modprobing
    them would fail.

    > for whatever reason then the
    > ordering is lost.


    Uninlining won't affect any ordering I can see.

    > This is why this was a macro to start with.
    >
    > > I think it
    > > would make it more obvious about the independence of the storage.

    >
    > I don't think having the associated storage be outside the macro make
    > any sense either. There is simply no valid reason for having it shared
    > between multiple invokations of the macro, as well as making its
    > interface more complex for no gain.


    oh god.

    > > Alternatively, perhaps Nicolas just needs to mention this in the comment more
    > > clearly.

    >
    > I wrote that code so to me it is cristal clear already. Any suggestions
    > as to how this could be improved?
    >


    Does mn10300's get_cycles() really count backwards? The first two
    callsites I looked at (crypto/tcrypt.c and fs/ext4/mballoc.c) assume
    that it is an upcounter.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Nicolas Pitre wrote:

    > The whole purpose of that thing is to be utterly fast and lightweight.
    > Having an out of line C call would trash the major advantage of this
    > code.


    No argument there.

    > > I imagine this would compile pretty much the same as the macro.


    Having said that, I realise it's wrong. The macro potentially takes a h/w
    read operation (cnt_lo) and does it at a place of its choosing - which the
    compiler may not be permitted to move if cnt_lo resolves to a bit of volatile
    inline asm with memory constraints. Converting it to an inline function
    forces cnt_lo to be resolved first.

    > Depends. As everybody has noticed now, the read ordering is important,
    > and if gcc decides to not inline this for whatever reason then the
    > ordering is lost. This is why this was a macro to start with.


    Good point. I wonder if you should've put a compiler barrier in there to at
    least make the point.

    > I don't think having the associated storage be outside the macro make any
    > sense either.


    It can have a comment attached to it to say what it represents. On the other
    hand, it'd probably need further comments attaching to it to fend off people
    who start thinking they can make use of this variable in other ways...


    > > Alternatively, perhaps Nicolas just needs to mention this in the comment
    > > more clearly.

    >
    > I wrote that code so to me it is cristal clear already. Any suggestions
    > as to how this could be improved?


    It ought to be, but clearly it isn't. Sometimes the obvious is all too easy to
    overlook. I'll think about it, but perhaps something like:

    * This macro uses a static internal variable to retain the upper counter.
    * This has two consequences: firstly, it may be used in multiple ways by
    * different callers for different things without interference; and secondly,
    * each caller will get its own, independent counter, and so an out of line
    * wrapper must be used if multiple callers want to use the same pair of
    * counters.

    It's a bit heavy-handed, but...

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [RFC patch 07/18] Trace clock core

    * Andrew Morton (akpm@linux-foundation.org) wrote:
    > On Fri, 7 Nov 2008 01:16:43 -0500 Mathieu Desnoyers wrote:
    >
    > > > Is there something we should be fixing in m68k?
    > > >

    > >
    > > Yes, but I fear it's going to go deep into include hell :-(

    >
    > Oh, OK. I thought that the comment meant that m68k's on_each_cpu()
    > behaves differently at runtime from other architectures (and wrongly).
    >
    > If it's just some compile-time #include snafu then that's far less
    > of a concern.
    >


    Should I simply remove this comment then ?


    --
    Mathieu Desnoyers
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Andrew Morton wrote:

    > If gcc did that then it would need to generate static instances of
    > inlined functions within individual compilation units. It would be a
    > disaster for the kernel. For a start, functions which are "inlined" in kernel
    > modules wouldn't be able to access their static storage and modprobing
    > them would fail.


    Do you expect a static inline function that lives in a header file and that
    has a static variable in it to share that static variable over all instances
    of that function in a program? Or do you expect the static variable to be
    limited at the file level? Or just at the invocation level?

    > Does mn10300's get_cycles() really count backwards?


    Yes, because the value is generated by a pair of cascaded 16-bit hardware
    down-counters.

    > The first two callsites I looked at (crypto/tcrypt.c and fs/ext4/mballoc.c)
    > assume that it is an upcounter.


    Hmmm... I didn't occur to me that get_cycles() was available for use outside
    of arch code. Possibly it wasn't so used when I first came up with the code.

    I should probably make it count the other way.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [RFC patch 07/18] Trace clock core

    On Fri, 7 Nov 2008 11:12:38 -0500 Mathieu Desnoyers wrote:

    > * Andrew Morton (akpm@linux-foundation.org) wrote:
    > > On Fri, 7 Nov 2008 01:16:43 -0500 Mathieu Desnoyers wrote:
    > >
    > > > > Is there something we should be fixing in m68k?
    > > > >
    > > >
    > > > Yes, but I fear it's going to go deep into include hell :-(

    > >
    > > Oh, OK. I thought that the comment meant that m68k's on_each_cpu()
    > > behaves differently at runtime from other architectures (and wrongly).
    > >
    > > If it's just some compile-time #include snafu then that's far less
    > > of a concern.
    > >

    >
    > Should I simply remove this comment then ?
    >


    umm, it could perhaps be clarified - mention that it's needed for an
    include order problem.

    It's a bit odd. Surely by the time we've included these:

    +#include
    +#include
    +#include
    +#include
    +#include
    +#include
    +#include
    +#include
    +#include
    +#include

    someone has already included sched.h, and the definition of
    _LINUX_SCHED_H will cause the later inclusion to not change anything?

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    On Fri, 07 Nov 2008 16:21:55 +0000 David Howells wrote:

    > Andrew Morton wrote:
    >
    > > If gcc did that then it would need to generate static instances of
    > > inlined functions within individual compilation units. It would be a
    > > disaster for the kernel. For a start, functions which are "inlined" in kernel
    > > modules wouldn't be able to access their static storage and modprobing
    > > them would fail.

    >
    > Do you expect a static inline function that lives in a header file and that
    > has a static variable in it to share that static variable over all instances
    > of that function in a program? Or do you expect the static variable to be
    > limited at the file level? Or just at the invocation level?


    I'd expect it to behave in the same way as it would if the function was
    implemented out-of-line.

    But it occurs to me that the modrobe-doesnt-work thing would happen if
    the function _is_ inlined anyway, so we won't be doing that.

    Whatever. Killing this many puppies because gcc may do something so
    bizarrely wrong isn't justifiable.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    * David Howells (dhowells@redhat.com) wrote:
    > Andrew Morton wrote:
    >
    > > We have a macro which must only have a single usage in any particular
    > > kernel build (and nothing to detect a violation of that).

    >
    > That's not true. It's a macro containing a _static_ local variable, therefore
    > the macro may be used multiple times, and each time it's used the compiler
    > will allocate a new piece of storage.
    >
    > > It apparently tries to avoid races via ordering tricks, as long
    > > as it is called with sufficient frequency. But nothing guarantees
    > > that it _is_ called sufficiently frequently?

    >
    > The comment attached to it clearly states this restriction. Therefore the
    > caller must guarantee it. That is something Mathieu's code and my code must
    > deal with, not Nicolas's.
    >
    > > There is absolutely no reason why the first two of these quite bad things
    > > needed to be done. In fact there is no reason why it needed to be
    > > implemented as a macro at all.

    >
    > There's a very good reason to implement it as either a macro or an inline
    > function: it's faster. Moving the whole thing out of line would impose an
    > additional function call overhead - with a 64-bit return value on 32-bit
    > platforms. For my case - sched_clock() - I'm willing to burn a bit of extra
    > space to get the extra speed.
    >
    > > As I said in the text which you deleted and ignored, this would be
    > > better if it was implemented as a C function which requires that the
    > > caller explicitly pass in a reference to the state storage.

    >
    > I'd be quite happy if it was:
    >
    > static inline u64 cnt32_to_63(u32 cnt_lo, u32 *__m_cnt_hi)
    > {
    > union cnt32_to_63 __x;
    > __x.hi = *__m_cnt_hi;
    > __x.lo = cnt_lo;
    > if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    > *__m_cnt_hi =
    > __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    > return __x.val;
    > }
    >


    Almost there. At least, with this kind of implementation, we would not
    have to resort to various tricks to make sure a single code path is
    called at a certain frequency. We would simply have to make sure the
    __m_cnt_hi value is updated at a certain frequency. Thinking about
    "data" rather than "code" makes much more sense.

    The only missing thing here is the correct ordering. The problem is, as
    I presented in more depth in my previous discussion with Nicolas, that
    the __m_cnt_hi value has to be read before cnt_lo. First off, using this
    macro with get_cycles() is simply buggy, because the macro expects
    _perfect_ order of timestamps, no skew whatsoever, or otherwise time
    could jump. This macro is therefore good only for mmio reads. One should
    use per-cpu variables to keep the state of get_cycles() reads (as I did
    in my other patch).

    The following approach should work :

    static inline u64 cnt32_to_63(u32 io_addr, u32 *__m_cnt_hi)
    {
    union cnt32_to_63 __x;
    __x.hi = *__m_cnt_hi; /* memory read for high bits internal state */
    smp_rmb(); /*
    * read high bits before low bits insures time
    * does not go backward.
    */
    __x.lo = readl(cnt_lo); /* mmio read */
    if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    *__m_cnt_hi =
    __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    return __x.val;
    }

    But any get_cycles() user of cnt32_to_63() should be shot down. The
    bright side is : there is no way get_cycles() can be used with this
    new code.

    e.g. of incorrect users for arm (unless they are UP only, but that seems
    like a weird design argument) :

    mach-sa1100/include/mach/SA-1100.h:#define OSCR __REG(0x90000010)
    /* OS timer Counter Reg. */
    mach-sa1100/generic.c: unsigned long long v = cnt32_to_63(OSCR);
    mach-pxa/include/mach/pxa-regs.h:#define OSCR __REG(0x40A00010) /* OS
    Timer Counter Register */
    mach-pxa/time.c: unsigned long long v = cnt32_to_63(OSCR);

    Correct user :
    mach-versatile/core.c: unsigned long long v =
    cnt32_to_63(readl(VERSATILE_REFCOUNTER));

    The new call would look like :

    /* Hi 32-bits of versatile refcounter state, kept for cnt32_to_64. */
    static u32 versatile_refcounter_hi;

    unsigned long long v = cnt32_to_64(VERSATILE_REFCOUNTER, refcounter_hi);

    Mathieu


    > I imagine this would compile pretty much the same as the macro. I think it
    > would make it more obvious about the independence of the storage.
    >
    > Alternatively, perhaps Nicolas just needs to mention this in the comment more
    > clearly.
    >
    > David


    --
    Mathieu Desnoyers
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    On Fri, 7 Nov 2008, Andrew Morton wrote:

    > On Fri, 07 Nov 2008 10:01:01 -0500 (EST) Nicolas Pitre wrote:
    >
    > > On Fri, 7 Nov 2008, David Howells wrote:
    > >
    > > > Andrew Morton wrote:
    > > >
    > > > > As I said in the text which you deleted and ignored, this would be
    > > > > better if it was implemented as a C function which requires that the
    > > > > caller explicitly pass in a reference to the state storage.

    > >
    > > The whole purpose of that thing is to be utterly fast and lightweight.

    >
    > Well I'm glad it wasn't designed to demonstrate tastefulness.


    Fast tricks aren't always meant to be beautiful. That,s why we have
    abstraction layers.

    > btw, do you know how damned irritating and frustrating it is for a code
    > reviewer to have his comments deliberately ignored and deleted in
    > replies?


    Do you know how irritating and frustrating it is when reviewers don't
    care reading the damn comments along with the code? Maybe it wasn't the
    case, but you gave the impression of jumping to conclusion without even
    bothering about the associated explanation in the code until I directed
    you at it.

    > > Having an out of line C call would trash the major advantage of this
    > > code.

    >
    > Not really.


    Your opinion.

    > > > I imagine this would compile pretty much the same as the macro.

    > >
    > > Depends. As everybody has noticed now, the read ordering is important,
    > > and if gcc decides to not inline this

    >
    > If gcc did that then it would need to generate static instances of
    > inlined functions within individual compilation units. It would be a
    > disaster for the kernel. For a start, functions which are "inlined" in kernel
    > modules wouldn't be able to access their static storage and modprobing
    > them would fail.


    That doesn't mean that access ordering is preserved within the function,
    unless the interface is pointer based, but that won't work if the
    counter is accessed through some special instruction rather than a
    memory location.

    > > for whatever reason then the
    > > ordering is lost.

    >
    > Uninlining won't affect any ordering I can see.


    See above.

    > > This is why this was a macro to start with.
    > >
    > > > I think it
    > > > would make it more obvious about the independence of the storage.

    > >
    > > I don't think having the associated storage be outside the macro make
    > > any sense either. There is simply no valid reason for having it shared
    > > between multiple invokations of the macro, as well as making its
    > > interface more complex for no gain.

    >
    > oh god.


    Thank you. ;-)

    > > > Alternatively, perhaps Nicolas just needs to mention this in the comment more
    > > > clearly.

    > >
    > > I wrote that code so to me it is cristal clear already. Any suggestions
    > > as to how this could be improved?
    > >

    >
    > Does mn10300's get_cycles() really count backwards? The first two
    > callsites I looked at (crypto/tcrypt.c and fs/ext4/mballoc.c) assume
    > that it is an upcounter.


    I know nothing about mn10300.


    Nicolas
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Nicolas Pitre wrote:

    > > Well I'm glad it wasn't designed to demonstrate tastefulness.

    >
    > Fast tricks aren't always meant to be beautiful. That,s why we have
    > abstraction layers.


    And comments. The comment attached to cnt32_to_63() is well written and
    informative.

    > I know nothing about mn10300.


    That's aimed my way.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    * David Howells (dhowells@redhat.com) wrote:
    > Nicolas Pitre wrote:
    >
    > > > I mean, the darned thing is called from sched_clock(), which can be
    > > > concurrently called on separate CPUs and which can be called from
    > > > interrupt context (with an arbitrary nesting level!) while it was running
    > > > in process context.

    > >
    > > Yes! And this is so on *purpose*. Please take some time to read the
    > > comment that goes along with it, and if you're still not convinced then
    > > look for those explanation emails I've already posted.

    >
    > I agree with Nicolas on this. It's abominably clever, but I think he's right.
    >
    > The one place I remain unconvinced is over the issue of preemption of a process
    > that is in the middle of cnt32_to_63(), where if the preempted process is
    > asleep for long enough, I think it can wind time backwards when it resumes, but
    > that's not a problem for the one place I want to use it (sched_clock()) because
    > that is (almost) always called with preemption disabled in one way or another.
    >
    > The one place it isn't is a debugging case that I'm not too worried about.
    >


    I am also concerned about the non-preemption off case.

    Then I think the function should document that it must be called with
    preempt disabled.

    Mathieu

    > > > /*
    > > > * Caller must provide locking to protect *caller_state
    > > > */

    > >
    > > NO! This is meant to be LOCK FREE!

    >
    > Absolutely.
    >
    > David


    --
    Mathieu Desnoyers
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    Mathieu Desnoyers wrote:

    > First off, using this macro with get_cycles() is simply buggy, because the
    > macro expects _perfect_ order of timestamps, no skew whatsoever, or
    > otherwise time could jump.


    Erm... Why can't I pass it get_cycles()? Are you saying that sched_clock()
    in MN10300 is wrong for it's use of get_cycles() with cnt32_to_63()?

    > __x.lo = readl(cnt_lo); /* mmio read */


    readl() might insert an extra barrier instruction. Not only that, io_addr
    must be unsigned long.

    David
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

    * David Howells (dhowells@redhat.com) wrote:
    > Mathieu Desnoyers wrote:
    >
    > > First off, using this macro with get_cycles() is simply buggy, because the
    > > macro expects _perfect_ order of timestamps, no skew whatsoever, or
    > > otherwise time could jump.

    >
    > Erm... Why can't I pass it get_cycles()? Are you saying that sched_clock()
    > in MN10300 is wrong for it's use of get_cycles() with cnt32_to_63()?
    >


    Yes. Do you think the synchronization of the cycles counters is
    _perfect_ across CPUs so that there is no possible way whatsoever that
    two cycle counter values appear to go backward between CPUs ? (also
    taking in account delays in __m_cnt_hi write-back...)

    As I showed in my previous example, if you are unlucky enough to hit the
    spot where the cycle counters go backward at the time warp edge, time
    will jump of 2^32, so about 4.29s at 1GHz.

    > > __x.lo = readl(cnt_lo); /* mmio read */

    >
    > readl() might insert an extra barrier instruction. Not only that, io_addr
    > must be unsigned long.


    If we expect the only correct use-case to be with readl(), I don't see
    the problem with added synchronization.

    >


    Ah, right, then the parameters should be updated accordingly.

    static inline u64 cnt32_to_63(unsigned long io_addr, u32 *__m_cnt_hi)
    {
    union cnt32_to_63 __x;
    __x.hi = *__m_cnt_hi; /* memory read for high bits internal state */
    rmb(); /*
    * read high bits before low bits insures time
    * does not go backward. Sync across
    * CPUs and for interrupts.
    */
    __x.lo = readl(io_addr); /* mmio read */
    if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
    *__m_cnt_hi =
    __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
    return __x.val;
    }

    Mathieu

    > David


    --
    Mathieu Desnoyers
    OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 6 FirstFirst 1 2 3 4 ... LastLast