Re: Semphore -> mutex in the device tree - Kernel

This is a discussion on Re: Semphore -> mutex in the device tree - Kernel ; On Thu, 2008-04-17 at 12:11 -0400, Alan Stern wrote: > On Thu, 17 Apr 2008, Peter Zijlstra wrote: > > > On Thu, 2008-04-17 at 11:22 -0400, Alan Stern wrote: > > > Peter: > > > > > > ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Re: Semphore -> mutex in the device tree

  1. Re: Semphore -> mutex in the device tree

    On Thu, 2008-04-17 at 12:11 -0400, Alan Stern wrote:
    > On Thu, 17 Apr 2008, Peter Zijlstra wrote:
    >
    > > On Thu, 2008-04-17 at 11:22 -0400, Alan Stern wrote:
    > > > Peter:
    > > >
    > > > The obstacle to converting the semaphore in struct device to a mutex
    > > > has been that its tree-oriented usage pattern isn't compatible with
    > > > lockdep.
    > > >
    > > > In order to get around this and at least begin the conversion process,
    > > > how about adding a provision for making some classes of mutex invisible
    > > > to lockdep? I know it doesn't solve the fundamental problem, but maybe
    > > > it's a step in the right direction.

    > >
    > > the device lock has two problems with lockdep:
    > >
    > > 1) on suspend it takes more than MAX_LOCK_DEPTH (48) locks

    >
    > This isn't true any more. Not in Greg KH's development tree.
    >
    > > 2) tree nesting
    > >
    > >
    > > Lets start with the easy one first; would a similar solution to the
    > > radix tree locking as found in -rt work?
    > >
    > > http://programming.kicks-ass.net/ker...-lockdep.patch
    > >
    > > That does mean you have to set an effective max depth to the tree, is
    > > that a practical issue?

    >
    > I don't know. But I suspect it wouldn't be sufficient to solve the
    > problems associated with tree nesting.


    It works for strict top-down locking. The sideways locking you do:

    > For example, it's quite likely that some code somewhere needs to hold
    > two sibling nodes' locks at the same time. Provided the parent node is
    > already locked, this operation is perfectly safe. But is lockdep able
    > to handle it?


    Your siblings are ordered; so a simple mutex_lock_nested() should work
    between siblings as long as you never need more than 8 siblings locked
    at any one time.

    > There are other, more subtle problems too; this is just one example.


    Can you think of a situation where the top-down class annotation and the
    sideways _nesting() isn't sufficient? If so, please share.

    > > The harder part is 1), holding _that_ many locks. Would something
    > > obscene like this work for you:

    >
    > This is no longer needed, fortunately. :-)


    Ah, good :-)

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: Semphore -> mutex in the device tree

    On Thu, 17 Apr 2008, Peter Zijlstra wrote:

    > > > That does mean you have to set an effective max depth to the tree, is
    > > > that a practical issue?

    > >
    > > I don't know. But I suspect it wouldn't be sufficient to solve the
    > > problems associated with tree nesting.

    >
    > It works for strict top-down locking. The sideways locking you do:
    >
    > > For example, it's quite likely that some code somewhere needs to hold
    > > two sibling nodes' locks at the same time. Provided the parent node is
    > > already locked, this operation is perfectly safe. But is lockdep able
    > > to handle it?

    >
    > Your siblings are ordered; so a simple mutex_lock_nested() should work
    > between siblings as long as you never need more than 8 siblings locked
    > at any one time.


    I don't fully understand the implications here. Let A and B be sibling
    device nodes. Suppose task 0 locks device A with NESTING_PARENT and
    then device B with NESTING_CHILD, while at about the same time task 1
    locks B with NESTING PARENT and then A with NESTING_CHILD. This is
    perfectly safe provided both tasks acquire the parent lock first, but
    otherwise it isn't. How would lockdep account for this?

    And don't say that one task is ignoring the ordering of the siblings.
    In fact the ordering is a rather weak one (time of registration), there
    might be different orderings that are more relevant (e.g., the numbers
    of the ports into which the siblings are plugged), and it isn't
    particularly easy to tell which of two siblings should come first.
    Furthermore the order of locking isn't always under our control; there
    _will_ be times when the order has to be "backward".

    > > There are other, more subtle problems too; this is just one example.

    >
    > Can you think of a situation where the top-down class annotation and the
    > sideways _nesting() isn't sufficient? If so, please share.


    I don't understand all the intricacies of lockdep. In principle
    ordering of siblings gives rise to a total ordering of the entire tree,
    so let's assume we have such a total ordering and that everyone always
    obeys it.

    Even so there is a potential for trouble. I don't know of any concrete
    examples like this in the kernel, but they might exist. Suppose a
    driver keeps a private mutex associated with each device it manages.
    Normally the device's lock would be acquired first and the private
    mutex second. But there could be places where the driver acquires a
    child device's lock while holding the parent's mutex; this would look
    to lockdep like a violation.

    Alan Stern

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: Semphore -> mutex in the device tree

    On Thu, 2008-04-17 at 14:43 -0400, Alan Stern wrote:
    > On Thu, 17 Apr 2008, Peter Zijlstra wrote:
    >
    > > > > That does mean you have to set an effective max depth to the tree, is
    > > > > that a practical issue?
    > > >
    > > > I don't know. But I suspect it wouldn't be sufficient to solve the
    > > > problems associated with tree nesting.

    > >
    > > It works for strict top-down locking. The sideways locking you do:
    > >
    > > > For example, it's quite likely that some code somewhere needs to hold
    > > > two sibling nodes' locks at the same time. Provided the parent node is
    > > > already locked, this operation is perfectly safe. But is lockdep able
    > > > to handle it?

    > >
    > > Your siblings are ordered; so a simple mutex_lock_nested() should work
    > > between siblings as long as you never need more than 8 siblings locked
    > > at any one time.

    >
    > I don't fully understand the implications here. Let A and B be sibling
    > device nodes. Suppose task 0 locks device A with NESTING_PARENT and
    > then device B with NESTING_CHILD, while at about the same time task 1
    > locks B with NESTING PARENT and then A with NESTING_CHILD. This is
    > perfectly safe provided both tasks acquire the parent lock first, but
    > otherwise it isn't. How would lockdep account for this?


    Lockdep only cares about class order; if you always lock NESTING_PARENT
    -> NESTING_CHILD it won't complain; but it will not require you to hold
    the parent lock (although I can provide you with an annotation to that
    effect if you want; lockdep_assert_held(dev->parent->lock) - or
    something like that).

    So using these annotations you can annotate a real deadlock 'away'.

    > And don't say that one task is ignoring the ordering of the siblings.
    > In fact the ordering is a rather weak one (time of registration), there
    > might be different orderings that are more relevant (e.g., the numbers
    > of the ports into which the siblings are plugged), and it isn't
    > particularly easy to tell which of two siblings should come first.
    > Furthermore the order of locking isn't always under our control; there
    > _will_ be times when the order has to be "backward".


    As long as you're sure there aren't any actual deadlocks; and ensure you
    use your annotated classes in the same order:

    tree_level_n -> tree_level_n+1

    and

    tree_leve_n_sub_m -> tree_level_n_sub_m+1

    (subclasses are from the _nesting() annotation)

    lockdep will not complain,

    > > > There are other, more subtle problems too; this is just one example.

    > >
    > > Can you think of a situation where the top-down class annotation and the
    > > sideways _nesting() isn't sufficient? If so, please share.

    >
    > I don't understand all the intricacies of lockdep. In principle
    > ordering of siblings gives rise to a total ordering of the entire tree,
    > so let's assume we have such a total ordering and that everyone always
    > obeys it.
    >
    > Even so there is a potential for trouble. I don't know of any concrete
    > examples like this in the kernel, but they might exist. Suppose a
    > driver keeps a private mutex associated with each device it manages.
    > Normally the device's lock would be acquired first and the private
    > mutex second. But there could be places where the driver acquires a
    > child device's lock while holding the parent's mutex; this would look
    > to lockdep like a violation.


    So lockdep cares about classes and the hierarchy of thereof; so given
    your example:

    parent_tree_level
    child_tree_level
    device_lock

    Its perfectly fine to take a lock from 'parent_tree_level' and then a
    lock from 'device_lock', skipping the class in the middle - as long as
    you thereafter never acquire a lock from it.

    So given a pre-determined class hierarchy, you're not required to take
    all locks in that hierarchy; as long as you always go down. If you ever
    take a lock so that moves up in the hierarchy you're in trouble.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: Semphore -> mutex in the device tree

    On Fri, 18 Apr 2008, Peter Zijlstra wrote:

    > > Even so there is a potential for trouble. I don't know of any concrete
    > > examples like this in the kernel, but they might exist. Suppose a
    > > driver keeps a private mutex associated with each device it manages.
    > > Normally the device's lock would be acquired first and the private
    > > mutex second. But there could be places where the driver acquires a
    > > child device's lock while holding the parent's mutex; this would look
    > > to lockdep like a violation.

    >
    > So lockdep cares about classes and the hierarchy of thereof; so given
    > your example:
    >
    > parent_tree_level
    > child_tree_level
    > device_lock
    >
    > Its perfectly fine to take a lock from 'parent_tree_level' and then a
    > lock from 'device_lock', skipping the class in the middle - as long as
    > you thereafter never acquire a lock from it.
    >
    > So given a pre-determined class hierarchy, you're not required to take
    > all locks in that hierarchy; as long as you always go down. If you ever
    > take a lock so that moves up in the hierarchy you're in trouble.


    We must be talking at cross purposes. Are classes and subclasses all
    that lockdep looks at?

    Let's take a simpler example. Suppose driver D's probe routine
    registers a child device. Then we have:

    Subsystem: Register device A with driver core

    Driver core: Lock device A with NESTING_PARENT
    Call Drobe()

    Drobe(): Register device B with driver core
    as a child of A

    Driver core: Lock device B with NESTING_PARENT
    Call Erobe()

    (where E is the appropriate driver for B). Is this a lockdep
    violation? Both A and B are locked with the same nesting level,
    because they are locked by the same code in the driver core, but
    one is the parent of the other in the device tree.


    Or maybe I misunderstood, and you're proposing to use a node's level in
    the tree as its lockdep nesting level. In that case, consider this
    example. Suppose driver D associates a private mutex M with each of
    its devices. Suppose D is managing device A at level 4 and device B at
    level 5. Then we might have:

    D: Lock device B at level 5
    D: Lock B's associated M

    (which tells lockdep that M comes after level 5), together with

    D: Lock device A at level 4
    D: Lock A's associated M
    D: Lock A's child at level 5

    Won't this then be a lockdep violation (since M is now locked before a
    device at level 5)?

    Alan Stern

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: Semphore -> mutex in the device tree

    On Fri, 2008-04-18 at 10:27 -0400, Alan Stern wrote:
    > On Fri, 18 Apr 2008, Peter Zijlstra wrote:
    >
    > > > Even so there is a potential for trouble. I don't know of any concrete
    > > > examples like this in the kernel, but they might exist. Suppose a
    > > > driver keeps a private mutex associated with each device it manages.
    > > > Normally the device's lock would be acquired first and the private
    > > > mutex second. But there could be places where the driver acquires a
    > > > child device's lock while holding the parent's mutex; this would look
    > > > to lockdep like a violation.

    > >
    > > So lockdep cares about classes and the hierarchy of thereof; so given
    > > your example:
    > >
    > > parent_tree_level
    > > child_tree_level
    > > device_lock
    > >
    > > Its perfectly fine to take a lock from 'parent_tree_level' and then a
    > > lock from 'device_lock', skipping the class in the middle - as long as
    > > you thereafter never acquire a lock from it.
    > >
    > > So given a pre-determined class hierarchy, you're not required to take
    > > all locks in that hierarchy; as long as you always go down. If you ever
    > > take a lock so that moves up in the hierarchy you're in trouble.

    >
    > We must be talking at cross purposes. Are classes and subclasses all
    > that lockdep looks at?


    Yes, it is fully class based.

    > Let's take a simpler example. Suppose driver D's probe routine
    > registers a child device. Then we have:
    >
    > Subsystem: Register device A with driver core
    >
    > Driver core: Lock device A with NESTING_PARENT
    > Call Drobe()
    >
    > Drobe(): Register device B with driver core
    > as a child of A
    >
    > Driver core: Lock device B with NESTING_PARENT
    > Call Erobe()
    >
    > (where E is the appropriate driver for B). Is this a lockdep
    > violation? Both A and B are locked with the same nesting level,
    > because they are locked by the same code in the driver core, but
    > one is the parent of the other in the device tree.


    Do I interpert this correct when I envision a call-chain like this:

    register_devise(A, some_parent)
    lock_device(A, NESTING_PARENT)
    D->probe()
    register_device(B, A)
    lock_device(B, NESTING_PARENT)

    That would work iff register_device() sets a tree-level class on B that
    is one down from A.

    > Or maybe I misunderstood, and you're proposing to use a node's level in
    > the tree as its lockdep nesting level.


    Yes, associate a class with each level like this:

    static struct lockdep_class_key device_tree_class[MAX_DEVICE_TREE_DEPTH];

    register_device(child, parent)
    {
    ...
    child->depth = parent->depth + 1;
    WARN_ON(child->depth > MAX_DEVICE_TREE_DEPTH);
    mutex_destroy(&child->lock);
    mutex_init(&child->lock);
    lockdep_set_class(&child->lock, &device_tree_class[child->depth]);
    ...
    }

    Now suppose we have a tree like:

    0 A
    / | \
    1 B C D
    / | \
    2 E F F
    |
    3 H


    Now, you can lock the whole path to H like:

    mutex_lock(&A->lock);
    mutex_lock(&D->lock);
    mutex_unlock(&A->lock);
    mutex_lock(&E->lock);
    mutex_unlock(&D->lock);
    mutex_lock(&H->lock);
    mutex_unlock(&E->lock);

    < H locked >

    without a single other lockdep annotation; this will teach lockdep the
    following class order:

    device_tree_class[0]
    device_tree_class[1]
    device_tree_class[2]
    device_tree_class[3]

    So a lock sequence like:

    mutex_lock(&E->lock);
    mutex_lock(&D->lock);

    Which will go from 2 -> 1, will generate a complaint.

    So, now your sibling scenario:

    Lock D, E and F:

    mutex_lock(&D->lock);
    mutex_lock(&E->lock);
    mutex_lock_nested(&F->lock, SINGLE_DEPTH_NESTING);

    This will teach lockdep the following class order:

    device_tree_class[1]
    device_tree_class[2]
    device_tree_class[2].subclass[1]

    So, if at another time you do:

    mutex_lock(&D->lock);
    mutex_lock(&F->lock);
    mutex_lock(&E->lock, SINGLE_DEPTH_NESTING);

    you're still obeying that order; of course you have to somehow guarantee
    that it will never actually deadlock - otherwise you annotate a genuine
    warning away.

    > In that case, consider this
    > example. Suppose driver D associates a private mutex M with each of
    > its devices. Suppose D is managing device A at level 4 and device B at
    > level 5. Then we might have:
    >
    > D: Lock device B at level 5
    > D: Lock B's associated M
    >
    > (which tells lockdep that M comes after level 5), together with
    >
    > D: Lock device A at level 4
    > D: Lock A's associated M
    > D: Lock A's child at level 5

    ^ B, right?
    >
    > Won't this then be a lockdep violation (since M is now locked before a
    > device at level 5)?


    Interesting.. yes, this would make lockdep upset - basically because you
    introduce nesting of M.

    device_tree_class[4]
    M_class
    device_tree_class[5]
    M_class

    So you take M_class inside M_class.

    Is this a common scenario? Normally a driver would only deal with a
    single device instance at a time, so I guess that once this scenario can
    happen the driver is already aware of this, right?

    It would need a separate annotation; if the coupling would be static
    (ps2 keyboard/mouse comes to mind) then the driver can have different
    lockdep_class_key instances.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: Semphore -> mutex in the device tree

    On Fri, 18 Apr 2008, Peter Zijlstra wrote:

    > Do I interpert this correct when I envision a call-chain like this:
    >
    > register_devise(A, some_parent)
    > lock_device(A, NESTING_PARENT)
    > D->probe()
    > register_device(B, A)
    > lock_device(B, NESTING_PARENT)
    >
    > That would work iff register_device() sets a tree-level class on B that
    > is one down from A.
    >
    > > Or maybe I misunderstood, and you're proposing to use a node's level in
    > > the tree as its lockdep nesting level.

    >
    > Yes, associate a class with each level like this:
    >
    > static struct lockdep_class_key device_tree_class[MAX_DEVICE_TREE_DEPTH];
    >
    > register_device(child, parent)
    > {
    > ...
    > child->depth = parent->depth + 1;
    > WARN_ON(child->depth > MAX_DEVICE_TREE_DEPTH);
    > mutex_destroy(&child->lock);
    > mutex_init(&child->lock);
    > lockdep_set_class(&child->lock, &device_tree_class[child->depth]);
    > ...
    > }


    Aha. I never understood that lockdep classes worked like that.

    ....
    > So, now your sibling scenario:
    >
    > Lock D, E and F:
    >
    > mutex_lock(&D->lock);
    > mutex_lock(&E->lock);
    > mutex_lock_nested(&F->lock, SINGLE_DEPTH_NESTING);
    >
    > This will teach lockdep the following class order:
    >
    > device_tree_class[1]
    > device_tree_class[2]
    > device_tree_class[2].subclass[1]
    >
    > So, if at another time you do:
    >
    > mutex_lock(&D->lock);
    > mutex_lock(&F->lock);
    > mutex_lock(&E->lock, SINGLE_DEPTH_NESTING);
    >
    > you're still obeying that order; of course you have to somehow guarantee
    > that it will never actually deadlock - otherwise you annotate a genuine
    > warning away.


    I'm still not sure this will end up working. In the situation I have
    in mind both sibling classes are locked by the same line of code;
    hence they will end up with the same nesting/subclass.

    > > In that case, consider this
    > > example. Suppose driver D associates a private mutex M with each of
    > > its devices. Suppose D is managing device A at level 4 and device B at
    > > level 5. Then we might have:
    > >
    > > D: Lock device B at level 5
    > > D: Lock B's associated M
    > >
    > > (which tells lockdep that M comes after level 5), together with
    > >
    > > D: Lock device A at level 4
    > > D: Lock A's associated M
    > > D: Lock A's child at level 5

    > ^ B, right?


    Doesn't have to be B. It could be any device.

    > > Won't this then be a lockdep violation (since M is now locked before a
    > > device at level 5)?

    >
    > Interesting.. yes, this would make lockdep upset - basically because you
    > introduce nesting of M.
    >
    > device_tree_class[4]
    > M_class
    > device_tree_class[5]
    > M_class
    >
    > So you take M_class inside M_class.
    >
    > Is this a common scenario? Normally a driver would only deal with a
    > single device instance at a time, so I guess that once this scenario can
    > happen the driver is already aware of this, right?


    I don't know if this ever occurs in the kernel. But it is a plausible
    scenario.

    In situations like this, where a private lock is acquired both before
    and after a device lock (albeit on different levels), the most logical
    solution is to assign each private lock to a class connected with the
    associated device's class. Can that be made to work?

    > It would need a separate annotation; if the coupling would be static
    > (ps2 keyboard/mouse comes to mind) then the driver can have different
    > lockdep_class_key instances.


    Rather like what I just said, right?


    I have wondered if it would be feasible for lockdep to support
    tree-ordered locks directly. It would have to know which mutexes
    belong to a tree structure, as well as the offsets to the parent
    pointer and to the mutex from the beginning of the enclosing structure.

    Since processes rarely lock more than a few tree members at a time,
    lockdep could get away with checking a few simple rules (for downward
    lock ordering):

    When locking a device D, if any other device locks are
    held then D's parent must already be locked. (A little
    more strict than necessary, but this should be acceptable.)

    When locking a device D, none of the other device locks
    already held may be for descendants of D.

    No doubt there are details making this less easy than I have painted
    it. Still, could it reasonably be done?

    Alan Stern

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread