[PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4) - Kernel

This is a discussion on [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4) - Kernel ; This patchset is intended to fix the oopses, memory corruption and mount failures when using the reproducer detailed here: https://bugzilla.samba.org/show_bug.cgi?id=5720 This is the fourth attempt at this. Since the third attempt, Steve French has committed the patch to handle the ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

  1. [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

    This patchset is intended to fix the oopses, memory corruption and mount
    failures when using the reproducer detailed here:

    https://bugzilla.samba.org/show_bug.cgi?id=5720

    This is the fourth attempt at this. Since the third attempt, Steve
    French has committed the patch to handle the server->tsk pointer more
    atomically so that patch is ommitted here. While that patch helps ensure
    that the shutdown of the demux thread happens cleanly, there are still
    other races.

    Andrew Morton has also taken the patch to clean up the server protocol
    handling for -mm. I've included this patch in the set since Steve has
    not yet taken it, and I've had to modify it slightly for changes that
    have gone into Steve's tree.

    This patchset is based on Steve French's cifs-2.6 git tree and should
    apply cleanly to its current state.

    The main differences in this patchset are that it fixes some bugs in
    list handling in the earlier patchsets. It also reenables the sharing
    of tree connects. With this, any structures that were previously
    shared should remain so (no loss of functionality).

    There's still some remaining cleanup work that can be done here. The
    cifs_mount code could stand to be broken up into smaller functions.
    cifs_debug_data_proc_show could also stand to be reorganized to better
    reflect the heirarchy of server->session->tcon. Those changes are
    probably more suitable in follow-on patches. I'd like to know whether
    these are acceptible before I spend time working on them.

    I've been able to run the reproducer in the above BZ overnight on this
    patchset. Without it, it usually crashes within a few minutes.

    Jeff Layton (4):
    cifs: clean up server protocol handling for TCP_Server_Info
    cifs: disable sharing session and tcon and add new TCP sharing code
    cifs: reinstate sharing of SMB sessions
    cifs: reinstate sharing of tree connections

    fs/cifs/cifs_debug.c | 286 +++++++++++++++++++---------------
    fs/cifs/cifsfs.c | 33 +++--
    fs/cifs/cifsglob.h | 28 ++--
    fs/cifs/cifssmb.c | 54 +------
    fs/cifs/connect.c | 426 +++++++++++++++++++++++++-------------------------
    fs/cifs/misc.c | 93 +++++------
    6 files changed, 456 insertions(+), 464 deletions(-)

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Fwd: [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

    This is much better since it doesn't regress the tcon sharing (which
    we still need to extend for the shared superblock case) - I
    particularly like the new routines to put (free) the tcon and put the
    smb session, but think the locking gets more complicated (with little
    performance gain) moving from one global spinlock covering
    tcon/smb-session/cifs-tcp-socket to one embedded within each
    structure, and also could be confusing since the cifs tcp socket is
    already protected by a semaphore and now would have a spinlock too.
    I think we greatly increase the chance of deadlock having to nest
    spinlocks.



    On Thu, Oct 30, 2008 at 10:16 AM, Jeff Layton wrote:
    >
    > This patchset is intended to fix the oopses, memory corruption and mount
    > failures when using the reproducer detailed here:
    >
    > https://bugzilla.samba.org/show_bug.cgi?id=5720
    >
    > This is the fourth attempt at this. Since the third attempt, Steve
    > French has committed the patch to handle the server->tsk pointer more
    > atomically so that patch is ommitted here. While that patch helps ensure
    > that the shutdown of the demux thread happens cleanly, there are still
    > other races.
    >
    > Andrew Morton has also taken the patch to clean up the server protocol
    > handling for -mm. I've included this patch in the set since Steve has
    > not yet taken it, and I've had to modify it slightly for changes that
    > have gone into Steve's tree.
    >
    > This patchset is based on Steve French's cifs-2.6 git tree and should
    > apply cleanly to its current state.
    >
    > The main differences in this patchset are that it fixes some bugs in
    > list handling in the earlier patchsets. It also reenables the sharing
    > of tree connects. With this, any structures that were previously
    > shared should remain so (no loss of functionality).
    >
    > There's still some remaining cleanup work that can be done here. The
    > cifs_mount code could stand to be broken up into smaller functions.
    > cifs_debug_data_proc_show could also stand to be reorganized to better
    > reflect the heirarchy of server->session->tcon. Those changes are
    > probably more suitable in follow-on patches. I'd like to know whether
    > these are acceptible before I spend time working on them.
    >
    > I've been able to run the reproducer in the above BZ overnight on this
    > patchset. Without it, it usually crashes within a few minutes.
    >
    > Jeff Layton (4):
    > cifs: clean up server protocol handling for TCP_Server_Info
    > cifs: disable sharing session and tcon and add new TCP sharing code
    > cifs: reinstate sharing of SMB sessions
    > cifs: reinstate sharing of tree connections
    >
    > fs/cifs/cifs_debug.c | 286 +++++++++++++++++++---------------
    > fs/cifs/cifsfs.c | 33 +++--
    > fs/cifs/cifsglob.h | 28 ++--
    > fs/cifs/cifssmb.c | 54 +------
    > fs/cifs/connect.c | 426 +++++++++++++++++++++++++-------------------------
    > fs/cifs/misc.c | 93 +++++------
    > 6 files changed, 456 insertions(+), 464 deletions(-)
    >




    --
    Thanks,

    Steve



    --
    Thanks,

    Steve
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

    On Thu, 30 Oct 2008 12:25:36 -0500
    "Steve French" wrote:

    > This is much better since it doesn't regress the tcon sharing (which we
    > still need to extend for the shared superblock case) - I particularly like
    > the new routines to put (free) the tcon and put the smb session, but think
    > the locking gets more complicated (with little performance gain) moving from
    > one global spinlock covering tcon/smb-session/cifs-tcp-socket to one
    > embedded within each structure, and also could be confusing since the cifs
    > tcp socket is already protected by a semaphore and now would have a spinlock
    > too. I think we greatly increase the chance of deadlock having to nest
    > spinlocks.
    >


    lockdep is pretty good about warning you if you're in danger of deadlock
    (and I generally always have lockdep turned on).

    I really think the fine-grained locking is the best scheme. It's pretty
    clear how it should work. If you're walking or altering the lists
    (and/or changing the refcounts) then you need to take the locks that
    protect them. (granted, I probably need to do another respin that adds
    some comments to this effect).

    I think we want to resist having locks that protect too many things.
    With that, we end up with the locks held over too much code. Not only is
    that generally worse for performance, but it can paper over race
    conditions.

    --
    Jeff Layton
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

    On Thu, Oct 30, 2008 at 12:42 PM, Jeff Layton wrote:
    > I think we want to resist having locks that protect too many things.
    > With that, we end up with the locks held over too much code. Not only is
    > that generally worse for performance, but it can paper over race
    > conditions.


    I agree that it is trivially worse for performance to have a single
    spinlock protecting the three interrelated structures (cifs tcp, smb
    and tree connection structs), but since they point to one another and
    frequently have operations that require us to use all three lists -
    to do things like iterate through all tree connections within a
    particular smb session, or iterate across all cifs smb sessions within
    each cifs tcp session - it makes code more complicated to have to grab
    and unlock multiple spinlocks in the correct order every time across
    all exit paths etc.



    --
    Thanks,

    Steve
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH 0/4] cifs: fix oopses and mem corruption with concurrent mount/umount (try #4)

    On Thu, 30 Oct 2008 12:51:03 -0500
    "Steve French" wrote:

    > On Thu, Oct 30, 2008 at 12:42 PM, Jeff Layton wrote:
    > > I think we want to resist having locks that protect too many things.
    > > With that, we end up with the locks held over too much code. Not only is
    > > that generally worse for performance, but it can paper over race
    > > conditions.

    >
    > I agree that it is trivially worse for performance to have a single
    > spinlock protecting the three interrelated structures (cifs tcp, smb
    > and tree connection structs), but since they point to one another and
    > frequently have operations that require us to use all three lists -
    > to do things like iterate through all tree connections within a
    > particular smb session, or iterate across all cifs smb sessions within
    > each cifs tcp session - it makes code more complicated to have to grab
    > and unlock multiple spinlocks in the correct order every time across
    > all exit paths etc.
    >


    A fair point, but most of that is in rarely-traveled procfile code. One
    thing we could consider is some helper macros or functions. For
    instance, a for_all_tcons() function or something that would take a
    pointer to a function that takes a tcon arg. It would
    basically just walk over all the tcons and handle the locking
    correctly and call the function for each.

    In any case, I don't see the benefit of not using fine grained locking
    here. deadlock is a possibility, but I think having well-defined
    locking rules mitigates that danger.

    --
    Jeff Layton
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread