BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17) - Kernel

This is a discussion on BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17) - Kernel ; It is SMP x86_64, rootfs is on XFS [ 4.419571] BUG: sleeping function called from invalid context at kernel/rwsem.c:131 [ 4.419576] in_atomic(): 1, irqs_disabled(): 0, pid: 1396, name: mkdir [ 4.419579] 2 locks held by mkdir/1396: [ 4.419581] #0: (&type->i_mutex_dir_key#2/1){--..}, ...

+ Reply to Thread
Results 1 to 20 of 20

Thread: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

  1. BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    It is SMP x86_64, rootfs is on XFS

    [ 4.419571] BUG: sleeping function called from invalid context at
    kernel/rwsem.c:131
    [ 4.419576] in_atomic(): 1, irqs_disabled(): 0, pid: 1396, name: mkdir
    [ 4.419579] 2 locks held by mkdir/1396:
    [ 4.419581] #0: (&type->i_mutex_dir_key#2/1){--..}, at:
    [] lookup_create+0x2b/0x99
    [ 4.419594] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
    [] xfs_ilock+0x60/0x7e
    [ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
    [ 4.419608] Call Trace:
    [ 4.419614] [] ? __debug_show_held_locks+0x22/0x24
    [ 4.419620] [] __might_sleep+0x120/0x122
    [ 4.419625] [] down_write_nested+0x2c/0x7b
    [ 4.419629] [] xfs_ilock+0x60/0x7e
    [ 4.419633] [] xfs_iget_core+0x3cc/0x4f1
    [ 4.419637] [] ? iget_locked+0xf0/0x125
    [ 4.419641] [] xfs_iget+0xeb/0x197
    [ 4.419646] [] xfs_trans_iget+0xb4/0x126
    [ 4.419651] [] xfs_ialloc+0x9e/0x51e
    [ 4.419655] [] ? get_parent_ip+0x11/0x41
    [ 4.419659] [] xfs_dir_ialloc+0x78/0x26a
    [ 4.419663] [] ? down_write_nested+0x73/0x7b
    [ 4.419667] [] xfs_mkdir+0x236/0x438
    [ 4.419673] [] xfs_vn_mknod+0xe7/0x13b
    [ 4.419677] [] xfs_vn_mkdir+0x13/0x15
    [ 4.419681] [] vfs_mkdir+0x4b/0x8f
    [ 4.419685] [] sys_mkdirat+0xa7/0x101
    [ 4.419691] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [ 4.419695] [] ? trace_hardirqs_on_caller+0x12c/0x157
    [ 4.419700] [] ? do_page_fault+0x394/0x884
    [ 4.419704] [] sys_mkdir+0x18/0x1a
    [ 4.419709] [] system_call_fastpath+0x16/0x1b
    [ 4.743247] dmesg used greatest stack depth: 2848 bytes left
    [ 5.930376] bnx2: eth0: using MSI
    [ 6.056409] BUG: sleeping function called from invalid context at
    kernel/rwsem.c:131
    [ 6.056413] in_atomic(): 1, irqs_disabled(): 0, pid: 1782, name: dhcpcd
    [ 6.056416] 2 locks held by dhcpcd/1782:
    [ 6.056418] #0: (&type->i_mutex_dir_key#2){--..}, at:
    [] do_filp_open+0x178/0x7d5
    [ 6.056431] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
    [] xfs_ilock+0x60/0x7e
    [ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
    [ 6.056445] Call Trace:
    [ 6.056452] [] ? __debug_show_held_locks+0x22/0x24
    [ 6.056458] [] __might_sleep+0x120/0x122
    [ 6.056464] [] down_write_nested+0x2c/0x7b
    [ 6.056468] [] xfs_ilock+0x60/0x7e
    [ 6.056472] [] xfs_iget_core+0x3cc/0x4f1
    [ 6.056476] [] ? iget_locked+0xf0/0x125
    [ 6.056481] [] xfs_iget+0xeb/0x197
    [ 6.056486] [] xfs_trans_iget+0xb4/0x126
    [ 6.056491] [] xfs_ialloc+0x9e/0x51e
    [ 6.056495] [] ? get_parent_ip+0x11/0x41
    [ 6.056499] [] xfs_dir_ialloc+0x78/0x26a
    [ 6.056503] [] ? down_write_nested+0x73/0x7b
    [ 6.056508] [] xfs_create+0x239/0x438
    [ 6.056512] [] ? get_lock_stats+0x2d/0x5c
    [ 6.056518] [] xfs_vn_mknod+0xce/0x13b
    [ 6.056522] [] xfs_vn_create+0x10/0x12
    [ 6.056526] [] vfs_create+0x5d/0x84
    [ 6.056530] [] do_filp_open+0x24e/0x7d5
    [ 6.056534] [] ? get_parent_ip+0x11/0x41
    [ 6.056539] [] ? sub_preempt_count+0xb0/0xc4
    [ 6.056544] [] ? alloc_fd+0x122/0x133
    [ 6.056549] [] do_sys_open+0x58/0xdf
    [ 6.056553] [] sys_open+0x20/0x22
    [ 6.056561] [] system_call_fastpath+0x16/0x1b
    [ 9.098603] bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full
    duplex, receive & transmit flow control ON
    [ 17.899872] BUG: sleeping function called from invalid context at
    kernel/rwsem.c:131
    [ 17.899876] in_atomic(): 1, irqs_disabled(): 0, pid: 1791, name:
    dhcpcd-run-hook
    [ 17.899879] 2 locks held by dhcpcd-run-hook/1791:
    [ 17.899881] #0: (&type->i_mutex_dir_key#2){--..}, at:
    [] do_filp_open+0x178/0x7d5
    [ 17.899893] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
    [] xfs_ilock+0x60/0x7e
    [ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
    2.6.27-next-20081017 #2
    [ 17.899907] Call Trace:
    [ 17.899914] [] ? __debug_show_held_locks+0x22/0x24
    [ 17.899920] [] __might_sleep+0x120/0x122
    [ 17.899926] [] down_write_nested+0x2c/0x7b
    [ 17.899930] [] xfs_ilock+0x60/0x7e
    [ 17.899934] [] xfs_iget_core+0x3cc/0x4f1
    [ 17.899939] [] ? iget_locked+0xf0/0x125
    [ 17.899943] [] xfs_iget+0xeb/0x197
    [ 17.899953] [] xfs_trans_iget+0xb4/0x126
    [ 17.899956] [] xfs_ialloc+0x9e/0x51e
    [ 17.899959] [] ? get_parent_ip+0x11/0x41
    [ 17.899962] [] xfs_dir_ialloc+0x78/0x26a
    [ 17.899965] [] ? down_write_nested+0x73/0x7b
    [ 17.899968] [] xfs_create+0x239/0x438
    [ 17.899971] [] ? get_lock_stats+0x2d/0x5c
    [ 17.899975] [] xfs_vn_mknod+0xce/0x13b
    [ 17.899978] [] xfs_vn_create+0x10/0x12
    [ 17.899981] [] vfs_create+0x5d/0x84
    [ 17.899984] [] do_filp_open+0x24e/0x7d5
    [ 17.899987] [] ? get_parent_ip+0x11/0x41
    [ 17.899990] [] ? sub_preempt_count+0xb0/0xc4
    [ 17.899993] [] ? alloc_fd+0x122/0x133
    [ 17.899997] [] do_sys_open+0x58/0xdf
    [ 17.900002] [] ? sigprocmask+0x37/0xd7
    [ 17.900004] [] sys_open+0x20/0x22
    [ 17.900008] [] system_call_fastpath+0x16/0x1b
    [ 18.190958] Adding 3911788k swap on /dev/sda1. Priority:-1
    extents:1 across:3911788k
    [ 30.773375] CPA self-test:
    [ 30.780146] 4k 523088 large 0 gb 0 x
    262144[ffff880000000000-ffff88003ffff000] miss 0
    [ 30.792087] 4k 523088 large 0 gb 0 x
    262144[ffff880000000000-ffff88003ffff000] miss 0
    [ 30.803460] 4k 523088 large 0 gb 0 x
    262144[ffff880000000000-ffff88003ffff000] miss 0
    [ 30.803463] ok.
    [ 3846.588515] BUG: sleeping function called from invalid context at
    kernel/rwsem.c:131
    [ 3846.588519] in_atomic(): 1, irqs_disabled(): 0, pid: 2005, name: screen
    [ 3846.588523] 2 locks held by screen/2005:
    [ 3846.588524] #0: (&type->i_mutex_dir_key#2/1){--..}, at:
    [] lookup_create+0x2b/0x99
    [ 3846.588538] #1: (&(&ip->i_lock)->mr_lock/1){--..}, at:
    [] xfs_ilock+0x60/0x7e
    [ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
    [ 3846.588552] Call Trace:
    [ 3846.588559] [] ? __debug_show_held_locks+0x22/0x24
    [ 3846.588565] [] __might_sleep+0x120/0x122
    [ 3846.588570] [] down_write_nested+0x2c/0x7b
    [ 3846.588574] [] xfs_ilock+0x60/0x7e
    [ 3846.588578] [] xfs_iget_core+0x3cc/0x4f1
    [ 3846.588582] [] ? iget_locked+0xf0/0x125
    [ 3846.588586] [] xfs_iget+0xeb/0x197
    [ 3846.588591] [] xfs_trans_iget+0xb4/0x126
    [ 3846.588596] [] xfs_ialloc+0x9e/0x51e
    [ 3846.588599] [] ? get_parent_ip+0x11/0x41
    [ 3846.588604] [] xfs_dir_ialloc+0x78/0x26a
    [ 3846.588607] [] ? down_write_nested+0x73/0x7b
    [ 3846.588612] [] xfs_create+0x239/0x438
    [ 3846.588617] [] xfs_vn_mknod+0xce/0x13b
    [ 3846.588621] [] vfs_mknod+0x75/0x9a
    [ 3846.588625] [] sys_mknodat+0x186/0x1e5
    [ 3846.588632] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [ 3846.588636] [] ? trace_hardirqs_on_caller+0x12c/0x157
    [ 3846.588640] [] sys_mknod+0x1a/0x1c
    [ 3846.588644] [] system_call_fastpath+0x16/0x1b
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Fri, Oct 17, 2008 at 04:43:07PM +0400, Alexander Beregalov wrote:
    > It is SMP x86_64, rootfs is on XFS


    The message comes from code in XFS, but I really wonder where the atomic
    context comes from. xfs_iget_core has a few changes in the tree pulled
    into todays Linux-next, but it's still missing the recent updates
    redoing this area completely. In some ways it looks like this process
    has the the atomic context flag left pending somewhere, but that should
    also trigger warnings on the VFS locks like i_mutex.

    Very misterious.

    IS there a specific workload you can reproduce this with?

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Fri, Oct 17, 2008 at 08:54:35PM +0400, Alexander Beregalov wrote:
    > Workload is not specific, many i/o commands trigger the message.
    >
    > dmesg|grep comm:
    > [ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
    > [ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
    > [ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
    > 2.6.27-next-20081017 #2
    > [ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
    > [ 4416.242723] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    > [ 4430.618037] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    > [ 4431.641578] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    > [ 4432.944265] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    > [ 4590.330706] Pid: 2026, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    > [14882.802479] Pid: 2058, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    > [16955.026436] Pid: 2091, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    > [18466.645935] Pid: 2119, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    > [19165.813049] Pid: 2150, comm: git-merge Not tainted 2.6.27-next-20081017 #2


    Very strange. Can you retry with tomorrow Linux-next which should have
    a large XFS update? If it still happens I'll investigate it in more
    detail.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/17 Christoph Hellwig :
    > On Fri, Oct 17, 2008 at 04:43:07PM +0400, Alexander Beregalov wrote:
    >> It is SMP x86_64, rootfs is on XFS

    >
    > The message comes from code in XFS, but I really wonder where the atomic
    > context comes from. xfs_iget_core has a few changes in the tree pulled
    > into todays Linux-next, but it's still missing the recent updates
    > redoing this area completely. In some ways it looks like this process
    > has the the atomic context flag left pending somewhere, but that should
    > also trigger warnings on the VFS locks like i_mutex.
    >
    > Very misterious.
    >
    > IS there a specific workload you can reproduce this with?
    >

    Workload is not specific, many i/o commands trigger the message.

    dmesg|grep comm:
    [ 4.419605] Pid: 1396, comm: mkdir Not tainted 2.6.27-next-20081017 #2
    [ 6.056443] Pid: 1782, comm: dhcpcd Not tainted 2.6.27-next-20081017 #2
    [ 17.899905] Pid: 1791, comm: dhcpcd-run-hook Not tainted
    2.6.27-next-20081017 #2
    [ 3846.588549] Pid: 2005, comm: screen Not tainted 2.6.27-next-20081017 #2
    [ 4416.242723] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    [ 4430.618037] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    [ 4431.641578] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    [ 4432.944265] Pid: 2023, comm: git Not tainted 2.6.27-next-20081017 #2
    [ 4590.330706] Pid: 2026, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    [14882.802479] Pid: 2058, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    [16955.026436] Pid: 2091, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    [18466.645935] Pid: 2119, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    [19165.813049] Pid: 2150, comm: git-merge Not tainted 2.6.27-next-20081017 #2
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/17 Christoph Hellwig :
    > Very strange. Can you retry with tomorrow Linux-next which should have
    > a large XFS update? If it still happens I'll investigate it in more
    > detail.


    I have tried next-20081016 with the same result.
    I am afraid I wont be able to test it tomorrow, it will likely be
    possible on Monday.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    I just ran the xfs testsuite over linux-next on qemu (i386), and I can't
    find anything at all. Really strange. Let's see if this still there
    with mondays linux-next, and if yes can you just try the xfs patch from
    the splitout linux-next patches and see if that alone causes it?

    In fact that might be useful for todays linux-next, too.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Fri, 17 Oct 2008 16:37:10 -0400
    Christoph Hellwig wrote:

    > I just ran the xfs testsuite over linux-next on qemu (i386), and I
    > can't find anything at all. Really strange. Let's see if this still
    > there with mondays linux-next, and if yes can you just try the xfs
    > patch from the splitout linux-next patches and see if that alone
    > causes it?
    >
    > In fact that might be useful for todays linux-next, too.
    > --


    also if you enable lockdep, it tracks where irq's got turned off (with
    a stacktrace I think) so it would provide some good clues.


    --
    Arjan van de Ven Intel Open Source Technology Centre
    For development, discussion and tips for power savings,
    visit http://www.lesswatts.org
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/18 Arjan van de Ven :
    > On Fri, 17 Oct 2008 16:37:10 -0400
    > Christoph Hellwig wrote:
    >
    >> I just ran the xfs testsuite over linux-next on qemu (i386), and I
    >> can't find anything at all. Really strange. Let's see if this still
    >> there with mondays linux-next, and if yes can you just try the xfs
    >> patch from the splitout linux-next patches and see if that alone
    >> causes it?


    Hi Christoph

    I have the same result with next-1020 and today's xfs-2.6.git/master
    (
    commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    Author: David Chinner
    Date: Fri Oct 17 15:36:23 2008 +1000
    Inode: Allow external list initialisation
    )

    Should I start bisecting?
    >>
    >> In fact that might be useful for todays linux-next, too.
    >> --

    >
    > also if you enable lockdep, it tracks where irq's got turned off (with
    > a stacktrace I think) so it would provide some good clues.


    Yes, lockdep was enabled, but there is no more information than I
    already posted.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
    > Hi Christoph
    >
    > I have the same result with next-1020 and today's xfs-2.6.git/master
    > (
    > commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    > Author: David Chinner
    > Date: Fri Oct 17 15:36:23 2008 +1000
    > Inode: Allow external list initialisation
    > )
    >
    > Should I start bisecting?


    That would be extremely helpful!


    And as mentioned in the other mail I still can't reproduce it, but I can
    currently only test on x86-32, not x86-64.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/20 Christoph Hellwig :
    > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
    >> Hi Christoph
    >>
    >> I have the same result with next-1020 and today's xfs-2.6.git/master
    >> (
    >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    >> Author: David Chinner
    >> Date: Fri Oct 17 15:36:23 2008 +1000
    >> Inode: Allow external list initialisation
    >> )

    Ha, that kernel (xfs/master) has made my system dead locked.
    SysRQ-d:
    Show all locks held in the system
    1 lock held by pdflush
    (&type->s_umount_key#18{----}, at writeback_inodes
    1 lock held by login
    (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
    and so on ( many locks at xfs_ilock)

    >>
    >> Should I start bisecting?

    >
    > That would be extremely helpful!
    >
    >
    > And as mentioned in the other mail I still can't reproduce it, but I can
    > currently only test on x86-32, not x86-64.
    >

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
    > 2008/10/20 Christoph Hellwig :
    > > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
    > >> Hi Christoph
    > >>
    > >> I have the same result with next-1020 and today's xfs-2.6.git/master
    > >> (
    > >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    > >> Author: David Chinner
    > >> Date: Fri Oct 17 15:36:23 2008 +1000
    > >> Inode: Allow external list initialisation
    > >> )

    > Ha, that kernel (xfs/master) has made my system dead locked.
    > SysRQ-d:
    > Show all locks held in the system
    > 1 lock held by pdflush
    > (&type->s_umount_key#18{----}, at writeback_inodes
    > 1 lock held by login
    > (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
    > and so on ( many locks at xfs_ilock)


    Curious. Can you post the full stack traces?

    Cheers,

    Dave.
    --
    Dave Chinner
    david@fromorbit.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/21 Dave Chinner :
    > On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
    >> 2008/10/20 Christoph Hellwig :
    >> > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
    >> >> Hi Christoph
    >> >>
    >> >> I have the same result with next-1020 and today's xfs-2.6.git/master
    >> >> (
    >> >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    >> >> Author: David Chinner
    >> >> Date: Fri Oct 17 15:36:23 2008 +1000
    >> >> Inode: Allow external list initialisation
    >> >> )

    >> Ha, that kernel (xfs/master) has made my system dead locked.
    >> SysRQ-d:
    >> Show all locks held in the system
    >> 1 lock held by pdflush
    >> (&type->s_umount_key#18{----}, at writeback_inodes
    >> 1 lock held by login
    >> (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
    >> and so on ( many locks at xfs_ilock)

    >
    > Curious. Can you post the full stack traces?

    I can not reproduce it, yet.

    Bisected to:
    dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
    commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
    Author: Lachlan McIlroy
    Date: Mon Sep 29 14:56:40 2008 +1000

    [XFS] Unlock inode before calling xfs_idestroy()

    Lock debugging reported the ilock was being destroyed without being
    unlocked. We don't need to lock the inode until we are going to insert it
    into the radix tree.

    SGI-PV: 987246

    SGI-Modid: xfs-linux-melb:xfs-kern:32159a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Christoph Hellwig

    :040000 040000 b56d4b74bf0cbdd66d0e61c537b4bb2f7461084f
    40773db6102a1e348a0a21b0e960cb33ffd0d419 M fs

    git bisect start 'fs/xfs'
    git bisect bad bfd2bd10da76378dc4afd87d7d204a1d3d70b347 Inode: Allow
    external list initialisation
    git bisect good 3fa8749e584b55f1180411ab1b51117190bac1e5 Linux 2.6.27
    git bisect skip a55c8e45381bcc5588a544ba73580719887372eb
    git bisect skip 35afc673a41114eb650c6ae766010160dd982a7b
    git bisect skip 7a9ba9bb899933293604a2b3c5ca4f40ad5a92a8
    git bisect skip c02afc5fa7433fbbc0a045afbb472533de0758de
    git bisect skip c0f47794ba870f0cd1dbe5c8fdbf56b86f5f2afa
    git bisect skip 7197197719eb94b527c6a422bf3fb682cdf89954
    git bisect good e0fe783155e4f1c7106f3579c258b9f995330c19
    git bisect bad 2472c6b938d2b3cb1698abe39cc90a3c1d7625b9
    git bisect good 20c1bd2cc6d9311fb7d7d0eb91b46dc4a42b5d11
    git bisect good 8e47c0b2427f0ea35984f02648163cc7a35d3592
    git bisect good a1853934a9bef78aeb1aa7539c629cdb755edab2
    git bisect bad 05de34dbbe744a4d235279f1493a8f05b903a4bb
    git bisect bad c4fa37724d18a3444bb4b4f77c4580b9dd525ed9
    git bisect good 4b4577db477462ff6f41babcc3b3e7036a1ba27d
    git bisect bad dd509097cb0b76d3836385f80d6b2d6fd3b97757
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
    > 2008/10/21 Dave Chinner :
    > > On Mon, Oct 20, 2008 at 09:13:05PM +0400, Alexander Beregalov wrote:
    > >> 2008/10/20 Christoph Hellwig :
    > >> > On Mon, Oct 20, 2008 at 06:58:09PM +0400, Alexander Beregalov wrote:
    > >> >> Hi Christoph
    > >> >>
    > >> >> I have the same result with next-1020 and today's xfs-2.6.git/master
    > >> >> (
    > >> >> commit bfd2bd10da76378dc4afd87d7d204a1d3d70b347
    > >> >> Author: David Chinner
    > >> >> Date: Fri Oct 17 15:36:23 2008 +1000
    > >> >> Inode: Allow external list initialisation
    > >> >> )
    > >> Ha, that kernel (xfs/master) has made my system dead locked.
    > >> SysRQ-d:
    > >> Show all locks held in the system
    > >> 1 lock held by pdflush
    > >> (&type->s_umount_key#18{----}, at writeback_inodes
    > >> 1 lock held by login
    > >> (&(&ip->i_iolock)->mr_lock){----} at xfs_ilock
    > >> and so on ( many locks at xfs_ilock)

    > >
    > > Curious. Can you post the full stack traces?

    > I can not reproduce it, yet.
    >
    > Bisected to:
    > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
    > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
    > Author: Lachlan McIlroy
    > Date: Mon Sep 29 14:56:40 2008 +1000
    >
    > [XFS] Unlock inode before calling xfs_idestroy()
    >
    > Lock debugging reported the ilock was being destroyed without being
    > unlocked. We don't need to lock the inode until we are going to insert it
    > into the radix tree.


    Ah, OK, I see the problem, though I don't understand why I'm not
    seeing the might_sleep() triggering all the time given that I always
    build with:

    $ grep SLEEP .config
    CONFIG_DEBUG_SPINLOCK_SLEEP=y

    Basically the above commit moved xfs_ilock() inside
    radix_tree_preload()/radix_tree_preload_end(), which means we are
    taking a rwsem() while we have an elevated preempt count. I'll
    get a patch out to fix it.

    The question is why isn't this triggering for me during QA runs?

    Cheers,

    Dave.
    --
    Dave Chinner
    david@fromorbit.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Wed, Oct 22, 2008 at 12:21:23PM +0400, Alexander Beregalov wrote:
    > > Ah, OK, I see the problem, though I don't understand why I'm not
    > > seeing the might_sleep() triggering all the time given that I always
    > > build with:
    > >
    > > $ grep SLEEP .config
    > > CONFIG_DEBUG_SPINLOCK_SLEEP=y
    > >
    > > Basically the above commit moved xfs_ilock() inside
    > > radix_tree_preload()/radix_tree_preload_end(), which means we are
    > > taking a rwsem() while we have an elevated preempt count. I'll
    > > get a patch out to fix it.

    > Could it cause the I/O dead lock or should I continue trying to reproduce it?


    The deadlock wouldn't be produced by the same thing that
    produced the sleeping-in-atomic warning. The missed unlock that
    I also fixed in the patch I just sent could possibly have caused
    that, but I'm just speculating on that...

    Cheers,

    Dave.
    --
    Dave Chinner
    david@fromorbit.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
    > On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
    > > Bisected to:
    > > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
    > > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
    > > Author: Lachlan McIlroy
    > > Date: Mon Sep 29 14:56:40 2008 +1000
    > >
    > > [XFS] Unlock inode before calling xfs_idestroy()
    > >
    > > Lock debugging reported the ilock was being destroyed without being
    > > unlocked. We don't need to lock the inode until we are going to insert it
    > > into the radix tree.

    >
    > Ah, OK, I see the problem, though I don't understand why I'm not
    > seeing the might_sleep() triggering all the time given that I always
    > build with:
    >
    > $ grep SLEEP .config
    > CONFIG_DEBUG_SPINLOCK_SLEEP=y
    >
    > Basically the above commit moved xfs_ilock() inside
    > radix_tree_preload()/radix_tree_preload_end(), which means we are
    > taking a rwsem() while we have an elevated preempt count. I'll
    > get a patch out to fix it.


    Patch below (against the xfs master/linux-next branch) should fix the
    regression. I've just started QA on it. Can you please check that
    it works for you, Alexander?

    Cheers,

    Dave.
    --
    Dave Chinner
    david@fromorbit.com

    XFS: Can't lock inodes in radix tree preload region

    When we are inside a radix tree preload region, we cannot
    sleep. Recently we moved the inode locking inside the
    preload region for the inode radix tree. Fix that,
    and fix a missed unlock in another error path in the
    same code at the same time.

    Signed-off-by: Dave Chinner
    ---
    fs/xfs/xfs_iget.c | 18 +++++++++++-------
    1 files changed, 11 insertions(+), 7 deletions(-)

    diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
    index fbc6088..837cae7 100644
    --- a/fs/xfs/xfs_iget.c
    +++ b/fs/xfs/xfs_iget.c
    @@ -159,18 +159,19 @@ xfs_iget_cache_miss(
    goto out_destroy;
    }

    + if (lock_flags)
    + xfs_ilock(ip, lock_flags);
    +
    /*
    * Preload the radix tree so we can insert safely under the
    - * write spinlock.
    + * write spinlock. Note that we cannot sleep inside the preload
    + * region.
    */
    if (radix_tree_preload(GFP_KERNEL)) {
    error = EAGAIN;
    - goto out_destroy;
    + goto out_unlock;
    }

    - if (lock_flags)
    - xfs_ilock(ip, lock_flags);
    -
    mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) - 1);
    first_index = agino & mask;
    write_lock(&pag->pag_ici_lock);
    @@ -181,7 +182,7 @@ xfs_iget_cache_miss(
    WARN_ON(error != -EEXIST);
    XFS_STATS_INC(xs_ig_dup);
    error = EAGAIN;
    - goto out_unlock;
    + goto out_preload_end;
    }

    /* These values _must_ be set before releasing the radix tree lock! */
    @@ -193,9 +194,12 @@ xfs_iget_cache_miss(
    *ipp = ip;
    return 0;

    -out_unlock:
    +out_preload_end:
    write_unlock(&pag->pag_ici_lock);
    radix_tree_preload_end();
    +out_unlock:
    + if (lock_flags)
    + xfs_iunlock(ip, lock_flags);
    out_destroy:
    xfs_destroy_inode(ip);
    return error;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    > Ah, OK, I see the problem, though I don't understand why I'm not
    > seeing the might_sleep() triggering all the time given that I always
    > build with:
    >
    > $ grep SLEEP .config
    > CONFIG_DEBUG_SPINLOCK_SLEEP=y
    >
    > Basically the above commit moved xfs_ilock() inside
    > radix_tree_preload()/radix_tree_preload_end(), which means we are
    > taking a rwsem() while we have an elevated preempt count. I'll
    > get a patch out to fix it.

    Could it cause the I/O dead lock or should I continue trying to reproduce it?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    2008/10/22 Dave Chinner :
    > On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
    >> On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
    >> > Bisected to:
    >> > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
    >> > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
    >> > Author: Lachlan McIlroy
    >> > Date: Mon Sep 29 14:56:40 2008 +1000
    >> >
    >> > [XFS] Unlock inode before calling xfs_idestroy()
    >> >
    >> > Lock debugging reported the ilock was being destroyed without being
    >> > unlocked. We don't need to lock the inode until we are going to insert it
    >> > into the radix tree.

    >>
    >> Ah, OK, I see the problem, though I don't understand why I'm not
    >> seeing the might_sleep() triggering all the time given that I always
    >> build with:
    >>
    >> $ grep SLEEP .config
    >> CONFIG_DEBUG_SPINLOCK_SLEEP=y
    >>
    >> Basically the above commit moved xfs_ilock() inside
    >> radix_tree_preload()/radix_tree_preload_end(), which means we are
    >> taking a rwsem() while we have an elevated preempt count. I'll
    >> get a patch out to fix it.

    >
    > Patch below (against the xfs master/linux-next branch) should fix the
    > regression. I've just started QA on it. Can you please check that
    > it works for you, Alexander?
    >
    > Cheers,
    >
    > Dave.
    > --
    > Dave Chinner
    > david@fromorbit.com
    >
    > XFS: Can't lock inodes in radix tree preload region
    >
    > When we are inside a radix tree preload region, we cannot
    > sleep. Recently we moved the inode locking inside the
    > preload region for the inode radix tree. Fix that,
    > and fix a missed unlock in another error path in the
    > same code at the same time.
    >
    > Signed-off-by: Dave Chinner

    Tested-by: Alexander Beregalov

    Yes, it fixes the issue. Thanks
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Wed, Oct 22, 2008 at 07:25:50PM +1100, Dave Chinner wrote:
    > > Basically the above commit moved xfs_ilock() inside
    > > radix_tree_preload()/radix_tree_preload_end(), which means we are
    > > taking a rwsem() while we have an elevated preempt count. I'll
    > > get a patch out to fix it.


    This really needs a warning. Then again I don't really understand this
    as the point of radix_tree_preload was that we can do the actual
    radix-tree under a lock, or not?

    Anyway, the patch looks correct to me, including the slightly unrelated
    hunk to fix the missing unlock and preload leak.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS?

    Dave Chinner writes:

    > On Wed, Oct 22, 2008 at 06:58:38PM +1100, Dave Chinner wrote:
    >> On Tue, Oct 21, 2008 at 03:42:16PM +0400, Alexander Beregalov wrote:
    >> > Bisected to:
    >> > dd509097cb0b76d3836385f80d6b2d6fd3b97757 is first bad commit
    >> > commit dd509097cb0b76d3836385f80d6b2d6fd3b97757
    >> > Author: Lachlan McIlroy
    >> > Date: Mon Sep 29 14:56:40 2008 +1000
    >> >
    >> > [XFS] Unlock inode before calling xfs_idestroy()
    >> >
    >> > Lock debugging reported the ilock was being destroyed without being
    >> > unlocked. We don't need to lock the inode until we are going to insert it
    >> > into the radix tree.

    >>
    >> Ah, OK, I see the problem, though I don't understand why I'm not
    >> seeing the might_sleep() triggering all the time given that I always
    >> build with:
    >>
    >> $ grep SLEEP .config
    >> CONFIG_DEBUG_SPINLOCK_SLEEP=y
    >>
    >> Basically the above commit moved xfs_ilock() inside
    >> radix_tree_preload()/radix_tree_preload_end(), which means we are
    >> taking a rwsem() while we have an elevated preempt count. I'll
    >> get a patch out to fix it.

    >
    > Patch below (against the xfs master/linux-next branch) should fix the
    > regression. I've just started QA on it. Can you please check that
    > it works for you, Alexander?


    Ran into the same problem, the fix worked for me.

    Thank you.

    Hannes
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: BUG: sleeping function called from invalid context at kernel/rwsem.c:131 XFS? (was: Re: linux-next: Tree for October 17)

    On Wed, Oct 22, 2008 at 06:13:51AM -0400, Christoph Hellwig wrote:
    > On Wed, Oct 22, 2008 at 07:25:50PM +1100, Dave Chinner wrote:
    > > > Basically the above commit moved xfs_ilock() inside
    > > > radix_tree_preload()/radix_tree_preload_end(), which means we are
    > > > taking a rwsem() while we have an elevated preempt count. I'll
    > > > get a patch out to fix it.

    >
    > This really needs a warning. Then again I don't really understand this
    > as the point of radix_tree_preload was that we can do the actual
    > radix-tree under a lock, or not?


    Right - the preload allows us to do GFP_KERNEL allocations for radix
    tree nodes and use a spinlock for inserts into the tree. We could
    drop the preload stuff if we initialised the radix tree to use
    GFP_ATOMIC allocations for radix tree nodes, but that is more likely
    to lead to insert failures under low memory conditions compared to
    the preload method.

    Cheers,

    Dave.
    --
    Dave Chinner
    david@fromorbit.com
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread