ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c - Kernel

This is a discussion on ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c - Kernel ; On Thu, Jul 17, 2008 at 05:09:05PM -0600, Andreas Dilger wrote: > On Jul 17, 2008 10:43 -0400, Josef Bacik wrote: > > Yeah thats a hard to answer question, one that I will leave up to others > > ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 25 of 25

Thread: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

  1. Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

    On Thu, Jul 17, 2008 at 05:09:05PM -0600, Andreas Dilger wrote:
    > On Jul 17, 2008 10:43 -0400, Josef Bacik wrote:
    > > Yeah thats a hard to answer question, one that I will leave up to others
    > > who have been doing this much longer than I. My thought is remount-ro
    > > is there to keep you from crashing, so if you have errors=continue then
    > > you expect to live with the consequences. Course if that bit gets flipped
    > > via corruption thats not good either.

    >
    > It shouldn't cause the kernel to crash, but it should definitely return
    > an error to the application. This is probably one of the code paths
    > that the Coverity folks were reporting on in FAST this year where on-disk
    > errors are not propagated to the application.


    Ok, please revert the previous patch and apply this one. On errors=continue we
    will just abort the handle which should keep the NULL pointer dereference from
    happening and return an error back to the application. Please let me know how
    this works Vegard, and thanks alot for testing all this.

    Signed-off-by: Josef Bacik

    Index: linux-2.6/fs/ext3/inode.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/inode.c
    +++ linux-2.6/fs/ext3/inode.c
    @@ -2023,13 +2023,27 @@ static void ext3_clear_blocks(handle_t *
    unsigned long count, __le32 *first, __le32 *last)
    {
    __le32 *p;
    + int ret;
    +
    if (try_to_extend_transaction(handle, inode)) {
    if (bh) {
    BUFFER_TRACE(bh, "call ext3_journal_dirty_metadata");
    - ext3_journal_dirty_metadata(handle, bh);
    + ret = ext3_journal_dirty_metadata(handle, bh);
    + if (ret) {
    + ext3_std_error(inode->i_sb, ret);
    + return;
    + }
    }
    - ext3_mark_inode_dirty(handle, inode);
    - ext3_journal_test_restart(handle, inode);
    + ret = ext3_mark_inode_dirty(handle, inode);
    + if (ret)
    + return;
    +
    + ret = ext3_journal_test_restart(handle, inode);
    + if (ret) {
    + ext3_std_error(inode->i_sb, ret);
    + return;
    + }
    +
    if (bh) {
    BUFFER_TRACE(bh, "retaking write access");
    ext3_journal_get_write_access(handle, bh);
    Index: linux-2.6/fs/ext3/balloc.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/balloc.c
    +++ linux-2.6/fs/ext3/balloc.c
    @@ -498,6 +498,7 @@ void ext3_free_blocks_sb(handle_t *handl
    ext3_error (sb, "ext3_free_blocks",
    "Freeing blocks not in datazone - "
    "block = "E3FSBLK", count = %lu", block, count);
    + err = -EIO;
    goto error_return;
    }

    @@ -535,6 +536,7 @@ do_more:
    "Freeing blocks in system zones - "
    "Block = "E3FSBLK", count = %lu",
    block, count);
    + err = -EIO;
    goto error_return;
    }

    Index: linux-2.6/fs/ext3/super.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/super.c
    +++ linux-2.6/fs/ext3/super.c
    @@ -167,7 +167,15 @@ static void ext3_handle_error(struct sup
    EXT3_SB(sb)->s_mount_opt |= EXT3_MOUNT_ABORT;
    if (journal)
    journal_abort(journal, -EIO);
    + } else {
    + handle_t *handle = current->journal_info;
    + if (handle && !is_handle_aborted(handle)) {
    + if (!handle->h_err)
    + handle->h_err = -EIO;
    + journal_abort_handle(handle);
    + }
    }
    +
    if (test_opt (sb, ERRORS_RO)) {
    printk (KERN_CRIT "Remounting filesystem read-only\n");
    sb->s_flags |= MS_RDONLY;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

    On Fri, Jul 18, 2008 at 12:51 PM, Josef Bacik wrote:
    > On Thu, Jul 17, 2008 at 05:09:05PM -0600, Andreas Dilger wrote:
    >> On Jul 17, 2008 10:43 -0400, Josef Bacik wrote:
    >> > Yeah thats a hard to answer question, one that I will leave up to others
    >> > who have been doing this much longer than I. My thought is remount-ro
    >> > is there to keep you from crashing, so if you have errors=continue then
    >> > you expect to live with the consequences. Course if that bit gets flipped
    >> > via corruption thats not good either.

    >>
    >> It shouldn't cause the kernel to crash, but it should definitely return
    >> an error to the application. This is probably one of the code paths
    >> that the Coverity folks were reporting on in FAST this year where on-disk
    >> errors are not propagated to the application.

    >
    > Ok, please revert the previous patch and apply this one. On errors=continue we
    > will just abort the handle which should keep the NULL pointer dereference from
    > happening and return an error back to the application. Please let me know how
    > this works Vegard, and thanks alot for testing all this.
    >
    > Signed-off-by: Josef Bacik


    Thanks for doing the patches :-)

    I still got this:

    loop0: rw=0, want=4294967298, limit=24576
    EXT3-fs error (device loop0): ext3_free_branches: Read failure,
    inode=74, block=2147483648
    EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    EXT3-fs error (device loop0) in ext3_truncate: IO failure
    EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    EXT3-fs error (device loop0) in ext3_orphan_del: Readonly filesystem
    EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    EXT3-fs error (device loop0) in ext3_delete_inode: IO failure
    EXT3-fs unexpected failure: !jh->b_committed_data;
    inconsistent data on disk
    ext3_forget: aborting transaction: IO failure in __ext3_journal_forget
    BUG: unable to handle kernel paging request at f1e79ffc
    IP: [] read_block_bitmap+0xc6/0x180
    *pde = 33cc5163 *pte = 31e79160
    Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    Pid: 4257, comm: rm Not tainted (2.6.26-03416-g11155ca #46)
    EIP: 0060:[] EFLAGS: 00210297 CPU: 1
    EIP is at read_block_bitmap+0xc6/0x180
    EAX: ffffffff EBX: f1e7a000 ECX: f3c20000 EDX: 00000001
    ESI: f5663c30 EDI: f1e7a800 EBP: f62e3cdc ESP: f62e3cac
    DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    Process rm (pid: 4257, ti=f62e2000 task=f637dfa0 task.ti=f62e2000)
    Stack: 00000400 f637e4c0 f637dfa0 f62e3cd4 00200246 00000000 f3d2c860 00000000
    f1e7a000 f3c20098 00000000 f56c4b7c f62e3d3c c0222704 c025efd3 f637dfa0
    c015addb f77aa050 f3d2db0c 00000031 00000000 00000032 f3d2c860 f77aa050
    Call Trace:
    [] ? ext3_free_blocks_sb+0xd4/0x620
    [] ? journal_forget+0x213/0x220
    [] ? trace_hardirqs_on+0xb/0x10
    [] ? ext3_free_blocks+0x2a/0xa0
    [] ? ext3_clear_blocks+0x145/0x160
    [] ? ext3_free_data+0xc7/0x100
    [] ? ext3_free_branches+0x213/0x220
    [] ? sync_buffer+0x0/0x40
    [] ? ext3_free_branches+0xae/0x220
    [] ? ext3_free_branches+0xae/0x220
    [] ? ext3_truncate+0x5c8/0x940
    [] ? trace_hardirqs_on_caller+0x116/0x170
    [] ? journal_start+0xd3/0x110
    [] ? journal_start+0xb0/0x110
    [] ? ext3_delete_inode+0xd7/0xe0
    [] ? ext3_delete_inode+0x0/0xe0
    [] ? generic_delete_inode+0x81/0x120
    [] ? generic_drop_inode+0x127/0x180
    [] ? iput+0x47/0x50
    [] ? do_unlinkat+0xec/0x170
    [] ? vfs_readdir+0x6b/0xa0
    [] ? filldir64+0x0/0xf0
    [] ? trace_hardirqs_on_thunk+0xc/0x10
    [] ? trace_hardirqs_on_caller+0x116/0x170
    [] ? sys_unlinkat+0x23/0x50
    [] ? sysenter_past_esp+0x78/0xc5
    =======================
    Code: 00 00 00 8b 45 e8 8b 1f 8b 55 e4 8b 88 ac 02 00 00 8b 41 34 0f
    af 51 10 03 50 14 89 5d ec 8b 46 18 89 45 f0 89 d8 8b 5d f0 29 d0 <0f>
    a3 03 19 c0 85 c0 74 11 8b 47 04 89 45 ec 29 d0 0f a3 03 19
    EIP: [] read_block_bitmap+0xc6/0x180 SS:ESP 0068:f62e3cac
    Kernel panic - not syncing: Fatal exception
    ------------[ cut here ]------------

    This was with error=continue.

    $ addr2line -e vmlinux -i c02224d6
    include/asm/bitops.h:305
    fs/ext3/balloc.c:98
    fs/ext3/balloc.c:167

    It looks similar to the ext2 crash which I just reported:
    http://lkml.org/lkml/2008/7/18/136

    Which had this EIP:

    $ addr2line -e vmlinux -i c026ee46
    include/asm/bitops.h:305
    fs/ext2/balloc.c:87
    fs/ext2/balloc.c:153

    You can see the full log at
    http://folk.uio.no/vegardno/linux/log-1216380709.txt which shows that
    it already survived a lot of failures, so I'm guessing your patch was
    correct and we just hit a different case. What do you think?


    Vegard

    --
    "The animistic metaphor of the bug that maliciously sneaked in while
    the programmer was not looking is intellectually dishonest as it
    disguises that the error is the programmer's own creation."
    -- E. W. Dijkstra, EWD1036
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

    On Fri, Jul 18, 2008 at 01:32:10PM +0200, Vegard Nossum wrote:
    > On Fri, Jul 18, 2008 at 12:51 PM, Josef Bacik wrote:
    > > On Thu, Jul 17, 2008 at 05:09:05PM -0600, Andreas Dilger wrote:
    > >> On Jul 17, 2008 10:43 -0400, Josef Bacik wrote:
    > >> > Yeah thats a hard to answer question, one that I will leave up to others
    > >> > who have been doing this much longer than I. My thought is remount-ro
    > >> > is there to keep you from crashing, so if you have errors=continue then
    > >> > you expect to live with the consequences. Course if that bit gets flipped
    > >> > via corruption thats not good either.
    > >>
    > >> It shouldn't cause the kernel to crash, but it should definitely return
    > >> an error to the application. This is probably one of the code paths
    > >> that the Coverity folks were reporting on in FAST this year where on-disk
    > >> errors are not propagated to the application.

    > >
    > > Ok, please revert the previous patch and apply this one. On errors=continue we
    > > will just abort the handle which should keep the NULL pointer dereference from
    > > happening and return an error back to the application. Please let me know how
    > > this works Vegard, and thanks alot for testing all this.
    > >
    > > Signed-off-by: Josef Bacik

    >
    > Thanks for doing the patches :-)
    >
    > I still got this:
    >
    > loop0: rw=0, want=4294967298, limit=24576
    > EXT3-fs error (device loop0): ext3_free_branches: Read failure,
    > inode=74, block=2147483648
    > EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    > EXT3-fs error (device loop0) in ext3_truncate: IO failure
    > EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    > EXT3-fs error (device loop0) in ext3_orphan_del: Readonly filesystem
    > EXT3-fs error (device loop0) in ext3_reserve_inode_write: Readonly filesystem
    > EXT3-fs error (device loop0) in ext3_delete_inode: IO failure
    > EXT3-fs unexpected failure: !jh->b_committed_data;
    > inconsistent data on disk
    > ext3_forget: aborting transaction: IO failure in __ext3_journal_forget
    > BUG: unable to handle kernel paging request at f1e79ffc
    > IP: [] read_block_bitmap+0xc6/0x180
    > *pde = 33cc5163 *pte = 31e79160
    > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    > Pid: 4257, comm: rm Not tainted (2.6.26-03416-g11155ca #46)
    > EIP: 0060:[] EFLAGS: 00210297 CPU: 1
    > EIP is at read_block_bitmap+0xc6/0x180
    > EAX: ffffffff EBX: f1e7a000 ECX: f3c20000 EDX: 00000001
    > ESI: f5663c30 EDI: f1e7a800 EBP: f62e3cdc ESP: f62e3cac
    > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    > Process rm (pid: 4257, ti=f62e2000 task=f637dfa0 task.ti=f62e2000)
    > Stack: 00000400 f637e4c0 f637dfa0 f62e3cd4 00200246 00000000 f3d2c860 00000000
    > f1e7a000 f3c20098 00000000 f56c4b7c f62e3d3c c0222704 c025efd3 f637dfa0
    > c015addb f77aa050 f3d2db0c 00000031 00000000 00000032 f3d2c860 f77aa050
    > Call Trace:
    > [] ? ext3_free_blocks_sb+0xd4/0x620
    > [] ? journal_forget+0x213/0x220
    > [] ? trace_hardirqs_on+0xb/0x10
    > [] ? ext3_free_blocks+0x2a/0xa0
    > [] ? ext3_clear_blocks+0x145/0x160
    > [] ? ext3_free_data+0xc7/0x100
    > [] ? ext3_free_branches+0x213/0x220
    > [] ? sync_buffer+0x0/0x40
    > [] ? ext3_free_branches+0xae/0x220
    > [] ? ext3_free_branches+0xae/0x220
    > [] ? ext3_truncate+0x5c8/0x940
    > [] ? trace_hardirqs_on_caller+0x116/0x170
    > [] ? journal_start+0xd3/0x110
    > [] ? journal_start+0xb0/0x110
    > [] ? ext3_delete_inode+0xd7/0xe0
    > [] ? ext3_delete_inode+0x0/0xe0
    > [] ? generic_delete_inode+0x81/0x120
    > [] ? generic_drop_inode+0x127/0x180
    > [] ? iput+0x47/0x50
    > [] ? do_unlinkat+0xec/0x170
    > [] ? vfs_readdir+0x6b/0xa0
    > [] ? filldir64+0x0/0xf0
    > [] ? trace_hardirqs_on_thunk+0xc/0x10
    > [] ? trace_hardirqs_on_caller+0x116/0x170
    > [] ? sys_unlinkat+0x23/0x50
    > [] ? sysenter_past_esp+0x78/0xc5
    > =======================
    > Code: 00 00 00 8b 45 e8 8b 1f 8b 55 e4 8b 88 ac 02 00 00 8b 41 34 0f
    > af 51 10 03 50 14 89 5d ec 8b 46 18 89 45 f0 89 d8 8b 5d f0 29 d0 <0f>
    > a3 03 19 c0 85 c0 74 11 8b 47 04 89 45 ec 29 d0 0f a3 03 19
    > EIP: [] read_block_bitmap+0xc6/0x180 SS:ESP 0068:f62e3cac
    > Kernel panic - not syncing: Fatal exception
    > ------------[ cut here ]------------
    >
    > This was with error=continue.
    >
    > $ addr2line -e vmlinux -i c02224d6
    > include/asm/bitops.h:305
    > fs/ext3/balloc.c:98
    > fs/ext3/balloc.c:167
    >
    > It looks similar to the ext2 crash which I just reported:
    > http://lkml.org/lkml/2008/7/18/136
    >
    > Which had this EIP:
    >
    > $ addr2line -e vmlinux -i c026ee46
    > include/asm/bitops.h:305
    > fs/ext2/balloc.c:87
    > fs/ext2/balloc.c:153
    >
    > You can see the full log at
    > http://folk.uio.no/vegardno/linux/log-1216380709.txt which shows that
    > it already survived a lot of failures, so I'm guessing your patch was
    > correct and we just hit a different case. What do you think?
    >


    Yeah you are right, its like a ****ty game of wack-a-mole. Heres another patch,
    same thing as last time, pull the other one out put this one on. Thanks,

    Josef


    Index: linux-2.6/fs/ext3/inode.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/inode.c
    +++ linux-2.6/fs/ext3/inode.c
    @@ -2023,13 +2023,27 @@ static void ext3_clear_blocks(handle_t *
    unsigned long count, __le32 *first, __le32 *last)
    {
    __le32 *p;
    + int ret;
    +
    if (try_to_extend_transaction(handle, inode)) {
    if (bh) {
    BUFFER_TRACE(bh, "call ext3_journal_dirty_metadata");
    - ext3_journal_dirty_metadata(handle, bh);
    + ret = ext3_journal_dirty_metadata(handle, bh);
    + if (ret) {
    + ext3_std_error(inode->i_sb, ret);
    + return;
    + }
    }
    - ext3_mark_inode_dirty(handle, inode);
    - ext3_journal_test_restart(handle, inode);
    + ret = ext3_mark_inode_dirty(handle, inode);
    + if (ret)
    + return;
    +
    + ret = ext3_journal_test_restart(handle, inode);
    + if (ret) {
    + ext3_std_error(inode->i_sb, ret);
    + return;
    + }
    +
    if (bh) {
    BUFFER_TRACE(bh, "retaking write access");
    ext3_journal_get_write_access(handle, bh);
    @@ -2051,7 +2065,9 @@ static void ext3_clear_blocks(handle_t *

    *p = 0;
    bh = sb_find_get_block(inode->i_sb, nr);
    - ext3_forget(handle, 0, inode, bh, nr);
    + ret = ext3_forget(handle, 0, inode, bh, nr);
    + if (ret)
    + return;
    }
    }

    Index: linux-2.6/fs/ext3/balloc.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/balloc.c
    +++ linux-2.6/fs/ext3/balloc.c
    @@ -498,6 +498,7 @@ void ext3_free_blocks_sb(handle_t *handl
    ext3_error (sb, "ext3_free_blocks",
    "Freeing blocks not in datazone - "
    "block = "E3FSBLK", count = %lu", block, count);
    + err = -EIO;
    goto error_return;
    }

    @@ -535,6 +536,7 @@ do_more:
    "Freeing blocks in system zones - "
    "Block = "E3FSBLK", count = %lu",
    block, count);
    + err = -EIO;
    goto error_return;
    }

    Index: linux-2.6/fs/ext3/super.c
    ================================================== =================
    --- linux-2.6.orig/fs/ext3/super.c
    +++ linux-2.6/fs/ext3/super.c
    @@ -167,7 +167,15 @@ static void ext3_handle_error(struct sup
    EXT3_SB(sb)->s_mount_opt |= EXT3_MOUNT_ABORT;
    if (journal)
    journal_abort(journal, -EIO);
    + } else {
    + handle_t *handle = current->journal_info;
    + if (handle && !is_handle_aborted(handle)) {
    + if (!handle->h_err)
    + handle->h_err = -EIO;
    + journal_abort_handle(handle);
    + }
    }
    +
    if (test_opt (sb, ERRORS_RO)) {
    printk (KERN_CRIT "Remounting filesystem read-only\n");
    sb->s_flags |= MS_RDONLY;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

    On Fri, Jul 18, 2008 at 1:20 PM, Josef Bacik wrote:
    >> You can see the full log at
    >> http://folk.uio.no/vegardno/linux/log-1216380709.txt which shows that
    >> it already survived a lot of failures, so I'm guessing your patch was
    >> correct and we just hit a different case. What do you think?
    >>

    >
    > Yeah you are right, its like a ****ty game of wack-a-mole. Heres another patch,
    > same thing as last time, pull the other one out put this one on. Thanks,


    It seems to hold up -- no stacktraces, but lots of IO failures.

    I would leave it in testing for a bit more, but I've got to run; I'll
    give it another go when I get home.

    You may see the log so far: http://folk.uio.no/vegardno/linux/log-1216382128.txt


    Vegard

    --
    "The animistic metaphor of the bug that maliciously sneaked in while
    the programmer was not looking is intellectually dishonest as it
    disguises that the error is the programmer's own creation."
    -- E. W. Dijkstra, EWD1036
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c

    On Fri, Jul 18, 2008 at 1:58 PM, Vegard Nossum wrote:
    > On Fri, Jul 18, 2008 at 1:20 PM, Josef Bacik wrote:
    >>> You can see the full log at
    >>> http://folk.uio.no/vegardno/linux/log-1216380709.txt which shows that
    >>> it already survived a lot of failures, so I'm guessing your patch was
    >>> correct and we just hit a different case. What do you think?
    >>>

    >>
    >> Yeah you are right, its like a ****ty game of wack-a-mole. Heres another patch,
    >> same thing as last time, pull the other one out put this one on. Thanks,

    >
    > It seems to hold up -- no stacktraces, but lots of IO failures.
    >
    > I would leave it in testing for a bit more, but I've got to run; I'll
    > give it another go when I get home.


    Ok, we still got this:

    BUG: unable to handle kernel NULL pointer dereference at 0000000c
    IP: [] journal_dirty_metadata+0xb8/0x1b0
    *pde = 00000000
    Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    Pid: 4770, comm: rm Not tainted (2.6.26-03421-g253a722 #49)
    EIP: 0060:[] EFLAGS: 00210246 CPU: 1
    EIP is at journal_dirty_metadata+0xb8/0x1b0
    EAX: 00000000 EBX: f3d70c90 ECX: 00000001 EDX: f3e12000
    ESI: 00000000 EDI: f21118f0 EBP: f3e13d94 ESP: f3e13d6c
    DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    Process rm (pid: 4770, ti=f3e12000 task=f62cdfa0 task.ti=f3e12000)
    Stack: f3d70430 f578047c f578047c f3e13d94 c0222cdb f779c000 f6ff2e70 f21118f0
    f779c000 f21118f0 f3e13db4 c02345ef 0000001c 00001499 c0760bc4 f21118f0
    00000000 ef36d004 f3e13de4 c0228e6f 0000147e 0000001c ef36d004 ef36d400
    Call Trace:
    [] ? ext3_free_blocks+0x6b/0xa0
    [] ? __ext3_journal_dirty_metadata+0x1f/0x50
    [] ? ext3_free_data+0x9f/0x100
    [] ? ext3_free_branches+0x213/0x220
    [] ? ext3_free_blocks+0x6b/0xa0
    [] ? ext3_free_branches+0xae/0x220
    [] ? ext3_truncate+0x58c/0x940
    [] ? trace_hardirqs_on_caller+0x116/0x170
    [] ? journal_start+0xd3/0x110
    [] ? journal_start+0xb0/0x110
    [] ? ext3_delete_inode+0xd7/0xe0
    [] ? ext3_delete_inode+0x0/0xe0
    [] ? generic_delete_inode+0x81/0x120
    [] ? generic_drop_inode+0x127/0x180
    [] ? iput+0x47/0x50
    [] ? do_unlinkat+0xec/0x170
    [] ? vfs_readdir+0x6b/0xa0
    [] ? filldir64+0x0/0xf0
    [] ? trace_hardirqs_on_thunk+0xc/0x10
    [] ? trace_hardirqs_on_caller+0x116/0x170
    [] ? sys_unlinkat+0x23/0x50
    [] ? sysenter_past_esp+0x78/0xc5
    =======================
    Code: b8 01 00 00 00 e8 c9 3f ed ff 89 e0 25 00 e0 ff ff f6 40 08 08
    74 05 e8 47 98 4e 00 83 c4 1c 31 c0 5b 5e 5f 5d c3 90 8d 74 26 00 <8b>
    46 0c 85 c0 0f 84 8d 00 00 00 8b 45 f0 39 46 18 74 66 8d 47
    EIP: [] journal_dirty_metadata+0xb8/0x1b0 SS:ESP 0068:f3e13d6c
    Kernel panic - not syncing: Fatal exception


    It looks similar to one of the others we saw. Are you sure I should
    back out all your previous patches? My stack looks like this:

    Duane Griffin (1):
    ext3: validate directory entry

    Josef Bacik (1):
    ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference

    And I am using error=continue.

    Now I've modified my scripts to also save the bad image, so I (or
    whomever) can re-test a specific crash easily. For instance, this one
    can be downloaded from
    http://folk.uio.no/vegardno/linux/ext3-crash-fs.bin.bz2 and mounted.
    Then you run rm -rf mnt/* and it should crash.

    Log is also available at http://folk.uio.no/vegardno/linux/log-1216412153.txt


    Vegard

    --
    "The animistic metaphor of the bug that maliciously sneaked in while
    the programmer was not looking is intellectually dishonest as it
    disguises that the error is the programmer's own creation."
    -- E. W. Dijkstra, EWD1036
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2