Re: ext3: slow symlink corruption on umount... - Kernel

This is a discussion on Re: ext3: slow symlink corruption on umount... - Kernel ; On Mon, 3 Nov 2008 12:58:54 -0800 Arthur Jones wrote: > Hi Andrew, ... > > On Mon, Nov 03, 2008 at 12:37:50PM -0800, Andrew Morton wrote: > > [...] > > OK. But still, questions remain. > > > ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 32 of 32

Thread: Re: ext3: slow symlink corruption on umount...

  1. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, 3 Nov 2008 12:58:54 -0800
    Arthur Jones wrote:

    > Hi Andrew, ...
    >
    > On Mon, Nov 03, 2008 at 12:37:50PM -0800, Andrew Morton wrote:
    > > [...]
    > > OK. But still, questions remain.
    > >
    > > - should we clear s_dirt if log_wait_commit() didn't work?
    > >
    > > - ext3_force_commit() clears s_dirt also
    > >
    > > - should ext3_sync_fs() be dropping the ext3_force_commit() return
    > > value on the floor?

    >
    > - does ext4 have a similar issue (ext4_sync_fs looks the same)?
    >


    I'm sure it does. Someone(tm) should prepare the ext4 patch once we've
    settled on the ext3 patch.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, 3 Nov 2008 16:19:29 -0500
    Theodore Tso wrote:

    > On Mon, Nov 03, 2008 at 01:13:13PM -0800, Andrew Morton wrote:
    > > On Mon, 3 Nov 2008 12:58:54 -0800
    > > Arthur Jones wrote:
    > >
    > > > Hi Andrew, ...
    > > >
    > > > On Mon, Nov 03, 2008 at 12:37:50PM -0800, Andrew Morton wrote:
    > > > > [...]
    > > > > OK. But still, questions remain.
    > > > >
    > > > > - should we clear s_dirt if log_wait_commit() didn't work?
    > > > >
    > > > > - ext3_force_commit() clears s_dirt also
    > > > >
    > > > > - should ext3_sync_fs() be dropping the ext3_force_commit() return
    > > > > value on the floor?

    >
    > Should all the callers of ->sync_fs also be dropping the error returns
    > on the floor?


    Oh, this is sync_fs. How meaningful would it be to return an error
    from sys_umount()? Would there be any point in leaving the errored fs
    mounted? Dunno.

    But if we're going to drop this error on the floor, we should do
    that at a higher level, not on a per-fs basis

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, Nov 03, 2008 at 01:27:35PM -0800, Andrew Morton wrote:
    > Oh, this is sync_fs. How meaningful would it be to return an error
    > from sys_umount()? Would there be any point in leaving the errored fs
    > mounted? Dunno.
    >
    > But if we're going to drop this error on the floor, we should do
    > that at a higher level, not on a per-fs basis


    At the very least we should log it at the VFS layer, IMHO.

    - Ted
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    Hi Andrew, ...

    On Mon, Nov 03, 2008 at 12:37:50PM -0800, Andrew Morton wrote:
    > [...]
    > OK. But still, questions remain.
    >
    > - should we clear s_dirt if log_wait_commit() didn't work?
    >
    > - ext3_force_commit() clears s_dirt also


    I'm not sure if it makes sense, but ATM I don't think it
    hurts anything given that ext3_write_super is just sb->s_dirt = 0;

    > - should ext3_sync_fs() be dropping the ext3_force_commit() return
    > value on the floor?


    Something like this (tested)?

    diff --git a/fs/ext3/super.c b/fs/ext3/super.c
    index 2b48b85..743acf1 100644
    --- a/fs/ext3/super.c
    +++ b/fs/ext3/super.c
    @@ -2386,13 +2386,15 @@ static void ext3_write_super (struct super_block * sb)

    static int ext3_sync_fs(struct super_block *sb, int wait)
    {
    + int ret = 0;
    +
    sb->s_dirt = 0;
    if (wait)
    - ext3_force_commit(sb);
    + ret = ext3_force_commit(sb);
    else
    journal_start_commit(EXT3_SB(sb)->s_journal, NULL);

    - return 0;
    + return ret;
    }

    /*

    Arthur
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    Here's the ext4 version of the patch. It's slightly altered from the
    ext3 version in that we do reflect the errors up to sync_fs's callers
    (which then throw it on the floor, but we might as well take away some
    of the fun from all of those academic researchers who like to write
    papers complaining about how often Linux doesn't do appropriat eerror
    checking :-).

    After doing some testing, I plan to carry it in the ext4 patch queue.
    I think similar changes should be made to the ext3 version of the
    patch; agreed?

    - Ted

    commit 8106ea5364c2680a385ed240e8f898611babf661
    Author: Theodore Ts'o
    Date: Mon Nov 3 17:01:22 2008 -0500

    ext4: wait on all pending commits in ext4_sync_fs()

    In ext4_sync_fs, we only wait for a commit to finish if we started it,
    but there may be one already in progress which will not be synced.

    In the case of a data=ordered umount with pending long symlinks which
    are delayed due to a long list of other I/O on the backing block
    device, this causes the buffer associated with the long symlinks to
    not be moved to the inode dirty list in the second phase of
    fsync_super. Then, before they can be dirtied again, kjournald exits,
    seeing the UMOUNT flag and the dirty pages are never written to the
    backing block device, causing long symlink corruption and exposing new
    or previously freed block data to userspace.

    To ensure all commits are synced, we flush all journal commits now
    when sync_fs'ing ext4.

    Signed-off-by: Arthur Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: "Theodore Ts'o"
    Cc: Eric Sandeen
    Cc:

    diff --git a/fs/ext4/super.c b/fs/ext4/super.c
    index 97cb896..e199773 100644
    --- a/fs/ext4/super.c
    +++ b/fs/ext4/super.c
    @@ -2899,8 +2899,9 @@ int ext4_force_commit(struct super_block *sb)
    return 0;

    journal = EXT4_SB(sb)->s_journal;
    - sb->s_dirt = 0;
    ret = ext4_journal_force_commit(journal);
    + if (!ret)
    + sb->s_dirt = 0;
    return ret;
    }

    @@ -2922,15 +2923,16 @@ static void ext4_write_super(struct super_block *sb)

    static int ext4_sync_fs(struct super_block *sb, int wait)
    {
    - tid_t target;
    + int ret;

    trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    - sb->s_dirt = 0;
    - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    - if (wait)
    - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    - }
    - return 0;
    + if (wait)
    + ret = ext4_force_commit(sb);
    + else
    + ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    + if (!ret)
    + sb->s_dirt = 0;
    + return ret;
    }

    /*
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    Hi Ted, ...

    On Mon, Nov 03, 2008 at 02:01:44PM -0800, Theodore Tso wrote:
    > [...]
    > @@ -2922,15 +2923,16 @@ static void ext4_write_super(struct super_block *sb)
    >
    > static int ext4_sync_fs(struct super_block *sb, int wait)
    > {
    > - tid_t target;
    > + int ret;
    >
    > trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    > - sb->s_dirt = 0;
    > - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    > - if (wait)
    > - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    > - }
    > - return 0;
    > + if (wait)
    > + ret = ext4_force_commit(sb);
    > + else
    > + ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    > + if (!ret)
    > + sb->s_dirt = 0;
    > + return ret;


    This bit will clear sb->s_dirt if we did _not_ start
    a commit in jbd2_journal_start_commit. It will return
    1 if we did start a commit which I don't think is what
    we want to do here...

    Arthur
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, 3 Nov 2008 17:01:44 -0500
    Theodore Tso wrote:

    > static int ext4_sync_fs(struct super_block *sb, int wait)
    > {
    > - tid_t target;
    > + int ret;
    >
    > trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    > - sb->s_dirt = 0;
    > - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    > - if (wait)
    > - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    > - }
    > - return 0;
    > + if (wait)
    > + ret = ext4_force_commit(sb);
    > + else
    > + ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    > + if (!ret)
    > + sb->s_dirt = 0;
    > + return ret;
    > }


    It should clear s_dirt before doing the "i/o", methinks?

    The usual pattern is

    foo->dirty = 0;
    do_io_on(foo);

    because


    do_io_on(foo);
    modify(foo);
    foo->dirty = 1;
    foo->dirty = 0;

    is racy.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, Nov 03, 2008 at 01:48:40PM -0800, Arthur Jones wrote:
    > > - should we clear s_dirt if log_wait_commit() didn't work?
    > >
    > > - ext3_force_commit() clears s_dirt also

    >
    > I'm not sure if it makes sense, but ATM I don't think it
    > hurts anything given that ext3_write_super is just sb->s_dirt = 0;


    I want to look at this some more, but it's possible we should just
    remove all of the references to sb->s_dirt completely. Also, the
    comment just above ext[34]_write_super() is out of date since before
    2.6.0.

    - Ted
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, Nov 03, 2008 at 02:27:06PM -0800, Andrew Morton wrote:
    > It should clear s_dirt before doing the "i/o", methinks?


    Yep, good point. As I mentioned earlier, though, I'm about 99% sure
    that the right fix is to remove all mention of s_dirt entirely, and in
    fact we can make super_operations.write_super be NULL for ext3 and
    ext4. But for now we should just keep it in its usual place for now,
    and save that for a cleanup commit later on.

    - Ted

    commit b20506dc713db1105287b691390563d2aace6d84
    Author: Theodore Ts'o
    Date: Mon Nov 3 17:54:41 2008 -0500

    ext4: wait on all pending commits in ext4_sync_fs()

    In ext4_sync_fs, we only wait for a commit to finish if we started it,
    but there may be one already in progress which will not be synced.

    In the case of a data=ordered umount with pending long symlinks which
    are delayed due to a long list of other I/O on the backing block
    device, this causes the buffer associated with the long symlinks to
    not be moved to the inode dirty list in the second phase of
    fsync_super. Then, before they can be dirtied again, kjournald exits,
    seeing the UMOUNT flag and the dirty pages are never written to the
    backing block device, causing long symlink corruption and exposing new
    or previously freed block data to userspace.

    To ensure all commits are synced, we flush all journal commits now
    when sync_fs'ing ext4.

    Signed-off-by: Arthur Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: "Theodore Ts'o"
    Cc: Eric Sandeen
    Cc:

    diff --git a/fs/ext4/super.c b/fs/ext4/super.c
    index 97cb896..5b5e38e 100644
    --- a/fs/ext4/super.c
    +++ b/fs/ext4/super.c
    @@ -2907,12 +2907,9 @@ int ext4_force_commit(struct super_block *sb)
    /*
    * Ext4 always journals updates to the superblock itself, so we don't
    * have to propagate any other updates to the superblock on disk at this
    - * point. Just start an async writeback to get the buffers on their way
    - * to the disk.
    - *
    - * This implicitly triggers the writebehind on sync().
    + * point. (We can probably nuke this function altogether, and remove
    + * any mention to sb->s_dirt in all of fs/ext4; eventual cleanup...)
    */
    -
    static void ext4_write_super(struct super_block *sb)
    {
    if (mutex_trylock(&sb->s_lock) != 0)
    @@ -2922,15 +2919,15 @@ static void ext4_write_super(struct super_block *sb)

    static int ext4_sync_fs(struct super_block *sb, int wait)
    {
    - tid_t target;
    + int ret;

    trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    sb->s_dirt = 0;
    - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    - if (wait)
    - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    - }
    - return 0;
    + if (wait)
    + ret = ext4_force_commit(sb);
    + else
    + ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    + return ret;
    }

    /*
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    Hi Ted, ...

    On Mon, Nov 03, 2008 at 02:55:44PM -0800, Theodore Tso wrote:
    > @@ -2922,15 +2919,15 @@ static void ext4_write_super(struct super_block *sb)
    >
    > static int ext4_sync_fs(struct super_block *sb, int wait)
    > {
    > - tid_t target;
    > + int ret;
    >
    > trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    > sb->s_dirt = 0;
    > - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    > - if (wait)
    > - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    > - }
    > - return 0;
    > + if (wait)
    > + ret = ext4_force_commit(sb);
    > + else
    > + ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    > + return ret;
    > }
    >
    > /*


    Same problem as the last one, jbd2_journal_start_commit returns
    true if it started a commit, not an error...

    Arthur
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    On Mon, Nov 03, 2008 at 03:01:04PM -0800, Arthur Jones wrote:
    >
    > Same problem as the last one, jbd2_journal_start_commit returns
    > true if it started a commit, not an error...


    Ah, right. !@#! differing return conventions....

    BTW, did you ever open a bugzilla entry for this the symlink
    corruption problem that we should reference in the commit log?

    - Ted

    commit fd7384f8243a957386af3767532d035346f0d149
    Author: Theodore Ts'o
    Date: Mon Nov 3 18:10:55 2008 -0500

    ext4: wait on all pending commits in ext4_sync_fs()

    In ext4_sync_fs, we only wait for a commit to finish if we started it,
    but there may be one already in progress which will not be synced.

    In the case of a data=ordered umount with pending long symlinks which
    are delayed due to a long list of other I/O on the backing block
    device, this causes the buffer associated with the long symlinks to
    not be moved to the inode dirty list in the second phase of
    fsync_super. Then, before they can be dirtied again, kjournald exits,
    seeing the UMOUNT flag and the dirty pages are never written to the
    backing block device, causing long symlink corruption and exposing new
    or previously freed block data to userspace.

    To ensure all commits are synced, we flush all journal commits now
    when sync_fs'ing ext4.

    Signed-off-by: Arthur Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: "Theodore Ts'o"
    Cc: Eric Sandeen
    Cc:

    diff --git a/fs/ext4/super.c b/fs/ext4/super.c
    index 97cb896..9e7e66c 100644
    --- a/fs/ext4/super.c
    +++ b/fs/ext4/super.c
    @@ -2907,12 +2907,9 @@ int ext4_force_commit(struct super_block *sb)
    /*
    * Ext4 always journals updates to the superblock itself, so we don't
    * have to propagate any other updates to the superblock on disk at this
    - * point. Just start an async writeback to get the buffers on their way
    - * to the disk.
    - *
    - * This implicitly triggers the writebehind on sync().
    + * point. (We can probably nuke this function altogether, and remove
    + * any mention to sb->s_dirt in all of fs/ext4; eventual cleanup...)
    */
    -
    static void ext4_write_super(struct super_block *sb)
    {
    if (mutex_trylock(&sb->s_lock) != 0)
    @@ -2922,15 +2919,15 @@ static void ext4_write_super(struct super_block *sb)

    static int ext4_sync_fs(struct super_block *sb, int wait)
    {
    - tid_t target;
    + int ret = 0;

    trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
    sb->s_dirt = 0;
    - if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
    - if (wait)
    - jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
    - }
    - return 0;
    + if (wait)
    + ret = ext4_force_commit(sb);
    + else
    + jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
    + return ret;
    }

    /*
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

    Hi Ted, ...

    On Mon, Nov 03, 2008 at 03:12:09PM -0800, Theodore Tso wrote:
    > [...]
    > BTW, did you ever open a bugzilla entry for this the symlink
    > corruption problem that we should reference in the commit log?


    Nope...

    Patch looks good now -- thanks for looking into this...

    Arthur
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2