[2.6.24 BUG] 100% iowait on host while UML is running - Kernel

This is a discussion on [2.6.24 BUG] 100% iowait on host while UML is running - Kernel ; On 2.6.24, top started showing 100% iowait on one CPU when a UML instance was running (but completely idle). I've traced it to this commit. Reverting it cures the problem. Miklos commit 41d10da3717409de33d5441f2f6d8f072ab3fbb6 Author: Jeff Moyer Date: Tue Oct 16 ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: [2.6.24 BUG] 100% iowait on host while UML is running

  1. [2.6.24 BUG] 100% iowait on host while UML is running

    On 2.6.24, top started showing 100% iowait on one CPU when a UML
    instance was running (but completely idle). I've traced it to this
    commit. Reverting it cures the problem.

    Miklos



    commit 41d10da3717409de33d5441f2f6d8f072ab3fbb6
    Author: Jeff Moyer
    Date: Tue Oct 16 23:27:20 2007 -0700

    aio: account I/O wait time properly

    Some months back I proposed changing the schedule() call in
    read_events to an io_schedule():
    http://osdir.com/ml/linux.kernel.aio.../msg00024.html
    This was rejected as there are AIO operations that do not initiate
    disk I/O. I've had another look at the problem, and the only AIO
    operation that will not initiate disk I/O is IOCB_CMD_NOOP. However,
    this command isn't even wired up!

    Given that it doesn't work, and hasn't for *years*, I'm going to
    suggest again that we do proper I/O accounting when using AIO.

    Signed-off-by: Jeff Moyer
    Acked-by: Zach Brown
    Cc: Benjamin LaHaise
    Cc: Suparna Bhattacharya
    Cc: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    diff --git a/fs/aio.c b/fs/aio.c
    index ea2e198..d02f43b 100644
    --- a/fs/aio.c
    +++ b/fs/aio.c
    @@ -303,7 +303,7 @@ static void wait_for_all_aios(struct kioctx *ctx)
    set_task_state(tsk, TASK_UNINTERRUPTIBLE);
    while (ctx->reqs_active) {
    spin_unlock_irq(&ctx->ctx_lock);
    - schedule();
    + io_schedule();
    set_task_state(tsk, TASK_UNINTERRUPTIBLE);
    spin_lock_irq(&ctx->ctx_lock);
    }
    @@ -323,7 +323,7 @@ ssize_t fastcall wait_on_sync_kiocb(struct kiocb *iocb)
    set_current_state(TASK_UNINTERRUPTIBLE);
    if (!iocb->ki_users)
    break;
    - schedule();
    + io_schedule();
    }
    __set_current_state(TASK_RUNNING);
    return iocb->ki_user_data;
    @@ -1170,7 +1170,7 @@ retry:
    ret = 0;
    if (to.timed_out) /* Only check after read evt */
    break;
    - schedule();
    + io_schedule();
    if (signal_pending(tsk)) {
    ret = -EINTR;
    break;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [2.6.24 BUG] 100% iowait on host while UML is running

    Miklos Szeredi writes:

    > On 2.6.24, top started showing 100% iowait on one CPU when a UML
    > instance was running (but completely idle). I've traced it to this
    > commit [1]. Reverting it cures the problem.


    Hi,

    The UML code sits in io_getevents waiting for an event to be submitted
    and completed. This is a case I completely overlooked when putting
    together the referenced patch.

    We could check ctx->reqs_active before scheduling to determine whether
    or not we are waiting for I/O, but this would require taking the
    context lock in order to be accurate. Given that the test would be
    only for the sake of book keeping, it might be okay to do it outside
    of the lock.

    Zach, what are your thoughts on this?

    -Jeff


    [1] commit 41d10da3717409de33d5441f2f6d8f072ab3fbb6
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [2.6.24 BUG] 100% iowait on host while UML is running


    > We could check ctx->reqs_active before scheduling to determine whether
    > or not we are waiting for I/O, but this would require taking the
    > context lock in order to be accurate. Given that the test would be
    > only for the sake of book keeping, it might be okay to do it outside
    > of the lock.
    >
    > Zach, what are your thoughts on this?


    I agree that it'd be OK to test it outside the lock, though we'll want
    some commentary:

    /* Try to only show up in io wait if there are ops in flight */
    if (ctx->reqs_active)
    io_schedule();
    else
    schedule();

    It's cheap, safe, and accurate the overwhelming majority of the time .

    We only need it in read_events(). The other two io_schedule() calls are
    only reached to wait on pending reqs specifically.

    It still won't make sense for iocbs which aren't performing IO, but I
    guess that's one more bridge to cross when we come to it.

    Do you want to throw this tiny patch together and submit it?

    - z
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [2.6.24 BUG] 100% iowait on host while UML is running

    Zach Brown writes:

    >> We could check ctx->reqs_active before scheduling to determine whether
    >> or not we are waiting for I/O, but this would require taking the
    >> context lock in order to be accurate. Given that the test would be
    >> only for the sake of book keeping, it might be okay to do it outside
    >> of the lock.
    >>
    >> Zach, what are your thoughts on this?

    >
    > I agree that it'd be OK to test it outside the lock, though we'll want
    > some commentary:
    >
    > /* Try to only show up in io wait if there are ops in flight */
    > if (ctx->reqs_active)
    > io_schedule();
    > else
    > schedule();
    >
    > It's cheap, safe, and accurate the overwhelming majority of the time .
    >
    > We only need it in read_events(). The other two io_schedule() calls are
    > only reached to wait on pending reqs specifically.
    >
    > It still won't make sense for iocbs which aren't performing IO, but I
    > guess that's one more bridge to cross when we come to it.
    >
    > Do you want to throw this tiny patch together and submit it?


    Sure. I tested this on a system that I used to reproduce the problem,
    and it shows I/O Wait back at normal levels on an idle system with 1
    uml guest running.

    Andrew, do you need a separate email with a [patch] heading or will
    this do?

    Cheers,

    Jeff

    Only account I/O wait time in read_events if there are active
    requests.

    Signed-off-by: Jeff Moyer

    diff --git a/fs/aio.c b/fs/aio.c
    index f12db41..9dec7d2 100644
    --- a/fs/aio.c
    +++ b/fs/aio.c
    @@ -1161,7 +1161,12 @@ retry:
    ret = 0;
    if (to.timed_out) /* Only check after read evt */
    break;
    - io_schedule();
    + /* Try to only show up in io wait if there are ops
    + * in flight */
    + if (ctx->reqs_active)
    + io_schedule();
    + else
    + schedule();
    if (signal_pending(tsk)) {
    ret = -EINTR;
    break;

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread