[PATCH 2/3] workqueues: implement flush_work() - Kernel

This is a discussion on [PATCH 2/3] workqueues: implement flush_work() - Kernel ; Most of users of flush_workqueue() can be changed to use cancel_work_sync(), but sometimes we really need to wait for the completion and cancelling is not an option. schedule_on_each_cpu() is good example. Add the new helper, flush_work(work), which waits for the ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: [PATCH 2/3] workqueues: implement flush_work()

  1. [PATCH 2/3] workqueues: implement flush_work()

    Most of users of flush_workqueue() can be changed to use cancel_work_sync(),
    but sometimes we really need to wait for the completion and cancelling is not
    an option. schedule_on_each_cpu() is good example.

    Add the new helper, flush_work(work), which waits for the completion of the
    specific work_struct. More precisely, it "flushes" the result of of the last
    queue_work() which is visible to the caller.

    For example, this code

    queue_work(wq, work);
    /* WINDOW */
    queue_work(wq, work);

    flush_work(work);

    doesn't necessary work "as expected". What can happen in the WINDOW above is

    - wq starts the execution of work->func()

    - the caller migrates to another CPU

    now, after the 2nd queue_work() this work is active on the previous CPU, and
    at the same time it is queued on another. In this case flush_work(work) may
    return before the first work->func() completes.

    It is trivial to add another helper

    int flush_work_sync(struct work_struct *work)
    {
    return flush_work(work) || wait_on_work(work);
    }

    which works "more correctly", but it has to iterate over all CPUs and thus
    it much slower than flush_work().

    Signed-off-by: Oleg Nesterov
    Acked-By: Max Krasnyansky

    --- 26-rc2/include/linux/workqueue.h~WQ_2_FLUSH_WORK 2008-05-18 15:42:34.000000000 +0400
    +++ 26-rc2/include/linux/workqueue.h 2008-05-18 15:42:34.000000000 +0400
    @@ -198,6 +198,8 @@ extern int keventd_up(void);
    extern void init_workqueues(void);
    int execute_in_process_context(work_func_t fn, struct execute_work *);

    +extern int flush_work(struct work_struct *work);
    +
    extern int cancel_work_sync(struct work_struct *work);

    /*
    --- 26-rc2/kernel/workqueue.c~WQ_2_FLUSH_WORK 2008-06-12 21:28:13.000000000 +0400
    +++ 26-rc2/kernel/workqueue.c 2008-06-29 18:20:33.000000000 +0400
    @@ -399,6 +399,52 @@ void flush_workqueue(struct workqueue_st
    }
    EXPORT_SYMBOL_GPL(flush_workqueue);

    +/**
    + * flush_work - block until a work_struct's callback has terminated
    + * @work: the work which is to be flushed
    + *
    + * It is expected that, prior to calling flush_work(), the caller has
    + * arranged for the work to not be requeued, otherwise it doesn't make
    + * sense to use this function.
    + */
    +int flush_work(struct work_struct *work)
    +{
    + struct cpu_workqueue_struct *cwq;
    + struct list_head *prev;
    + struct wq_barrier barr;
    +
    + might_sleep();
    + cwq = get_wq_data(work);
    + if (!cwq)
    + return 0;
    +
    + prev = NULL;
    + spin_lock_irq(&cwq->lock);
    + if (!list_empty(&work->entry)) {
    + /*
    + * See the comment near try_to_grab_pending()->smp_rmb().
    + * If it was re-queued under us we are not going to wait.
    + */
    + smp_rmb();
    + if (unlikely(cwq != get_wq_data(work)))
    + goto out;
    + prev = &work->entry;
    + } else {
    + if (cwq->current_work != work)
    + goto out;
    + prev = &cwq->worklist;
    + }
    + insert_wq_barrier(cwq, &barr, prev->next);
    +out:
    + spin_unlock_irq(&cwq->lock);
    + if (!prev)
    + return 0;
    +
    + wait_for_completion(&barr.done);
    + return 1;
    +}
    +EXPORT_SYMBOL_GPL(flush_work);
    +
    /*
    * Upon a successful return (>= 0), the caller "owns" WORK_STRUCT_PENDING bit,
    * so this work can't be re-armed in any way.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH 2/3] workqueues: implement flush_work()

    On Sun, Jun 29, 2008 at 06:49:26PM +0400, Oleg Nesterov wrote:
    ....
    > --- 26-rc2/kernel/workqueue.c~WQ_2_FLUSH_WORK 2008-06-12 21:28:13.000000000 +0400
    > +++ 26-rc2/kernel/workqueue.c 2008-06-29 18:20:33.000000000 +0400
    > @@ -399,6 +399,52 @@ void flush_workqueue(struct workqueue_st
    > }
    > EXPORT_SYMBOL_GPL(flush_workqueue);
    >
    > +/**
    > + * flush_work - block until a work_struct's callback has terminated
    > + * @work: the work which is to be flushed
    > + *
    > + * It is expected that, prior to calling flush_work(), the caller has
    > + * arranged for the work to not be requeued, otherwise it doesn't make
    > + * sense to use this function.
    > + */


    I missed this before, and probably it's not required, but "Returns..."
    could be added here.

    > +int flush_work(struct work_struct *work)
    > +{
    > + struct cpu_workqueue_struct *cwq;
    > + struct list_head *prev;
    > + struct wq_barrier barr;
    > +
    > + might_sleep();
    > + cwq = get_wq_data(work);
    > + if (!cwq)
    > + return 0;
    > +
    > + prev = NULL;
    > + spin_lock_irq(&cwq->lock);
    > + if (!list_empty(&work->entry)) {
    > + /*
    > + * See the comment near try_to_grab_pending()->smp_rmb().
    > + * If it was re-queued under us we are not going to wait.
    > + */
    > + smp_rmb();
    > + if (unlikely(cwq != get_wq_data(work)))
    > + goto out;
    > + prev = &work->entry;
    > + } else {


    Probably it doesn't matter too much, but one little doubt: don't we
    need (for consistency) smp_rmb() for this branch as well? It seems
    this cwq could be read out of order here too.

    > + if (cwq->current_work != work)
    > + goto out;
    > + prev = &cwq->worklist;
    > + }
    > + insert_wq_barrier(cwq, &barr, prev->next);
    > +out:
    > + spin_unlock_irq(&cwq->lock);
    > + if (!prev)
    > + return 0;
    > +
    > + wait_for_completion(&barr.done);
    > + return 1;
    > +}
    > +EXPORT_SYMBOL_GPL(flush_work);
    > +
    > /*
    > * Upon a successful return (>= 0), the caller "owns" WORK_STRUCT_PENDING bit,
    > * so this work can't be re-armed in any way.
    >


    Otherwise, all looks correct to me as before.

    Regards,
    Jarek P.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH 2/3] workqueues: implement flush_work()

    On 06/30, Jarek Poplawski wrote:
    >
    > On Sun, Jun 29, 2008 at 06:49:26PM +0400, Oleg Nesterov wrote:
    > ...
    > > --- 26-rc2/kernel/workqueue.c~WQ_2_FLUSH_WORK 2008-06-12 21:28:13.000000000 +0400
    > > +++ 26-rc2/kernel/workqueue.c 2008-06-29 18:20:33.000000000 +0400
    > > @@ -399,6 +399,52 @@ void flush_workqueue(struct workqueue_st
    > > }
    > > EXPORT_SYMBOL_GPL(flush_workqueue);
    > >
    > > +/**
    > > + * flush_work - block until a work_struct's callback has terminated
    > > + * @work: the work which is to be flushed
    > > + *
    > > + * It is expected that, prior to calling flush_work(), the caller has
    > > + * arranged for the work to not be requeued, otherwise it doesn't make
    > > + * sense to use this function.
    > > + */

    >
    > I missed this before, and probably it's not required, but "Returns..."
    > could be added here.


    Agreed, I'll update the comment later, together with other changes
    in workqueue.c

    > > + spin_lock_irq(&cwq->lock);
    > > + if (!list_empty(&work->entry)) {
    > > + /*
    > > + * See the comment near try_to_grab_pending()->smp_rmb().
    > > + * If it was re-queued under us we are not going to wait.
    > > + */
    > > + smp_rmb();
    > > + if (unlikely(cwq != get_wq_data(work)))
    > > + goto out;
    > > + prev = &work->entry;
    > > + } else {

    >
    > Probably it doesn't matter too much, but one little doubt: don't we
    > need (for consistency) smp_rmb() for this branch as well? It seems
    > this cwq could be read out of order here too.
    >
    > > + if (cwq->current_work != work)
    > > + goto out;


    Yes, cwq can be "stale", but this doesn't matter and we can't have
    the false positive here.

    cwq->current_work is always changed under cwq->lock, and we hold this
    lock. If we see "cwq->current_work == work" we can safely insert the
    barrier and wait. Even if this work was already re-queued on another
    CPU or another workqueue_struct.

    Note also that rmb() can't really help here.

    > Otherwise, all looks correct to me as before.


    Thanks!

    Oleg.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH 2/3] workqueues: implement flush_work()

    On Tue, Jul 01, 2008 at 04:50:18PM +0400, Oleg Nesterov wrote:
    ....
    > Yes, cwq can be "stale", but this doesn't matter and we can't have
    > the false positive here.
    >
    > cwq->current_work is always changed under cwq->lock, and we hold this
    > lock. If we see "cwq->current_work == work" we can safely insert the
    > barrier and wait. Even if this work was already re-queued on another
    > CPU or another workqueue_struct.
    >
    > Note also that rmb() can't really help here.


    Right! The question is how "stale" this cwq could be when read without
    any lock or barrier. Of course, there can't be the false positive, but
    I wonder if we really do enough, to check if a work isn't current on
    some other cwq, even without any immediate re-queuing.

    Thanks for the explanation,
    Jarek P.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH 2/3] workqueues: implement flush_work()

    On 07/01, Jarek Poplawski wrote:
    >
    > On Tue, Jul 01, 2008 at 04:50:18PM +0400, Oleg Nesterov wrote:
    > ...
    > > Yes, cwq can be "stale", but this doesn't matter and we can't have
    > > the false positive here.
    > >
    > > cwq->current_work is always changed under cwq->lock, and we hold this
    > > lock. If we see "cwq->current_work == work" we can safely insert the
    > > barrier and wait. Even if this work was already re-queued on another
    > > CPU or another workqueue_struct.
    > >
    > > Note also that rmb() can't really help here.

    >
    > Right! The question is how "stale" this cwq could be when read without
    > any lock or barrier. Of course, there can't be the false positive, but
    > I wonder if we really do enough, to check if a work isn't current on
    > some other cwq, even without any immediate re-queuing.


    Not sure I understand...

    Of course, the work can be current on _all_ CPUs. So no, we don't do
    enough. Please look at the changelog, in particular the note about
    flush_work_sync().

    But without re-queuing cwq can't be wrong? Once again, flush_work()
    flushes the result of the last visible queue_work(). If not requeued,
    the work is either current, or it is pending and list_empty() == F.

    Oleg.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread