Commit 8bc35475 authored by Tejun Heo's avatar Tejun Heo
Browse files

workqueue: Fix spruious data race in __flush_work()



When flushing a work item for cancellation, __flush_work() knows that it
exclusively owns the work item through its PENDING bit. 134874e2
("workqueue: Allow cancel_work_sync() and disable_work() from atomic
contexts on BH work items") added a read of @work->data to determine whether
to use busy wait for BH work items that are being canceled. While the read
is safe when @from_cancel, @work->data was read before testing @from_cancel
to simplify code structure:

	data = *work_data_bits(work);
	if (from_cancel &&
	    !WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {

While the read data was never used if !@from_cancel, this could trigger
KCSAN data race detection spuriously:

  ==================================================================
  BUG: KCSAN: data-race in __flush_work / __flush_work

  write to 0xffff8881223aa3e8 of 8 bytes by task 3998 on cpu 0:
   instrument_write include/linux/instrumented.h:41 [inline]
   ___set_bit include/asm-generic/bitops/instrumented-non-atomic.h:28 [inline]
   insert_wq_barrier kernel/workqueue.c:3790 [inline]
   start_flush_work kernel/workqueue.c:4142 [inline]
   __flush_work+0x30b/0x570 kernel/workqueue.c:4178
   flush_work kernel/workqueue.c:4229 [inline]
   ...

  read to 0xffff8881223aa3e8 of 8 bytes by task 50 on cpu 1:
   __flush_work+0x42a/0x570 kernel/workqueue.c:4188
   flush_work kernel/workqueue.c:4229 [inline]
   flush_delayed_work+0x66/0x70 kernel/workqueue.c:4251
   ...

  value changed: 0x0000000000400000 -> 0xffff88810006c00d

Reorganize the code so that @from_cancel is tested before @work->data is
accessed. The only problem is triggering KCSAN detection spuriously. This
shouldn't need READ_ONCE() or other access qualifiers.

No functional changes.

Signed-off-by: default avatarTejun Heo <tj@kernel.org>
Reported-by: default avatar <syzbot+b3e4f2f51ed645fd5df2@syzkaller.appspotmail.com>
Fixes: 134874e2 ("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items")
Link: http://lkml.kernel.org/r/000000000000ae429e061eea2157@google.com
Cc: Jens Axboe <axboe@kernel.dk>
parent 98cc1730
Loading
Loading
Loading
Loading
+25 −20
Original line number Diff line number Diff line
@@ -4166,7 +4166,6 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
static bool __flush_work(struct work_struct *work, bool from_cancel)
{
	struct wq_barrier barr;
	unsigned long data;

	if (WARN_ON(!wq_online))
		return false;
@@ -4184,15 +4183,18 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
	 * was queued on a BH workqueue, we also know that it was running in the
	 * BH context and thus can be busy-waited.
	 */
	data = *work_data_bits(work);
	if (from_cancel &&
	    !WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {
	if (from_cancel) {
		unsigned long data = *work_data_bits(work);

		if (!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) &&
		    (data & WORK_OFFQ_BH)) {
			/*
		 * On RT, prevent a live lock when %current preempted soft
		 * interrupt processing or prevents ksoftirqd from running by
		 * keeping flipping BH. If the BH work item runs on a different
		 * CPU then this has no effect other than doing the BH
		 * disable/enable dance for nothing. This is copied from
			 * On RT, prevent a live lock when %current preempted
			 * soft interrupt processing or prevents ksoftirqd from
			 * running by keeping flipping BH. If the BH work item
			 * runs on a different CPU then this has no effect other
			 * than doing the BH disable/enable dance for nothing.
			 * This is copied from
			 * kernel/softirq.c::tasklet_unlock_spin_wait().
			 */
			while (!try_wait_for_completion(&barr.done)) {
@@ -4203,10 +4205,13 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
					cpu_relax();
				}
			}
	} else {
		wait_for_completion(&barr.done);
			goto out_destroy;
		}
	}

	wait_for_completion(&barr.done);

out_destroy:
	destroy_work_on_stack(&barr.work);
	return true;
}