Commit cd1635d8 authored by Yu Kuai's avatar Yu Kuai
Browse files

md/raid5: fix IO hang with degraded array with llbitmap

When llbitmap bit state is still unwritten, any new write should force
rcw, as bitmap_ops->blocks_synced() is checked in handle_stripe_dirtying().
However, later the same check is missing in need_this_block(), causing
stripe to deadloop during handling because handle_stripe() will decide
to go to handle_stripe_fill(), meanwhile need_this_block() always return
0 and nothing is handled.

Link: https://lore.kernel.org/linux-raid/20260123182623.3718551-2-yukuai@fnnas.com


Fixes: 5ab829f1 ("md/md-llbitmap: introduce new lockless bitmap")
Signed-off-by: default avatarYu Kuai <yukuai@fnnas.com>
Reviewed-by: default avatarLi Nan <linan122@huawei.com>
parent 5d1dd579
Loading
Loading
Loading
Loading
+6 −1
Original line number Diff line number Diff line
@@ -3751,9 +3751,14 @@ static int need_this_block(struct stripe_head *sh, struct stripe_head_state *s,
	struct r5dev *dev = &sh->dev[disk_idx];
	struct r5dev *fdev[2] = { &sh->dev[s->failed_num[0]],
				  &sh->dev[s->failed_num[1]] };
	struct mddev *mddev = sh->raid_conf->mddev;
	bool force_rcw = false;
	int i;
	bool force_rcw = (sh->raid_conf->rmw_level == PARITY_DISABLE_RMW);

	if (sh->raid_conf->rmw_level == PARITY_DISABLE_RMW ||
	    (mddev->bitmap_ops && mddev->bitmap_ops->blocks_synced &&
	     !mddev->bitmap_ops->blocks_synced(mddev, sh->sector)))
		force_rcw = true;

	if (test_bit(R5_LOCKED, &dev->flags) ||
	    test_bit(R5_UPTODATE, &dev->flags))