Commit 94a8cea5 authored by Brian Foster's avatar Brian Foster Committed by Theodore Ts'o
Browse files

ext4: fix dirtyclusters double decrement on fs shutdown



fstests test generic/388 occasionally reproduces a warning in
ext4_put_super() associated with the dirty clusters count:

  WARNING: CPU: 7 PID: 76064 at fs/ext4/super.c:1324 ext4_put_super+0x48c/0x590 [ext4]

Tracing the failure shows that the warning fires due to an
s_dirtyclusters_counter value of -1. IOW, this appears to be a
spurious decrement as opposed to some sort of leak. Further tracing
of the dirty cluster count deltas and an LLM scan of the resulting
output identified the cause as a double decrement in the error path
between ext4_mb_mark_diskspace_used() and the caller
ext4_mb_new_blocks().

First, note that generic/388 is a shutdown vs. fsstress test and so
produces a random set of operations and shutdown injections. In the
problematic case, the shutdown triggers an error return from the
ext4_handle_dirty_metadata() call(s) made from
ext4_mb_mark_context(). The changed value is non-zero at this point,
so ext4_mb_mark_diskspace_used() does not exit after the error
bubbles up from ext4_mb_mark_context(). Instead, the former
decrements both cluster counters and returns the error up to
ext4_mb_new_blocks(). The latter falls into the !ar->len out path
which decrements the dirty clusters counter a second time, creating
the inconsistency.

To avoid this problem and simplify ownership of the cluster
reservation in this codepath, lift the counter reduction to a single
place in the caller. This makes it more clear that
ext4_mb_new_blocks() is responsible for acquiring cluster
reservation (via ext4_claim_free_clusters()) in the !delalloc case
as well as releasing it, regardless of whether it ends up consumed
or returned due to failure.

Fixes: 0087d9fb ("ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc")
Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
Reviewed-by: default avatarBaokun Li <libaokun1@huawei.com>
Link: https://patch.msgid.link/20260113171905.118284-1-bfoster@redhat.com


Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
parent 491f2927
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -567,7 +567,7 @@ test_mark_diskspace_used_range(struct kunit *test,

	bitmap = mbt_ctx_bitmap(sb, TEST_GOAL_GROUP);
	memset(bitmap, 0, sb->s_blocksize);
	ret = ext4_mb_mark_diskspace_used(ac, NULL, 0);
	ret = ext4_mb_mark_diskspace_used(ac, NULL);
	KUNIT_ASSERT_EQ(test, ret, 0);

	max = EXT4_CLUSTERS_PER_GROUP(sb);
+5 −16
Original line number Diff line number Diff line
@@ -4186,8 +4186,7 @@ ext4_mb_mark_context(handle_t *handle, struct super_block *sb, bool state,
 * Returns 0 if success or error code
 */
static noinline_for_stack int
ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
				handle_t *handle, unsigned int reserv_clstrs)
ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, handle_t *handle)
{
	struct ext4_group_desc *gdp;
	struct ext4_sb_info *sbi;
@@ -4242,13 +4241,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
	BUG_ON(changed != ac->ac_b_ex.fe_len);
#endif
	percpu_counter_sub(&sbi->s_freeclusters_counter, ac->ac_b_ex.fe_len);
	/*
	 * Now reduce the dirty block count also. Should not go negative
	 */
	if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
		/* release all the reserved blocks if non delalloc */
		percpu_counter_sub(&sbi->s_dirtyclusters_counter,
				   reserv_clstrs);

	return err;
}
@@ -6333,7 +6325,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
			ext4_mb_pa_put_free(ac);
	}
	if (likely(ac->ac_status == AC_STATUS_FOUND)) {
		*errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
		*errp = ext4_mb_mark_diskspace_used(ac, handle);
		if (*errp) {
			ext4_discard_allocated_blocks(ac);
			goto errout;
@@ -6364,12 +6356,9 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
out:
	if (inquota && ar->len < inquota)
		dquot_free_block(ar->inode, EXT4_C2B(sbi, inquota - ar->len));
	if (!ar->len) {
		if ((ar->flags & EXT4_MB_DELALLOC_RESERVED) == 0)
			/* release all the reserved blocks if non delalloc */
			percpu_counter_sub(&sbi->s_dirtyclusters_counter,
						reserv_clstrs);
	}
	/* release any reserved blocks */
	if (reserv_clstrs)
		percpu_counter_sub(&sbi->s_dirtyclusters_counter, reserv_clstrs);

	trace_ext4_allocate_blocks(ar, (unsigned long long)block);