Commit 0e6a169c authored by Dave Chen's avatar Dave Chen Committed by David Sterba
Browse files

btrfs: fix unnecessary flush on close when truncating zero-sized files



In btrfs_setsize(), when a file is truncated to size 0, the
BTRFS_INODE_FLUSH_ON_CLOSE flag is unconditionally set to ensure
pending writes get flushed on close. This flag was designed to protect
the "truncate-then-rewrite" pattern, where an application truncates a
file with existing data down to zero and writes new content, ensuring
the new data reach disk on close.

However, when a file already has a size of 0 (e.g. a newly created
file opened with O_CREAT | O_TRUNC), oldsize and newsize are both 0.
In this case, setting BTRFS_INODE_FLUSH_ON_CLOSE is unnecessary because
no "good data" was truncated away. The subsequent filemap_flush() in
btrfs_release_file() then triggers avoidable writeback that disrupts
the normal delayed writeback batching, adding I/O overhead.

This comes from a real workload. A backup service creates temporary
files via mkstemp(), closes them, and later reopens them with O_TRUNC
for writing. The O_TRUNC is defensive.  The file creation and usage is
done by a different component, so removing the unneeded truncation is
not straightforward.  This pattern repeats for a large number of files
each close() triggers an unnecessary filemap_flush().

Signed-off-by: default avatarDave Chen <davechen@synology.com>
Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent 30407652
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -5442,7 +5442,7 @@ static int btrfs_setsize(struct inode *inode, struct iattr *attr)
		 * zero. Make sure any new writes to the file get on disk
		 * on close.
		 */
		if (newsize == 0)
		if (newsize == 0 && oldsize != 0)
			set_bit(BTRFS_INODE_FLUSH_ON_CLOSE,
				&BTRFS_I(inode)->runtime_flags);