linux-cryptodev-2.6

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git synced 2026-04-18 03:23:53 -04:00

Author	SHA1	Message	Date
Chao Yu	ab59919c8a	f2fs: check skipped write in f2fs_enable_checkpoint() This patch introduces sbi->nr_pages[F2FS_SKIPPED_WRITE] to record any skipped write during data flush in f2fs_enable_checkpoint(). So in the loop of data flush, if there is any skipped write in previous flush, let's retry sync_inode_sb(), otherwise, all dirty data written before f2fs_enable_checkpoint() should have been persisted, then break the retry loop. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-27 02:45:44 +00:00
Jaegeuk Kim	993663874b	Revert "f2fs: add timeout in f2fs_enable_checkpoint()" This reverts commit `4bc3477796`. Let's apply a better approach to flush the only dirty pages committed by user to avoid the delay caused by unncessary incoming ones. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-20 20:54:14 +00:00
Chao Yu	a5d8b9d94e	f2fs: fix to unlock folio in f2fs_read_data_large_folio() We missed to unlock folio in error path of f2fs_read_data_large_folio(), fix it. With below testcase, it can reproduce the bug. touch /mnt/f2fs/file truncate -s $((102410241024)) /mnt/f2fs/file f2fs_io setflags immutable /mnt/f2fs/file sync echo 3 > /proc/sys/vm/drop_caches time dd if=/mnt/f2fs/file of=/dev/null bs=1M count=1024 f2fs_io clearflags immutable /mnt/f2fs/file echo 1 > /proc/sys/vm/drop_caches time dd if=/mnt/f2fs/file of=/dev/null bs=1M count=1024 time dd if=/mnt/f2fs/file of=/dev/null bs=1M count=1024 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-19 17:07:47 +00:00
Chao Yu	fe15bc3d44	f2fs: fix error path handling in f2fs_read_data_large_folio() In error path of f2fs_read_data_large_folio(), if bio is valid, it may submit bio twice, fix it. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-19 17:07:47 +00:00
Jaegeuk Kim	ec8bb999dc	f2fs: use folio_end_read No logic change. Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Chao Yu	5c145c0318	f2fs: fix to avoid mapping wrong physical block for swapfile Xiaolong Guo reported a f2fs bug in bugzilla [1] [1] https://bugzilla.kernel.org/show_bug.cgi?id=220951 Quoted: "When using stress-ng's swap stress test on F2FS filesystem with kernel 6.6+, the system experiences data corruption leading to either: 1 dm-verity corruption errors and device reboot 2 F2FS node corruption errors and boot hangs The issue occurs specifically when: 1 Using F2FS filesystem (ext4 is unaffected) 2 Swapfile size is less than F2FS section size (2MB) 3 Swapfile has fragmented physical layout (multiple non-contiguous extents) 4 Kernel version is 6.6+ (6.1 is unaffected) The root cause is in check_swap_activate() function in fs/f2fs/data.c. When the first extent of a small swapfile (< 2MB) is not aligned to section boundaries, the function incorrectly treats it as the last extent, failing to map subsequent extents. This results in incorrect swap_extent creation where only the first extent is mapped, causing subsequent swap writes to overwrite wrong physical locations (other files' data). Steps to Reproduce 1 Setup a device with F2FS-formatted userdata partition 2 Compile stress-ng from https://github.com/ColinIanKing/stress-ng 3 Run swap stress test: (Android devices) adb shell "cd /data/stressng; ./stress-ng-64 --metrics-brief --timeout 60 --swap 0" Log: 1 Ftrace shows in kernel 6.6, only first extent is mapped during second f2fs_map_blocks call in check_swap_activate(): stress-ng-swap-8990: f2fs_map_blocks: ino=11002, file offset=0, start blkaddr=0x43143, len=0x1 (Only 4KB mapped, not the full swapfile) 2 in kernel 6.1, both extents are correctly mapped: stress-ng-swap-5966: f2fs_map_blocks: ino=28011, file offset=0, start blkaddr=0x13cd4, len=0x1 stress-ng-swap-5966: f2fs_map_blocks: ino=28011, file offset=1, start blkaddr=0x60c84b, len=0xff The problematic code is in check_swap_activate(): if ((pblock - SM_I(sbi)->main_blkaddr) % blks_per_sec \|\| nr_pblocks % blks_per_sec \|\| !f2fs_valid_pinned_area(sbi, pblock)) { bool last_extent = false; not_aligned++; nr_pblocks = roundup(nr_pblocks, blks_per_sec); if (cur_lblock + nr_pblocks > sis->max) nr_pblocks -= blks_per_sec; /* this extent is last one */ if (!nr_pblocks) { nr_pblocks = last_lblock - cur_lblock; last_extent = true; } ret = f2fs_migrate_blocks(inode, cur_lblock, nr_pblocks); if (ret) { if (ret == -ENOENT) ret = -EINVAL; goto out; } if (!last_extent) goto retry; } When the first extent is unaligned and roundup(nr_pblocks, blks_per_sec) exceeds sis->max, we subtract blks_per_sec resulting in nr_pblocks = 0. The code then incorrectly assumes this is the last extent, sets nr_pblocks = last_lblock - cur_lblock (entire swapfile), and performs migration. After migration, it doesn't retry mapping, so subsequent extents are never processed. " In order to fix this issue, we need to lookup block mapping info after we migrate all blocks in the tail of swapfile. Cc: stable@kernel.org Fixes: `9703d69d9d` ("f2fs: support file pinning for zoned devices") Cc: Daeho Jeong <daehojeong@google.com> Reported-and-tested-by: Xiaolong Guo <guoxiaolong2008@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220951 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Chao Yu	fe2961fb77	f2fs: avoid f2fs_map_blocks() for consecutive holes in readpages For consecutive large hole mapping across {d,id,did}nodes , we don't need to call f2fs_map_blocks() to check one hole block per one time, instead, we can use map.m_next_pgofs as a hint of next potential valid block, so that we can skip calling f2fs_map_blocks the range of [cur_pgofs + 1, .m_next_pgofs). 1) regular case touch /mnt/f2fs/file truncate -s $((102410241024)) /mnt/f2fs/file time dd if=/mnt/f2fs/file of=/dev/null bs=1M count=1024 Before: real 0m0.706s user 0m0.000s sys 0m0.706s After: real 0m0.620s user 0m0.008s sys 0m0.611s 2) large folio case touch /mnt/f2fs/file truncate -s $((102410241024)) /mnt/f2fs/file f2fs_io setflags immutable /mnt/f2fs/file sync echo 3 > /proc/sys/vm/drop_caches time dd if=/mnt/f2fs/file of=/dev/null bs=1M count=1024 Before: real 0m0.438s user 0m0.004s sys 0m0.433s After: real 0m0.368s user 0m0.004s sys 0m0.364s Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Nanzhe Zhao	d194f112a9	f2fs: advance index and offset after zeroing in large folio read In f2fs_read_data_large_folio(), the block zeroing path calls folio_zero_range() and then continues the loop. However, it fails to advance index and offset before continuing. This can cause the loop to repeatedly process the same subpage of the folio, leading to stalls/hangs and incorrect progress when reading large folios with holes/zeroed blocks. Fix it by advancing index and offset unconditionally in the loop iteration, so they are updated even when the zeroing path continues. Signed-off-by: Nanzhe Zhao <nzzhao@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Nanzhe Zhao	6afd05ca6d	f2fs: add 'folio_in_bio' to handle readahead folios with no BIO submission f2fs_read_data_large_folio() can build a single read BIO across multiple folios during readahead. If a folio ends up having none of its subpages added to the BIO (e.g. all subpages are zeroed / treated as holes), it will never be seen by f2fs_finish_read_bio(), so folio_end_read() is never called. This leaves the folio locked and not marked uptodate. Track whether the current folio has been added to a BIO via a local 'folio_in_bio' bool flag, and when iterating readahead folios, explicitly mark the folio uptodate (on success) and unlock it when nothing was added. Signed-off-by: Nanzhe Zhao <nzzhao@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Yongpeng Yang	540d34c182	f2fs: avoid unnecessary block mapping lookups in f2fs_read_data_large_folio In the second call to f2fs_map_blocks within f2fs_read_data_large_folio, map.m_len exceeds the logical address space to be read. This patch ensures map.m_len does not exceed the required address space. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Chao Yu	93ffb6c28f	f2fs: detect more inconsistent cases in sanity_check_node_footer() Let's enhance sanity_check_node_footer() to detect more inconsistent cases as below: Node Type Node Footer Info =================== ============================= NODE_TYPE_REGULAR inode = true and xnode = true NODE_TYPE_INODE inode = false or xnode = true NODE_TYPE_XATTR inode = true or xnode = false NODE_TYPE_NON_INODE inode = false Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:35 +00:00
Chao Yu	50ac3ecd8e	f2fs: fix to do sanity check on node footer in {read,write}_end_io -----------[ cut here ]------------ kernel BUG at fs/f2fs/data.c:358! Call Trace: <IRQ> blk_update_request+0x5eb/0xe70 block/blk-mq.c:987 blk_mq_end_request+0x3e/0x70 block/blk-mq.c:1149 blk_complete_reqs block/blk-mq.c:1224 [inline] blk_done_softirq+0x107/0x160 block/blk-mq.c:1229 handle_softirqs+0x283/0x870 kernel/softirq.c:579 __do_softirq kernel/softirq.c:613 [inline] invoke_softirq kernel/softirq.c:453 [inline] __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680 irq_exit_rcu+0x9/0x30 kernel/softirq.c:696 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1050 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1050 </IRQ> In f2fs_write_end_io(), it detects there is inconsistency in between node page index (nid) and footer.nid of node page. If footer of node page is corrupted in fuzzed image, then we load corrupted node page w/ async method, e.g. f2fs_ra_node_pages() or f2fs_ra_node_page(), in where we won't do sanity check on node footer, once node page becomes dirty, we will encounter this bug after node page writeback. Cc: stable@kernel.org Reported-by: syzbot+803dd716c4310d16ff3a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=803dd716c4310d16ff3a Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Chao Yu	0a736109c9	f2fs: fix to do sanity check on node footer in __write_node_folio() Add node footer sanity check during node folio's writeback, if sanity check fails, let's shutdown filesystem to avoid looping to redirty and writeback in .writepages. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Yangyang Zang	f7b929eda1	f2fs: clean up the type parameter in f2fs_sync_meta_pages() Clean up code to improve readability, no logic changes. Signed-off-by: Yangyang Zang <zangyangyang1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Daeho Jeong	e48e16f3e3	f2fs: support non-4KB block size without packed_ssa feature Currently, F2FS requires the packed_ssa feature to be enabled when utilizing non-4KB block sizes (e.g., 16KB). This restriction limits the flexibility of filesystem formatting options. This patch allows F2FS to support non-4KB block sizes even when the packed_ssa feature is disabled. It adjusts the SSA calculation logic to correctly handle summary entries in larger blocks without the packed layout. Cc: stable@kernel.org Fixes: `7ee8bc3942` ("f2fs: revert summary entry count from 2048 to 512 in 16kb block support") Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Chao Yu	1dd3b437d4	f2fs: make FAULT_DISCARD obsolete __blkdev_issue_discard() in __submit_discard_cmd() will never fail, so let's make FAULT_DISCARD fault injection obsolete. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Chao Yu	ce2739e482	f2fs: fix to avoid UAF in f2fs_write_end_io() As syzbot reported an use-after-free issue in f2fs_write_end_io(). It is caused by below race condition: loop device umount - worker_thread - loop_process_work - do_req_filebacked - lo_rw_aio - lo_rw_aio_complete - blk_mq_end_request - blk_update_request - f2fs_write_end_io - dec_page_count - folio_end_writeback - kill_f2fs_super - kill_block_super - f2fs_put_super : free(sbi) : get_pages(, F2FS_WB_CP_DATA) accessed sbi which is freed In kill_f2fs_super(), we will drop all page caches of f2fs inodes before call free(sbi), it guarantee that all folios should end its writeback, so it should be safe to access sbi before last folio_end_writeback(). Let's relocate ckpt thread wakeup flow before folio_end_writeback() to resolve this issue. Cc: stable@kernel.org Fixes: `e234088758` ("f2fs: avoid wait if IO end up when do_checkpoint for better performance") Reported-by: syzbot+b4444e3c972a7a124187@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b4444e3c972a7a124187 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-17 00:00:34 +00:00
Chao Yu	3996b70209	Revert "f2fs: block cache/dio write during f2fs_enable_checkpoint()" This reverts commit `196c81fdd4`. Original patch may cause below deadlock, revert it. write remount - write_begin - lock_page --- lock A - prepare_write_begin - f2fs_map_lock - f2fs_enable_checkpoint - down_write(cp_enable_rwsem) --- lock B - sync_inode_sb - writepages - lock_page --- lock A - down_read(cp_enable_rwsem) --- lock A Cc: stable@kernel.org Fixes: `196c81fdd4` ("f2fs: block cache/dio write during f2fs_enable_checkpoint()") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-16 03:49:31 +00:00
Chao Yu	0eda086de8	f2fs: fix to check sysfs filename w/ gc_pin_file_thresh correctly Sysfs entry name is gc_pin_file_thresh instead of gc_pin_file_threshold, fix it. Cc: stable@kernel.org Fixes: `c521a6ab4a` ("f2fs: fix to limit gc_pin_file_threshold") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:09 +00:00
Yongpeng Yang	7633a7387e	f2fs: fix IS_CHECKPOINTED flag inconsistency issue caused by concurrent atomic commit and checkpoint writes During SPO tests, when mounting F2FS, an -EINVAL error was returned from f2fs_recover_inode_page. The issue occurred under the following scenario Thread A Thread B f2fs_ioc_commit_atomic_write - f2fs_do_sync_file // atomic = true - f2fs_fsync_node_pages : last_folio = inode folio : schedule before folio_lock(last_folio) f2fs_write_checkpoint - block_operations// writeback last_folio - schedule before f2fs_flush_nat_entries : set_fsync_mark(last_folio, 1) : set_dentry_mark(last_folio, 1) : folio_mark_dirty(last_folio) - __write_node_folio(last_folio) : f2fs_down_read(&sbi->node_write)//block - f2fs_flush_nat_entries : {struct nat_entry}->flag \|= BIT(IS_CHECKPOINTED) - unblock_operations : f2fs_up_write(&sbi->node_write) f2fs_write_checkpoint//return : f2fs_do_write_node_page() f2fs_ioc_commit_atomic_write//return SPO Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has already been written once. However, the {struct nat_entry}->flag did not have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and write last_folio again after Thread B finishes f2fs_write_checkpoint. After SPO and reboot, it was detected that {struct node_info}->blk_addr was not NULL_ADDR because Thread B successfully write the checkpoint. This issue only occurs in atomic write scenarios. For regular file fsync operations, the folio must be dirty. If block_operations->f2fs_sync_node_pages successfully submit the folio write, this path will not be executed. Otherwise, the f2fs_write_checkpoint will need to wait for the folio write submission to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the situation where f2fs_need_dentry_mark checks that the {struct nat_entry}->flag /wo the IS_CHECKPOINTED flag, but the folio write has already been submitted, will not occur. Therefore, for atomic file fsync, sbi->node_write should be acquired through __write_node_folio to ensure that the IS_CHECKPOINTED flag correctly indicates that the checkpoint write has been completed. Fixes: `608514deba` ("f2fs: set fsync mark only for the last dnode") Cc: stable@kernel.org Signed-off-by: Sheng Yong <shengyong1@xiaomi.com> Signed-off-by: Jinbao Liu <liujinbao1@xiaomi.com> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:09 +00:00
Yongpeng Yang	071e50d61c	f2fs: change seq_file_ra_mul and max_io_bytes to unsigned int {struct file_ra_state}->ra_pages and {struct bio}->bi_iter.bi_size is defined as unsigned int, so values of seq_file_ra_mul and max_io_bytes exceeding UINT_MAX are meaningless. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:09 +00:00
Yongpeng Yang	98ea0039db	f2fs: fix out-of-bounds access in sysfs attribute read/write Some f2fs sysfs attributes suffer from out-of-bounds memory access and incorrect handling of integer values whose size is not 4 bytes. For example: vm:~# echo 65537 > /sys/fs/f2fs/vde/carve_out vm:~# cat /sys/fs/f2fs/vde/carve_out 65537 vm:~# echo 4294967297 > /sys/fs/f2fs/vde/atgc_age_threshold vm:~# cat /sys/fs/f2fs/vde/atgc_age_threshold 1 carve_out maps to {struct f2fs_sb_info}->carve_out, which is a 8-bit integer. However, the sysfs interface allows setting it to a value larger than 255, resulting in an out-of-range update. atgc_age_threshold maps to {struct atgc_management}->age_threshold, which is a 64-bit integer, but its sysfs interface cannot correctly set values larger than UINT_MAX. The root causes are: 1. __sbi_store() treats all default values as unsigned int, which prevents updating integers larger than 4 bytes and causes out-of-bounds writes for integers smaller than 4 bytes. 2. f2fs_sbi_show() also assumes all default values are unsigned int, leading to out-of-bounds reads and incorrect access to integers larger than 4 bytes. This patch introduces {struct f2fs_attr}->size to record the actual size of the integer associated with each sysfs attribute. With this information, sysfs read and write operations can correctly access and update values according to their real data size, avoiding memory corruption and truncation. Fixes: `b59d0bae6c` ("f2fs: add sysfs support for controlling the gc_thread") Cc: stable@kernel.org Signed-off-by: Jinbao Liu <liujinbao1@xiaomi.com> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Nanzhe Zhao	c0c589fa1d	f2fs: Accounting large folio subpages before bio submission In f2fs_read_data_large_folio(), read_pages_pending is incremented only after the subpage has been added to the BIO. With a heavily fragmented file, each new subpage can force submission of the previous BIO. If the BIO completes quickly, f2fs_finish_read_bio() may decrement read_pages_pending to zero and call folio_end_read() while the read loop is still processing other subpages of the same large folio. Fix the ordering by incrementing read_pages_pending before any possible BIO submission for the current subpage, matching the iomap ordering and preventing premature folio_end_read(). Signed-off-by: Nanzhe Zhao <nzzhao@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Nanzhe Zhao	00feea1dfc	f2fs: Zero f2fs_folio_state on allocation f2fs_folio_state is attached to folio->private and is expected to start with read_pages_pending == 0. However, the structure was allocated from ffs_entry_slab without being fully initialized, which can leave read_pages_pending with stale values. Allocate the object with __GFP_ZERO so all fields are reliably zeroed at creation time. Signed-off-by: Nanzhe Zhao <nzzhao@126.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Chao Yu	d36de29f4b	f2fs: sysfs: introduce inject_lock_timeout This patch adds a new sysfs node in /sys/fs/f2fs/<disk>/inject_lock_timeout, it relies on CONFIG_F2FS_FAULT_INJECTION kernel config. It can be used to simulate different type of timeout in lock duration. ========== =============================== Flag_Value Flag_Description ========== =============================== 0x00000000 No timeout (default) 0x00000001 Simulate running time 0x00000002 Simulate IO type sleep time 0x00000003 Simulate Non-IO type sleep time 0x00000004 Simulate runnable time ========== =============================== Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Chao Yu	c56254e2e0	f2fs: introduce FAULT_LOCK_TIMEOUT This patch introduce a new fault type FAULT_LOCK_TIMEOUT, it can be used to inject timeout into lock duration. Timeout type can be set via /sys/fs/f2fs/<disk>/inject_timeout_type Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Chao Yu	7a127c80b0	f2fs: rename FAULT_TIMEOUT to FAULT_ATOMIC_TIMEOUT No logic changes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:08 +00:00
Chao Yu	6fa1160539	f2fs: fix timeout precision of f2fs_io_schedule_timeout_killable() Sometimes, f2fs_io_schedule_timeout_killable(HZ) may delay for about 2 seconds, this is because __f2fs_schedule_timeout(DEFAULT_SCHEDULE_TIMEOUT) may delay for about 2 * DEFAULT_SCHEDULE_TIMEOUT due to its precision, but we only account the delay as DEFAULT_SCHEDULE_TIMEOUT as below, fix it. f2fs_io_schedule_timeout_killable() .. timeout -= DEFAULT_SCHEDULE_TIMEOUT; Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	da90b67155	f2fs: fix to use jiffies based precision for DEFAULT_SCHEDULE_TIMEOUT Due to timeout parameter in {io,}_schedule_timeout() is based on jiffies unit precision. It will lose precision when using msecs_to_jiffies(x) for conversion. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	b5da276ae6	f2fs: clean up w/ __f2fs_schedule_timeout() No logic changes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	67972c2b89	f2fs: trace elapsed time for io_rwsem lock Use f2fs_{down,up}_{read,write}_trace for io_rwsem to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	ce9fe67c9c	f2fs: trace elapsed time for cp_global_sem lock Use f2fs_{down,up}_write_trace for cp_global_sem to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	e605302c14	f2fs: trace elapsed time for gc_lock lock Use f2fs_{down,up}_write_trace for gc_lock to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:07 +00:00
Chao Yu	bb28b66875	f2fs: trace elapsed time for node_write lock Use f2fs_{down,up}_read_trace for node_write to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
Chao Yu	f9f9360251	f2fs: trace elapsed time for node_change lock Use f2fs_{down,up}_read_trace for node_change to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
Chao Yu	66e9e0d55d	f2fs: trace elapsed time for cp_rwsem lock Use f2fs_{down,up}_read_trace for cp_rwsem to trace lock elapsed time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
Chao Yu	e4b75621fc	f2fs: sysfs: introduce max_lock_elapsed_time This patch add a new sysfs node in /sys/fs/f2fs/<device>/max_lock_elapsed_time. This is a threshold, once a thread enters critical region that lock covers, total elapsed time exceeds this threshold, f2fs will print tracepoint to dump information of related context. This sysfs entry can be used to control the value of threshold, by default, the value is 500 ms. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
Chao Yu	79b3cebc70	f2fs: add lock elapsed time trace facility for f2fs rwsemphore This patch adds lock elapsed time trace facility for f2fs rwsemphore. If total elapsed time of critical region covered by lock exceeds a threshold, it will print tracepoint to dump information of lock related context, including: - thread information - CPU/IO priority - lock information - elapsed time - total time - running time (depend on CONFIG_64BIT) - runnable time (depend on CONFIG_SCHED_INFO and CONFIG_SCHEDSTATS) - io sleep time (depend on CONFIG_TASK_DELAY_ACCT and /proc/sys/kernel/task_delayacct) - other time (by default other time will account nonio sleep time, but, if above kconfig is not defined, other time will include runnable time and/or io sleep time as wll) output: f2fs_lock_elapsed_time: dev = (254,52), comm: sh, pid: 13855, prio: 120, ioprio_class: 2, ioprio_data: 4, lock_name: cp_rwsem, lock_type: rlock, total: 1000, running: 993, runnable: 2, io_sleep: 0, other: 5 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
Daeho Jeong	7ec199117c	f2fs: flush plug periodically during GC to maximize readahead effect During the garbage collection process, F2FS submits readahead I/Os for valid blocks. However, since the GC loop runs within a single plug scope without intermediate flushing, these readahead I/Os often accumulate in the block layer's plug list instead of being dispatched to the device immediately. Consequently, when the GC thread attempts to lock the page later, the I/O might not have completed (or even started), leading to a "read try and wait" scenario. This negates the benefit of readahead and causes unnecessary delays in GC latency. This patch addresses this issue by introducing an intermediate blk_finish_plug() and blk_start_plug() pair within the GC loop. This forces the dispatch of pending I/Os, ensuring that readahead pages are fetched in time, thereby reducing GC latency. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-07 03:17:06 +00:00
ZhaoYueNan	572b1c6f2a	f2fs: Update the default value of the documentation ckpt_thread_ioprio The commit `8a2d9f00d` has been updated to set its default value to "rt,3", fixing the outdated default value in the F2FS documentation. Signed-off-by: ZhaoYueNan <amktiao030215@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:30:54 +00:00
Yongpeng Yang	9609dd7047	f2fs: remove non-uptodate folio from the page cache in move_data_block During data movement, move_data_block acquires file folio without triggering a file read. Such folio are typically not uptodate, they need to be removed from the page cache after gc complete. This patch marks folio with the PG_dropbehind flag and uses folio_end_dropbehind to remove folio from the page cache. Signed-off-by: Yunlei He <heyunlei@xiaomi.com> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:30:25 +00:00
Yongpeng Yang	db1a8a7813	f2fs: return immediately after submitting the specified folio in __submit_merged_write_cond f2fs_folio_wait_writeback ensures the folio write is submitted to the block layer via __submit_merged_write_cond, then waits for the folio to complete. Other I/O submissions are irrelevant to f2fs_folio_wait_writeback. Thus, if the folio write bio is already submitted, the function can return immediately. This patch adds a writeback parameter to __submit_merged_write_cond(), which signals an immediate return after submitting the target folio, and waitting writeback can use this parameter. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:29:40 +00:00
Yongpeng Yang	86c1cf0578	f2fs: clean up the force parameter in __submit_merged_write_cond() The force parameter in __submit_merged_write_cond is redundant, where `force == true` implies `inode == NULL && folio == NULL && ino == 0` is true, and `force == false` implies `inode != NULL \|\| folio != NULL \|\| ino != 0` is true. Thus, this patch replaces the force parameter with a stack variable force. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:29:35 +00:00
Zhiguo Niu	761dac9073	f2fs: fix to add gc count stat in f2fs_gc_range It missed the stat count in f2fs_gc_range. Cc: stable@kernel.org Fixes: `9bf1dcbdfd` ("f2fs: fix to account gc stats correctly") Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:27:26 +00:00
Chao Yu	3cb396a2c7	f2fs: fix to do sanity check on nat entry of quota inode As Zhiguo reported, nat entry of quota inode could be corrupted: "ino/block_addr=NULL_ADDR in nid=4 entry" We'd better to do sanity check on quota inode to detect and record nat.blk_addr inconsistency, so that we can have a chance to repair it w/ later fsck. Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:26:56 +00:00
Zhiguo Niu	3250bd41d9	f2fs: remove some redundant codes in f2fs_quota_enable 1. qf_inum has been got and checked in its caller f2fs_enable_quotas 2. f2fs_sb_has_quota_ino has bee checked in its all callers 3. use sbi cleanup F2FS_SB(sb) Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2026-01-01 03:26:08 +00:00
Joanne Chang	4a210a5be2	f2fs: improve check for enough free sections The check for enough sections in segment.h has the following issues: 1. has_not_enough_free_secs() should return "enough secs" when "free_secs >= upper_secs", not just structly greater. Conversely, it should only return "not enough secs" when "free_secs < lower_secs", not when they are equal. This accounts for the possibility that blocks can fit within curseg without requiring an additional free section. 2. __get_secs_required() currently separates the needed space to section and block parts, checking them against free sections and curseg, respectively. This does not consider the case where curseg cannot hold the whole block part, but excess free sections beyond the section part can accommodate some of the block part. 3. has_curseg_enough_space() only checks CURSEG_HOT_DATA for dentry blocks, but when active_logs=6, they may be placed in WARM and COLD sections. Also, the current logic does not consider that dentry and data blocks can be put in the same section when active_logs=2 or 6. This patch modifies the three functions to address the above issues: 1. Rename has_curseg_enough_space() to get_additional_blocks_required(). Calculate the minimum node, dentry, and data blocks curseg can accommodate. Then subtract these from the total required blocks of respective type to determine the worst-case number of blocks that must be placed in free sections. 2. In __get_secs_required(), get the number of blocks needing new sections from the new get_additional_blocks_required(). Return the upper bound of necessary free sections for these blocks. For active_logs=2 or 6, dentry blocks are combined with data blocks. 3. In has_not_enough_free_secs(), get the required sections from __get_secs_required(), and return “not enough secs” if “free_secs < required_secs”. Signed-off-by: Joanne Chang <joannechien@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-16 00:49:25 +00:00
Jaegeuk Kim	903c6e95bc	f2fs: add a tracepoint to see large folio read submission For example, 1327.539878: f2fs_preload_pages_start: dev = (252,16), ino = 14, i_size = 4294967296 start: 0, end: 8191 1327.539878: page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=8192 order=9 size=0 async_size=0 ra_pages=4096 mmap_miss=0 prev_pos=-1 1327.539879: page_cache_ra_order: dev=252:16 ino=e index=0 order=9 size=4096 async_size=2048 ra_pages=4096 1327.541895: f2fs_readpages: dev = (252,16), ino = 14, start = 0 nrpage = 4096 1327.541930: f2fs_lookup_extent_tree_start: dev = (252,16), ino = 14, pgofs = 0, type = Read 1327.541931: f2fs_lookup_read_extent_tree_end: dev = (252,16), ino = 14, pgofs = 0, read_ext_info(fofs: 0, len: 1048576, blk: 4221440) 1327.541931: f2fs_map_blocks: dev = (252,16), ino = 14, file offset = 0, start blkaddr = 0x406a00, len = 0x1000, flags = 2, seg_type = 8, may_create = 0, multidevice = 0, flag = 0, err = 0 1327.541989: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 0, nr_pages = 512, dirty = 0, uptodate = 0 1327.542012: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 512, nr_pages = 512, dirty = 0, uptodate = 0 1327.542036: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 1024, nr_pages = 512, dirty = 0, uptodate = 0 1327.542080: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 1536, nr_pages = 512, dirty = 0, uptodate = 0 1327.542127: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 2048, nr_pages = 512, dirty = 0, uptodate = 0 1327.542151: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 2560, nr_pages = 512, dirty = 0, uptodate = 0 1327.542196: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 3072, nr_pages = 512, dirty = 0, uptodate = 0 1327.542219: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 3584, nr_pages = 512, dirty = 0, uptodate = 0 1327.542239: f2fs_submit_read_bio: dev = (252,16)/(252,16), rw = READ(R), DATA, sector = 33771520, size = 16777216 1327.542269: page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=8192 order=9 size=4096 async_size=2048 ra_pages=4096 mmap_miss=0 prev_pos=-1 1327.542289: page_cache_ra_order: dev=252:16 ino=e index=4096 order=9 size=4096 async_size=2048 ra_pages=4096 1327.544485: f2fs_readpages: dev = (252,16), ino = 14, start = 4096 nrpage = 4096 1327.544521: f2fs_lookup_extent_tree_start: dev = (252,16), ino = 14, pgofs = 4096, type = Read 1327.544521: f2fs_lookup_read_extent_tree_end: dev = (252,16), ino = 14, pgofs = 4096, read_ext_info(fofs: 0, len: 1048576, blk: 4221440) 1327.544522: f2fs_map_blocks: dev = (252,16), ino = 14, file offset = 4096, start blkaddr = 0x407a00, len = 0x1000, flags = 2, seg_type = 8, may_create = 0, multidevice = 0, flag = 0, err = 0 1327.544550: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 4096, nr_pages = 512, dirty = 0, uptodate = 0 1327.544575: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 4608, nr_pages = 512, dirty = 0, uptodate = 0 1327.544601: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 5120, nr_pages = 512, dirty = 0, uptodate = 0 1327.544647: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 5632, nr_pages = 512, dirty = 0, uptodate = 0 1327.544692: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 6144, nr_pages = 512, dirty = 0, uptodate = 0 1327.544734: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 6656, nr_pages = 512, dirty = 0, uptodate = 0 1327.544777: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 7168, nr_pages = 512, dirty = 0, uptodate = 0 1327.544805: f2fs_read_folio: dev = (252,16), ino = 14, DATA, FILE, index = 7680, nr_pages = 512, dirty = 0, uptodate = 0 1327.544826: f2fs_submit_read_bio: dev = (252,16)/(252,16), rw = READ(R), DATA, sector = 33804288, size = 16777216 1327.544852: f2fs_preload_pages_end: dev = (252,16), ino = 14, i_size = 4294967296 start: 8192, end: 8191 Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-16 00:46:49 +00:00
Jaegeuk Kim	05e65c14ea	f2fs: support large folio for immutable non-compressed case This patch enables large folio for limited case where we can get the high-order memory allocation. It supports the encrypted and fsverity files, which are essential for Android environment. How to test: - dd if=/dev/zero of=/mnt/test/test bs=1G count=4 - f2fs_io setflags immutable /mnt/test/test - echo 3 > /proc/sys/vm/drop_caches : to reload inode with large folio - f2fs_io read 32 0 1024 mmap 0 0 /mnt/test/test Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-16 00:46:49 +00:00
Linus Torvalds	8f0b4cce44	Linux 6.19-rc1	2025-12-14 16:05:07 +12:00

1 2 3 4 5 ...

1412051 Commits