From eade54040384f54b7fb330e4b0975c5734850b3c Mon Sep 17 00:00:00 2001 From: Sheng Yong Date: Fri, 27 Feb 2026 10:30:08 +0800 Subject: [PATCH 1/4] erofs: set fileio bio failed in short read case For file-backed mount, IO requests are handled by vfs_iocb_iter_read(). However, it can be interrupted by SIGKILL, returning the number of bytes actually copied. Unused folios in bio are unexpectedly marked as uptodate. vfs_read filemap_read filemap_get_pages filemap_readahead erofs_fileio_readahead erofs_fileio_rq_submit vfs_iocb_iter_read filemap_read filemap_get_pages <= detect signal erofs_fileio_ki_complete <= set all folios uptodate This patch addresses this by setting short read bio with an error directly. Fixes: bc804a8d7e86 ("erofs: handle end of filesystem properly for file-backed mounts") Reported-by: chenguanyou Signed-off-by: Yunlei He Signed-off-by: Sheng Yong Reviewed-by: Gao Xiang Reviewed-by: Chao Yu Signed-off-by: Gao Xiang --- fs/erofs/fileio.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index abe873f01297..98cdaa1cd1a7 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -25,10 +25,8 @@ static void erofs_fileio_ki_complete(struct kiocb *iocb, long ret) container_of(iocb, struct erofs_fileio_rq, iocb); struct folio_iter fi; - if (ret >= 0 && ret != rq->bio.bi_iter.bi_size) { - bio_advance(&rq->bio, ret); - zero_fill_bio(&rq->bio); - } + if (ret >= 0 && ret != rq->bio.bi_iter.bi_size) + ret = -EIO; if (!rq->bio.bi_end_io) { bio_for_each_folio_all(fi, &rq->bio) { DBG_BUGON(folio_test_uptodate(fi.folio)); From c23df30915f83e7257c8625b690a1cece94142a0 Mon Sep 17 00:00:00 2001 From: Jiucheng Xu Date: Wed, 11 Mar 2026 17:11:31 +0800 Subject: [PATCH 2/4] erofs: add GFP_NOIO in the bio completion if needed The bio completion path in the process context (e.g. dm-verity) will directly call into decompression rather than trigger another workqueue context for minimal scheduling latencies, which can then call vm_map_ram() with GFP_KERNEL. Due to insufficient memory, vm_map_ram() may generate memory swapping I/O, which can cause submit_bio_wait to deadlock in some scenarios. Trimmed down the call stack, as follows: f2fs_submit_read_io submit_bio //bio_list is initialized. mmc_blk_mq_recovery z_erofs_endio vm_map_ram __pte_alloc_kernel __alloc_pages_direct_reclaim shrink_folio_list __swap_writepage submit_bio_wait //bio_list is non-NULL, hang!!! Use memalloc_noio_{save,restore}() to wrap up this path. Reviewed-by: Gao Xiang Signed-off-by: Jiucheng Xu Reviewed-by: Chao Yu Signed-off-by: Gao Xiang --- fs/erofs/zdata.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 3977e42b9516..fe8121df9ef2 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1445,6 +1445,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, int bios) { struct erofs_sb_info *const sbi = EROFS_SB(io->sb); + int gfp_flag; /* wake up the caller thread for sync decompression */ if (io->sync) { @@ -1477,7 +1478,9 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, sbi->sync_decompress = EROFS_SYNC_DECOMPRESS_FORCE_ON; return; } + gfp_flag = memalloc_noio_save(); z_erofs_decompressqueue_work(&io->u.work); + memalloc_noio_restore(gfp_flag); } static void z_erofs_fill_bio_vec(struct bio_vec *bvec, From 938c418422c4b08523ae39aebbd828428dcfefd2 Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Mon, 23 Mar 2026 17:48:57 +0800 Subject: [PATCH 3/4] erofs: update the Kconfig description Refine the description to better highlight its features and use cases. In addition, add instructions for building it as a module and clarify the compression option. Reviewed-by: Chao Yu Signed-off-by: Gao Xiang --- fs/erofs/Kconfig | 43 +++++++++++++++++++++++++++++-------------- 1 file changed, 29 insertions(+), 14 deletions(-) diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig index a9f645f57bb2..97c48ebe8458 100644 --- a/fs/erofs/Kconfig +++ b/fs/erofs/Kconfig @@ -16,22 +16,36 @@ config EROFS_FS select ZLIB_INFLATE if EROFS_FS_ZIP_DEFLATE select ZSTD_DECOMPRESS if EROFS_FS_ZIP_ZSTD help - EROFS (Enhanced Read-Only File System) is a lightweight read-only - file system with modern designs (e.g. no buffer heads, inline - xattrs/data, chunk-based deduplication, multiple devices, etc.) for - scenarios which need high-performance read-only solutions, e.g. - smartphones with Android OS, LiveCDs and high-density hosts with - numerous containers; + EROFS (Enhanced Read-Only File System) is a modern, lightweight, + secure read-only filesystem for various use cases, such as immutable + system images, container images, application sandboxes, and datasets. - It also provides transparent compression and deduplication support to - improve storage density and maintain relatively high compression - ratios, and it implements in-place decompression to temporarily reuse - page cache for compressed data using proper strategies, which is - quite useful for ensuring guaranteed end-to-end runtime decompression + EROFS uses a flexible, hierarchical on-disk design so that features + can be enabled on demand: the core on-disk format is block-aligned in + order to perform optimally on all kinds of devices, including block + and memory-backed devices; the format is easy to parse and has zero + metadata redundancy, unlike generic filesystems, making it ideal for + filesystem auditing and remote access; inline data, random-access + friendly directory data, inline/shared extended attributes and + chunk-based deduplication ensure space efficiency while maintaining + high performance. + + Optionally, it supports multiple devices to reference external data, + enabling data sharing for container images. + + It also has advanced encoded on-disk layouts, particularly for data + compression and fine-grained deduplication. It utilizes fixed-size + output compression to improve storage density while keeping relatively + high compression ratios. Furthermore, it implements in-place + decompression to reuse file pages to keep compressed data temporarily + with proper strategies, which ensures guaranteed end-to-end runtime performance under extreme memory pressure without extra cost. - See the documentation at - and the web pages at for more details. + For more details, see the web pages at + and the documentation at . + + To compile EROFS filesystem support as a module, choose M here. The + module will be called erofs. If unsure, say N. @@ -105,7 +119,8 @@ config EROFS_FS_ZIP depends on EROFS_FS default y help - Enable transparent compression support for EROFS file systems. + Enable EROFS compression layouts so that filesystems containing + compressed files can be parsed by the kernel. If you don't want to enable compression feature, say N. From 2f0407ed923b7eb363424033fc12fe253da139c4 Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Tue, 24 Mar 2026 23:54:07 +0800 Subject: [PATCH 4/4] erofs: fix .fadvise() for page cache sharing Currently, .fadvise() doesn't work well if page cache sharing is on since shared inodes belong to a pseudo fs generated with init_pseudo(), and sb->s_bdi is the default one &noop_backing_dev_info. Then, generic_fadvise() will just behave as a no-op if sb->s_bdi is &noop_backing_dev_info, but as the bdev fs (the bdev fs changes inode_to_bdi() instead), it's actually NOT a pure memfs. Let's generate a real bdi for erofs_ishare_mnt instead. Fixes: d86d7817c042 ("erofs: implement .fadvise for page cache share") Reviewed-by: Hongbo Li Signed-off-by: Gao Xiang --- fs/erofs/ishare.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c index 829d50d5c717..ec433bacc592 100644 --- a/fs/erofs/ishare.c +++ b/fs/erofs/ishare.c @@ -200,8 +200,19 @@ struct inode *erofs_real_inode(struct inode *inode, bool *need_iput) int __init erofs_init_ishare(void) { - erofs_ishare_mnt = kern_mount(&erofs_anon_fs_type); - return PTR_ERR_OR_ZERO(erofs_ishare_mnt); + struct vfsmount *mnt; + int ret; + + mnt = kern_mount(&erofs_anon_fs_type); + if (IS_ERR(mnt)) + return PTR_ERR(mnt); + /* generic_fadvise() doesn't work if s_bdi == &noop_backing_dev_info */ + ret = super_setup_bdi(mnt->mnt_sb); + if (ret) + kern_unmount(mnt); + else + erofs_ishare_mnt = mnt; + return ret; } void erofs_exit_ishare(void)