Commit d87d7389 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'ext4_for_linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "New ext4 features and performance improvements:

   - Fast commit performance improvements

   - Multi-fsblock atomic write support for bigalloc file systems

   - Large folio support for regular files

  This last can result in really stupendous performance for the right
  workloads. For example, see [1] where the Kernel Test Robot reported
  over 37% improvement on a large sequential I/O workload.

  There are also the usual bug fixes and cleanups. Of note are cleanups
  of the extent status tree to fix potential races that could result in
  the extent status tree getting corrupted under heavy simultaneous
  allocation and deallocation to a single file"

Link: https://lore.kernel.org/all/202505161418.ec0d753f-lkp@intel.com/ [1]

* tag 'ext4_for_linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (52 commits)
  ext4: Add a WARN_ON_ONCE for querying LAST_IN_LEAF instead
  ext4: Simplify flags in ext4_map_query_blocks()
  ext4: Rename and document EXT4_EX_FILTER to EXT4_EX_QUERY_FILTER
  ext4: Simplify last in leaf check in ext4_map_query_blocks
  ext4: Unwritten to written conversion requires EXT4_EX_NOCACHE
  ext4: only dirty folios when data journaling regular files
  ext4: Add atomic block write documentation
  ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc
  ext4: Add multi-fsblock atomic write support with bigalloc
  ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS
  ext4: Make ext4_meta_trans_blocks() non-static for later use
  ext4: Check if inode uses extents in ext4_inode_can_atomic_write()
  ext4: Document an edge case for overwrites
  jbd2: remove journal_t argument from jbd2_superblock_csum()
  jbd2: remove journal_t argument from jbd2_chksum()
  ext4: remove sb argument from ext4_superblock_csum()
  ext4: remove sbi argument from ext4_chksum()
  ext4: enable large folio for regular file
  ext4: make online defragmentation support large folios
  ext4: make the writeback path support large folios
  ...
parents e9d71265 7acd1b31
Loading
Loading
Loading
Loading
+225 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. _atomic_writes:

Atomic Block Writes
-------------------------

Introduction
~~~~~~~~~~~~

Atomic (untorn) block writes ensure that either the entire write is committed
to disk or none of it is. This prevents "torn writes" during power loss or
system crashes. The ext4 filesystem supports atomic writes (only with Direct
I/O) on regular files with extents, provided the underlying storage device
supports hardware atomic writes. This is supported in the following two ways:

1. **Single-fsblock Atomic Writes**:
   EXT4's supports atomic write operations with a single filesystem block since
   v6.13. In this the atomic write unit minimum and maximum sizes are both set
   to filesystem blocksize.
   e.g. doing atomic write of 16KB with 16KB filesystem blocksize on 64KB
   pagesize system is possible.

2. **Multi-fsblock Atomic Writes with Bigalloc**:
   EXT4 now also supports atomic writes spanning multiple filesystem blocks
   using a feature known as bigalloc. The atomic write unit's minimum and
   maximum sizes are determined by the filesystem block size and cluster size,
   based on the underlying device’s supported atomic write unit limits.

Requirements
~~~~~~~~~~~~

Basic requirements for atomic writes in ext4:

 1. The extents feature must be enabled (default for ext4)
 2. The underlying block device must support atomic writes
 3. For single-fsblock atomic writes:

    1. A filesystem with appropriate block size (up to the page size)
 4. For multi-fsblock atomic writes:

    1. The bigalloc feature must be enabled
    2. The cluster size must be appropriately configured

NOTE: EXT4 does not support software or COW based atomic write, which means
atomic writes on ext4 are only supported if underlying storage device supports
it.

Multi-fsblock Implementation Details
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The bigalloc feature changes ext4 to allocate in units of multiple filesystem
blocks, also known as clusters. With bigalloc each bit within block bitmap
represents cluster (power of 2 number of blocks) rather than individual
filesystem blocks.
EXT4 supports multi-fsblock atomic writes with bigalloc, subject to the
following constraints. The minimum atomic write size is the larger of the fs
block size and the minimum hardware atomic write unit; and the maximum atomic
write size is smaller of the bigalloc cluster size and the maximum hardware
atomic write unit.  Bigalloc ensures that all allocations are aligned to the
cluster size, which satisfies the LBA alignment requirements of the hardware
device if the start of the partition/logical volume is itself aligned correctly.

Here is the block allocation strategy in bigalloc for atomic writes:

 * For regions with fully mapped extents, no additional work is needed
 * For append writes, a new mapped extent is allocated
 * For regions that are entirely holes, unwritten extent is created
 * For large unwritten extents, the extent gets split into two unwritten
   extents of appropriate requested size
 * For mixed mapping regions (combinations of holes, unwritten extents, or
   mapped extents), ext4_map_blocks() is called in a loop with
   EXT4_GET_BLOCKS_ZERO flag to convert the region into a single contiguous
   mapped extent by writing zeroes to it and converting any unwritten extents to
   written, if found within the range.

Note: Writing on a single contiguous underlying extent, whether mapped or
unwritten, is not inherently problematic. However, writing to a mixed mapping
region (i.e. one containing a combination of mapped and unwritten extents)
must be avoided when performing atomic writes.

The reason is that, atomic writes when issued via pwritev2() with the RWF_ATOMIC
flag, requires that either all data is written or none at all. In the event of
a system crash or unexpected power loss during the write operation, the affected
region (when later read) must reflect either the complete old data or the
complete new data, but never a mix of both.

To enforce this guarantee, we ensure that the write target is backed by
a single, contiguous extent before any data is written. This is critical because
ext4 defers the conversion of unwritten extents to written extents until the I/O
completion path (typically in ->end_io()). If a write is allowed to proceed over
a mixed mapping region (with mapped and unwritten extents) and a failure occurs
mid-write, the system could observe partially updated regions after reboot, i.e.
new data over mapped areas, and stale (old) data over unwritten extents that
were never marked written. This violates the atomicity and/or torn write
prevention guarantee.

To prevent such torn writes, ext4 proactively allocates a single contiguous
extent for the entire requested region in ``ext4_iomap_alloc`` via
``ext4_map_blocks_atomic()``. EXT4 also force commits the current journalling
transaction in case if allocation is done over mixed mapping. This ensures any
pending metadata updates (like unwritten to written extents conversion) in this
range are in consistent state with the file data blocks, before performing the
actual write I/O. If the commit fails, the whole I/O must be aborted to prevent
from any possible torn writes.
Only after this step, the actual data write operation is performed by the iomap.

Handling Split Extents Across Leaf Blocks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There can be a special edge case where we have logically and physically
contiguous extents stored in separate leaf nodes of the on-disk extent tree.
This occurs because on-disk extent tree merges only happens within the leaf
blocks except for a case where we have 2-level tree which can get merged and
collapsed entirely into the inode.
If such a layout exists and, in the worst case, the extent status cache entries
are reclaimed due to memory pressure, ``ext4_map_blocks()`` may never return
a single contiguous extent for these split leaf extents.

To address this edge case, a new get block flag
``EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS flag`` is added to enhance the
``ext4_map_query_blocks()`` lookup behavior.

This new get block flag allows ``ext4_map_blocks()`` to first check if there is
an entry in the extent status cache for the full range.
If not present, it consults the on-disk extent tree using
``ext4_map_query_blocks()``.
If the located extent is at the end of a leaf node, it probes the next logical
block (lblk) to detect a contiguous extent in the adjacent leaf.

For now only one additional leaf block is queried to maintain efficiency, as
atomic writes are typically constrained to small sizes
(e.g. [blocksize, clustersize]).


Handling Journal transactions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To support multi-fsblock atomic writes, we ensure enough journal credits are
reserved during:

 1. Block allocation time in ``ext4_iomap_alloc()``. We first query if there
    could be a mixed mapping for the underlying requested range. If yes, then we
    reserve credits of up to ``m_len``, assuming every alternate block can be
    an unwritten extent followed by a hole.

 2. During ``->end_io()`` call, we make sure a single transaction is started for
    doing unwritten-to-written conversion. The loop for conversion is mainly
    only required to handle a split extent across leaf blocks.

How to
------

Creating Filesystems with Atomic Write Support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

First check the atomic write units supported by block device.
See :ref:`atomic_write_bdev_support` for more details.

For single-fsblock atomic writes with a larger block size
(on systems with block size < page size):

.. code-block:: bash

    # Create an ext4 filesystem with a 16KB block size
    # (requires page size >= 16KB)
    mkfs.ext4 -b 16384 /dev/device

For multi-fsblock atomic writes with bigalloc:

.. code-block:: bash

    # Create an ext4 filesystem with bigalloc and 64KB cluster size
    mkfs.ext4 -F -O bigalloc -b 4096 -C 65536 /dev/device

Where ``-b`` specifies the block size, ``-C`` specifies the cluster size in bytes,
and ``-O bigalloc`` enables the bigalloc feature.

Application Interface
~~~~~~~~~~~~~~~~~~~~~

Applications can use the ``pwritev2()`` system call with the ``RWF_ATOMIC`` flag
to perform atomic writes:

.. code-block:: c

    pwritev2(fd, iov, iovcnt, offset, RWF_ATOMIC);

The write must be aligned to the filesystem's block size and not exceed the
filesystem's maximum atomic write unit size.
See ``generic_atomic_write_valid()`` for more details.

``statx()`` system call with ``STATX_WRITE_ATOMIC`` flag can provides following
details:

 * ``stx_atomic_write_unit_min``: Minimum size of an atomic write request.
 * ``stx_atomic_write_unit_max``: Maximum size of an atomic write request.
 * ``stx_atomic_write_segments_max``: Upper limit for segments. The number of
   separate memory buffers that can be gathered into a write operation
   (e.g., the iovcnt parameter for IOV_ITER). Currently, this is always set to one.

The STATX_ATTR_WRITE_ATOMIC flag in ``statx->attributes`` is set if atomic
writes are supported.

.. _atomic_write_bdev_support:

Hardware Support
----------------

The underlying storage device must support atomic write operations.
Modern NVMe and SCSI devices often provide this capability.
The Linux kernel exposes this information through sysfs:

* ``/sys/block/<device>/queue/atomic_write_unit_min`` - Minimum atomic write size
* ``/sys/block/<device>/queue/atomic_write_unit_max`` - Maximum atomic write size

Nonzero values for these attributes indicate that the device supports
atomic writes.

See Also
--------

* :doc:`bigalloc` - Documentation on the bigalloc feature
* :doc:`allocators` - Documentation on block allocation in ext4
* Support for atomic block writes in 6.13:
  https://lwn.net/Articles/1009298/
+1 −0
Original line number Diff line number Diff line
@@ -25,3 +25,4 @@ order.
.. include:: inlinedata.rst
.. include:: eainode.rst
.. include:: verity.rst
.. include:: atomic_writes.rst
+4 −4
Original line number Diff line number Diff line
@@ -30,7 +30,7 @@ int ext4_inode_bitmap_csum_verify(struct super_block *sb,

	sz = EXT4_INODES_PER_GROUP(sb) >> 3;
	provided = le16_to_cpu(gdp->bg_inode_bitmap_csum_lo);
	calculated = ext4_chksum(sbi, sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	calculated = ext4_chksum(sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	if (sbi->s_desc_size >= EXT4_BG_INODE_BITMAP_CSUM_HI_END) {
		hi = le16_to_cpu(gdp->bg_inode_bitmap_csum_hi);
		provided |= (hi << 16);
@@ -52,7 +52,7 @@ void ext4_inode_bitmap_csum_set(struct super_block *sb,
		return;

	sz = EXT4_INODES_PER_GROUP(sb) >> 3;
	csum = ext4_chksum(sbi, sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	csum = ext4_chksum(sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	gdp->bg_inode_bitmap_csum_lo = cpu_to_le16(csum & 0xFFFF);
	if (sbi->s_desc_size >= EXT4_BG_INODE_BITMAP_CSUM_HI_END)
		gdp->bg_inode_bitmap_csum_hi = cpu_to_le16(csum >> 16);
@@ -71,7 +71,7 @@ int ext4_block_bitmap_csum_verify(struct super_block *sb,
		return 1;

	provided = le16_to_cpu(gdp->bg_block_bitmap_csum_lo);
	calculated = ext4_chksum(sbi, sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	calculated = ext4_chksum(sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	if (sbi->s_desc_size >= EXT4_BG_BLOCK_BITMAP_CSUM_HI_END) {
		hi = le16_to_cpu(gdp->bg_block_bitmap_csum_hi);
		provided |= (hi << 16);
@@ -92,7 +92,7 @@ void ext4_block_bitmap_csum_set(struct super_block *sb,
	if (!ext4_has_feature_metadata_csum(sb))
		return;

	csum = ext4_chksum(sbi, sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	csum = ext4_chksum(sbi->s_csum_seed, (__u8 *)bh->b_data, sz);
	gdp->bg_block_bitmap_csum_lo = cpu_to_le16(csum & 0xFFFF);
	if (sbi->s_desc_size >= EXT4_BG_BLOCK_BITMAP_CSUM_HI_END)
		gdp->bg_block_bitmap_csum_hi = cpu_to_le16(csum >> 16);
+71 −20
Original line number Diff line number Diff line
@@ -256,9 +256,19 @@ struct ext4_allocation_request {
#define EXT4_MAP_UNWRITTEN	BIT(BH_Unwritten)
#define EXT4_MAP_BOUNDARY	BIT(BH_Boundary)
#define EXT4_MAP_DELAYED	BIT(BH_Delay)
/*
 * This is for use in ext4_map_query_blocks() for a special case where we can
 * have a physically and logically contiguous blocks split across two leaf
 * nodes instead of a single extent. This is required in case of atomic writes
 * to know whether the returned extent is last in leaf. If yes, then lookup for
 * next in leaf block in ext4_map_query_blocks_next_in_leaf().
 * - This is never going to be added to any buffer head state.
 * - We use the next available bit after BH_BITMAP_UPTODATE.
 */
#define EXT4_MAP_QUERY_LAST_IN_LEAF	BIT(BH_BITMAP_UPTODATE + 1)
#define EXT4_MAP_FLAGS		(EXT4_MAP_NEW | EXT4_MAP_MAPPED |\
				 EXT4_MAP_UNWRITTEN | EXT4_MAP_BOUNDARY |\
				 EXT4_MAP_DELAYED)
				 EXT4_MAP_DELAYED | EXT4_MAP_QUERY_LAST_IN_LEAF)

struct ext4_map_blocks {
	ext4_fsblk_t m_pblk;
@@ -706,9 +716,6 @@ enum {
#define EXT4_GET_BLOCKS_CONVERT			0x0010
#define EXT4_GET_BLOCKS_IO_CREATE_EXT		(EXT4_GET_BLOCKS_PRE_IO|\
					 EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT)
	/* Convert extent to initialized after IO complete */
#define EXT4_GET_BLOCKS_IO_CONVERT_EXT		(EXT4_GET_BLOCKS_CONVERT|\
					 EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT)
	/* Eventual metadata allocation (due to growing extent tree)
	 * should not fail, so try to use reserved blocks for that.*/
#define EXT4_GET_BLOCKS_METADATA_NOFAIL		0x0020
@@ -720,11 +727,23 @@ enum {
#define EXT4_GET_BLOCKS_ZERO			0x0200
#define EXT4_GET_BLOCKS_CREATE_ZERO		(EXT4_GET_BLOCKS_CREATE |\
					EXT4_GET_BLOCKS_ZERO)
	/* Caller will submit data before dropping transaction handle. This
	 * allows jbd2 to avoid submitting data before commit. */
	/* Caller is in the context of data submission, such as writeback,
	 * fsync, etc. Especially, in the generic writeback path, caller will
	 * submit data before dropping transaction handle. This allows jbd2
	 * to avoid submitting data before commit. */
#define EXT4_GET_BLOCKS_IO_SUBMIT		0x0400
	/* Convert extent to initialized after IO complete */
#define EXT4_GET_BLOCKS_IO_CONVERT_EXT		(EXT4_GET_BLOCKS_CONVERT |\
					 EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT |\
					 EXT4_GET_BLOCKS_IO_SUBMIT)
	/* Caller is in the atomic contex, find extent if it has been cached */
#define EXT4_GET_BLOCKS_CACHED_NOWAIT		0x0800
/*
 * Atomic write caller needs this to query in the slow path of mixed mapping
 * case, when a contiguous extent can be split across two adjacent leaf nodes.
 * Look EXT4_MAP_QUERY_LAST_IN_LEAF.
 */
#define EXT4_GET_BLOCKS_QUERY_LAST_IN_LEAF	0x1000

/*
 * The bit position of these flags must not overlap with any of the
@@ -738,6 +757,13 @@ enum {
#define EXT4_EX_NOCACHE				0x40000000
#define EXT4_EX_FORCE_CACHE			0x20000000
#define EXT4_EX_NOFAIL				0x10000000
/*
 * ext4_map_query_blocks() uses this filter mask to filter the flags needed to
 * pass while lookup/querying of on disk extent tree.
 */
#define EXT4_EX_QUERY_FILTER	(EXT4_EX_NOCACHE | EXT4_EX_FORCE_CACHE |\
				 EXT4_EX_NOFAIL |\
				 EXT4_GET_BLOCKS_QUERY_LAST_IN_LEAF)

/*
 * Flags used by ext4_free_blocks
@@ -1061,16 +1087,16 @@ struct ext4_inode_info {
	/* End of lblk range that needs to be committed in this fast commit */
	ext4_lblk_t i_fc_lblk_len;

	/* Number of ongoing updates on this inode */
	atomic_t  i_fc_updates;

	spinlock_t i_raw_lock;	/* protects updates to the raw inode */

	/* Fast commit wait queue for this inode */
	wait_queue_head_t i_fc_wait;

	/* Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len */
	struct mutex i_fc_lock;
	/*
	 * Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len
	 * and inode's EXT4_FC_STATE_COMMITTING state bit.
	 */
	spinlock_t i_fc_lock;

	/*
	 * i_disksize keeps track of what the inode size is ON DISK, not
@@ -1754,7 +1780,7 @@ struct ext4_sb_info {
	 * following fields:
	 * ei->i_fc_list, s_fc_dentry_q, s_fc_q, s_fc_bytes, s_fc_bh.
	 */
	spinlock_t s_fc_lock;
	struct mutex s_fc_lock;
	struct buffer_head *s_fc_bh;
	struct ext4_fc_stats s_fc_stats;
	tid_t s_fc_ineligible_tid;
@@ -1913,6 +1939,7 @@ enum {
	EXT4_STATE_LUSTRE_EA_INODE,	/* Lustre-style ea_inode */
	EXT4_STATE_VERITY_IN_PROGRESS,	/* building fs-verity Merkle tree */
	EXT4_STATE_FC_COMMITTING,	/* Fast commit ongoing */
	EXT4_STATE_FC_FLUSHING_DATA,	/* Fast commit flushing data */
	EXT4_STATE_ORPHAN_FILE,		/* Inode orphaned in orphan file */
};

@@ -2295,10 +2322,12 @@ static inline int ext4_emergency_state(struct super_block *sb)
#define EXT4_DEFM_NODELALLOC	0x0800

/*
 * Default journal batch times
 * Default journal batch times and ioprio.
 */
#define EXT4_DEF_MIN_BATCH_TIME	0
#define EXT4_DEF_MAX_BATCH_TIME	15000 /* 15ms */
#define EXT4_DEF_JOURNAL_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))


/*
 * Default values for superblock update
@@ -2487,8 +2516,7 @@ static inline __le16 ext4_rec_len_to_disk(unsigned len, unsigned blocksize)
#define DX_HASH_SIPHASH			6
#define DX_HASH_LAST 			DX_HASH_SIPHASH

static inline u32 ext4_chksum(struct ext4_sb_info *sbi, u32 crc,
			      const void *address, unsigned int length)
static inline u32 ext4_chksum(u32 crc, const void *address, unsigned int length)
{
	return crc32c(crc, address, length);
}
@@ -2922,8 +2950,6 @@ void __ext4_fc_track_create(handle_t *handle, struct inode *inode,
void ext4_fc_track_create(handle_t *handle, struct dentry *dentry);
void ext4_fc_track_inode(handle_t *handle, struct inode *inode);
void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handle);
void ext4_fc_start_update(struct inode *inode);
void ext4_fc_stop_update(struct inode *inode);
void ext4_fc_del(struct inode *inode);
bool ext4_fc_replay_check_excluded(struct super_block *sb, ext4_fsblk_t block);
void ext4_fc_replay_cleanup(struct super_block *sb);
@@ -2973,6 +2999,7 @@ static inline bool ext4_mb_cr_expensive(enum criteria cr)
void ext4_inode_csum_set(struct inode *inode, struct ext4_inode *raw,
			 struct ext4_inode_info *ei);
int ext4_inode_is_fast_symlink(struct inode *inode);
void ext4_check_map_extents_env(struct inode *inode);
struct buffer_head *ext4_getblk(handle_t *, struct inode *, ext4_lblk_t, int);
struct buffer_head *ext4_bread(handle_t *, struct inode *, ext4_lblk_t, int);
int ext4_bread_batch(struct inode *inode, ext4_lblk_t block, int bh_count,
@@ -2993,6 +3020,7 @@ int ext4_walk_page_buffers(handle_t *handle,
				     struct buffer_head *bh));
int do_journal_get_write_access(handle_t *handle, struct inode *inode,
				struct buffer_head *bh);
bool ext4_should_enable_large_folio(struct inode *inode);
#define FALL_BACK_TO_NONDELALLOC 1
#define CONVERT_INLINE_DATA	 2

@@ -3039,6 +3067,8 @@ extern void ext4_set_aops(struct inode *inode);
extern int ext4_writepage_trans_blocks(struct inode *);
extern int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode);
extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks);
extern int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
				  int pextents);
extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode,
			     loff_t lstart, loff_t lend);
extern vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf);
@@ -3050,6 +3080,17 @@ extern void ext4_da_update_reserve_space(struct inode *inode,
extern int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk,
			      ext4_fsblk_t pblk, ext4_lblk_t len);

static inline bool is_special_ino(struct super_block *sb, unsigned long ino)
{
	struct ext4_super_block *es = EXT4_SB(sb)->s_es;

	return (ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) ||
		ino == le32_to_cpu(es->s_usr_quota_inum) ||
		ino == le32_to_cpu(es->s_grp_quota_inum) ||
		ino == le32_to_cpu(es->s_prj_quota_inum) ||
		ino == le32_to_cpu(es->s_orphan_file_inum);
}

/* indirect.c */
extern int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
				struct ext4_map_blocks *map, int flags);
@@ -3119,8 +3160,7 @@ extern int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wa
extern void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block);
extern int ext4_seq_options_show(struct seq_file *seq, void *offset);
extern int ext4_calculate_overhead(struct super_block *sb);
extern __le32 ext4_superblock_csum(struct super_block *sb,
				   struct ext4_super_block *es);
extern __le32 ext4_superblock_csum(struct ext4_super_block *es);
extern void ext4_superblock_csum_set(struct super_block *sb);
extern int ext4_alloc_flex_bg_array(struct super_block *sb,
				    ext4_group_t ngroup);
@@ -3378,6 +3418,13 @@ static inline unsigned int ext4_flex_bg_size(struct ext4_sb_info *sbi)
	return 1 << sbi->s_log_groups_per_flex;
}

static inline loff_t ext4_get_maxbytes(struct inode *inode)
{
	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
		return inode->i_sb->s_maxbytes;
	return EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
}

#define ext4_std_error(sb, errno)				\
do {								\
	if ((errno))						\
@@ -3710,6 +3757,8 @@ extern long ext4_fallocate(struct file *file, int mode, loff_t offset,
			  loff_t len);
extern int ext4_convert_unwritten_extents(handle_t *handle, struct inode *inode,
					  loff_t offset, ssize_t len);
extern int ext4_convert_unwritten_extents_atomic(handle_t *handle,
			struct inode *inode, loff_t offset, ssize_t len);
extern int ext4_convert_unwritten_io_end_vec(handle_t *handle,
					     ext4_io_end_t *io_end);
extern int ext4_map_blocks(handle_t *handle, struct inode *inode,
@@ -3847,7 +3896,9 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
static inline bool ext4_inode_can_atomic_write(struct inode *inode)
{

	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;
	return S_ISREG(inode->i_mode) &&
		ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
		EXT4_SB(inode->i_sb)->s_awu_min > 0;
}

extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
+2 −1
Original line number Diff line number Diff line
@@ -16,7 +16,8 @@ int ext4_inode_journal_mode(struct inode *inode)
	    ext4_test_inode_flag(inode, EXT4_INODE_EA_INODE) ||
	    test_opt(inode->i_sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA ||
	    (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA) &&
	    !test_opt(inode->i_sb, DELALLOC))) {
	    !test_opt(inode->i_sb, DELALLOC) &&
	    !mapping_large_folio_support(inode->i_mapping))) {
		/* We do not support data journalling for encrypted data */
		if (S_ISREG(inode->i_mode) && IS_ENCRYPTED(inode))
			return EXT4_INODE_ORDERED_DATA_MODE;  /* ordered */
Loading