Commit 4b84a4c8 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull misc vfs updates from Christian Brauner:
 "Features:

   - Support caching symlink lengths in inodes

     The size is stored in a new union utilizing the same space as
     i_devices, thus avoiding growing the struct or taking up any more
     space

     When utilized it dodges strlen() in vfs_readlink(), giving about
     1.5% speed up when issuing readlink on /initrd.img on ext4

   - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag

     If a file system supports uncached buffered IO, it may set
     FOP_DONTCACHE and enable support for RWF_DONTCACHE.

     If RWF_DONTCACHE is attempted without the file system supporting
     it, it'll get errored with -EOPNOTSUPP

   - Enable VBOXGUEST and VBOXSF_FS on ARM64

     Now that VirtualBox is able to run as a host on arm64 (e.g. the
     Apple M3 processors) we can enable VBOXSF_FS (and in turn
     VBOXGUEST) for this architecture.

     Tested with various runs of bonnie++ and dbench on an Apple MacBook
     Pro with the latest Virtualbox 7.1.4 r165100 installed

  Cleanups:

   - Delay sysctl_nr_open check in expand_files()

   - Use kernel-doc includes in fiemap docbook

   - Use page->private instead of page->index in watch_queue

   - Use a consume fence in mnt_idmap() as it's heavily used in
     link_path_walk()

   - Replace magic number 7 with ARRAY_SIZE() in fc_log

   - Sort out a stale comment about races between fd alloc and dup2()

   - Fix return type of do_mount() from long to int

   - Various cosmetic cleanups for the lockref code

  Fixes:

   - Annotate spinning as unlikely() in __read_seqcount_begin

     The annotation already used to be there, but got lost in commit
     52ac39e5 ("seqlock: seqcount_t: Implement all read APIs as
     statement expressions")

   - Fix proc_handler for sysctl_nr_open

   - Flush delayed work in delayed fput()

   - Fix grammar and spelling in propagate_umount()

   - Fix ESP not readable during coredump

     In /proc/PID/stat, there is the kstkesp field which is the stack
     pointer of a thread. While the thread is active, this field reads
     zero. But during a coredump, it should have a valid value

     However, at the moment, kstkesp is zero even during coredump

   - Don't wake up the writer if the pipe is still full

   - Fix unbalanced user_access_end() in select code"

* tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  gfs2: use lockref_init for qd_lockref
  erofs: use lockref_init for pcl->lockref
  dcache: use lockref_init for d_lockref
  lockref: add a lockref_init helper
  lockref: drop superfluous externs
  lockref: use bool for false/true returns
  lockref: improve the lockref_get_not_zero description
  lockref: remove lockref_put_not_zero
  fs: Fix return type of do_mount() from long to int
  select: Fix unbalanced user_access_end()
  vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64
  pipe_read: don't wake up the writer if the pipe is still full
  selftests: coredump: Add stackdump test
  fs/proc: do_task_stat: Fix ESP not readable during coredump
  fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
  fs: sort out a stale comment about races between fd alloc and dup2
  fs: Fix grammar and spelling in propagate_umount()
  fs: fc_log replace magic number 7 with ARRAY_SIZE()
  fs: use a consume fence in mnt_idmap()
  file: flush delayed work in delayed fput()
  ...
parents d5829524 c859df52
Loading
Loading
Loading
Loading
+15 −34
Original line number Diff line number Diff line
@@ -12,21 +12,10 @@ returns a list of extents.
Request Basics
--------------

A fiemap request is encoded within struct fiemap::

  struct fiemap {
	__u64	fm_start;	 /* logical offset (inclusive) at
				  * which to start mapping (in) */
	__u64	fm_length;	 /* logical length of mapping which
				  * userspace cares about (in) */
	__u32	fm_flags;	 /* FIEMAP_FLAG_* flags for request (in/out) */
	__u32	fm_mapped_extents; /* number of extents that were
				    * mapped (out) */
	__u32	fm_extent_count; /* size of fm_extents array (in) */
	__u32	fm_reserved;
	struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
  };
A fiemap request is encoded within struct fiemap:

.. kernel-doc:: include/uapi/linux/fiemap.h
   :identifiers: fiemap

fm_start, and fm_length specify the logical range within the file
which the process would like mappings for. Extents returned mirror
@@ -60,6 +49,8 @@ FIEMAP_FLAG_XATTR
  If this flag is set, the extents returned will describe the inodes
  extended attribute lookup tree, instead of its data tree.

FIEMAP_FLAG_CACHE
  This flag requests caching of the extents.

Extent Mapping
--------------
@@ -77,18 +68,10 @@ complete the requested range and will not have the FIEMAP_EXTENT_LAST
flag set (see the next section on extent flags).

Each extent is described by a single fiemap_extent structure as
returned in fm_extents::

    struct fiemap_extent {
	    __u64	fe_logical;  /* logical offset in bytes for the start of
				* the extent */
	    __u64	fe_physical; /* physical offset in bytes for the start
				* of the extent */
	    __u64	fe_length;   /* length in bytes for the extent */
	    __u64	fe_reserved64[2];
	    __u32	fe_flags;    /* FIEMAP_EXTENT_* flags for this extent */
	    __u32	fe_reserved[3];
    };
returned in fm_extents:

.. kernel-doc:: include/uapi/linux/fiemap.h
    :identifiers: fiemap_extent

All offsets and lengths are in bytes and mirror those on disk.  It is valid
for an extents logical offset to start before the request or its logical
@@ -175,6 +158,8 @@ FIEMAP_EXTENT_MERGED
  userspace would be highly inefficient, the kernel will try to merge most
  adjacent blocks into 'extents'.

FIEMAP_EXTENT_SHARED
  This flag is set to request that space be shared with other files.

VFS -> File System Implementation
---------------------------------
@@ -191,14 +176,10 @@ each discovered extent::
                     u64 len);

->fiemap is passed struct fiemap_extent_info which describes the
fiemap request::

  struct fiemap_extent_info {
	unsigned int fi_flags;		/* Flags as passed from user */
	unsigned int fi_extents_mapped;	/* Number of mapped extents */
	unsigned int fi_extents_max;	/* Size of fiemap_extent array */
	struct fiemap_extent *fi_extents_start;	/* Start of fiemap_extent array */
  };
fiemap request:

.. kernel-doc:: include/linux/fiemap.h
    :identifiers: fiemap_extent_info

It is intended that the file system should not need to access any of this
structure directly. Filesystem handlers should be tolerant to signals and return
+1 −1
Original line number Diff line number Diff line
# SPDX-License-Identifier: GPL-2.0-only
config VBOXGUEST
	tristate "Virtual Box Guest integration support"
	depends on X86 && PCI && INPUT
	depends on (ARM64 || X86) && PCI && INPUT
	help
	  This is a driver for the Virtual Box Guest PCI device used in
	  Virtual Box virtual machines. Enabling this driver will add
+1 −2
Original line number Diff line number Diff line
@@ -1681,9 +1681,8 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
	/* Make sure we always see the terminating NUL character */
	smp_store_release(&dentry->d_name.name, dname); /* ^^^ */

	dentry->d_lockref.count = 1;
	dentry->d_flags = 0;
	spin_lock_init(&dentry->d_lock);
	lockref_init(&dentry->d_lockref, 1);
	seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock);
	dentry->d_inode = NULL;
	dentry->d_parent = dentry;
+1 −2
Original line number Diff line number Diff line
@@ -747,8 +747,7 @@ static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe)
	if (IS_ERR(pcl))
		return PTR_ERR(pcl);

	spin_lock_init(&pcl->lockref.lock);
	pcl->lockref.count = 1;		/* one ref for this request */
	lockref_init(&pcl->lockref, 1); /* one ref for this request */
	pcl->algorithmformat = map->m_algorithmformat;
	pcl->length = 0;
	pcl->partial = true;
+2 −1
Original line number Diff line number Diff line
@@ -5006,10 +5006,11 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
		if (IS_ENCRYPTED(inode)) {
			inode->i_op = &ext4_encrypted_symlink_inode_operations;
		} else if (ext4_inode_is_fast_symlink(inode)) {
			inode->i_link = (char *)ei->i_data;
			inode->i_op = &ext4_fast_symlink_inode_operations;
			nd_terminate_link(ei->i_data, inode->i_size,
				sizeof(ei->i_data) - 1);
			inode_set_cached_link(inode, (char *)ei->i_data,
					      inode->i_size);
		} else {
			inode->i_op = &ext4_symlink_inode_operations;
		}
Loading