mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
synced 2026-04-18 03:23:53 -04:00
Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner: - statmount: accept fd as a parameter Extend struct mnt_id_req with a file descriptor field and a new STATMOUNT_BY_FD flag. When set, statmount() returns mount information for the mount the fd resides on — including detached mounts (unmounted via umount2(MNT_DETACH)). For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID mask bits are cleared since neither is meaningful. The capability check is skipped for STATMOUNT_BY_FD since holding an fd already implies prior access to the mount and equivalent information is available through fstatfs() and /proc/pid/mountinfo without privilege. Includes comprehensive selftests covering both attached and detached mount cases. - fs: Remove internal old mount API code (1 patch) Now that every in-tree filesystem has been converted to the new mount API, remove all the legacy shim code in fs_context.c that handled unconverted filesystems. This deletes ~280 lines including legacy_init_fs_context(), the legacy_fs_context struct, and associated wrappers. The mount(2) syscall path for userspace remains untouched. Documentation references to the legacy callbacks are cleaned up. - mount: add OPEN_TREE_NAMESPACE to open_tree() Container runtimes currently use CLONE_NEWNS to copy the caller's entire mount namespace — only to then pivot_root() and recursively unmount everything they just copied. With large mount tables and thousands of parallel container launches this creates significant contention on the namespace semaphore. OPEN_TREE_NAMESPACE copies only the specified mount tree (like OPEN_TREE_CLONE) but returns a mount namespace fd instead of a detached mount fd. The new namespace contains the copied tree mounted on top of a clone of the real rootfs. This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER) followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by the new user namespace. Mount namespace file mounts are excluded from the copy to prevent cycles. Includes ~1000 lines of selftests" * tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/open_tree: add OPEN_TREE_NAMESPACE tests mount: add OPEN_TREE_NAMESPACE fs: Remove internal old mount API code selftests: statmount: tests for STATMOUNT_BY_FD statmount: accept fd as a parameter statmount: permission check should return EPERM
This commit is contained in:
@@ -180,7 +180,6 @@ prototypes::
|
||||
int (*freeze_fs) (struct super_block *);
|
||||
int (*unfreeze_fs) (struct super_block *);
|
||||
int (*statfs) (struct dentry *, struct kstatfs *);
|
||||
int (*remount_fs) (struct super_block *, int *, char *);
|
||||
void (*umount_begin) (struct super_block *);
|
||||
int (*show_options)(struct seq_file *, struct dentry *);
|
||||
ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
|
||||
@@ -204,7 +203,6 @@ sync_fs: read
|
||||
freeze_fs: write
|
||||
unfreeze_fs: write
|
||||
statfs: maybe(read) (see below)
|
||||
remount_fs: write
|
||||
umount_begin: no
|
||||
show_options: no (namespace_sem)
|
||||
quota_read: no (see below)
|
||||
@@ -229,8 +227,6 @@ file_system_type
|
||||
|
||||
prototypes::
|
||||
|
||||
struct dentry *(*mount) (struct file_system_type *, int,
|
||||
const char *, void *);
|
||||
void (*kill_sb) (struct super_block *);
|
||||
|
||||
locking rules:
|
||||
@@ -238,13 +234,9 @@ locking rules:
|
||||
======= =========
|
||||
ops may block
|
||||
======= =========
|
||||
mount yes
|
||||
kill_sb yes
|
||||
======= =========
|
||||
|
||||
->mount() returns ERR_PTR or the root dentry; its superblock should be locked
|
||||
on return.
|
||||
|
||||
->kill_sb() takes a write-locked superblock, does all shutdown work on it,
|
||||
unlocks and drops the reference.
|
||||
|
||||
|
||||
@@ -299,8 +299,6 @@ manage the filesystem context. They are as follows:
|
||||
On success it should return 0. In the case of an error, it should return
|
||||
a negative error code.
|
||||
|
||||
.. Note:: reconfigure is intended as a replacement for remount_fs.
|
||||
|
||||
|
||||
Filesystem context Security
|
||||
===========================
|
||||
|
||||
@@ -448,11 +448,8 @@ a file off.
|
||||
|
||||
**mandatory**
|
||||
|
||||
->get_sb() is gone. Switch to use of ->mount(). Typically it's just
|
||||
a matter of switching from calling ``get_sb_``... to ``mount_``... and changing
|
||||
the function type. If you were doing it manually, just switch from setting
|
||||
->mnt_root to some pointer to returning that pointer. On errors return
|
||||
ERR_PTR(...).
|
||||
->get_sb() and ->mount() are gone. Switch to using the new mount API. See
|
||||
Documentation/filesystems/mount_api.rst for more details.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -94,11 +94,9 @@ functions:
|
||||
|
||||
The passed struct file_system_type describes your filesystem. When a
|
||||
request is made to mount a filesystem onto a directory in your
|
||||
namespace, the VFS will call the appropriate mount() method for the
|
||||
specific filesystem. New vfsmount referring to the tree returned by
|
||||
->mount() will be attached to the mountpoint, so that when pathname
|
||||
resolution reaches the mountpoint it will jump into the root of that
|
||||
vfsmount.
|
||||
namespace, the VFS will call the appropriate get_tree() method for the
|
||||
specific filesystem. See Documentation/filesystems/mount_api.rst
|
||||
for more details.
|
||||
|
||||
You can see all filesystems that are registered to the kernel in the
|
||||
file /proc/filesystems.
|
||||
@@ -117,8 +115,6 @@ members are defined:
|
||||
int fs_flags;
|
||||
int (*init_fs_context)(struct fs_context *);
|
||||
const struct fs_parameter_spec *parameters;
|
||||
struct dentry *(*mount) (struct file_system_type *, int,
|
||||
const char *, void *);
|
||||
void (*kill_sb) (struct super_block *);
|
||||
struct module *owner;
|
||||
struct file_system_type * next;
|
||||
@@ -151,10 +147,6 @@ members are defined:
|
||||
'struct fs_parameter_spec'.
|
||||
More info in Documentation/filesystems/mount_api.rst.
|
||||
|
||||
``mount``
|
||||
the method to call when a new instance of this filesystem should
|
||||
be mounted
|
||||
|
||||
``kill_sb``
|
||||
the method to call when an instance of this filesystem should be
|
||||
shut down
|
||||
@@ -173,45 +165,6 @@ members are defined:
|
||||
s_lock_key, s_umount_key, s_vfs_rename_key, s_writers_key,
|
||||
i_lock_key, i_mutex_key, invalidate_lock_key, i_mutex_dir_key: lockdep-specific
|
||||
|
||||
The mount() method has the following arguments:
|
||||
|
||||
``struct file_system_type *fs_type``
|
||||
describes the filesystem, partly initialized by the specific
|
||||
filesystem code
|
||||
|
||||
``int flags``
|
||||
mount flags
|
||||
|
||||
``const char *dev_name``
|
||||
the device name we are mounting.
|
||||
|
||||
``void *data``
|
||||
arbitrary mount options, usually comes as an ASCII string (see
|
||||
"Mount Options" section)
|
||||
|
||||
The mount() method must return the root dentry of the tree requested by
|
||||
caller. An active reference to its superblock must be grabbed and the
|
||||
superblock must be locked. On failure it should return ERR_PTR(error).
|
||||
|
||||
The arguments match those of mount(2) and their interpretation depends
|
||||
on filesystem type. E.g. for block filesystems, dev_name is interpreted
|
||||
as block device name, that device is opened and if it contains a
|
||||
suitable filesystem image the method creates and initializes struct
|
||||
super_block accordingly, returning its root dentry to caller.
|
||||
|
||||
->mount() may choose to return a subtree of existing filesystem - it
|
||||
doesn't have to create a new one. The main result from the caller's
|
||||
point of view is a reference to dentry at the root of (sub)tree to be
|
||||
attached; creation of new superblock is a common side effect.
|
||||
|
||||
The most interesting member of the superblock structure that the mount()
|
||||
method fills in is the "s_op" field. This is a pointer to a "struct
|
||||
super_operations" which describes the next level of the filesystem
|
||||
implementation.
|
||||
|
||||
For more information on mounting (and the new mount API), see
|
||||
Documentation/filesystems/mount_api.rst.
|
||||
|
||||
The Superblock Object
|
||||
=====================
|
||||
|
||||
@@ -244,7 +197,6 @@ filesystem. The following members are defined:
|
||||
enum freeze_wholder who);
|
||||
int (*unfreeze_fs) (struct super_block *);
|
||||
int (*statfs) (struct dentry *, struct kstatfs *);
|
||||
int (*remount_fs) (struct super_block *, int *, char *);
|
||||
void (*umount_begin) (struct super_block *);
|
||||
|
||||
int (*show_options)(struct seq_file *, struct dentry *);
|
||||
@@ -351,10 +303,6 @@ or bottom half).
|
||||
``statfs``
|
||||
called when the VFS needs to get filesystem statistics.
|
||||
|
||||
``remount_fs``
|
||||
called when the filesystem is remounted. This is called with
|
||||
the kernel lock held
|
||||
|
||||
``umount_begin``
|
||||
called when the VFS is unmounting a filesystem.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user