Unverified Commit 7a47db23 authored Dec 20, 2024 by Christian Brauner

Merge patch series "netfs: Read performance improvements and "single-blob" support"

David Howells <dhowells@redhat.com> says:

This set of patches is primarily about two things: improving read
performance and supporting monolithic single-blob objects that have to be
read/written as such (e.g. AFS directory contents).  The implementation of
the two parts is interwoven as each makes the other possible.

READ PERFORMANCE
================

The read performance improvements are intended to speed up some loss of
performance detected in cifs and to a lesser extend in afs.  The problem is
that we queue too many work items during the collection of read results:
each individual subrequest is collected by its own work item, and then they
have to interact with each other when a series of subrequests don't exactly
align with the pattern of folios that are being read by the overall
request.

Whilst the processing of the pages covered by individual subrequests as
they complete potentially allows folios to be woken in parallel and with
minimum delay, it can shuffle wakeups for sequential reads out of order -
and that is the most common I/O pattern.

The final assessment and cleanup of an operation is then held up until the
last I/O completes - and for a synchronous sequential operation, this means
the bouncing around of work items just adds latency.

Two changes have been made to make this work:

 (1) All collection is now done in a single "work item" that works
     progressively through the subrequests as they complete (and also
     dispatches retries as necessary).

 (2) For readahead and AIO, this work item be done on a workqueue and can
     run in parallel with the ultimate consumer of the data; for
     synchronous direct or unbuffered reads, the collection is run in the
     application thread and not offloaded.

Functions such as smb2_readv_callback() then just tell netfslib that the
subrequest has terminated; netfslib does a minimal bit of processing on the
spot - stat counting and tracing mostly - and then queues/wakes up the
worker.  This simplifies the logic as the collector just walks sequentially
through the subrequests as they complete and walks through the folios, if
buffered, unlocking them as it goes.  It also keeps to a minimum the amount
of latency injected into the filesystem's low-level I/O handling

The way netfs supports filesystems using the deprecated PG_private_2 flag
is changed: folios are flagged and added to a write request as they
complete and that takes care of scheduling the writes to the cache.  The
originating read request can then just unlock the pages whatever happens.

SINGLE-BLOB OBJECT SUPPORT
==========================

Single-blob objects are files for which the content of the file must be
read from or written to the server in a single operation because reading
them in parts may yield inconsistent results.  AFS directories are an
example of this as there exists the possibility that the contents are
generated on the fly and would differ between reads or might change due to
third party interference.

Such objects will be written to and retrieved from the cache if one is
present, though we allow/may need to propose multiple subrequests to do so.
The important part is that read from/write to the *server* is monolithic.

Single blob reading is, for the moment, fully synchronous and does result
collection in the application thread and, also for the moment, the API is
supplied the buffer in the form of a folio_queue chain rather than using
the pagecache.

AFS CHANGES
===========

This series makes a number of changes to the kafs filesystem, primarily in
the area of directory handling:

 (1) AFS's FetchData RPC reply processing is made partially asynchronous
     which allows the netfs_io_request's outstanding operation counter to
     be removed as part of reducing the collection to a single work item.

 (2) Directory and symlink reading are plumbed through netfslib using the
     single-blob object API and are now cacheable with fscache.  This also
     allows the afs_read struct to be eliminated and netfs_io_subrequest to
     be used directly instead.

 (3) Directory and symlink content are now stored in a folio_queue buffer
     rather than in the pagecache.  This means we don't require the RCU
     read lock and xarray iteration to access it, and folios won't randomly
     disappear under us because the VM wants them back.

     There are some downsides to this, though: the storage folios are no
     longer known to the VM, drop_caches can't flush them, the folios are
     not migrateable.  The inode must also be marked dirty manually to get
     the data written to the cache in the background.

 (4) The vnode operation lock is changed from a mutex struct to a private
     lock implementation.  The problem is that the lock now needs to be
     dropped in a separate thread and mutexes don't permit that.

 (5) When a new directory or symlink is created, we now initialise it
     locally and mark it valid rather than downloading it (we know what
     it's likely to look like).

 (6) We now use the in-directory hashtable to reduce the number of entries
     we need to scan when doing a lookup.  The edit routines have to
     maintain the hash chains.

 (7) Cancellation (e.g. by signal) of an async call after the rxrpc_call
     has been set up is now offloaded to the worker thread as there will be
     a notification from rxrpc upon completion.  This avoids a double
     cleanup.

SUPPORTING CHANGES
==================

To support the above some other changes are also made:

 (1) A "rolling buffer" implementation is created to abstract out the two
     separate folio_queue chaining implementations I had (one for read and
     one for write).

 (2) Functions are provided to create/extend a buffer in a folio_queue
     chain and tear it down again.  This is used to handle AFS directories,
     but could also be used to create bounce buffers for content crypto and
     transport crypto.

 (3) The was_async argument is dropped from netfs_read_subreq_terminated().
     Instead we wake the read collection work item by either queuing it or
     waking up the app thread.

 (4) We don't need to use BH-excluding locks when communicating between the
     issuing thread and the collection thread as neither of them now run in
     BH context.

MISCELLANY
==========

Also included are a number of new tracepoints; a split of the netfslib
write collection code to put retrying into its own file (it gets more
complicated with content encryption).

There are also some minor fixes AFS included, including fixing the AFS
directory format struct layout, reducing some directory over-invalidation
and making afs_mkdir() translate EEXIST to ENOTEMPY (which is not available
on all systems the servers support).

Finally, there's a patch to try and detect entry into the folio unlock
function with no folio_queue structs in the buffer (which isn't allowed in
the cases that can get there).  This is a debugging patch, but should be
minimal overhead.

* patches from https://lore.kernel.org/r/20241216204124.3752367-1-dhowells@redhat.com: (31 commits)
  netfs: Report on NULL folioq in netfs_writeback_unlock_folios()
  afs: Add a tracepoint for afs_read_receive()
  afs: Locally initialise the contents of a new symlink on creation
  afs: Use the contained hashtable to search a directory
  afs: Make afs_mkdir() locally initialise a new directory's content
  netfs: Change the read result collector to only use one work item
  afs: Make {Y,}FS.FetchData an asynchronous operation
  afs: Fix cleanup of immediately failed async calls
  afs: Eliminate afs_read
  afs: Use netfslib for symlinks, allowing them to be cached
  afs: Use netfslib for directories
  afs: Make afs_init_request() get a key if not given a file
  netfs: Add support for caching single monolithic objects such as AFS dirs
  netfs: Add functions to build/clean a buffer in a folio_queue
  afs: Add more tracepoints to do with tracking validity
  cachefiles: Add auxiliary data trace
  cachefiles: Add some subrequest tracepoints
  netfs: Remove some extraneous directory invalidations
  afs: Fix directory format encoding struct
  afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTY
  ...

Link: https://lore.kernel.org/r/20241216204124.3752367-1-dhowells@redhat.com


Signed-off-by: Christian Brauner <brauner@kernel.org>

parents 5fe85a5c 794d8cf3

fs/9p/vfs_addr.c

+3 −3

Original line number	Diff line number	Diff line
		@@ -81,13 +81,13 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
		__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
		if (pos + total >= i_size_read(rreq->inode))
		__set_bit(NETFS_SREQ_HIT_EOF, &subreq->flags);

		if (!err) {
		if (!err && total) {
		subreq->transferred += total;
		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
		}

		netfs_read_subreq_terminated(subreq, err, false);
		subreq->error = err;
		netfs_read_subreq_terminated(subreq);
		}

		/**

fs/afs/Makefile

+1 −0

Original line number	Diff line number	Diff line
		@@ -11,6 +11,7 @@ kafs-y := \
		cmservice.o \
		dir.o \
		dir_edit.o \
		dir_search.o \
		dir_silly.o \
		dynroot.o \
		file.o \

fs/afs/callback.c

+2 −2

Original line number	Diff line number	Diff line
		@@ -41,7 +41,7 @@ static void afs_volume_init_callback(struct afs_volume *volume)

		list_for_each_entry(vnode, &volume->open_mmaps, cb_mmap_link) {
		if (vnode->cb_v_check != atomic_read(&volume->cb_v_break)) {
		atomic64_set(&vnode->cb_expires_at, AFS_NO_CB_PROMISE);
		afs_clear_cb_promise(vnode, afs_cb_promise_clear_vol_init_cb);
		queue_work(system_unbound_wq, &vnode->cb_work);
		}
		}
		@@ -79,7 +79,7 @@ void __afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reas
		_enter("");

		clear_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
		if (atomic64_xchg(&vnode->cb_expires_at, AFS_NO_CB_PROMISE) != AFS_NO_CB_PROMISE) {
		if (afs_clear_cb_promise(vnode, afs_cb_promise_clear_cb_break)) {
		vnode->cb_break++;
		vnode->cb_v_check = atomic_read(&vnode->volume->cb_v_break);
		afs_clear_permits(vnode);

fs/afs/dir.c

+417 −392

File changed.

Preview size limit exceeded, changes collapsed.

fs/afs/dir_edit.c

+225 −158

Original line number	Diff line number	Diff line
		@@ -10,6 +10,7 @@
		#include <linux/namei.h>
		#include <linux/pagemap.h>
		#include <linux/iversion.h>
		#include <linux/folio_queue.h>
		#include "internal.h"
		#include "xdr_fs.h"

		@@ -105,23 +106,57 @@ static void afs_clear_contig_bits(union afs_xdr_dir_block *block,
		}

		/*
		* Get a new directory folio.
		* Get a specific block, extending the directory storage to cover it as needed.
		*/
		static struct folio afs_dir_get_folio(struct afs_vnode vnode, pgoff_t index)
		static union afs_xdr_dir_block afs_dir_get_block(struct afs_dir_iter iter, size_t block)
		{
		struct address_space *mapping = vnode->netfs.inode.i_mapping;
		struct folio_queue *fq;
		struct afs_vnode *dvnode = iter->dvnode;
		struct folio *folio;
		size_t blpos = block * AFS_DIR_BLOCK_SIZE;
		size_t blend = (block + 1) * AFS_DIR_BLOCK_SIZE, fpos = iter->fpos;
		int ret;

		folio = __filemap_get_folio(mapping, index,
		FGP_LOCK \| FGP_ACCESSED \| FGP_CREAT,
		mapping->gfp_mask);
		if (IS_ERR(folio)) {
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		return NULL;
		if (dvnode->directory_size < blend) {
		size_t cur_size = dvnode->directory_size;

		ret = netfs_alloc_folioq_buffer(
		NULL, &dvnode->directory, &cur_size, blend,
		mapping_gfp_mask(dvnode->netfs.inode.i_mapping));
		dvnode->directory_size = cur_size;
		if (ret < 0)
		goto fail;
		}

		fq = iter->fq;
		if (!fq)
		fq = dvnode->directory;

		/* Search the folio queue for the folio containing the block... */
		for (; fq; fq = fq->next) {
		for (int s = iter->fq_slot; s < folioq_count(fq); s++) {
		size_t fsize = folioq_folio_size(fq, s);

		if (blend <= fpos + fsize) {
		/* ... and then return the mapped block. */
		folio = folioq_folio(fq, s);
		if (WARN_ON_ONCE(folio_pos(folio) != fpos))
		goto fail;
		iter->fq = fq;
		iter->fq_slot = s;
		iter->fpos = fpos;
		return kmap_local_folio(folio, blpos - fpos);
		}
		fpos += fsize;
		}
		if (!folio_test_private(folio))
		folio_attach_private(folio, (void *)1);
		return folio;
		iter->fq_slot = 0;
		}

		fail:
		iter->fq = NULL;
		iter->fq_slot = 0;
		afs_invalidate_dir(dvnode, afs_dir_invalid_edit_get_block);
		return NULL;
		}

		/*
		@@ -209,9 +244,8 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		{
		union afs_xdr_dir_block meta, block;
		union afs_xdr_dirent *de;
		struct folio folio0, folio;
		unsigned int need_slots, nr_blocks, b;
		pgoff_t index;
		struct afs_dir_iter iter = { .dvnode = vnode };
		unsigned int nr_blocks, b, entry;
		loff_t i_size;
		int slot;

		@@ -220,20 +254,17 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		i_size = i_size_read(&vnode->netfs.inode);
		if (i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS \|\|
		(i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_add_bad_size);
		return;
		}

		folio0 = afs_dir_get_folio(vnode, 0);
		if (!folio0) {
		_leave(" [fgp]");
		meta = afs_dir_get_block(&iter, 0);
		if (!meta)
		return;
		}

		/* Work out how many slots we're going to need. */
		need_slots = afs_dir_calc_slots(name->len);
		iter.nr_slots = afs_dir_calc_slots(name->len);

		meta = kmap_local_folio(folio0, 0);
		if (i_size == 0)
		goto new_directory;
		nr_blocks = i_size / AFS_DIR_BLOCK_SIZE;
		@@ -245,22 +276,21 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		/* If the directory extended into a new folio, then we need to
		* tack a new folio on the end.
		*/
		index = b / AFS_DIR_BLOCKS_PER_PAGE;
		if (nr_blocks >= AFS_DIR_MAX_BLOCKS)
		goto error;
		if (index >= folio_nr_pages(folio0)) {
		folio = afs_dir_get_folio(vnode, index);
		if (!folio)
		goto error;
		} else {
		folio = folio0;
		}
		goto error_too_many_blocks;

		block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio));
		/* Lower dir blocks have a counter in the header we can check. */
		if (b < AFS_DIR_BLOCKS_WITH_CTR &&
		meta->meta.alloc_ctrs[b] < iter.nr_slots)
		continue;

		block = afs_dir_get_block(&iter, b);
		if (!block)
		goto error;

		/* Abandon the edit if we got a callback break. */
		if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
		goto invalidated;
		goto already_invalidated;

		_debug("block %u: %2u %3u %u",
		b,
		@@ -275,31 +305,23 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		afs_set_i_size(vnode, (b + 1) * AFS_DIR_BLOCK_SIZE);
		}

		/* Only lower dir blocks have a counter in the header. */
		if (b >= AFS_DIR_BLOCKS_WITH_CTR \|\|
		meta->meta.alloc_ctrs[b] >= need_slots) {
		/* We need to try and find one or more consecutive
		* slots to hold the entry.
		/* We need to try and find one or more consecutive slots to
		* hold the entry.
		*/
		slot = afs_find_contig_bits(block, need_slots);
		slot = afs_find_contig_bits(block, iter.nr_slots);
		if (slot >= 0) {
		_debug("slot %u", slot);
		goto found_space;
		}
		}

		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		}

		/* There are no spare slots of sufficient size, yet the operation
		* succeeded. Download the directory again.
		*/
		trace_afs_edit_dir(vnode, why, afs_edit_dir_create_nospc, 0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_add_no_slots);
		goto out_unmap;

		new_directory:
		@@ -307,8 +329,7 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		i_size = AFS_DIR_BLOCK_SIZE;
		afs_set_i_size(vnode, i_size);
		slot = AFS_DIR_RESV_BLOCKS0;
		folio = folio0;
		block = kmap_local_folio(folio, 0);
		block = afs_dir_get_block(&iter, 0);
		nr_blocks = 1;
		b = 0;

		@@ -326,41 +347,39 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		de->u.name[name->len] = 0;

		/* Adjust the bitmap. */
		afs_set_contig_bits(block, slot, need_slots);
		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		afs_set_contig_bits(block, slot, iter.nr_slots);

		/* Adjust the allocation counter. */
		if (b < AFS_DIR_BLOCKS_WITH_CTR)
		meta->meta.alloc_ctrs[b] -= need_slots;
		meta->meta.alloc_ctrs[b] -= iter.nr_slots;

		/* Adjust the hash chain. */
		entry = b * AFS_DIR_SLOTS_PER_BLOCK + slot;
		iter.bucket = afs_dir_hash_name(name);
		de->u.hash_next = meta->meta.hashtable[iter.bucket];
		meta->meta.hashtable[iter.bucket] = htons(entry);
		kunmap_local(block);

		inode_inc_iversion_raw(&vnode->netfs.inode);
		afs_stat_v(vnode, n_dir_cr);
		_debug("Insert %s in %u[%u]", name->name, b, slot);

		netfs_single_mark_inode_dirty(&vnode->netfs.inode);

		out_unmap:
		kunmap_local(meta);
		folio_unlock(folio0);
		folio_put(folio0);
		_leave("");
		return;

		invalidated:
		already_invalidated:
		trace_afs_edit_dir(vnode, why, afs_edit_dir_create_inval, 0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		goto out_unmap;

		error_too_many_blocks:
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_add_too_many_blocks);
		error:
		trace_afs_edit_dir(vnode, why, afs_edit_dir_create_error, 0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		goto out_unmap;
		}

		@@ -374,13 +393,14 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
		void afs_edit_dir_remove(struct afs_vnode *vnode,
		struct qstr *name, enum afs_edit_dir_reason why)
		{
		union afs_xdr_dir_block meta, block;
		union afs_xdr_dirent *de;
		struct folio folio0, folio;
		unsigned int need_slots, nr_blocks, b;
		pgoff_t index;
		union afs_xdr_dir_block meta, block, *pblock;
		union afs_xdr_dirent de, pde;
		struct afs_dir_iter iter = { .dvnode = vnode };
		struct afs_fid fid;
		unsigned int b, slot, entry;
		loff_t i_size;
		int slot;
		__be16 next;
		int found;

		_enter(",,{%d,%s},", name->len, name->name);

		@@ -388,81 +408,95 @@ void afs_edit_dir_remove(struct afs_vnode *vnode,
		if (i_size < AFS_DIR_BLOCK_SIZE \|\|
		i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS \|\|
		(i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_rem_bad_size);
		return;
		}
		nr_blocks = i_size / AFS_DIR_BLOCK_SIZE;

		folio0 = afs_dir_get_folio(vnode, 0);
		if (!folio0) {
		_leave(" [fgp]");
		if (!afs_dir_init_iter(&iter, name))
		return;
		}

		/* Work out how many slots we're going to discard. */
		need_slots = afs_dir_calc_slots(name->len);

		meta = kmap_local_folio(folio0, 0);
		meta = afs_dir_find_block(&iter, 0);
		if (!meta)
		return;

		/* Find a block that has sufficient slots available. Each folio
		* contains two or more directory blocks.
		*/
		for (b = 0; b < nr_blocks; b++) {
		index = b / AFS_DIR_BLOCKS_PER_PAGE;
		if (index >= folio_nr_pages(folio0)) {
		folio = afs_dir_get_folio(vnode, index);
		if (!folio)
		goto error;
		} else {
		folio = folio0;
		/* Find the entry in the blob. */
		found = afs_dir_search_bucket(&iter, name, &fid);
		if (found < 0) {
		/* Didn't find the dirent to clobber. Re-download. */
		trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_noent,
		0, 0, 0, 0, name->name);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_rem_wrong_name);
		goto out_unmap;
		}

		block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio));
		entry = found;
		b = entry / AFS_DIR_SLOTS_PER_BLOCK;
		slot = entry % AFS_DIR_SLOTS_PER_BLOCK;

		/* Abandon the edit if we got a callback break. */
		block = afs_dir_find_block(&iter, b);
		if (!block)
		goto error;
		if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
		goto invalidated;

		if (b > AFS_DIR_BLOCKS_WITH_CTR \|\|
		meta->meta.alloc_ctrs[b] <= AFS_DIR_SLOTS_PER_BLOCK - 1 - need_slots) {
		slot = afs_dir_scan_block(block, name, b);
		if (slot >= 0)
		goto found_dirent;
		}
		goto already_invalidated;

		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		}

		/* Didn't find the dirent to clobber. Download the directory again. */
		trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_noent,
		0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		goto out_unmap;

		found_dirent:
		/* Check and clear the entry. */
		de = &block->dirents[slot];
		if (de->u.valid != 1)
		goto error_unmap;

		trace_afs_edit_dir(vnode, why, afs_edit_dir_delete, b, slot,
		ntohl(de->u.vnode), ntohl(de->u.unique),
		name->name);

		memset(de, 0, sizeof(de) need_slots);

		/* Adjust the bitmap. */
		afs_clear_contig_bits(block, slot, need_slots);
		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		afs_clear_contig_bits(block, slot, iter.nr_slots);

		/* Adjust the allocation counter. */
		if (b < AFS_DIR_BLOCKS_WITH_CTR)
		meta->meta.alloc_ctrs[b] += need_slots;
		meta->meta.alloc_ctrs[b] += iter.nr_slots;

		/* Clear the constituent entries. */
		next = de->u.hash_next;
		memset(de, 0, sizeof(de) iter.nr_slots);
		kunmap_local(block);

		/* Adjust the hash chain: if iter->prev_entry is 0, the hashtable head
		* index is previous; otherwise it's slot number of the previous entry.
		*/
		if (!iter.prev_entry) {
		__be16 prev_next = meta->meta.hashtable[iter.bucket];

		if (unlikely(prev_next != htons(entry))) {
		pr_warn("%llx:%llx:%x: not head of chain b=%x p=%x,%x e=%x %*s",
		vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique,
		iter.bucket, iter.prev_entry, prev_next, entry,
		name->len, name->name);
		goto error;
		}
		meta->meta.hashtable[iter.bucket] = next;
		} else {
		unsigned int pb = iter.prev_entry / AFS_DIR_SLOTS_PER_BLOCK;
		unsigned int ps = iter.prev_entry % AFS_DIR_SLOTS_PER_BLOCK;
		__be16 prev_next;

		pblock = afs_dir_find_block(&iter, pb);
		if (!pblock)
		goto error;
		pde = &pblock->dirents[ps];
		prev_next = pde->u.hash_next;
		if (prev_next != htons(entry)) {
		kunmap_local(pblock);
		pr_warn("%llx:%llx:%x: not prev in chain b=%x p=%x,%x e=%x %*s",
		vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique,
		iter.bucket, iter.prev_entry, prev_next, entry,
		name->len, name->name);
		goto error;
		}
		pde->u.hash_next = next;
		kunmap_local(pblock);
		}

		netfs_single_mark_inode_dirty(&vnode->netfs.inode);

		inode_set_iversion_raw(&vnode->netfs.inode, vnode->status.data_version);
		afs_stat_v(vnode, n_dir_rm);
		@@ -470,26 +504,20 @@ void afs_edit_dir_remove(struct afs_vnode *vnode,

		out_unmap:
		kunmap_local(meta);
		folio_unlock(folio0);
		folio_put(folio0);
		_leave("");
		return;

		invalidated:
		already_invalidated:
		kunmap_local(block);
		trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_inval,
		0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		kunmap_local(block);
		if (folio != folio0) {
		folio_unlock(folio);
		folio_put(folio);
		}
		goto out_unmap;

		error_unmap:
		kunmap_local(block);
		error:
		trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_error,
		0, 0, 0, 0, name->name);
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		goto out_unmap;
		}

		@@ -502,9 +530,8 @@ void afs_edit_dir_update_dotdot(struct afs_vnode vnode, struct afs_vnode new_d
		{
		union afs_xdr_dir_block *block;
		union afs_xdr_dirent *de;
		struct folio *folio;
		struct afs_dir_iter iter = { .dvnode = vnode };
		unsigned int nr_blocks, b;
		pgoff_t index;
		loff_t i_size;
		int slot;

		@@ -512,39 +539,35 @@ void afs_edit_dir_update_dotdot(struct afs_vnode vnode, struct afs_vnode new_d

		i_size = i_size_read(&vnode->netfs.inode);
		if (i_size < AFS_DIR_BLOCK_SIZE) {
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_upd_bad_size);
		return;
		}

		nr_blocks = i_size / AFS_DIR_BLOCK_SIZE;

		/* Find a block that has sufficient slots available. Each folio
		* contains two or more directory blocks.
		*/
		for (b = 0; b < nr_blocks; b++) {
		index = b / AFS_DIR_BLOCKS_PER_PAGE;
		folio = afs_dir_get_folio(vnode, index);
		if (!folio)
		block = afs_dir_get_block(&iter, b);
		if (!block)
		goto error;

		block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio));

		/* Abandon the edit if we got a callback break. */
		if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
		goto invalidated;
		goto already_invalidated;

		slot = afs_dir_scan_block(block, &dotdot_name, b);
		if (slot >= 0)
		goto found_dirent;

		kunmap_local(block);
		folio_unlock(folio);
		folio_put(folio);
		}

		/* Didn't find the dirent to clobber. Download the directory again. */
		trace_afs_edit_dir(vnode, why, afs_edit_dir_update_nodd,
		0, 0, 0, 0, "..");
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		afs_invalidate_dir(vnode, afs_dir_invalid_edit_upd_no_dd);
		goto out;

		found_dirent:
		@@ -556,26 +579,70 @@ void afs_edit_dir_update_dotdot(struct afs_vnode vnode, struct afs_vnode new_d
		ntohl(de->u.vnode), ntohl(de->u.unique), "..");

		kunmap_local(block);
		folio_unlock(folio);
		folio_put(folio);
		netfs_single_mark_inode_dirty(&vnode->netfs.inode);
		inode_set_iversion_raw(&vnode->netfs.inode, vnode->status.data_version);

		out:
		_leave("");
		return;

		invalidated:
		already_invalidated:
		kunmap_local(block);
		folio_unlock(folio);
		folio_put(folio);
		trace_afs_edit_dir(vnode, why, afs_edit_dir_update_inval,
		0, 0, 0, 0, "..");
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		goto out;

		error:
		trace_afs_edit_dir(vnode, why, afs_edit_dir_update_error,
		0, 0, 0, 0, "..");
		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
		goto out;
		}

		/*
		* Initialise a new directory. We need to fill in the "." and ".." entries.
		*/
		void afs_mkdir_init_dir(struct afs_vnode dvnode, struct afs_vnode parent_dvnode)
		{
		union afs_xdr_dir_block *meta;
		struct afs_dir_iter iter = { .dvnode = dvnode };
		union afs_xdr_dirent *de;
		unsigned int slot = AFS_DIR_RESV_BLOCKS0;
		loff_t i_size;

		i_size = i_size_read(&dvnode->netfs.inode);
		if (i_size != AFS_DIR_BLOCK_SIZE) {
		afs_invalidate_dir(dvnode, afs_dir_invalid_edit_add_bad_size);
		return;
		}

		meta = afs_dir_get_block(&iter, 0);
		if (!meta)
		return;

		afs_edit_init_block(meta, meta, 0);

		de = &meta->dirents[slot];
		de->u.valid = 1;
		de->u.vnode = htonl(dvnode->fid.vnode);
		de->u.unique = htonl(dvnode->fid.unique);
		memcpy(de->u.name, ".", 2);
		trace_afs_edit_dir(dvnode, afs_edit_dir_for_mkdir, afs_edit_dir_mkdir, 0, slot,
		dvnode->fid.vnode, dvnode->fid.unique, ".");
		slot++;

		de = &meta->dirents[slot];
		de->u.valid = 1;
		de->u.vnode = htonl(parent_dvnode->fid.vnode);
		de->u.unique = htonl(parent_dvnode->fid.unique);
		memcpy(de->u.name, "..", 3);
		trace_afs_edit_dir(dvnode, afs_edit_dir_for_mkdir, afs_edit_dir_mkdir, 0, slot,
		parent_dvnode->fid.vnode, parent_dvnode->fid.unique, "..");

		afs_set_contig_bits(meta, AFS_DIR_RESV_BLOCKS0, 2);
		meta->meta.alloc_ctrs[0] -= 2;
		kunmap_local(meta);

		netfs_single_mark_inode_dirty(&dvnode->netfs.inode);
		set_bit(AFS_VNODE_DIR_VALID, &dvnode->flags);
		set_bit(AFS_VNODE_DIR_READ, &dvnode->flags);
		}