Commit b33f7dec authored by Mike Snitzer's avatar Mike Snitzer Committed by Anna Schumaker
Browse files

nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_



Also update Documentation/filesystems/nfs/localio.rst accordingly
and reduce the technical documentation debt that was previously
captured in that document.

Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
Acked-by: default avatarChuck Lever <chuck.lever@oracle.com>
Signed-off-by: default avatarAnna Schumaker <anna.schumaker@oracle.com>
parent 39972494
Loading
Loading
Loading
Loading
+27 −58
Original line number Diff line number Diff line
@@ -218,64 +218,30 @@ NFS Client and Server Interlock
===============================

LOCALIO provides the nfs_uuid_t object and associated interfaces to
allow proper network namespace (net-ns) and NFSD object refcounting:

    We don't want to keep a long-term counted reference on each NFSD's
    net-ns in the client because that prevents a server container from
    completely shutting down.

    So we avoid taking a reference at all and rely on the per-cpu
    reference to the server (detailed below) being sufficient to keep
    the net-ns active. This involves allowing the NFSD's net-ns exit
    code to iterate all active clients and clear their ->net pointers
    (which are needed to find the per-cpu-refcount for the nfsd_serv).

    Details:

     - Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
       that can be used to find the client. It does add the 16-byte
       uuid_t to nfs_client so it is bigger than needed (given that
       uuid_t is only used during the initial NFS client and server
       LOCALIO handshake to determine if they are local to each other).
       If that is really a problem we can find a fix.

     - When the nfs server confirms that the uuid_t is local, it moves
       the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.

     - When each server's net-ns is shutting down - in a "pre_exit"
       handler, all these nfs_uuid_t have their ->net cleared. There is
       an rcu_synchronize() call between pre_exit() handlers and exit()
       handlers so any caller that sees nfs_uuid_t ->net as not NULL can
       safely manage the per-cpu-refcount for nfsd_serv.

     - The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
       can safely dereference ->net in a private rcu_read_lock() section
       to allow safe access to the associated nfsd_net and nfsd_serv.

So LOCALIO required the introduction and use of NFSD's percpu_ref to
interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
allow proper network namespace (net-ns) and NFSD object refcounting.

LOCALIO required the introduction and use of NFSD's percpu nfsd_net_ref
to interlock nfsd_shutdown_net() and nfsd_open_local_fh(), to ensure
each net-ns is not destroyed while in use by nfsd_open_local_fh(), and
warrants a more detailed explanation:

    nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
    nfsd_open_local_fh() uses nfsd_net_try_get() before opening its
    nfsd_file handle and then the caller (NFS client) must drop the
    reference for the nfsd_file and associated nn->nfsd_serv using
    nfs_file_put_local() once it has completed its IO.
    reference for the nfsd_file and associated net-ns using
    nfsd_file_put_local() once it has completed its IO.

    This interlock working relies heavily on nfsd_open_local_fh() being
    afforded the ability to safely deal with the possibility that the
    NFSD's net-ns (and nfsd_net by association) may have been destroyed
    by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
    possible given the nfs_uuid_t ->net pointer managemenet detailed
    above.

All told, this elaborate interlock of the NFS client and server has been
verified to fix an easy to hit crash that would occur if an NFSD
instance running in a container, with a LOCALIO client mounted, is
shutdown. Upon restart of the container and associated NFSD the client
would go on to crash due to NULL pointer dereference that occurred due
to the LOCALIO client's attempting to nfsd_open_local_fh(), using
nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
    by nfsd_destroy_serv() via nfsd_shutdown_net().

This interlock of the NFS client and server has been verified to fix an
easy to hit crash that would occur if an NFSD instance running in a
container, with a LOCALIO client mounted, is shutdown. Upon restart of
the container and associated NFSD, the client would go on to crash due
to NULL pointer dereference that occurred due to the LOCALIO client's
attempting to nfsd_open_local_fh() without having a proper reference on
NFSD's net-ns.

NFS Client issues IO instead of Server
======================================
@@ -308,16 +274,19 @@ fs/nfs/localio.c:nfs_local_commit().

With normal NFS that makes use of RPC to issue IO to the server, if an
application uses O_DIRECT the NFS client will bypass the pagecache but
the NFS server will not. Because the NFS server's use of buffered IO
affords applications to be less precise with their alignment when
issuing IO to the NFS client. LOCALIO can be configured to use O_DIRECT
semantics by setting the 'localio_O_DIRECT_semantics' nfs module
the NFS server will not. The NFS server's use of buffered IO affords
applications to be less precise with their alignment when issuing IO to
the NFS client. But if all applications properly align their IO, LOCALIO
can be configured to use end-to-end O_DIRECT semantics from the NFS
client to the underlying local filesystem, that it is sharing with
the NFS server, by setting the 'localio_O_DIRECT_semantics' nfs module
parameter to Y, e.g.:

    echo Y > /sys/module/nfs/parameters/localio_O_DIRECT_semantics

Once enabled, it will cause LOCALIO to use O_DIRECT semantics (this may
cause IO to fail if applications do not properly align their IO).
Once enabled, it will cause LOCALIO to use end-to-end O_DIRECT semantics
(but again, this may cause IO to fail if applications do not properly
align their IO).

Security
========
+7 −3
Original line number Diff line number Diff line
@@ -159,6 +159,10 @@ static void nfs_uuid_add_file(nfs_uuid_t *nfs_uuid, struct nfs_file_localio *nfl
	spin_unlock(&nfs_uuid_lock);
}

/*
 * Caller is responsible for calling nfsd_net_put and
 * nfsd_file_put (via nfs_to_nfsd_file_put_local).
 */
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
		   struct rpc_clnt *rpc_clnt, const struct cred *cred,
		   const struct nfs_fh *nfs_fh, struct nfs_file_localio *nfl,
@@ -171,7 +175,7 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
	 * Not running in nfsd context, so must safely get reference on nfsd_serv.
	 * But the server may already be shutting down, if so disallow new localio.
	 * uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
	 * if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
	 * if uuid->net is not NULL, then calling nfsd_net_try_get() is safe
	 * and if it succeeds we will have an implied reference to the net.
	 *
	 * Otherwise NFS may not have ref on NFSD and therefore cannot safely
@@ -179,12 +183,12 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
	 */
	rcu_read_lock();
	net = rcu_dereference(uuid->net);
	if (!net || !nfs_to->nfsd_serv_try_get(net)) {
	if (!net || !nfs_to->nfsd_net_try_get(net)) {
		rcu_read_unlock();
		return ERR_PTR(-ENXIO);
	}
	rcu_read_unlock();
	/* We have an implied reference to net thanks to nfsd_serv_try_get */
	/* We have an implied reference to net thanks to nfsd_net_try_get */
	localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
					     cred, nfs_fh, fmode);
	if (IS_ERR(localio))
+1 −1
Original line number Diff line number Diff line
@@ -391,7 +391,7 @@ nfsd_file_put(struct nfsd_file *nf)
}

/**
 * nfsd_file_put_local - put nfsd_file reference and arm nfsd_serv_put in caller
 * nfsd_file_put_local - put nfsd_file reference and arm nfsd_net_put in caller
 * @nf: nfsd_file of which to put the reference
 *
 * First save the associated net to return to caller, then put
+2 −2
Original line number Diff line number Diff line
@@ -25,8 +25,8 @@
#include "cache.h"

static const struct nfsd_localio_operations nfsd_localio_ops = {
	.nfsd_serv_try_get  = nfsd_serv_try_get,
	.nfsd_serv_put  = nfsd_serv_put,
	.nfsd_net_try_get  = nfsd_net_try_get,
	.nfsd_net_put  = nfsd_net_put,
	.nfsd_open_local_fh = nfsd_open_local_fh,
	.nfsd_file_put_local = nfsd_file_put_local,
	.nfsd_file_get = nfsd_file_get,
+6 −5
Original line number Diff line number Diff line
@@ -140,9 +140,10 @@ struct nfsd_net {

	struct svc_info nfsd_info;
#define nfsd_serv nfsd_info.serv
	struct percpu_ref nfsd_serv_ref;
	struct completion nfsd_serv_confirm_done;
	struct completion nfsd_serv_free_done;

	struct percpu_ref nfsd_net_ref;
	struct completion nfsd_net_confirm_done;
	struct completion nfsd_net_free_done;

	/*
	 * clientid and stateid data for construction of net unique COPY
@@ -229,8 +230,8 @@ struct nfsd_net {
extern bool nfsd_support_version(int vers);
extern unsigned int nfsd_net_id;

bool nfsd_serv_try_get(struct net *net);
void nfsd_serv_put(struct net *net);
bool nfsd_net_try_get(struct net *net);
void nfsd_net_put(struct net *net);

void nfsd_copy_write_verifier(__be32 verf[2], struct nfsd_net *nn);
void nfsd_reset_write_verifier(struct nfsd_net *nn);
Loading