Commit 18c64378 authored by Jeff Layton's avatar Jeff Layton Committed by Chuck Lever
Browse files

sunrpc: add info about xprt queue times to svc_xprt_dequeue tracepoint



I've been looking at a problem where we see increased RPC timeouts in
clients when the nfs_layout_flexfiles dataserver_timeo value is tuned
very low (6s). This is necessary to ensure quick failover to a different
mirror if a server goes down, but it causes a lot more major RPC timeouts.

Ultimately, the problem is server-side however. It's sometimes doesn't
respond to connection attempts. My theory is that the interrupt handler
runs when a connection comes in, the xprt ends up being enqueued, but it
takes a significant amount of time for the nfsd thread to pick it up.

Currently, the svc_xprt_dequeue tracepoint displays "wakeup-us". This is
the time between the wake_up() call, and the thread dequeueing the xprt.
If no thread was woken, or the thread ended up picking up a different
xprt than intended, then this value won't tell us how long the xprt was
waiting.

Add a new xpt_qtime field to struct svc_xprt and set it in
svc_xprt_enqueue(). When the dequeue tracepoint fires, also store the
time that the xprt sat on the queue in total. Display it as "qtime-us".

Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
parent 8c4aae55
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -53,6 +53,7 @@ struct svc_xprt {
	struct svc_xprt_class	*xpt_class;
	const struct svc_xprt_ops *xpt_ops;
	struct kref		xpt_ref;
	ktime_t			xpt_qtime;
	struct list_head	xpt_list;
	struct lwq_node		xpt_ready;
	unsigned long		xpt_flags;
+7 −6
Original line number Diff line number Diff line
@@ -2040,19 +2040,20 @@ TRACE_EVENT(svc_xprt_dequeue,

	TP_STRUCT__entry(
		SVC_XPRT_ENDPOINT_FIELDS(rqst->rq_xprt)

		__field(unsigned long, wakeup)
		__field(unsigned long, qtime)
	),

	TP_fast_assign(
		SVC_XPRT_ENDPOINT_ASSIGNMENTS(rqst->rq_xprt);
		ktime_t ktime = ktime_get();

		__entry->wakeup = ktime_to_us(ktime_sub(ktime_get(),
							rqst->rq_qtime));
		SVC_XPRT_ENDPOINT_ASSIGNMENTS(rqst->rq_xprt);
		__entry->wakeup = ktime_to_us(ktime_sub(ktime, rqst->rq_qtime));
		__entry->qtime = ktime_to_us(ktime_sub(ktime, rqst->rq_xprt->xpt_qtime));
	),

	TP_printk(SVC_XPRT_ENDPOINT_FORMAT " wakeup-us=%lu",
		SVC_XPRT_ENDPOINT_VARARGS, __entry->wakeup)
	TP_printk(SVC_XPRT_ENDPOINT_FORMAT " wakeup-us=%lu qtime-us=%lu",
		SVC_XPRT_ENDPOINT_VARARGS, __entry->wakeup, __entry->qtime)
);

DECLARE_EVENT_CLASS(svc_xprt_event,
+1 −0
Original line number Diff line number Diff line
@@ -488,6 +488,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
	pool = svc_pool_for_cpu(xprt->xpt_server);

	percpu_counter_inc(&pool->sp_sockets_queued);
	xprt->xpt_qtime = ktime_get();
	lwq_enqueue(&xprt->xpt_ready, &pool->sp_xprts);

	svc_pool_wake_idle_thread(pool);