Commit 2e04247f authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
parents 0074adea 31f505dc
Loading
Loading
Loading
Loading
+27 −15
Original line number Diff line number Diff line
@@ -9,9 +9,10 @@ Fprobe - Function entry/exit probe
Introduction
============

Fprobe is a function entry/exit probe mechanism based on ftrace.
Instead of using ftrace full feature, if you only want to attach callbacks
on function entry and exit, similar to the kprobes and kretprobes, you can
Fprobe is a function entry/exit probe based on the function-graph tracing
feature in ftrace.
Instead of tracing all functions, if you want to attach callbacks on specific
function entry and exit, similar to the kprobes and kretprobes, you can
use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
instrumentation for multiple functions with single handler. This document
describes how to use fprobe.
@@ -91,12 +92,14 @@ The prototype of the entry/exit callback function are as follows:

.. code-block:: c

 int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
 int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);

 void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
 void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);

Note that the @entry_ip is saved at function entry and passed to exit handler.
If the entry callback function returns !0, the corresponding exit callback will be cancelled.
Note that the @entry_ip is saved at function entry and passed to exit
handler.
If the entry callback function returns !0, the corresponding exit callback
will be cancelled.

@fp
        This is the address of `fprobe` data structure related to this handler.
@@ -112,12 +115,10 @@ If the entry callback function returns !0, the corresponding exit callback will
        This is the return address that the traced function will return to,
        somewhere in the caller. This can be used at both entry and exit.

@regs
        This is the `pt_regs` data structure at the entry and exit. Note that
        the instruction pointer of @regs may be different from the @entry_ip
        in the entry_handler. If you need traced instruction pointer, you need
        to use @entry_ip. On the other hand, in the exit_handler, the instruction
        pointer of @regs is set to the current return address.
@fregs
        This is the `ftrace_regs` data structure at the entry and exit. This
        includes the function parameters, or the return values. So user can
        access thos values via appropriate `ftrace_regs_*` APIs.

@entry_data
        This is a local storage to share the data between entry and exit handlers.
@@ -125,6 +126,17 @@ If the entry callback function returns !0, the corresponding exit callback will
        and `entry_data_size` field when registering the fprobe, the storage is
        allocated and passed to both `entry_handler` and `exit_handler`.

Entry data size and exit handlers on the same function
======================================================

Since the entry data is passed via per-task stack and it has limited size,
the entry data size per probe is limited to `15 * sizeof(long)`. You also need
to take care that the different fprobes are probing on the same function, this
limit becomes smaller. The entry data size is aligned to `sizeof(long)` and
each fprobe which has exit handler uses a `sizeof(long)` space on the stack,
you should keep the number of fprobes on the same function as small as
possible.

Share the callbacks with kprobes
================================

@@ -165,8 +177,8 @@ This counter counts up when;
 - fprobe fails to take ftrace_recursion lock. This usually means that a function
   which is traced by other ftrace users is called from the entry_handler.

 - fprobe fails to setup the function exit because of the shortage of rethook
   (the shadow stack for hooking the function return.)
 - fprobe fails to setup the function exit because of failing to allocate the
   data buffer from the per-task shadow stack.

The `fprobe::nmissed` field counts up in both cases. Therefore, the former
skips both of entry and exit callback and the latter skips the exit
+2 −0
Original line number Diff line number Diff line
@@ -217,9 +217,11 @@ config ARM64
	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
	select HAVE_EFFICIENT_UNALIGNED_ACCESS
	select HAVE_GUP_FAST
	select HAVE_FTRACE_GRAPH_FUNC
	select HAVE_FTRACE_MCOUNT_RECORD
	select HAVE_FUNCTION_TRACER
	select HAVE_FUNCTION_ERROR_INJECTION
	select HAVE_FUNCTION_GRAPH_FREGS
	select HAVE_FUNCTION_GRAPH_TRACER
	select HAVE_FUNCTION_GRAPH_RETVAL
	select HAVE_GCC_PLUGINS
+1 −0
Original line number Diff line number Diff line
@@ -8,6 +8,7 @@ syscall-y += unistd_32.h
syscall-y += unistd_compat_32.h

generic-y += early_ioremap.h
generic-y += fprobe.h
generic-y += mcs_spinlock.h
generic-y += mmzone.h
generic-y += qrwlock.h
+34 −17
Original line number Diff line number Diff line
@@ -52,6 +52,8 @@ extern unsigned long ftrace_graph_call;
extern void return_to_handler(void);

unsigned long ftrace_call_adjust(unsigned long addr);
unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip);
#define ftrace_get_symaddr(fentry_ip) arch_ftrace_get_symaddr(fentry_ip)

#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
#define HAVE_ARCH_FTRACE_REGS
@@ -129,6 +131,38 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
	arch_ftrace_regs(fregs)->pc = arch_ftrace_regs(fregs)->lr;
}

static __always_inline unsigned long
ftrace_regs_get_frame_pointer(const struct ftrace_regs *fregs)
{
	return arch_ftrace_regs(fregs)->fp;
}

static __always_inline unsigned long
ftrace_regs_get_return_address(const struct ftrace_regs *fregs)
{
	return arch_ftrace_regs(fregs)->lr;
}

static __always_inline struct pt_regs *
ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
{
	struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs);

	memcpy(regs->regs, afregs->regs, sizeof(afregs->regs));
	regs->sp = afregs->sp;
	regs->pc = afregs->pc;
	regs->regs[29] = afregs->fp;
	regs->regs[30] = afregs->lr;
	return regs;
}

#define arch_ftrace_fill_perf_regs(fregs, _regs) do {		\
		(_regs)->pc = arch_ftrace_regs(fregs)->pc;			\
		(_regs)->regs[29] = arch_ftrace_regs(fregs)->fp;		\
		(_regs)->sp = arch_ftrace_regs(fregs)->sp;			\
		(_regs)->pstate = PSR_MODE_EL1h;		\
	} while (0)

int ftrace_regs_query_register_offset(const char *name);

int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
@@ -186,23 +220,6 @@ static inline bool arch_syscall_match_sym_name(const char *sym,

#ifndef __ASSEMBLY__
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
struct fgraph_ret_regs {
	/* x0 - x7 */
	unsigned long regs[8];

	unsigned long fp;
	unsigned long __unused;
};

static inline unsigned long fgraph_ret_regs_return_value(struct fgraph_ret_regs *ret_regs)
{
	return ret_regs->regs[0];
}

static inline unsigned long fgraph_ret_regs_frame_pointer(struct fgraph_ret_regs *ret_regs)
{
	return ret_regs->fp;
}

void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
			   unsigned long frame_pointer);
+0 −12
Original line number Diff line number Diff line
@@ -179,18 +179,6 @@ int main(void)
  DEFINE(FTRACE_OPS_FUNC,		offsetof(struct ftrace_ops, func));
#endif
  BLANK();
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
  DEFINE(FGRET_REGS_X0,			offsetof(struct fgraph_ret_regs, regs[0]));
  DEFINE(FGRET_REGS_X1,			offsetof(struct fgraph_ret_regs, regs[1]));
  DEFINE(FGRET_REGS_X2,			offsetof(struct fgraph_ret_regs, regs[2]));
  DEFINE(FGRET_REGS_X3,			offsetof(struct fgraph_ret_regs, regs[3]));
  DEFINE(FGRET_REGS_X4,			offsetof(struct fgraph_ret_regs, regs[4]));
  DEFINE(FGRET_REGS_X5,			offsetof(struct fgraph_ret_regs, regs[5]));
  DEFINE(FGRET_REGS_X6,			offsetof(struct fgraph_ret_regs, regs[6]));
  DEFINE(FGRET_REGS_X7,			offsetof(struct fgraph_ret_regs, regs[7]));
  DEFINE(FGRET_REGS_FP,			offsetof(struct fgraph_ret_regs, fp));
  DEFINE(FGRET_REGS_SIZE,		sizeof(struct fgraph_ret_regs));
#endif
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
  DEFINE(FTRACE_OPS_DIRECT_CALL,	offsetof(struct ftrace_ops, direct_call));
#endif
Loading