Commit b78f1293 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull tracing updates from Steven Rostedt:

 - Have module addresses get updated in the persistent ring buffer

   The addresses of the modules from the previous boot are saved in the
   persistent ring buffer. If the same modules are loaded and an address
   is in the old buffer points to an address that was both saved in the
   persistent ring buffer and is loaded in memory, shift the address to
   point to the address that is loaded in memory in the trace event.

 - Print function names for irqs off and preempt off callsites

   When ignoring the print fmt of a trace event and just printing the
   fields directly, have the fields for preempt off and irqs off events
   still show the function name (via kallsyms) instead of just showing
   the raw address.

 - Clean ups of the histogram code

   The histogram functions saved over 800 bytes on the stack to process
   events as they come in. Instead, create per-cpu buffers that can hold
   this information and have a separate location for each context level
   (thread, softirq, IRQ and NMI).

   Also add some more comments to the code.

 - Add "common_comm" field for histograms

   Add "common_comm" that uses the current->comm as a field in an event
   histogram and acts like any of the other fields of the event.

 - Show "subops" in the enabled_functions file

   When the function graph infrastructure is used, a subsystem has a
   "subops" that it attaches its callback function to. Instead of the
   enabled_functions just showing a function calling the function that
   calls the subops functions, also show the subops functions that will
   get called for that function too.

 - Add "copy_trace_marker" option to instances

   There are cases where an instance is created for tooling to write
   into, but the old tooling has the top level instance hardcoded into
   the application. New tools want to consume the data from an instance
   and not the top level buffer. By adding a copy_trace_marker option,
   whenever the top instance trace_marker is written into, a copy of it
   is also written into the instance with this option set. This allows
   new tools to read what old tools are writing into the top buffer.

   If this option is cleared by the top instance, then what is written
   into the trace_marker is not written into the top instance. This is a
   way to redirect the trace_marker writes into another instance.

 - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()

   If a tracepoint is created by DECLARE_TRACE() instead of
   TRACE_EVENT(), then it will not be exposed via tracefs. Currently
   there's no way to differentiate in the kernel the tracepoint
   functions between those that are exposed via tracefs or not. A
   calling convention has been made manually to append a "_tp" prefix
   for events created by DECLARE_TRACE(). Instead of doing this
   manually, force it so that all DECLARE_TRACE() events have this
   notation.

 - Use __string() for task->comm in some sched events

   Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
   scheduler events use __string() which makes it dynamic. Note, if
   these events are parsed by user space it they may break, and the
   event may have to be converted back to the hardcoded size.

 - Have function graph "depth" be unsigned to the user

   Internally to the kernel, the "depth" field of the function graph
   event is signed due to -1 being used for end of boundary. What
   actually gets recorded in the event itself is zero or positive.
   Reflect this to user space by showing "depth" as unsigned int and be
   consistent across all events.

 - Allow an arbitrary long CPU string to osnoise_cpus_write()

   The filtering of which CPUs to write to can exceed 256 bytes. If a
   machine has 256 CPUs, and the filter is to filter every other CPU,
   the write would take a string larger than 256 bytes. Instead of using
   a fixed size buffer on the stack that is 256 bytes, allocate it to
   handle what is passed in.

 - Stop having ftrace check the per-cpu data "disabled" flag

   The "disabled" flag in the data structure passed to most ftrace
   functions is checked to know if tracing has been disabled or not.
   This flag was added back in 2008 before the ring buffer had its own
   way to disable tracing. The "disable" flag is now not always set when
   needed, and the ring buffer flag should be used in all locations
   where the disabled is needed. Since the "disable" flag is redundant
   and incorrect, stop using it. Fix up some locations that use the
   "disable" flag to use the ring buffer info.

 - Use a new tracer_tracing_disable/enable() instead of data->disable
   flag

   There's a few cases that set the data->disable flag to stop tracing,
   but this flag is not consistently used. It is also an on/off switch
   where if a function set it and calls another function that sets it,
   the called function may incorrectly enable it.

   Use a new trace_tracing_disable() and tracer_tracing_enable() that
   uses a counter and can be nested. These use the ring buffer flags
   which are always checked making the disabling more consistent.

 - Save the trace clock in the persistent ring buffer

   Save what clock was used for tracing in the persistent ring buffer
   and set it back to that clock after a reboot.

 - Remove unused reference to a per CPU data pointer in mmiotrace
   functions

 - Remove unused buffer_page field from trace_array_cpu structure

 - Remove more strncpy() instances

 - Other minor clean ups and fixes

* tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits)
  tracing: Fix compilation warning on arm32
  tracing: Record trace_clock and recover when reboot
  tracing/sched: Use __string() instead of fixed lengths for task->comm
  tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix
  tracing: Cleanup upper_empty() in pid_list
  tracing: Allow the top level trace_marker to write into another instances
  tracing: Add a helper function to handle the dereference arg in verifier
  tracing: Remove unnecessary "goto out" that simply returns ret is trigger code
  tracing: Fix error handling in event_trigger_parse()
  tracing: Rename event_trigger_alloc() to trigger_data_alloc()
  tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf
  tracing: Remove unused buffer_page field from trace_array_cpu structure
  tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer
  tracing: Convert the per CPU "disabled" counter to local from atomic
  tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field
  ring-buffer: Add ring_buffer_record_is_on_cpu()
  tracing: Do not use per CPU array_buffer.data->disabled for cpumask
  ftrace: Do not disabled function graph based on "disabled" field
  tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled
  tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one()
  ...
parents 472c5f73 2fbdb6d8
Loading
Loading
Loading
Loading
+13 −0
Original line number Diff line number Diff line
@@ -1205,6 +1205,19 @@ Here are the available options:
	default instance. The only way the top level instance has this flag
	cleared, is by it being set in another instance.

  copy_trace_marker
	If there are applications that hard code writing into the top level
	trace_marker file (/sys/kernel/tracing/trace_marker or trace_marker_raw),
	and the tooling would like it to go into an instance, this option can
	be used. Create an instance and set this option, and then all writes
	into the top level trace_marker file will also be redirected into this
	instance.

	Note, by default this option is set for the top level instance. If it
	is disabled, then writes to the trace_marker or trace_marker_raw files
	will not be written into the top level file. If no instance has this
	option set, then a write will error with the errno of ENODEV.

  annotate
	It is sometimes confusing when the CPU buffers are full
	and one CPU buffer had a lot of events recently, thus
+11 −6
Original line number Diff line number Diff line
@@ -71,7 +71,7 @@ In subsys/file.c (where the tracing statement must be added)::
	void somefct(void)
	{
		...
		trace_subsys_eventname(arg, task);
		trace_subsys_eventname_tp(arg, task);
		...
	}

@@ -129,12 +129,12 @@ within an if statement with the following::
		for (i = 0; i < count; i++)
			tot += calculate_nuggets();

		trace_foo_bar(tot);
		trace_foo_bar_tp(tot);
	}

All trace_<tracepoint>() calls have a matching trace_<tracepoint>_enabled()
All trace_<tracepoint>_tp() calls have a matching trace_<tracepoint>_enabled()
function defined that returns true if the tracepoint is enabled and
false otherwise. The trace_<tracepoint>() should always be within the
false otherwise. The trace_<tracepoint>_tp() should always be within the
block of the if (trace_<tracepoint>_enabled()) to prevent races between
the tracepoint being enabled and the check being seen.

@@ -143,7 +143,10 @@ the static_key of the tracepoint to allow the if statement to be implemented
with jump labels and avoid conditional branches.

.. note:: The convenience macro TRACE_EVENT provides an alternative way to
      define tracepoints. Check http://lwn.net/Articles/379903,
      define tracepoints. Note, DECLARE_TRACE(foo) creates a function
      "trace_foo_tp()" whereas TRACE_EVENT(foo) creates a function
      "trace_foo()", and also exposes the tracepoint as a trace event in
      /sys/kernel/tracing/events directory.  Check http://lwn.net/Articles/379903,
      http://lwn.net/Articles/381064 and http://lwn.net/Articles/383362
      for a series of articles with more details.

@@ -159,7 +162,9 @@ In a C file::

	void do_trace_foo_bar_wrapper(args)
	{
		trace_foo_bar(args);
		trace_foo_bar_tp(args); // for tracepoints created via DECLARE_TRACE
					//   or
		trace_foo_bar(args);    // for tracepoints created via TRACE_EVENT
	}

In the header file::
+2 −0
Original line number Diff line number Diff line
@@ -328,6 +328,7 @@ ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops);
 * DIRECT - Used by the direct ftrace_ops helper for direct functions
 *            (internal ftrace only, should not be used by others)
 * SUBOP  - Is controlled by another op in field managed.
 * GRAPH  - Is a component of the fgraph_ops structure
 */
enum {
	FTRACE_OPS_FL_ENABLED			= BIT(0),
@@ -349,6 +350,7 @@ enum {
	FTRACE_OPS_FL_PERMANENT                 = BIT(16),
	FTRACE_OPS_FL_DIRECT			= BIT(17),
	FTRACE_OPS_FL_SUBOP			= BIT(18),
	FTRACE_OPS_FL_GRAPH			= BIT(19),
};

#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+1 −0
Original line number Diff line number Diff line
@@ -192,6 +192,7 @@ void ring_buffer_record_off(struct trace_buffer *buffer);
void ring_buffer_record_on(struct trace_buffer *buffer);
bool ring_buffer_record_is_on(struct trace_buffer *buffer);
bool ring_buffer_record_is_set_on(struct trace_buffer *buffer);
bool ring_buffer_record_is_on_cpu(struct trace_buffer *buffer, int cpu);
void ring_buffer_record_disable_cpu(struct trace_buffer *buffer, int cpu);
void ring_buffer_record_enable_cpu(struct trace_buffer *buffer, int cpu);

+26 −12
Original line number Diff line number Diff line
@@ -464,16 +464,30 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
#endif

#define DECLARE_TRACE(name, proto, args)				\
	__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args),		\
	__DECLARE_TRACE(name##_tp, PARAMS(proto), PARAMS(args),		\
			cpu_online(raw_smp_processor_id()),		\
			PARAMS(void *__data, proto))

#define DECLARE_TRACE_CONDITION(name, proto, args, cond)		\
	__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args),		\
	__DECLARE_TRACE(name##_tp, PARAMS(proto), PARAMS(args),		\
			cpu_online(raw_smp_processor_id()) && (PARAMS(cond)), \
			PARAMS(void *__data, proto))

#define DECLARE_TRACE_SYSCALL(name, proto, args)			\
	__DECLARE_TRACE_SYSCALL(name##_tp, PARAMS(proto), PARAMS(args),	\
				PARAMS(void *__data, proto))

#define DECLARE_TRACE_EVENT(name, proto, args)				\
	__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args),		\
			cpu_online(raw_smp_processor_id()),		\
			PARAMS(void *__data, proto))

#define DECLARE_TRACE_EVENT_CONDITION(name, proto, args, cond)		\
	__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args),		\
			cpu_online(raw_smp_processor_id()) && (PARAMS(cond)), \
			PARAMS(void *__data, proto))

#define DECLARE_TRACE_EVENT_SYSCALL(name, proto, args)			\
	__DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args),	\
				PARAMS(void *__data, proto))

@@ -591,32 +605,32 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)

#define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print)
#define DEFINE_EVENT(template, name, proto, args)		\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT(name, PARAMS(proto), PARAMS(args))
#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg)\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT(name, PARAMS(proto), PARAMS(args))
#define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT(name, PARAMS(proto), PARAMS(args))
#define DEFINE_EVENT_CONDITION(template, name, proto,		\
			       args, cond)			\
	DECLARE_TRACE_CONDITION(name, PARAMS(proto),		\
	DECLARE_TRACE_EVENT_CONDITION(name, PARAMS(proto),	\
				PARAMS(args), PARAMS(cond))

#define TRACE_EVENT(name, proto, args, struct, assign, print)	\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT(name, PARAMS(proto), PARAMS(args))
#define TRACE_EVENT_FN(name, proto, args, struct,		\
		assign, print, reg, unreg)			\
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT(name, PARAMS(proto), PARAMS(args))
#define TRACE_EVENT_FN_COND(name, proto, args, cond, struct,	\
		assign, print, reg, unreg)			\
	DECLARE_TRACE_CONDITION(name, PARAMS(proto),	\
	DECLARE_TRACE_EVENT_CONDITION(name, PARAMS(proto),	\
			PARAMS(args), PARAMS(cond))
#define TRACE_EVENT_CONDITION(name, proto, args, cond,		\
			      struct, assign, print)		\
	DECLARE_TRACE_CONDITION(name, PARAMS(proto),		\
	DECLARE_TRACE_EVENT_CONDITION(name, PARAMS(proto),	\
				PARAMS(args), PARAMS(cond))
#define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign,	\
			    print, reg, unreg)			\
	DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args))
	DECLARE_TRACE_EVENT_SYSCALL(name, PARAMS(proto), PARAMS(args))

#define TRACE_EVENT_FLAGS(event, flag)

Loading