Commit 4ff261e7 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull runtime verification updates from Steven Rostedt:

 - Added Linear temporal logic monitors for RT application

   Real-time applications may have design flaws causing them to have
   unexpected latency. For example, the applications may raise page
   faults, or may be blocked trying to take a mutex without priority
   inheritance.

   However, while attempting to implement DA monitors for these
   real-time rules, deterministic automaton is found to be inappropriate
   as the specification language. The automaton is complicated, hard to
   understand, and error-prone.

   For these cases, linear temporal logic is found to be more suitable.
   The LTL is more concise and intuitive.

 - Make printk_deferred() public

   The new monitors needed access to printk_deferred(). Make them
   visible for the entire kernel.

 - Add a vpanic() to allow for va_list to be passed to panic.

 - Add rtapp container monitor.

   A collection of monitors that check for common problems with
   real-time applications that cause unexpected latency.

 - Add page fault tracepoints to risc-v

   These tracepoints are necessary to for the RV monitor to run on
   risc-v.

 - Fix the behaviour of the rv tool with -s and idle tasks.

 - Allow the rv tool to gracefully terminate with SIGTERM

 - Adjusts dot2c not to create lines over 100 columns

 - Properly order nested monitors in the RV Kconfig file

 - Return the registration error in all DA monitor instead of 0

 - Update and add new sched collection monitors

   Replace tss and sncid monitors with more complete sts:

   Not only prove that switches occur in scheduling context and scheduling
   needs interrupt disabled but also that each call to the scheduler
   disables interrupts to (optionally) switch.

   New monitor: nrp
     Preemption requires need resched which is cleared by any switch
     (includes a non optimal workaround for /nested/ preemptions)

   New monitor: sssw
     suspension requires setting the task to sleepable and, after the
     switch occurs, the task requires a wakeup to come back to runnable

   New monitor: opid
      waking and need-resched operations occur with interrupts and
      preemption disabled or in IRQ without explicitly disabling
      preemption"

* tag 'trace-rv-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (48 commits)
  rv: Add opid per-cpu monitor
  rv: Add nrp and sssw per-task monitors
  rv: Replace tss and sncid monitors with more complete sts
  sched: Adapt sched tracepoints for RV task model
  rv: Retry when da monitor detects race conditions
  rv: Adjust monitor dependencies
  rv: Use strings in da monitors tracepoints
  rv: Remove trailing whitespace from tracepoint string
  rv: Add da_handle_start_run_event_ to per-task monitors
  rv: Fix wrong type cast in reactors_show() and monitor_reactor_show()
  rv: Fix wrong type cast in monitors_show()
  rv: Remove struct rv_monitor::reacting
  rv: Remove rv_reactor's reference counter
  rv: Merge struct rv_reactor_def into struct rv_reactor
  rv: Merge struct rv_monitor_def into struct rv_monitor
  rv: Remove unused field in struct rv_monitor_def
  rv: Return init error when registering monitors
  verification/rvgen: Organise Kconfig entries for nested monitors
  tools/dot2c: Fix generated files going over 100 column limit
  tools/rv: Stop gracefully also on SIGTERM
  ...
parents d50b07d0 61438453
Loading
Loading
Loading
Loading
+3 −1
Original line number Diff line number Diff line
@@ -8,8 +8,10 @@ Runtime Verification

   runtime-verification.rst
   deterministic_automata.rst
   da_monitor_synthesis.rst
   linear_temporal_logic.rst
   monitor_synthesis.rst
   da_monitor_instrumentation.rst
   monitor_wip.rst
   monitor_wwnr.rst
   monitor_sched.rst
   monitor_rtapp.rst
+134 −0
Original line number Diff line number Diff line
Linear temporal logic
=====================

Introduction
------------

Runtime verification monitor is a verification technique which checks that the
kernel follows a specification. It does so by using tracepoints to monitor the
kernel's execution trace, and verifying that the execution trace sastifies the
specification.

Initially, the specification can only be written in the form of deterministic
automaton (DA).  However, while attempting to implement DA monitors for some
complex specifications, deterministic automaton is found to be inappropriate as
the specification language. The automaton is complicated, hard to understand,
and error-prone.

Thus, RV monitors based on linear temporal logic (LTL) are introduced. This type
of monitor uses LTL as specification instead of DA. For some cases, writing the
specification as LTL is more concise and intuitive.

Many materials explain LTL in details. One book is::

  Christel Baier and Joost-Pieter Katoen: Principles of Model Checking, The MIT
  Press, 2008.

Grammar
-------

Unlike some existing syntax, kernel's implementation of LTL is more verbose.
This is motivated by considering that the people who read the LTL specifications
may not be well-versed in LTL.

Grammar:
    ltl ::= opd | ( ltl ) | ltl binop ltl | unop ltl

Operands (opd):
    true, false, user-defined names consisting of upper-case characters, digits,
    and underscore.

Unary Operators (unop):
    always
    eventually
    next
    not

Binary Operators (binop):
    until
    and
    or
    imply
    equivalent

This grammar is ambiguous: operator precedence is not defined. Parentheses must
be used.

Example linear temporal logic
-----------------------------
.. code-block::

   RAIN imply (GO_OUTSIDE imply HAVE_UMBRELLA)

means: if it is raining, going outside means having an umbrella.

.. code-block::

   RAIN imply (WET until not RAIN)

means: if it is raining, it is going to be wet until the rain stops.

.. code-block::

   RAIN imply eventually not RAIN

means: if it is raining, rain will eventually stop.

The above examples are referring to the current time instance only. For kernel
verification, the `always` operator is usually desirable, to specify that
something is always true at the present and for all future. For example::

    always (RAIN imply eventually not RAIN)

means: *all* rain eventually stops.

In the above examples, `RAIN`, `GO_OUTSIDE`, `HAVE_UMBRELLA` and `WET` are the
"atomic propositions".

Monitor synthesis
-----------------

To synthesize an LTL into a kernel monitor, the `rvgen` tool can be used:
`tools/verification/rvgen`. The specification needs to be provided as a file,
and it must have a "RULE = LTL" assignment. For example::

    RULE = always (ACQUIRE imply ((not KILLED and not CRASHED) until RELEASE))

which says: if `ACQUIRE`, then `RELEASE` must happen before `KILLED` or
`CRASHED`.

The LTL can be broken down using sub-expressions. The above is equivalent to:

   .. code-block::

    RULE = always (ACQUIRE imply (ALIVE until RELEASE))
    ALIVE = not KILLED and not CRASHED

From this specification, `rvgen` generates the C implementation of a Buchi
automaton - a non-deterministic state machine which checks the satisfiability of
the LTL. See Documentation/trace/rv/monitor_synthesis.rst for details on using
`rvgen`.

References
----------

One book covering model checking and linear temporal logic is::

  Christel Baier and Joost-Pieter Katoen: Principles of Model Checking, The MIT
  Press, 2008.

For an example of using linear temporal logic in software testing, see::

  Ruijie Meng, Zhen Dong, Jialin Li, Ivan Beschastnikh, and Abhik Roychoudhury.
  2022. Linear-time temporal logic guided greybox fuzzing. In Proceedings of the
  44th International Conference on Software Engineering (ICSE '22).  Association
  for Computing Machinery, New York, NY, USA, 1343–1355.
  https://doi.org/10.1145/3510003.3510082

The kernel's LTL monitor implementation is based on::

  Gerth, R., Peled, D., Vardi, M.Y., Wolper, P. (1996). Simple On-the-fly
  Automatic Verification of Linear Temporal Logic. In: Dembiński, P., Średniawa,
  M. (eds) Protocol Specification, Testing and Verification XV. PSTV 1995. IFIP
  Advances in Information and Communication Technology. Springer, Boston, MA.
  https://doi.org/10.1007/978-0-387-34892-6_1
+133 −0
Original line number Diff line number Diff line
Real-time application monitors
==============================

- Name: rtapp
- Type: container for multiple monitors
- Author: Nam Cao <namcao@linutronix.de>

Description
-----------

Real-time applications may have design flaws such that they experience
unexpected latency and fail to meet their time requirements. Often, these flaws
follow a few patterns:

  - Page faults: A real-time thread may access memory that does not have a
    mapped physical backing or must first be copied (such as for copy-on-write).
    Thus a page fault is raised and the kernel must first perform the expensive
    action. This causes significant delays to the real-time thread
  - Priority inversion: A real-time thread blocks waiting for a lower-priority
    thread. This causes the real-time thread to effectively take on the
    scheduling priority of the lower-priority thread. For example, the real-time
    thread needs to access a shared resource that is protected by a
    non-pi-mutex, but the mutex is currently owned by a non-real-time thread.

The `rtapp` monitor detects these patterns. It aids developers to identify
reasons for unexpected latency with real-time applications. It is a container of
multiple sub-monitors described in the following sections.

Monitor pagefault
+++++++++++++++++

The `pagefault` monitor reports real-time tasks raising page faults. Its
specification is::

  RULE = always (RT imply not PAGEFAULT)

To fix warnings reported by this monitor, `mlockall()` or `mlock()` can be used
to ensure physical backing for memory.

This monitor may have false negatives because the pages used by the real-time
threads may just happen to be directly available during testing.  To minimize
this, the system can be put under memory pressure (e.g.  invoking the OOM killer
using a program that does `ptr = malloc(SIZE_OF_RAM); memset(ptr, 0,
SIZE_OF_RAM);`) so that the kernel executes aggressive strategies to recycle as
much physical memory as possible.

Monitor sleep
+++++++++++++

The `sleep` monitor reports real-time threads sleeping in a manner that may
cause undesirable latency. Real-time applications should only put a real-time
thread to sleep for one of the following reasons:

  - Cyclic work: real-time thread sleeps waiting for the next cycle. For this
    case, only the `clock_nanosleep` syscall should be used with `TIMER_ABSTIME`
    (to avoid time drift) and `CLOCK_MONOTONIC` (to avoid the clock being
    changed). No other method is safe for real-time. For example, threads
    waiting for timerfd can be woken by softirq which provides no real-time
    guarantee.
  - Real-time thread waiting for something to happen (e.g. another thread
    releasing shared resources, or a completion signal from another thread). In
    this case, only futexes (FUTEX_LOCK_PI, FUTEX_LOCK_PI2 or one of
    FUTEX_WAIT_*) should be used.  Applications usually do not use futexes
    directly, but use PI mutexes and PI condition variables which are built on
    top of futexes. Be aware that the C library might not implement conditional
    variables as safe for real-time. As an alternative, the librtpi library
    exists to provide a conditional variable implementation that is correct for
    real-time applications in Linux.

Beside the reason for sleeping, the eventual waker should also be
real-time-safe. Namely, one of:

  - An equal-or-higher-priority thread
  - Hard interrupt handler
  - Non-maskable interrupt handler

This monitor's warning usually means one of the following:

  - Real-time thread is blocked by a non-real-time thread (e.g. due to
    contention on a mutex without priority inheritance). This is priority
    inversion.
  - Time-critical work waits for something which is not safe for real-time (e.g.
    timerfd).
  - The work executed by the real-time thread does not need to run at real-time
    priority at all.  This is not a problem for the real-time thread itself, but
    it is potentially taking the CPU away from other important real-time work.

Application developers may purposely choose to have their real-time application
sleep in a way that is not safe for real-time. It is debatable whether that is a
problem. Application developers must analyze the warnings to make a proper
assessment.

The monitor's specification is::

  RULE = always ((RT and SLEEP) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))

  RT_FRIENDLY_SLEEP = (RT_VALID_SLEEP_REASON or KERNEL_THREAD)
                  and ((not WAKE) until RT_FRIENDLY_WAKE)

  RT_VALID_SLEEP_REASON = FUTEX_WAIT
                       or RT_FRIENDLY_NANOSLEEP

  RT_FRIENDLY_NANOSLEEP = CLOCK_NANOSLEEP
                      and NANOSLEEP_TIMER_ABSTIME
                      and NANOSLEEP_CLOCK_MONOTONIC

  RT_FRIENDLY_WAKE = WOKEN_BY_EQUAL_OR_HIGHER_PRIO
                  or WOKEN_BY_HARDIRQ
                  or WOKEN_BY_NMI
                  or KTHREAD_SHOULD_STOP

  ALLOWLIST = BLOCK_ON_RT_MUTEX
           or FUTEX_LOCK_PI
           or TASK_IS_RCU
           or TASK_IS_MIGRATION

Beside the scenarios described above, this specification also handle some
special cases:

  - `KERNEL_THREAD`: kernel tasks do not have any pattern that can be recognized
    as valid real-time sleeping reasons. Therefore sleeping reason is not
    checked for kernel tasks.
  - `KTHREAD_SHOULD_STOP`: a non-real-time thread may stop a real-time kernel
    thread by waking it and waiting for it to exit (`kthread_stop()`). This
    wakeup is safe for real-time.
  - `ALLOWLIST`: to handle known false positives with the kernel.
  - `BLOCK_ON_RT_MUTEX` is included in the allowlist due to its implementation.
    In the release path of rt_mutex, a boosted task is de-boosted before waking
    the rt_mutex's waiter. Consequently, the monitor may see a real-time-unsafe
    wakeup (e.g. non-real-time task waking real-time task). This is actually
    real-time-safe because preemption is disabled for the duration.
  - `FUTEX_LOCK_PI` is included in the allowlist for the same reason as
    `BLOCK_ON_RT_MUTEX`.
+269 −38

File changed.

Preview size limit exceeded, changes collapsed.

+271 −0
Original line number Diff line number Diff line
Deterministic Automata Monitor Synthesis
========================================
Runtime Verification Monitor Synthesis
======================================

The starting point for the application of runtime verification (RV) techniques
is the *specification* or *modeling* of the desired (or undesired) behavior
@@ -36,24 +36,26 @@ below::
                                  |  +----> panic ?
                                  +-------> <user-specified>

DA monitor synthesis
RV monitor synthesis
--------------------

The synthesis of automata-based models into the Linux *RV monitor* abstraction
is automated by the dot2k tool and the rv/da_monitor.h header file that
contains a set of macros that automatically generate the monitor's code.
The synthesis of a specification into the Linux *RV monitor* abstraction is
automated by the rvgen tool and the header file containing common code for
creating monitors. The header files are:

dot2k
  * rv/da_monitor.h for deterministic automaton monitor.
  * rv/ltl_monitor.h for linear temporal logic monitor.

rvgen
-----

The dot2k utility leverages dot2c by converting an automaton model in
the DOT format into the C representation [1] and creating the skeleton of
a kernel monitor in C.
The rvgen utility converts a specification into the C presentation and creating
the skeleton of a kernel monitor in C.

For example, it is possible to transform the wip.dot model present in
[1] into a per-cpu monitor with the following command::

  $ dot2k -d wip.dot -t per_cpu
  $ rvgen monitor -c da -s wip.dot -t per_cpu

This will create a directory named wip/ with the following files:

@@ -63,18 +65,38 @@ This will create a directory named wip/ with the following files:
The wip.c file contains the monitor declaration and the starting point for
the system instrumentation.

Monitor macros
--------------
Similarly, a linear temporal logic monitor can be generated with the following
command::

  $ rvgen monitor -c ltl -s pagefault.ltl -t per_task

This generates pagefault/ directory with:

- pagefault.h: The Buchi automaton (the non-deterministic state machine to
  verify the specification)
- pagefault.c: The skeleton for the RV monitor

Monitor header files
--------------------

The header files:

- `rv/da_monitor.h` for deterministic automaton monitor
- `rv/ltl_monitor` for linear temporal logic monitor

include common macros and static functions for implementing *Monitor
Instance(s)*.

The rv/da_monitor.h enables automatic code generation for the *Monitor
Instance(s)* using C macros.
The benefits of having all common functionalities in a single header file are
3-fold:

The benefits of the usage of macro for monitor synthesis are 3-fold as it:
  - Reduce the code duplication;
  - Facilitate the bug fix/improvement;
  - Avoid the case of developers changing the core of the monitor code to
    manipulate the model in a (let's say) non-standard way.

- Reduces the code duplication;
- Facilitates the bug fix/improvement;
- Avoids the case of developers changing the core of the monitor code
  to manipulate the model in a (let's say) non-standard way.
rv/da_monitor.h
+++++++++++++++

This initial implementation presents three different types of monitor instances:

@@ -87,7 +109,7 @@ the second for monitors with per-cpu instances, and the third with per-task
instances.

In all cases, the 'name' argument is a string that identifies the monitor, and
the 'type' argument is the data type used by dot2k on the representation of
the 'type' argument is the data type used by rvgen on the representation of
the model in C.

For example, the wip model with two states and three events can be
@@ -130,11 +152,113 @@ While the event "preempt_enabled" will use::
To notify the monitor that the system will be returning to the initial state,
so the system and the monitor should be in sync.

rv/ltl_monitor.h
++++++++++++++++
This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`)
to be complete. For example, for the `pagefault` monitor, the `pagefault.c`
source file must include::

  #include "pagefault.h"
  #include <rv/ltl_monitor.h>

(the skeleton monitor file generated by `rvgen` already does this).

`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the
implementation of the Buchi automaton - a non-deterministic state machine that
verifies the LTL specification. While `rv/ltl_monitor.h` includes the common
helper functions to interact with the Buchi automaton and to implement an RV
monitor. An important definition in `$(MODEL_NAME).h` is::

  enum ltl_atom {
      LTL_$(FIRST_ATOMIC_PROPOSITION),
      LTL_$(SECOND_ATOMIC_PROPOSITION),
      ...
      LTL_NUM_ATOM
  };

which is the list of atomic propositions present in the LTL specification
(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the
functions interacting with the Buchi automaton.

While generating code, `rvgen` cannot understand the meaning of the atomic
propositions. Thus, that task is left for manual work. The recommended pratice
is adding tracepoints to places where the atomic propositions change; and in the
tracepoints' handlers: the Buchi automaton is executed using::

  void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value)

which tells the Buchi automaton that the atomic proposition `atom` is now
`value`. The Buchi automaton checks whether the LTL specification is still
satisfied, and invokes the monitor's error tracepoint and the reactor if
violation is detected.

Tracepoints and `ltl_atom_update()` should be used whenever possible. However,
it is sometimes not the most convenient. For some atomic propositions which are
changed in multiple places in the kernel, it is cumbersome to trace all those
places. Furthermore, it may not be important that the atomic propositions are
updated at precise times. For example, considering the following linear temporal
logic::

  RULE = always (RT imply not PAGEFAULT)

This LTL states that a real-time task does not raise page faults. For this
specification, it is not important when `RT` changes, as long as it has the
correct value when `PAGEFAULT` is true.  Motivated by this case, another
function is introduced::

  void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)

This function is called whenever the Buchi automaton is triggered. Therefore, it
can be manually implemented to "fetch" `RT`::

  void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)
  {
      ltl_atom_set(mon, LTL_RT, rt_task(task));
  }

Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`,
`RT` is also fetched. Thus, the LTL specification can be verified without
tracing `RT` everywhere.

For atomic propositions which act like events, they usually need to be set (or
cleared) and then immediately cleared (or set). A convenient function is
provided::

  void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value)

which is equivalent to::

  ltl_atom_update(task, atom, value);
  ltl_atom_update(task, atom, !value);

To initialize the atomic propositions, the following function must be
implemented::

  ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)

This function is called for all running tasks when the monitor is enabled. It is
also called for new tasks created after the enabling the monitor. It should
initialize as many atomic propositions as possible, for example::

  void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
  {
      ltl_atom_set(mon, LTL_RT, rt_task(task));
      if (task_creation)
          ltl_atom_set(mon, LTL_PAGEFAULT, false);
  }

Atomic propositions not initialized by `ltl_atom_init()` will stay in the
unknown state until relevant tracepoints are hit, which can take some time. As
monitoring for a task cannot be done until all atomic propositions is known for
the task, the monitor may need some time to start validating tasks which have
been running before the monitor is enabled. Therefore, it is recommended to
start the tasks of interest after enabling the monitor.

Final remarks
-------------

With the monitor synthesis in place using the rv/da_monitor.h and
dot2k, the developer's work should be limited to the instrumentation
With the monitor synthesis in place using the header files and
rvgen, the developer's work should be limited to the instrumentation
of the system, increasing the confidence in the overall approach.

[1] For details about deterministic automata format and the translation
@@ -142,6 +266,6 @@ from one representation to another, see::

  Documentation/trace/rv/deterministic_automata.rst

[2] dot2k appends the monitor's name suffix to the events enums to
[2] rvgen appends the monitor's name suffix to the events enums to
avoid conflicting variables when exporting the global vmlinux.h
use by BPF programs.
Loading