Commit 72885116 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull landlock updates from Mickaël Salaün:
 "This brings two main changes to Landlock:

   - A signal scoping fix with a new interface for user space to know if
     it is compatible with the running kernel.

   - Audit support to give visibility on why access requests are denied,
     including the origin of the security policy, missing access rights,
     and description of object(s). This was designed to limit log spam
     as much as possible while still alerting about unexpected blocked
     access.

  With these changes come new and improved documentation, and a lot of
  new tests"

* tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits)
  landlock: Add audit documentation
  selftests/landlock: Add audit tests for network
  selftests/landlock: Add audit tests for filesystem
  selftests/landlock: Add audit tests for abstract UNIX socket scoping
  selftests/landlock: Add audit tests for ptrace
  selftests/landlock: Test audit with restrict flags
  selftests/landlock: Add tests for audit flags and domain IDs
  selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
  selftests/landlock: Add test for invalid ruleset file descriptor
  samples/landlock: Enable users to log sandbox denials
  landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
  landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
  landlock: Log scoped denials
  landlock: Log TCP bind and connect denials
  landlock: Log truncate and IOCTL denials
  landlock: Factor out IOCTL hooks
  landlock: Log file-related denials
  landlock: Log mount-related denials
  landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
  landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
  ...
parents 78fb88ec 8e2dd47b
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -48,3 +48,4 @@ subdirectories.
   Yama
   SafeSetID
   ipe
   landlock
+158 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0
.. Copyright © 2025 Microsoft Corporation

================================
Landlock: system-wide management
================================

:Author: Mickaël Salaün
:Date: March 2025

Landlock can leverage the audit framework to log events.

User space documentation can be found here:
Documentation/userspace-api/landlock.rst.

Audit
=====

Denied access requests are logged by default for a sandboxed program if `audit`
is enabled.  This default behavior can be changed with the
sys_landlock_restrict_self() flags (cf.
Documentation/userspace-api/landlock.rst).  Landlock logs can also be masked
thanks to audit rules.  Landlock can generate 2 audit record types.

Record types
------------

AUDIT_LANDLOCK_ACCESS
    This record type identifies a denied access request to a kernel resource.
    The ``domain`` field indicates the ID of the domain that blocked the
    request.  The ``blockers`` field indicates the cause(s) of this denial
    (separated by a comma), and the following fields identify the kernel object
    (similar to SELinux).  There may be more than one of this record type per
    audit event.

    Example with a file link request generating two records in the same event::

        domain=195ba459b blockers=fs.refer path="/usr/bin" dev="vda2" ino=351
        domain=195ba459b blockers=fs.make_reg,fs.refer path="/usr/local" dev="vda2" ino=365

AUDIT_LANDLOCK_DOMAIN
    This record type describes the status of a Landlock domain.  The ``status``
    field can be either ``allocated`` or ``deallocated``.

    The ``allocated`` status is part of the same audit event and follows
    the first logged ``AUDIT_LANDLOCK_ACCESS`` record of a domain.  It identifies
    Landlock domain information at the time of the sys_landlock_restrict_self()
    call with the following fields:

    - the ``domain`` ID
    - the enforcement ``mode``
    - the domain creator's ``pid``
    - the domain creator's ``uid``
    - the domain creator's executable path (``exe``)
    - the domain creator's command line (``comm``)

    Example::

        domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer"

    The ``deallocated`` status is an event on its own and it identifies a
    Landlock domain release.  After such event, it is guarantee that the
    related domain ID will never be reused during the lifetime of the system.
    The ``domain`` field indicates the ID of the domain which is released, and
    the ``denials`` field indicates the total number of denied access request,
    which might not have been logged according to the audit rules and
    sys_landlock_restrict_self()'s flags.

    Example::

        domain=195ba459b status=deallocated denials=3


Event samples
--------------

Here are two examples of log events (see serial numbers).

In this example a sandboxed program (``kill``) tries to send a signal to the
init process, which is denied because of the signal scoping restriction
(``LL_SCOPED=s``)::

  $ LL_FS_RO=/ LL_FS_RW=/ LL_SCOPED=s LL_FORCE_LOG=1 ./sandboxer kill 1

This command generates two events, each identified with a unique serial
number following a timestamp (``msg=audit(1729738800.268:30)``).  The first
event (serial ``30``) contains 4 records.  The first record
(``type=LANDLOCK_ACCESS``) shows an access denied by the domain `1a6fdc66f`.
The cause of this denial is signal scopping restriction
(``blockers=scope.signal``).  The process that would have receive this signal
is the init process (``opid=1 ocomm="systemd"``).

The second record (``type=LANDLOCK_DOMAIN``) describes (``status=allocated``)
domain `1a6fdc66f`.  This domain was created by process ``286`` executing the
``/root/sandboxer`` program launched by the root user.

The third record (``type=SYSCALL``) describes the syscall, its provided
arguments, its result (``success=no exit=-1``), and the process that called it.

The fourth record (``type=PROCTITLE``) shows the command's name as an
hexadecimal value.  This can be translated with ``python -c
'print(bytes.fromhex("6B696C6C0031"))'``.

Finally, the last record (``type=LANDLOCK_DOMAIN``) is also the only one from
the second event (serial ``31``).  It is not tied to a direct user space action
but an asynchronous one to free resources tied to a Landlock domain
(``status=deallocated``).  This can be useful to know that the following logs
will not concern the domain ``1a6fdc66f`` anymore.  This record also summarize
the number of requests this domain denied (``denials=1``), whether they were
logged or not.

.. code-block::

  type=LANDLOCK_ACCESS msg=audit(1729738800.268:30): domain=1a6fdc66f blockers=scope.signal opid=1 ocomm="systemd"
  type=LANDLOCK_DOMAIN msg=audit(1729738800.268:30): domain=1a6fdc66f status=allocated mode=enforcing pid=286 uid=0 exe="/root/sandboxer" comm="sandboxer"
  type=SYSCALL msg=audit(1729738800.268:30): arch=c000003e syscall=62 success=no exit=-1 [..] ppid=272 pid=286 auid=0 uid=0 gid=0 [...] comm="kill" [...]
  type=PROCTITLE msg=audit(1729738800.268:30): proctitle=6B696C6C0031
  type=LANDLOCK_DOMAIN msg=audit(1729738800.324:31): domain=1a6fdc66f status=deallocated denials=1

Here is another example showcasing filesystem access control::

  $ LL_FS_RO=/ LL_FS_RW=/tmp LL_FORCE_LOG=1 ./sandboxer sh -c "echo > /etc/passwd"

The related audit logs contains 8 records from 3 different events (serials 33,
34 and 35) created by the same domain `1a6fdc679`::

  type=LANDLOCK_ACCESS msg=audit(1729738800.221:33): domain=1a6fdc679 blockers=fs.write_file path="/dev/tty" dev="devtmpfs" ino=9
  type=LANDLOCK_DOMAIN msg=audit(1729738800.221:33): domain=1a6fdc679 status=allocated mode=enforcing pid=289 uid=0 exe="/root/sandboxer" comm="sandboxer"
  type=SYSCALL msg=audit(1729738800.221:33): arch=c000003e syscall=257 success=no exit=-13 [...] ppid=272 pid=289 auid=0 uid=0 gid=0 [...] comm="sh" [...]
  type=PROCTITLE msg=audit(1729738800.221:33): proctitle=7368002D63006563686F203E202F6574632F706173737764
  type=LANDLOCK_ACCESS msg=audit(1729738800.221:34): domain=1a6fdc679 blockers=fs.write_file path="/etc/passwd" dev="vda2" ino=143821
  type=SYSCALL msg=audit(1729738800.221:34): arch=c000003e syscall=257 success=no exit=-13 [...] ppid=272 pid=289 auid=0 uid=0 gid=0 [...] comm="sh" [...]
  type=PROCTITLE msg=audit(1729738800.221:34): proctitle=7368002D63006563686F203E202F6574632F706173737764
  type=LANDLOCK_DOMAIN msg=audit(1729738800.261:35): domain=1a6fdc679 status=deallocated denials=2


Event filtering
---------------

If you get spammed with audit logs related to Landlock, this is either an
attack attempt or a bug in the security policy.  We can put in place some
filters to limit noise with two complementary ways:

- with sys_landlock_restrict_self()'s flags if we can fix the sandboxed
  programs,
- or with audit rules (see :manpage:`auditctl(8)`).

Additional documentation
========================

* `Linux Audit Documentation`_
* Documentation/userspace-api/landlock.rst
* Documentation/security/landlock.rst
* https://landlock.io

.. Links
.. _Linux Audit Documentation:
   https://github.com/linux-audit/audit-documentation/wiki
+12 −1
Original line number Diff line number Diff line
@@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
==================================

:Author: Mickaël Salaün
:Date: December 2022
:Date: March 2025

Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
harden a whole system, this feature should be available to any process,
@@ -45,6 +45,10 @@ Guiding principles for safe access controls
  sandboxed process shall retain their scoped accesses (at the time of resource
  acquisition) whatever process uses them.
  Cf. `File descriptor access rights`_.
* Access denials shall be logged according to system and Landlock domain
  configurations.  Log entries must contain information about the cause of the
  denial and the owner of the related security policy.  Such log generation
  should have a negligible performance and memory impact on allowed requests.

Design choices
==============
@@ -124,6 +128,13 @@ makes the reasoning much easier and helps avoid pitfalls.
.. kernel-doc:: security/landlock/ruleset.h
    :identifiers:

Additional documentation
========================

* Documentation/userspace-api/landlock.rst
* Documentation/admin-guide/LSM/landlock.rst
* https://landlock.io

.. Links
.. _tools/testing/selftests/landlock/:
   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/landlock/
+44 −28
Original line number Diff line number Diff line
@@ -8,7 +8,7 @@ Landlock: unprivileged access control
=====================================

:Author: Mickaël Salaün
:Date: January 2025
:Date: March 2025

The goal of Landlock is to enable restriction of ambient rights (e.g. global
filesystem or network access) for a set of processes.  Because Landlock
@@ -317,33 +317,32 @@ IPC scoping
-----------

Similar to the implicit `Ptrace restrictions`_, we may want to further restrict
interactions between sandboxes. Each Landlock domain can be explicitly scoped
for a set of actions by specifying it on a ruleset.  For example, if a
sandboxed process should not be able to :manpage:`connect(2)` to a
non-sandboxed process through abstract :manpage:`unix(7)` sockets, we can
specify such a restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``.
Moreover, if a sandboxed process should not be able to send a signal to a
non-sandboxed process, we can specify this restriction with
``LANDLOCK_SCOPE_SIGNAL``.

A sandboxed process can connect to a non-sandboxed process when its domain is
not scoped. If a process's domain is scoped, it can only connect to sockets
created by processes in the same scope.
Moreover, if a process is scoped to send signal to a non-scoped process, it can
only send signals to processes in the same scope.

A connected datagram socket behaves like a stream socket when its domain is
scoped, meaning if the domain is scoped after the socket is connected, it can
still :manpage:`send(2)` data just like a stream socket.  However, in the same
scenario, a non-connected datagram socket cannot send data (with
:manpage:`sendto(2)`) outside its scope.

A process with a scoped domain can inherit a socket created by a non-scoped
process. The process cannot connect to this socket since it has a scoped
domain.

IPC scoping does not support exceptions, so if a domain is scoped, no rules can
be added to allow access to resources or processes outside of the scope.
interactions between sandboxes.  Therefore, at ruleset creation time, each
Landlock domain can restrict the scope for certain operations, so that these
operations can only reach out to processes within the same Landlock domain or in
a nested Landlock domain (the "scope").

The operations which can be scoped are:

``LANDLOCK_SCOPE_SIGNAL``
    This limits the sending of signals to target processes which run within the
    same or a nested Landlock domain.

``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``
    This limits the set of abstract :manpage:`unix(7)` sockets to which we can
    :manpage:`connect(2)` to socket addresses which were created by a process in
    the same or a nested Landlock domain.

    A :manpage:`sendto(2)` on a non-connected datagram socket is treated as if
    it were doing an implicit :manpage:`connect(2)` and will be blocked if the
    remote end does not stem from the same or a nested Landlock domain.

    A :manpage:`sendto(2)` on a socket which was previously connected will not
    be restricted.  This works for both datagram and stream sockets.

IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`.
If an operation is scoped within a domain, no rules can be added to allow access
to resources or processes outside of the scope.

Truncating files
----------------
@@ -595,6 +594,16 @@ Starting with the Landlock ABI version 6, it is possible to restrict
:manpage:`signal(7)` sending by setting ``LANDLOCK_SCOPE_SIGNAL`` to the
``scoped`` ruleset attribute.

Logging (ABI < 7)
-----------------

Starting with the Landlock ABI version 7, it is possible to control logging of
Landlock audit events with the ``LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF``,
``LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON``, and
``LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF`` flags passed to
sys_landlock_restrict_self().  See Documentation/admin-guide/LSM/landlock.rst
for more details on audit.

.. _kernel_support:

Kernel support
@@ -683,9 +692,16 @@ fine-grained restrictions). Moreover, their complexity can lead to security
issues, especially when untrusted processes can manipulate them (cf.
`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).

How to disable Landlock audit records?
--------------------------------------

You might want to put in place filters as explained here:
Documentation/admin-guide/LSM/landlock.rst

Additional documentation
========================

* Documentation/admin-guide/LSM/landlock.rst
* Documentation/security/landlock.rst
* https://landlock.io

+1 −0
Original line number Diff line number Diff line
@@ -13157,6 +13157,7 @@ L: linux-security-module@vger.kernel.org
S:	Supported
W:	https://landlock.io
T:	git https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git
F:	Documentation/admin-guide/LSM/landlock.rst
F:	Documentation/security/landlock.rst
F:	Documentation/userspace-api/landlock.rst
F:	fs/ioctl.c
Loading