Commit e40030a4 authored by Alexei Starovoitov's avatar Alexei Starovoitov
Browse files

Merge branch 'bpf-make-kf_trusted_args-default'

Puranjay Mohan says:

====================
bpf: Make KF_TRUSTED_ARGS default

v2: https://lore.kernel.org/all/20251231171118.1174007-1-puranjay@kernel.org/
Changes in v2->v3:
- Fix documentation: add a new section for kfunc parameters (Eduard)
- Remove all occurances of KF_TRUSTED from comments, etc. (Eduard)
- Fix the netfilter kfuncs to drop dead NULL checks.
- Fix selftest for netfilter kfuncs to check for verification failures
  and remove the runtime failure that are not possible after this
  changes

v1: https://lore.kernel.org/all/20251224192448.3176531-1-puranjay@kernel.org/
Changes in v1->v2:
- Update kfunc_dynptr_param selftest to use a real pointer that is not
  ptr_to_stack and not CONST_PTR_TO_DYNPTR rather than casting 1
  (Alexei)
- Thoroughly review all kfuncs in the to find regressions or missing
  annotations. (Eduard)
- Fix kfuncs found from the above step.

This series makes trusted arguments the default requirement for all BPF
kfuncs, inverting the current opt-in model. Instead of requiring
explicit KF_TRUSTED_ARGS flags, kfuncs now require trusted arguments by
default and must explicitly opt-out using __nullable/__opt annotations
or the KF_RCU flag.

This improves security and type safety by preventing BPF programs from
passing untrusted or NULL pointers to kernel functions at verification
time, while maintaining flexibility for the small number of kfuncs that
legitimately need to accept NULL or RCU pointers.

MOTIVATION

The current opt-in model is error-prone and inconsistent. Most kfuncs already
require trusted pointers from sources like KF_ACQUIRE, struct_ops callbacks, or
tracepoints. Making trusted arguments the default:

- Prevents NULL pointer dereferences at verification time
- Reduces defensive NULL checks in kernel code
- Provides better error messages for invalid BPF programs
- Aligns with existing patterns (context pointers, struct_ops already trusted)

IMPACT ANALYSIS

Comprehensive analysis of all 304+ kfuncs across 37 kernel files found:
- Most kfuncs (299/304) are already safe and require no changes
- Only 4 kfuncs required fixes (all included in this series)
- 0 regressions found in independent verification

All bpf selftests are passing. The hid_bpf tests are also passing:
# PASSED: 20 / 20 tests passed.
# Totals: pass:20 fail:0 xfail:0 xpass:0 skip:0 error:0

bpf programs in drivers/hid/bpf/progs/ show no regression as shown by
veristat:

Done. Processed 24 files, 62 programs. Skipped 0 files, 0 programs.

TECHNICAL DETAILS

The verifier now validates kfunc arguments in this order:
1. NULL check (runs first): Rejects NULL unless parameter has __nullable/__opt
2. Trusted check: Rejects untrusted pointers unless kfunc has KF_RCU

Special cases that bypass trusted checking:
- Context pointers (xdp_md, __sk_buff): Handled via KF_ARG_PTR_TO_CTX
- Struct_ops callbacks: Pre-marked as PTR_TRUSTED during initialization
- KF_RCU kfuncs: Have separate validation path for RCU pointers

BACKWARD COMPATIBILITY

This affects BPF program verification, not runtime:
- Valid programs passing trusted pointers: Continue to work
- Programs with bugs: May now fail verification (preventing runtime crashes)

This series introduces two intentional breaking changes to the BPF
verifier's kfunc handling:

1. NULL pointer rejection timing: Kfuncs that previously accepted NULL
pointers without KF_TRUSTED_ARGS will now reject NULL at verification
time instead of returning runtime errors. This affects netfilter
connection tracking functions (bpf_xdp_ct_lookup, bpf_skb_ct_lookup,
bpf_xdp_ct_alloc, bpf_skb_ct_alloc), which now enforce their documented
"Cannot be NULL" requirements at load time rather than returning -EINVAL
at runtime.

2. Fentry/fexit program restrictions: BPF programs using fentry/fexit
attachment points can no longer pass their callback arguments directly
to kfuncs, as these arguments are not marked as trusted by default.
Programs requiring trusted argument semantics should migrate to tp_btf
(tracepoint with BTF) attachment points where arguments are guaranteed
trusted by the verifier.

Both changes strengthen the verifier's safety guarantees by catching
errors earlier in the development cycle and are accompanied by
comprehensive test updates demonstrating the new expected behaviors.
====================

Link: https://patch.msgid.link/20260102180038.2708325-1-puranjay@kernel.org


Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents c286e7e9 cf503eb2
Loading
Loading
Loading
Loading
+91 −93
Original line number Diff line number Diff line
@@ -50,7 +50,70 @@ A wrapper kfunc is often needed when we need to annotate parameters of the
kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.

2.2 Annotating kfunc parameters
2.2 kfunc Parameters
--------------------

All kfuncs now require trusted arguments by default. This means that all
pointer arguments must be valid, and all pointers to BTF objects must be
passed in their unmodified form (at a zero offset, and without having been
obtained from walking another pointer, with exceptions described below).

There are two types of pointers to kernel objects which are considered "trusted":

1. Pointers which are passed as tracepoint or struct_ops callback arguments.
2. Pointers which were returned from a KF_ACQUIRE kfunc.

Pointers to non-BTF objects (e.g. scalar pointers) may also be passed to
kfuncs, and may have a non-zero offset.

The definition of "valid" pointers is subject to change at any time, and has
absolutely no ABI stability guarantees.

As mentioned above, a nested pointer obtained from walking a trusted pointer is
no longer trusted, with one exception. If a struct type has a field that is
guaranteed to be valid (trusted or rcu, as in KF_RCU description below) as long
as its parent pointer is valid, the following macros can be used to express
that to the verifier:

* ``BTF_TYPE_SAFE_TRUSTED``
* ``BTF_TYPE_SAFE_RCU``
* ``BTF_TYPE_SAFE_RCU_OR_NULL``

For example,

.. code-block:: c

	BTF_TYPE_SAFE_TRUSTED(struct socket) {
		struct sock *sk;
	};

or

.. code-block:: c

	BTF_TYPE_SAFE_RCU(struct task_struct) {
		const cpumask_t *cpus_ptr;
		struct css_set __rcu *cgroups;
		struct task_struct __rcu *real_parent;
		struct task_struct *group_leader;
	};

In other words, you must:

1. Wrap the valid pointer type in a ``BTF_TYPE_SAFE_*`` macro.

2. Specify the type and name of the valid nested field. This field must match
   the field in the original type definition exactly.

A new type declared by a ``BTF_TYPE_SAFE_*`` macro also needs to be emitted so
that it appears in BTF. For example, ``BTF_TYPE_SAFE_TRUSTED(struct socket)``
is emitted in the ``type_is_trusted()`` function as follows:

.. code-block:: c

	BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct socket));

2.3 Annotating kfunc parameters
-------------------------------

Similar to BPF helpers, there is sometime need for additional context required
@@ -58,7 +121,7 @@ by the verifier to make the usage of kernel functions safer and more useful.
Hence, we can annotate a parameter by suffixing the name of the argument of the
kfunc with a __tag, where tag may be one of the supported annotations.

2.2.1 __sz Annotation
2.3.1 __sz Annotation
---------------------

This annotation is used to indicate a memory and size pair in the argument list.
@@ -74,7 +137,7 @@ argument as its size. By default, without __sz annotation, the size of the type
of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
pointer.

2.2.2 __k Annotation
2.3.2 __k Annotation
--------------------

This annotation is only understood for scalar arguments, where it indicates that
@@ -98,7 +161,7 @@ Hence, whenever a constant scalar argument is accepted by a kfunc which is not a
size parameter, and the value of the constant matters for program safety, __k
suffix should be used.

2.2.3 __uninit Annotation
2.3.3 __uninit Annotation
-------------------------

This annotation is used to indicate that the argument will be treated as
@@ -115,7 +178,7 @@ Here, the dynptr will be treated as an uninitialized dynptr. Without this
annotation, the verifier will reject the program if the dynptr passed in is
not initialized.

2.2.4 __opt Annotation
2.3.4 __opt Annotation
-------------------------

This annotation is used to indicate that the buffer associated with an __sz or __szk
@@ -135,7 +198,7 @@ Either way, the returned buffer is either NULL, or of size buffer_szk. Without t
annotation, the verifier will reject the program if a null pointer is passed in with
a nonzero size.

2.2.5 __str Annotation
2.3.5 __str Annotation
----------------------------
This annotation is used to indicate that the argument is a constant string.

@@ -160,7 +223,7 @@ Or::
                ...
        }

2.2.6 __prog Annotation
2.3.6 __prog Annotation
---------------------------
This annotation is used to indicate that the argument needs to be fixed up to
the bpf_prog_aux of the caller BPF program. Any value passed into this argument
@@ -179,7 +242,7 @@ An example is given below::

.. _BPF_kfunc_nodef:

2.3 Using an existing kernel function
2.4 Using an existing kernel function
-------------------------------------

When an existing function in the kernel is fit for consumption by BPF programs,
@@ -187,7 +250,7 @@ it can be directly registered with the BPF subsystem. However, care must still
be taken to review the context in which it will be invoked by the BPF program
and whether it is safe to do so.

2.4 Annotating kfuncs
2.5 Annotating kfuncs
---------------------

In addition to kfuncs' arguments, verifier may need more information about the
@@ -216,7 +279,7 @@ protected. An example is given below::
        ...
        }

2.4.1 KF_ACQUIRE flag
2.5.1 KF_ACQUIRE flag
---------------------

The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
@@ -226,7 +289,7 @@ referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
loading of the BPF program until no lingering references remain in all possible
explored states of the program.

2.4.2 KF_RET_NULL flag
2.5.2 KF_RET_NULL flag
----------------------

The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
@@ -235,87 +298,21 @@ returned from the kfunc before making use of it (dereferencing or passing to
another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
both are orthogonal to each other.

2.4.3 KF_RELEASE flag
2.5.3 KF_RELEASE flag
---------------------

The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
passed in to it. There can be only one referenced pointer that can be passed
in. All copies of the pointer being released are invalidated as a result of
invoking kfunc with this flag. KF_RELEASE kfuncs automatically receive the
protection afforded by the KF_TRUSTED_ARGS flag described below.

2.4.4 KF_TRUSTED_ARGS flag
--------------------------

The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
indicates that the all pointer arguments are valid, and that all pointers to
BTF objects have been passed in their unmodified form (that is, at a zero
offset, and without having been obtained from walking another pointer, with one
exception described below).

There are two types of pointers to kernel objects which are considered "valid":

1. Pointers which are passed as tracepoint or struct_ops callback arguments.
2. Pointers which were returned from a KF_ACQUIRE kfunc.

Pointers to non-BTF objects (e.g. scalar pointers) may also be passed to
KF_TRUSTED_ARGS kfuncs, and may have a non-zero offset.

The definition of "valid" pointers is subject to change at any time, and has
absolutely no ABI stability guarantees.

As mentioned above, a nested pointer obtained from walking a trusted pointer is
no longer trusted, with one exception. If a struct type has a field that is
guaranteed to be valid (trusted or rcu, as in KF_RCU description below) as long
as its parent pointer is valid, the following macros can be used to express
that to the verifier:

* ``BTF_TYPE_SAFE_TRUSTED``
* ``BTF_TYPE_SAFE_RCU``
* ``BTF_TYPE_SAFE_RCU_OR_NULL``

For example,

.. code-block:: c

	BTF_TYPE_SAFE_TRUSTED(struct socket) {
		struct sock *sk;
	};

or

.. code-block:: c

	BTF_TYPE_SAFE_RCU(struct task_struct) {
		const cpumask_t *cpus_ptr;
		struct css_set __rcu *cgroups;
		struct task_struct __rcu *real_parent;
		struct task_struct *group_leader;
	};

In other words, you must:

1. Wrap the valid pointer type in a ``BTF_TYPE_SAFE_*`` macro.

2. Specify the type and name of the valid nested field. This field must match
   the field in the original type definition exactly.

A new type declared by a ``BTF_TYPE_SAFE_*`` macro also needs to be emitted so
that it appears in BTF. For example, ``BTF_TYPE_SAFE_TRUSTED(struct socket)``
is emitted in the ``type_is_trusted()`` function as follows:

.. code-block:: c

	BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct socket));

invoking kfunc with this flag.

2.4.5 KF_SLEEPABLE flag
2.5.4 KF_SLEEPABLE flag
-----------------------

The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only
be called by sleepable BPF programs (BPF_F_SLEEPABLE).

2.4.6 KF_DESTRUCTIVE flag
2.5.5 KF_DESTRUCTIVE flag
--------------------------

The KF_DESTRUCTIVE flag is used to indicate functions calling which is
@@ -324,18 +321,19 @@ rebooting or panicking. Due to this additional restrictions apply to these
calls. At the moment they only require CAP_SYS_BOOT capability, but more can be
added later.

2.4.7 KF_RCU flag
2.5.6 KF_RCU flag
-----------------

The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free. The pointers are not
NULL, but the object's refcount could have reached zero. The kfuncs need to
consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE
pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely
also be KF_RET_NULL.
The KF_RCU flag allows kfuncs to opt out of the default trusted args
requirement and accept RCU pointers with weaker guarantees. The kfuncs marked
with KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier
guarantees that the objects are valid and there is no use-after-free. The
pointers are not NULL, but the object's refcount could have reached zero. The
kfuncs need to consider doing refcnt != 0 check, especially when returning a
KF_ACQUIRE pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should
very likely also be KF_RET_NULL.

2.4.8 KF_RCU_PROTECTED flag
2.5.7 KF_RCU_PROTECTED flag
---------------------------

The KF_RCU_PROTECTED flag is used to indicate that the kfunc must be invoked in
@@ -354,7 +352,7 @@ RCU protection but do not take RCU protected arguments.

.. _KF_deprecated_flag:

2.4.9 KF_DEPRECATED flag
2.5.8 KF_DEPRECATED flag
------------------------

The KF_DEPRECATED flag is used for kfuncs which are scheduled to be
@@ -374,7 +372,7 @@ encouraged to make their use-cases known as early as possible, and participate
in upstream discussions regarding whether to keep, change, deprecate, or remove
those kfuncs if and when such discussions occur.

2.5 Registering the kfuncs
2.6 Registering the kfuncs
--------------------------

Once the kfunc is prepared for use, the final step to making it visible is
@@ -397,7 +395,7 @@ type. An example is shown below::
        }
        late_initcall(init_subsystem);

2.6  Specifying no-cast aliases with ___init
2.7  Specifying no-cast aliases with ___init
--------------------------------------------

The verifier will always enforce that the BTF type of a pointer passed to a
+1 −4
Original line number Diff line number Diff line
@@ -295,9 +295,6 @@ hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr
{
	struct hid_bpf_ctx_kern *ctx_kern;

	if (!ctx)
		return NULL;

	ctx_kern = container_of(ctx, struct hid_bpf_ctx_kern, ctx);

	if (rdwr_buf_size + offset > ctx->allocated_size)
@@ -364,7 +361,7 @@ __hid_bpf_hw_check_params(struct hid_bpf_ctx *ctx, __u8 *buf, size_t *buf__sz,
	u32 report_len;

	/* check arguments */
	if (!ctx || !hid_ops || !buf)
	if (!hid_ops)
		return -EINVAL;

	switch (rtype) {
+9 −14
Original line number Diff line number Diff line
@@ -68,10 +68,7 @@ __bpf_kfunc void bpf_put_file(struct file *file)
 *
 * Resolve the pathname for the supplied *path* and store it in *buf*. This BPF
 * kfunc is the safer variant of the legacy bpf_d_path() helper and should be
 * used in place of bpf_d_path() whenever possible. It enforces KF_TRUSTED_ARGS
 * semantics, meaning that the supplied *path* must itself hold a valid
 * reference, or else the BPF program will be outright rejected by the BPF
 * verifier.
 * used in place of bpf_d_path() whenever possible.
 *
 * This BPF kfunc may only be called from BPF LSM programs.
 *
@@ -359,14 +356,13 @@ __bpf_kfunc int bpf_cgroup_read_xattr(struct cgroup *cgroup, const char *name__s
__bpf_kfunc_end_defs();

BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
BTF_ID_FLAGS(func, bpf_get_task_exe_file,
	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_get_task_exe_file, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE)
BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_path_d_path)
BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)

static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
@@ -377,9 +373,8 @@ static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
	return -EACCES;
}

/* bpf_[set|remove]_dentry_xattr.* hooks have KF_TRUSTED_ARGS and
 * KF_SLEEPABLE, so they are only available to sleepable hooks with
 * dentry arguments.
/* bpf_[set|remove]_dentry_xattr.* hooks have KF_SLEEPABLE, so they are only
 * available to sleepable hooks with dentry arguments.
 *
 * Setting and removing xattr requires exclusive lock on dentry->d_inode.
 * Some hooks already locked d_inode, while some hooks have not locked
+1 −1
Original line number Diff line number Diff line
@@ -162,7 +162,7 @@ __bpf_kfunc int bpf_get_fsverity_digest(struct file *file, struct bpf_dynptr *di
__bpf_kfunc_end_defs();

BTF_KFUNCS_START(fsverity_set_ids)
BTF_ID_FLAGS(func, bpf_get_fsverity_digest, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_get_fsverity_digest)
BTF_KFUNCS_END(fsverity_set_ids)

static int bpf_get_fsverity_digest_filter(const struct bpf_prog *prog, u32 kfunc_id)
+1 −1
Original line number Diff line number Diff line
@@ -753,7 +753,7 @@ enum bpf_type_flag {
	MEM_ALLOC		= BIT(11 + BPF_BASE_TYPE_BITS),

	/* PTR was passed from the kernel in a trusted context, and may be
	 * passed to KF_TRUSTED_ARGS kfuncs or BPF helper functions.
	 * passed to kfuncs or BPF helper functions.
	 * Confusingly, this is _not_ the opposite of PTR_UNTRUSTED above.
	 * PTR_UNTRUSTED refers to a kptr that was read directly from a map
	 * without invoking bpf_kptr_xchg(). What we really need to know is
Loading