Commit 5bde5750 authored by Alexei Starovoitov's avatar Alexei Starovoitov
Browse files

Merge branch 'introduce-load-acquire-and-store-release-bpf-instructions'

Peilin Ye says:

====================
Introduce load-acquire and store-release BPF instructions

This patchset adds kernel support for BPF load-acquire and store-release
instructions (for background, please see [1]), including core/verifier
and arm64/x86-64 JIT compiler changes, as well as selftests.  riscv64 is
also planned to be supported.  The corresponding LLVM changes can be
found at:

  https://github.com/llvm/llvm-project/pull/108636

The first 3 patches from v4 have already been applied:

  - [bpf-next,v4,01/10] bpf/verifier: Factor out atomic_ptr_type_ok()
    https://git.kernel.org/bpf/bpf-next/c/b2d9ef71d4c9
  - [bpf-next,v4,02/10] bpf/verifier: Factor out check_atomic_rmw()
    https://git.kernel.org/bpf/bpf-next/c/d430c46c7580
  - [bpf-next,v4,03/10] bpf/verifier: Factor out check_load_mem() and check_store_reg()
    https://git.kernel.org/bpf/bpf-next/c/d38ad248fb7a

Please refer to the LLVM PR and individual kernel patches for details.
Thanks!

v5: https://lore.kernel.org/all/cover.1741046028.git.yepeilin@google.com/
v5..v6 change:

  o (Alexei) avoid using #ifndef in verifier.c

v4: https://lore.kernel.org/bpf/cover.1740978603.git.yepeilin@google.com/
v4..v5 notable changes:

  o (kernel test robot) for 32-bit arches: make the verifier reject
                        64-bit load-acquires/store-releases, and fix
                        build error in interpreter changes
    * tested ARCH=arc build following instructions from kernel test
      robot
  o (Alexei) drop Documentation/ patch (v4 10/10) for now

v3: https://lore.kernel.org/bpf/cover.1740009184.git.yepeilin@google.com/
v3..v4 notable changes:

  o (Alexei) add x86-64 JIT support (including arena)
  o add Acked-by: tags from Xu

v2: https://lore.kernel.org/bpf/cover.1738888641.git.yepeilin@google.com/
v2..v3 notable changes:

  o (Alexei) change encoding to BPF_LOAD_ACQ=0x100, BPF_STORE_REL=0x110
  o add Acked-by: tags from Ilya and Eduard
  o make new selftests depend on:
    * __clang_major__ >= 18, and
    * ENABLE_ATOMICS_TESTS is defined (currently this means -mcpu=v3 or
      v4), and
    * JIT supports load_acq/store_rel (currenty only arm64)
  o work around llvm-17 CI job failure by conditionally define
    __arena_global variables as 64-bit if __clang_major__ < 18, to make
    sure .addr_space.1 has no holes
  o add Google copyright notice in new files

v1: https://lore.kernel.org/all/cover.1737763916.git.yepeilin@google.com/
v1..v2 notable changes:

  o (Eduard) for x86 and s390, make
             bpf_jit_supports_insn(..., /*in_arena=*/true) return false
	     for load_acq/store_rel
  o add Eduard's Acked-by: tag
  o (Eduard) extract LDX and non-ATOMIC STX handling into helpers, see
             PATCH v2 3/9
  o allow unpriv programs to store-release pointers to stack
  o (Alexei) make it clearer in the interpreter code (PATCH v2 4/9) that
             only W and DW are supported for atomic RMW
  o test misaligned load_acq/store_rel
  o (Eduard) other selftests/ changes:
    * test load_acq/store_rel with !atomic_ptr_type_ok() pointers:
      - PTR_TO_CTX, for is_ctx_reg()
      - PTR_TO_PACKET, for is_pkt_reg()
      - PTR_TO_FLOW_KEYS, for is_flow_key_reg()
      - PTR_TO_SOCKET, for is_sk_reg()
    * drop atomics/ tests
    * delete unnecessary 'pid' checks from arena_atomics/ tests
    * avoid depending on __BPF_FEATURE_LOAD_ACQ_STORE_REL, use
      __imm_insn() and inline asm macros instead

RFC v1: https://lore.kernel.org/all/cover.1734742802.git.yepeilin@google.com
RFC v1..v1 notable changes:

  o 1-2/8: minor verifier.c refactoring patches
  o   3/8: core/verifier changes
         * (Eduard) handle load-acquire properly in backtrack_insn()
         * (Eduard) avoid skipping checks (e.g.,
                    bpf_jit_supports_insn()) for load-acquires
         * track the value stored by store-releases, just like how
           non-atomic STX instructions are handled
         * (Eduard) add missing link in commit message
         * (Eduard) always print 'r' for disasm.c changes
  o   4/8: arm64/insn: avoid treating load_acq/store_rel as
           load_ex/store_ex
  o   5/8: arm64/insn: add load_acq/store_rel
         * (Xu) include Should-Be-One (SBO) bits in "mask" and "value",
                to avoid setting fixed bits during runtime (JIT-compile
                time)
  o   6/8: arm64 JIT compiler changes
         * (Xu) use emit_a64_add_i() for "pointer + offset" to optimize
                code emission
  o   7/8: selftests
         * (Eduard) avoid adding new tests to the 'test_verifier' runner
         * add more tests, e.g., checking mark_precise logic
  o   8/8: instruction-set.rst changes

[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/

Thanks,
====================

Link: https://patch.msgid.link/cover.1741049567.git.yepeilin@google.com


Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 3a6fa573 ff3afe5d
Loading
Loading
Loading
Loading
+10 −2
Original line number Diff line number Diff line
@@ -188,8 +188,10 @@ enum aarch64_insn_ldst_type {
	AARCH64_INSN_LDST_STORE_PAIR_PRE_INDEX,
	AARCH64_INSN_LDST_LOAD_PAIR_POST_INDEX,
	AARCH64_INSN_LDST_STORE_PAIR_POST_INDEX,
	AARCH64_INSN_LDST_LOAD_ACQ,
	AARCH64_INSN_LDST_LOAD_EX,
	AARCH64_INSN_LDST_LOAD_ACQ_EX,
	AARCH64_INSN_LDST_STORE_REL,
	AARCH64_INSN_LDST_STORE_EX,
	AARCH64_INSN_LDST_STORE_REL_EX,
	AARCH64_INSN_LDST_SIGNED_LOAD_IMM_OFFSET,
@@ -351,8 +353,10 @@ __AARCH64_INSN_FUNCS(ldr_imm, 0x3FC00000, 0x39400000)
__AARCH64_INSN_FUNCS(ldr_lit,	0xBF000000, 0x18000000)
__AARCH64_INSN_FUNCS(ldrsw_lit,	0xFF000000, 0x98000000)
__AARCH64_INSN_FUNCS(exclusive,	0x3F800000, 0x08000000)
__AARCH64_INSN_FUNCS(load_ex,	0x3F400000, 0x08400000)
__AARCH64_INSN_FUNCS(store_ex,	0x3F400000, 0x08000000)
__AARCH64_INSN_FUNCS(load_acq,  0x3FDFFC00, 0x08DFFC00)
__AARCH64_INSN_FUNCS(store_rel, 0x3FDFFC00, 0x089FFC00)
__AARCH64_INSN_FUNCS(load_ex,	0x3FC00000, 0x08400000)
__AARCH64_INSN_FUNCS(store_ex,	0x3FC00000, 0x08000000)
__AARCH64_INSN_FUNCS(mops,	0x3B200C00, 0x19000400)
__AARCH64_INSN_FUNCS(stp,	0x7FC00000, 0x29000000)
__AARCH64_INSN_FUNCS(ldp,	0x7FC00000, 0x29400000)
@@ -602,6 +606,10 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
				     int offset,
				     enum aarch64_insn_variant variant,
				     enum aarch64_insn_ldst_type type);
u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
					enum aarch64_insn_register base,
					enum aarch64_insn_size_type size,
					enum aarch64_insn_ldst_type type);
u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
				   enum aarch64_insn_register base,
				   enum aarch64_insn_register state,
+29 −0
Original line number Diff line number Diff line
@@ -540,6 +540,35 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
					     offset >> shift);
}

u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
					enum aarch64_insn_register base,
					enum aarch64_insn_size_type size,
					enum aarch64_insn_ldst_type type)
{
	u32 insn;

	switch (type) {
	case AARCH64_INSN_LDST_LOAD_ACQ:
		insn = aarch64_insn_get_load_acq_value();
		break;
	case AARCH64_INSN_LDST_STORE_REL:
		insn = aarch64_insn_get_store_rel_value();
		break;
	default:
		pr_err("%s: unknown load-acquire/store-release encoding %d\n",
		       __func__, type);
		return AARCH64_BREAK_FAULT;
	}

	insn = aarch64_insn_encode_ldst_size(size, insn);

	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
					    reg);

	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn,
					    base);
}

u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
				   enum aarch64_insn_register base,
				   enum aarch64_insn_register state,
+20 −0
Original line number Diff line number Diff line
@@ -119,6 +119,26 @@
	aarch64_insn_gen_load_store_ex(Rt, Rn, Rs, A64_SIZE(sf), \
				       AARCH64_INSN_LDST_STORE_REL_EX)

/* Load-acquire & store-release */
#define A64_LDAR(Rt, Rn, size)  \
	aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
					    AARCH64_INSN_LDST_LOAD_ACQ)
#define A64_STLR(Rt, Rn, size)  \
	aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
					    AARCH64_INSN_LDST_STORE_REL)

/* Rt = [Rn] (load acquire) */
#define A64_LDARB(Wt, Xn)	A64_LDAR(Wt, Xn, 8)
#define A64_LDARH(Wt, Xn)	A64_LDAR(Wt, Xn, 16)
#define A64_LDAR32(Wt, Xn)	A64_LDAR(Wt, Xn, 32)
#define A64_LDAR64(Xt, Xn)	A64_LDAR(Xt, Xn, 64)

/* [Rn] = Rt (store release) */
#define A64_STLRB(Wt, Xn)	A64_STLR(Wt, Xn, 8)
#define A64_STLRH(Wt, Xn)	A64_STLR(Wt, Xn, 16)
#define A64_STLR32(Wt, Xn)	A64_STLR(Wt, Xn, 32)
#define A64_STLR64(Xt, Xn)	A64_STLR(Xt, Xn, 64)

/*
 * LSE atomics
 *
+84 −2
Original line number Diff line number Diff line
@@ -647,6 +647,81 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
	return 0;
}

static int emit_atomic_ld_st(const struct bpf_insn *insn, struct jit_ctx *ctx)
{
	const s32 imm = insn->imm;
	const s16 off = insn->off;
	const u8 code = insn->code;
	const bool arena = BPF_MODE(code) == BPF_PROBE_ATOMIC;
	const u8 arena_vm_base = bpf2a64[ARENA_VM_START];
	const u8 dst = bpf2a64[insn->dst_reg];
	const u8 src = bpf2a64[insn->src_reg];
	const u8 tmp = bpf2a64[TMP_REG_1];
	u8 reg;

	switch (imm) {
	case BPF_LOAD_ACQ:
		reg = src;
		break;
	case BPF_STORE_REL:
		reg = dst;
		break;
	default:
		pr_err_once("unknown atomic load/store op code %02x\n", imm);
		return -EINVAL;
	}

	if (off) {
		emit_a64_add_i(1, tmp, reg, tmp, off, ctx);
		reg = tmp;
	}
	if (arena) {
		emit(A64_ADD(1, tmp, reg, arena_vm_base), ctx);
		reg = tmp;
	}

	switch (imm) {
	case BPF_LOAD_ACQ:
		switch (BPF_SIZE(code)) {
		case BPF_B:
			emit(A64_LDARB(dst, reg), ctx);
			break;
		case BPF_H:
			emit(A64_LDARH(dst, reg), ctx);
			break;
		case BPF_W:
			emit(A64_LDAR32(dst, reg), ctx);
			break;
		case BPF_DW:
			emit(A64_LDAR64(dst, reg), ctx);
			break;
		}
		break;
	case BPF_STORE_REL:
		switch (BPF_SIZE(code)) {
		case BPF_B:
			emit(A64_STLRB(src, reg), ctx);
			break;
		case BPF_H:
			emit(A64_STLRH(src, reg), ctx);
			break;
		case BPF_W:
			emit(A64_STLR32(src, reg), ctx);
			break;
		case BPF_DW:
			emit(A64_STLR64(src, reg), ctx);
			break;
		}
		break;
	default:
		pr_err_once("unexpected atomic load/store op code %02x\n",
			    imm);
		return -EINVAL;
	}

	return 0;
}

#ifdef CONFIG_ARM64_LSE_ATOMICS
static int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
{
@@ -1641,11 +1716,17 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
			return ret;
		break;

	case BPF_STX | BPF_ATOMIC | BPF_B:
	case BPF_STX | BPF_ATOMIC | BPF_H:
	case BPF_STX | BPF_ATOMIC | BPF_W:
	case BPF_STX | BPF_ATOMIC | BPF_DW:
	case BPF_STX | BPF_PROBE_ATOMIC | BPF_B:
	case BPF_STX | BPF_PROBE_ATOMIC | BPF_H:
	case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
	case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
		if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
		if (bpf_atomic_is_load_store(insn))
			ret = emit_atomic_ld_st(insn, ctx);
		else if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
			ret = emit_lse_atomic(insn, ctx);
		else
			ret = emit_ll_sc_atomic(insn, ctx);
@@ -2669,7 +2750,8 @@ bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
	switch (insn->code) {
	case BPF_STX | BPF_ATOMIC | BPF_W:
	case BPF_STX | BPF_ATOMIC | BPF_DW:
		if (!cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
		if (!bpf_atomic_is_load_store(insn) &&
		    !cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
			return false;
	}
	return true;
+10 −4
Original line number Diff line number Diff line
@@ -2919,10 +2919,16 @@ bool bpf_jit_supports_arena(void)

bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
{
	/*
	 * Currently the verifier uses this function only to check which
	 * atomic stores to arena are supported, and they all are.
	 */
	if (!in_arena)
		return true;
	switch (insn->code) {
	case BPF_STX | BPF_ATOMIC | BPF_B:
	case BPF_STX | BPF_ATOMIC | BPF_H:
	case BPF_STX | BPF_ATOMIC | BPF_W:
	case BPF_STX | BPF_ATOMIC | BPF_DW:
		if (bpf_atomic_is_load_store(insn))
			return false;
	}
	return true;
}

Loading