Commit 31a24ae8 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
parents 6a713827 a27a8816
Loading
Loading
Loading
Loading
+1 −2
Original line number Diff line number Diff line
@@ -2279,8 +2279,7 @@
				   state is kept private from the host.
				   Not valid if the kernel is running in EL2.

			Defaults to VHE/nVHE based on hardware support and
			the value of CONFIG_ARM64_VHE.
			Defaults to VHE/nVHE based on hardware support.

	kvm-arm.vgic_v3_group0_trap=
			[KVM,ARM] Trap guest accesses to GICv3 group-0
+54 −0
Original line number Diff line number Diff line
@@ -53,6 +53,60 @@ Example usage of perf::
  $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5

For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
as PMU v1, but some new functions are added to the hardware.

(a) L3C PMU supports filtering by core/thread within the cluster which can be
specified as a bitmap::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5

This will only count the operations from core/thread 0 and 1 in this cluster.

(b) Tracetag allow the user to chose to count only read, write or atomic
operations via the tt_req parameeter in perf. The default value counts all
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
represents write operations, 3'b110 represents atomic store operations and
3'b111 represents atomic non-store operations, other values are reserved::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5

This will only count the read operations in this cluster.

(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
Some important codes are as follows:
5'b00001: comes from L3C in this die;
5'b01000: comes from L3C in the cross-die;
5'b01001: comes from L3C which is in another socket;
5'b01110: comes from the local DDR;
5'b01111: comes from the cross-die DDR;
5'b10000: comes from cross-socket DDR;
etc, it is mainly helpful to find that the data source is nearest from the CPU
cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
configured in perf command::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
  hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5

(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
5'b00000: I/O_MGMT_ICL;
5'b00001: Network_ICL;
5'b00011: HAC_ICL;
5'b10000: PCIe_ICL;

Users could configure IDs to count data come from specific CCL/ICL, by setting
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
check the bit when matching against the srcid_cmd/tgtid_cmd.

If all of these options are disabled, it can works by the default value that
doesn't distinguish the filter condition and ID information and will return
the total counter values in the PMU counters.

The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported as the events are all uncore.

+10 −3
Original line number Diff line number Diff line
@@ -202,9 +202,10 @@ Before jumping into the kernel, the following conditions must be met:

- System registers

  All writable architected system registers at the exception level where
  the kernel image will be entered must be initialised by software at a
  higher exception level to prevent execution in an UNKNOWN state.
  All writable architected system registers at or below the exception
  level where the kernel image will be entered must be initialised by
  software at a higher exception level to prevent execution in an UNKNOWN
  state.

  - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
    executing on.
@@ -270,6 +271,12 @@ Before jumping into the kernel, the following conditions must be met:
      having 0b1 set for the corresponding bit for each of the auxiliary
      counters present.

  For CPUs with the Fine Grained Traps (FEAT_FGT) extension present:

  - If EL3 is present and the kernel is entered at EL2:

    - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1.

The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs.  All CPUs must
enter the kernel in the same exception level.
+34 −0
Original line number Diff line number Diff line
@@ -107,3 +107,37 @@ filter out the Pointer Authentication system key registers from
KVM_GET/SET_REG_* ioctls and mask those features from cpufeature ID
register. Any attempt to use the Pointer Authentication instructions will
result in an UNDEFINED exception being injected into the guest.


Enabling and disabling keys
---------------------------

The prctl PR_PAC_SET_ENABLED_KEYS allows the user program to control which
PAC keys are enabled in a particular task. It takes two arguments, the
first being a bitmask of PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY
and PR_PAC_APDBKEY specifying which keys shall be affected by this prctl,
and the second being a bitmask of the same bits specifying whether the key
should be enabled or disabled. For example::

  prctl(PR_PAC_SET_ENABLED_KEYS,
        PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY | PR_PAC_APDBKEY,
        PR_PAC_APIBKEY, 0, 0);

disables all keys except the IB key.

The main reason why this is useful is to enable a userspace ABI that uses PAC
instructions to sign and authenticate function pointers and other pointers
exposed outside of the function, while still allowing binaries conforming to
the ABI to interoperate with legacy binaries that do not sign or authenticate
pointers.

The idea is that a dynamic loader or early startup code would issue this
prctl very early after establishing that a process may load legacy binaries,
but before executing any PAC instructions.

For compatibility with previous kernel versions, processes start up with IA,
IB, DA and DB enabled, and are reset to this state on exec(). Processes created
via fork() and clone() inherit the key enabled state from the calling process.

It is recommended to avoid disabling the IA key, as this has higher performance
overhead than disabling any of the other keys.
+1 −1
Original line number Diff line number Diff line
@@ -40,7 +40,7 @@ space obtained in one of the following ways:
  during creation and with the same restrictions as for ``mmap()`` above
  (e.g. data, bss, stack).

The AArch64 Tagged Address ABI has two stages of relaxation depending
The AArch64 Tagged Address ABI has two stages of relaxation depending on
how the user addresses are used by the kernel:

1. User addresses not accessed by the kernel but used for address space
Loading