Unverified Commit 7e340f4f authored by Palmer Dabbelt's avatar Palmer Dabbelt
Browse files

Merge patch series "Svvptc extension to remove preventive sfence.vma"

Alexandre Ghiti <alexghiti@rivosinc.com> says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com


Signed-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
parents 1845d381 7a21b2e3
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -171,6 +171,13 @@ properties:
            memory types as ratified in the 20191213 version of the privileged
            ISA specification.

        - const: svvptc
          description:
            The standard Svvptc supervisor-level extension for
            address-translation cache behaviour with respect to invalid entries
            as ratified at commit 4a69197e5617 ("Update to ratified state") of
            riscv-svvptc.

        - const: zacas
          description: |
            The Zacas extension for Atomic Compare-and-Swap (CAS) instructions
+17 −1
Original line number Diff line number Diff line
@@ -46,7 +46,23 @@ do { \
} while (0)

#ifdef CONFIG_64BIT
#define flush_cache_vmap(start, end)		flush_tlb_kernel_range(start, end)
extern u64 new_vmalloc[NR_CPUS / sizeof(u64) + 1];
extern char _end[];
#define flush_cache_vmap flush_cache_vmap
static inline void flush_cache_vmap(unsigned long start, unsigned long end)
{
	if (is_vmalloc_or_module_addr((void *)start)) {
		int i;

		/*
		 * We don't care if concurrently a cpu resets this value since
		 * the only place this can happen is in handle_exception() where
		 * an sfence.vma is emitted.
		 */
		for (i = 0; i < ARRAY_SIZE(new_vmalloc); ++i)
			new_vmalloc[i] = -1ULL;
	}
}
#define flush_cache_vmap_early(start, end)	local_flush_tlb_kernel_range(start, end)
#endif

+1 −0
Original line number Diff line number Diff line
@@ -92,6 +92,7 @@
#define RISCV_ISA_EXT_ZCF		83
#define RISCV_ISA_EXT_ZCMOP		84
#define RISCV_ISA_EXT_ZAWRS		85
#define RISCV_ISA_EXT_SVVPTC		86

#define RISCV_ISA_EXT_XLINUXENVCFG	127

+10 −0
Original line number Diff line number Diff line
@@ -497,6 +497,9 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
		struct vm_area_struct *vma, unsigned long address,
		pte_t *ptep, unsigned int nr)
{
	asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1)
		 : : : : svvptc);

	/*
	 * The kernel assumes that TLBs don't cache invalid entries, but
	 * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
@@ -506,6 +509,13 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
	 */
	while (nr--)
		local_flush_tlb_page(address + nr * PAGE_SIZE);

svvptc:;
	/*
	 * Svvptc guarantees that the new valid pte will be visible within
	 * a bounded timeframe, so when the uarch does not cache invalid
	 * entries, we don't have to do anything.
	 */
}
#define update_mmu_cache(vma, addr, ptep) \
	update_mmu_cache_range(NULL, vma, addr, ptep, 1)
+7 −0
Original line number Diff line number Diff line
@@ -61,6 +61,13 @@ struct thread_info {
	void			*scs_base;
	void			*scs_sp;
#endif
#ifdef CONFIG_64BIT
	/*
	 * Used in handle_exception() to save a0, a1 and a2 before knowing if we
	 * can access the kernel stack.
	 */
	unsigned long		a0, a1, a2;
#endif
};

#ifdef CONFIG_SHADOW_CALL_STACK
Loading