+10
−1
Loading
After removing uneccessary TLBIs, the next bottleneck when creating the
page tables for the linear map is DSB and ISB, which were previously
issued per-pte in __set_pte(). Since we are writing multiple ptes in a
given pte table, we can elide these barriers and insert them once we
have finished writing to the table.
Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
before | 78 (0%) | 435 (0%) | 1723 (0%) | 3779 (0%)
after | 11 (-86%) | 161 (-63%) | 656 (-62%) | 1654 (-56%)
Signed-off-by:
Ryan Roberts <ryan.roberts@arm.com>
Tested-by:
Itaru Kitayama <itaru.kitayama@fujitsu.com>
Tested-by:
Eric Chanudet <echanude@redhat.com>
Reviewed-by:
Mark Rutland <mark.rutland@arm.com>
Reviewed-by:
Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-3-ryan.roberts@arm.com
Signed-off-by:
Will Deacon <will@kernel.org>