Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "powerpc/64s: do not re-activate batched TLB flush" makes
   arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)

   It adds a generic enter/leave layer and switches architectures to use
   it. Various hacks were removed in the process.

 - "zram: introduce compressed data writeback" implements data
   compression for zram writeback (Richard Chang and Sergey Senozhatsky)

 - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
   page ranges for hugepages. Large improvements during demand faulting
   are demonstrated (David Hildenbrand)

 - "memcg cleanups" tidies up some memcg code (Chen Ridong)

 - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
   stats" improves DAMOS stat's provided information, deterministic
   control, and readability (SeongJae Park)

 - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
   issues in the hugetlb cgroup charging selftests (Li Wang)

 - "Fix va_high_addr_switch.sh test failure - again" addresses several
   issues in the va_high_addr_switch test (Chunyu Hu)

 - "mm/damon/tests/core-kunit: extend existing test scenarios" improves
   the KUnit test coverage for DAMON (Shu Anzai)

 - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
   glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
   transiently return -EAGAIN (Shivank Garg)

 - "arch, mm: consolidate hugetlb early reservation" reworks and
   consolidates a pile of straggly code related to reservation of
   hugetlb memory from bootmem and creation of CMA areas for hugetlb
   (Mike Rapoport)

 - "mm: clean up anon_vma implementation" cleans up the anon_vma
   implementation in various ways (Lorenzo Stoakes)

 - "tweaks for __alloc_pages_slowpath()" does a little streamlining of
   the page allocator's slowpath code (Vlastimil Babka)

 - "memcg: separate private and public ID namespaces" cleans up the
   memcg ID code and prevents the internal-only private IDs from being
   exposed to userspace (Shakeel Butt)

 - "mm: hugetlb: allocate frozen gigantic folio" cleans up the
   allocation of frozen folios and avoids some atomic refcount
   operations (Kefeng Wang)

 - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
   of memory betewwn the active and inactive LRUs and adds auto-tuning
   of the ratio-based quotas and of monitoring intervals (SeongJae Park)

 - "Support page table check on PowerPC" makes
   CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)

 - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
   nodes_and() and nodes_andnot() propagate the return values from the
   underlying bit operations, enabling some cleanup in calling code
   (Yury Norov)

 - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
   some DAMON internal interfaces (SeongJae Park)

 - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
   in khupaged and fixes a scan limit accounting issue (Shivank Garg)

 - "mm: balloon infrastructure cleanups" goes to town on the balloon
   infrastructure and its page migration function. Mainly cleanups, also
   some locking simplification (David Hildenbrand)

 - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
   additional tracepoints to the page reclaim code (Jiayuan Chen)

 - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
   part of Marco's kernel-wide migration from the legacy workqueue APIs
   over to the preferred unbound workqueues (Marco Crivellari)

 - "Various mm kselftests improvements/fixes" provides various unrelated
   improvements/fixes for the mm kselftests (Kevin Brodsky)

 - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
   folio allocation, mainly by avoiding unnecessary work in
   pfn_range_valid_contig() (Kefeng Wang)

 - "selftests/damon: improve leak detection and wss estimation
   reliability" improves the reliability of two of the DAMON selftests
   (SeongJae Park)

 - "mm/damon: cleanup kdamond, damon_call(), damos filter and
   DAMON_MIN_REGION" does some cleanup work in the core DAMON code
   (SeongJae Park)

 - "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
   performs maintenance work on the DAMON documentation (SeongJae Park)

 - "mm: add and use vma_assert_stabilised() helper" refactors and cleans
   up the core VMA code. The main aim here is to be able to use the mmap
   write lock's lockdep state to perform various assertions regarding
   the locking which the VMA code requires (Lorenzo Stoakes)

 - "mm, swap: swap table phase II: unify swapin use" removes some old
   swap code (swap cache bypassing and swap synchronization) which
   wasn't working very well. Various other cleanups and simplifications
   were made. The end result is a 20% speedup in one benchmark (Kairui
   Song)

 - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
   available on 64-bit alpha, loongarch, mips, parisc, and um. Various
   cleanups were performed along the way (Qi Zheng)

* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
  mm/memory: handle non-split locks correctly in zap_empty_pte_table()
  mm: move pte table reclaim code to memory.c
  mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
  mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
  um: mm: enable MMU_GATHER_RCU_TABLE_FREE
  parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
  LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
  alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
  mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
  zsmalloc: make common caches global
  mm: add SPDX id lines to some mm source files
  mm/zswap: use %pe to print error pointers
  mm/vmscan: use %pe to print error pointers
  mm/readahead: fix typo in comment
  mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
  mm: refactor vma_map_pages to use vm_insert_pages
  mm/damon: unify address range representation with damon_addr_range
  mm/cma: replace snprintf with strscpy in cma_new_area
  ...
This commit is contained in:
Linus Torvalds
2026-02-12 11:32:37 -08:00
332 changed files with 6257 additions and 5611 deletions

View File

@@ -172,6 +172,7 @@ config PPC
select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC_BOOK3S || PPC_8xx
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if !HUGETLB_PAGE
select ARCH_SUPPORTS_SCHED_MC if SMP
select ARCH_SUPPORTS_SCHED_SMT if PPC64 && SMP
select SCHED_MC if ARCH_SUPPORTS_SCHED_MC
@@ -304,6 +305,7 @@ config PPC
select LOCK_MM_AND_FIND_VMA
select MMU_GATHER_PAGE_SIZE
select MMU_GATHER_RCU_TABLE_FREE
select HAVE_ARCH_TLB_REMOVE_TABLE
select MMU_GATHER_MERGE_VMAS
select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64
select MODULES_USE_ELF_RELA

View File

@@ -198,6 +198,7 @@ void unmap_kernel_page(unsigned long va);
#ifndef __ASSEMBLER__
#include <linux/sched.h>
#include <linux/threads.h>
#include <linux/page_table_check.h>
/* Bits to mask out from a PGD to get to the PUD page */
#define PGD_MASKED_BITS 0
@@ -311,7 +312,11 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep)
{
return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0));
pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0));
page_table_check_pte_clear(mm, addr, old_pte);
return old_pte;
}
#define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -433,6 +438,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write)
return true;
}
static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr)
{
return pte_present(pte) && !is_kernel_addr(addr);
}
/* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
*

View File

@@ -144,6 +144,8 @@
#define PAGE_KERNEL_ROX __pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
#ifndef __ASSEMBLER__
#include <linux/page_table_check.h>
/*
* page table defines
*/
@@ -416,8 +418,11 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
return __pte(old);
pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
page_table_check_pte_clear(mm, addr, old_pte);
return old_pte;
}
#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
@@ -426,11 +431,16 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
pte_t *ptep, int full)
{
if (full && radix_enabled()) {
pte_t old_pte;
/*
* We know that this is a full mm pte clear and
* hence can be sure there is no parallel set_pte.
*/
return radix__ptep_get_and_clear_full(mm, addr, ptep, full);
old_pte = radix__ptep_get_and_clear_full(mm, addr, ptep, full);
page_table_check_pte_clear(mm, addr, old_pte);
return old_pte;
}
return ptep_get_and_clear(mm, addr, ptep);
}
@@ -539,6 +549,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write)
return arch_pte_access_permitted(pte_val(pte), write, 0);
}
static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr)
{
return pte_present(pte) && pte_user(pte);
}
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -909,6 +924,12 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
return pte_access_permitted(pud_pte(pud), write);
}
#define pud_user_accessible_page pud_user_accessible_page
static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr)
{
return pud_leaf(pud) && pte_user_accessible_page(pud_pte(pud), addr);
}
#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
static inline __be64 p4d_raw(p4d_t x)
{
@@ -1074,6 +1095,12 @@ static inline bool pmd_access_permitted(pmd_t pmd, bool write)
return pte_access_permitted(pmd_pte(pmd), write);
}
#define pmd_user_accessible_page pmd_user_accessible_page
static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr)
{
return pmd_leaf(pmd) && pte_user_accessible_page(pmd_pte(pmd), addr);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
extern pud_t pfn_pud(unsigned long pfn, pgprot_t pgprot);
@@ -1284,19 +1311,34 @@ extern int pudp_test_and_clear_young(struct vm_area_struct *vma,
static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
unsigned long addr, pmd_t *pmdp)
{
if (radix_enabled())
return radix__pmdp_huge_get_and_clear(mm, addr, pmdp);
return hash__pmdp_huge_get_and_clear(mm, addr, pmdp);
pmd_t old_pmd;
if (radix_enabled()) {
old_pmd = radix__pmdp_huge_get_and_clear(mm, addr, pmdp);
} else {
old_pmd = hash__pmdp_huge_get_and_clear(mm, addr, pmdp);
}
page_table_check_pmd_clear(mm, addr, old_pmd);
return old_pmd;
}
#define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR
static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm,
unsigned long addr, pud_t *pudp)
{
if (radix_enabled())
return radix__pudp_huge_get_and_clear(mm, addr, pudp);
BUG();
return *pudp;
pud_t old_pud;
if (radix_enabled()) {
old_pud = radix__pudp_huge_get_and_clear(mm, addr, pudp);
} else {
BUG();
}
page_table_check_pud_clear(mm, addr, old_pud);
return old_pud;
}
static inline pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,

View File

@@ -12,7 +12,6 @@
#define PPC64_TLB_BATCH_NR 192
struct ppc64_tlb_batch {
int active;
unsigned long index;
struct mm_struct *mm;
real_pte_t pte[PPC64_TLB_BATCH_NR];
@@ -24,12 +23,8 @@ DECLARE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch);
extern void __flush_tlb_pending(struct ppc64_tlb_batch *batch);
#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
static inline void arch_enter_lazy_mmu_mode(void)
{
struct ppc64_tlb_batch *batch;
if (radix_enabled())
return;
/*
@@ -37,11 +32,9 @@ static inline void arch_enter_lazy_mmu_mode(void)
* operating on kernel page tables.
*/
preempt_disable();
batch = this_cpu_ptr(&ppc64_tlb_batch);
batch->active = 1;
}
static inline void arch_leave_lazy_mmu_mode(void)
static inline void arch_flush_lazy_mmu_mode(void)
{
struct ppc64_tlb_batch *batch;
@@ -51,11 +44,16 @@ static inline void arch_leave_lazy_mmu_mode(void)
if (batch->index)
__flush_tlb_pending(batch);
batch->active = 0;
preempt_enable();
}
#define arch_flush_lazy_mmu_mode() do {} while (0)
static inline void arch_leave_lazy_mmu_mode(void)
{
if (radix_enabled())
return;
arch_flush_lazy_mmu_mode();
preempt_enable();
}
extern void hash__tlbiel_all(unsigned int action);

View File

@@ -68,7 +68,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t pte, int dirty);
void gigantic_hugetlb_cma_reserve(void) __init;
#include <asm-generic/hugetlb.h>
#else /* ! CONFIG_HUGETLB_PAGE */
@@ -77,10 +76,6 @@ static inline void flush_hugetlb_page(struct vm_area_struct *vma,
{
}
static inline void __init gigantic_hugetlb_cma_reserve(void)
{
}
static inline void __init hugetlbpage_init_defaultsize(void)
{
}

View File

@@ -29,6 +29,8 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p
#ifndef __ASSEMBLER__
#include <linux/page_table_check.h>
extern int icache_44x_need_flush;
#ifndef pte_huge_size
@@ -122,7 +124,11 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep)
{
return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
page_table_check_pte_clear(mm, addr, old_pte);
return old_pte;
}
#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
@@ -243,6 +249,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write)
return true;
}
static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr)
{
return pte_present(pte) && !is_kernel_addr(addr);
}
/* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
*

View File

@@ -271,6 +271,7 @@ static inline const void *pfn_to_kaddr(unsigned long pfn)
struct page;
extern void clear_user_page(void *page, unsigned long vaddr, struct page *pg);
#define clear_user_page clear_user_page
extern void copy_user_page(void *to, void *from, unsigned long vaddr,
struct page *p);
extern int devmem_is_allowed(unsigned long pfn);

View File

@@ -34,6 +34,8 @@ struct mm_struct;
void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
pte_t pte, unsigned int nr);
#define set_ptes set_ptes
void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte);
#define update_mmu_cache(vma, addr, ptep) \
update_mmu_cache_range(NULL, vma, addr, ptep, 1)
@@ -202,6 +204,14 @@ static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size)
#endif /* CONFIG_PPC64 */
#ifndef pmd_user_accessible_page
#define pmd_user_accessible_page(pmd, addr) false
#endif
#ifndef pud_user_accessible_page
#define pud_user_accessible_page(pud, addr) false
#endif
#endif /* __ASSEMBLER__ */
#endif /* _ASM_POWERPC_PGTABLE_H */

View File

@@ -20,7 +20,11 @@ extern void reloc_got2(unsigned long);
void check_for_initrd(void);
void mem_topology_setup(void);
#ifdef CONFIG_NUMA
void initmem_init(void);
#else
static inline void initmem_init(void) {}
#endif
void setup_panic(void);
#define ARCH_PANIC_TIMEOUT 180

View File

@@ -154,12 +154,10 @@ void arch_setup_new_exec(void);
/* Don't move TLF_NAPPING without adjusting the code in entry_32.S */
#define TLF_NAPPING 0 /* idle thread enabled NAP mode */
#define TLF_SLEEPING 1 /* suspend code enabled SLEEP mode */
#define TLF_LAZY_MMU 3 /* tlb_batch is active */
#define TLF_RUNLATCH 4 /* Is the runlatch enabled? */
#define _TLF_NAPPING (1 << TLF_NAPPING)
#define _TLF_SLEEPING (1 << TLF_SLEEPING)
#define _TLF_LAZY_MMU (1 << TLF_LAZY_MMU)
#define _TLF_RUNLATCH (1 << TLF_RUNLATCH)
#ifndef __ASSEMBLER__

View File

@@ -37,7 +37,6 @@ extern void tlb_flush(struct mmu_gather *tlb);
*/
#define tlb_needs_table_invalidate() radix_enabled()
#define __HAVE_ARCH_TLB_REMOVE_TABLE
/* Get the generic bits... */
#include <asm-generic/tlb.h>

View File

@@ -1281,9 +1281,6 @@ struct task_struct *__switch_to(struct task_struct *prev,
{
struct thread_struct *new_thread, *old_thread;
struct task_struct *last;
#ifdef CONFIG_PPC_64S_HASH_MMU
struct ppc64_tlb_batch *batch;
#endif
new_thread = &new->thread;
old_thread = &current->thread;
@@ -1291,14 +1288,6 @@ struct task_struct *__switch_to(struct task_struct *prev,
WARN_ON(!irqs_disabled());
#ifdef CONFIG_PPC_64S_HASH_MMU
batch = this_cpu_ptr(&ppc64_tlb_batch);
if (batch->active) {
current_thread_info()->local_flags |= _TLF_LAZY_MMU;
if (batch->index)
__flush_tlb_pending(batch);
batch->active = 0;
}
/*
* On POWER9 the copy-paste buffer can only paste into
* foreign real addresses, so unprivileged processes can not
@@ -1369,20 +1358,6 @@ struct task_struct *__switch_to(struct task_struct *prev,
*/
#ifdef CONFIG_PPC_BOOK3S_64
#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* This applies to a process that was context switched while inside
* arch_enter_lazy_mmu_mode(), to re-activate the batch that was
* deactivated above, before _switch(). This will never be the case
* for new tasks.
*/
if (current_thread_info()->local_flags & _TLF_LAZY_MMU) {
current_thread_info()->local_flags &= ~_TLF_LAZY_MMU;
batch = this_cpu_ptr(&ppc64_tlb_batch);
batch->active = 1;
}
#endif
/*
* Math facilities are masked out of the child MSR in copy_thread.
* A new task does not need to restore_math because it will

View File

@@ -1003,7 +1003,6 @@ void __init setup_arch(char **cmdline_p)
fadump_cma_init();
kdump_cma_reserve();
kvm_cma_reserve();
gigantic_hugetlb_cma_reserve();
early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn << PAGE_SHIFT);

View File

@@ -8,6 +8,7 @@
#include <linux/sched.h>
#include <linux/mm_types.h>
#include <linux/mm.h>
#include <linux/page_table_check.h>
#include <linux/stop_machine.h>
#include <asm/sections.h>
@@ -230,6 +231,9 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
pmd = *pmdp;
pmd_clear(pmdp);
page_table_check_pmd_clear(vma->vm_mm, address, pmd);
/*
* Wait for all pending hash_page to finish. This is needed
* in case of subpage collapse. When we collapse normal pages

View File

@@ -25,11 +25,12 @@
#include <asm/tlb.h>
#include <asm/bug.h>
#include <asm/pte-walk.h>
#include <kunit/visibility.h>
#include <trace/events/thp.h>
DEFINE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch);
EXPORT_SYMBOL_IF_KUNIT(ppc64_tlb_batch);
/*
* A linux PTE was changed and the corresponding hash table entry
@@ -100,7 +101,7 @@ void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
* Check if we have an active batch on this CPU. If not, just
* flush now and return.
*/
if (!batch->active) {
if (!is_lazy_mmu_mode_active()) {
flush_hash_page(vpn, rpte, psize, ssize, mm_is_thread_local(mm));
put_cpu_var(ppc64_tlb_batch);
return;
@@ -154,6 +155,7 @@ void __flush_tlb_pending(struct ppc64_tlb_batch *batch)
flush_hash_range(i, local);
batch->index = 0;
}
EXPORT_SYMBOL_IF_KUNIT(__flush_tlb_pending);
void hash__tlb_flush(struct mmu_gather *tlb)
{
@@ -205,7 +207,7 @@ void __flush_hash_table_range(unsigned long start, unsigned long end)
* way to do things but is fine for our needs here.
*/
local_irq_save(flags);
arch_enter_lazy_mmu_mode();
lazy_mmu_mode_enable();
for (; start < end; start += PAGE_SIZE) {
pte_t *ptep = find_init_mm_pte(start, &hugepage_shift);
unsigned long pte;
@@ -217,7 +219,7 @@ void __flush_hash_table_range(unsigned long start, unsigned long end)
continue;
hpte_need_flush(&init_mm, start, ptep, pte, hugepage_shift);
}
arch_leave_lazy_mmu_mode();
lazy_mmu_mode_disable();
local_irq_restore(flags);
}
@@ -237,7 +239,7 @@ void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long
* way to do things but is fine for our needs here.
*/
local_irq_save(flags);
arch_enter_lazy_mmu_mode();
lazy_mmu_mode_enable();
start_pte = pte_offset_map(pmd, addr);
if (!start_pte)
goto out;
@@ -249,6 +251,6 @@ void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long
}
pte_unmap(start_pte);
out:
arch_leave_lazy_mmu_mode();
lazy_mmu_mode_disable();
local_irq_restore(flags);
}

View File

@@ -10,6 +10,7 @@
#include <linux/pkeys.h>
#include <linux/debugfs.h>
#include <linux/proc_fs.h>
#include <linux/page_table_check.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
@@ -127,7 +128,8 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
WARN_ON(!(pmd_leaf(pmd)));
#endif
trace_hugepage_set_pmd(addr, pmd_val(pmd));
return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
page_table_check_pmd_set(mm, addr, pmdp, pmd);
return set_pte_at_unchecked(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
}
void set_pud_at(struct mm_struct *mm, unsigned long addr,
@@ -144,7 +146,8 @@ void set_pud_at(struct mm_struct *mm, unsigned long addr,
WARN_ON(!(pud_leaf(pud)));
#endif
trace_hugepage_set_pud(addr, pud_val(pud));
return set_pte_at(mm, addr, pudp_ptep(pudp), pud_pte(pud));
page_table_check_pud_set(mm, addr, pudp, pud);
return set_pte_at_unchecked(mm, addr, pudp_ptep(pudp), pud_pte(pud));
}
static void do_serialize(void *arg)
@@ -179,23 +182,27 @@ void serialize_against_pte_lookup(struct mm_struct *mm)
pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
{
unsigned long old_pmd;
pmd_t old_pmd;
VM_WARN_ON_ONCE(!pmd_present(*pmdp));
old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID);
old_pmd = __pmd(pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID));
flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
return __pmd(old_pmd);
page_table_check_pmd_clear(vma->vm_mm, address, old_pmd);
return old_pmd;
}
pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address,
pud_t *pudp)
{
unsigned long old_pud;
pud_t old_pud;
VM_WARN_ON_ONCE(!pud_present(*pudp));
old_pud = pud_hugepage_update(vma->vm_mm, address, pudp, _PAGE_PRESENT, _PAGE_INVALID);
old_pud = __pud(pud_hugepage_update(vma->vm_mm, address, pudp, _PAGE_PRESENT, _PAGE_INVALID));
flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE);
return __pud(old_pud);
page_table_check_pud_clear(vma->vm_mm, address, old_pud);
return old_pud;
}
pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma,
@@ -550,7 +557,7 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
if (radix_enabled())
return radix__ptep_modify_prot_commit(vma, addr,
ptep, old_pte, pte);
set_pte_at(vma->vm_mm, addr, ptep, pte);
set_pte_at_unchecked(vma->vm_mm, addr, ptep, pte);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE

View File

@@ -14,6 +14,7 @@
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/mm.h>
#include <linux/page_table_check.h>
#include <linux/hugetlb.h>
#include <linux/string_helpers.h>
#include <linux/memory.h>
@@ -1474,6 +1475,8 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
pmd = *pmdp;
pmd_clear(pmdp);
page_table_check_pmd_clear(vma->vm_mm, address, pmd);
radix__flush_tlb_collapsed_pmd(vma->vm_mm, address);
return pmd;
@@ -1606,7 +1609,7 @@ void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
(atomic_read(&mm->context.copros) > 0))
radix__flush_tlb_page(vma, addr);
set_pte_at(mm, addr, ptep, pte);
set_pte_at_unchecked(mm, addr, ptep, pte);
}
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
@@ -1617,7 +1620,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
if (!radix_enabled())
return 0;
set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pud);
set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pud);
return 1;
}
@@ -1664,7 +1667,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
if (!radix_enabled())
return 0;
set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pmd);
set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pmd);
return 1;
}

View File

@@ -73,13 +73,13 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
if (!pte)
return;
arch_enter_lazy_mmu_mode();
lazy_mmu_mode_enable();
for (; npages > 0; --npages) {
pte_update(mm, addr, pte, 0, 0, 0);
addr += PAGE_SIZE;
++pte;
}
arch_leave_lazy_mmu_mode();
lazy_mmu_mode_disable();
pte_unmap_unlock(pte - 1, ptl);
}

View File

@@ -200,18 +200,15 @@ static int __init hugetlbpage_init(void)
arch_initcall(hugetlbpage_init);
void __init gigantic_hugetlb_cma_reserve(void)
unsigned int __init arch_hugetlb_cma_order(void)
{
unsigned long order = 0;
if (radix_enabled())
order = PUD_SHIFT - PAGE_SHIFT;
return PUD_SHIFT - PAGE_SHIFT;
else if (!firmware_has_feature(FW_FEATURE_LPAR) && mmu_psize_defs[MMU_PAGE_16G].shift)
/*
* For pseries we do use ibm,expected#pages for reserving 16G pages.
*/
order = mmu_psize_to_shift(MMU_PAGE_16G) - PAGE_SHIFT;
return mmu_psize_to_shift(MMU_PAGE_16G) - PAGE_SHIFT;
if (order)
hugetlb_cma_reserve(order);
return 0;
}

View File

@@ -182,11 +182,6 @@ void __init mem_topology_setup(void)
memblock_set_node(0, PHYS_ADDR_MAX, &memblock.memory, 0);
}
void __init initmem_init(void)
{
sparse_init();
}
/* mark pages that don't exist as nosave */
static int __init mark_nonram_nosave(void)
{
@@ -221,7 +216,16 @@ static int __init mark_nonram_nosave(void)
* anyway) will take a first dip into ZONE_NORMAL and get otherwise served by
* ZONE_DMA.
*/
static unsigned long max_zone_pfns[MAX_NR_ZONES];
void __init arch_zone_limits_init(unsigned long *max_zone_pfns)
{
#ifdef CONFIG_ZONE_DMA
max_zone_pfns[ZONE_DMA] = min((zone_dma_limit >> PAGE_SHIFT) + 1, max_low_pfn);
#endif
max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
#ifdef CONFIG_HIGHMEM
max_zone_pfns[ZONE_HIGHMEM] = max_pfn;
#endif
}
/*
* paging_init() sets up the page tables - in fact we've already done this.
@@ -259,17 +263,6 @@ void __init paging_init(void)
zone_dma_limit = DMA_BIT_MASK(zone_dma_bits);
#ifdef CONFIG_ZONE_DMA
max_zone_pfns[ZONE_DMA] = min(max_low_pfn,
1UL << (zone_dma_bits - PAGE_SHIFT));
#endif
max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
#ifdef CONFIG_HIGHMEM
max_zone_pfns[ZONE_HIGHMEM] = max_pfn;
#endif
free_area_init(max_zone_pfns);
mark_nonram_nosave();
}

View File

@@ -1213,8 +1213,6 @@ void __init initmem_init(void)
setup_node_data(nid, start_pfn, end_pfn);
}
sparse_init();
/*
* We need the numa_cpu_lookup_table to be accurate for all CPUs,
* even before we online them, so that we can use cpu_to_{node,mem}

View File

@@ -22,6 +22,7 @@
#include <linux/mm.h>
#include <linux/percpu.h>
#include <linux/hardirq.h>
#include <linux/page_table_check.h>
#include <linux/hugetlb.h>
#include <asm/tlbflush.h>
#include <asm/tlb.h>
@@ -206,6 +207,9 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
* and not hw_valid ptes. Hence there is no translation cache flush
* involved that need to be batched.
*/
page_table_check_ptes_set(mm, addr, ptep, pte, nr);
for (;;) {
/*
@@ -224,6 +228,14 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
}
}
void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
pte = set_pte_filter(pte, addr);
__set_pte_at(mm, addr, ptep, pte, 0);
}
void unmap_kernel_page(unsigned long va)
{
pmd_t *pmdp = pmd_off_k(va);

View File

@@ -93,6 +93,7 @@ config PPC_BOOK3S_64
select IRQ_WORK
select PPC_64S_HASH_MMU if !PPC_RADIX_MMU
select KASAN_VMALLOC if KASAN
select ARCH_HAS_LAZY_MMU_MODE
config PPC_BOOK3E_64
bool "Embedded processors"

View File

@@ -120,7 +120,7 @@ config PPC_SMLPAR
config CMM
tristate "Collaborative memory management"
depends on PPC_SMLPAR
select MEMORY_BALLOON
select BALLOON
default y
help
Select this option, if you want to enable the kernel interface

View File

@@ -19,7 +19,7 @@
#include <linux/stringify.h>
#include <linux/swap.h>
#include <linux/device.h>
#include <linux/balloon_compaction.h>
#include <linux/balloon.h>
#include <asm/firmware.h>
#include <asm/hvcall.h>
#include <asm/mmu.h>
@@ -165,7 +165,6 @@ static long cmm_alloc_pages(long nr)
balloon_page_enqueue(&b_dev_info, page);
atomic_long_inc(&loaned_pages);
adjust_managed_page_count(page, -1);
nr--;
}
@@ -190,7 +189,6 @@ static long cmm_free_pages(long nr)
if (!page)
break;
plpar_page_set_active(page);
adjust_managed_page_count(page, 1);
__free_page(page);
atomic_long_dec(&loaned_pages);
nr--;
@@ -496,13 +494,11 @@ static struct notifier_block cmm_mem_nb = {
.priority = CMM_MEM_HOTPLUG_PRI
};
#ifdef CONFIG_BALLOON_COMPACTION
#ifdef CONFIG_BALLOON_MIGRATION
static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
struct page *newpage, struct page *page,
enum migrate_mode mode)
{
unsigned long flags;
/*
* loan/"inflate" the newpage first.
*
@@ -517,47 +513,17 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
return -EBUSY;
}
/* balloon page list reference */
get_page(newpage);
/*
* When we migrate a page to a different zone, we have to fixup the
* count of both involved zones as we adjusted the managed page count
* when inflating.
*/
if (page_zone(page) != page_zone(newpage)) {
adjust_managed_page_count(page, 1);
adjust_managed_page_count(newpage, -1);
}
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
balloon_page_insert(b_dev_info, newpage);
__count_vm_event(BALLOON_MIGRATE);
b_dev_info->isolated_pages--;
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
/*
* activate/"deflate" the old page. We ignore any errors just like the
* other callers.
*/
plpar_page_set_active(page);
balloon_page_finalize(page);
/* balloon page list reference */
put_page(page);
return 0;
}
static void cmm_balloon_compaction_init(void)
{
b_dev_info.migratepage = cmm_migratepage;
}
#else /* CONFIG_BALLOON_COMPACTION */
static void cmm_balloon_compaction_init(void)
{
}
#endif /* CONFIG_BALLOON_COMPACTION */
#else /* CONFIG_BALLOON_MIGRATION */
int cmm_migratepage(struct balloon_dev_info *b_dev_info, struct page *newpage,
struct page *page, enum migrate_mode mode);
#endif /* CONFIG_BALLOON_MIGRATION */
/**
* cmm_init - Module initialization
@@ -573,11 +539,13 @@ static int cmm_init(void)
return -EOPNOTSUPP;
balloon_devinfo_init(&b_dev_info);
cmm_balloon_compaction_init();
b_dev_info.adjust_managed_page_count = true;
if (IS_ENABLED(CONFIG_BALLOON_MIGRATION))
b_dev_info.migratepage = cmm_migratepage;
rc = register_oom_notifier(&cmm_oom_nb);
if (rc < 0)
goto out_balloon_compaction;
return rc;
if ((rc = register_reboot_notifier(&cmm_reboot_nb)))
goto out_oom_notifier;
@@ -606,7 +574,6 @@ out_reboot_notifier:
unregister_reboot_notifier(&cmm_reboot_nb);
out_oom_notifier:
unregister_oom_notifier(&cmm_oom_nb);
out_balloon_compaction:
return rc;
}