Summary of significant series in this pull request:

- The 3 patch series "mm, swap: improve cluster scan strategy" from
   Kairui Song improves performance and reduces the failure rate of swap
   cluster allocation.
 
 - The 4 patch series "support large align and nid in Rust allocators"
   from Vitaly Wool permits Rust allocators to set NUMA node and large
   alignment when perforning slub and vmalloc reallocs.
 
 - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
   Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
   for virtual address spaces for ops-level DAMOS filters.
 
 - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
   from Suren Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps.
 
 - The 2 patch series "mm/mincore: minor clean up for swap cache
   checking" from Kairui Song performs some cleanup in the swap code.
 
 - The 11 patch series "mm: vm_normal_page*() improvements" from David
   Hildenbrand provides code cleanup in the pagemap code.
 
 - The 5 patch series "add persistent huge zero folio support" from
   Pankaj Raghav provides a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero.
 
 - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
   few touchups to the recently added Kexec Handover feature.
 
 - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
   arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap.  To
   end the constant struggle with space shortage on 32-bit conflicting with
   64-bit's needs.
 
 - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
   cleans up some swap code.
 
 - The 7 patch series "selftests/mm: Fix false positives and skip
   unsupported tests" from Donet Tom fixes a few things in our selftests
   code.
 
 - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
   THPs when advised" from David Hildenbrand "allows individual processes
   to opt-out of THP=always into THP=madvise, without affecting other
   workloads on the system".
 
   It's a long story - the [1/N] changelog spells out the considerations.
 
 - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
   gets us started on the memdesc project.  Please see
   https://kernelnewbies.org/MatthewWilcox/Memdescs and
   https://blogs.oracle.com/linux/post/introducing-memdesc.
 
 - The 3 patch series "Tiny optimization for large read operations" from
   Chi Zhiling improves the efficiency of the pagecache read path.
 
 - The 5 patch series "Better split_huge_page_test result check" from Zi
   Yan improves our folio splitting selftest code.
 
 - The 2 patch series "test that rmap behaves as expected" from Wei Yang
   adds some rmap selftests.
 
 - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
   removes that function and converts its two remaining callers.
 
 - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
   fixes some UFFD selftests issues.
 
 - The 3 patch series "introduce kernel file mapped folios" from Boris
   Burkov introduces the concept of "kernel file pages".  Using these
   permits btrfs to account its metadata pages to the root cgroup, rather
   than to the cgroups of random inappropriate tasks.
 
 - The 2 patch series "mm/pageblock: improve readability of some
   pageblock handling" from Wei Yang provides some readability improvements
   to the page allocator code.
 
 - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
   Park teaches DAMON to understand arm32 highmem.
 
 - The 4 patch series "tools: testing: Use existing atomic.h for
   vma/maple tests" from Brendan Jackman performs some code cleanups and
   deduplication under tools/testing/.
 
 - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
   Liam Howlett fixes a couple of 32-bit issues in
   tools/testing/radix-tree.c.
 
 - The 2 patch series "kasan: unify kasan_enabled() and remove
   arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
   arch-specific initialization code into a common arch-neutral
   implementation.
 
 - The 3 patch series "mm: remove zpool" from Johannes Weiner removes
   zspool - an indirection layer which now only redirects to a single thing
   (zsmalloc).
 
 - The 2 patch series "mm: task_stack: Stack handling cleanups" from
   Pasha Tatashin makes a couple of cleanups in the fork code.
 
 - The 37 patch series "mm: remove nth_page()" from David Hildenbrand
   makes rather a lot of adjustments at various nth_page() callsites,
   eventually permitting the removal of that undesirable helper function.
 
 - The 2 patch series "introduce kasan.write_only option in hw-tags" from
   Yeoreum Yun creates a KASAN read-only mode for ARM, using that
   architecture's memory tagging feature.  It is felt that a read-only mode
   KASAN is suitable for use in production systems rather than debug-only.
 
 - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
   from Kefeng Wang does some tidying in the hugetlb folio allocation code.
 
 - The 12 patch series "mm: establish const-correctness for pointer
   parameters" from Max Kellermann makes quite a number of the MM API
   functions more accurate about the constness of their arguments.  This
   was getting in the way of subsystems (in this case CEPH) when they
   attempt to improving their own const/non-const accuracy.
 
 - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
   fixes a number of code sites which were confused over when to use
   free_pages() vs __free_pages().
 
 - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
   Ryhl makes the mapletree code accessible to Rust.  Required by nouveau
   and by its forthcoming successor: the new Rust Nova driver.
 
 - The 2 patch series "selftests/mm: split_huge_page_test:
   split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
   some cleanups to the thp selftesting code.
 
 - The 14 patch series "mm, swap: introduce swap table as swap cache
   (phase I)" from Chris Li and Kairui Song is the first step along the
   path to implementing "swap tables" - a new approach to swap allocation
   and state tracking which is expected to yield speed and space
   improvements.  This patchset itself yields a 5-20% performance benefit
   in some situations.
 
 - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
   the new memdesc layer to clean up the ptdesc code a little.
 
 - The 3 patch series "Fix va_high_addr_switch.sh test failure" from
   Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
 
 - The 2 patch series "Minor fixes for memory allocation profiling" from
   Suren Baghdasaryan addresses a couple of minor issues in relatively new
   memory allocation profiling feature.
 
 - The 3 patch series "Small cleanups" from Matthew Wilcox has a few
   cleanups in preparation for more memdesc work.
 
 - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
   DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
   furtherance of supporting arm highmem.
 
 - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
   warnings" from Muhammad Anjum adds that compiler check to selftests code
   and fixes the fallout, by removing dead code.
 
 - The 10 patch series "Improvements to Victim Process Thawing and OOM
   Reaper Traversal Order" from zhongjinji makes a number of improvements
   in the OOM killer: mainly thawing a more appropriate group of victim
   threads so they can release resources.
 
 - The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
   from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
 
 - The 7 patch series "mm/damon: define and use DAMON initialization
   check function" from SeongJae Park implement reliability and
   maintainability improvements to a recently-added bug fix.
 
 - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
   non-idle ages" from SeongJae Park provides additional transparency to
   userspace clients of the DAMON_STAT information.
 
 - The 2 patch series "Expand scope of khugepaged anonymous collapse"
   from Dev Jain removes some constraints on khubepaged's collapsing of
   anon VMAs.  It also increases the success rate of MADV_COLLAPSE against
   an anon vma.
 
 - The 2 patch series "mm: do not assume file == vma->vm_file in
   compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
   removal of file_operations.mmap().  This patchset concentrates upon
   clearing up the treatment of stacked filesystems.
 
 - The 6 patch series "mm: Improve mlock tracking for large folios" from
   Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
   of large folios.  /proc/meminfo's "Mlocked" field became more accurate.
 
 - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
   during fork" from Donet Tom fixes several user-visible KSM stats
   inaccuracies across forks and adds selftest code to verify these
   counters.
 
 - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
   Yang addresses some potential but presently benign issues in KSM's
   mm_slot handling.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
 jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
 =4mhY
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "mm, swap: improve cluster scan strategy" from Kairui Song improves
   performance and reduces the failure rate of swap cluster allocation

 - "support large align and nid in Rust allocators" from Vitaly Wool
   permits Rust allocators to set NUMA node and large alignment when
   perforning slub and vmalloc reallocs

 - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
   DAMOS_STAT's handling of the DAMON operations sets for virtual
   address spaces for ops-level DAMOS filters

 - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
   Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps

 - "mm/mincore: minor clean up for swap cache checking" from Kairui Song
   performs some cleanup in the swap code

 - "mm: vm_normal_page*() improvements" from David Hildenbrand provides
   code cleanup in the pagemap code

 - "add persistent huge zero folio support" from Pankaj Raghav provides
   a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero

 - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
   the recently added Kexec Handover feature

 - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
   Stoakes turns mm_struct.flags into a bitmap. To end the constant
   struggle with space shortage on 32-bit conflicting with 64-bit's
   needs

 - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
   code

 - "selftests/mm: Fix false positives and skip unsupported tests" from
   Donet Tom fixes a few things in our selftests code

 - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
   from David Hildenbrand "allows individual processes to opt-out of
   THP=always into THP=madvise, without affecting other workloads on the
   system".

   It's a long story - the [1/N] changelog spells out the considerations

 - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
   the memdesc project. Please see

      https://kernelnewbies.org/MatthewWilcox/Memdescs and
      https://blogs.oracle.com/linux/post/introducing-memdesc

 - "Tiny optimization for large read operations" from Chi Zhiling
   improves the efficiency of the pagecache read path

 - "Better split_huge_page_test result check" from Zi Yan improves our
   folio splitting selftest code

 - "test that rmap behaves as expected" from Wei Yang adds some rmap
   selftests

 - "remove write_cache_pages()" from Christoph Hellwig removes that
   function and converts its two remaining callers

 - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
   selftests issues

 - "introduce kernel file mapped folios" from Boris Burkov introduces
   the concept of "kernel file pages". Using these permits btrfs to
   account its metadata pages to the root cgroup, rather than to the
   cgroups of random inappropriate tasks

 - "mm/pageblock: improve readability of some pageblock handling" from
   Wei Yang provides some readability improvements to the page allocator
   code

 - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
   to understand arm32 highmem

 - "tools: testing: Use existing atomic.h for vma/maple tests" from
   Brendan Jackman performs some code cleanups and deduplication under
   tools/testing/

 - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
   a couple of 32-bit issues in tools/testing/radix-tree.c

 - "kasan: unify kasan_enabled() and remove arch-specific
   implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
   initialization code into a common arch-neutral implementation

 - "mm: remove zpool" from Johannes Weiner removes zspool - an
   indirection layer which now only redirects to a single thing
   (zsmalloc)

 - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
   couple of cleanups in the fork code

 - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
   adjustments at various nth_page() callsites, eventually permitting
   the removal of that undesirable helper function

 - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
   creates a KASAN read-only mode for ARM, using that architecture's
   memory tagging feature. It is felt that a read-only mode KASAN is
   suitable for use in production systems rather than debug-only

 - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
   some tidying in the hugetlb folio allocation code

 - "mm: establish const-correctness for pointer parameters" from Max
   Kellermann makes quite a number of the MM API functions more accurate
   about the constness of their arguments. This was getting in the way
   of subsystems (in this case CEPH) when they attempt to improving
   their own const/non-const accuracy

 - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
   code sites which were confused over when to use free_pages() vs
   __free_pages()

 - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
   mapletree code accessible to Rust. Required by nouveau and by its
   forthcoming successor: the new Rust Nova driver

 - "selftests/mm: split_huge_page_test: split_pte_mapped_thp
   improvements" from David Hildenbrand adds a fix and some cleanups to
   the thp selftesting code

 - "mm, swap: introduce swap table as swap cache (phase I)" from Chris
   Li and Kairui Song is the first step along the path to implementing
   "swap tables" - a new approach to swap allocation and state tracking
   which is expected to yield speed and space improvements. This
   patchset itself yields a 5-20% performance benefit in some situations

 - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
   layer to clean up the ptdesc code a little

 - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
   issues in our 5-level pagetable selftesting code

 - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
   addresses a couple of minor issues in relatively new memory
   allocation profiling feature

 - "Small cleanups" from Matthew Wilcox has a few cleanups in
   preparation for more memdesc work

 - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
   Quanmin Yan makes some changes to DAMON in furtherance of supporting
   arm highmem

 - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
   Anjum adds that compiler check to selftests code and fixes the
   fallout, by removing dead code

 - "Improvements to Victim Process Thawing and OOM Reaper Traversal
   Order" from zhongjinji makes a number of improvements in the OOM
   killer: mainly thawing a more appropriate group of victim threads so
   they can release resources

 - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
   is a bunch of small and unrelated fixups for DAMON

 - "mm/damon: define and use DAMON initialization check function" from
   SeongJae Park implement reliability and maintainability improvements
   to a recently-added bug fix

 - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
   SeongJae Park provides additional transparency to userspace clients
   of the DAMON_STAT information

 - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
   some constraints on khubepaged's collapsing of anon VMAs. It also
   increases the success rate of MADV_COLLAPSE against an anon vma

 - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
   from Lorenzo Stoakes moves us further towards removal of
   file_operations.mmap(). This patchset concentrates upon clearing up
   the treatment of stacked filesystems

 - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
   provides some fixes and improvements to mlock's tracking of large
   folios. /proc/meminfo's "Mlocked" field became more accurate

 - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
   Donet Tom fixes several user-visible KSM stats inaccuracies across
   forks and adds selftest code to verify these counters

 - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
   some potential but presently benign issues in KSM's mm_slot handling

* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
  mm: swap: check for stable address space before operating on the VMA
  mm: convert folio_page() back to a macro
  mm/khugepaged: use start_addr/addr for improved readability
  hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
  alloc_tag: fix boot failure due to NULL pointer dereference
  mm: silence data-race in update_hiwater_rss
  mm/memory-failure: don't select MEMORY_ISOLATION
  mm/khugepaged: remove definition of struct khugepaged_mm_slot
  mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
  hugetlb: increase number of reserving hugepages via cmdline
  selftests/mm: add fork inheritance test for ksm_merging_pages counter
  mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
  drivers/base/node: fix double free in register_one_node()
  mm: remove PMD alignment constraint in execmem_vmalloc()
  mm/memory_hotplug: fix typo 'esecially' -> 'especially'
  mm/rmap: improve mlock tracking for large folios
  mm/filemap: map entire large folio faultaround
  mm/fault: try to map the entire file folio in finish_fault()
  mm/rmap: mlock large folios in try_to_unmap_one()
  mm/rmap: fix a mlock race condition in folio_referenced_one()
  ...
This commit is contained in:
Linus Torvalds 2025-10-02 18:18:33 -07:00
commit 8804d970fa
381 changed files with 9091 additions and 5069 deletions

View File

@ -77,6 +77,13 @@ Description: Writing a keyword for a monitoring operations set ('vaddr' for
Note that only the operations sets that listed in
'avail_operations' file are valid inputs.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/addr_unit
Date: Aug 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing an integer to this file sets the 'address unit'
parameter of the given operations set of the context. Reading
the file returns the last-written 'address unit' value.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/sample_us
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>

View File

@ -175,4 +175,4 @@ Below command makes every memory region of size >=4K that has not accessed for
$ sudo damo start --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>
--target_pid <pid of your workload>

View File

@ -61,7 +61,7 @@ comma (",").
:ref:`kdamonds <sysfs_kdamonds>`/nr_kdamonds
│ │ :ref:`0 <sysfs_kdamond>`/state,pid,refresh_ms
│ │ │ :ref:`contexts <sysfs_contexts>`/nr_contexts
│ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations
│ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations,addr_unit
│ │ │ │ │ :ref:`monitoring_attrs <sysfs_monitoring_attrs>`/
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
│ │ │ │ │ │ │ intervals_goal/access_bp,aggrs,min_sample_us,max_sample_us
@ -188,9 +188,9 @@ details). At the moment, only one context per kdamond is supported, so only
contexts/<N>/
-------------
In each context directory, two files (``avail_operations`` and ``operations``)
and three directories (``monitoring_attrs``, ``targets``, and ``schemes``)
exist.
In each context directory, three files (``avail_operations``, ``operations``
and ``addr_unit``) and three directories (``monitoring_attrs``, ``targets``,
and ``schemes``) exist.
DAMON supports multiple types of :ref:`monitoring operations
<damon_design_configurable_operations_set>`, including those for virtual address
@ -205,6 +205,9 @@ You can set and get what type of monitoring operations DAMON will use for the
context by writing one of the keywords listed in ``avail_operations`` file and
reading from the ``operations`` file.
``addr_unit`` file is for setting and getting the :ref:`address unit
<damon_design_addr_unit>` parameter of the operations set.
.. _sysfs_monitoring_attrs:
contexts/<N>/monitoring_attrs/

View File

@ -225,6 +225,42 @@ to "always" or "madvise"), and it'll be automatically shutdown when
PMD-sized THP is disabled (when both the per-size anon control and the
top-level control are "never")
process THP controls
--------------------
A process can control its own THP behaviour using the ``PR_SET_THP_DISABLE``
and ``PR_GET_THP_DISABLE`` pair of prctl(2) calls. The THP behaviour set using
``PR_SET_THP_DISABLE`` is inherited across fork(2) and execve(2). These calls
support the following arguments::
prctl(PR_SET_THP_DISABLE, 1, 0, 0, 0):
This will disable THPs completely for the process, irrespective
of global THP controls or madvise(..., MADV_COLLAPSE) being used.
prctl(PR_SET_THP_DISABLE, 1, PR_THP_DISABLE_EXCEPT_ADVISED, 0, 0):
This will disable THPs for the process except when the usage of THPs is
advised. Consequently, THPs will only be used when:
- Global THP controls are set to "always" or "madvise" and
madvise(..., MADV_HUGEPAGE) or madvise(..., MADV_COLLAPSE) is used.
- Global THP controls are set to "never" and madvise(..., MADV_COLLAPSE)
is used. This is the same behavior as if THPs would not be disabled on
a process level.
Note that MADV_COLLAPSE is currently always rejected if
madvise(..., MADV_NOHUGEPAGE) is set on an area.
prctl(PR_SET_THP_DISABLE, 0, 0, 0, 0):
This will re-enable THPs for the process, as if they were never disabled.
Whether THPs will actually be used depends on global THP controls and
madvise() calls.
prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0):
This returns a value whose bits indicate how THP-disable is configured:
Bits
1 0 Value Description
|0|0| 0 No THP-disable behaviour specified.
|0|1| 1 THP is entirely disabled for this process.
|1|1| 3 THP-except-advised mode is set for this process.
Khugepaged controls
-------------------
@ -383,6 +419,8 @@ option: ``huge=``. It can have following values:
always
Attempt to allocate huge pages every time we need a new page;
Always try PMD-sized huge pages first, and fall back to smaller-sized
huge pages if the PMD-sized huge page allocation fails;
never
Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)``
@ -390,7 +428,9 @@ never
is specified everywhere;
within_size
Only allocate huge page if it will be fully within i_size.
Only allocate huge page if it will be fully within i_size;
Always try PMD-sized huge pages first, and fall back to smaller-sized
huge pages if the PMD-sized huge page allocation fails;
Also respect madvise() hints;
advise

View File

@ -53,26 +53,17 @@ Zswap receives pages for compression from the swap subsystem and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Zswap makes use of zpool for the managing the compressed memory pool. Each
allocation in zpool is not directly accessible by address. Rather, a handle is
Zswap makes use of zsmalloc for the managing the compressed memory pool. Each
allocation in zsmalloc is not directly accessible by address. Rather, a handle is
returned by the allocation routine and that handle must be mapped before being
accessed. The compressed memory pool grows on demand and shrinks as compressed
pages are freed. The pool is not preallocated. By default, a zpool
of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created,
but it can be overridden at boot time by setting the ``zpool`` attribute,
e.g. ``zswap.zpool=zsmalloc``. It can also be changed at runtime using the sysfs
``zpool`` attribute, e.g.::
echo zsmalloc > /sys/module/zswap/parameters/zpool
The zsmalloc type zpool has a complex compressed page storage method, and it
can achieve great storage densities.
pages are freed. The pool is not preallocated.
When a swap page is passed from swapout to zswap, zswap maintains a mapping
of the swap entry, a combination of the swap type and swap offset, to the zpool
handle that references that compressed swap page. This mapping is achieved
with a red-black tree per swap type. The swap offset is the search key for the
tree nodes.
of the swap entry, a combination of the swap type and swap offset, to the
zsmalloc handle that references that compressed swap page. This mapping is
achieved with a red-black tree per swap type. The swap offset is the search
key for the tree nodes.
During a page fault on a PTE that is a swap entry, the swapin code calls the
zswap load function to decompress the page into the page allocated by the page
@ -96,11 +87,11 @@ attribute, e.g.::
echo lzo > /sys/module/zswap/parameters/compressor
When the zpool and/or compressor parameter is changed at runtime, any existing
compressed pages are not modified; they are left in their own zpool. When a
request is made for a page in an old zpool, it is uncompressed using its
original compressor. Once all pages are removed from an old zpool, the zpool
and its compressor are freed.
When the compressor parameter is changed at runtime, any existing compressed
pages are not modified; they are left in their own pool. When a request is
made for a page in an old pool, it is uncompressed using its original
compressor. Once all pages are removed from an old pool, the pool and its
compressor are freed.
Some of the pages in zswap are same-value filled pages (i.e. contents of the
page have same value or repetitive pattern). These pages include zero-filled

View File

@ -118,7 +118,6 @@ More Memory Management Functions
.. kernel-doc:: mm/memremap.c
.. kernel-doc:: mm/hugetlb.c
.. kernel-doc:: mm/swap.c
.. kernel-doc:: mm/zpool.c
.. kernel-doc:: mm/memcontrol.c
.. #kernel-doc:: mm/memory-tiers.c (build warnings)
.. kernel-doc:: mm/shmem.c

View File

@ -143,6 +143,9 @@ disabling KASAN altogether or controlling its features:
Asymmetric mode: a bad access is detected synchronously on reads and
asynchronously on writes.
- ``kasan.write_only=off`` or ``kasan.write_only=on`` controls whether KASAN
checks the write (store) accesses only or all accesses (default: ``off``).
- ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc
allocations (default: ``on``).

View File

@ -476,7 +476,6 @@ Use the following commands to enable zswap::
# echo 0 > /sys/module/zswap/parameters/enabled
# echo 50 > /sys/module/zswap/parameters/max_pool_percent
# echo deflate-iaa > /sys/module/zswap/parameters/compressor
# echo zsmalloc > /sys/module/zswap/parameters/zpool
# echo 1 > /sys/module/zswap/parameters/enabled
# echo 100 > /proc/sys/vm/swappiness
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
@ -625,7 +624,6 @@ the 'fixed' compression mode::
echo 0 > /sys/module/zswap/parameters/enabled
echo 50 > /sys/module/zswap/parameters/max_pool_percent
echo deflate-iaa > /sys/module/zswap/parameters/compressor
echo zsmalloc > /sys/module/zswap/parameters/zpool
echo 1 > /sys/module/zswap/parameters/enabled
echo 100 > /proc/sys/vm/swappiness

View File

@ -291,8 +291,9 @@ It's slow but very precise.
HugetlbPages size of hugetlb memory portions
CoreDumping process's memory is currently being dumped
(killing the process may lead to a corrupted core)
THP_enabled process is allowed to use THP (returns 0 when
PR_SET_THP_DISABLE is set on the process
THP_enabled process is allowed to use THP (returns 0 when
PR_SET_THP_DISABLE is set on the process to disable
THP completely, not just partially)
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
@ -1008,6 +1009,19 @@ number, module (if originates from a loadable module) and the function calling
the allocation. The number of bytes allocated and number of calls at each
location are reported. The first line indicates the version of the file, the
second line is the header listing fields in the file.
If file version is 2.0 or higher then each line may contain additional
<key>:<value> pairs representing extra information about the call site.
For example if the counters are not accurate, the line will be appended with
"accurate:no" pair.
Supported markers in v2:
accurate:no
Absolute values of the counters in this line are not accurate
because of the failure to allocate memory to track some of the
allocations made at this location. Deltas in these counters are
accurate, therefore counters can be used to track allocation size
and count changes.
Example output.

View File

@ -67,7 +67,7 @@ processes, NUMA nodes, files, and backing memory devices would be supportable.
Also, if some architectures or devices support special optimized access check
features, those will be easily configurable.
DAMON currently provides below three operation sets. Below two subsections
DAMON currently provides below three operation sets. Below three subsections
describe how those work.
- vaddr: Monitor virtual address spaces of specific processes
@ -135,6 +135,20 @@ the interference is the responsibility of sysadmins. However, it solves the
conflict with the reclaim logic using ``PG_idle`` and ``PG_young`` page flags,
as Idle page tracking does.
.. _damon_design_addr_unit:
Address Unit
------------
DAMON core layer uses ``unsinged long`` type for monitoring target address
ranges. In some cases, the address space for a given operations set could be
too large to be handled with the type. ARM (32-bit) with large physical
address extension is an example. For such cases, a per-operations set
parameter called ``address unit`` is provided. It represents the scale factor
that need to be multiplied to the core layer's address for calculating real
address on the given address space. Support of ``address unit`` parameter is
up to each operations set implementation. ``paddr`` is the only operations set
implementation that supports the parameter.
.. _damon_core_logic:
@ -689,7 +703,7 @@ DAMOS accounts below statistics for each scheme, from the beginning of the
scheme's execution.
- ``nr_tried``: Total number of regions that the scheme is tried to be applied.
- ``sz_trtied``: Total size of regions that the scheme is tried to be applied.
- ``sz_tried``: Total size of regions that the scheme is tried to be applied.
- ``sz_ops_filter_passed``: Total bytes that passed operations set
layer-handled DAMOS filters.
- ``nr_applied``: Total number of regions that the scheme is applied.

View File

@ -89,18 +89,13 @@ the maintainer.
Community meetup
----------------
DAMON community is maintaining two bi-weekly meetup series for community
members who prefer synchronous conversations over mails.
DAMON community has a bi-weekly meetup series for members who prefer
synchronous conversations over mails. It is for discussions on specific topics
between a group of members including the maintainer. The maintainer shares the
available time slots, and attendees should reserve one of those at least 24
hours before the time slot, by reaching out to the maintainer.
The first one is for any discussion between every community member. No
reservation is needed.
The seconds one is for discussions on specific topics between restricted
members including the maintainer. The maintainer shares the available time
slots, and attendees should reserve one of those at least 24 hours before the
time slot, by reaching out to the maintainer.
Schedules and available reservation time slots are available at the Google `doc
Schedules and reservation status are available at the Google `doc
<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_.
There is also a public Google `calendar
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_

View File

@ -20,6 +20,7 @@ see the :doc:`admin guide <../admin-guide/mm/index>`.
highmem
page_reclaim
swap
swap-table
page_cache
shmfs
oom

View File

@ -0,0 +1,69 @@
.. SPDX-License-Identifier: GPL-2.0
:Author: Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>
==========
Swap Table
==========
Swap table implements swap cache as a per-cluster swap cache value array.
Swap Entry
----------
A swap entry contains the information required to serve the anonymous page
fault.
Swap entry is encoded as two parts: swap type and swap offset.
The swap type indicates which swap device to use.
The swap offset is the offset of the swap file to read the page data from.
Swap Cache
----------
Swap cache is a map to look up folios using swap entry as the key. The result
value can have three possible types depending on which stage of this swap entry
was in.
1. NULL: This swap entry is not used.
2. folio: A folio has been allocated and bound to this swap entry. This is
the transient state of swap out or swap in. The folio data can be in
the folio or swap file, or both.
3. shadow: The shadow contains the working set information of the swapped
out folio. This is the normal state for a swapped out page.
Swap Table Internals
--------------------
The previous swap cache is implemented by XArray. The XArray is a tree
structure. Each lookup will go through multiple nodes. Can we do better?
Notice that most of the time when we look up the swap cache, we are either
in a swap in or swap out path. We should already have the swap cluster,
which contains the swap entry.
If we have a per-cluster array to store swap cache value in the cluster.
Swap cache lookup within the cluster can be a very simple array lookup.
We give such a per-cluster swap cache value array a name: the swap table.
A swap table is an array of pointers. Each pointer is the same size as a
PTE. The size of a swap table for one swap cluster typically matches a PTE
page table, which is one page on modern 64-bit systems.
With swap table, swap cache lookup can achieve great locality, simpler,
and faster.
Locking
-------
Swap table modification requires taking the cluster lock. If a folio
is being added to or removed from the swap table, the folio must be
locked prior to the cluster lock. After adding or removing is done, the
folio shall be unlocked.
Swap table lookup is protected by RCU and atomic read. If the lookup
returns a folio, the user must lock the folio before use.

View File

@ -6795,7 +6795,7 @@ S: Maintained
W: https://docs.dasharo.com/
F: drivers/platform/x86/dasharo-acpi.c
DATA ACCESS MONITOR
DAMON
M: SeongJae Park <sj@kernel.org>
L: damon@lists.linux.dev
L: linux-mm@kvack.org
@ -14852,6 +14852,8 @@ F: net/mctp/
MAPLE TREE
M: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Alice Ryhl <aliceryhl@google.com>
R: Andrew Ballance <andrewjballance@gmail.com>
L: maple-tree@lists.infradead.org
L: linux-mm@kvack.org
S: Supported
@ -14860,6 +14862,8 @@ F: include/linux/maple_tree.h
F: include/trace/events/maple_tree.h
F: lib/maple_tree.c
F: lib/test_maple_tree.c
F: rust/helpers/maple_tree.c
F: rust/kernel/maple_tree.rs
F: tools/testing/radix-tree/maple.c
F: tools/testing/shared/linux/maple_tree.h
@ -16394,6 +16398,7 @@ S: Maintained
F: include/linux/rmap.h
F: mm/page_vma_mapped.c
F: mm/rmap.c
F: tools/testing/selftests/mm/rmap.c
MEMORY MANAGEMENT - SECRETMEM
M: Andrew Morton <akpm@linux-foundation.org>
@ -16413,12 +16418,14 @@ R: Barry Song <baohua@kernel.org>
R: Chris Li <chrisl@kernel.org>
L: linux-mm@kvack.org
S: Maintained
F: Documentation/mm/swap-table.rst
F: include/linux/swap.h
F: include/linux/swapfile.h
F: include/linux/swapops.h
F: mm/page_io.c
F: mm/swap.c
F: mm/swap.h
F: mm/swap_table.h
F: mm/swap_state.c
F: mm/swapfile.c
@ -28211,9 +28218,7 @@ R: Chengming Zhou <chengming.zhou@linux.dev>
L: linux-mm@kvack.org
S: Maintained
F: Documentation/admin-guide/mm/zswap.rst
F: include/linux/zpool.h
F: include/linux/zswap.h
F: mm/zpool.c
F: mm/zswap.c
F: tools/testing/selftests/cgroup/test_zswap.c

View File

@ -151,9 +151,6 @@
/* Helpers */
#define TO_KB(bytes) ((bytes) >> 10)
#define TO_MB(bytes) (TO_KB(bytes) >> 10)
#define PAGES_TO_KB(n_pages) ((n_pages) << (PAGE_SHIFT - 10))
#define PAGES_TO_MB(n_pages) (PAGES_TO_KB(n_pages) >> 10)
/*
***************************************************************

View File

@ -704,7 +704,7 @@ static inline void arc_slc_enable(void)
void flush_dcache_folio(struct folio *folio)
{
clear_bit(PG_dc_clean, &folio->flags);
clear_bit(PG_dc_clean, &folio->flags.f);
return;
}
EXPORT_SYMBOL(flush_dcache_folio);
@ -889,8 +889,8 @@ void copy_user_highpage(struct page *to, struct page *from,
copy_page(kto, kfrom);
clear_bit(PG_dc_clean, &dst->flags);
clear_bit(PG_dc_clean, &src->flags);
clear_bit(PG_dc_clean, &dst->flags.f);
clear_bit(PG_dc_clean, &src->flags.f);
kunmap_atomic(kto);
kunmap_atomic(kfrom);
@ -900,7 +900,7 @@ void clear_user_page(void *to, unsigned long u_vaddr, struct page *page)
{
struct folio *folio = page_folio(page);
clear_page(to);
clear_bit(PG_dc_clean, &folio->flags);
clear_bit(PG_dc_clean, &folio->flags.f);
}
EXPORT_SYMBOL(clear_user_page);

View File

@ -488,7 +488,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
*/
if (vma->vm_flags & VM_EXEC) {
struct folio *folio = page_folio(page);
int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags);
int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags.f);
if (dirty) {
unsigned long offset = offset_in_folio(folio, paddr);
nr = folio_nr_pages(folio);

View File

@ -46,9 +46,9 @@ extern pte_t *pkmap_page_table;
#endif
#ifdef ARCH_NEEDS_KMAP_HIGH_GET
extern void *kmap_high_get(struct page *page);
extern void *kmap_high_get(const struct page *page);
static inline void *arch_kmap_local_high_get(struct page *page)
static inline void *arch_kmap_local_high_get(const struct page *page)
{
if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !cache_is_vivt())
return NULL;
@ -57,7 +57,7 @@ static inline void *arch_kmap_local_high_get(struct page *page)
#define arch_kmap_local_high_get arch_kmap_local_high_get
#else /* ARCH_NEEDS_KMAP_HIGH_GET */
static inline void *kmap_high_get(struct page *page)
static inline void *kmap_high_get(const struct page *page)
{
return NULL;
}

View File

@ -17,7 +17,7 @@
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
}
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags

View File

@ -67,7 +67,7 @@ void v4_mc_copy_user_highpage(struct page *to, struct page *from,
struct folio *src = page_folio(from);
void *kto = kmap_atomic(to);
if (!test_and_set_bit(PG_dcache_clean, &src->flags))
if (!test_and_set_bit(PG_dcache_clean, &src->flags.f))
__flush_dcache_folio(folio_flush_mapping(src), src);
raw_spin_lock(&minicache_lock);

View File

@ -73,7 +73,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
if (!test_and_set_bit(PG_dcache_clean, &src->flags))
if (!test_and_set_bit(PG_dcache_clean, &src->flags.f))
__flush_dcache_folio(folio_flush_mapping(src), src);
/* FIXME: not highmem safe */

View File

@ -87,7 +87,7 @@ void xscale_mc_copy_user_highpage(struct page *to, struct page *from,
struct folio *src = page_folio(from);
void *kto = kmap_atomic(to);
if (!test_and_set_bit(PG_dcache_clean, &src->flags))
if (!test_and_set_bit(PG_dcache_clean, &src->flags.f))
__flush_dcache_folio(folio_flush_mapping(src), src);
raw_spin_lock(&minicache_lock);

View File

@ -718,7 +718,7 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
if (size < sz)
break;
if (!offset)
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
offset = 0;
size -= sz;
if (!size)

View File

@ -203,7 +203,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
folio = page_folio(pfn_to_page(pfn));
mapping = folio_flush_mapping(folio);
if (!test_and_set_bit(PG_dcache_clean, &folio->flags))
if (!test_and_set_bit(PG_dcache_clean, &folio->flags.f))
__flush_dcache_folio(mapping, folio);
if (mapping) {
if (cache_is_vivt())

View File

@ -304,7 +304,7 @@ void __sync_icache_dcache(pte_t pteval)
else
mapping = NULL;
if (!test_and_set_bit(PG_dcache_clean, &folio->flags))
if (!test_and_set_bit(PG_dcache_clean, &folio->flags.f))
__flush_dcache_folio(mapping, folio);
if (pte_exec(pteval))
@ -343,8 +343,8 @@ void flush_dcache_folio(struct folio *folio)
return;
if (!cache_ops_need_broadcast() && cache_is_vipt_nonaliasing()) {
if (test_bit(PG_dcache_clean, &folio->flags))
clear_bit(PG_dcache_clean, &folio->flags);
if (test_bit(PG_dcache_clean, &folio->flags.f))
clear_bit(PG_dcache_clean, &folio->flags.f);
return;
}
@ -352,14 +352,14 @@ void flush_dcache_folio(struct folio *folio)
if (!cache_ops_need_broadcast() &&
mapping && !folio_mapped(folio))
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
else {
__flush_dcache_folio(mapping, folio);
if (mapping && cache_is_vivt())
__flush_dcache_aliases(mapping, folio);
else if (mapping)
__flush_icache_all();
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
EXPORT_SYMBOL(flush_dcache_folio);

View File

@ -300,6 +300,6 @@ void __init kasan_init(void)
local_flush_tlb_all();
memset(kasan_early_shadow_page, 0, PAGE_SIZE);
pr_info("Kernel address sanitizer initialized\n");
init_task.kasan_depth = 0;
kasan_init_generic();
}

View File

@ -737,7 +737,7 @@ static void *__init late_alloc(unsigned long sz)
if (!ptdesc || !pagetable_pte_ctor(NULL, ptdesc))
BUG();
return ptdesc_to_virt(ptdesc);
return ptdesc_address(ptdesc);
}
static pte_t * __init arm_pte_alloc(pmd_t *pmd, unsigned long addr,

View File

@ -1549,7 +1549,6 @@ source "kernel/Kconfig.hz"
config ARCH_SPARSEMEM_ENABLE
def_bool y
select SPARSEMEM_VMEMMAP_ENABLE
select SPARSEMEM_VMEMMAP
config HW_PERF_EVENTS
def_bool y

View File

@ -21,12 +21,12 @@ extern bool arch_hugetlb_migration_supported(struct hstate *h);
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
#ifdef CONFIG_ARM64_MTE
if (system_supports_mte()) {
clear_bit(PG_mte_tagged, &folio->flags);
clear_bit(PG_mte_lock, &folio->flags);
clear_bit(PG_mte_tagged, &folio->flags.f);
clear_bit(PG_mte_lock, &folio->flags.f);
}
#endif
}

View File

@ -308,6 +308,7 @@ static inline const void *__tag_set(const void *addr, u8 tag)
#define arch_enable_tag_checks_sync() mte_enable_kernel_sync()
#define arch_enable_tag_checks_async() mte_enable_kernel_async()
#define arch_enable_tag_checks_asymm() mte_enable_kernel_asymm()
#define arch_enable_tag_checks_write_only() mte_enable_kernel_store_only()
#define arch_suppress_tag_checks_start() mte_enable_tco()
#define arch_suppress_tag_checks_stop() mte_disable_tco()
#define arch_force_async_tag_fault() mte_check_tfsr_exit()

View File

@ -200,6 +200,7 @@ static inline void mte_set_mem_tag_range(void *addr, size_t size, u8 tag,
void mte_enable_kernel_sync(void);
void mte_enable_kernel_async(void);
void mte_enable_kernel_asymm(void);
int mte_enable_kernel_store_only(void);
#else /* CONFIG_ARM64_MTE */
@ -251,6 +252,11 @@ static inline void mte_enable_kernel_asymm(void)
{
}
static inline int mte_enable_kernel_store_only(void)
{
return -EINVAL;
}
#endif /* CONFIG_ARM64_MTE */
#endif /* __ASSEMBLY__ */

View File

@ -48,12 +48,12 @@ static inline void set_page_mte_tagged(struct page *page)
* before the page flags update.
*/
smp_wmb();
set_bit(PG_mte_tagged, &page->flags);
set_bit(PG_mte_tagged, &page->flags.f);
}
static inline bool page_mte_tagged(struct page *page)
{
bool ret = test_bit(PG_mte_tagged, &page->flags);
bool ret = test_bit(PG_mte_tagged, &page->flags.f);
VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
@ -82,7 +82,7 @@ static inline bool try_page_mte_tagging(struct page *page)
{
VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
if (!test_and_set_bit(PG_mte_lock, &page->flags))
if (!test_and_set_bit(PG_mte_lock, &page->flags.f))
return true;
/*
@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page)
* already. Check if the PG_mte_tagged flag has been set or wait
* otherwise.
*/
smp_cond_load_acquire(&page->flags, VAL & (1UL << PG_mte_tagged));
smp_cond_load_acquire(&page->flags.f, VAL & (1UL << PG_mte_tagged));
return false;
}
@ -173,13 +173,13 @@ static inline void folio_set_hugetlb_mte_tagged(struct folio *folio)
* before the folio flags update.
*/
smp_wmb();
set_bit(PG_mte_tagged, &folio->flags);
set_bit(PG_mte_tagged, &folio->flags.f);
}
static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio)
{
bool ret = test_bit(PG_mte_tagged, &folio->flags);
bool ret = test_bit(PG_mte_tagged, &folio->flags.f);
VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
@ -196,7 +196,7 @@ static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
{
VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
if (!test_and_set_bit(PG_mte_lock, &folio->flags))
if (!test_and_set_bit(PG_mte_lock, &folio->flags.f))
return true;
/*
@ -204,7 +204,7 @@ static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
* already. Check if the PG_mte_tagged flag has been set or wait
* otherwise.
*/
smp_cond_load_acquire(&folio->flags, VAL & (1UL << PG_mte_tagged));
smp_cond_load_acquire(&folio->flags.f, VAL & (1UL << PG_mte_tagged));
return false;
}

View File

@ -2956,7 +2956,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "Store Only MTE Tag Check",
.capability = ARM64_MTE_STORE_ONLY,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, MTESTOREONLY, IMP)
},

View File

@ -157,6 +157,24 @@ void mte_enable_kernel_asymm(void)
mte_enable_kernel_sync();
}
}
int mte_enable_kernel_store_only(void)
{
/*
* If the CPU does not support MTE store only,
* the kernel checks all operations.
*/
if (!cpus_have_cap(ARM64_MTE_STORE_ONLY))
return -EINVAL;
sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCSO_MASK,
SYS_FIELD_PREP(SCTLR_EL1, TCSO, 1));
isb();
pr_info_once("MTE: enabled store only mode at EL1\n");
return 0;
}
#endif
#ifdef CONFIG_KASAN_HW_TAGS

View File

@ -53,11 +53,11 @@ void __sync_icache_dcache(pte_t pte)
{
struct folio *folio = page_folio(pte_page(pte));
if (!test_bit(PG_dcache_clean, &folio->flags)) {
if (!test_bit(PG_dcache_clean, &folio->flags.f)) {
sync_icache_aliases((unsigned long)folio_address(folio),
(unsigned long)folio_address(folio) +
folio_size(folio));
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
EXPORT_SYMBOL_GPL(__sync_icache_dcache);
@ -69,8 +69,8 @@ EXPORT_SYMBOL_GPL(__sync_icache_dcache);
*/
void flush_dcache_folio(struct folio *folio)
{
if (test_bit(PG_dcache_clean, &folio->flags))
clear_bit(PG_dcache_clean, &folio->flags);
if (test_bit(PG_dcache_clean, &folio->flags.f))
clear_bit(PG_dcache_clean, &folio->flags.f);
}
EXPORT_SYMBOL(flush_dcache_folio);

View File

@ -399,14 +399,12 @@ void __init kasan_init(void)
{
kasan_init_shadow();
kasan_init_depth();
#if defined(CONFIG_KASAN_GENERIC)
kasan_init_generic();
/*
* Generic KASAN is now fully initialized.
* Software and Hardware Tag-Based modes still require
* kasan_init_sw_tags() and kasan_init_hw_tags() correspondingly.
*/
pr_info("KernelAddressSanitizer initialized (generic)\n");
#endif
}
#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */

View File

@ -1239,7 +1239,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
vmem_altmap_free(altmap, size >> PAGE_SHIFT);
} else {
WARN_ON(PageReserved(page));
free_pages((unsigned long)page_address(page), get_order(size));
__free_pages(page, get_order(size));
}
}

View File

@ -25,12 +25,12 @@ void flush_dcache_folio(struct folio *folio)
mapping = folio_flush_mapping(folio);
if (mapping && !folio_mapped(folio))
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
else {
dcache_wbinv_all();
if (mapping)
icache_inv_all();
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
EXPORT_SYMBOL(flush_dcache_folio);
@ -56,7 +56,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
return;
folio = page_folio(pfn_to_page(pfn));
if (!test_and_set_bit(PG_dcache_clean, &folio->flags))
if (!test_and_set_bit(PG_dcache_clean, &folio->flags.f))
dcache_wbinv_all();
if (folio_flush_mapping(folio)) {

View File

@ -9,6 +9,7 @@ config LOONGARCH
select ACPI_PPTT if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
select ARCH_BINFMT_ELF_STATE
select ARCH_NEEDS_DEFER_KASAN
select ARCH_DISABLE_KASAN_INLINE
select ARCH_ENABLE_MEMORY_HOTPLUG
select ARCH_ENABLE_MEMORY_HOTREMOVE

View File

@ -106,7 +106,6 @@ CONFIG_CMDLINE_PARTITION=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BFQ_GROUP_IOSCHED=y
CONFIG_BINFMT_MISC=m
CONFIG_ZPOOL=y
CONFIG_ZSWAP=y
CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD=y
CONFIG_ZSMALLOC=y

View File

@ -66,7 +66,6 @@
#define XKPRANGE_WC_SHADOW_OFFSET (KASAN_SHADOW_START + XKPRANGE_WC_KASAN_OFFSET)
#define XKVRANGE_VC_SHADOW_OFFSET (KASAN_SHADOW_START + XKVRANGE_VC_KASAN_OFFSET)
extern bool kasan_early_stage;
extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
#define kasan_mem_to_shadow kasan_mem_to_shadow
@ -75,12 +74,6 @@ void *kasan_mem_to_shadow(const void *addr);
#define kasan_shadow_to_mem kasan_shadow_to_mem
const void *kasan_shadow_to_mem(const void *shadow_addr);
#define kasan_arch_is_ready kasan_arch_is_ready
static __always_inline bool kasan_arch_is_ready(void)
{
return !kasan_early_stage;
}
#define addr_has_metadata addr_has_metadata
static __always_inline bool addr_has_metadata(const void *addr)
{

View File

@ -40,11 +40,9 @@ static pgd_t kasan_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
#define __pte_none(early, pte) (early ? pte_none(pte) : \
((pte_val(pte) & _PFN_MASK) == (unsigned long)__pa(kasan_early_shadow_page)))
bool kasan_early_stage = true;
void *kasan_mem_to_shadow(const void *addr)
{
if (!kasan_arch_is_ready()) {
if (!kasan_enabled()) {
return (void *)(kasan_early_shadow_page);
} else {
unsigned long maddr = (unsigned long)addr;
@ -298,7 +296,8 @@ void __init kasan_init(void)
kasan_populate_early_shadow(kasan_mem_to_shadow((void *)VMALLOC_START),
kasan_mem_to_shadow((void *)KFENCE_AREA_END));
kasan_early_stage = false;
/* Enable KASAN here before kasan_mem_to_shadow(). */
kasan_init_generic();
/* Populate the linear mapping */
for_each_mem_range(i, &pa_start, &pa_end) {
@ -329,5 +328,4 @@ void __init kasan_init(void)
/* At this point kasan is fully initialized. Enable error messages */
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized.\n");
}

View File

@ -37,11 +37,11 @@
#define PG_dcache_dirty PG_arch_1
#define folio_test_dcache_dirty(folio) \
test_bit(PG_dcache_dirty, &(folio)->flags)
test_bit(PG_dcache_dirty, &(folio)->flags.f)
#define folio_set_dcache_dirty(folio) \
set_bit(PG_dcache_dirty, &(folio)->flags)
set_bit(PG_dcache_dirty, &(folio)->flags.f)
#define folio_clear_dcache_dirty(folio) \
clear_bit(PG_dcache_dirty, &(folio)->flags)
clear_bit(PG_dcache_dirty, &(folio)->flags.f)
extern void (*flush_cache_all)(void);
extern void (*__flush_cache_all)(void);
@ -50,13 +50,14 @@ extern void (*flush_cache_mm)(struct mm_struct *mm);
extern void (*flush_cache_range)(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
extern void __flush_dcache_pages(struct page *page, unsigned int nr);
void __flush_dcache_folio_pages(struct folio *folio, struct page *page, unsigned int nr);
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
static inline void flush_dcache_folio(struct folio *folio)
{
if (cpu_has_dc_aliases)
__flush_dcache_pages(&folio->page, folio_nr_pages(folio));
__flush_dcache_folio_pages(folio, folio_page(folio, 0),
folio_nr_pages(folio));
else if (!cpu_has_ic_fills_f_dc)
folio_set_dcache_dirty(folio);
}
@ -64,10 +65,12 @@ static inline void flush_dcache_folio(struct folio *folio)
static inline void flush_dcache_page(struct page *page)
{
struct folio *folio = page_folio(page);
if (cpu_has_dc_aliases)
__flush_dcache_pages(page, 1);
__flush_dcache_folio_pages(folio, page, 1);
else if (!cpu_has_ic_fills_f_dc)
folio_set_dcache_dirty(page_folio(page));
folio_set_dcache_dirty(folio);
}
#define flush_dcache_mmap_lock(mapping) do { } while (0)

View File

@ -99,9 +99,9 @@ SYSCALL_DEFINE3(cacheflush, unsigned long, addr, unsigned long, bytes,
return 0;
}
void __flush_dcache_pages(struct page *page, unsigned int nr)
void __flush_dcache_folio_pages(struct folio *folio, struct page *page,
unsigned int nr)
{
struct folio *folio = page_folio(page);
struct address_space *mapping = folio_flush_mapping(folio);
unsigned long addr;
unsigned int i;
@ -117,12 +117,12 @@ void __flush_dcache_pages(struct page *page, unsigned int nr)
* get faulted into the tlb (and thus flushed) anyways.
*/
for (i = 0; i < nr; i++) {
addr = (unsigned long)kmap_local_page(nth_page(page, i));
addr = (unsigned long)kmap_local_page(page + i);
flush_data_cache_page(addr);
kunmap_local((void *)addr);
}
}
EXPORT_SYMBOL(__flush_dcache_pages);
EXPORT_SYMBOL(__flush_dcache_folio_pages);
void __flush_anon_page(struct page *page, unsigned long vmaddr)
{

View File

@ -187,7 +187,7 @@ void flush_dcache_folio(struct folio *folio)
/* Flush this page if there are aliases. */
if (mapping && !mapping_mapped(mapping)) {
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
} else {
__flush_dcache_folio(folio);
if (mapping) {
@ -195,7 +195,7 @@ void flush_dcache_folio(struct folio *folio)
flush_aliases(mapping, folio);
flush_icache_range(start, start + folio_size(folio));
}
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
EXPORT_SYMBOL(flush_dcache_folio);
@ -227,7 +227,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
return;
folio = page_folio(pfn_to_page(pfn));
if (!test_and_set_bit(PG_dcache_clean, &folio->flags))
if (!test_and_set_bit(PG_dcache_clean, &folio->flags.f))
__flush_dcache_folio(folio);
mapping = folio_flush_mapping(folio);

View File

@ -75,7 +75,7 @@ static inline void sync_icache_dcache(struct page *page)
static inline void flush_dcache_folio(struct folio *folio)
{
clear_bit(PG_dc_clean, &folio->flags);
clear_bit(PG_dc_clean, &folio->flags.f);
}
#define flush_dcache_folio flush_dcache_folio

View File

@ -83,7 +83,7 @@ void update_cache(struct vm_area_struct *vma, unsigned long address,
{
unsigned long pfn = pte_val(*pte) >> PAGE_SHIFT;
struct folio *folio = page_folio(pfn_to_page(pfn));
int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags);
int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags.f);
/*
* Since icaches do not snoop for updated data on OpenRISC, we

View File

@ -48,7 +48,7 @@
#ifndef __ASSEMBLER__
struct rlimit;
unsigned long mmap_upper_limit(struct rlimit *rlim_stack);
unsigned long mmap_upper_limit(const struct rlimit *rlim_stack);
unsigned long calc_max_stack_size(unsigned long stack_max);
/*

View File

@ -122,10 +122,10 @@ void __update_cache(pte_t pte)
pfn = folio_pfn(folio);
nr = folio_nr_pages(folio);
if (folio_flush_mapping(folio) &&
test_bit(PG_dcache_dirty, &folio->flags)) {
test_bit(PG_dcache_dirty, &folio->flags.f)) {
while (nr--)
flush_kernel_dcache_page_addr(pfn_va(pfn + nr));
clear_bit(PG_dcache_dirty, &folio->flags);
clear_bit(PG_dcache_dirty, &folio->flags.f);
} else if (parisc_requires_coherency())
while (nr--)
flush_kernel_dcache_page_addr(pfn_va(pfn + nr));
@ -481,7 +481,7 @@ void flush_dcache_folio(struct folio *folio)
pgoff_t pgoff;
if (mapping && !mapping_mapped(mapping)) {
set_bit(PG_dcache_dirty, &folio->flags);
set_bit(PG_dcache_dirty, &folio->flags.f);
return;
}

View File

@ -77,7 +77,7 @@ unsigned long calc_max_stack_size(unsigned long stack_max)
* indicating that "current" should be used instead of a passed-in
* value from the exec bprm as done with arch_pick_mmap_layout().
*/
unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
unsigned long mmap_upper_limit(const struct rlimit *rlim_stack)
{
unsigned long stack_base;

View File

@ -122,6 +122,7 @@ config PPC
# Please keep this list sorted alphabetically.
#
select ARCH_32BIT_OFF_T if PPC32
select ARCH_NEEDS_DEFER_KASAN if PPC_RADIX_MMU
select ARCH_DISABLE_KASAN_INLINE if PPC_RADIX_MMU
select ARCH_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE
select ARCH_ENABLE_MEMORY_HOTPLUG

View File

@ -40,8 +40,8 @@ static inline void flush_dcache_folio(struct folio *folio)
if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
return;
/* avoid an atomic op if possible */
if (test_bit(PG_dcache_clean, &folio->flags))
clear_bit(PG_dcache_clean, &folio->flags);
if (test_bit(PG_dcache_clean, &folio->flags.f))
clear_bit(PG_dcache_clean, &folio->flags.f);
}
#define flush_dcache_folio flush_dcache_folio

View File

@ -53,18 +53,6 @@
#endif
#ifdef CONFIG_KASAN
#ifdef CONFIG_PPC_BOOK3S_64
DECLARE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key);
static __always_inline bool kasan_arch_is_ready(void)
{
if (static_branch_likely(&powerpc_kasan_enabled_key))
return true;
return false;
}
#define kasan_arch_is_ready kasan_arch_is_ready
#endif
void kasan_early_init(void);
void kasan_mmu_init(void);

View File

@ -939,9 +939,9 @@ static inline void kvmppc_mmu_flush_icache(kvm_pfn_t pfn)
/* Clear i-cache for new pages */
folio = page_folio(pfn_to_page(pfn));
if (!test_bit(PG_dcache_clean, &folio->flags)) {
if (!test_bit(PG_dcache_clean, &folio->flags.f)) {
flush_dcache_icache_folio(folio);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}

View File

@ -1562,11 +1562,11 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap)
folio = page_folio(pte_page(pte));
/* page is dirty */
if (!test_bit(PG_dcache_clean, &folio->flags) &&
if (!test_bit(PG_dcache_clean, &folio->flags.f) &&
!folio_test_reserved(folio)) {
if (trap == INTERRUPT_INST_STORAGE) {
flush_dcache_icache_folio(folio);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
} else
pp |= HPTE_R_N;
}

View File

@ -780,7 +780,7 @@ static void __meminit free_vmemmap_pages(struct page *page,
while (nr_pages--)
free_reserved_page(page++);
} else
free_pages((unsigned long)page_address(page), order);
__free_pages(page, order);
}
static void __meminit remove_pte_table(pte_t *pte_start, unsigned long addr,

View File

@ -165,7 +165,7 @@ void __init kasan_init(void)
/* At this point kasan is fully initialized. Enable error messages */
init_task.kasan_depth = 0;
pr_info("KASAN init done\n");
kasan_init_generic();
}
void __init kasan_late_init(void)

View File

@ -127,7 +127,7 @@ void __init kasan_init(void)
/* Enable error messages */
init_task.kasan_depth = 0;
pr_info("KASAN init done\n");
kasan_init_generic();
}
void __init kasan_late_init(void) { }

View File

@ -19,8 +19,6 @@
#include <linux/memblock.h>
#include <asm/pgalloc.h>
DEFINE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key);
static void __init kasan_init_phys_region(void *start, void *end)
{
unsigned long k_start, k_end, k_cur;
@ -92,11 +90,9 @@ void __init kasan_init(void)
*/
memset(kasan_early_shadow_page, 0, PAGE_SIZE);
static_branch_inc(&powerpc_kasan_enabled_key);
/* Enable error messages */
init_task.kasan_depth = 0;
pr_info("KASAN init done\n");
kasan_init_generic();
}
void __init kasan_early_init(void) { }

View File

@ -87,9 +87,9 @@ static pte_t set_pte_filter_hash(pte_t pte, unsigned long addr)
struct folio *folio = maybe_pte_to_folio(pte);
if (!folio)
return pte;
if (!test_bit(PG_dcache_clean, &folio->flags)) {
if (!test_bit(PG_dcache_clean, &folio->flags.f)) {
flush_dcache_icache_folio(folio);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
return pte;
@ -127,13 +127,13 @@ static inline pte_t set_pte_filter(pte_t pte, unsigned long addr)
return pte;
/* If the page clean, we move on */
if (test_bit(PG_dcache_clean, &folio->flags))
if (test_bit(PG_dcache_clean, &folio->flags.f))
return pte;
/* If it's an exec fault, we flush the cache and make it clean */
if (is_exec_fault()) {
flush_dcache_icache_folio(folio);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
return pte;
}
@ -175,12 +175,12 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
goto bail;
/* If the page is already clean, we move on */
if (test_bit(PG_dcache_clean, &folio->flags))
if (test_bit(PG_dcache_clean, &folio->flags.f))
goto bail;
/* Clean the page and set PG_dcache_clean */
flush_dcache_icache_folio(folio);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
bail:
return pte_mkexec(pte);

View File

@ -69,7 +69,7 @@ static const struct flag_info flag_array[] = {
}
};
struct pgtable_level pg_level[5] = {
struct ptdump_pg_level pg_level[5] = {
{ /* pgd */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),

View File

@ -102,7 +102,7 @@ static const struct flag_info flag_array[] = {
}
};
struct pgtable_level pg_level[5] = {
struct ptdump_pg_level pg_level[5] = {
{ /* pgd */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),

View File

@ -11,12 +11,12 @@ struct flag_info {
int shift;
};
struct pgtable_level {
struct ptdump_pg_level {
const struct flag_info *flag;
size_t num;
u64 mask;
};
extern struct pgtable_level pg_level[5];
extern struct ptdump_pg_level pg_level[5];
void pt_dump_size(struct seq_file *m, unsigned long delta);

View File

@ -67,7 +67,7 @@ static const struct flag_info flag_array[] = {
}
};
struct pgtable_level pg_level[5] = {
struct ptdump_pg_level pg_level[5] = {
{ /* pgd */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),

View File

@ -545,7 +545,7 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
/* balloon page list reference */
put_page(page);
return MIGRATEPAGE_SUCCESS;
return 0;
}
static void cmm_balloon_compaction_init(void)

View File

@ -23,8 +23,8 @@ static inline void local_flush_icache_range(unsigned long start,
static inline void flush_dcache_folio(struct folio *folio)
{
if (test_bit(PG_dcache_clean, &folio->flags))
clear_bit(PG_dcache_clean, &folio->flags);
if (test_bit(PG_dcache_clean, &folio->flags.f))
clear_bit(PG_dcache_clean, &folio->flags.f);
}
#define flush_dcache_folio flush_dcache_folio
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1

View File

@ -7,7 +7,7 @@
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
}
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags

View File

@ -101,9 +101,9 @@ void flush_icache_pte(struct mm_struct *mm, pte_t pte)
{
struct folio *folio = page_folio(pte_page(pte));
if (!test_bit(PG_dcache_clean, &folio->flags)) {
if (!test_bit(PG_dcache_clean, &folio->flags.f)) {
flush_icache_mm(mm, false);
set_bit(PG_dcache_clean, &folio->flags);
set_bit(PG_dcache_clean, &folio->flags.f);
}
}
#endif /* CONFIG_MMU */

View File

@ -1630,7 +1630,7 @@ static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d)
if (PageReserved(page))
free_reserved_page(page);
else
free_pages((unsigned long)page_address(page), 0);
__free_pages(page, 0);
p4d_clear(p4d);
}
@ -1652,7 +1652,7 @@ static void __meminit free_vmemmap_storage(struct page *page, size_t size,
return;
}
free_pages((unsigned long)page_address(page), order);
__free_pages(page, order);
}
static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long addr, unsigned long end,

View File

@ -533,4 +533,5 @@ void __init kasan_init(void)
csr_write(CSR_SATP, PFN_DOWN(__pa(swapper_pg_dir)) | satp_mode);
local_flush_tlb_all();
kasan_init_generic();
}

View File

@ -712,7 +712,6 @@ menu "Memory setup"
config ARCH_SPARSEMEM_ENABLE
def_bool y
select SPARSEMEM_VMEMMAP_ENABLE
select SPARSEMEM_VMEMMAP
config ARCH_SPARSEMEM_DEFAULT
def_bool y

View File

@ -39,7 +39,7 @@ static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
}
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags

View File

@ -21,6 +21,7 @@
#include <linux/kernel.h>
#include <asm/asm-extable.h>
#include <linux/memblock.h>
#include <linux/kasan.h>
#include <asm/access-regs.h>
#include <asm/asm-offsets.h>
#include <asm/machine.h>
@ -65,7 +66,7 @@ static void __init kasan_early_init(void)
{
#ifdef CONFIG_KASAN
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
kasan_init_generic();
#endif
}

View File

@ -144,7 +144,7 @@ int uv_destroy_folio(struct folio *folio)
folio_get(folio);
rc = uv_destroy(folio_to_phys(folio));
if (!rc)
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
folio_put(folio);
return rc;
}
@ -193,7 +193,7 @@ int uv_convert_from_secure_folio(struct folio *folio)
folio_get(folio);
rc = uv_convert_from_secure(folio_to_phys(folio));
if (!rc)
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
folio_put(folio);
return rc;
}
@ -289,7 +289,7 @@ static int __make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb)
expected = expected_folio_refs(folio) + 1;
if (!folio_ref_freeze(folio, expected))
return -EBUSY;
set_bit(PG_arch_1, &folio->flags);
set_bit(PG_arch_1, &folio->flags.f);
/*
* If the UVC does not succeed or fail immediately, we don't want to
* loop for long, or we might get stall notifications.
@ -483,18 +483,18 @@ int arch_make_folio_accessible(struct folio *folio)
* convert_to_secure.
* As secure pages are never large folios, both variants can co-exists.
*/
if (!test_bit(PG_arch_1, &folio->flags))
if (!test_bit(PG_arch_1, &folio->flags.f))
return 0;
rc = uv_pin_shared(folio_to_phys(folio));
if (!rc) {
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
return 0;
}
rc = uv_convert_from_secure(folio_to_phys(folio));
if (!rc) {
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
return 0;
}

View File

@ -2272,7 +2272,7 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
start = pmd_val(*pmd) & HPAGE_MASK;
end = start + HPAGE_SIZE;
__storage_key_init_range(start, end);
set_bit(PG_arch_1, &folio->flags);
set_bit(PG_arch_1, &folio->flags.f);
cond_resched();
return 0;
}

View File

@ -155,7 +155,7 @@ static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
paddr = rste & PMD_MASK;
}
if (!test_and_set_bit(PG_arch_1, &folio->flags))
if (!test_and_set_bit(PG_arch_1, &folio->flags.f))
__storage_key_init_range(paddr, paddr + size);
}

View File

@ -27,7 +27,7 @@ static unsigned long stack_maxrandom_size(void)
return STACK_RND_MASK << PAGE_SHIFT;
}
static inline int mmap_is_legacy(struct rlimit *rlim_stack)
static inline int mmap_is_legacy(const struct rlimit *rlim_stack)
{
if (current->personality & ADDR_COMPAT_LAYOUT)
return 1;
@ -47,7 +47,7 @@ static unsigned long mmap_base_legacy(unsigned long rnd)
}
static inline unsigned long mmap_base(unsigned long rnd,
struct rlimit *rlim_stack)
const struct rlimit *rlim_stack)
{
unsigned long gap = rlim_stack->rlim_cur;
unsigned long pad = stack_maxrandom_size() + stack_guard_gap;
@ -169,7 +169,7 @@ check_asce_limit:
* This function, called very early during the creation of a new
* process VM image, sets up which VM layout function to use:
*/
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
void arch_pick_mmap_layout(struct mm_struct *mm, const struct rlimit *rlim_stack)
{
unsigned long random_factor = 0UL;
@ -182,10 +182,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
*/
if (mmap_is_legacy(rlim_stack)) {
mm->mmap_base = mmap_base_legacy(random_factor);
clear_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_clear(MMF_TOPDOWN, mm);
} else {
mm->mmap_base = mmap_base(random_factor, rlim_stack);
set_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_set(MMF_TOPDOWN, mm);
}
}

View File

@ -25,7 +25,7 @@ unsigned long *crst_table_alloc_noprof(struct mm_struct *mm)
ptdesc = pagetable_alloc_noprof(gfp, CRST_ALLOC_ORDER);
if (!ptdesc)
return NULL;
table = ptdesc_to_virt(ptdesc);
table = ptdesc_address(ptdesc);
__arch_set_page_dat(table, 1UL << CRST_ALLOC_ORDER);
return table;
}
@ -123,7 +123,7 @@ struct ptdesc *page_table_alloc_pgste_noprof(struct mm_struct *mm)
ptdesc = pagetable_alloc_noprof(GFP_KERNEL_ACCOUNT, 0);
if (ptdesc) {
table = (u64 *)ptdesc_to_virt(ptdesc);
table = (u64 *)ptdesc_address(ptdesc);
__arch_set_page_dat(table, 1);
memset64(table, _PAGE_INVALID, PTRS_PER_PTE);
memset64(table + PTRS_PER_PTE, 0, PTRS_PER_PTE);
@ -153,7 +153,7 @@ unsigned long *page_table_alloc_noprof(struct mm_struct *mm)
pagetable_free(ptdesc);
return NULL;
}
table = ptdesc_to_virt(ptdesc);
table = ptdesc_address(ptdesc);
__arch_set_page_dat(table, 1);
memset64((u64 *)table, _PAGE_INVALID, PTRS_PER_PTE);
memset64((u64 *)table + PTRS_PER_PTE, 0, PTRS_PER_PTE);

View File

@ -14,7 +14,7 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
}
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags

View File

@ -114,7 +114,7 @@ static void sh4_flush_dcache_folio(void *arg)
struct address_space *mapping = folio_flush_mapping(folio);
if (mapping && !mapping_mapped(mapping))
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
else
#endif
{

View File

@ -138,7 +138,7 @@ static void sh7705_flush_dcache_folio(void *arg)
struct address_space *mapping = folio_flush_mapping(folio);
if (mapping && !mapping_mapped(mapping))
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
else {
unsigned long pfn = folio_pfn(folio);
unsigned int i, nr = folio_nr_pages(folio);

View File

@ -64,14 +64,14 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
struct folio *folio = page_folio(page);
if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
test_bit(PG_dcache_clean, &folio->flags)) {
test_bit(PG_dcache_clean, &folio->flags.f)) {
void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(vto, src, len);
kunmap_coherent(vto);
} else {
memcpy(dst, src, len);
if (boot_cpu_data.dcache.n_aliases)
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
}
if (vma->vm_flags & VM_EXEC)
@ -85,14 +85,14 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
struct folio *folio = page_folio(page);
if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
test_bit(PG_dcache_clean, &folio->flags)) {
test_bit(PG_dcache_clean, &folio->flags.f)) {
void *vfrom = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(dst, vfrom, len);
kunmap_coherent(vfrom);
} else {
memcpy(dst, src, len);
if (boot_cpu_data.dcache.n_aliases)
clear_bit(PG_dcache_clean, &folio->flags);
clear_bit(PG_dcache_clean, &folio->flags.f);
}
}
@ -105,7 +105,7 @@ void copy_user_highpage(struct page *to, struct page *from,
vto = kmap_atomic(to);
if (boot_cpu_data.dcache.n_aliases && folio_mapped(src) &&
test_bit(PG_dcache_clean, &src->flags)) {
test_bit(PG_dcache_clean, &src->flags.f)) {
vfrom = kmap_coherent(from, vaddr);
copy_page(vto, vfrom);
kunmap_coherent(vfrom);
@ -148,7 +148,7 @@ void __update_cache(struct vm_area_struct *vma,
if (pfn_valid(pfn)) {
struct folio *folio = page_folio(pfn_to_page(pfn));
int dirty = !test_and_set_bit(PG_dcache_clean, &folio->flags);
int dirty = !test_and_set_bit(PG_dcache_clean, &folio->flags.f);
if (dirty)
__flush_purge_region(folio_address(folio),
folio_size(folio));
@ -162,7 +162,7 @@ void __flush_anon_page(struct page *page, unsigned long vmaddr)
if (pages_do_alias(addr, vmaddr)) {
if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
test_bit(PG_dcache_clean, &folio->flags)) {
test_bit(PG_dcache_clean, &folio->flags.f)) {
void *kaddr;
kaddr = kmap_coherent(page, vmaddr);

View File

@ -31,7 +31,7 @@ void *kmap_coherent(struct page *page, unsigned long addr)
enum fixed_addresses idx;
unsigned long vaddr;
BUG_ON(!test_bit(PG_dcache_clean, &folio->flags));
BUG_ON(!test_bit(PG_dcache_clean, &folio->flags.f));
preempt_disable();
pagefault_disable();

View File

@ -294,7 +294,7 @@ static unsigned long mmap_rnd(void)
return rnd << PAGE_SHIFT;
}
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
void arch_pick_mmap_layout(struct mm_struct *mm, const struct rlimit *rlim_stack)
{
unsigned long random_factor = mmap_rnd();
unsigned long gap;
@ -309,7 +309,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
gap == RLIM_INFINITY ||
sysctl_legacy_va_layout) {
mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
clear_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_clear(MMF_TOPDOWN, mm);
} else {
/* We know it's 32-bit */
unsigned long task_size = STACK_TOP32;
@ -320,7 +320,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
gap = (task_size / 6 * 5);
mm->mmap_base = PAGE_ALIGN(task_size - gap - random_factor);
set_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_set(MMF_TOPDOWN, mm);
}
}

View File

@ -224,7 +224,7 @@ inline void flush_dcache_folio_impl(struct folio *folio)
((1UL<<ilog2(roundup_pow_of_two(NR_CPUS)))-1UL)
#define dcache_dirty_cpu(folio) \
(((folio)->flags >> PG_dcache_cpu_shift) & PG_dcache_cpu_mask)
(((folio)->flags.f >> PG_dcache_cpu_shift) & PG_dcache_cpu_mask)
static inline void set_dcache_dirty(struct folio *folio, int this_cpu)
{
@ -243,7 +243,7 @@ static inline void set_dcache_dirty(struct folio *folio, int this_cpu)
"bne,pn %%xcc, 1b\n\t"
" nop"
: /* no outputs */
: "r" (mask), "r" (non_cpu_bits), "r" (&folio->flags)
: "r" (mask), "r" (non_cpu_bits), "r" (&folio->flags.f)
: "g1", "g7");
}
@ -265,7 +265,7 @@ static inline void clear_dcache_dirty_cpu(struct folio *folio, unsigned long cpu
" nop\n"
"2:"
: /* no outputs */
: "r" (cpu), "r" (mask), "r" (&folio->flags),
: "r" (cpu), "r" (mask), "r" (&folio->flags.f),
"i" (PG_dcache_cpu_mask),
"i" (PG_dcache_cpu_shift)
: "g1", "g7");
@ -292,7 +292,7 @@ static void flush_dcache(unsigned long pfn)
struct folio *folio = page_folio(page);
unsigned long pg_flags;
pg_flags = folio->flags;
pg_flags = folio->flags.f;
if (pg_flags & (1UL << PG_dcache_dirty)) {
int cpu = ((pg_flags >> PG_dcache_cpu_shift) &
PG_dcache_cpu_mask);
@ -480,7 +480,7 @@ void flush_dcache_folio(struct folio *folio)
mapping = folio_flush_mapping(folio);
if (mapping && !mapping_mapped(mapping)) {
bool dirty = test_bit(PG_dcache_dirty, &folio->flags);
bool dirty = test_bit(PG_dcache_dirty, &folio->flags.f);
if (dirty) {
int dirty_cpu = dcache_dirty_cpu(folio);

View File

@ -5,6 +5,7 @@ menu "UML-specific options"
config UML
bool
default y
select ARCH_NEEDS_DEFER_KASAN if STATIC_LINK
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_HAS_CACHE_LINE_SIZE
select ARCH_HAS_CPU_FINALIZE_INIT

View File

@ -24,10 +24,9 @@
#ifdef CONFIG_KASAN
void kasan_init(void);
extern int kasan_um_is_ready;
#ifdef CONFIG_STATIC_LINK
#define kasan_arch_is_ready() (kasan_um_is_ready)
#if defined(CONFIG_STATIC_LINK) && defined(CONFIG_KASAN_INLINE)
#error UML does not work in KASAN_INLINE mode with STATIC_LINK enabled!
#endif
#else
static inline void kasan_init(void) { }

View File

@ -21,10 +21,10 @@
#include <os.h>
#include <um_malloc.h>
#include <linux/sched/task.h>
#include <linux/kasan.h>
#ifdef CONFIG_KASAN
int kasan_um_is_ready;
void kasan_init(void)
void __init kasan_init(void)
{
/*
* kasan_map_memory will map all of the required address space and
@ -32,7 +32,11 @@ void kasan_init(void)
*/
kasan_map_memory((void *)KASAN_SHADOW_START, KASAN_SHADOW_SIZE);
init_task.kasan_depth = 0;
kasan_um_is_ready = true;
/*
* Since kasan_init() is called before main(),
* KASAN is initialized but the enablement is deferred after
* jump_label_init(). See arch_mm_preinit().
*/
}
static void (*kasan_init_ptr)(void)
@ -58,6 +62,9 @@ static unsigned long brk_end;
void __init arch_mm_preinit(void)
{
/* Safe to call after jump_label_init(). Enables KASAN. */
kasan_init_generic();
/* clear the zero-page */
memset(empty_zero_page, 0, PAGE_SIZE);

View File

@ -1565,7 +1565,6 @@ config ARCH_SPARSEMEM_ENABLE
def_bool y
select SPARSEMEM_STATIC if X86_32
select SPARSEMEM_VMEMMAP_ENABLE if X86_64
select SPARSEMEM_VMEMMAP if X86_64
config ARCH_SPARSEMEM_DEFAULT
def_bool X86_64 || (NUMA && X86_32)

View File

@ -34,6 +34,7 @@
* We need to define the tracepoints somewhere, and tlb.c
* is only compiled when SMP=y.
*/
#define CREATE_TRACE_POINTS
#include <trace/events/tlb.h>
#include "mm_internal.h"

View File

@ -1031,7 +1031,7 @@ static void __meminit free_pagetable(struct page *page, int order)
free_reserved_pages(page, nr_pages);
#endif
} else {
free_pages((unsigned long)page_address(page), order);
__free_pages(page, order);
}
}

View File

@ -451,5 +451,5 @@ void __init kasan_init(void)
__flush_tlb_all();
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
kasan_init_generic();
}

View File

@ -80,7 +80,7 @@ unsigned long arch_mmap_rnd(void)
}
static unsigned long mmap_base(unsigned long rnd, unsigned long task_size,
struct rlimit *rlim_stack)
const struct rlimit *rlim_stack)
{
unsigned long gap = rlim_stack->rlim_cur;
unsigned long pad = stack_maxrandom_size(task_size) + stack_guard_gap;
@ -110,7 +110,7 @@ static unsigned long mmap_legacy_base(unsigned long rnd,
*/
static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
unsigned long random_factor, unsigned long task_size,
struct rlimit *rlim_stack)
const struct rlimit *rlim_stack)
{
*legacy_base = mmap_legacy_base(random_factor, task_size);
if (mmap_is_legacy())
@ -119,12 +119,12 @@ static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
*base = mmap_base(random_factor, task_size, rlim_stack);
}
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
void arch_pick_mmap_layout(struct mm_struct *mm, const struct rlimit *rlim_stack)
{
if (mmap_is_legacy())
clear_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_clear(MMF_TOPDOWN, mm);
else
set_bit(MMF_TOPDOWN, &mm->flags);
mm_flags_set(MMF_TOPDOWN, mm);
arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base,
arch_rnd(mmap64_rnd_bits), task_size_64bit(0),

View File

@ -126,7 +126,7 @@ __setup("debugpat", pat_debug_setup);
static inline enum page_cache_mode get_page_memtype(struct page *pg)
{
unsigned long pg_flags = pg->flags & _PGMT_MASK;
unsigned long pg_flags = pg->flags.f & _PGMT_MASK;
if (pg_flags == _PGMT_WB)
return _PAGE_CACHE_MODE_WB;
@ -161,10 +161,10 @@ static inline void set_page_memtype(struct page *pg,
break;
}
old_flags = READ_ONCE(pg->flags);
old_flags = READ_ONCE(pg->flags.f);
do {
new_flags = (old_flags & _PGMT_CLEAR_MASK) | memtype_flags;
} while (!try_cmpxchg(&pg->flags, &old_flags, new_flags));
} while (!try_cmpxchg(&pg->flags.f, &old_flags, new_flags));
}
#else
static inline enum page_cache_mode get_page_memtype(struct page *pg)

View File

@ -42,7 +42,7 @@ void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
struct page *p = pfn_to_page(PHYS_PFN(phys));
unsigned int order = get_order(size);
free_pages((unsigned long) page_address(p), order);
__free_pages(p, order);
}
}

View File

@ -29,7 +29,7 @@
#if DCACHE_WAY_SIZE > PAGE_SIZE
#define get_pkmap_color get_pkmap_color
static inline int get_pkmap_color(struct page *page)
static inline int get_pkmap_color(const struct page *page)
{
return DCACHE_ALIAS(page_to_phys(page));
}

View File

@ -134,8 +134,8 @@ void flush_dcache_folio(struct folio *folio)
*/
if (mapping && !mapping_mapped(mapping)) {
if (!test_bit(PG_arch_1, &folio->flags))
set_bit(PG_arch_1, &folio->flags);
if (!test_bit(PG_arch_1, &folio->flags.f))
set_bit(PG_arch_1, &folio->flags.f);
return;
} else {
@ -232,7 +232,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
#if (DCACHE_WAY_SIZE > PAGE_SIZE)
if (!folio_test_reserved(folio) && test_bit(PG_arch_1, &folio->flags)) {
if (!folio_test_reserved(folio) && test_bit(PG_arch_1, &folio->flags.f)) {
unsigned long phys = folio_pfn(folio) * PAGE_SIZE;
unsigned long tmp;
@ -247,10 +247,10 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
}
preempt_enable();
clear_bit(PG_arch_1, &folio->flags);
clear_bit(PG_arch_1, &folio->flags.f);
}
#else
if (!folio_test_reserved(folio) && !test_bit(PG_arch_1, &folio->flags)
if (!folio_test_reserved(folio) && !test_bit(PG_arch_1, &folio->flags.f)
&& (vma->vm_flags & VM_EXEC) != 0) {
for (i = 0; i < nr; i++) {
void *paddr = kmap_local_folio(folio, i * PAGE_SIZE);
@ -258,7 +258,7 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
__invalidate_icache_page((unsigned long)paddr);
kunmap_local(paddr);
}
set_bit(PG_arch_1, &folio->flags);
set_bit(PG_arch_1, &folio->flags.f);
}
#endif
}

View File

@ -94,5 +94,5 @@ void __init kasan_init(void)
/* At this point kasan is fully initialized. Enable error messages. */
current->kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
kasan_init_generic();
}

View File

@ -196,6 +196,8 @@ static void __blkdev_issue_zero_pages(struct block_device *bdev,
sector_t sector, sector_t nr_sects, gfp_t gfp_mask,
struct bio **biop, unsigned int flags)
{
struct folio *zero_folio = largest_zero_folio();
while (nr_sects) {
unsigned int nr_vecs = __blkdev_sectors_to_bio_pages(nr_sects);
struct bio *bio;
@ -208,15 +210,14 @@ static void __blkdev_issue_zero_pages(struct block_device *bdev,
break;
do {
unsigned int len, added;
unsigned int len;
len = min_t(sector_t,
PAGE_SIZE, nr_sects << SECTOR_SHIFT);
added = bio_add_page(bio, ZERO_PAGE(0), len, 0);
if (added < len)
len = min_t(sector_t, folio_size(zero_folio),
nr_sects << SECTOR_SHIFT);
if (!bio_add_folio(bio, zero_folio, len, 0))
break;
nr_sects -= added >> SECTOR_SHIFT;
sector += added >> SECTOR_SHIFT;
nr_sects -= len >> SECTOR_SHIFT;
sector += len >> SECTOR_SHIFT;
} while (nr_sects);
*biop = bio_chain_and_submit(*biop, bio);

View File

@ -88,7 +88,7 @@ static int hash_walk_new_entry(struct crypto_hash_walk *walk)
sg = walk->sg;
walk->offset = sg->offset;
walk->pg = nth_page(sg_page(walk->sg), (walk->offset >> PAGE_SHIFT));
walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT);
walk->offset = offset_in_page(walk->offset);
walk->entrylen = sg->length;
@ -226,7 +226,7 @@ int shash_ahash_digest(struct ahash_request *req, struct shash_desc *desc)
if (!IS_ENABLED(CONFIG_HIGHMEM))
return crypto_shash_digest(desc, data, nbytes, req->result);
page = nth_page(page, offset >> PAGE_SHIFT);
page += offset >> PAGE_SHIFT;
offset = offset_in_page(offset);
if (nbytes > (unsigned int)PAGE_SIZE - offset)

Some files were not shown because too many files have changed in this diff Show More