mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/
synced 2026-04-18 06:33:43 -04:00
sched/numa: Skip some page migrations after a shared fault
Shared faults can lead to lots of unnecessary page migrations, slowing down the system, and causing private faults to hit the per-pgdat migration ratelimit. This patch adds sysctl numa_balancing_migrate_deferred, which specifies how many shared page migrations to skip unconditionally, after each page migration that is skipped because it is a shared fault. This reduces the number of page migrations back and forth in shared fault situations. It also gives a strong preference to the tasks that are already running where most of the memory is, and to moving the other tasks to near the memory. Testing this with a much higher scan rate than the default still seems to result in fewer page migrations than before. Memory seems to be somewhat better consolidated than previously, with multi-instance specjbb runs on a 4 node system. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-62-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
committed by
Ingo Molnar
parent
1e3646ffc6
commit
de1c9ce6f0
@@ -375,7 +375,8 @@ feature should be disabled. Otherwise, if the system overhead from the
|
||||
feature is too high then the rate the kernel samples for NUMA hinting
|
||||
faults may be controlled by the numa_balancing_scan_period_min_ms,
|
||||
numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms,
|
||||
numa_balancing_scan_size_mb and numa_balancing_settle_count sysctls.
|
||||
numa_balancing_scan_size_mb, numa_balancing_settle_count sysctls and
|
||||
numa_balancing_migrate_deferred.
|
||||
|
||||
==============================================================
|
||||
|
||||
@@ -421,6 +422,13 @@ the schedule balancer stops pushing the task towards a preferred node. This
|
||||
gives the scheduler a chance to place the task on an alternative node if the
|
||||
preferred node is overloaded.
|
||||
|
||||
numa_balancing_migrate_deferred is how many page migrations get skipped
|
||||
unconditionally, after a page migration is skipped because a page is shared
|
||||
with other tasks. This reduces page migration overhead, and determines
|
||||
how much stronger the "move task near its memory" policy scheduler becomes,
|
||||
versus the "move memory near its task" memory management policy, for workloads
|
||||
with shared memory.
|
||||
|
||||
==============================================================
|
||||
|
||||
osrelease, ostype & version:
|
||||
|
||||
Reference in New Issue
Block a user