Commit 1333c35c authored by Sean Christopherson's avatar Sean Christopherson
Browse files

KVM: Conditionally reschedule when resetting the dirty ring



When resetting a dirty ring, conditionally reschedule on each iteration
after the first.  The recently introduced hard limit mitigates the issue
of an endless reset, but isn't sufficient to completely prevent RCU
stalls, soft lockups, etc., nor is the hard limit intended to guard
against such badness.

Note!  Take care to check for reschedule even in the "continue" paths,
as a pathological scenario (or malicious userspace) could dirty the same
gfn over and over, i.e. always hit the continue path.

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 	4-....: (5249 ticks this GP) idle=51e4/1/0x4000000000000000 softirq=309/309 fqs=2563
  rcu: 	(t=5250 jiffies g=-319 q=608 ncpus=24)
  CPU: 4 UID: 1000 PID: 1067 Comm: dirty_log_test Tainted: G             L     6.13.0-rc3-17fa7a24ea1e-HEAD-vm #814
  Tainted: [L]=SOFTLOCKUP
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x26/0x200 [kvm]
  Call Trace:
   <TASK>
   kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm]
   kvm_dirty_ring_reset+0x58/0x220 [kvm]
   kvm_vm_ioctl+0x10eb/0x15d0 [kvm]
   __x64_sys_ioctl+0x8b/0xb0
   do_syscall_64+0x5b/0x160
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  Tainted: [L]=SOFTLOCKUP
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x17/0x200 [kvm]
  Call Trace:
   <TASK>
   kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm]
   kvm_dirty_ring_reset+0x58/0x220 [kvm]
   kvm_vm_ioctl+0x10eb/0x15d0 [kvm]
   __x64_sys_ioctl+0x8b/0xb0
   do_syscall_64+0x5b/0x160
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>

Fixes: fb04a1ed ("KVM: X86: Implement ring-based dirty memory tracking")
Reviewed-by: default avatarJames Houghton <jthoughton@google.com>
Reviewed-by: default avatarYan Zhao <yan.y.zhao@intel.com>
Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250516213540.2546077-4-seanjc@google.com


Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
parent 49005a2a
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -134,6 +134,16 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring,

		ring->reset_index++;
		(*nr_entries_reset)++;

		/*
		 * While the size of each ring is fixed, it's possible for the
		 * ring to be constantly re-dirtied/harvested while the reset
		 * is in-progress (the hard limit exists only to guard against
		 * wrapping the count into negative space).
		 */
		if (!first_round)
			cond_resched();

		/*
		 * Try to coalesce the reset operations when the guest is
		 * scanning pages in the same slot.