Previous high mem size initialized for vGPU type was too small which caused
failure for some VMs. This trys to take minimal value of 384MB for each VM and
enlarge default high mem size to make guest driver happy.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
In function intel_vgpu_emulate_mmio_read, the untracked mmio register is
dumped through kernel log, but the register value is not correct. This
patch fixes this issue.
V2: fix the fromat warning from checkpatch.pl.
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The readq and writeq are already offered by drm_os_linux.h. So we can
use them directly whithout dectecting their presence. This patch removed
the duplicated code.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Return ealier for a invalid access, else it would false set
tlb flag for RCS.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The current prototype of new_mmio_info() uses void* for parameters read
and write, which are functions with precise calling conventions
(argument types and return type). Write down these conventions in
new_mmio_info() definition.
This has been reported by the following warnings when clang is used to
build the kernel:
drivers/gpu/drm/i915/gvt/handlers.c:124:21: error: pointer type
mismatch ('void *' and 'int (*)(struct intel_vgpu *, unsigned int,
void *, unsigned int)') [-Werror,-Wpointer-type-mismatch]
info->read = read ? read : intel_vgpu_default_mmio_read;
^ ~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gvt/handlers.c:125:23: error: pointer type
mismatch ('void *' and 'int (*)(struct intel_vgpu *, unsigned int,
void *, unsigned int)') [-Werror,-Wpointer-type-mismatch]
info->write = write ? write : intel_vgpu_default_mmio_write;
^ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This allows the compiler to detect that sbi_ctl_mmio_write() returns a
"bool" value instead of an expected "int" one. Fix this.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Pull VFIO fixes from Alex Williamson:
- Add mtty sample driver properly into build system (Alex Williamson)
- Restore type1 mapping performance after mdev (Alex Williamson)
- Fix mdev device race (Alex Williamson)
- Cleanups to the mdev ABI used by vendor drivers (Alex Williamson)
- Build fix for old compilers (Arnd Bergmann)
- Fix sample driver error path (Dan Carpenter)
- Handle pci_iomap() error (Arvind Yadav)
- Fix mdev ioctl return type (Paul Gortmaker)
* tag 'vfio-v4.10-rc3' of git://github.com/awilliam/linux-vfio:
vfio-mdev: fix non-standard ioctl return val causing i386 build fail
vfio-pci: Handle error from pci_iomap
vfio-mdev: fix some error codes in the sample code
vfio-pci: use 32-bit comparisons for register address for gcc-4.5
vfio-mdev: Make mdev_device private and abstract interfaces
vfio-mdev: Make mdev_parent private
vfio-mdev: de-polute the namespace, rename parent_device & parent_ops
vfio-mdev: Fix remove race
vfio/type1: Restore mapping performance with mdev support
vfio-mdev: Fix mtty sample driver building
Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've
done the backmerge directly because Dave is on vacation.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
PCI basic config space's size is 256 bytes. When check if access crosses
space range, should use "> 256".
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There's an issue in current cfg space emulation for PCI_COMMAND (offset
0x4): when guest changes some bits other than PCI_COMMAND_MEMORY, this
write operation will not be written to virutal cfg space successfully.
This patch is to fix the wrong behavior above.
Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The release action might be triggered from either user's closing
mdev or the detaching event of kvm and vfio_group, so this patch
introduces an atomic to prevent double-release.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
gfn_to_memslot() may return NULL if the gfn is mmio
or invalid. A malicious user might input a bad gfn
to panic the host if we don't check it.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Though there is no issue exposed yet, it's possible that another
thread releases the entry while our trying to deref it out of the
lock. Fit it by moving the dereference within lock.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The GGTT space is partitioned between vGPUs, it could be reused by
next vGPU after previous one is release, the stale entries need
point to scratch page when vGPU created.
v2: Reset logic move to vGPU create.
v3: Correct the commit msg.
v4: Move the reset function to vGPU init gtt function, as result it's no
need explicitly in vGPU reset logic as vGPU init gtt called during
reset.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It should be vgpu_opregion(vgpu)->va, not vgpu_opregion(vgpu).
Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
KVMGT leverages vfio/mdev to mediate device accesses from guest,
this patch adds the vfio/mdev support, thereby completes the
functionality. An intel_vgpu is presented as a mdev device,
and full userspace API compatibility with vfio-pci is kept.
An intel_vgpu_ops is provided to mdev framework, methods get
called to create/remove a vgpu, to open/close it, and to
access it.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Previously to read/write a GPA, we at first try to pin the GFN it belongs
to, then translate the pinned PFN to a kernel HVA, then read/write it.
This is however not necessary. A GFN should be pinned IFF it would be
accessed by peripheral devices (DMA), not by CPU. This patch changes
the read/write method to KVM API, which will leverage userspace HVA
and copy_{from|to}_usr instead.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Need to be careful to release struct_mutext when request alloc
failed and take consistent handling for return status as with
normal go out path. Ensure to check correct workload request in
complete path too.
v2: Add Fixes note
Fixes: 90d27a1b18 ("drm/i915/gvt: fix deadlock in workload_thread")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Pei Zhang <pei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
For 64bit bar while reading the higher 32bit the value should be returned
directly.
In the current implementation the higher 32bit value was discarded and not
written to the cfg space of vgpu which lead to an incorrect bar size.
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Instead of partially depending on vfio pin/unpin pages interface if
mdev is available, which would result in failure if vfio is not
on. But replace with a wrapper which need to be fixed till mdev
support got fully merged.
Cc: Jike Song <jike.song@intel.com>
Cc: Xiaoguang Chen <xiaoguang.chen@intel.com>
Reviewed-by: Xiaoguang Chen <Xiaoguang.chen@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch will fix warning log print during command scan caused by
empty workload (ring head equals tail). This patch avoid going into
real scan process if workload is empty. It's guest's responsibility
to make sure if an empty workload is proper to submit to HW.
[v2] modify the patch description. It's a fix, not a w/a.
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Since there's no opregion in vgpu so clear the opregion bits in case
guest access it.
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Add more MMIO regs with command access flag for whitelist as they are
accessed by command.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Static checker gave warning on:
drivers/gpu/drm/i915/gvt/edid.c:506 intel_gvt_i2c_handle_aux_ch_write()
warn: odd binop '0x0 & 0xff'
We try to return ACK for I2C reply which is defined with 0. Remove
bit shift which caused misleading bit op.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Gvt gdrst handler handle_device_reset() invoke function
setup_vgpu_mmio() to reset mmio status. In this case,
the virtual mmio memory has been allocated already. The
new allocation just cause old mmio memory leakage.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We initiate vgpu->workload_q_head via for_each_engine
macro which may skip unavailable engines. So we should
follow this rule anywhere. The function
intel_vgpu_reset_execlist is not aware of this. Kernel
crash when touch a uninitiated vgpu->workload_q_head[x].
Let's fix it by using for_each_engine_masked and skip
unavailable engine ID. Meanwhile rename ring_bitmap to
general name engine_mask.
v2: remove unnecessary engine activation check (zhenyu)
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It's a classical abba type deadlock when using 2 mutex objects, which
are gvt.lock(a) and drm.struct_mutex(b). Deadlock happens in threads:
1. intel_gvt_create/destroy_vgpu: P(a)->P(b)
2. workload_thread: P(b)->P(a)
Fix solution is align the lock acquire sequence in both threads. This
patch choose to adjust the sequence in workload_thread function.
This fixed lockup symptom for guest-reboot stress test.
v2: adjust sequence in workload_thread based on zhenyu's suggestion.
adjust sequence in create/destroy_vgpu function.
v3: fix to still require struct_mutex for dispatch_workload()
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
[zhenyuw: fix unused variables warnings.]
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
KVMGT is the MPT implementation based on VFIO/KVM. It provides
a kvmgt_mpt ops to gvt for vGPU access mediation, e.g. to
mediate and emulate the MMIO accesses, to inject interrupts
to vGPU user, to intercept the GTT writing and replace it with
DMA-able address, to write-protect guest PPGTT table for
shadowing synchronization, etc. This patch provides the MPT
implementation for GVT, not yet functional due to theabsence
of mdev.
It's built as kvmgt.ko, depends on vfio.ko, kvm.ko and mdev.ko,
and being required by i915.ko. To not introduce hard dependency
in i915.ko, we used indirect symbol reference. But that means
users have to include kvmgt.ko into init ramdisk if their
i915.ko is included.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There are currently 4 methods in intel_gvt_io_emulation_ops
to emulate CFG/MMIO reading/writing for intel vGPU. A possibly
better scope is: add 3 more methods for vgpu create/destroy/reset
respectively, and rename the ops to 'intel_gvt_ops', then pass
it to the MPT module (say the future kvmgt) to use: they are
all methods for external usage.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Hypervisors are different, the MPT ops is a only superset of
all possibly supported hypervisors. There might be other way
out of the MPT to achieve same target. e.g. vfio-based kvmgt
won't provide map_gfn_to_mfn method to establish guest EPT
mapping for aperture, since it will be done in QEMU/KVM, MMIO
is also trapped elsewhere, etc.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
GVT host needs init/exit hooks to do some initialization/cleanup
work, e.g.: vfio mdev host device register/unregister.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Current GVT contains some obsolete logic originally cooked to
support the old, non-vfio kvmgt, which is actually workarounds.
We don't support that anymore, so it's safe to remove it and
make a better framework.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
By providing predefined vGPU types, users can choose which type a vgpu
to create and use, without specifying detailed parameters.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
kmap_atomic doesn't allow sleep until unmapped. However,
it's necessary to allow sleep during reading/writing guest
memory, so use kmap instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
On guest writing a PPGTT entry, if it contains value and the old
entry is valid, gvt will read it and find & free the corresponding
old data for it. However, with the KVM write protection provided
by page_track, the guest entry will be written with new value
before gvt handling. To avoid that, we should use the shadow
entry instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
All the unused entries in the page table tree(PML4E->PDPE->PDE->PTE)
should point to scratch page table/scratch page to avoid page walk error
due to the page prefetching.
When removing an entry in shadow PPGTT, it need map to scratch page
also, the older implementation use single scratch page to assign to all
level entries, it doesn't align the page walk behavior when removed
entry is in PML, PDP, PD. To avoid potential page walk error this patch
implement a scratch page tree to replace the single scratch page.
v2: more details in commit message address Kevin's comments.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
When SW wishes to reset the render engine, it will program
engine's reset control register and wait response from HW.
We need emulate the behavior of this register so guest i915
driver could walk through the engine reset flow. The registers
are not emulated in gvt yet, this patch add the emulation
logic.
v2: add more desc info in commit message.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
From commit e95433c73a, workload status setting
was changed to only capture on error path, but we need to set it properly in
normal path too, otherwise we'll fail to complete workload which could lead
guest VM vGPU reset.
v2: uses braces and add Fixes tag.
Fixes: e95433c73a ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>