Driver Changes:
- Avoid infinite GPU waits by avoidin premature release of request's
reusable memory (Chris, Janusz)
- Expose RPS thresholds in sysfs (Tvrtko)
- Apply GuC SLPC min frequency softlimit correctly (Vinay)
- Restore SLPC efficient freq earlier (Vinay)
- Consider OA buffer boundary when zeroing out reports (Umesh)
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL (Matt R)
- Fix context workarounds with non-masked regs on MTL/DG2 (Lucas)
- Enable the CCS_FLUSH bit in the pipe control and in the CS for MTL+ (Andi)
- Update MTL workarounds 14018778641, 22016122933 (Tejas, Zhanjun)
- Ensure memory quiesced before AUX CCS invalidation (Jonathan)
- Add a gsc_info debugfs (Daniele)
- Invalidate the TLBs on each GT on multi-GT device (Chris)
- Fix a VMA UAF for multi-gt platform (Nirmoy)
- Do not use stolen on MTL due to HW bug (Nirmoy)
- Check HuC and GuC version compatibility on MTL (Daniele)
- Dump perf_limit_reasons for slow GuC init debug (Vinay)
- Replace kmap() with kmap_local_page() (Sumitra, Ira)
- Add sentinel to xehp_oa_b_counters for KASAN (Andrzej)
- Add the gen12_needs_ccs_aux_inv helper (Andi)
- Fixes and updates for GSC memory allocation (Daniele)
- Fix one wrong caching mode enum usage (Tvrtko)
- Fixes for GSC wakeref (Alan)
- Static checker fixes (Harshit, Arnd, Dan, Cristophe, David, Andi)
- Rename flags with bit_group_X according to the datasheet (Andi)
- Use direct alias for i915 in requests (Andrzej)
- Replace i915->gt0 with to_gt(i915) (Andi)
- Use the i915_vma_flush_writes helper (Tvrtko)
- Selftest improvements (Alan)
- Remove dead code (Tvrtko)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# Conflicts:
# drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMy6kDd9npweR4uy@jlahtine-mobl.ger.corp.intel.com
On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.
That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.
To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete. Selftests shouldn't
care if the proxy-init failed as that should be flagged elsewhere.
Difference from prior versions:
v7: - Change the fw status to INTEL_UC_FIRMWARE_LOAD_FAIL if the
proxy-init fails so that intel_gsc_uc_fw_proxy_get_status
catches it. (Daniele)
v6: - Add a helper that returns something more than a boolean
so we selftest can stop waiting if proxy-init hadn't
completed but failed (Daniele).
v5: - Move the call to __wait_gsc_proxy_completed from common
__run_selftests dispatcher to the group-level selftest
function (Trvtko).
- change the pr_info to pr_warn if we hit the timeout.
v4: - Remove generalized waiters function table framework (Tvrtko).
- Remove mention of CI-framework-timeout from comments (Tvrtko).
v3: - Rebase to latest drm-tip.
v2: - Based on internal testing, increase the timeout for gsc-proxy
specific case to 8 seconds.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720230126.375566-1-alan.previn.teres.alexis@intel.com
Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.
From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity. PAT is expected to work much like MOCS already works today,
and by design userspace is expected to select the index that exactly
matches the desired behavior described in the hardware specification.
For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For kernel objects, cache_level is used
for simplicity and backward compatibility. For Pre-gen12 platforms PAT can
have 1:1 mapping to i915_cache_level, so these two are interchangeable. see
the use of LEGACY_CACHELEVEL.
One consequence of this change is that gen8_pte_encode is no longer working
for gen12 platforms due to the fact that gen12 platforms has different PAT
definitions. In the meantime the mtl_pte_encode introduced specfically for
MTL becomes generic for all gen12 platforms. This patch renames the MTL
PTE encode function into gen12_pte_encode and apply it to all gen12. Even
though this change looks unrelated, but separating them would temporarily
break gen12 PTE encoding, thus squash them in one patch.
Special note: this patch changes the way caching behavior is controlled in
the sense that some objects are left to be managed by userspace. For such
objects we need to be careful not to change the userspace settings.There
are kerneldoc and comments added around obj->cache_coherent, cache_dirty,
and how to bypass the checkings by i915_gem_object_has_cache_level. For
full understanding, these changes need to be looked at together with the
two follow-up patches, one disables the {set|get}_caching ioctl's and the
other adds set_pat extension to the GEM_CREATE uAPI.
Bspec: 63019
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.
Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.
cachelevel_to_pat: a platform dependent array mapping cache_level to
pat_index.
max_pat_index: the maximum PAT index recommended in hardware specification
Needed for validating the PAT index passed in from user
space.
i915_gem_get_pat_index: function to convert cache_level to PAT index.
obj_to_i915(obj): macro moved to header file for wider usage.
I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
convenience of coding.
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-2-fei.yang@intel.com
UAPI Changes:
- (Build-time only, should not have any impact)
drm/i915/uapi: Replace fake flex-array with flexible-array member
"Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead."
This is on core kernel request moving towards GCC 13.
Driver Changes:
- Fix context runtime accounting on sysfs fdinfo for heavy workloads (Tvrtko)
- Add support for OA media units on MTL (Umesh)
- Add new workarounds for Meteorlake (Daniele, Radhakrishna, Haridhar)
- Fix sysfs to read actual frequency for MTL and Gen6 and earlier
(Ashutosh)
- Synchronize i915/BIOS on C6 enabling on MTL (Vinay)
- Fix DMAR error noise due to GPU error capture (Andrej)
- Fix forcewake during BAR resize on discrete (Andrzej)
- Flush lmem contents after construction on discrete (Chris)
- Fix GuC loading timeout on systems where IFWI programs low boot
frequency (John)
- Fix race condition UAF in i915_perf_add_config_ioctl (Min)
- Sanitycheck MMIO access early in driver load and during forcewake
(Matt)
- Wakeref fixes for GuC RC error scenario and active VM tracking (Chris)
- Cancel HuC delayed load timer on reset (Daniele)
- Limit double GT reset to pre-MTL (Daniele)
- Use i915 instead of dev_priv insied the file_priv structure (Andi)
- Improve GuC load error reporting (John)
- Simplify VCS/BSD engine selection logic (Tvrtko)
- Perform uc late init after probe error injection (Andrzej)
- Fix format for perf_limit_reasons in debugfs (Vinay)
- Create per-gt debugfs files (Andi)
- Documentation and kerneldoc fixes (Nirmoy, Lee)
- Selftest improvements (Fei, Jonathan)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZC6APj/feB+jBf2d@jlahtine-mobl.ger.corp.intel.com
Driver Changes:
- Fix issue #6333: "list_add corruption" and full system lockup from
performance monitoring (Janusz)
- Give the punit time to settle before fatally failing (Aravind, Chris)
- Don't use stolen memory or BAR for ring buffers on LLC platforms (John)
- Add missing ecodes and correct timeline seqno on GuC error captures (John)
- Make sure DSM size has correct 1MiB granularity on Gen12+ (Nirmoy,
Lucas)
- Fix potential SSEU max_subslices array-index-out-of-bounds access on Gen11 (Andrea)
- Whitelist COMMON_SLICE_CHICKEN3 for UMD access on Gen12+ (Matt R.)
- Apply Wa_1408615072/Wa_1407596294 correctly on Gen11 (Matt R)
- Apply LNCF/LBCF workarounds correctly on XeHP SDV/PVC/DG2 (Matt R)
- Implement Wa_1606376872 for Xe_LP (Gustavo)
- Consider GSI offset when doing MCR lookups on Meteorlake+ (Matt R.)
- Add engine TLB invalidation for Meteorlake (Matt R.)
- Fix GSC Driver-FLR completion on Meteorlake (Alan)
- Fix GSC races on driver load/unload on Meteorlake+ (Daniele)
- Disable MC6 for MTL A step (Badal)
- Consolidate TLB invalidation flow (Tvrtko)
- Improve debug GuC/HuC debug messages (Michal Wa., John)
- Move fd_install after last use of fence (Rob)
- Initialize the obj flags for shmem objects (Aravind)
- Fix missing debug object activation (Nirmoy)
- Probe lmem before the stolen portion (Matt A)
- Improve clean up of GuC busyness stats worker (John)
- Fix missing return code checks in GuC submission init (John)
- Annotate two more workaround/tuning registers as MCR on PVC (Matt R)
- Fix GEN8_MISCCPCTL definition and remove unused INF_UNIT_LEVEL_CLKGATE (Lucas)
- Use sysfs_emit() and sysfs_emit_at() (Nirmoy)
- Make kobj_type structures constant (Thomas W.)
- make kobj attributes const on gt/ (Jani)
- Remove the unused virtualized start hack on buddy allocator (Matt A)
- Remove redundant check for DG1 (Lucas)
- Move DG2 tuning to the right function (Lucas)
- Rename dev_priv to i915 for private data naming consistency in gt/ (Andi)
- Remove unnecessary whitelisting of CS_CTX_TIMESTAMP on Xe_HP platforms (Matt R.)
-
- Escape wildcard in method names in kerneldoc (Bagas)
- Selftest improvements (Chris, Jonathan, Tvrtko, Anshuman, Tejas)
- Fix sparse warnings (Jani)
[airlied: fix unused variable in intel_workarounds]
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZBMSb42yjjzczRhj@jlahtine-mobl.ger.corp.intel.com
Due to holidays we started -next with more -fixes in-flight than
usual, and people have been asking where they are. Backmerge to get
things better in sync.
Conflicts:
- Tiny conflict in drm_fbdev_generic.c between variable rename and
missing error handling that got added.
- Conflict in drm_fb_helper.c between the added call to vgaswitcheroo
in drm_fb_helper_single_fb_probe and a refactor patch that extracted
lots of helpers and incidentally removed the dev local variable.
Readd it to make things compile.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The definition of intel_selftest_modify_policy() does not match the
declaration, as gcc-13 points out:
drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:29:5: error: conflicting types for 'intel_selftest_modify_policy' due to enum/integer mismatch; have 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, u32)' {aka 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, unsigned int)'} [-Werror=enum-int-mismatch]
29 | int intel_selftest_modify_policy(struct intel_engine_cs *engine,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:11:
drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h:28:5: note: previous declaration of 'intel_selftest_modify_policy' with type 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, enum selftest_scheduler_modify)'
28 | int intel_selftest_modify_policy(struct intel_engine_cs *engine,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change the type in the definition to match.
Fixes: 617e87c05c ("drm/i915/selftest: Fix hangcheck self test for GuC submission")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117163743.1003219-1-arnd@kernel.org
(cherry picked from commit 8d7eb8ed3f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The definition of intel_selftest_modify_policy() does not match the
declaration, as gcc-13 points out:
drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:29:5: error: conflicting types for 'intel_selftest_modify_policy' due to enum/integer mismatch; have 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, u32)' {aka 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, unsigned int)'} [-Werror=enum-int-mismatch]
29 | int intel_selftest_modify_policy(struct intel_engine_cs *engine,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:11:
drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h:28:5: note: previous declaration of 'intel_selftest_modify_policy' with type 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, enum selftest_scheduler_modify)'
28 | int intel_selftest_modify_policy(struct intel_engine_cs *engine,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change the type in the definition to match.
Fixes: 617e87c05c ("drm/i915/selftest: Fix hangcheck self test for GuC submission")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117163743.1003219-1-arnd@kernel.org
There is an impedance mismatch between the scatterlist API using unsigned
int and our memory/page accounting in unsigned long. That is we may try
to create a scatterlist for a large object that overflows returning a
small table into which we try to fit very many pages. As the object size
is under the control of userspace, we have to be prudent and catch the
conversion errors.
To catch the implicit truncation we check before calling scattterlist
creation Apis. we use overflows_type check and report E2BIG if the
overflows may raise. When caller does not return errno, use WARN_ON to
report a problem.
This is already used in our create ioctls to indicate if the uABI request
is simply too large for the backing store. Failing that type check,
we have a second check at sg_alloc_table time to make sure the values
we are passing into the scatterlist API are not truncated.
v2: Move added i915_utils's macro into drm_util header (Jani N)
v5: Fix macros to be enclosed in parentheses for complex values
Fix too long line warning
v8: Replace safe_conversion() with check_assign() (Kees)
v14: Remove shadowing macros of scatterlist creation api and fix to
explicitly overflow check where the scatterlist creation APIs are
called. (Jani)
v15: Add missing returning of error code when the WARN_ON() has been
detected. (Jani)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221228192252.917299-3-gwan-gyeong.mun@intel.com
We already wrap i915_vma.node.start for use with the GGTT, as there we
can perform additional sanity checks that the node belongs to the GGTT
and fits within the 32b registers. In the next couple of patches, we
will introduce guard pages around the objects _inside_ the drm_mm_node
allocation. That is we will offset the vma->pages so that the first page
is at drm_mm_node.start + vma->guard (not 0 as is currently the case).
All users must then not use i915_vma.node.start directly, but compute
the guard offset, thus all users are converted to use a
i915_vma_offset() wrapper.
The notable exceptions are the selftests that are testing exact
behaviour of i915_vma_pin/i915_vma_insert.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com
On XE_LPM+ platforms the media engines are carved out into a separate
GT but have a common GGTMMADR address range which essentially makes
the GGTT address space to be shared between media and render GT. As a
result any updates in GGTT shall invalidate TLB of GTs sharing it and
similarly any operation on GGTT requiring an action on a GT will have to
involve all GTs sharing it. setup_private_pat was being done on a per
GGTT based as that doesn't touch any GGTT structures moved it to per GT
based.
BSPEC: 63834
v2:
1. Add details to commit msg
2. includes fix for failure to add item to ggtt->gt_list, as suggested
by Lucas
3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within
it.
4. setup_private_pat moved out of intel_gt_tiles_init
v3:
1. Move out for_each_gt from i915_driver.c (Jani Nikula)
v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list
(Matt Roper)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221122070126.4813-1-aravind.iddamsetty@intel.com
We rely on page_sizes.sg in setup_scratch_page() reporting the correct
value if the underlying sgl is not contiguous, however in
get_pages_internal() we are only looking at the layout of the created
pages when calculating the sg_page_sizes, and not the final sgl, which
could in theory be completely different. In such a situation we might
incorrectly think we have a 64K scratch page, when it is actually only
4K or similar split over multiple non-contiguous entries, which could
lead to broken behaviour when touching the scratch space within the
padding of a 64K GTT page-table. For most of the other backends we
already just call i915_sg_dma_sizes() on the final mapping, so rather
just move that into __i915_gem_object_set_pages() to avoid such issues
coming back to bite us later.
v2: Update missing conversion in gvt
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108103238.165447-1-matthew.auld@intel.com
Driver Changes:
- Fix for #7306: [Arc A380] white flickering when using arc as a
secondary gpu (Matt A)
- Add Wa_18017747507 for DG2 (Wayne)
- Avoid spurious WARN on DG1 due to incorrect cache_dirty flag
(Niranjana, Matt A)
- Corrections to CS timestamp support for Gen5 and earlier (Ville)
- Fix a build error used with clang compiler on hwmon (GG)
- Improvements to LMEM handling with RPM (Anshuman, Matt A)
- Cleanups in dmabuf code (Mike)
- Selftest improvements (Matt A)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com
The goal in launching the request smoketest is to have sufficient tasks
running across the system such that we are likely to detect concurrency
issues. We aim to have 2 tasks using the same engine, gt, device (each
level of locking around submission and signaling) running at the same
time. While tasks may not be running all the time as they synchronise
with the gpu, they will be running most of the time, in which case
having many more tasks than cores available is wasteful (and
dramatically increases the workload causing excess runtime). Aim to
limit the number of tasks such that there is at least 2 running per
engine, spreading surplus cores around the engines (rather than running
a task per core per engine.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Tested-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102155709.31717-1-nirmoy.das@intel.com
Since a7c01fa93a ("signal: break out of wait loops on kthread_stop()")
kthread_stop() started asserting a pending signal which wreaks havoc with
a few of our selftests. Mainly because they are not fully expecting to
handle signals, but also cutting the intended test runtimes short due
signal_pending() now returning true (via __igt_timeout), which therefore
breaks both the patterns of:
kthread_run()
..sleep for igt_timeout_ms to allow test to exercise stuff..
kthread_stop()
And check for errors recorded in the thread.
And also:
Main thread | Test thread
---------------+------------------------------
kthread_run() |
kthread_stop() | do stuff until __igt_timeout
| -- exits early due signal --
Where this kthread_stop() was assume would have a "join" semantics, which
it would have had if not the new signal assertion issue.
To recap, threads are now likely to catch a previously impossible
ERESTARTSYS or EINTR, marking the test as failed, or have a pointlessly
short run time.
To work around this start using kthread_work(er) API which provides
an explicit way of waiting for threads to exit. And for cases where
parent controls the test duration we add explicit signaling which threads
will now use instead of relying on kthread_should_stop().
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020130841.3845791-1-tvrtko.ursulin@linux.intel.com
It turns out that on production DG2/ATS HW we should have support for
PS64. This feature allows to provide a 64K TLB hint at the PTE level,
which is a lot more flexible than the current method of enabling 64K GTT
pages for the entire page-table, since that leads to all kinds of
annoying restrictions, as documented in:
commit caa574ffc4
Author: Matthew Auld <matthew.auld@intel.com>
Date: Sat Feb 19 00:17:49 2022 +0530
drm/i915/uapi: document behaviour for DG2 64K support
On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.
With PS64, we can now drop the 2M GTT alignment restriction, and instead
only require 64K or larger when dealing with lmem. We still use the
compact-pt layout when possible, but only when we are certain that this
doesn't interfere with userspace.
Note that this is a change in uAPI behaviour, but hopefully shouldn't be
a concern (IGT is at least able to autodetect the alignment), since we
are only making the GTT alignment constraint less restrictive.
Based on a patch from CQ Tang.
v2: update the comment wrt scratch page
v3: (Nirmoy)
- Fix the selftest to actually use the random size, plus some comment
improvements, also drop the rem stuff.
Reported-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221004114915.221708-1-matthew.auld@intel.com
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Daniele needs 84d4333c1e ("misc/mei: Add NULL check to component match
callback functions") in order to merge the DG2 HuC patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>