Driver Changes:
- Avoid infinite GPU waits by avoidin premature release of request's
reusable memory (Chris, Janusz)
- Expose RPS thresholds in sysfs (Tvrtko)
- Apply GuC SLPC min frequency softlimit correctly (Vinay)
- Restore SLPC efficient freq earlier (Vinay)
- Consider OA buffer boundary when zeroing out reports (Umesh)
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL (Matt R)
- Fix context workarounds with non-masked regs on MTL/DG2 (Lucas)
- Enable the CCS_FLUSH bit in the pipe control and in the CS for MTL+ (Andi)
- Update MTL workarounds 14018778641, 22016122933 (Tejas, Zhanjun)
- Ensure memory quiesced before AUX CCS invalidation (Jonathan)
- Add a gsc_info debugfs (Daniele)
- Invalidate the TLBs on each GT on multi-GT device (Chris)
- Fix a VMA UAF for multi-gt platform (Nirmoy)
- Do not use stolen on MTL due to HW bug (Nirmoy)
- Check HuC and GuC version compatibility on MTL (Daniele)
- Dump perf_limit_reasons for slow GuC init debug (Vinay)
- Replace kmap() with kmap_local_page() (Sumitra, Ira)
- Add sentinel to xehp_oa_b_counters for KASAN (Andrzej)
- Add the gen12_needs_ccs_aux_inv helper (Andi)
- Fixes and updates for GSC memory allocation (Daniele)
- Fix one wrong caching mode enum usage (Tvrtko)
- Fixes for GSC wakeref (Alan)
- Static checker fixes (Harshit, Arnd, Dan, Cristophe, David, Andi)
- Rename flags with bit_group_X according to the datasheet (Andi)
- Use direct alias for i915 in requests (Andrzej)
- Replace i915->gt0 with to_gt(i915) (Andi)
- Use the i915_vma_flush_writes helper (Tvrtko)
- Selftest improvements (Alan)
- Remove dead code (Tvrtko)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# Conflicts:
# drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMy6kDd9npweR4uy@jlahtine-mobl.ger.corp.intel.com
UAPI Changes:
- I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake.
Driver Changes:
Fixes/improvements/new stuff:
- Use large rings for compute contexts (Chris Wilson)
- Better logging/debug of unexpected GuC communication issues (Michal Wajdeczko)
- Clear out entire reports after reading if not power of 2 size (Ashutosh Dixit)
- Limit lmem allocation size to succeed on SmallBars (Andrzej Hajda)
- perf/OA capture robustness improvements on DG2 (Umesh Nerlige Ramappa)
- Fix error code in intel_gsc_uc_heci_cmd_submit_nonpriv() (Dan Carpenter)
Future platform enablement:
- Add workaround 14016712196 (Tejas Upadhyay)
- HuC loading for MTL (Daniele Ceraolo Spurio)
- Allow user to set cache at BO creation (Fei Yang)
Miscellaneous:
- Use system include style for drm headers (Jani Nikula)
- Drop legacy CTB definitions (Michal Wajdeczko)
- Turn off the timer to sample frequencies when GT is parked (Ashutosh Dixit)
- Make PMU sample array two-dimensional (Ashutosh Dixit)
- Use the correct error value when kernel_context() fails (Andi Shyti)
- Fix second parameter type of pre-gen8 pte_encode callbacks (Nathan Chancellor)
- Fix parameter in gmch_ggtt_insert_{entries, page}() (Nathan Chancellor)
- Fix size_t format specifier in gsccs_send_message() (Nathan Chancellor)
- Use the fdinfo helper (Tvrtko Ursulin)
- Add some missing error propagation (Tvrtko Ursulin)
- Reduce I915_MAX_GT to 2 (Matt Atwood)
- Rename I915_PMU_MAX_GTS to I915_PMU_MAX_GT (Matt Atwood)
- Remove some obsolete definitions (John Harrison)
Merges:
- Merge drm/drm-next into drm-intel-gt-next (Tvrtko Ursulin)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZIH09fqe5v5yArsu@tursulin-desk
MTL introduces a new way to instruct the PUnit with
power and bandwidth requirements of DE. Add the functionality
to program the registers and handle waits using interrupts.
The current wait time for timeouts is programmed for 10 msecs to
factor in the worst case scenarios. Changes made to use REG_BIT
for a register that we touched(GEN8_DE_MISC_IER _MMIO).
Wa_14016740474 is added which applies to Xe_LPD+ display
v2: checkpatch warning fixes, simplify program pmdemand part
v3: update to dbufs and pipes values to pmdemand register(stan)
Removed the macro usage in update_pmdemand_values()
v4: move the pmdemand_pre_plane_update before cdclk update
pmdemand_needs_update included cdclk params comparisons
pmdemand_state NULL check (Gustavo)
pmdemand.o in sorted order in the makefile (Jani)
update pmdemand misc irq handler loop (Gustavo)
active phys bitmask and programming correction (Gustavo)
v5: simplify pmdemand_state structure
simplify methods to find active phys and max port clock
Timeout in case of previou pmdemand task pending (Gustavo)
v6: rebasing
updates to max_ddiclk calculations (Gustavo)
updates to active_phys count method (Gustavo)
v7: use two separate loop to iterate throug old and new
crtc states to calculate the active phys (Gustavo)
v8: use uniform function names (Gustavo)
v9: For phys change iterate through connectors (Imre)
Look for change in phys for pmdemand update (Gustavo, Imre)
Some more stlying changes (Imre)
Update pmdemand state during HW readout/sanitize (Imre)
v10: Fix CI checkpatch warnings
v11: use correct pmdemand object pointer during hw readout,
simplify the check for phys need update (Gustavo)
v12: Handle possible non serialize cases (Imre)
Initialise also pmdemand params HW readout (Imre)
Update active phys mask during sanitize calls (Imre)
Check TC/encoder changes to limit connector update (Imre)
v13: Check display version before accessing pmdemand functions
v14: Move is_serialized to intel_global_state.c
simplify update params and other stlying issues (Imre)
Bspec: 66451, 64636, 64602, 64603
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> #v4
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com> #v11
Reviewed-by: Imre Deak <imre.deak@intel.com>
[RK: Fixed minor typo in one of the comments. s/qclck_gc/qclk_gv/]
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230606201032.347449-1-vinod.govindapillai@intel.com
Before we add the second step of the MTL HuC auth (via GSC), we need to
have the ability to differentiate between them. To do so, the huc
authentication check is duplicated for GuC and GSC auth, with
GSC-enabled binaries being considered fully authenticated only after
the GSC auth step.
To report the difference between the 2 auth steps, a new case is added
to the HuC getparam. This way, the clear media driver can start
submitting before full auth, as partial auth is enough for those
workloads.
v2: fix authentication status check for DG2
v3: add a better comment at the top of the HuC file to explain the
different approaches to load and auth (John)
v4: update call to intel_huc_is_authenticated in the pxp code to check
for GSC authentication
v5: drop references to meu and esclamation mark in huc_auth print (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> #v2
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-5-daniele.ceraolospurio@intel.com
VLV has a so called "wide gamut color correction" unit (WGC).
What it is is a 3x3 matrix similar to the later CHV CGM
CSC, with less precisions/range. In fact CHV also has the WGC
but using it there doesn't really make sense when you have the
superior CGM CSC around.
Hook up the necessary stuff to expose the WGC as the CTM
crtc property.
One additional crazy idea that came to mind would be to use
the WGC as an output CSC on CHV for YCbCr output. But it
would be incompatible with the legacy LUT usage. In fact
since the WGC lacks post-offsets we'd probably have to
use the legacy LUT to do that final part of the RGB->YCbCr
conversion. Sounds doable, but perhaps not worth the hassle.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230413164916.4221-6-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Rather than selecting the display IP and feature flags at the same time
the general PCI probing happens, move this step into the display code
itself so that it can be more easily re-used outside of i915 (i.e., by
the Xe driver).
v2:
- Make intel_display_device_probe() always return a non-NULL pointer
and simplify copying of runtime_defaults. (Andrzej)
v3:
- Redefine INTEL_VGA_DEVICE/INTEL_QUANTA_DEVICE to eliminate a cast and
an include of linux/mod_devicetable.h. (Jani)
- Keep explicit memcpy for runtime defaults. (Jani)
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230523195609.73627-5-matthew.d.roper@intel.com
Display to communicate display pipe count/CDCLK/voltage configuration
to Pcode for more accurate power accounting for DG2.
Existing sequence is only sending the voltage value to the Pcode.
Adding new sequence with current cdclk associate with voltage value masking.
Adding pcode request when any pipe power well will disable or enable.
v2: - Make intel_cdclk_need_serialize static to make CI compiler happy.
v3: - Removed redundant return(Jani Nikula)
- Changed intel_cdclk_power_usage_to_pcode_(pre|post)_notification to be
static and also naming to intel_cdclk_pcode_(pre|post)_notify(Jani Nikula)
- Changed u8 to be u16 for cdclk parameter in intel_pcode_notify function,
as according to BSpec it requires 10 bits(Jani Nikula)
- Replaced dev_priv's with i915's(Jani Nikula)
- Simplified expression in intel_cdclk_need_serialize(Jani Nikula)
- Removed redundant kernel-doc and indentation(Jani Nikula)
v4: - Fixed some checkpatch warnings
v5: - According to HW team comments that change should affect only DG2,
fix correspodent platform check to account this.
v6: - Added one more missing IS_DG2 check(Vinod Govindapillai)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230504093959.12085-1-stanislav.lisovskiy@intel.com
Define and use the bitmasks for the x/y components
of the skl+ scaler window pos/size registers.
We stick to the full 16 bits mask here even though the
hardware limits are actually lower. The current (ADL)
hardware maximums are in fact: 14 bits for X size, 13 bits
for X pos, 13 bits for Y size/pos. Yes, that is correct,
X pos has less bits than the X size for some reason. But
that doesn't actually matter for now as we don't currently
even support such wide displays without the use of bigjoiner
(due to max plane width limit).
v2: Switch back to full 16bit masks since that's what
we use transcoder timign regs and PIPESRC as well
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230426135019.7603-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Define and use the bitmasks for the x/y components
of the ilk+ panel filter window pos/size registers.
Note that we stick to the full 16 bit mask even though
the actual hardware limits are lower (and somewhat
platform dependent). BDW is actually limited to
13 bits horizontal and 12 bits vertical, with the high
bits being hardwired to zero. HSW should have the same
limits as BDW. And pre-HSW should be limited to 12bits
in both directions as that's already the limit of the
transcoder timing registers. Curiously on HSW and earlier
platforms all 16 bits can actually be set, but presumably
the hardware ignores the high bits.
v2: Switch back to full 16bit masks since that's what
we use transcoder timign regs and PIPESRC as well
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230426135019.7603-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Define more of the PSR mask bits, and describe in detail
what some of them do. Even if we don't set them all from
the driver they can be very useful during PSR debugging.
Having to trawl through bspec every time to find them is
not fun, and re-reverse engineering the behaviour every
time is time consuming (even if a bit more fun than spec
trawling).
v2: Moar bits
Put the description into a comment to be easily available
v2: Fix the BDW_UNMASK_VBL_TO_REGS_IN_SRD/HSW_UNMASK_VBL_TO_REGS_IN_SRD
description
Rebase due to intel_psr_regs.h
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230411191429.29895-6-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
The differences between MTL and TGL DP sequences are big enough to
MTL have its own functions.
Also it is much easier to follow MTL sequences against spec with
its own functions.
One change worthy to mention is the move of
'intel_display_power_get(dev_priv, dig_port->ddi_io_power_domain)'.
This call is not necessary for MTL but we have _put() counter part in
intel_ddi_post_disable_dp() that needs to balanced.
We could add a display version check on it but instead here it is
moving it to intel_ddi_pre_enable_dp() so it is executed for all
platforms in a single place and this will not cause any harm in MTL
and newer platforms.
v2:
- Fix logic to wait for buf idle.
- Use the right register to wait for ddi active.(RK)
v3:
- Increase wait timeout for ddi buf active (Mika)
v4:
- Increase idle timeout for ddi buf idle (Mika)
v5: use rmw in mtl_disable_ddi_buf. Donot clear
link training mask(Imre)
BSpec: 65448 65505
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Satyeshwar Singh <satyeshwar.singh@intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230413212443.1504245-7-radhakrishna.sripada@intel.com
XELPDP has C10 and C20 phys from Synopsys to drive displays. Each phy
has a dedicated PIPE 5.2 Message bus for configuration. This message
bus is used to configure the phy internal registers.
XELPDP has C10 phys to drive output to the EDP and the native output
from the display engine. Add structures, programming hardware state
readout logic. Port clock calculations are similar to DG2. Use the DG2
formulae to calculate the port clock but use the relevant pll signals.
Note: PHY lane 0 is always used for PLL programming.
Add sequences for C10 phy enable/disable phy lane reset,
powerdown change sequence and phy lane programming.
Bspec: 64539, 64568, 64599, 65100, 65101, 65450, 65451, 67610, 67636
v2: Squash patches related to C10 phy message bus and pll
programming support (Jani)
Move register definitions to a new file i.e. intel_cx0_reg_defs.h (Jani)
Move macro definitions (Jani)
DP rates as separate patch (Jani)
Spin out xelpdp register definitions into a separate file (Jani)
Replace macro to select registers based on phy lane with
function calls (Jani)
Fix styling issues (Jani)
Call XELPDP_PORT_P2M_MSGBUS_STATUS() with port instead of phy (Lucas)
v3: Move clear request flag into try-loop
v4: On PHY idle change drm_err_once() as drm_dbg_kms() (Jani)
use __intel_de_wait_for_register() instead of __intel_wait_for_register
and uncomment intel_uncore.h (Jani)
Add DP-alt support for PHY lane programming (Khaled)
v4: Add tx and cmn on c10mpllb_state (Imre)
Add missing waits for pending transactions between two message bus
writes (Imre)
General cleanups and simplifications (Imre)
v5: Few nit cleanups from rev4 (imre)
s/dev_priv/i915/ , s/c10mpllb/c10pll/ (RK)
Rebase
v6: Move the mtl code from intel_c10pll_calc_port_clock to mtl function
Fix typo in comment for REG_FIELD_PREP8 definition(Imre)
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v4)
Link: https://patchwork.freedesktop.org/patch/msgid/20230413212443.1504245-4-radhakrishna.sripada@intel.com
- Fix DPT+shmem combo and add i915.enable_dpt modparam (Ville)
- i915.enable_sagv module parameter (Ville)
- Correction to QGV related register addresses (Vinod)
- IPS debugfs per-crtc and new file for false_color (Ville)
- More clean-up and reorganization of Display code (Jani)
- DP DSC related fixes and improvements (Stanislav, Ankit, Suraj, Swati)
- Make utility pin asserts more accurate (Ville)
- Meteor Lake enabling (Daniele)
- High refresh rate PSR fixes (Jouni)
- Cursor and Plane chicken register fixes (Ville)
- Align the ADL-P TypeC sequences with hardware specification (Imre)
- Documentation build fixes and improvements to catch bugs earlier (Lee, Jani)
- PL1 power limit hwmon entry changed to use 0 as disabled state (Ashutosh)
- DP aux sync fix and improvements (Ville)
- DP MST fixes and w/a (Stanislav)
- Limit PXP drm-errors or warning on firmware API failures (Alan)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZC7RR3Laet8ywHRo@intel.com
UAPI Changes:
- (Build-time only, should not have any impact)
drm/i915/uapi: Replace fake flex-array with flexible-array member
"Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead."
This is on core kernel request moving towards GCC 13.
Driver Changes:
- Fix context runtime accounting on sysfs fdinfo for heavy workloads (Tvrtko)
- Add support for OA media units on MTL (Umesh)
- Add new workarounds for Meteorlake (Daniele, Radhakrishna, Haridhar)
- Fix sysfs to read actual frequency for MTL and Gen6 and earlier
(Ashutosh)
- Synchronize i915/BIOS on C6 enabling on MTL (Vinay)
- Fix DMAR error noise due to GPU error capture (Andrej)
- Fix forcewake during BAR resize on discrete (Andrzej)
- Flush lmem contents after construction on discrete (Chris)
- Fix GuC loading timeout on systems where IFWI programs low boot
frequency (John)
- Fix race condition UAF in i915_perf_add_config_ioctl (Min)
- Sanitycheck MMIO access early in driver load and during forcewake
(Matt)
- Wakeref fixes for GuC RC error scenario and active VM tracking (Chris)
- Cancel HuC delayed load timer on reset (Daniele)
- Limit double GT reset to pre-MTL (Daniele)
- Use i915 instead of dev_priv insied the file_priv structure (Andi)
- Improve GuC load error reporting (John)
- Simplify VCS/BSD engine selection logic (Tvrtko)
- Perform uc late init after probe error injection (Andrzej)
- Fix format for perf_limit_reasons in debugfs (Vinay)
- Create per-gt debugfs files (Andi)
- Documentation and kerneldoc fixes (Nirmoy, Lee)
- Selftest improvements (Fei, Jonathan)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZC6APj/feB+jBf2d@jlahtine-mobl.ger.corp.intel.com
The WA states that we need to alert the GSC FW before doing a GSC engine
reset and then wait for 200ms. The GuC owns engine reset, so on the i915
side we only need to apply this for full GT reset.
Given that we do full GT resets in the resume paths to cleanup the HW
state and that a long wait in those scenarios would not be acceptable,
a faster path has been introduced where, if the GSC is idle, we try first
to individually reset the GuC and all engines except the GSC and only fall
back to full reset if that fails.
Note: according to the WA specs, if the GSC is idle it should be possible
to only wait for the uC wakeup time (~15ms) instead of the whole 200ms.
However, the GSC FW team have mentioned that the wakeup time can change
based on other things going on in the HW and pcode, so a good security
margin would be required. Given that when the GSC is idle we already
skip the wait & reset entirely and that this reduced wait would still
likely be too long to use in resume paths, it's not worth adding support
for this reduced wait.
v2: add comment to explain why it is safe to skip the GSC reset (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323231857.2194435-2-daniele.ceraolospurio@intel.com