The page aligned surface address calculation needs to know which way
things are rotated. The contract now says that the caller must pass the
rotate x/y coordinates, as well as the tile_height aligned stride in
the tile_height direction. This will make it fairly simple to deal with
90/270 degree rotation on SKL+ where we have to deal with the rotated
view into the GTT.
v2: Pass rotation instead of bool even thoughwe only care about 0/180 vs. 90/270
v3: Introduce intel_tile_dims(), and don't mix up different units so much
v4: Unconfuse bytes vs. pixels even more
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Assorted changes in the areas of code cleanup, reduction of
invariant conditional in the interrupt handler and lock
contention and MMIO access optimisation.
* Remove needless initialization.
* Improve cache locality by reorganizing code and/or using
branch hints to keep unexpected or error conditions out
of line.
* Favor busy submit path vs. empty queue.
* Less branching in hot-paths.
v2:
* Avoid mmio reads when possible. (Chris Wilson)
* Use natural integer size for csb indices.
* Remove useless return value from execlists_update_context.
* Extract 32-bit ppgtt PDPs update so it is out of line and
shared with two callers.
* Grab forcewake across all mmio operations to ease the
load on uncore lock and use chepear mmio ops.
v3:
* Removed some more pointless u8 data types.
* Removed unused return from execlists_context_queue.
* Commit message updates.
v4:
* Unclumsify the unqueue if statement. (Chris Wilson)
* Hide forcewake from the queuing function. (Chris Wilson)
Version 3 now makes the irq handling code path ~20% smaller on
48-bit PPGTT hardware, and a little bit less elsewhere. Hot
paths are mostly in-line now and hammering on the uncore
spinlock is greatly reduced together with mmio traffic to an
extent.
Benchmarking with "gem_latency -n 100" (keep submitting
batches with 100 nop instruction) shows approximately 4% higher
throughput, 2% less CPU time and 22% smaller latencies. This was
on a big-core while small-cores could benefit even more.
Most likely reason for the improvements are the MMIO
optimization and uncore lock traffic reduction.
One odd result is with "gem_latency -n 0" (dispatching empty
batches) which shows 5% more throughput, 8% less CPU time,
25% better producer and consumer latencies, but 15% higher
dispatch latency which is yet unexplained.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1456505912-22286-1-git-send-email-tvrtko.ursulin@linux.intel.com
In addition to calculating final watermarks, let's also pre-calculate a
set of intermediate watermark values at atomic check time. These
intermediate watermarks are a combination of the watermarks for the old
state and the new state; they should satisfy the requirements of both
states which means they can be programmed immediately when we commit the
atomic state (without waiting for a vblank). Once the vblank does
happen, we can then re-program watermarks to the more optimal final
value.
v2: Significant rebasing/rewriting.
v3:
- Move 'need_postvbl_update' flag to CRTC state (Daniel)
- Don't forget to check intermediate watermark values for validity
(Maarten)
- Don't due async watermark optimization; just do it at the end of the
atomic transaction, after waiting for vblanks. We do want it to be
async eventually, but adding that now will cause more trouble for
Maarten's in-progress work. (Maarten)
- Don't allocate space in crtc_state for intermediate watermarks on
platforms that don't need it (gen9+).
- Move WaCxSRDisabledForSpriteScaling:ivb into intel_begin_crtc_commit
now that ilk_update_wm is gone.
v4:
- Add a wm_mutex to cover updates to intel_crtc->active and the
need_postvbl_update flag. Since we don't have async yet it isn't
terribly important yet, but might as well add it now.
- Change interface to program watermarks. Platforms will now expose
.initial_watermarks() and .optimize_watermarks() functions to do
watermark programming. These should lock wm_mutex, copy the
appropriate state values into intel_crtc->active, and then call
the internal program watermarks function.
v5:
- Skip intermediate watermark calculation/check during initial hardware
readout since we don't trust the existing HW values (and don't have
valid values of our own yet).
- Don't try to call .optimize_watermarks() on platforms that don't have
atomic watermarks yet. (Maarten)
v6:
- Rebase
v7:
- Further rebase
v8:
- A few minor indentation and line length fixes
v9:
- Yet another rebase since Maarten's patches reworked a bunch of the
code (wm_pre, wm_post, etc.) that this was previously based on.
v10:
- Move wm_mutex to dev_priv to protect against racing commits against
disjoint CRTC sets. (Maarten)
- Drop unnecessary clearing of cstate->wm.need_postvbl_update (Maarten)
v11:
- Now that we've moved to atomic watermark updates, make sure we call
the proper function to program watermarks in
{ironlake,haswell}_crtc_enable(); the failure to do so on the
previous patch iteration led to us not actually programming the
watermarks before turning on the CRTC, which was the cause of the
underruns that the CI system was seeing.
- Fix inverted logic for determining when to optimize watermarks. We
were needlessly optimizing when the intermediate/optimal values were
the same (harmless), but not actually optimizing when they differed
(also harmless, but wasteful from a power/bandwidth perspective).
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456276813-5689-1-git-send-email-matthew.d.roper@intel.com
Elsewhere we have adopted the convention of using '_link' to denote
elements in the list (and '_list' for the actual list_head itself), and
that the name should indicate which list the link belongs to (and
preferrably not just where the link is being stored).
s/vma_link/obj_link/ (we iterate over obj->vma_list)
s/mm_list/vm_link/ (we iterate over vm->[in]active_list)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Currently we perform our own wait in post_plane_update,
but the atomic core performs another one in wait_for_vblanks.
This means that 2 vblanks are done when a fb is changed,
which is a bit overkill.
Merge them by creating a helper function that takes a crtc mask
for the planes to wait on.
The broadwell vblank workaround may look gone entirely but this is
not the case. pipe_config->wm_changed is set to true
when any plane is turned on, which forces a vblank wait.
Changes since v1:
- Removing the double vblank wait on broadwell moved to its own commit.
Changes since v2:
- Move out POWER_DOMAIN_MODESET handling to its own commit.
Changes since v3:
- Do not wait for vblank on legacy cursor updates. (Ville)
- Move broadwell vblank workaround comment to page_flip_finished. (Ville)
Changes since v4:
- Compile fix, legacy_cursor_flip -> *_update.
Changes since v5:
- Kill brackets.
- Add WARN_ON when wait_for_vblanks fails.
- Remove extra newlines.
- Split the checks whether vblank is needed to a separate function,
with comments why a vblank is needed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56CD84DA.5030507@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
After we've told the irq code we don't want to handle display irqs
anymore, we must make sure any display irq handling already
kicked off has finished before we actually turn off the power well.
I wouldn't expect PIPESTAT based interrupts to occur anymore since
vblanks/page flips/gmbus/etc should all be quiescent at this point.
But at least hotplug interrupts could still occur. Hotplug
interrupts may also kick off the workqueue based hotplug processing,
but that code should take the required power domain references
itself, so there shouldn't be any need to synchronize with the
hotplug processing from the power well code.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455900112-15387-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
These platforms should be fine now.
FBC can allow very significant power savings for screen-on idle
systems, but it is worth mentioning that a lot of people won't get
significant power savings by enabling this feature because they may
have something else preventing the system from getting into the
deepest sleep states. Examples may include a hungry wifi device or a
max_performance SATA link power management policy. You can check your
PC state residencies on the powertop "Idle stats" tab. I recommend
trying to run "sudo powertop --auto-tune" and then seeing if the
residencies improve.
Oh, and in case you - the person reading this commit message - found
this commit through git bisect, please do the following:
- Check your dmesg and see if there are error messages mentioning
underruns around the time your problem started happening.
- Download intel-gpu-tools, compile it, and run:
$ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 | tee fbc.txt
Then send us the fbc.txt file, especially if you get a failure.
This will really maximize your chances of getting the bug fixed
quickly.
- Try to find a reliable way to reproduce the problem, and tell us.
- Boot with drm.debug=0xe, reproduce the problem, then send us the
dmesg file.
v2: Don't enable by default on SKL.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455655643-2535-1-git-send-email-paulo.r.zanoni@intel.com
With a reliable frontbuffer tracking and all instability corner cases
on Haswell and Broadwell solved let's re-enabled PSR by default on
these platforms.
In case a new issue is found and PSR is the main suspect, please check
if i915.enable_psr=0 really makes your problem go away. If this is the case
PSR is the culprit so after that please check if i915.enable_psr=2
or i915.enable_psr=3 solves your issue and please let us know.
There are many panels out there and not all implementations apparently
work as we would expect.
In case you needed to force it on standby or disabled or in case of any
PSR related bug please report it at bugs.freedesktop.org.
In a bugzilla entry for PSR is desirable:
- dmesg (drm.debug=0xe)
- output of /sys/kernel/debug/dri/0/i915_edp_psr_status
- Platform information. Vendor, model, id, pci id.
- Graphical environment: Gnome, KDE, openbox, etc...
- Details how to reproduce.
- Also good if you could run PSR test cases of Intel-gpu-tools
- Please mention if forcing main link standby or main link off helps you.
There are Intel-gpu-tools test cases that can be helpful to
determine if PSR is working as expected:
kms_psr_sink_crc and kms_psr_frontbuffer_tracking.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455278893-1307-2-git-send-email-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With a reliable frontbuffer tracking and all instability corner cases
solved for this platform let's re-enabled PSR by default.
In case a new issue is found and PSR is the main suspect, please check
if i915.enable_psr=0 really makes your problem go away,
please report it at bugs.freedesktop.org.
In a bugzilla entry for PSR is desirable:
- dmesg (drm.debug=0xe)
- output of /sys/kernel/debug/dri/0/i915_edp_psr_status
- Platform information. Vendor, model, id, pci id.
- Graphical environment: Gnome, KDE, openbox, etc...
- Details how to reproduce.
- Also good if you could run PSR test cases of Intel-gpu-tools
- Please mention if forcing main link standby or main link off helps you.
There are Intel-gpu-tools test cases that can be helpful to
determine if PSR is working as expected:
kms_psr_sink_crc and kms_psr_frontbuffer_tracking.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will give us flexibility to enable PSR by default independently so
issues and corner cases in one platform won't affect others were we have
it working properly.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The assumption when adding the intel_display_power_is_enabled() checks
was that if it returns success the power can't be turned off afterwards
during the HW access, which is guaranteed by modeset locks. This isn't
always true, so make sure we hold a dedicated reference for the time of
the access.
While at it also add the missing reference around the HW access in
i915_interrupt_info().
v2:
- update the commit message mentioning that this also fixes the
HW access in the interrupt info debugfs entry (Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455296121-4742-9-git-send-email-imre.deak@intel.com