Commit Graph

95006 Commits

Author SHA1 Message Date
Dave Airlie 9a767faa94 Merge tag 'drm-msm-fixes-2023-07-27' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.5-rc4

Display:
+ Fix to correct the UBWC programming for decoder version 4.3 seen
  on SM8550
+ Add the missing flush and fetch bits for DMA4 and DMA5 SSPPs.
+ Fix to drop the unused dpu_core_perf_data_bus_id enum from the code
+ Drop the unused dsi_phy_14nm_17mA_regulators from QCM 2290 DSI cfg.

GPU:
+ Fix warn splat for newer devices without revn
+ Remove name/revn for a690.. we shouldn't be populating these for
  newer devices, for consistency, but it slipped through review
+ Fix a6xx gpu snapshot BINDLESS_DATA size (was listed in bytes
  instead of dwords, causing AHB faults on a6xx gen4/a660-family)
+ Disallow submit with fence id 0

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9MwCSfiyv8i7yWAsJKYEzCDyzaTx=ujX80Y23rZd9RA@mail.gmail.com
2023-07-28 11:59:14 +10:00
Dave Airlie 0dd9c514d2 amd-drm-fixes-6.5-2023-07-26:
amdgpu:
 - gfxhub partition fix
 - Fix error handling in psp_sw_init()
 - SMU13 fix
 - DCN 3.1 fix
 - DCN 3.2 fix
 - Fix for display PHY programming sequence
 - DP MST error handling fix
 - GFX 9.4.3 fix
 
 amdkfd:
 - GFX11 trap handling fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZMFpxwAKCRC93/aFa7yZ
 2BgxAQDI1hcwi2rb4rj2e1G5eU9KLjMPl0ybsEKmpllXWye2nwEAnmP0CFxtMpoR
 hzPkDpPG/kdNXIy0ekxwPqnkeCca9Ak=
 =RojP
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.5-2023-07-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.5-2023-07-26:

amdgpu:
- gfxhub partition fix
- Fix error handling in psp_sw_init()
- SMU13 fix
- DCN 3.1 fix
- DCN 3.2 fix
- Fix for display PHY programming sequence
- DP MST error handling fix
- GFX 9.4.3 fix

amdkfd:
- GFX11 trap handling fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230726184936.7812-1-alexander.deucher@amd.com
2023-07-28 11:55:32 +10:00
Rob Clark 1b5d0ddcb3 drm/msm: Disallow submit with fence id 0
A fence id of zero is expected to be invalid, and is not removed from
the fence_idr table.  If userspace is requesting to specify the fence
id with the FENCE_SN_IN flag, we need to reject a zero fence id value.

Fixes: 17154addc5 ("drm/msm: Add MSM_SUBMIT_FENCE_SN_IN")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/549180/
2023-07-26 10:50:04 -07:00
Lijo Lazar bc1688fce2 drm/amdgpu: Restore HQD persistent state register
On GFX v9.4.3, compute queue MQD is populated using the values in HQD
persistent state register. Hence don't clear the values on module
unload, instead restore it to the default reset value so that MQD is
initialized correctly during next module load. In particular, preload
flag needs to be set on compute queue MQD, otherwise it could cause
uninitialized values being used at device reset state resulting in EDC.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:26:38 -04:00
Dan Carpenter 38ac4e8385 drm/amd/display: Unlock on error path in dm_handle_mst_sideband_msg_ready_event()
This error path needs to unlock the "aconnector->handle_mst_msg_ready"
mutex before returning.

Fixes: 4f6d9e38c4 ("drm/amd/display: Add polling method to handle MST reply packet")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:23:51 -04:00
Leo Chen de612738e9 drm/amd/display: Exit idle optimizations before attempt to access PHY
[Why & How]
DMUB may hang when powering down pixel clocks due to no dprefclk.

It is fixed by exiting idle optimization before the attempt to access PHY.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Leo Chen <sancchen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:21:54 -04:00
Alvin Lee 4509e69a07 drm/amd/display: Don't apply FIFO resync W/A if rdivider = 0
[Description]
It is not valid to set the WDIVIDER value to 0, so do not
re-write to DISPCLK_WDIVIDER if the current value is 0
(i.e., it is at it's initial value and we have not made any
requests to change DISPCLK yet).

Reviewed-by: Saaem Rizvi <syedsaaem.rizvi@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:20:59 -04:00
George Shen 25b054c3c8 drm/amd/display: Guard DCN31 PHYD32CLK logic against chip family
[Why]
Current yellow carp B0 PHYD32CLK logic is incorrectly applied to other
ASICs.

[How]
Add guard to check chip family is yellow carp before applying logic.

Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:19:24 -04:00
Jane Jian 4a37c55b85 drm/amd/smu: use AverageGfxclkFrequency* to replace previous GFX Curr Clock
Report current GFX clock also from average clock value as the original
CurrClock data is not valid/accurate any more as per FW team

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:17:17 -04:00
Mario Limonciello c01aebeef3 drm/amd: Fix an error handling mistake in psp_sw_init()
If the second call to amdgpu_bo_create_kernel() fails, the memory
allocated from the first call should be cleared.  If the third call
fails, the memory from the second call should be cleared.

Fixes: b95b539168 ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:16:57 -04:00
Victor Lu 9beb223f2a drm/amdgpu: Fix infinite loop in gfxhub_v1_2_xcc_gart_enable (v2)
An instance of for_each_inst() was not changed to match its new
behaviour and is causing a loop.

v2: remove tmp_mask variable

Fixes: b579ea632f ("drm/amdgpu: Modify for_each_inst macro")
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:15:57 -04:00
Jonathan Kim 602816c3ee drm/amdkfd: fix trap handling work around for debugging
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Ruili Ji <ruili.ji@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:13:58 -04:00
Christophe JAILLET e354f67733 drm/i915: Fix an error handling path in igt_write_huge()
All error handling paths go to 'out', except this one. Be consistent and
also branch to 'out' here.

Fixes: c10a652e23 ("drm/i915/selftests: Rework context handling in hugepages selftests")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7a036b88671312ee9adc01c74ef5b3376f690b76.1689619758.git.christophe.jaillet@wanadoo.fr
(cherry picked from commit 361ecaadb1)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-25 08:38:12 +01:00
Radhakrishna Sripada 3844ed5e78 drm/i915/dpt: Use shmem for dpt objects
Dpt objects that are created from internal get evicted when there is
memory pressure and do not get restored when pinned during scanout. The
pinned page table entries look corrupted and programming the display
engine with the incorrect pte's result in DE throwing pipe faults.

Create DPT objects from shmem and mark the object as dirty when pinning so
that the object is restored when shrinker evicts an unpinned buffer object.

v2: Unconditionally mark the dpt objects dirty during pinning(Chris).

Fixes: 0dc987b699 ("drm/i915/display: Add smem fallback allocation for dpt")
Cc: <stable@vger.kernel.org> # v6.0+
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718225118.2562132-1-radhakrishna.sripada@intel.com
(cherry picked from commit e91a777a6e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-25 08:38:09 +01:00
Daniel Vetter 4e076c73e4 drm/atomic: Fix potential use-after-free in nonblocking commits
This requires a bit of background.  Properly done a modeset driver's
unload/remove sequence should be

	drm_dev_unplug();
	drm_atomic_helper_shutdown();
	drm_dev_put();

The trouble is that the drm_dev_unplugged() checks are by design racy,
they do not synchronize against all outstanding ioctl.  This is because
those ioctl could block forever (both for modeset and for driver
specific ioctls), leading to deadlocks in hotunplug.  Instead the code
sections that touch the hardware need to be annotated with
drm_dev_enter/exit, to avoid accessing hardware resources after the
unload/remove has finished.

To avoid use-after-free issues all the involved userspace visible
objects are supposed to hold a reference on the underlying drm_device,
like drm_file does.

The issue now is that we missed one, the atomic modeset ioctl can be run
in a nonblocking fashion, and in that case it cannot rely on the implied
drm_device reference provided by the ioctl calling context.  This can
result in a use-after-free if an nonblocking atomic commit is carefully
raced against a driver unload.

Fix this by unconditionally grabbing a drm_device reference for any
drm_atomic_state structures.  Strictly speaking this isn't required for
blocking commits and TEST_ONLY calls, but it's the simpler approach.

Thanks to shanzhulig for the initial idea of grabbing an unconditional
reference, I just added comments, a condensed commit message and fixed a
minor potential issue in where exactly we drop the final reference.

Reported-by: shanzhulig <shanzhulig@gmail.com>
Suggested-by: shanzhulig <shanzhulig@gmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-21 09:53:30 -07:00
Dave Airlie 28801cc859 amd-drm-fixes-6.5-2023-07-20:
amdgpu:
 - More PCIe DPM fixes for Intel platforms
 - DCN3.0.1 fixes
 - Virtual display timer fix
 - Async flip fix
 - SMU13 clock reporting fixes
 - Add missing PSP firmware declaration
 - DP MST fix
 - DCN3.1.x fixes
 - Slab out of bounds fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZLk2sQAKCRC93/aFa7yZ
 2I7jAP9D26a1N0J3bJd24L4h8FHYRauz1F+z1QFTNv1f51PJJwEAkDpp3xsFb2dM
 YBaLDyeSASlgAB3DQ79cyCtpVBCcOQs=
 =nvAQ
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.5-2023-07-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.5-2023-07-20:

amdgpu:
- More PCIe DPM fixes for Intel platforms
- DCN3.0.1 fixes
- Virtual display timer fix
- Async flip fix
- SMU13 clock reporting fixes
- Add missing PSP firmware declaration
- DP MST fix
- DCN3.1.x fixes
- Slab out of bounds fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720133456.7826-1-alexander.deucher@amd.com
2023-07-21 12:16:47 +10:00
Dave Airlie 534a7915c6 Merge tag 'drm-intel-fixes-2023-07-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Add sentinel to xehp_oa_b_counters [perf] (Andrzej Hajda)
- Revert "drm/i915: use localized __diag_ignore_all() instead of per file" (Jani Nikula)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZLjuwhLhwab5B7RY@tursulin-desk
2023-07-21 12:15:09 +10:00
Dave Airlie f4f19c03cf Memory leak fixes in drm/client, memory access/leak fixes for
accel/qaic, another leak fix in dma-buf and three nouveau fixes around
 hotplugging.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZLjo/gAKCRDj7w1vZxhR
 xTWbAP99XcDE5j72H1R85umRYU45LRKn/0NOFA4V4wRlBGRSbwD+N2aiRpyP45cu
 J5LjdHgSK1DCDjcFxeYqfsU/7jHY8w8=
 =jtaK
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2023-07-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Memory leak fixes in drm/client, memory access/leak fixes for
accel/qaic, another leak fix in dma-buf and three nouveau fixes around
hotplugging.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fmj5nok7zggux2lcpdtls2iknweba54wfc6o4zxq6i6s3dgi2r@7z3eawwhyhen
2023-07-21 12:14:05 +10:00
Ben Skeggs ea293f823a drm/nouveau/kms/nv50-: init hpd_irq_lock for PIOR DP
Fixes OOPS on boards with ANX9805 DP encoders.

Cc: stable@vger.kernel.org # 6.4+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-3-skeggsb@gmail.com
2023-07-19 11:08:47 +02:00
Ben Skeggs 2b5d1c29f6 drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts
Fixes crash on boards with ANX9805 TMDS/DP encoders.

Cc: stable@vger.kernel.org # 6.4+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-2-skeggsb@gmail.com
2023-07-19 11:08:47 +02:00
Ben Skeggs 752a281032 drm/nouveau/i2c: fix number of aux event slots
This was completely bogus before, using maximum DCB device index rather
than maximum AUX ID to size the buffer that stores event refcounts.

*Pretty* unlikely to have been an actual problem on most configurations,
that is, unless you've got one of the rare boards that have off-chip DP.

There, it'll likely crash.

Cc: stable@vger.kernel.org # 6.4+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-1-skeggsb@gmail.com
2023-07-19 11:08:47 +02:00
Guchun Chen b13d3e9c6b drm/amdgpu: use a macro to define no xcp partition case
~0 as no xcp partition is used in several places, so improve its
definition by a macro for code consistency.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:42:54 -04:00
Guchun Chen 8e78127143 drm/amdgpu/vm: use the same xcp_id from root PD
Other PDs/PTs allocation should just use the same xcp_id as that
stored in root PD.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:42:47 -04:00
Guchun Chen 8ecee4cbc7 drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create
Recent code set xcp_id stored from file private data when opening
device to amdgpu bo for accounting memory usage etc, but not all
VMs are attached to this fpriv structure like the vm cases in
amdgpu_mes_self_test, otherwise, KASAN will complain below out
of bound access. And more importantly, VM code should not touch
fpriv structure, so drop fpriv code handling from amdgpu_vm_pt.

[   77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069
[   77.294146] Call Trace:
[   77.294178]  <TASK>
[   77.294208]  dump_stack_lvl+0x49/0x63
[   77.294260]  print_report+0x16f/0x4a6
[   77.294307]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.295979]  ? kasan_complete_mode_report_info+0x3c/0x200
[   77.296057]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.297556]  kasan_report+0xb4/0x130
[   77.297609]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.299202]  __asan_load4+0x6f/0x90
[   77.299272]  amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.300796]  ? amdgpu_init+0x6e/0x1000 [amdgpu]
[   77.302222]  ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu]
[   77.303721]  ? preempt_count_sub+0x18/0xc0
[   77.303786]  amdgpu_vm_init+0x39e/0x870 [amdgpu]
[   77.305186]  ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu]
[   77.306683]  ? kasan_set_track+0x25/0x30
[   77.306737]  ? kasan_save_alloc_info+0x1b/0x30
[   77.306795]  ? __kasan_kmalloc+0x87/0xa0
[   77.306852]  amdgpu_mes_self_test+0x169/0x620 [amdgpu]

v2: without specifying xcp partition for PD/PT bo, the xcp id is -1.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686
Fixes: 3ebfd221c1 ("drm/amdkfd: Store xcp partition id to amdgpu bo")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:42:34 -04:00
Guchun Chen dcaa32e1f5 drm/amdgpu: Allocate root PD on correct partition
file_priv needs to be setup firstly, otherwise, root PD
will always be allocated on partition 0, even if opening
the device from other partitions.

Fixes: 3ebfd221c1 ("drm/amdkfd: Store xcp partition id to amdgpu bo")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:42:23 -04:00
Nicholas Kazlauskas 2387ccf43e drm/amd/display: Keep PHY active for DP displays on DCN31
[Why & How]
Port of a change that went into DCN314 to keep the PHY enabled
when we have a connected and active DP display.

The PHY can hang if PHY refclk is disabled inadvertently.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:41:23 -04:00
Daniel Miess 2a9482e559 drm/amd/display: Prevent vtotal from being set to 0
[Why]
In dcn314 DML the destination pipe vtotal was being set
to the crtc adjustment vtotal_min value even in cases
where that value is 0.

[How]
Only set vtotal to the crtc adjustment vtotal_min value
in cases where the value is non-zero.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Daniel Miess <daniel.miess@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:40:56 -04:00
Zhikai Zhai a460beefe7 drm/amd/display: Disable MPC split by default on special asic
[WHY]
All of pipes will be used when the MPC split enable on the dcn
which just has 2 pipes. Then MPO enter will trigger the minimal
transition which need programe dcn from 2 pipes MPC split to 2
pipes MPO. This action will cause lag if happen frequently.

[HOW]
Disable the MPC split for the platform which dcn resource is limited

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:39:44 -04:00
Taimur Hassan 5a25cefc09 drm/amd/display: check TG is non-null before checking if enabled
[Why & How]
If there is no TG allocation we can dereference a NULL pointer when
checking if the TG is enabled.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Taimur Hassan <syed.hassan@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:33:47 -04:00
Wayne Lin 4f6d9e38c4 drm/amd/display: Add polling method to handle MST reply packet
[Why]
Specific TBT4 dock doesn't send out short HPD to notify source
that IRQ event DOWN_REP_MSG_RDY is set. Which violates the spec
and cause source can't send out streams to mst sinks.

[How]
To cover this misbehavior, add an additional polling method to detect
DOWN_REP_MSG_RDY is set. HPD driven handling method is still kept.
Just hook up our handler to drm mgr->cbs->poll_hpd_irq().

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:32:37 -04:00
Srinivasan Shanmugam 87279fdf5e drm/amd/display: Clean up errors & warnings in amdgpu_dm.c
Fix the following errors & warnings reported by checkpatch:

ERROR: space required before the open brace '{'
ERROR: space required before the open parenthesis '('
ERROR: that open brace { should be on the previous line
ERROR: space prohibited before that ',' (ctx:WxW)
ERROR: else should follow close brace '}'
ERROR: open brace '{' following function definitions go on the next line
ERROR: code indent should use tabs where possible

WARNING: braces {} are not necessary for single statement blocks
WARNING: void function return statements are not generally useful
WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line

Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:32:32 -04:00
Candice Li b9c2213cdf drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta
Allow the initramfs generator to automatically include psp_13_0_6_ta
firmware to initramfs.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:23:11 -04:00
Alex Deucher 068c8bb10f drm/amdgpu/pm: make mclk consistent for smu 13.0.7
Use current uclk to be consistent with other dGPUs.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-07-18 14:22:49 -04:00
Alex Deucher a4eb118241 drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
Use average gfxclock for consistency with other dGPUs.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-07-18 14:22:12 -04:00
Simon Ser 1ca67aba8d drm/amd/display: only accept async flips for fast updates
Up until now, amdgpu was silently degrading to vsync when
user-space requested an async flip but the hardware didn't support
it.

The hardware doesn't support immediate flips when the update changes
the FB pitch, the DCC state, the rotation, enables or disables CRTCs
or planes, etc. This is reflected in the dm_crtc_state.update_type
field: UPDATE_TYPE_FAST means that immediate flip is supported.

Silently degrading async flips to vsync is not the expected behavior
from a uAPI point-of-view. Xorg expects async flips to fail if
unsupported, to be able to fall back to a blit. i915 already behaves
this way.

This patch aligns amdgpu with uAPI expectations and returns a failure
when an async flip is not possible.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-07-18 14:21:02 -04:00
Guchun Chen b42ae87a7b drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
In below thousands of screen rotation loop tests with virtual display
enabled, a CPU hard lockup issue may happen, leading system to unresponsive
and crash.

do {
	xrandr --output Virtual --rotate inverted
	xrandr --output Virtual --rotate right
	xrandr --output Virtual --rotate left
	xrandr --output Virtual --rotate normal
} while (1);

NMI watchdog: Watchdog detected hard LOCKUP on cpu 1

? hrtimer_run_softirq+0x140/0x140
? store_vblank+0xe0/0xe0 [drm]
hrtimer_cancel+0x15/0x30
amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu]
drm_vblank_disable_and_save+0x185/0x1f0 [drm]
drm_crtc_vblank_off+0x159/0x4c0 [drm]
? record_print_text.cold+0x11/0x11
? wait_for_completion_timeout+0x232/0x280
? drm_crtc_wait_one_vblank+0x40/0x40 [drm]
? bit_wait_io_timeout+0xe0/0xe0
? wait_for_completion_interruptible+0x1d7/0x320
? mutex_unlock+0x81/0xd0
amdgpu_vkms_crtc_atomic_disable

It's caused by a stuck in lock dependency in such scenario on different
CPUs.

CPU1                                             CPU2
drm_crtc_vblank_off                              hrtimer_interrupt
    grab event_lock (irq disabled)                   __hrtimer_run_queues
        grab vbl_lock/vblank_time_block                  amdgpu_vkms_vblank_simulate
            amdgpu_vkms_disable_vblank                       drm_handle_vblank
                hrtimer_cancel                                         grab dev->event_lock

So CPU1 stucks in hrtimer_cancel as timer callback is running endless on
current clock base, as that timer queue on CPU2 has no chance to finish it
because of failing to hold the lock. So NMI watchdog will throw the errors
after its threshold, and all later CPUs are impacted/blocked.

So use hrtimer_try_to_cancel to fix this, as disable_vblank callback
does not need to wait the handler to finish. And also it's not necessary
to check the return value of hrtimer_try_to_cancel, because even if it's
-1 which means current timer callback is running, it will be reprogrammed
in hrtimer_start with calling enable_vblank to make it works.

v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes
tag as well

v3: drop warn printing (Christian)

v4: drop superfluous check of blank->enabled in timer function, as it's
guaranteed in drm_handle_vblank (Christian)

Fixes: 84ec374bd5 ("drm/amdgpu: create amdgpu_vkms (v4)")
Cc: stable@vger.kernel.org
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:18:05 -04:00
Aurabindo Pillai d1792509e1 drm/amd/display: add DCN301 specific logic for OTG programming
[Why&How]
DCN301 does not have FAMS hence the workaround needed on other DCN3x
variants related to OTG min/max selector programming is not applicable for it.
Hence isolate it and have it use the old sequence without workaround.

Fixes: 1598fc5764 ("drm/amd/display: Program OTG vtotal min/max selectors unconditionally for DCN1+")
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:17:35 -04:00
Aurabindo Pillai 2ed5a4c461 drm/amd/display: export some optc function for reuse
[Why&How]
Make a few functions non static so that they can be reused for other
asic. This is in preparation for separating out OTG programming sequence
for DCN301

Fixes: 1598fc5764 ("drm/amd/display: Program OTG vtotal min/max selectors unconditionally for DCN1+")
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:17:11 -04:00
Mario Limonciello 60a2dae490 drm/amd: Use amdgpu_device_pcie_dynamic_switching_supported() for SMU7
SMU7 does a check if the dGPU is inserted into a Rocket Lake system,
to turn off DPM.  Extend this check to all systems that have problems
with dynamic switching by using the
amdgpu_device_pcie_dynamic_switching_supported() helper.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 14:16:38 -04:00
Rob Clark 1cd0787f08 drm/msm: Fix hw_fence error path cleanup
In an error path where the submit is free'd without the job being run,
the hw_fence pointer is simply a kzalloc'd block of memory.  In this
case we should just kfree() it, rather than trying to decrement it's
reference count.  Fortunately we can tell that this is the case by
checking for a zero refcount, since if the job was run, the submit would
be holding a reference to the hw_fence.

Fixes: f94e6a51e1 ("drm/msm: Pre-allocate hw_fence")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/547088/
2023-07-17 12:54:20 -07:00
Gaosheng Cui 6e8a996563 drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
The msm_gem_get_vaddr() returns an ERR_PTR() on failure, and a null
is catastrophic here, so we should use IS_ERR_OR_NULL() to check
the return value.

Fixes: 6a8bd08d04 ("drm/msm: add sudo flag to submit ioctl")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/547712/
Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-07-17 11:57:02 -07:00
Jani Nikula 2c27770a7b Revert "drm/i915: use localized __diag_ignore_all() instead of per file"
This reverts commit 88e9664434.

__diag_ignore_all() only works for GCC 8 or later.

-Woverride-init (from -Wextra, enabled in i915 Makefile) combined with
CONFIG_WERROR=y or W=e breaks the build for older GCC.

With i386_defconfig and x86_64_defconfig enabling CONFIG_WERROR=y by
default, we really need to roll back the change.

An alternative would be to disable -Woverride-init in the Makefile for
GCC <8, but the revert seems like the safest bet now.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8768
Reported-by: John Garry <john.g.garry@oracle.com>
References: https://lore.kernel.org/r/ad2601c0-84bb-c574-3702-a83ff8faf98c@oracle.com
References: https://lore.kernel.org/r/87wmzezns4.fsf@intel.com
Fixes: 88e9664434 ("drm/i915: use localized __diag_ignore_all() instead of per file")
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230711110214.25093-1-jani.nikula@intel.com
(cherry picked from commit 290d161045)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-17 13:39:04 +01:00
Andrzej Hajda 785b3f667b drm/i915/perf: add sentinel to xehp_oa_b_counters
Arrays passed to reg_in_range_table should end with empty record.

The patch solves KASAN detected bug with signature:
BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
Read of size 4 at addr ffffffffa1555d90 by task perf/1518

CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1
Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023
Call Trace:
<TASK>
...
xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]

Fixes: 0fa9349dda ("drm/i915/perf: complete programming whitelisting for XEHPSDV")
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230711153410.1224997-1-andrzej.hajda@intel.com
(cherry picked from commit 2f42c5afb3)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-17 13:39:01 +01:00
Rob Clark bd846ceee9 drm/msm/adreno: Fix snapshot BINDLESS_DATA size
The incorrect size was causing "CP | AHB bus error" when snapshotting
the GPU state on a6xx gen4 (a660 family).

Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Fixes: 1707add815 ("drm/msm/a6xx: Add a6xx gpu state")
Patchwork: https://patchwork.freedesktop.org/patch/546763/
2023-07-15 08:19:35 -07:00
Rob Clark 317ab1b90e drm/msm/a690: Remove revn and name
These fields are deprecated.  But any userspace new enough to support
a690 also knows how to identify the GPU based on chip-id.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/545552/
2023-07-15 08:12:01 -07:00
Rob Clark 7164360030 drm/msm/adreno: Fix warn splat for devices without revn
Recently, a WARN_ON() was introduced to ensure that revn is filled before
adreno_is_aXYZ is called. This however doesn't work very well when revn is
0 by design (such as for A635).

Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
Fixes: cc943f43ec ("drm/msm/adreno: warn if chip revn is verified before being set")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> # sc7280
Patchwork: https://patchwork.freedesktop.org/patch/545554/
2023-07-15 08:12:00 -07:00
Dave Airlie 38d88d5e97 amd-drm-fixes-6.5-2023-07-12:
amdgpu:
 - SMU i2c locking fix
 - Fix a possible deadlock in process restoration for ROCm apps
 - Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZK7yYQAKCRC93/aFa7yZ
 2JvYAQDpMj8/rLUsmWRk30jvkaZivgaeUEAG0FGaMpKaaATbvwEA2eHxN3xk5GKs
 ethEPp/zdivIrz6h/JWSCFrpCzqg4g8=
 =rsRJ
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.5-2023-07-12:

amdgpu:
- SMU i2c locking fix
- Fix a possible deadlock in process restoration for ROCm apps
- Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230712184009.7740-1-alexander.deucher@amd.com
2023-07-14 13:19:54 +10:00
Dave Airlie 864e029fea Merge tag 'drm-intel-fixes-2023-07-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Don't preserve dpll_hw_state for slave crtc in Bigjoiner (Stanislav Lisovskiy)
- Consider OA buffer boundary when zeroing out reports [perf] (Umesh Nerlige Ramappa)
- Remove dead code from gen8_pte_encode (Tvrtko Ursulin)
- Fix one wrong caching mode enum usage (Tvrtko Ursulin)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZK+nHLCltaxoxVw/@tursulin-desk
2023-07-14 11:13:15 +10:00
Marijn Suijten 97368254a0 drm/msm/dsi: Drop unused regulators from QCM2290 14nm DSI PHY config
The regulator setup was likely copied from other SoCs by mistake.  Just
like SM6125 the DSI PHY on this platform is not getting power from a
regulator but from the MX power domain.

Fixes: 572e9fd6d1 ("drm/msm/dsi: Add phy configuration for QCM2290")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/544536/
Link: https://lore.kernel.org/r/20230627-sm6125-dpu-v2-1-03e430a2078c@somainline.org
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2023-07-13 12:49:47 -07:00
Dmitry Baryshkov e8383f5cf1 drm/msm/dpu: drop enum dpu_core_perf_data_bus_id
Drop the leftover of bus-client -> interconnect conversion, the enum
dpu_core_perf_data_bus_id.

Fixes: cb88482e25 ("drm/msm/dpu: clean up references of DPU custom bus scaling")
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/546048/
Link: https://lore.kernel.org/r/20230707193942.3806526-2-dmitry.baryshkov@linaro.org
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2023-07-13 11:56:24 -07:00