Commit Graph

4199 Commits

Author SHA1 Message Date
Dave Airlie 754270c7c5 Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
First feature pull for 4.15.  Highlights:
- Per VM BO support
- Lots of powerplay cleanups
- Powerplay support for CI
- pasid mgr for kfd
- interrupt infrastructure for recoverable page faults
- SR-IOV fixes
- initial GPU reset for vega10
- prime mmap support
- ttm page table debugging improvements
- lots of bug fixes

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (232 commits)
  drm/amdgpu: clarify license in amdgpu_trace_points.c
  drm/amdgpu: Add gem_prime_mmap support
  drm/amd/powerplay: delete dead code in smumgr
  drm/amd/powerplay: delete SMUM_FIELD_MASK
  drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_READ_FIELD
  drm/amd/powerplay: delete SMUM_SET_FIELD
  drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_FIELD
  drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
  drm/amd/powerplay: move macros to hwmgr.h
  drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: add new helper functions in hwmgr.h
  drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
  drm/amd/powerplay: refine powerplay code.
  drm/amd/powerplay: delete dead code in hwmgr.h
  drm/amd/powerplay: refine interface in struct pp_smumgr_func
  ...
2017-09-28 08:37:02 +10:00
Alex Deucher 6f87a89570 drm/amdgpu: clarify license in amdgpu_trace_points.c
It was not clear.  The rest of the driver is MIT/X11.

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:37 -04:00
Samuel Li dfced2e4bc drm/amdgpu: Add gem_prime_mmap support
v2: drop hdp invalidate/flush.
v3: honor pgoff during prime mmap. Add a barrier after cpu access.
v4: drop begin/end_cpu_access() for now, revisit later.

Signed-off-by: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:37 -04:00
Rex Zhu aec8d5cc28 drm/amd/powerplay: delete dead code in smumgr
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:36 -04:00
Rex Zhu 63196fe79b drm/amd/powerplay: delete SMUM_FIELD_MASK
repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:36 -04:00
Rex Zhu 0041e6007e drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
repeated defining in hwmgr.h
use PHM_WAIT_INDIRECT_FIELD instand.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:35 -04:00
Rex Zhu 515113f5e5 drm/amd/powerplay: delete SMUM_READ_FIELD
repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:34 -04:00
Rex Zhu 95175869bd drm/amd/powerplay: delete SMUM_SET_FIELD
repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:34 -04:00
Rex Zhu f0f6e3752a drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:33 -04:00
Rex Zhu 37192704d9 drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:33 -04:00
Rex Zhu a9eca3a685 drm/amd/powerplay: delete SMUM_WRITE_FIELD
the macro is as same as PHM_WRITE_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:32 -04:00
Rex Zhu fbabae4696 drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
the macro is as same as PHM_WRITE_INDIRECT_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:32 -04:00
Rex Zhu 538fdf1fe7 drm/amd/powerplay: move macros to hwmgr.h
the macro is not relevant to SMU,
so rename SMU_WAIT_FIELD_UNEQUAL to
PHM_WAIT_FIELD_UNEQUAL and move to hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:31 -04:00
Rex Zhu 57d13f794d drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
the macro is not relevant to SMU, so move to hwmgr.h
and rename to PHM_WAIT_VFPF_INDIRECT_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:30 -04:00
Rex Zhu 554d95da39 drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
the macro is not relevant to SMU, so move to hwmgr.h
and rename to PHM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:30 -04:00
Rex Zhu b05720cbf6 drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
the macro is not relevent to SMU, so move to hwmgr.h
and rename to PHM_WAIT_INDIRECT_FIELD_UNEQUAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:29 -04:00
Rex Zhu d92cb1629b drm/amd/powerplay: add new helper functions in hwmgr.h
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:29 -04:00
Rex Zhu be49be4085 drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
in VFPF macros to support virtualization

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:28 -04:00
Rex Zhu b3b030520d drm/amd/powerplay: refine powerplay code.
delete struct smumgr, put smu backend function table
in struct hwmgr

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:27 -04:00
Rex Zhu 221c89f980 drm/amd/powerplay: delete dead code in hwmgr.h
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:27 -04:00
Rex Zhu d3f8c0abf4 drm/amd/powerplay: refine interface in struct pp_smumgr_func
unify to use struct hwmgr as function parameter in
smumgr.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:26 -04:00
Christian König e9c7577c09 drm/amdgpu: simplify pinning into visible VRAM
Just set the CPU access required flag when we pin it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:25 -04:00
Monk Liu c833d8aa4d drm/amdgpu:fix firmware memoryleak(v2)
this fix memory leak due to request_firmware after driver
unloaded

v2:
release gmc firmware for gmc6/7/8 as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:25 -04:00
Monk Liu 4ff184d70e drm/amdgpu:fix uvd ring fini routine(v2)
fix missing finish uvd enc_ring.
v2:
since the adev pointer check in already in ring_fini
so drop the check outsider

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:24 -04:00
Monk Liu beb8410284 drm/amdgpu/sriov:alloc KIQ MQD in VRAM(v2)
this way after KIQ MQD released in drv unloading, CPC
can still let KIQ access this MQD thus RLCV SAVE_VF
will not fail

v2:
always use VRAM domain for KIQ MQD no matter BM or SRIOV

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:24 -04:00
Monk Liu 85f95ad629 drm/amdgpu:unmap KCQ in gfx hw_fini(v2)
v2:
move kcq_disable out of SRIOV, make it genearal

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:23 -04:00
Monk Liu 4bd9a67e17 drm/amdgpu:halt when vm fault
only with this way we can debug the VMC page fault issue

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:22 -04:00
Yong Zhao e6d921974a drm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcs
Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping().

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:22 -04:00
Yong Zhao 7bdc53f925 drm/amdgpu: Fix a bug in amdgpu_fill_buffer()
When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size
is 0x200000, the problem will happen.

In order to fix the problem, we separately store the max nums of
PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs
structure, rather than inferring it from bytes limit of SDMA
constant fill, i.e. fill_max_bytes.

Together with the fix, we replace the hard code value "10" in
amdgpu_vm_bo_update_mapping() with the corresponding values from
structure amdgpu_vm_pte_funcs.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:21 -04:00
Yong Zhao dfe5c2b76b drm/amdgpu: Correct bytes limit for SDMA 3.0 copy and fill
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:21 -04:00
Christian König a8ffeac96d drm/amdgpu: use 2MB fragment size for GFX6,7 and 8
Use 2MB fragment size by default for older hardware generations as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: John Bridgman <john.bridgman@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:20 -04:00
Xiangliang.Yu fd4495e57c drm/amdgpu: Fix driver reloading failure
SRIOV doesn't implement PMC capability of PCIe, so it can't update
power state by reading PMC register.

Currently, amdgpu driver doesn't disable pci device when removing
driver, the enable_cnt of pci device will not be decrease to 0.
When reloading driver, pci_enable_device will do nothing as
enable_cnt is not zero. And power state will not be updated as PMC
is not support.
So current_state of pci device is not D0 state and pci_enable_msi
return fail.

Add pci_disable_device when remmoving driver to fix the issue.

Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:19 -04:00
Rex Zhu 6042e8560e drm/amd/powerplay: refine phm_register_thermal_interrupt interface
currently, not all asics implement this callback function
so not return error to avoid powerplay initialize failed
in those asices

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:19 -04:00
Evan Quan 5c58301856 drm/amd/amdgpu: add vega10/raven mmhub/athub golden settings
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:18 -04:00
Eric Huang fafa359840 drm/amd/powerplay: change alert temperature range
Change to more meaningful range that triggers thermal
interrupts.

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:18 -04:00
Eric Huang a1665a55c8 drm/amd/powerplay: implement register thermal interrupt for Vega10
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:17 -04:00
Eric Huang 2a5b64c9fc drm/amd/powerplay: add register thermal interrupt in hwmgr_hw_init
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:16 -04:00
Eric Huang 4d1f9fb721 drm/amdgpu: add cgs query info of pci bus devfn
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:16 -04:00
Tom St Denis 10cfafd62a drm/amd/amdgpu: Partial revert of iova debugfs
We discovered that on some devices even with iommu enabled
you can access all of system memory through the iommu translation.

Therefore, we revert the read method to the translation only service
and drop the write method completely.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christan König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:15 -04:00
Evan Quan a49ccdbd1d drm/amd/amgpu: update vega10 sdma golden setting
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:15 -04:00
Evan Quan 6fe8542957 drm/amd/amgpu: update raven sdma golden setting
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:14 -04:00
Monk Liu d59c026b7b drm/amdgpu/sriov:fix memory leak after gpu reset
GPU reset will require all hw doing hw_init thus
ucode_init_bo will be invoked again, which lead to
memory leak

skip the fw_buf allocation during sriov gpu reset to avoid
memory leak.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:13 -04:00
Monk Liu eb01abc7c4 drm/amdgpu:make ctx_add_fence interruptible(v2)
otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:13 -04:00
Monk Liu f840cc5f84 drm/amdgpu/sriov:init csb for gfxv9
RLC need CSB registers initiated under SRIOV during world switch
otherwise the clear state buffer behav will not be recovered to
current VF scheme after switch back

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:12 -04:00
Horace Chen 6e132ca0bb drm/amdgpu/sriov:increate mailbox polling timeout
increase timeout to 12 seconds,because there may have multiple
FLR waiting for done, the waiting time of events may be long,
increase to 12s to reduce timeout failure.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:12 -04:00
Monk Liu 030308fcbd drm/amdgpu/sriov:fix page fault issue of driver unload
bo_free on csa is too late to put in amdgpu_fini because that
time ttm is already finished,
Move it earlier to avoid the page fault.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:11 -04:00
Monk Liu 6e2e216fad drm/amdgpu:use formal register to trigger hdp invalidate
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:10 -04:00
Monk Liu 1d4e0a8c4f drm/amdgpu:hdp flush should be put it initialized
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:10 -04:00
Monk Liu 2ea6ab2741 drm/amdgpu:insert TMZ_BEGIN
FRAME_CONTROL(begin) is needed for vega10 due to ucode logic change,
it can fix some CTS random fail under gfx preemption enabled mode.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:09 -04:00
Monk Liu 55981bd2e8 drm/amdgpu/sriov:don't load psp fw during gpu reset
At least for SRIOV we found reload PSP fw during
gpu reset cause PSP hang.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:09 -04:00