This patch enables TMZ bit in FRAME_CONTROL for gfx10.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable sdma TMZ mode via setting TMZ bit in sdma copy pkt
for sdma v5.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable sdma TMZ mode via setting TMZ bit in sdma copy pkt
for sdma v4
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands amdgpu_copy_buffer interface with tmz parameter.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands sdma copy_buffer interface with tmz parameter.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a buffer object is secure, i.e. created with
AMDGPU_GEM_CREATE_ENCRYPTED, then the TMZ bit of
the PTEs that belong the buffer object should be
set.
v1: design and draft the skeletion of TMZ bits setting on PTEs (Alex)
v2: return failure once create secure BO on non-TMZ platform (Ray)
v3: amdgpu_bo_encrypted() only checks the BO (Luben)
v4: move TMZ flag setting into amdgpu_vm_bo_update (Christian)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark a job as secure, if and only if the command
submission flag has the secure flag set.
v2: fix the null job pointer while in vmid 0
submission.
v3: Context --> Command submission.
v4: filling cs parser with cs->in.flags
v5: move the job secure flag setting out of amdgpu_cs_submit()
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands the context control function to support trusted flag while we
want to set command buffer in trusted mode.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands the emit_tmz function to support trusted flag while we want
to set command buffer in trusted mode.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds tmz bit in frame control pm4 packet, and it will used in future.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a function to check tmz capability with kernel parameter and ASIC type.
v2: use a per device tmz variable instead of global amdgpu_tmz.
v3: refine the comments for the function. (Luben)
v4: add amdgpu_tmz.c/h for future use.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch to add amdgpu_tmz structure which stores all tmz related fields.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds tmz parameter to enable/disable
the feature in the amdgpu kernel module. Nomally,
by default, it should be auto (rely on the
hardware capability).
But right now, it need to set "off" to avoid
breaking other developers' work because it's not
totally completed.
Will set "auto" till the feature is stable and
completely verified.
v2: add "auto" option for future use.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Define the TMZ (encryption) bit in the page table entry (PTE) for
Raven and newer asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
This fixes GPU hangs due to cache coherency issues.
Bump the driver version. Split out from the original patch.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This fixes GPU hangs due to cache coherency issues.
v2: Split the version bump to a separate patch
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wait for tiles off after unpause to fix transcode timeout issue.
It is a work around.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In order to surface the ASIC revision to user level, we want
to put it into the HSA topology. This can be because different
ASIC revisions may require user-level software to do different
things (e.g. patch code for things that are changed in later
hardware revisions).
The ASIC revision from the hardware is maximum of 4 bits at this
time, so put it into 4 of the open bits in the HSA capability.
Then user-level software can use this capability information to
know -- for each ASIC -- what revision-based things must be done.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The '>' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3004:68-73: WARNING:
conversion to bool not needed here
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to hardware bug that when RSMU UMC index is disabled,
clear EccErrCnt at the first UMC instance will clean up all other
EccErrCnt registes from other instances at the same time. This
will break the correctable error count log in EccErrCnt register
once querying it. So decouple both to make error count query workable.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This makes consistent with other regsiters' access in this module.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CG/PG ungate is already performed in ip_suspend_phase1. Otherwise,
the CG/PG ungate will be performed twice. That will cause gfxoff
disablement is performed twice also on runpm enter while gfxoff
enablemnt once on rump exit. That will put gfxoff into disabled
state.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This sequence change should be safe as what did in ip_suspend_phase1
is to suspend DCE only. And this is a prerequisite for coming
redundant cg/pg ungate dropping.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver steered p-state switching is designed for Vega20 only.
Also simplify early return for temporary disable due to SMU FW
bug.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2534:2-3: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Currently the error returns paths are unlocking lock kiq->ring_lock
however it seems this should be dev->gfx.kiq.ring_lock as this
is the lock that is being locked and unlocked around the ring
operations. This looks like a bug, but it's not. The kiq is just
a local variable pointing to the same structure. Make it consistent.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wait for the oldest sequence on the ring
to be signaled in order to make sure there
will be no command overrun.
v2: fix coding stype and remove abs operation
v3: remove the initialization of variable r
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to
finish
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
because nv12 SRIOV support one vf mode
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
retired those early sos version used in vega10 bring up
phase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
call common helper function to init sos ucode, instead
of duplicate codes per ip version
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
driver already had psp_firmware_header struture to
deal with different layout of sos ucode. the sos
micorcode initialization could be common one.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
call common helper function to initialize asd ucode
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
asd is unified ucode across asic. it is not necessary
to keep its software structure to be ip specific one
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The driver can't access UCODE_DATA/ADDR registers on production boards.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
tOS version is available through debugfs interface
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vmr ring is dedicated for sriov vf (i.e.guest driver
in sriov), which is general communication interface
between driver and psp fw accross all ip version.
it is not correct to make it as ip specific callback.
it is even worse to check specific tOS version per IP
version (like psp_v11/v12).
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reduce the mem->lock`s protected code area, no need to protect pr_debug.
This also simplifies error handling.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For chip like CHIP_OLAND with si enabled(amdgpu.si_support=1),
the amdgpu will expose pp_num_states to the /sys directory.
In this moment, read the pp_num_states file will excute the
amdgpu_get_pp_num_states func. In our case, the data hasn't
been initialized, so the kernel will access some ilegal
address, trigger the segmentfault and system will reboot soon:
uos@uos-PC:~$ cat /sys/devices/pci0000\:00/0000\:00\:00.0/0000\:01\:00
.0/pp_num_states
Message from syslogd@uos-PC at Apr 22 09:26:20 ...
kernel:[ 82.154129] Internal error: Oops: 96000004 [#1] SMP
This patch aims to fix this problem, avoid that reading file
triggers the kernel sementfault.
Signed-off-by: limingyu <limingyu@uniontech.com>
Signed-off-by: zhoubinbin <zhoubinbin@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function amdgpu_job_submit:
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:148:26: warning: variable priority set but not used [-Wunused-but-set-variable]
commit 33abcb1f5a ("drm/amdgpu: set compute queue priority at mqd_init")
left behind this, remove it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix a kernel-doc warning of missing struct field desription:
../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or member 'vm' not described in 'amdgpu_vm_eviction_lock'
Fixes: a269e44989 ("drm/amdgpu: Avoid reclaim fs while eviction lock")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
According to the current kiq read register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 through KIQ
4. client-B poll the seqno-1
5. the kiq complete these two read operation
6. client-A to read the register at the wb buffer and
get REG-1 value
Therefore, use amdgpu_device_wb_get() to request reg_val_offs
for each kiq read register.
v2: fix the error remove
v3: fix the print typo
v4: remove unused variables
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In pp_one_vf mode avoid the extra overhead and read/write the
registers without the KIQ.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Yintian Tao <yintian.tao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If set error query ready in amdgpu_ras_late_init, which will
cause some IP blocks aren't initialized, but their error query
is ready.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make code more readable.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is basically just some code cosmetic. The current design
for XGMI setup gput reset is to operate on current device(adev)
first and then on other devices from the hive(by another 'for' loop).
But actually we can do some sort to the device list(to put current
device 1st position) and handle all the devices in a single 'for'
loop.
V2: added missing hive->hive_lock protection
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As for XGMI setup, it should be performed on other devices
from the hive also.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As for XGMI setup, it needs to be performed on
all the devices from the same hive.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code a bit more readable by using a common
error handling pattern.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Reviewed-by: Christian König <christian.koenig@amd.com>.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
clean up unused variable:
1. ring_lru_list
2. ring_lru_list_lock
related-commit:
drm/amdgpu: remove ring lru handling
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prefix RAS message printing in gfx/mmhub with PCI device info,
which assists the debug in multiple GPU case.
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
disble vblank in dce_vitual_crtc_commit(), which is skipped
under sriov before
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiawei <Jiawei.Gu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is convenient for multiple teams to obtain the information. Also,
add device info by using dev_info().
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Turn off the printing by default because it is not very useful, while
adding more details.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vega20 arbitrates pstate at hive level and not device level. Last peer to
remote buffer unmap could drop P-State while another process is still
remote buffer mapped.
With this fix, P-States still needs to be disabled for now as SMU bug
was discovered on synchronous P2P transfers. This should be fixed in the
next FW update.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 3fded222f4
This is work around for vcn1 only. Currently vcn1 has separate
begin_use and idle work handle.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Tested-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the list with supported Arcturus chips, but disable for now until
final list is confirmed.
Ideally we can poll atombios for FRU support, instead of maintaining
this list of chips, but this will enable serial number reading for
supported ASICs for the time-being.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes a minor typo in the file.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit c12b84d6e0.
The original patch causes a RAS event and subsequent kernel hard-hang
when running the KFDMemoryTest.PtraceAccessInvisibleVram on VG20 and
Arcturus
dmesg output at hang time:
[drm] RAS event of type ERREVENT_ATHUB_INTERRUPT detected!
amdgpu 0000:67:00.0: GPU reset begin!
Evicting PASID 0x8000 queues
Started evicting pasid 0x8000
qcm fence wait loop timeout expired
The cp might be in an unrecoverable state due to an unsuccessful queues preemption
Failed to evict process queues
Failed to suspend process 0x8000
Finished evicting pasid 0x8000
Started restoring pasid 0x8000
Finished restoring pasid 0x8000
[drm] UVD VCPU state may lost due to RAS ERREVENT_ATHUB_INTERRUPT
amdgpu: [powerplay] Failed to send message 0x26, response 0x0
amdgpu: [powerplay] Failed to set soft min gfxclk !
amdgpu: [powerplay] Failed to upload DPM Bootup Levels!
amdgpu: [powerplay] Failed to send message 0x7, response 0x0
amdgpu: [powerplay] [DisableAllSMUFeatures] Failed to disable all smu features!
amdgpu: [powerplay] [DisableDpmTasks] Failed to disable all smu features!
amdgpu: [powerplay] [PowerOffAsic] Failed to disable DPM!
[drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <powerplay> failed -5
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set MP1 state to prepare for unload before reloading SMU FW
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added dedicated function to check if particular fw should be skipped from loading.
Added dedicated function for SMU FW loading via PSP
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The system reboot failed as some IP blocks enter power gate before perform
hw resource destory. Meanwhile use unify interface to set device CGPG to ungate
state can simplify the amdgpu poweroff or reset ungate guard.
Fixes: 487eca11a3 ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)")
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang@amd.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The system reboot failed as some IP blocks enter power gate before perform
hw resource destory. Meanwhile use unify interface to set device CGPG to ungate
state can simplify the amdgpu poweroff or reset ungate guard.
Fixes: 487eca11a3 ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)")
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang@amd.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Vram lost counter is wrongly increased by two during baco reset.
V2: assumed vram lost for mode1 reset on all ASICs
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This code is dead, let's remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Let format prefixes take care of printing the module name
through pr_fmt and dev_fmt definitions.
Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu uses lots of pr_* calls for printing error messages.
With this prefix, errors shall be more obvious to the end
use regarding its origin, and may help debugging.
Prefix format:
[xxx.xxxxx] amdgpu: ...
Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set up a GPU scheduler based on the ring flag rather
than the ring type.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't want a GPU scheduler for this ring.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows IPs to flag whether a specific ring requires
a GPU scheduler or not. E.g., sometimes instances of an
IP are asymmetric and have different capabilities.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vram lost counter is wrongly increased by two during baco reset.
V2: assumed vram lost for mode1 reset on all ASICs
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prefix RAS message printing in GFX IP with PCI device info,
which assists the debug in multiple GPU case.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If there is no GPU hang, user still can access
debugfs through kiq.
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Monk Liu <Monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prefix ras related kernel message logging with PCI
device info by replacing DRM_INFO/WARN/ERROR with
dev_info/warn/err. This can clearly tell user about
GPU device information where ras is. And add some
other ras message printing to make it more clear
and friendly as well.
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Uncorrectable error count printing is missed when issuing UMC
UE injection. When going to the error count log function in GPU
recover work thread, there is no chance to get correct error count
value by last error injection and print, because the error status
register is automatically cleared after reading in UMC ecc irq
callback. So add such message printing in UMC ecc irq cb to be
consistent with other RAS error interrupt cases.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Under bare metal, there is no more else to take
care of the GPU register access through MMIO.
Under Virtualization, to access GPU register is
implemented through KIQ during run-time due to
world-switch.
Therefore, under SR-IOV user can only access
debugfs to r/w GPU registers when meets all
three conditions below.
- amdgpu_gpu_recovery=0
- TDR happened
- in_gpu_reset=0
v2: merge amdgpu_virt_can_access_debugfs() into
amdgpu_virt_enable_access_debugfs()
v3: drop ret variable in amdgpu_virt_enable_access_debugfs()
and directly return result
Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
legacy:
- fix drm_local_map.offset type
ttm:
- temporarily disable hugepages to debug amdgpu problems.
prime:
- fix sg extraction
amdgpu:
- Various Renoir fixes
- Fix gfx clockgating sequence on gfx10
- RAS fixes
- Avoid MST property creation after registration
- Various cursor/viewport fixes
- Fix a confusing log message about optional firmwares
i915:
- Flush all the reloc_gpu batch (Chris)
- Ignore readonly failures when updating relocs (Chris)
- Fill all the unused space in the GGTT (Chris)
- Return the right vswing table (Jose)
- Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre)
analogix_dp:
- probe fix
virtio:
- oob fix in object create
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJej453AAoJEAx081l5xIa+r7YQAIX48cUROehoNDhzEHnAxJuU
WZXNHKaMCaDPzAs6SyCtHiPFWH6trBR5McE2dXfg6qc33lnzROFNp5PLB7qb4O+q
+3QkJ5cGd1bohT7vn3omP9FxxZeD4H4bE/zat+yUPwMWJYSUz4m6w6Ya0rIPa8HS
d9nRL0Y6wGBhm8/E0WCB6fe5G96D1JOFGLhfbVajGlDW/I+eBiS5WEyrtlIW698K
Q7lTNOXKEi9kFEZiW39RbKwW3YwqOEiQf1k0KbvUqctq4qLskHD3MgJpmkAiGVPH
mSnTYPPyATILVGKcmEHR3oB9wuYsoPhNgGCVhhm1MppI8GVUzfk6uqOfdK8UNfDU
IRAZ05AynmMMFNu/4Fw0SyR1sbj4OtAiG0hWaJ6Ou9MBzhERGXfT3+/BzeHsR4MJ
+fVIbOArSCAeFTAkqcLVbMKjivAJjullpsj36DFn+lXmHnxB98zAkSNT5dQcDjzl
bp6FhJXm7pWYx8SvkGRneESqLdr2WVgyZmP6u+kgzZ5pPubWSDqY1IFu1exb5Sne
bf7HoPzQ6LyD5KgX5WdoJ5++bcvQ9G4/qDF96NY6emCMKcwnOaAzvtErxQLpFeWP
dZwnxHXXtxY4Z4r5bFURPeWX3rWX5f/cCZ8B7mTUDSTa4hgzV8yUX3ZQBc+9Knja
zuvnpm4j1BXFqOg0Xfsu
=ZTAf
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"As expected, more fixes did turn up in the latter part of the week.
The drm_local_map build regression fix is here, along with temporary
disabling of the hugepage work due to some amdgpu related crashes.
Otherwise it's just a bunch of i915, and amdgpu fixes.
legacy:
- fix drm_local_map.offset type
ttm:
- temporarily disable hugepages to debug amdgpu problems.
prime:
- fix sg extraction
amdgpu:
- Various Renoir fixes
- Fix gfx clockgating sequence on gfx10
- RAS fixes
- Avoid MST property creation after registration
- Various cursor/viewport fixes
- Fix a confusing log message about optional firmwares
i915:
- Flush all the reloc_gpu batch (Chris)
- Ignore readonly failures when updating relocs (Chris)
- Fill all the unused space in the GGTT (Chris)
- Return the right vswing table (Jose)
- Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre)
analogix_dp:
- probe fix
virtio:
- oob fix in object create"
* tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm: (34 commits)
drm/ttm: Temporarily disable the huge_fault() callback
drm/bridge: analogix_dp: Split bind() into probe() and real bind()
drm/legacy: Fix type for drm_local_map.offset
drm/amdgpu/display: fix warning when compiling without debugfs
drm/amdgpu: unify fw_write_wait for new gfx9 asics
drm/amd/powerplay: error out on forcing clock setting not supported
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
drm/amd/display: Check for null fclk voltage when parsing clock table
drm/amd/display: Acknowledge wm_optimized_required
drm/amd/display: Make cursor source translation adjustment optional
drm/amd/display: Calculate scaling ratios on every medium/full update
drm/amd/display: Program viewport when source pos changes for DCN20 hw seq
drm/amd/display: Fix incorrect cursor pos on scaled primary plane
drm/amd/display: change default pipe_split policy for DCN1
drm/amd/display: Translate cursor position by source rect
drm/amd/display: Update stream adjust in dc_stream_adjust_vmin_vmax
drm/amd/display: Avoid create MST prop after registration
drm/amdgpu/psp: dont warn on missing optional TA's
drm/amdgpu: update RAS related dmesg print
drm/amdgpu: resolve mGPU RAS query instability
...
This reverts commit b74fb888f4.
Revert the auto alignment mode set of SH MEM config, as it will result
to OCL Conformance Test fail.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Execution will only reach here if the asserted condition is true.
Hence there is no need for the additional check.
Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the fw_write_wait default case true since presumably all new
gfx9 asics will have updated firmware. That is using unique WAIT_REG_MEM
packet with opration=1.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Tested-by: Aaron Liu <aaron.liu@amd.com>
Tested-by: Yuxian Dai <Yuxian.Dai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add indirect access support to registers outside of
mmio bar.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
all the register access through kiq is redirected
to amdgpu_kiq_rreg/amdgpu_kiq_wreg
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
those are not needed anymore
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
not needed anymore
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
all the mmCUR_CONTROL instances are in mmr range and
can be accessd directly by using RREG32/WREG32
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the workaround is not needed for soc15 ASICs except
for vega10. it is even not needed with latest vega10
vbios.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The system will be hang up during S3 suspend because of SMU is pending
for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is at least 1 VG20 DID that does not have an FRU, and trying to read
that will cause a hang. For now, explicitly support reading the FRU for
Arcturus and for the WKS VG20 DIDs, and skip for everything else.
This re-enables serial number reporting for server cards
v2: Add ASIC check
v3: Don't default to true for pre-VG20
v4: Use DID instead of parsing the VBIOS
v5: Sqaush in overflow warning fix
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[PATCH 2/2]
kfd_pre_reset will free mem_objs allocated by kfd_gtt_sa_allocate
Without this change, sriov tdr code path will never free those
allocated memories and get memory leak.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 5161bba4311f in order to split it into two
different patches, and this will make it easier to understand.
[PATCH 1/2]
porting to gfx10 from
commit 1b0bfcff46 ("drm/amdgpu: Avoid destroy hqd when GPU is on reset")
Originally, MEC is touched
without GPU initialized first.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
prefix RAS error related dmesg print with pci device info
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
upon receiving uncorrectable error, query every GPU node for ras errors
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Incorrect CG sequence will cause gfx timedout,
if we keep switching power profile mode
(enter profile mod such as PEAK will disable CG,
exit profile mode EXIT will enable CG)
when run Vulkan test case(case used for test: vkexample).
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add RLC_SPM golden settings
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add RLC_SPM golden settings
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UTCL2 client ID is useful information to get which
UTCL2 client caused the gpuvm fault. Print it out
for debug purpose
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VCN shared memory needs restore after wake up during S3 test.
v2: Allocate shared memory saved_bo at sw_init and free it in sw_fini.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On ARCTURUS and RENOIR, powerplay is not supported yet.
When plug in or unplug power jack, ACPI event will issue.
Then kernel NULL pointer BUG will be triggered.
Check for NULL pointers before calling.
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace dev_warn() with dev_info() and note that they are
optional to avoid confusing users.
The RAS TAs only exist on server boards and the HDCP and DTM
TAs only exist on client boards. They are optional either way.
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.
v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum
v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list
v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use AMDGPU_HW_IP_* to set amdgpu_ring_type enum values
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kfd_pre_reset will free mem_objs allocated by kfd_gtt_sa_allocate
Without this change, sriov tdr code path will never free those allocated
memories and get memory leak.
v2:add a bugfix for kiq ring test fail
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the fw_write_wait default case true since presumably all new
gfx9 asics will have updated firmware. That is using unique WAIT_REG_MEM
packet with opration=1.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Tested-by: Aaron Liu <aaron.liu@amd.com>
Tested-by: Yuxian Dai <Yuxian.Dai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The system will be hang up during S3 suspend because of SMU is pending
for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.
v2: Use disable the GFX CGPG instead of RLC safe mode guard.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
core:
- revert drm_mm atomic patch
- dt binding fixes
fbcon:
- null ptr error fix
i915:
- GVT fixes
nouveau:
- runpm fix
- svm fixes
amdgpu:
- HDCP fixes
- gfx10 fix
- Misc display fixes
- BACO fixes
amdkfd:
- Fix memory leak
vboxvideo:
- remove conflicting fbs
vc4:
- mode validation fix
xen:
- fix PTR_ERR usage
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJejR7RAAoJEAx081l5xIa+heoP/jBjHdwaZsLOkZFqFr8613mR
juL53xEn7RK7lnVSWBRqAdJznRyJqAyCtDrxW5Au3t0x6zetOv3AhS8fQBtqv/Ze
ghSdLZdjuTd4Lm2qyS2aWU3DBNnBlYcGcTD6bxwJn3EXSSR1z8YabT2osOOhj7cz
HwZjhW/XmIOpuhCKDEyyzeTCSLRISBdzgipIbuUXHeqUGB2jt/vpHTiqm+FgTFo5
hrHfM2EPpUK7LDq+REfXFeIwaQvtRDB1AJ3p8JM9iLt2GvTfavjGIDKDG9XReQhC
l8WeaMQD69ZbujmBX+XRuP7qj+vfnVYcV/RuMm8Yr09W2Ac8RZOQknvp9PlVmqVl
xYAvSu1DSD/88/5fvaDxYR2pyCE2ui+3hbFd9WHGFEX8h/cIhGJZ+cZqDO4K0nmY
vw5JmtSr0Eiwrv2dFn/CFYCxxGz0BdGxVOskbUCwZUPHaTLTAbpfFIaKEABW/sbR
k5P+cQFjxUJMbCMQeZKSa7L2GZAjf/K/SKyRNMeBuCsfn8KpEUo1W4hhGijtxL//
akwBFwVeZ0bLTELwB6mFdYGQ987vIcfYjJV0ochPlJCPWJ1OGx6jh1+gYR81W7Np
5mVFApS2FybmBaTQYWH0E+AhJxyxEHE6spgaGCgCHTlknANmCu3lF90NFVPAK7JB
Vvlj3gJnOzoguejWzomE
=HNn4
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This is a set of fixes that have queued up, I think I might have
another pull with some more before rc1 but I'd like to dequeue what I
have now just in case Easter is more eggciting that expected.
The main thing in here is a fix for a longstanding nouveau power
management issues on certain laptops, it should help runtime
suspend/resume for a lot of people.
There is also a reverted patch for some drm_mm behaviour in atomic
contexts.
Summary:
core:
- revert drm_mm atomic patch
- dt binding fixes
fbcon:
- null ptr error fix
i915:
- GVT fixes
nouveau:
- runpm fix
- svm fixes
amdgpu:
- HDCP fixes
- gfx10 fix
- Misc display fixes
- BACO fixes
amdkfd:
- Fix memory leak
vboxvideo:
- remove conflicting fbs
vc4:
- mode validation fix
xen:
- fix PTR_ERR usage"
* tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drm: (41 commits)
drm/nouveau/kms/nv50-: wait for FIFO space on PIO channels
drm/nouveau/nvif: protect waits against GPU falling off the bus
drm/nouveau/nvif: access PTIMER through usermode class, if available
drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during init
drm/nouveau: workaround runpm fail by disabling PCI power management on certain intel bridges
drm/nouveau/svm: remove useless SVM range check
drm/nouveau/svm: check for SVM initialized before migrating
drm/nouveau/svm: fix vma range check for migration
drm/nouveau: remove checks for return value of debugfs functions
drm/nouveau/ttm: evict other IO mappings when running out of BAR1 space
drm/amdkfd: kfree the wrong pointer
drm/amd/display: increase HDCP authentication delay
drm/amd/display: Correctly cancel future watchdog and callback events
drm/amd/display: Don't try hdcp1.4 when content_type is set to type1
drm/amd/powerplay: move the ASIC specific nbio operation out of smu_v11_0.c
drm/amd/powerplay: drop redundant BIF doorbell interrupt operations
drm/amd/display: Fix dcn21 num_states
drm/amd/display: Enable BT2020 in COLOR_ENCODING property
drm/amd/display: LFC not working on 2.0x range monitors (v2)
drm/amd/display: Support plane level CTM
...
Replace dev_warn() with dev_info() and note that they are
optional to avoid confusing users.
The RAS TAs only exist on server boards and the HDCP and DTM
TAs only exist on client boards. They are optional either way.
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
prefix RAS error related dmesg print with pci device info
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
upon receiving uncorrectable error, query every GPU node for ras errors
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Incorrect CG sequence will cause gfx timedout,
if we keep switching power profile mode
(enter profile mod such as PEAK will disable CG,
exit profile mode EXIT will enable CG)
when run Vulkan test case(case used for test: vkexample).
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl6GTQMUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vy3PhAAmqpYBRobOsG8QbmKDjoJEFtkqdvD
z6+4zf/R+hF11RyXjMDwihIe8d+tkQ4eAaYu6Oh5PrTyanz0G0PgeCrivZeytULk
thqQIWzDQMVA5vN/2/Vy8s5s+3HzP8z/MZOFScJ7+xA1MndXptPRTNmFUbjx+GAv
x8/pTp0u9AF6m7itX65DxXvwkzjWamt+Ar4Yx2IcuKAU/M5RtfuZO3PpDnqn7/wk
JFlkRoYeFB6qNnnkPdeyPHl9dALhuhzgdTyklQEnKVW3nf3xThYDhcEwdh6kBQgl
0dH8lL5LXy7PKGN8RES4wB0Vqndw/HlsCF5O4wkkfItbnbJxGJtS139e5973m0ud
sgWvF4yJAT2jCKhIeNz34sePQJMyWALhv0XzZCsJ0YeGHsrV1jrHELkwUT1+eIsT
3UV0iZ6aL06zQJDyKUbbIcQzEQ/wwBC+x9VgsyL54K1quCQZ1N1Nl/dvrb4cRG9m
m9EhJK/brDf4c0uFlOmMTSxV1t5J+z6ZSQnh1ShD/o5yBsxqN6q5brDT6LEs+jbM
LsIkA18jJOd4OyiDs98YiFKvIfFQbQ0LEBQpJwhF0snvfBFMMbUYN/T/NYneWON/
F0TpkFoP7PXDuq55iNaLdnObfzrpC9kdzUyWvePUvjxIl55bkf+/qtUny+H48t4L
dNggvW052d7BHes=
=deWu
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg)
- Add more 32 GT/s link speed decoding and improve the implementation
(Yicong Yang)
Resource management:
- Add support for sizing programmable host bridge apertures and fix a
related alpha Nautilus regression (Ivan Kokshaysky)
Interrupts:
- Add boot interrupt quirk mechanism for Xeon chipsets and document
boot interrupts (Sean V Kelley)
PCIe native device hotplug:
- When possible, disable in-band presence detect and use PDS
(Alexandru Gagniuc)
- Add DMI table for devices that don't use in-band presence detection
but don't advertise that correctly (Stuart Hayes)
- Fix hang when powering slots up/down via sysfs (Lukas Wunner)
- Fix an MSI interrupt race (Stuart Hayes)
Virtualization:
- Add ACS quirks for Zhaoxin devices (Raymond Pang)
Error handling:
- Add Error Disconnect Recover (EDR) support so firmware can report
devices disconnected via DPC and we can try to recover (Kuppuswamy
Sathyanarayanan)
Peer-to-peer DMA:
- Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew
Maier)
ASPM:
- Reduce severity of common clock config message (Chris Packham)
- Clear the correct bits when enabling L1 substates, so we don't go
to the wrong state (Yicong Yang)
Endpoint framework:
- Replace EPF linkup ops with notifier call chain and improve locking
(Kishon Vijay Abraham I)
- Fix concurrent memory allocation in OB address region (Kishon Vijay
Abraham I)
- Move PF function number assignment to EPC core to support multiple
function creation methods (Kishon Vijay Abraham I)
- Fix issue with clearing configfs "start" entry (Kunihiko Hayashi)
- Fix issue with endpoint MSI-X ignoring BAR Indicator and Table
Offset (Kishon Vijay Abraham I)
- Add support for testing DMA transfers (Kishon Vijay Abraham I)
- Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I)
- Add support for tests to clear IRQ (Kishon Vijay Abraham I)
- Add common DT schema for endpoint controllers (Kishon Vijay Abraham I)
Amlogic Meson PCIe controller driver:
- Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi
Pommarel)
- Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi
Pommarel)
Cadence PCIe controller driver:
- Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay
Abraham I)
Intel VMD host bridge driver:
- Add two VMD Device IDs that require bus restriction mode (Sushma
Kalakota)
Mobiveil PCIe controller driver:
- Refactor and modularize mobiveil driver (Hou Zhiqiang)
- Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang)
Microsoft Hyper-V host bridge driver:
- Add support for Hyper-V PCI protocol version 1.3 and
PCI_BUS_RELATIONS2 (Long Li)
- Refactor to prepare for virtual PCI on non-x86 architectures (Boqun
Feng)
- Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui)
NVIDIA Tegra PCIe controller driver:
- Use pci_parse_request_of_pci_ranges() (Rob Herring)
- Add support for endpoint mode and related DT updates (Vidya Sagar)
- Reduce -EPROBE_DEFER error message log level (Thierry Reding)
Qualcomm PCIe controller driver:
- Restrict class fixup to specific Qualcomm devices (Bjorn Andersson)
Synopsys DesignWare PCIe controller driver:
- Refactor core initialization code for endpoint mode (Vidya Sagar)
- Fix endpoint MSI-X to use correct table address (Kishon Vijay
Abraham I)
TI DRA7xx PCIe controller driver:
- Fix MSI IRQ handling (Vignesh Raghavendra)
TI Keystone PCIe controller driver:
- Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I)
Miscellaneous:
- Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng
Feng)
- Use ioremap(), not phys_to_virt(), for platform ROM to fix video
ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)"
* tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits)
misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS
PCI: tegra: Print -EPROBE_DEFER error message at debug level
misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq()
misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
tools: PCI: Add 'e' to clear IRQ
misc: pci_endpoint_test: Add ioctl to clear IRQ
misc: pci_endpoint_test: Avoid using module parameter to determine irqtype
PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt
PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address
PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments
misc: pci_endpoint_test: Add support to get DMA option from userspace
tools: PCI: Add 'd' command line option to support DMA
misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation
PCI: endpoint: functions/pci-epf-test: Print throughput information
PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data
PCI: pciehp: Fix MSI interrupt race
PCI: pciehp: Fix indefinite wait on sysfs requests
PCI: endpoint: Fix clearing start entry in configfs
PCI: tegra: Add support for PCIe endpoint mode in Tegra194
PCI: sysfs: Revert "rescan" file renames
...
On ARCTURUS and RENOIR, powerplay is not supported yet.
When plug in or unplug power jack, ACPI event will issue.
Then kernel NULL pointer BUG will be triggered.
Check for NULL pointers before calling.
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Change SH_MEM_CONFIG Alignment mode to Automatic, as:
1)OGL fn_amd_compute_shader will failed with unaligned mode.
2)The default alignment mode was defined to automatic on gfx10
specification.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Navi ASICs don't require to access through PSP to osssys registers.
This on SR-IOV configuration.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
The buffer used when calling psp is a shared buffer. If we have multiple calls
at the same time we can overwrite the buffer.
[How]
Add mutex to guard the shared buffer.
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This series focuses on corner case bug fixes and general clarity
improvements to hmm_range_fault().
- 9 bug fixes
- Allow pgmap to track the 'owner' of a DEVICE_PRIVATE - in this case the
owner tells the driver if it can understand the DEVICE_PRIVATE page or
not. Use this to resolve a bug in nouveau where it could touch
DEVICE_PRIVATE pages from other drivers.
- Remove a bunch of dead, redundant or unused code and flags
- Clarity improvements to hmm_range_fault()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6CUDEACgkQOG33FX4g
mxqXgQ/+I2o4QIq86xTddRCesINab73sA4bdvIYgixojTBvo5a9Tu9JO3nmzJOEi
mmUXsdN+MnyTdnrkcfbEvo0b5ktCGlwGYQmcRv9MrB1QvSChg6Zy3F3cWR+dsv5c
Mo3z0sWRx+//dVSI0p1BMhd5rym0nZCh/rUO94C/XgZ3YoC+MjX4lKeCMvjJpZFc
DDEN3Z+zjOvjeiT9TMNMmkPnZSY+N4RQwTRKek5l95QwcHyFy5SBI+uzOHyE0lVw
KG5+yk+TFRMoiHjZh4BFqkibQbVKG0qMKiOymIncTr3kL4DK0Hwf/4Zk4DMKucEb
Rs2B7C+UShiyIOzdjbnBsqPiHevAhR3nOpozJy3x/Z6fLapVoIlVx3UJJ/djuxZa
7+H2VzxKz1xGRDH4Js/WD0smZ8jisA04vW5THJVFr0Vd2sqKBo3G/5ay2Kw9NX7T
MRII1KkA/luZbmM6WLEQkliVuNkMCsVUU3hiY96tLI7uN73M9fQUe6s3NubeoFxi
UKg0zuMsAlsJxkHGeY+IMHtUR/law+k9/aZoB4idMGwV/i7KiiolbRcB0H5yn4L5
OV+4uxVBFS/n37sCpMwpx5WJtgbPzww3b6Cd0hfIUqnQzSOjP40Pl5ZX/c/2M7ps
bUdZjve5j653YeC/7NBPDprvzbfmyyxJdZZZzzXLQl5kyL2o9LE=
=UEpV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull hmm updates from Jason Gunthorpe:
"This series focuses on corner case bug fixes and general clarity
improvements to hmm_range_fault(). It arose from a review of
hmm_range_fault() by Christoph, Ralph and myself.
hmm_range_fault() is being used by these 'SVM' style drivers to
non-destructively read the page tables. It is very similar to
get_user_pages() except that the output is an array of PFNs and
per-pfn flags, and it has various modes of reading.
This is necessary before RDMA ODP can be converted, as we don't want
to have weird corner case regressions, which is still a looking
forward item. Ralph has a nice tester for this routine, but it is
waiting for feedback from the selftests maintainers.
Summary:
- 9 bug fixes
- Allow pgmap to track the 'owner' of a DEVICE_PRIVATE - in this case
the owner tells the driver if it can understand the DEVICE_PRIVATE
page or not. Use this to resolve a bug in nouveau where it could
touch DEVICE_PRIVATE pages from other drivers.
- Remove a bunch of dead, redundant or unused code and flags
- Clarity improvements to hmm_range_fault()"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits)
mm/hmm: return error for non-vma snapshots
mm/hmm: do not set pfns when returning an error code
mm/hmm: do not unconditionally set pfns when returning EBUSY
mm/hmm: use device_private_entry_to_pfn()
mm/hmm: remove HMM_FAULT_SNAPSHOT
mm/hmm: remove unused code and tidy comments
mm/hmm: return the fault type from hmm_pte_need_fault()
mm/hmm: remove pgmap checking for devmap pages
mm/hmm: check the device private page owner in hmm_range_fault()
mm: simplify device private page handling in hmm_range_fault
mm: handle multiple owners of device private pages in migrate_vma
memremap: add an owner field to struct dev_pagemap
mm: merge hmm_vma_do_fault into into hmm_vma_walk_hole_
mm/hmm: don't handle the non-fault case in hmm_vma_walk_hole_()
mm/hmm: simplify hmm_vma_walk_hugetlb_entry()
mm/hmm: remove the unused HMM_FAULT_ALLOW_RETRY flag
mm/hmm: don't provide a stub for hmm_range_fault()
mm/hmm: do not check pmd_protnone twice in hmm_vma_handle_pmd()
mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL handling
mm/hmm: return -EFAULT when setting HMM_PFN_ERROR on requested valid pages
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJehCE6AAoJEAx081l5xIa+bKAQAJj49Hj3WvSJV6cSl3LmgwPV
IcWpR6LLVrCOQiS600NyXnbv6lTmCtnIfwEneQqm5ltQvJk38QcKQvSua6+ETi9f
0hl7IiytLwv2R/pS0g9jgSsKmbeP2bDwBAR44vK6pcK6WOgCrkpoL1F4YZEDUILT
ewnRb52afF3Hw8PSG0lwgBYd7G0uto49t3nt86LjGftJgB3wPFlluVOzLHTeEh0w
FZEyKuqS8hq8EZFfG1Gu1hS2ylO9y1VgYxiv18jDyRb8jUPq+gzqH6roTPRIronA
whGZgC6SkyZ9NthCLBu4ITbO9wStAHoawzFfax25QwwuoOyrikuvGy3PfEUu+ixL
bW2UtYK6BHLGnvZChH6E7i9J6qQbNYCn3Ty5bB1KsY06sHVoP1jYUBSicPN2ELWc
9KaBI+WROBNEkge/rizwUFfD/u0MZaaSRsVSlGDdHkD7IFj2tDPhfNWdXZQl4EwR
JnndT6cu97htf7tkid7RASpJ/hnwJTb1hg0Cc9kPblOrGqEDF0K5845FLR9VptGl
5c2/KyKM/GaI793fP1TG76uDegBhV9337mUF1ZW3c6gCE7QXncZXM5jrIFRE9q2L
IbvuyUYRof1gW0r1R0WVJYAi2CRBeMd7qgkQjMrbTw0o7FiYDJXB7dc5pYj2um2v
mSVHikC2S1rbUdH0xbWM
=I0vz
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"This is the main drm pull request for 5.7-rc1.
Highlights:
- i915 enables Tigerlake by default
- i915 and amdgpu have initial OLED backlight support
[ Jani Nikula pipes up and points out that we've had a bunch of
"initial support" code for a long time already, but only now
Lyude made it actually work on real world machines ]
- vmwgfx add support to enable OpenGL 4 userspace
- zero length arrays are mostly removed.
Detailed summary:
new driver:
- tidss: TI Keystone platform display subsystem
core:
- new drm device warn macros
- mode config valid for memory constrained devices
- bridge bus format negotation
- consolidated fake vblank event handling
- dma_alloc related cleanups
- drop get_crtc callback
- dp: DP1.4 EDID corruption test
- EDID CEA detailed timings improvements
- relicense some code to dual GPL2/MIT
- convert core vblank support to per-crtc support
- rework drm_global_mutex
- bridge rework to allow omap_dss custom driver removeal
- remove drm_fb_helper connector interrfaces
- zero-length array removal
scheduler:
- support for modifying the sched list
- revert job distribution optimization
- helper to pick least loaded scheduler
- race condition fix
mst:
- various fixes
- remove register_connector callback
i915:
- uapi to allows userspace specific CS ring buffer sizes
- Tigerlake enablement patches + Tigerlake enabled by default
- new sysfs entries for engine properties
- display/logging refactors
- eDP/DP fixes for DPCD
- Gen7 back to aliasing-ppgtt
- Gen8+ irq refactor
- Avoid globals
- GEM locking fixes and simplifications
- Ice Lake and Elkhart Lake fixes and workarounds
- Baytrail/Haswell instability fix
- GVT - VFIO edid better support
amdgpu:
- Rework VM update handling in preparation for HMM support
- drm load/unload removal fixups
- USB-C PD firmware updates
- HDCP srm support
- Navi/renoir PM watermark fixes
- OLED panel support
- Optimize debugging vram access
- Use BACO for runtime pm
- DC clock programming optimizations and fixes
- PSP fw loading sequence updates
- Drop DRIVER_USE_AGP
- Remove legacy drm load and unload callbacks
- ACP Kconfig fix
- Lots of fixes across the driver
amdkfd:
- runtime pm support
- more gfx config details in amdgpu
radeon:
- drop DRIVER_USE_AGP
vmwgfx:
- Disable DMA when SEV encryption in use
- Shader Model 5 support - needed for GL4 support
msm:
- DPU resource manager refactor
- dpu using atomic global state
mediatek:
- MT8183 DPI support
etnaviv:
- out-of-bounds read fix
- expose feature flags for GC400 STM32MP1 SoC
- runtime suspend entry fix
- dma32 zone fix
hisilicon:
- mode selection fixes
meson:
- YUV420 support
lima:
- add support for heap buffers
tinydrm:
- removal of owner field
- explicit DT dependency removal
- YAML schema conversion
tegra:
- misc cleanups
tidss:
- new driver
virtio:
- better batching of notifications to host
- memory handling reworked
- shmem + gpu context fixes
hibmc:
- add gamma_set support
- improve DPMS support
pl111:
- Integrator IM-PD1 support
sun4i:
- LVDS support for A20 + A33
- DSI panel handling improvements"
* tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm: (1537 commits)
drm/i915/display: Fix mode private_flags comparison at atomic_check
drm/i915/gt: Stage the transfer of the virtual breadcrumb
drm/i915/gt: Select the deepest available parking mode for rc6
drm/i915: Avoid live-lock with i915_vma_parked()
drm/i915/gt: Treat idling as a RPS downclock event
drm/i915/gt: Cancel a hung context if already closed
drm/i915: Use explicit flag to mark unreachable intel_context
drm/amdgpu: don't try to reserve training bo for sriov (v2)
drm/amdgpu/smu11: add support for SMU AC/DC interrupts
drm/amdgpu/swSMU: handle manual AC/DC notifications
drm/amdgpu/swSMU: handle DC controlled by GPIO for navi1x
drm/amdgpu/swSMU: set AC/DC mode based on the current system state (v2)
drm/amdgpu/swSMU: correct the bootup power source for Navi1X (v2)
drm/amdgpu/swSMU: use the smu11 power source helper for navi1x
drm/amdgpu/smu11: add a helper to set the power source
drm/amd/swSMU: add callback to set AC/DC power source (v2)
drm/scheduler: fix rare NULL ptr race
drm/amdgpu: fix the coverage issue to clear ArcVPGRs
drm/amd/display: Fix pageflip event race condition for DCN.
drm/[radeon|amdgpu]: Remove HAINAN board from max_sclk override check
...
Pull trivial tree updates from Jiri Kosina:
"My attempt to revitalize trivial queue I've been neglecting for years
(what a disaster that was for this world, right? :) ) with patches
collected from backlog that were still relevant and not applied
elsewhere in the meantime"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
err.h: remove deprecated PTR_RET for good
blk-mq: Fix typo in comment
x86/boot: Fix comment spelling
sh: mach-highlander: Fix comment spelling
s390/dasd: Fix comment spelling
mfd: wm8994: Fix comment spelling
docs: Add reference in binfmt-misc.rst
genirq: fix kerneldoc comment for irq_desc
drm/amdgpu: fix two documentation mismatch issues
HID: fix Kconfig word ordering
list/hashtable: minor documentation corrections.
There is a spelling mistake in a dev_err error message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The problem is that we can't add the clear fence to the BO
when there is an exclusive fence on it since we can't
guarantee the the clear fence will complete after the
exclusive one.
To fix this refactor the function and also add the exclusive
fence as shared to the resv object.
v2: fix warning
v3: add excl fence as shared instead
v4: squash in fix for fence handling in amdgpu_gem_object_close
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable VCN2.5 DPG mode for arcturus after below items are applied.
ASD: 0x21000023
SOS: 0x17003B
VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16
VBIOS: 23
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add firmware write/read point reset sync through shared memory
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add firmware write/read point reset sync through shared memory
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added firmware share memory support for VCN. Current multiple
queue mode is enabled only.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add vcn dpg harware synchronization to fix race condition
issue between vcn driver and hardware.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add vcn dpg harware synchronization to fix race condition
issue between vcn driver and hardware.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Couldn't only rely on enc fence to decide switching to dpg unpaude mode.
Since a enc thread may not schedule a fence in time during multiple
threads running situation.
v3: 1. Rename enc_submission_cnt to dpg_enc_submission_cnt
2. Add dpg_enc_submission_cnt check in idle_work_handler
v4: Remove extra counter check, and reduce counter before idle
work schedule
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix race condition issue when multiple vcn starts are called.
v2: Removed checking the return value of cancel_delayed_work_sync()
to prevent possible races here.
v3: Add total_submission_cnt to avoid gate power unexpectedly.
v4: Remove extra counter check, and reduce counter before idle
work schedule
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to the new L1.0b0c011b policy, many SDMA registers are blocked which raise
the violation warning. There are total 6 pair register needed to be skipped
when driver init and de-init.
mmSDMA0/1_CNTL
mmSDMA0/1_F32_CNTL
mmSDMA0/1_UTCL1_PAGE
mmSDMA0/1_UTCL1_CNTL
mmSDMA0/1_CHICKEN_BITS,
mmSDMA0/1_SEM_WAIT_FAIL_TIMER_CNTL
v2: squash in warning fix
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.
The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU reset.
2. The KIQ scheduler is not used in the front-end and should be disabled during GPU reset.
3. In amdgpu_ring_fini() when the driver unloads.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Test-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Due Page faults can easily overwhelm the interrupt handler.
So to make sure that we never lose valuable interrupts on the primary ring
we re-route page faults to IH ring 1.
It also facilitates the recovery page process, since it's already
running from a process context.
This is valid for Arcturus and future Navi generation GPUs.
[How]
Setting IH_CLIENT_CFG_DATA for VMC and UMD IH clients.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
call psp to program ih cntl in SR-IOV if supported on Navi and Arcturus.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support added into IH to enable ring1 and ring2 for navi10_ih.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
nbio v7.4 size of ih doorbell range is 64 bit. This requires 2 DWords per register.
[How]
Change ih doorbell size from 2 to 4. This means two Dwords per ring.
Current configuration uses two ih rings.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Previously these registers were set to 0. This was causing an
infinite retry on the UTCL1 RB, preventing higher priority RB such as paging RB.
[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs on Vega10, Vega12,
Vega20 and Arcturus.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clean up the smu10, smu12, and gfx9 drivers to use headers for
registers instead of hardcoding in the C source files.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We have three ib pools, they are normal, VM, direct pools.
Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.
Any jobs schedule direct VM update IBs should use VM pool.
Any other jobs use NORMAL pool.
v2: squash in coding style fix
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
extend compute lockup timeout to 60000 for SR-IOV.
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiawei <Jiawei.Gu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As no need to support vcn decode feature, so disable the
ring for SR-IOV.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
if host support new handshake we only need to enter
fullaccess_mode in ip_init() part, otherwise we need
to do it before reading vbios (becuase host prepares vbios
for VF only after received REQ_GPU_INIT event under
legacy handshake)
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
what:
1)move timtout setting before ip_early_init to reduce exclusive mode
cost for SRIOV
2)move ip_discovery_init() to inside of amdgpu_discovery_reg_base_init()
it is a prepare for the later upcoming patches.
why:
in later upcoming patches we would use a new mailbox event --
"req_gpu_init_data", which is a callback hooked in adev->virt.ops and
this callback send a new event "REQ_GPU_INIT_DAT" to host to notify
host to do some preparation like "IP discovery/vbios on the VF FB"
and this callback must be:
A) invoked after set_ip_block() because virt.ops is configured during
set_ip_block()
B) invoked before ip_discovery_init() becausen ip_discovery_init()
need host side prepares everything in VF FB first.
current place of ip_discovery_init() is before we can invoke callback
of adev->virt.ops, thus we must move ip_discovery_init() to a place
after the adev->virt.ops all settle done, and the perfect place is in
amdgpu_discovery_reg_base_init()
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
by this new handshake host side can prepare vbios/ip-discovery
and pf&vf exchange data upon recieving this request without
stopping world switch.
this way the world switch is less impacted by VF's exclusive mode
request
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
what:
with the new "req_init_data" handshake we need to use mailbox
before do IP discovery, so in mxgpu_nv.c file the original
SOC15_REG method won'twork because that depends on IP discovery
complete first.
how:
so the solution is to always use static MMIO offset for NV+ mailbox
registers.
HW team confirm us all MAILBOX registers will be at the same
offset for all ASICs, no IP discovery needed for those registers
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1) modify xgpu_nv_send_access_requests to support
new idh request
2) introduce new function: req_gpu_init_data() which
is used to notify host to prepare vbios/ip-discovery/pfvf exchange
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
new idh_request and ihd_event to prepare for the
new handshake protocol implementation later
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1) drop the headers from AI in mxgpu_nv.c, should refer to mxgpu_nv.h
2) the IDH_EVENT_MAX is not used and not aligned with host side
so drop it
3) the IDH_TEXT_MESSAG was provided in host but not defined in guest
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The conversion to bool is not needed, remove it.
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the VCN firmware will not use
vf vmr now. And new psp policy won't support set tmr
now.
For driver compatible issue, ignore the not support error.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add 4k resolution for virtual connector.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The crtc num is determined by virtual_display parameter.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
added flag to ras context to indicate if ras query functionality is ready
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
we need to move virt detection much earlier because:
1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always
be at DE5 (dw) mmio offset from vega10, this way there is no
need to implement detect_hw_virt() routine in each nbio/chip file.
for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at
0x1503
2) we need to acknowledged we are SRIOV VF before we do IP discovery because
the IP discovery content will be updated by host everytime after it recieved
a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches
for this new handshake soon).
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
those two headers are not needed for ip discovery
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ensure that when we memcpy, we don't end up copying more data than
the struct supports. For now, this is 16 characters for product number
and serial number, and 32 chars for product name
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reporting the fw_version just returns 0, the actual version is kept as
ta_*_ucode_version. This is the same as the feature reported in
the amdgpu_firmware_info debugfs file.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
added asic support checking function to be filled in by supported asic types
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow for reading of information like manufacturer, product number
and serial number from the FRU chip. Report the serial number as
the new sysfs file serial_number. Note that this only works on
server cards, as consumer cards do not feature the FRU chip, which
contains this information.
v2: Add documentation to amdgpu.rst, add helper functions,
rename functions for consistency, fix bad starting offset
v3: Remove testing definitions
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Note if a buffer was imported using peer2peer.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/359296
We should be able to do this now after checking all the prerequisites.
v2: fix entrie count in the sgt
v3: manually construct the sg
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/359295
Check if we can do peer2peer on the PCIe bus.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/359294
Importing should work out of the box.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/359293
the HPD bo size calculation error.
the "mem.size" can't present actual BO size all time.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <Christian.Koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl6BIG4eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGlHUH/RCFve2sfHRPjRW+
xR5SaLVAw6XKvtKBq7yvKmHEwqNJnL79IHyqqtSrtfFr2FfaH/KvYiCbbAezvSrM
np0udGu7STKGd21CWuyEZJudyhXkOwMRNiFiCXWp7rs35oh8T0TpJxMzo2Nc1nLk
JFQPqAP6OSvq4IkWEywKQI+Au3Z1IBf83xVjZ1s+MKPQHYD49x2hc4cQntL5/cnm
a3DoR2iBkYiGZCZ9dDqAqJTnMQIiCbACdZXgGjNRUpdyA/dtAjsMl11NPYHm8TA2
3AHBupAK50WBZGad6xv2qKQyScsmoJG2mv92QjlOFz0Tpiu6rLnDlLYREDVB6YH6
qbLDsc8=
=XEIU
-----END PGP SIGNATURE-----
Merge v5.6 into drm-next
msm needed rc6, so I just went and merged release
(msm has been in drm-next outside of this tree)
Signed-off-by: Dave Airlie <airlied@redhat.com>
On some EFI systems, the video BIOS is provided by the EFI firmware. The
boot stub code stores the physical address of the ROM image in pdev->rom.
Currently we attempt to access this pointer using phys_to_virt(), which
doesn't work with CONFIG_HIGHMEM.
On these systems, attempting to load the radeon module on a x86_32 kernel
can result in the following:
BUG: unable to handle page fault for address: 3e8ed03c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 317 Comm: systemd-udevd Not tainted 5.6.0-rc3-next-20200228 #2
Hardware name: Apple Computer, Inc. MacPro1,1/Mac-F4208DC8, BIOS MP11.88Z.005C.B08.0707021221 07/02/07
EIP: radeon_get_bios+0x5ed/0xe50 [radeon]
Code: 00 00 84 c0 0f 85 12 fd ff ff c7 87 64 01 00 00 00 00 00 00 8b 47 08 8b 55 b0 e8 1e 83 e1 d6 85 c0 74 1a 8b 55 c0 85 d2 74 13 <80> 38 55 75 0e 80 78 01 aa 0f 84 a4 03 00 00 8d 74 26 00 68 dc 06
EAX: 3e8ed03c EBX: 00000000 ECX: 3e8ed03c EDX: 00010000
ESI: 00040000 EDI: eec04000 EBP: eef3fc60 ESP: eef3fbe0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206
CR0: 80050033 CR2: 3e8ed03c CR3: 2ec77000 CR4: 000006d0
Call Trace:
r520_init+0x26/0x240 [radeon]
radeon_device_init+0x533/0xa50 [radeon]
radeon_driver_load_kms+0x80/0x220 [radeon]
drm_dev_register+0xa7/0x180 [drm]
radeon_pci_probe+0x10f/0x1a0 [radeon]
pci_device_probe+0xd4/0x140
Fix the issue by updating all drivers which can access a platform provided
ROM. Instead of calling the helper function pci_platform_rom() which uses
phys_to_virt(), call ioremap() directly on the pdev->rom.
radeon_read_platform_bios() previously directly accessed an __iomem
pointer. Avoid this by calling memcpy_fromio() instead of kmemdup().
pci_platform_rom() now has no remaining callers, so remove it.
Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.com
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Now that flags are handled on a fine-grained per-page basis this global
flag is redundant and has a confusing overlap with the pfn_flags_mask and
default_flags.
Normalize the HMM_FAULT_SNAPSHOT behavior into one place. Callers needing
the SNAPSHOT behavior should set a pfn_flags_mask and default_flags that
always results in a cleared HMM_PFN_VALID. Then no pages will be faulted,
and HMM_FAULT_SNAPSHOT is not a special flow that overrides the masking
mechanism.
As this is the last flag, also remove the flags argument. If future flags
are needed they can be part of the struct hmm_range function arguments.
Link: https://lore.kernel.org/r/20200327200021.29372-5-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Remove the HMM_PFN_DEVICE_PRIVATE flag, no driver has ever set this flag
on input, and the only place that uses it on output can be trivially
changed to use is_device_private_page().
This removes the ability to request that device_private pages are faulted
back into system memory.
Link: https://lore.kernel.org/r/20200316193216.920734-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
1) SRIOV guest KMD doesn't care training buffer
2) if we resered training buffer that will overlap with IP discovery
reservation because training buffer is at vram_size - 0x8000 and
IP discovery is at ()vram_size - 0x10000 => vram_size -1)
v2: squash in warning fix from Nirmoy
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For boards that do not support automatic AC/DC transitions
in firmware, manually tell the firmware when the status
changes.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set ComputePGMRSRC1.VGPRS as 0x3f to clear all ArcVGPRs.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RLCG is enabled by host driver, no need to enable it in guest for none-PSP load path
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMHub EDC becomes dirty after BACO reset
EDC registers should be cleared early on in reset phase
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are spelling mistakes in pr_err messages and a comment. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
clang warns:
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:754:6: warning: variable 'shadow'
is used uninitialized whenever 'if' condition is
false [-Wsometimes-uninitialized]
if (offset == grbm_cntl || offset == grbm_idx)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:757:6: note: uninitialized use
occurs here
if (shadow) {
^~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:754:2: note: remove the 'if' if
its condition is always true
if (offset == grbm_cntl || offset == grbm_idx)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:738:13: note: initialize the
variable 'shadow' to silence this warning
bool shadow;
^
= 0
1 warning generated.
shadow is only assigned in one condition and used as the condition for
another if statement; combine the two if statements and remove shadow
to make the code cleaner and resolve this warning.
Fixes: 2e0cc4d48b ("drm/amdgpu: revise RLCG access path")
Link: https://github.com/ClangBuiltLinux/linux/issues/936
Suggested-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn2.5/jpeg2.5 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn2/jpeg2 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn1 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The CAP fw is for enabling driver compatibility. Currently, it only
enabled for vega10 VF.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Originally, only the PTE valid is taken in consider.
The PRT case is missied when bo update which raise problem.
We need add condition for PRT case.
v2: add PRT condition for amdgpu_vm_bo_update_mapping, too
v3: fix one typo error
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn2.5/jpeg2.5 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn2/jpeg2 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix typo for vcn1 idle check
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function name mentioned in the documentational comments mismatches
the actual one. The mismatch may make trouble for automatic documentation
generation. One of the erronous name has seen to be misused
and fixed in commit bc5ab2d29b ("drm/amdgpu: fix typo in function
sdma_v4_0_page_resume").
There is apparently no functional change in the patch.
Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
VCN HW doesn't support dynamic load balance on multiple instances
for a context. This patch initializes VNC entities with only one
drm_gpu_scheduler picked by drm_sched_pick_best(). Picking a
drm_gpu_scheduler using drm_sched_pick_best() ensures that we
do load balance among multiple contexts but not among multiple
jobs in a context.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Puts the i2c adapter in common place for sharing by RAS
and upcoming data read from FRU EEPROM feature.
v2:
Move i2c adapter to amdgpu_pm and rename it.
v3: Move i2c adapter init to ASIC specific code and get rid
of the switch case in amdgpu_device
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Job fence on page table should be a shared one, so add it to the root
page talbe bo resv.
last_delayed field is not needed anymore. so remove it.
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix switch-case indentation in amdgpu_ctx_init_entity()
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
what changed:
1)provide new implementation interface for the rlcg access path
2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op
function can access reg that need RLCG path help
now even debugfs's reg_op can used to dump wave.
tested-by: Monk Liu <monk.liu@amd.com>
tested-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
AccVGPRs are newly added in arcturus. Before reading these
registers, they should be initialized. Otherwise edc error
happens, when RAS is enabled.
v2: reuse the existing logical to calculate register size
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the warning
"warn: variable dereferenced before check 'obj' (see line 1131)"
by removing unnecessary checks as amdgpu_ras_debugfs_create_all()
is only called from amdgpu_debugfs_init() where obj member in
con->head list is not NULL.
Use list_for_each_entry() instead list_for_each_entry_safe() as obj
do not to be freeing or removing from list during this process.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This can fix the baco reset failure seen on Navi10.
And this should be a low risk fix as the same sequence
is already used for system suspend/resume.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RAS support capability needs to be updated on top of different
memeory ECC enablement, and remove redundant memory ecc check
in gmc module for vega20 and arcturus.
v2: check HBM ECC enablement and set ras mask accordingly.
v3: avoid to invoke atomfirmware interface to query twice.
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
v2: Read from correct offset from internal storage buffer.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
include amdgpu_ras.h head file instead of use extern
ras_debugfs_create_all function
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
disallow the logical to be enabled on platforms that
don't support gfx ras at this stage, like sriov skus,
dgpu with legacy ras.etc
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
invoking an error injection successfully will cause an at_event intterrupt that
will occur before the invoke sequence can complete causing an invalid error
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
refine the assignment for vcn.num_vcn_inst,
vcn.harvest_config, vcn.num_enc_rings in VF
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This can fix the baco reset failure seen on Navi10.
And this should be a low risk fix as the same sequence
is already used for system suspend/resume.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
v2: Read from correct offset from internal storage buffer.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
amd-drm-next-5.7-2020-03-10:
amdgpu:
- SR-IOV fixes
- Fix up fallout from drm load/unload callback removal
- Navi, renoir power management watermark fixes
- Refactor smu parameter handling
- Display FEC fixes
- Display DCC fixes
- HDCP fixes
- Add support for USB-C PD firmware updates
- Pollock detection fix
- Rework compute ring priority handling
- RAS fixes
- Misc cleanups
amdkfd:
- Consolidate more gfx config details in amdgpu
- Consolidate bo alloc flags
- Improve code comments
- SDMA MQD fixes
- Misc cleanups
gpu scheduler:
- Add suport for modifying the sched list
uapi:
- Clarify comments about GEM_CREATE flags that are not used by userspace.
The kernel driver has always prevented userspace from using these.
They are only used internally in the kernel driver.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310212748.4519-1-alexander.deucher@amd.com
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5lkYceHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGpHQH/RJrzcaZHo4lw88m
Jf7vBZ9DYUlRgqE0pxTHWmodNObKRqpwOUGflUcWbb/7GD2LQUfeqhSECVQyTID9
N9y7FcPvx321Qhc3EkZ24DBYk0+DQ0K2FVUrSa/PxO0n7czxxXWaLRDmlSULEd3R
D4pVs3zEWOBXJHUAvUQ5R+lKfkeWKNeeepeh+rezuhpdWFBRNz4Jjr5QUJ8od5xI
sIwobYmESJqTRVBHqW8g2T2/yIsFJ78GCXs8DZLe1wxh40UbxdYDTA0NDDTHKzK6
lxzBgcmKzuge+1OVmzxLouNWMnPcjFlVgXWVerpSy3/SIFFkzzUWeMbqm6hKuhOn
wAlcIgI=
=VQUc
-----END PGP SIGNATURE-----
Merge v5.6-rc5 into drm-next
Requested my mripard for some misc patches that need this as a base.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some framework test will fail if enable runpm on Vega10.
Disable it untill issue fixed.
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Tested-by: Kyle Chen <Kyle.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
and remove each ras IP's own debugfs creation
this is required to fix ras when the driver does not use the drm load
and unload callbacks due to ordering issues with the drm device node.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
centralize all debugfs creation in one place for ras
this is required to fix ras when the driver does not use the drm load
and unload callbacks due to ordering issues with the drm device node.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only kernel bo has kfd eviction fence.
This warning is to give a notice that kfd only remove eviction fence on
individual bos.
Tested-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_*,
but they are interweavedly used in kernel driver, resulting in bad
readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is not
referenced in kernel, and it functions implicitly in kernel through
ALLOC_MEM_FLAGS_COHERENT, causing unnecessary confusion.
Replace all occurrences of ALLOC_MEM_FLAGS_* with
KFD_IOC_ALLOC_MEM_FLAGS_* to solve the problem.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If there are no high priority compute queues available then set normal
priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH]
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The ROMC_INDEX/DATA offset was changed to e4/e5 since
from smuio_v11 (vega20/arcturus).
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Tested-by: Candice Li <Candice.Li@amd.com>
Reviewed-by: Candice Li <Candice.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
AMDGPU statically sets priority for compute queues
at initialization so remove all the functions
responsible for changing compute queue priority dynamically.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switch to appropriate sched list for an entity on priority override.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.
Policy: make queue 0 of each pipe as high priority compute queue
High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CRTC in DPMS state off calls for low power state entry.
Support both atomic mode setting and pre-atomic mode setting.
v2: move comment
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
and disable MC resum in VCN2.0 as well
those are not concerned by VF driver
Singed-off-by: darlington Opara <darlington.opara@amd.com>
Signed-off-by: Jinage Zhao <jiange.zhao@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
support IB test on dec/enc ring
disable ring test on dec/enc ring (MMSCH limitation)
v2: squash in unused variable warning fix
Singed-off-by: darlington Opara <darlington.opara@amd.com>
Signed-off-by: Jinage Zhao <jiange.zhao@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
something need to do for VCN2.0 enablement on SRIOV:
1)use one dec ring and one enc ring
2)allocate MM table for MMSCH usage
3)implement SRIOV version vcn_start which orgnize vcn programing
with patcket format and implement start mmsch for to run those
packet
4)doorbell is changed for SRIOV
Singed-off-by: darlington Opara <darlington.opara@amd.com>
Signed-off-by: Jinage Zhao <jiange.zhao@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMSCH doesn't support jpeg ring on SRIOV
Signed-off-by: Jinage Zhao <jiange.zhao@amd.com>
Singed-off-by: darlington Opara <darlington.opara@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add "CP" to AMDGPU_GEM_CREATE_MQD_GFX9 to indicate it is only for CP MQD
buffer.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Problem:
During GU reset PSP's sysfs was being wrongly reinitilized
during call to amdgpu_device_ip_late_init which was failing
with duplicate error.
Fix:
Move psp_sysfs_init to psp_sw_init to avoid this. Add guards
in sysfs file's read and write hook agains premature call
if PSP is not finished initialization.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SPM access the video memory according to SPM_VMID. It should be updated
with the job's vmid right before the job is scheduled. SPM_VMID is a
global resource
Signed-off-by: Jacob He <jacob.he@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
check UMC status and exit prior to making and erroneus register access
this resolved unexpected behaviour with UMC indexing mode broadcasting writes
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On arcturus, DF-Cstate needs to be toggled off/on
before and after accessing UMC error counter and
error address registers, otherwise, clearing such
registers may fail.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mitigates race condition on BACO reset between GPU bootcode and driver reload
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add arcturus xgmi/wafl pcs err status group to support
PCS error detection and report on arcturus
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now driver will report XGMI/WAFL PCS error through
sysfs xgmi_wafl_err_count node on Vega20
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since from vega20, hardware supports run-time detect
and report XGMI/WAFL PCS ras error. Add helper functions
to walkthrough every type of ras error and report it if
any.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm_fb_helper_{add,remove}_one_connector() and
drm_fb_helper_single_add_all_connectors() are dummy functions now
and serve no purpose. Hence remove their calls.
This is the preparatory step for removing the
drm_fb_helper_{add,remove}_one_connector() functions from
drm_fb_helper.h
This removal is done using below sementic patch and unused variable
compilation warnings are fixed manually.
@@
@@
- drm_fb_helper_single_add_all_connectors(...);
@@
expression e1;
statement S;
@@
- e1 = drm_fb_helper_single_add_all_connectors(...);
- S
@@
@@
- drm_fb_helper_add_one_connector(...);
@@
@@
- drm_fb_helper_remove_one_connector(...);
Changes since v1:
* Squashed warning fixes into the patch that introduced the
warnings (into 5/7) (Laurent, Emil, Lyude)
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305120434.111091-6-pankaj.laxminarayan.bharadiya@intel.com
The max connector argument for drm_fb_helper_init() isn't used anymore
hence remove it.
All the drm_fb_helper_init() calls are modified with below sementic
patch.
@@
expression E1, E2, E3;
@@
- drm_fb_helper_init(E1,E2, E3)
+ drm_fb_helper_init(E1,E2)
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305120434.111091-2-pankaj.laxminarayan.bharadiya@intel.com
[why]
CP firmware decide to skip setting the state for 3D pipe 1 for Navi1x as there
is no use case.
[how]
Disable 3D pipe 1 on Navi1x.
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org