We need to make sure the various init sequences submitted
to KIQ complete before testing the rings.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KIQ is the Kernel Interface Queue for managing the MEC. Rather than setting
up rings via direct MMIO of ring registers, the rings are configured via
special packets sent to the KIQ. The allows the MEC to better manage shared
resources and certain power events.
v2: squash in s3/s4 fix from Rex
v3: further fixes from Rex
Signed-off-by: David Panariti <David.Panariti@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing chips to the doorbell range setup. These
were missed in the KIQ code. Fixes power and performance
regressions with KIQ. Spotted by Rex.
Tested-and-Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to properly set the MTYPE and ROQ space setting.
This should fix performance regressions with KIQ enabled.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add new RIDs.
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some of these paths probably cannot be interrupted by a signal anyway.
Those that can would fail to clean up things if they actually got
interrupted.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This way GFX and MM won't fight for VMIDs any more.
Initially disabled since we need to stop flushing all HUBS
at the same time as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1,KIQ won't touch VRAM so no need to involv HDP flush/invalidate at all.
2,According to CP hw designer KIQ better not use any PM4 package lead to wait behave.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clean up a toggle with ?:.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Swap read/write pattern for WREG32_FIELD()
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Swap read/write pattern for WREG32_FIELD()
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use new WREG32_FIELD_OFFSET() to clean up code.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1) Adapt to vulkan:
Now use double SWITCH BUFFER to replace the 128 nops w/a,
because when vulkan introduced, umd can insert 7 ~ 16 IBs
per submit which makes 256 DW size cannot hold the whole
DMAframe (if we still insert those 128 nops), CP team suggests
use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a.
2) To fix the CE VM fault issue when MCBP introduced:
Need one more COND_EXEC wrapping IB part (original one us
for VM switch part).
this change can fix vm fault issue caused by below scenario
without this change:
>CE passed original COND_EXEC (no MCBP issued this moment),
proceed as normal.
>DE catch up to this COND_EXEC, but this time MCBP issued,
thus DE treats all following packages as NOP. The following
VM switch packages now looks just as NOP to DE, so DE
dosen't do VM flush at all.
>Now CE proceeds to the first IBc, and triggers VM fault,
because DE didn't do VM flush for this DMAframe.
3) change estimated alloc size for gfx9.
with new DMAframe scheme, we need modify emit_frame_size
for gfx9
4) No need to insert 128 nops after gfx8 vm flush anymore
because there was double SWITCH_BUFFER append to vm flush,
and for gfx7 we already use double SWITCH_BUFFER following
after vm_flush so no change needed for it.
5) Change emit_frame_size for gfx8
v2: squash in BUG removal from Monk
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Apply the new IB during IB emit for SRIOV with MCBP
v2: agd: use define instead of magic number
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
when MCBP enabled for gfx8, the cond_exec must also
be implemented, otherwise there will be odds to meet
cross engine (ce and me) deadlock when world switch
happens.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch introduces a new flag named "amdgpu_firmware_load_type" to
handle different firmware loading method. Since Vega10, there are
three ways to load firmware. It would be better to use a flag and a
fw_load_type kernel parameter to configure it.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The ring structure already has what we need.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoids passing around additional parameters during setup.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Everything we need is in the ring structure. No need to
pass all the bits explicitly.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to loop through the compute queues twice.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If KIQ isn't working, the compute rings won't work either.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To better match where they are used. Called from sw_init
and sw_fini.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Found with scripts/coccinelle/misc/boolconv.cocci.
Signed-off-by: Andrew F. Davis <afd@ti.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Newer asics have a two levels of irq ids now:
client id - the IP
src id - the interrupt src within the IP
v2: integrated Christian's comments.
v3: fix rebase fail in SI and CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Newer asics need 64 bit wptrs. If the wptr is now
smaller than the rptr that doesn't indicate a wrap-around
anymore.
v2: integrate Christian's comments.
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: agd: move apertures to mc structure
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Because different HWs have different definition for CE & DE meta
data, follow mqd design to move the structures to vi_structs.h.
And change the prefix from amdgpu to vi as the structures is only
for VI family.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In some cases, manually insmod/rmmod amdgpu is necessary. When
unloading amdgpu, the KIQ IRQ enable/disable function will case
system hang. The root cause is, in the sequence of function
amdgpu_fini, the sw_fini of IP block AMD_IP_BLOCK_TYPE_GFX will be
invoked earlier than that of AMD_IP_BLOCK_TYPE_IH. So continue to use
the variable freed by AMD_IP_BLOCK_TYPE_GFX will cause system hang.
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Xiangliang Yu < Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: move the config struct to drm_amdgpu_info_device
v3: move the config feature to amdgpu_gca_config
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to free mqd backup when destroying ring.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vi_mqd is only used by VI family but mqd_ptr and mqd_backup is
common for all ASIC, so change the pointer to void.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2:
use in_rest to fix compute ring test failure issue
which occured after FLR/gpu_reset.
we need backup a clean status of MQD which was created in drv load
stage, and use it in resume stage, otherwise KCQ and KIQ all may
faild in ring/ib test.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In resume routine, we need clr RB prior to the
ring test of engine, otherwise some engine hang
duplicated during GPU reset.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is required for restoring the mqds after GPU reset.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change-Id: Ica8f86577a50d817119de4b4fb95068dc72652a9
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sw part only invoked once during sw_init.
hw part invoked during first drv load and resume later.
that way we cannot alloc mqd in hw/resume, we only keep
mqd allocted in sw_init routine.
and hw_init routine only kmap and set it.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We ultimately want to re-use this for bare metal,
so no need to have vf checks in the KIQ code itself
since kiq itself is currently only used in VF cases.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is for SRIOV fix:
mqd soft init/fini will be invoked by sw_init to
allocate BO for compute MQD resource, instead of
original scheme that hw_init allocates MQD.
because if hw_init allocates MQD, then resume will
allocate MQD, and that lead to memory leak after
driver recovered from hang.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
introduce a new mqd member in ring is for later usage.
we need keep a clean version of MQD for the purpose
of recovering compute rings from hang.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CG & PG function changes engine clock/gating, which is
not appropriate for VF device, because one vf doesn't know
the whole picture of engine's overall workload.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
gcc-4.8 warns about '{0}' being used an an initializer for nested structures:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function ‘gfx_v8_0_ring_emit_ce_meta_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:7263:2: warning: missing braces around initializer [-Wmissing-braces]
} ce_payload = {0};
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function ‘gfx_v8_0_ring_emit_de_meta_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:7290:2: warning: missing braces around initializer [-Wmissing-braces]
} de_payload = {0};
Using an empty {} initializer however has the same effect and works on all versions.
Fixes: acad2b2a7b ("drm/amdgpu:implement CE/DE meta-init routines")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>