Commit Graph

661 Commits

Author SHA1 Message Date
Likun Gao 3b94fb101f drm/amd/powerplay: add limit of pp_feature for smu (v3)
Move pp_feature from the struct of amd_powerplay to amdgpu_device.
Add pp_feature limit for overdrive interface.

v2: put pp_feature into struct amdgpu_pm.
v3: merge feature_mask with pp_feature.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:04:02 -05:00
Dave Airlie f4bc54b532 Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Updates for 5.1:
- GDS fixes
- Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES interface
- GPUVM fixes
- PCIE DPM switching fixes for vega20
- Vega10 uclk DPM regression fix
- DC Freesync fixes
- DC ABM fixes
- Various DC cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208210214.27666-1-alexander.deucher@amd.com
2019-02-11 14:04:20 +10:00
Harish Kasiviswanathan c53134577c drm/amdgpu: Fix pci platform speed and width
The new Vega series GPU cards have in-built bridges. To get the pcie
speed and width supported by the platform walk the hierarchy and get the
slowest link.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-07 14:03:18 -05:00
Dave Airlie 37fdaa3390 drm-misc-next for 5.1:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - Split out some part of drm_crtc_helper.h into drm_probe_helper.h
   - DRIVER_* flags improvements
   - New tasks on the TODO-list
   - Improvements to the documentation
 
 Driver Changes:
   - Continual of drmP.h removal in multiple drivers
   - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers
   - sun4i: Addition of the A23 support, multiple fixes for the tiled
     formats
   - atmel-hlcdc: Fix of clipping and rotation properties
   - qxl: various BO-related improvements, prime and generic fbdev emulation
     support
   - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output
   - New Sitronix ST7701 panel driver
   - New Kingdisplay KD097D04 panel driver
   - New LeMaker BL035-RGB-002 panel driver
   - New PDA 91-00156-A0 panel driver
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXFRb/wAKCRDj7w1vZxhR
 xXj2AQDFAkH0/tksJ+T6yrHZXQE7q/6fAWTlrDXLo7Nb1hhfswEA1AyaHmZJA0Ar
 u7S+4Gfzs/gisw9Jr4bqQaerEpYJxQA=
 =qumD
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-02-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.1:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - Split out some part of drm_crtc_helper.h into drm_probe_helper.h
  - DRIVER_* flags improvements
  - New tasks on the TODO-list
  - Improvements to the documentation

Driver Changes:
  - Continual of drmP.h removal in multiple drivers
  - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers
  - sun4i: Addition of the A23 support, multiple fixes for the tiled
    formats
  - atmel-hlcdc: Fix of clipping and rotation properties
  - qxl: various BO-related improvements, prime and generic fbdev emulation
    support
  - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output
  - New Sitronix ST7701 panel driver
  - New Kingdisplay KD097D04 panel driver
  - New LeMaker BL035-RGB-002 panel driver
  - New PDA 91-00156-A0 panel driver

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201144749.t3abxvguhstu6bcl@flea
2019-02-04 14:42:34 +10:00
Dave Airlie e09191d360 Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
New stuff for 5.1.
amdgpu:
- DC bandwidth formula updates
- Support for DCC on scanout surfaces
- Support for multiple IH rings on soc15 asics
- Fix xgmi locking
- Add sysfs interface to get pcie usage stats
- Simplify DC i2c/aux code
- Initial support for BACO on vega10/20
- New runtime SMU feature debug interface
- Expand existing sysfs power interfaces to new clock domains
- Handle kexec properly
- Simplify IH programming
- Rework doorbell handling across asics
- Drop old CI DPM implementation
- DC page flipping fixes
- Misc SR-IOV fixes

amdkfd:
- Simplify the interfaces between amdkfd and amdgpu

ttm:
- Add a callback to notify the driver when the lru changes

sched:
- Refactor mirror list handling
- Rework hw fence processing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125231517.26268-1-alexander.deucher@amd.com
2019-02-01 09:34:20 +10:00
Andrey Grodzovsky 222b5f0441 drm/sched: Refactor ring mirror list handling.
Decauple sched threads stop and start and ring mirror
list handling from the policy of what to do about the
guilty jobs.
When stoppping the sched thread and detaching sched fences
from non signaled HW fenes wait for all signaled HW fences
to complete before rerunning the jobs.

v2: Fix resubmission of guilty job into HW after refactoring.

v4:
Full restart for all the jobs, not only from guilty ring.
Extract karma increase into standalone function.

v5:
Rework waiting for signaled jobs without relying on the job
struct itself as those might already be freed for non 'guilty'
job's schedulers.
Expose karma increase to drivers.

v6:
Use list_for_each_entry_safe_continue and drm_sched_process_job
in case fence already signaled.
Call drm_sched_increase_karma only once for amdgpu and add documentation.

v7:
Wait only for the latest job's fence.

Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:36 -05:00
wentalou f14899fd2a drm/amdgpu: sriov should skip asic_reset in device_init
sriov would meet guest driver load failure,
if calling amdgpu_asic_reset in amdgpu_device_init.
sriov should skip asic_reset in device_init.

Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:35 -05:00
Daniel Vetter fcd70cd36b drm: Split out drm_probe_helper.h
Having the probe helper stuff (which pretty much everyone needs) in
the drm_crtc_helper.h file (which atomic drivers should never need) is
confusing. Split them out.

To make sure I actually achieved the goal here I went through all
drivers. And indeed, all atomic drivers are now free of
drm_crtc_helper.h includes.

v2: Make it compile. There was so much compile fail on arm drivers
that I figured I'll better not include any of the acks on v1.

v3: Massive rebase because i915 has lost a lot of drmP.h includes, but
not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h
there was still one, which this patch largely removes. Which means
rolling out lots more includes all over.

This will also conflict with ongoing drmP.h cleanup by others I
expect.

v3: Rebase on top of atomic bochs.

v4: Review from Laurent for bridge/rcar/omap/shmob/core bits:
- (re)move some of the added includes, use the better include files in
  other places (all suggested from Laurent adopted unchanged).
- sort alphabetically

v5: Actually try to sort them, and while at it, sort all the ones I
touch.

v6: Rebase onto i915 changes.

v7: Rebase once more.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: CK Hu <ck.hu@mediatek.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: virtualization@lists.linux-foundation.org
Cc: etnaviv@lists.freedesktop.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: spice-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-tegra@vger.kernel.org
Cc: xen-devel@lists.xen.org
Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch
2019-01-24 13:20:42 +01:00
Alex Deucher 95e8e59ec4 drm/amdgpu: check if we need to reset at init time (v2)
To deal with situations like kexec or GPU VM passthrough
where the device may have been used previously without a
proper GPU reset between.

v2: rebase

bug: https://bugs.freedesktop.org/show_bug.cgi?id=108585
bug: https://bugs.freedesktop.org/show_bug.cgi?id=108754
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:57 -05:00
Tom St Denis 22d6575b8d drm/amd/amdgpu: add missing mutex lock to amdgpu_get_xgmi_hive() (v3)
v2: Move locks around in other functions so that this
function can stand on its own.  Also only hold the hive
specific lock for add/remove device instead of the driver
global lock so you can't add/remove devices in parallel from
one hive.

v3: add reset_lock

Acked-by:  Shaoyun.liu < Shaoyun.liu@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:53 -05:00
Emily Deng 72d3f59205 drm/amdgpu/sriov: For finishing routine send rel event after init failed
When init fail, send rel init, req_fini and rel_fini to host for the
finishing routine.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:50 -05:00
wentalou 0aaeefccb4 drm/amdgpu: distinguish early and late re-init log in sriov
distinguish ip_reinit_early_sriov and ip_reinit_late_sriov
by different log RE-INIT-early and RE-INIT-late

Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:49 -05:00
Emily Deng d3c117e564 drm/amdgpu/sriov:Correct pfvf exchange logic
The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu
reset.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-By: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:24 -05:00
Emily Deng 91334223b2 drm/amdgpu/virtual_dce: No need to pin the cursor bo
For virtual display feature, no need to pin cursor bo.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:23 -05:00
Daniel Vetter c2d88e06bc drm: Move the legacy kms disable_all helper to crtc helpers
It's not a core function, and the matching atomic functions are also
not in the core. Plus the suspend/resume helper is also already there.

Needs a tiny bit of open-coding, but less midlayer beats that I think.

v2: Rebase onto ast (which gained a new user).

Cc: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Cc: Sean Paul <sean@poorly.run>
Cc: David Airlie <airlied@linux.ie>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Rex Zhu <Rex.Zhu@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shaoyun Liu <Shaoyun.Liu@amd.com>
Cc: Monk Liu <Monk.Liu@amd.com>
Cc: nouveau@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181217194303.14397-4-daniel.vetter@ffwll.ch
2019-01-11 22:54:29 +01:00
wentalou 7b184b0061 drm/amdgpu: kfd_pre_reset outside req_full_gpu cause sriov hang
XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev,
but outside req_full_gpu of sriov.
It would make sriov hang during reset.

Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-14 12:04:38 -05:00
Andrey Grodzovsky fc42d47ce0 drm/amdgpu: Enable GPU recovery by default for CI
I retested Bonaire (gfx7 dGPU) and it works fine.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-12 14:26:40 -05:00
Alex Deucher 223577753b drm/amdgpu/si: fix SI after doorbell rework
SI does not use doorbells, move asic doorbell init later
asic check.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=108920
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-05 17:49:50 -05:00
Andrey Grodzovsky d4535e2c01 drm/amdgpu: Implement concurrent asic reset for XGMI.
Use per hive wq to concurrently send reset commands to all nodes
in the hive.

v2:
Switch to system_highpri_wq after dropping dedicated queue.
Fix non XGMI code path KASAN error.
Stop  the hive reset for each node loop if there
is a reset failure on any of the nodes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-03 11:15:14 -05:00
Andrey Grodzovsky a82400b57a drm/amdgpu: Handle xgmi device removal.
XGMI hive has some resources allocted on device init which
needs to be deallocated when the device is unregistered.

v2: Remove creation of dedicated wq for XGMI hive reset.
v3: Use the gmc.xgmi.supported flag

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-03 11:15:08 -05:00
Oak Zeng 88dc26e46b drm/amdgpu: Fix num_doorbell calculation issue
When paging queue is enabled, it use the second page of doorbell.
The AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the
kernel doorbells are in the first page. So with paging queue enabled,
the total kernel doorbell range should be original num_doorbell plus
one page (0x400 in dword), not *2.

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-30 12:01:04 -05:00
Andrey Grodzovsky 26bc534094 drm/amdgpu: Refactor GPU reset for XGMI hive case
For XGMI hive case do reset in steps where each step iterates over
all devs in hive. This especially important for asic reset
since all PSP FW in hive must come up within a limited time
(around 1 sec) to properply negotiate the link.
Do this by  refactoring  amdgpu_device_gpu_recover and amdgpu_device_reset
into pre_asic_reset, asic_reset and post_asic_reset functions where is part
is exectued for all the GPUs in the hive before going to the next step.

v2: Update names for amdgpu_device_lock/unlock functions.

v3: Introduce per hive locking to avoid multiple resets for GPUs
    in same hive.
v4:
Remove delayed_workqueue()/ttm_bo_unlock_delayed_workqueue() - they
are copy & pasted over from radeon and on amdgpu there isn't
any reason for that any more.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:55:36 -05:00
Andrey Grodzovsky 5183411b56 drm/amdgpu: Refactor amdgpu_xgmi_add_device
This is prep work for updating each PSP FW in hive after
GPU reset.
Split into build topology SW state and update each PSP FW in the hive.
Save topology and count of XGMI devices for reuse.

v2: Create seperate header for XGMI.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:55:35 -05:00
Oak Zeng 9564f1928e drm/amdgpu: Use asic specific doorbell index instead of macro definition
ASIC specific doorbell layout is used instead of enum definition

Signed-off-by: Oak Zeng <ozeng@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:55:33 -05:00
Oak Zeng 6585661ddd drm/amdgpu: Call doorbell index init on device initialization
Also call functioin amdgpu_device_doorbell_init after
amdgpu_device_ip_early_init because the former depends
on the later to set up asic-specific init_doorbell_index
function

Signed-off-by: Oak Zeng <ozeng@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:55:32 -05:00
Philip Yang ec3db8a63d drm/amdgpu: enable paging queue doorbell support v4
Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for paging queue will
break SRIOV, instead we can reserve and map two doorbell pages for amdgpu, paging
queues doorbell index use same index as SDMA gfx queues index but on second page.

For Vega20, after we change doorbell layout to increase SDMA doorbell for 8 SDMA RLC
queues later, we could use new doorbell index for paging queue.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-20 14:04:58 -05:00
Hawking Zhang 3e2e2ab554 drm/amdgpu/psp: initialize xgmi session (v2)
Setup and tear down xgmi as part of psp.

v2:
- make psp_xgmi_terminate static
- squash in:
drm/amdgpu: only issue xgmi cmd when it is enabled
drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-06 14:02:43 -05:00
Rex Zhu 1e256e2762 drm/amdgpu: Refine CSA related functions
There is no functional changes,
Use function arguments for SRIOV special variables which
is hardcode in those functions.

so we can share those functions in baremetal.

Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:48 -05:00
Andrey Grodzovsky 3ba7b418f1 drm/amdgpu: Enable default GPU reset for dGPU on gfx8/9 v3
After testing looks like these subset of ASICs has GPU reset
working for the most part. Enable reset due to job timeout.

v2: Switch from GFX version to ASIC type.
v3: Fix identation

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:23 -05:00
Christian König 9d064be1e6 drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"
This is still completely breaking my Raven system.

This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038.

Revert until we sort out the sbios and firmware combinations that work
correctly.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=108606
Cc: stable@vger.kernel.org # v4.19

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-01 09:56:56 -05:00
Andrey Grodzovsky 734afd4b21 drm/amdgpu: Fix skipping hangged job reset during gpu recover.
Problem:
During GPU recover DAL would hang in
amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty

Fix:
Turns out there was a typo introduced by
3320b8d drm/amdgpu: remove job->ring which caused skipping
amdgpu_fence_driver_force_completion and so the hangged job
was never force signaled and this would cause the hang later in DAL.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-01 09:51:33 -05:00
Emily Deng 91eec27ebb drm/amdgpu: Fix null pointer amdgpu_device_fw_loading
Need to check adev->powerplay.pp_funcs.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-22 14:40:54 -05:00
Rex Zhu 7a3e0bb2a5 drm/amdgpu: Load fw between hw_init/resume_phase1 and phase2
Extract the function of fw loading out of powerplay.
Do fw loading between hw_init/resuem_phase1 and phase2

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:49:21 -05:00
Rex Zhu 0a4f25205e drm/amdgpu: split ip hw_init into 2 phases
We need to do some IPs earlier to deal with ordering issues
similar to how resume is split into two phases.

Will do fw loading via smu/psp between the two phases.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:49:09 -05:00
Rex Zhu c8963ea4ce drm/amdgpu: Split amdgpu_ucode_init/fini_bo into two functions
1. one is for create/free bo when init/fini
2. one is for fill the bo before fw loading

the ucode bo only need to be created when load driver
and free when driver unload.

when resume/reset, driver only need to re-fill the bo
if the bo is allocated in vram.

Suggested by Christian.

v2: Return error when bo create failed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:48:52 -05:00
Rex Zhu a2d31dc3cf drm/amdgpu: Check late_init status before set cg/pg state
Fix cg/pg unexpected set in hw init failed case.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:48:44 -05:00
Rex Zhu 73f847dbab drm/amdgpu: Refine function amdgpu_device_ip_late_init
1. only call late_init when hw_init successful,
   so check status.hw instand of status.valid in late_init.
2. set status.late_initialized true if late_init was not implemented.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:48:28 -05:00
Rex Zhu 44779b43f1 drm/amdgpu: Move gfx flag in_suspend to adev
Move in_suspend flag to adev from gfx, so
can be used in other ip blocks, also keep
consistent with gpu_in_reset flag.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-09 17:05:33 -05:00
Evan Quan b55c9e7a11 drm/amd/powerplay: helper interfaces for MGPU fan boost feature
MGPU fan boost feature is enabled only when two or more dGPUs
in the system.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-09 16:59:56 -05:00
Christian König 403009bfba drm/amdgpu: fix shadow BO restoring
Don't grab the reservation lock any more and simplify the handling quite
a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-19 12:38:41 -05:00
Christian König c33adbc728 drm/amdgpu: always recover VRAM during GPU recovery
It shouldn't add much overhead and we should make sure that critical
VRAM content is always restored.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-19 12:38:35 -05:00
Alex Deucher 741deade2a drm/amdgpu: simplify Raven, Raven2, and Picasso handling
Treat them all as Raven rather than adding a new picasso
asic type.  This simplifies a lot of code and also handles the
case of rv2 chips with the 0x15d8 pci id.  It also fixes dmcu
fw handling for picasso.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:38:03 -05:00
Feifei Xu 54c4d17e98 drm/amdgpu: add raven2 to gpu_info firmware
Add gpu_info firmware for raven2.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:36:11 -05:00
Kenneth Feng a06c3ee083 drm/amdgpu: enable gfxoff in non-sriov and stutter mode by default
enable gfxoff in non-sriov and stutter mode by default

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:35:57 -05:00
Likun Gao b22ab73314 drm/amd/display/dm: add picasso support
Add support for picasso to the display manager.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:35:03 -05:00
Likun Gao ad5a67a7ea drm/amdgpu: add soc15 support for picasso
Add the IP blocks, clock and powergating flags, and common clockgating support.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:34:57 -05:00
Likun Gao be9699e392 drm/amdgpu: add picasso to asic_type enum
Add picasso to amd_asic_type enum and amdgpu_asic_name[].

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-14 09:34:48 -05:00
Shaoyun Liu fb30fc59a2 drm/amdgpu : Generate XGMI topology info from driver level
Driver will save an array of XGMI hive info, each hive will have a list of devices
that have the same hive ID.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:47:52 -05:00
Emily Deng 39186aefac drm/amdgpu: move PSP init prior to IH in gpu reset
since we use PSP to program IH regs now

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:45:42 -05:00
Christian König 961c75cf20 drm/amdgpu: move amdgpu_device_(vram|gtt)_location
Move that into amdgpu_gmc.c since we are really deadling with GMC
address space here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:35:18 -05:00
Yintian Tao e78196444b drm/amdgpu: move full access into amdgpu_device_ip_suspend
It will be more safe to make full-acess include both phase1 and phase2.
Then accessing special registeris wherever at phase1 or phase2 will not
block any shutdown and suspend process under virtualization.

Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 11:55:20 -05:00
Christian König 12938fad23 drm/amdgpu: cleanup GPU recovery check a bit (v2)
Check if we should call the function instead of providing the forced
flag.

v2: rebase on KFD changes (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:16 -05:00
Yintian Tao 9c70d10ae7 drm/amdgpu: remove fulll access for suspend phase1
There is no need for gpu full access for suspend phase1
because under virtualization there is no hw register access
for dce block.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:09 -05:00
Rex Zhu fdd34271a3 drm/amdgpu: Set clock ungate state when suspend/fini
After set power ungate state, set clock ungate state
before when suspend or fini.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:01 -05:00
Rex Zhu 05df1f01b2 drm/amdgpu: Set power ungate state when suspend/fini
Unify to set power ungate state at the begin of suspend/fini.
Remove the workaround code for gfx off feature in
amdgpu_device.c.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:01 -05:00
Rex Zhu 1112a46b48 drm/amdgpu: Refine function name and function args
There are no any logical changes here.

1. change function names:
   amdgpu_device_ip_late_set_pg/cg_state to
   amdgpu_device_set_pg/cg_state.
2. add a function argument cg/pg_state, so
   we can enable/disable cg/pg through those functions

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:00 -05:00
Christian König 3798e9a6e6 drm/amdgpu: use new scheduler load balancing for VMs
Instead of the fixed round robin use let the scheduler balance the load
of page table updates.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:45 -05:00
Rex Zhu a54594752a drm/amdgpu: Cancel the delay work when suspend
Cancel the delay work to avoid the corner case that
ib test was not running when suspend

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:41 -05:00
Rex Zhu 6c1fd99bc6 drm/amdgpu: Cancel gfx off delay work when driver fini/suspend
there may be gfx off delay work pending when suspend/driver
unload, need to cancel them first.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:12 -05:00
Rex Zhu 408acede87 drm/amdgpu: Ctrl gfx off via amdgpu_gfx_off_ctrl
use amdgpu_gfx_off_ctrl function so driver can arbitrate
whether the gfx ip can be power off or power on.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:52 -05:00
Rex Zhu 1e317b99f0 drm/amdgpu: Put enable gfx off feature to a delay thread
delay to enable gfx off feature to avoid gfx on/off frequently
suggested by Alex and Evan.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:52 -05:00
Rex Zhu d23ee13fba drm/amdgpu: Add amdgpu_gfx_off_ctrl function
v2:
   1. drop the special handling for the hw IP
      suggested by hawking and Christian.
   2. refine the variable name suggested by Flora.

This funciton as the entry of gfx off feature.
we arbitrat gfx off feature enable/disable in this
function.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:51 -05:00
Leo (Sunpeng) Li dc37a9a08d Revert "drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86"
This reverts commit 8624c3c4dbfe24fc6740687236a2e196f5f4bfb0.

We need CONFIG_DRM_AMD_DC_DCN1_0 to guard code that is using fp math.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:32:28 -05:00
Dave Airlie 557ce95051 Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-next
More fixes for 4.19:
- Fixes for scheduler
- Fix for SR-IOV
- Fixes for display

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180809200052.2777-1-alexander.deucher@amd.com
2018-08-10 11:43:02 +10:00
Emily Deng b045d3af7d drm/amdgpu/sriov: give 8s for recover vram under RUNTIME
Extend the timeout for recovering vram bos from shadows on sr-iov
to cover the worst case scenario for timeslices and VFs

Under runtime, the wait fence time could be quite long when
other VFs are in exclusive mode. For example, for 4 VF, every
VF's exclusive timeout time is set to 3s, then the worst case is
9s. If the VF number is more than 4,then the worst case time will
be longer.
The 8s is the test data, with setting to 8s, it will pass the TDR
test for 1000 times.

SWDEV-161490

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-09 11:59:17 -05:00
Dave Airlie 3fce461827 BackMerge v4.18-rc7 into drm-next
rmk requested this for armada and I think we've had a few
conflicts build up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-07-30 10:39:22 +10:00
Shirish S 5f8181733f drm/amdgpu: move the amdgpu_fbdev_set_suspend() further up
This patch moves amdgpu_fbdev_set_suspend() to the beginning
of suspend sequence.

This is to ensure fbcon does not to write to the VRAM
after GPU is powerd down.

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-24 15:14:54 -05:00
Alex Deucher fe1053b785 drm/amdgpu: rework suspend and resume to deal with atomic changes
Use the newly split ip suspend functions to do suspend displays
first (to deal with atomic so that FBs can be unpinned before
attempting to evict vram), then evict vram, then suspend the
other IPs.  Also move the non-DC pinning code to only be
called in the non-DC cases since atomic should take care of
DC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Fixes: e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-20 14:24:55 -05:00
Alex Deucher e7854a0380 drm/amdgpu: split ip suspend into 2 phases
We need to do some IPs earlier to deal with ordering issues
similar to how resume is split into two phases. Do DCE first
to deal with atomic, then do the rest.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-20 14:24:50 -05:00
Shirish S ecb8c50382 drm/amdgpu: use drm_fb helper for console_(un)lock
This patch removes the usage of console_(un)lock
by replacing drm_fb_helper_set_suspend() to
drm_fb_helper_set_suspend_unlocked() which locks and
unlocks the console instead of locking ourselves.

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-20 14:24:18 -05:00
Shirish S 4d3b9ae50e drm/amdgpu: lock and unlock console only for amdgpu_fbdev_set_suspend [V5]
[Why]
While the console_lock is held, console output will be buffered, till
its unlocked it wont be emitted, hence its ideal to unlock sooner to enable
debugging/detecting/fixing of any issue in the remaining sequence of events
in resume path.
The concern here is about consoles other than fbcon on the device,
e.g. a serial console

[How]
This patch restructures the console_lock, console_unlock around
amdgpu_fbdev_set_suspend() and moves this new block appropriately.

V2: Kept amdgpu_fbdev_set_suspend after pci_set_power_state
V3: Updated the commit message to clarify the real concern that this patch
    addresses.
V4: code clean-up.
V5: fixed return value

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-20 14:23:33 -05:00
Colin Ian King 3f48c6813f drm/amdgpu: fix spelling mistake "successed" -> "succeeded"
Trivial fix to spelling mistake in dev_err error message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-20 14:23:14 -05:00
Michel Dänzer 4841203102 drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86
Allowing CONFIG_DRM_AMD_DC_DCN1_0 to be disabled on X86 was an
opportunity for display with Raven Ridge accidentally not working.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-19 13:56:43 -05:00
Leo Liu 96a5d8d491 drm/amdgpu: Make sure IB tests flushed after IP resume
Fixes: 2c773de2 (drm/amdgpu: defer test IBs on the rings at boot (V3))

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-17 15:12:50 -05:00
Christian König 3320b8d2ac drm/amdgpu: remove job->ring
We can easily get that from the scheduler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-16 16:11:53 -05:00
Shaoyun Liu 67ccea6059 drm/amdgpu: Check NULL pointer for job before reset job's ring
job could be NULL when amdgpu_device_gpu_recover is called

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:00 -04:00
Shaoyun Liu 5c6dd71e59 drm/amdgpu: Call KFD reset handlers during GPU reset
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:55 -04:00
Junwei Zhang 7b7c6c81b3 drm/amdgpu: separate gpu address from bo pin
It could be got by amdgpu_bo_gpu_offset() if need

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-10 14:15:39 -05:00
Darren Powell 87e3f1366e drm/amd: Remove errors from sphinx documentation
Eliminating the warnings produced by sphinx when processing the sphinx comments in
 amdgpu_device.c & amdgpu_mn.c

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:40:03 -05:00
Alex Deucher 5d9a633040 drm/amdgpu: use pcie functions for link width and speed
Use the newly exported pci functions to get the link width
and speed rather than using the drm duplicated versions.

Also query the GPU link caps directly rather than hardcoding
them.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:39:59 -05:00
Rex Zhu 916ac57ffb drm/amdgpu: Move CG/PG setting out of delay worker thread
Partially revert commit 2dc80b0065
("drm/amdgpu: optimize amdgpu driver load & resume time")'

1. CG/PG enablement are part of gpu hw ip initialize, we should
wait for them complete. otherwise, there are some potential conflicts,
for example, Suspend and CG enablement concurrently.
2. better run ib test after hw initialize completely. That is to say,
   ib test should be after CG/PG enablement. otherwise, the test will
   not cover the cg/pg/poweroff enable case.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:38:50 -05:00
Rex Zhu c9f96fd506 drm/amdgpu: Split set_pg_state into separate function
1. add amdgpu_device_ip_late_set_pg_state function for
   set pg state.
2. delete duplicate pg state setting on gfx_v8_0's late_init.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:38:50 -05:00
Rex Zhu 9134c6d7f2 drm/amdgpu: Add gfx_off support in smu through pp_set_powergating_by_smu
we can take gfx off feature as gfx power gate. gfx off feature is also
controled by smu. so add gfx_off support in pp_set_powergating_by_smu.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:38:49 -05:00
Dave Airlie f29135ee4e Merge v4.18-rc3 into drm-next
Two requests have come in for a backmerge,
and I've got some pull reqs on rc2, so this
just makes sense.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-07-04 10:27:12 +10:00
Stefan Agner a21daa88d4 drm/amdgpu: Use correct enum to set powergating state
Use enum amd_powergating_state instead of enum amd_clockgating_state.
The underlying value stays the same, so there is no functional change
in practise. This fixes a warning seen with clang:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1930:14: warning: implicit
      conversion from enumeration type 'enum amd_clockgating_state' to
      different enumeration type 'enum amd_powergating_state'
      [-Wenum-conversion]
                                                       AMD_CG_STATE_UNGATE);
                                                       ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-19 13:17:39 -05:00
Junwei Zhang 761f58e0e9 drm/amdgpu: correct GART location info
Avoid confusing the GART with the GTT domain.

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-19 13:17:39 -05:00
Harry Wentland d9fda24804 drm/amdgpu: Don't default to DC support for Kaveri and older
We've had a number of users report failures to detect and light up
display with DC with LVDS and VGA. These connector types are not
currently supported with DC. I'd like to add support but unfortunately
don't have a system with LVDS or VGA available.

In order not to cause regressions we should probably fallback to the
non-DC driver for ASICs that support VGA and LVDS.

These ASICs are:
 * Bonaire
 * Kabini
 * Kaveri
 * Mullins

ASIC support can always be force enabled with amdgpu.dc=1

v2: Keep Hawaii on DC
v3: Added Mullins to the list

Cc: stable@vger.kernel.org
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-19 12:43:53 -05:00
Rex Zhu b1ddf54847 drm/amdgpu: Get real power source to initizlize ac_power
driver need to know the real power source to do some power
related configuration when initialize.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-15 12:20:46 -05:00
Huang Rui 06b18f61ee drm/amdgpu: fix CG enabling hang with gfxoff enabled
After defer the execution of clockgating enabling, at that time, gfx already
enter into "off" state. Howerver, clockgating enabling will use MMIO to access
the gfx registers, then get the gfx hung.

So here we should move the gfx powergating and gfxoff enabling behavior at the
end of initialization behind clockgating.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-13 13:45:21 -05:00
Rex Zhu 34319b329f drm/amdgpu: skip CG for VCN when late_init/fini
VCN clockgating is handled manually like VCE and UVD.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-24 00:15:44 -05:00
Andrey Grodzovsky bf83060408 Remove calls to suspend/resume atomic helpers from amdgpu_device_gpu_recover. (v2)
First of all it's already being called from the display code from amd_ip_funcs.suspend/resume hooks.
Second of all, the place in amdgpu_device_gpu_recover it's being called is wrong for GPU stalls since
it is called BEFORE we cancel and force completion of all in flight jobs which were not yet processed.
So, as Bas pointed in the ticket we will try to wait for fence  in amdgpu_pm_compute_clocks but the pipe
is hanged so we end up in deadlock.

v2: remove unused variable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106500
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-23 23:51:20 -05:00
Feifei Xu c6034aa2c4 drm/amdgpu: Add vega20 to dc support check (v2)
v2: fix whitespace

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-17 10:13:19 -05:00
Feifei Xu e4bd817040 drm/amdgpu: set asic family for vega20.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-17 10:13:10 -05:00
Feifei Xu 27c0bc7163 drm/amdgpu: Add gpu_info firmware for vega20. (v2)
vega20_gpu_info firmware stores gpu configuration for vega20.

v2: drop gpu info firmware for vega20

Squash of:
drm/amdgpu: Add gpu_info firmware for vega20.
drm/amdgpu: drop gpu_info firmware for vega20

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-17 10:13:10 -05:00
Feifei Xu 956fcddc0b drm/amdgpu: Add vega20 to asic_type enum.
Add vega20 to amd_asic_type enum and amdgpu_asic_name[].

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-17 10:13:09 -05:00
Emily Deng abc342538c drm/amdgpu: For sriov reset, move IB test into exclusive mode
When put the IB test out of exclusive mode, and do sriov reset,
the IB test will randomly fail. As out of exclusive mode it uses
kiq to do read and write registers, but as it has world switch,
the kiq read and write time will be random, sometimes it will
beyond the MAX_KIQ_REG_WAIT and then the read or write register
will fail, which will result the IB test fail.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:06 -05:00
Leo Liu 675fd32b27 drm/amdgpu: add VEGAM dc support check
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:55 -05:00
Leo Liu 32cc7e536a drm/amdgpu: set VEGAM to ASIC family and ip blocks
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:53 -05:00
Leo Liu cc07f18ddb drm/amdgpu: bypass GPU info firmware load for VEGAM
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:53 -05:00
Leo Liu 48ff108d9d drm/amdgpu: add VEGAM ASIC type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:52 -05:00
Huang Rui b083369621 drm/amdgpu: it should disable gfxoff when system is going to suspend
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:41 -05:00
Huang Rui 00f54b97d7 drm/amdgpu: use pp_feature member to store the mask
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:40 -05:00
Rex Zhu 7951e37670 drm/amdgpu: Reserved vram for smu to save debug info.
v2: check reserved vram size before allocate.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:28 -05:00
Shirish S 2c773de2ec drm/amdgpu: defer test IBs on the rings at boot (V3)
amdgpu_ib_ring_tests() runs test IB's on rings at boot
contributes to ~500 ms of amdgpu driver's boot time.

This patch defers it and ensures that its executed
in amdgpu_info_ioctl() if it wasn't scheduled.

V2: Use queue_delayed_work() & flush_delayed_work().
V3: removed usage of separate wq, ensure ib tests is
    run before enabling clockgating.

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:21 -05:00
Harry Wentland 2fa417324a drm/amd/display: Remove PRE_VEGA flag
We enabled this upstream by default now and no longer need the flag.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:06 -05:00
Alex Deucher 8bc04c2965 drm/amdgpu: use new asic need_full_reset callback
Use the new callback to determine whether to use full
asic reset or per IP soft reset.  Enables reset to
actually proceed on asics which don't support soft
reset yet.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:07:59 -05:00
Daniel Stone e68d14dd4e drm/amdgpu: Move GEM BO to drm_framebuffer
Since drm_framebuffer can now store GEM objects directly, place them
there rather than in our own subclass. As this makes the framebuffer
create_handle and destroy functions the same as the GEM framebuffer
helper, we can reuse those.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:07:56 -05:00
Rex Zhu 43fa561fd0 drm/amdgpu: remove duplicate cg/pg wrapper functions
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König<christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:07:53 -05:00
Andrey Grodzovsky b6356df3eb drm/amdgpu: Fix NULL ptr on driver unload due to init failure.
Problem:
When unloading due to failure amdgpu_device_fini was called twice
which was leading to NULL ptr in amdgpu_irq_disable_all.

Fix:
Call amdgpu_device_fini only once from amdgpu_driver_unload_kms.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-21 15:04:42 -05:00
Alex Deucher dca7b4015c drm/amdgpu: add vega12 to dc support check
DC is used for modesetting on vega12.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
2018-03-21 14:24:51 -05:00
Alex Deucher e48a3cd9cb drm/amdgpu: set asic family and ip blocks for vega12
soc15 just like vega10 and raven.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
2018-03-21 14:23:55 -05:00
Alex Deucher 3f76dcedb3 drm/amdgpu: add gpu_info firmware for vega12
Stores gpu configuration details.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
2018-03-21 14:23:49 -05:00
Feifei Xu 8fab806ad1 drm/amdgpu: add vega12 to asic_type enum
Add vega12 to amd_asic_type enum and amdgpu_asic_name[].

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
2018-03-21 14:23:39 -05:00
Rex Zhu 81ce8bea03 drm/amdgpu: Fix kernel NULL pointer dereference when amdgpu fini
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-20 23:44:21 -05:00
Mikita Lipski e5b03032e3 drm/amdgpu - Disable all irqs before disabling all CRTCs
By moving amdgpu_irq_disable_all earlier in the sequence
fixes an issue with disabling pflip interrupts:

*ERROR* dal_irq_service_dummy_ack: called for non-implemented irq source

Earlier patch fixed a memory corruption and revealed irq
warnings.This way it seems to be there no obvious issues
with unloading the module.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-20 23:44:12 -05:00
Mikita Lipski 088e7c1617 drm/amdgpu: Disable irq on device before destroying it
Disable irq on devices before destroying them. That prevents
use-after-free memory access when unloading the driver.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-20 23:44:03 -05:00
Mikita Lipski ff97cba8c1 drm/amdgpu: Use atomic function to disable crtcs with dc enabled
This change fixes the deadlock when unloading the driver with displays
connected.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-20 23:43:42 -05:00
Alex Deucher e3ecdffac9 drm/amdgpu: add documentation for amdgpu_device.c
Add kernel doc for the functions in amdgpu_device.c

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-20 23:43:05 -05:00
Rex Zhu 2dac5936e5 drm/amdgpu: Call amdgpu_ucode_fini_bo in amd_powerplay.c
make it symmetric with amdgpu_ucode_init_bo in amd_powerplay.c

refine the "commit b22558bb4ff8fc9fe925222f90297d7a03a5fb20"

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-15 09:57:58 -05:00
Rex Zhu 5b2a3d2c15 drm/amdgpu: Don't compared ip_block_type with ip_block_index
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-15 09:57:41 -05:00
Rex Zhu 5771632723 drm/amdgpu: Plus NULL function pointer check
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-15 09:57:32 -05:00
Alex Deucher 5494d8640f drm/amdgpu: move getting pcie info to common code
No need to replicate it in several places.

Reviewed-by: Rex Zhu <rezhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 16:01:16 -05:00
Alex Deucher 19aede7791 drm/amdgpu: move firmware loading type setup to common code
No need to replicate it in several places.

Reviewed-by: Rex Zhu <rezhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 16:01:15 -05:00
Monk Liu 421a2a30c1 drm/amdgpu: implement mmio byte access helper for MB
mailbox registers can be accessed with a byte boundry according
to BIF team, so this patch prepares register byte access
and will be used by following patches.

Actually, for mailbox registers once the byte field is touched even not changed,
the mailbox behaves, so we need the byte width accessing to those sort of regs.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Pixel Ding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:26 -05:00
Emily Deng edc3d27cbb drm/amdgpu: Correct the amdgpu_ucode_fini_bo place for Tonga
The amdgpu_ucode_fini_bo should be called after gfx_v8_0_hw_fini,
or it will have KCQ disable failed issue.

For Tonga, as it firstly finishes SMC block, and the SMC hw fini
will call amdgpu_ucode_fini, which will lead the amdgpu_ucode_fini_bo
called before gfx_v8_0_hw_fini, this is incorrect.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:25 -05:00
Emily Deng 58e955d9de drm/amdgpu: Correct the place of amdgpu_pm_sysfs_fini
The amdgpu_pm_sysfs_fini should call before amdgpu_device_ip_fini,
or the adev->pm.dpm_enabled would be set to 0, then the device files
related to pp won't be removed by amdgpu_pm_sysfs_fini when unload
driver.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:25 -05:00
Monk Liu c41d1cf62d drm/amdgpu: cleanups for vram lost handling
1)create a routine "handle_vram_lost" to do the vram
recovery, and put it into amdgpu_device_reset/reset_sriov,
this way no need of the extra paramter to hold the
VRAM LOST information and the related macros can be removed.

3)show vram_recover failure if time out, and set TMO equal to
lockup_timeout if vram_recover is under SRIOV runtime mode.

4)report error if any ip reset failed for SR-IOV

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-01 11:52:41 -05:00
Monk Liu 711826656b drm/amdgpu: stop all rings before doing gpu recover
found recover_vram_from_shadow sometimes get executed
in paralle with SDMA scheduler, should stop all
schedulers before doing gpu reset/recover

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-01 11:52:23 -05:00
Monk Liu c12aba3acd drm/amdgpu: move WB_FREE to correct place
WB_FREE should be put after all engines's hw_fini
done, otherwise the invalid wptr/rptr_addr would still
be used by engines which trigger abnormal bugs.

This fixes couple DMAR reading error in host side for SRIOV
after guest kmd is unloaded.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-28 14:18:04 -05:00
Monk Liu 7346958551 drm/amdgpu: fix&cleanups for wb_clear
fix:
should do right shift on wb before clearing

cleanups:
1,should memset all wb buffer
2,set max wb number to 128 (total 4KB) is big enough

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-28 14:18:03 -05:00
Mikita Lipski 15b9bc9aa8 drm/amdgpu: Unify the dm resume calls into one
amdgpu_dm_display_resume is now called from dm_resume to
unify DAL resume call into a single function call

There is no more need to separately call 2 resume functions
for DM.

Initially they were separated to resume display state after
cursor is pinned. But because there is no longer any corruption
with the cursor - the calls can be merged into one function hook.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:56 -05:00
Shaoyun Liu 9475a9434f drm/amdgpu: Add place holder for soc15 asic init on emulation
Add common smu_soc_asic_init function to emulate the sillicon post sequence

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:49 -05:00
Shaoyun Liu 593aa2d282 drm/amdgpu: Double the timeout count on emulation mode
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:49 -05:00
Shaoyun Liu 4a2ba39477 drm/amdgpu: Fix none-powerplay issue when load driver on emulation mode
On emulation mode , driver will be loaded with powerplay disabled

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:47 -05:00
Shaoyun Liu bfca028927 drm/amdgpu: Basic emulation support
Add amdgpu_emu_mode module parameter to control the emulation mode
Avoid vbios operation on emulation since there is no vbios post duirng emulation,
use the common hw_init to simulate the post

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Acked-By: Alex Deucher <alexander.deucher@amd.com>
Acked-By: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:47 -05:00
Shaoyun Liu e966a725c0 drm/amdgpu: Enable ip block bit mask print out info by default
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:43 -05:00
Alex Deucher 367e66870e drm/amdgpu: remove DC special casing for KB/ML
It seems to be working now.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102372
Reviewed-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:02 -05:00
Alex Deucher 9950cda2a0 drm/amdgpu: drop the drm irq pre/post/un install callbacks
The preinstall callback didn't do anything because not all
of the IPs were initialized when it was called.

Move the postinstall setup into sequence in the driver.

The uninstall callback disabled all interrupt source, but
it got called too late in the driver sequence and caused problems
with IPs who already freed the relevant data structures.  Move
the call into the right place in the driver sequence.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Tested-By: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:16 -05:00
Christian König 132f34e4b5 drm/amdgpu: move struct gart_funcs into amdgpu_gmc.h
And rename it to struct gmc_funcs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:17:44 -05:00
Christian König 770d13b19f drm/amdgpu: move struct amdgpu_mc into amdgpu_gmc.h
And rename it to amdgpu_gmc as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:17:43 -05:00
Alex Deucher 458d876eb8 drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)
We only support vga_switcheroo and runtime pm on PX/HG systems
so forcing runpm to 1 doesn't do anything useful anyway.

Only call vga_switcheroo_init_domain_pm_ops() for PX/HG so
that the cleanup path is correct as well.  This mirrors what
radeon does as well.

v2: rework the patch originally sent by Lukas (Alex)

Acked-by: Lukas Wunner <lukas@wunner.de>
Reported-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de> (v1)
Cc: stable@vger.kernel.org
2018-01-23 10:24:41 -05:00
Andrey Grodzovsky 54bc1398cc drm/amdgpu: Reenable manual GPU reset from sysfs
Otherwise it keeps rejecting the reset.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-23 01:42:48 -05:00
Christian König 0ebb7c5405 drm/amdgpu: fix 64bit BAR detection
Windows added by the BIOS are not marked as 64bit because they are
usually not changeable anyway.

This fixes large BAR support on my new Ryzen build system.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-10 15:44:54 -05:00
Alex Deucher 041d9d93b5 drm/amdgpu: rename amdgpu_get_pcie_info
add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 11:00:08 -05:00
Alex Deucher 6b8f4ee56f drm/amdgpu: move amdgpu_need_backup to amdgpu_object.c
It's the only place it's used.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 11:00:03 -05:00
Alex Deucher 5f152b5e69 drm/amdgpu: rename amdgpu_gpu_recover
add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:58 -05:00
Alex Deucher 55e0037aab drm/amdgpu: move dummy page functions to amdgpu_gart.c
It's the only place they are used.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:52 -05:00
Alex Deucher 39c640c086 drm/amdgpu: rename amdgpu_need_post
add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:46 -05:00
Alex Deucher 2990a1fc01 drm/amdgpu: rename ip block helper functions
add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:40 -05:00
Alex Deucher f5ec697e37 drm/amdgpu: move fw_reserve functions to amdgpu_ttm.c
It's the only place they are used.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:35 -05:00
Alex Deucher 2543e28a81 drm/amdgpu: rename amdgpu_*_location functions
add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:28 -05:00
Alex Deucher 22cb016437 drm/amdgpu: move amdgpu_doorbell_get_kfd_info to amdgpu_amdkfd.c
It's the only place it's used.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:23 -05:00
Alex Deucher 8111c38727 drm/amdgpu: rename amdgpu_pci_config_reset
add device for consistency with other functions in this file.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:18 -05:00
Alex Deucher 9c3f2b5474 drm/amdgpu: rename amdgpu_program_register_sequence
add device for consistency with other functions in this file.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:13 -05:00
Alex Deucher 131b4b3686 drm/amdgpu: rename amdgpu_wb_* functions
add device for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:07 -05:00
Alex Deucher 75758255dc drm/amdgpu: move debugfs functions to their own file
amdgpu_device.c was getting pretty cluttered.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:59:01 -05:00
Alex Deucher cdd61df614 drm/amdgpu: rename amdgpu_suspend to amdgpu_device_ip_suspend
for consistency with the other functions in that file.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:58:54 -05:00
Alex Deucher 06ec907054 drm/amdgpu: use consistent naming for static funcs in amdgpu_device.c
Prefix the functions with device or device_ip for functions which
deal with ip blocks for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:58:47 -05:00
Alex Deucher 4e89df63c1 drm/amdgpu: move atom functions from amdgpu_device.c
and move them to amdgpu_atombios.c for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18 10:58:35 -05:00
Andrey Grodzovsky 8854695add drm/amdgpu: Simplify amdgpu_lockup_timeout usage.
With introduction of amdgpu_gpu_recovery we don't need any more
to rely on amdgpu_lockup_timeout == 0 for disabling GPU reset.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-15 17:15:00 -05:00
Andrey Grodzovsky dcebf026e6 drm/amdgpu: Add gpu_recovery parameter
Add new parameter to control GPU recovery procedure.

v2:
Add auto logic where reset is disabled for bare metal and enabled
for SR-IOV.
Allow forced reset from debugfs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-15 17:14:50 -05:00
Alex Deucher 88bc1e3c38 drm/amdgpu: drop scratch regs save and restore from GPU reset handling
The expectation is that the base driver doesn't mess with these.
Some components interact with these directly so let the components
handle these directly.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-13 17:28:09 -05:00
Alex Deucher 4ec6ecf48c drm/amdgpu: drop scratch regs save and restore from S3/S4 handling
The expectation is that the base driver doesn't mess with these.
Some components interact with these directly so let the components
handle these directly.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-13 17:28:09 -05:00
Monk Liu b9141cd393 drm/amdgpu: no need to evict VRAM in device_fini
this VRAM evict is not needed and also cost 2seconds
to finish because the IRQ is software side disabled
before it.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12 14:49:56 -05:00
Christian König 79588d21ad drm/amdgpu: add amdgpu_evict_vram debugfs file
Torture test for MM and VM support, can be used to evict all VRAM while
the system is under load.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12 14:45:28 -05:00
Christian König 763efb6c6f drm/amdgpu: cleanup debugfs handling a bit
Remove the superflous .debugfs_init callback and register all files in
amdgpu_device.c in just one function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12 14:45:07 -05:00
Lucas Stach 1b1f42d8fd drm: move amd_gpu_scheduler into common location
This moves and renames the AMDGPU scheduler to a common location in DRM
in order to facilitate re-use by other drivers. This is mostly a straight
forward rename with no code changes.

One notable exception is the function to_drm_sched_fence(), which is no
longer a inline header function to avoid the need to export the
drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07 11:51:56 -05:00
Christian König 97489129c2 drm/amdgpu: allow specifying vm_block_size for multi level PDs v2
This patch allows specifying the vm_block_size even when multi level
page directories are active.

v2: fix signed/unsigned compare warning

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:31 -05:00
Christian König f3368128ba drm/amdgpu: move validation of the VM size into the VM code
This moves validation of the VM size parameter into amdgpu_vm_adjust_size().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:30 -05:00
Christian König 341b759e64 drm/amdgpu: allow non pot VM size values
The VM size actually doesn't need to be a power of two.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:30 -05:00
Christian König c13c55d611 drm/ttm: use an operation context for ttm_bo_mem_space v2
Instead of specifying interruptible and no_wait_gpu manually.

v2: rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:02 -05:00
Christian König b98f1b9e5e drm/amdgpu: align GTT start to 4GB v2
For VCE to work properly the start of the GTT space must be aligned to a
4GB boundary.

v2: add comment why we do this

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:58 -05:00
Christian König 3d647c8f93 drm/amdgpu: remove VRAM size reduction v2
Remove some outdated comments and all code which tries to reduce the VRAM size
mapped into the MC.

This is superfluous and misleading since we never actually program the size.

v2: handle gmc_v6_0.c as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:58 -05:00
Christian König 31b8adab32 drm/amdgpu: require a root bus window above 4GB for BAR resize
Don't even try to resize the BAR when there is no window above 4GB.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:53 -05:00
Monk Liu 2413613506 drm/amdgpu:show error message if fail on event4
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:52 -05:00
Monk Liu 84e5b5161e drm/amdgpu:free CSA in unified place
instead of doing it in each GFX ip's sw_fini

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:51 -05:00
Monk Liu 9921167d90 drm/amdgpu:cleanup unused stack var
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:50 -05:00
Monk Liu f59548c882 drm/amdgpu:fix NULL pointer access during drv remove
NULL pointer is because original logic will step into
set_pde_pte() even after the gart.ptr is freed due to
there are twice gart_unbind() on all gart area.

also, there are other minor fixes:
1,since gart_init only create dummy page, the corresponding
gart_fini shouldn't do more like unbinding all GART, this is
unnecessary because in driver fini stage all GART unbinding
had already been done during each IP's SW_FINI (GMC's
SW_FINI is the last one called), so remove the step
for the GART unbinding in gart_fini().

2,gart_fini() is already invoked during each GMC IP's gart_fini
routine,e.g. gmc_vx_0_gart_fini(), so no need to manually
call it during ttm_fini().

3,amdgpu_gem_force_release() should be put ahead of
amdgpu_vm_manager_fini()

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:50 -05:00
Pixel Ding 1daee8b472 drm/amdgpu: revise retry init to fully cleanup driver
Retry at drm_dev_register instead of amdgpu_device_init.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:18 -05:00
Monk Liu 75bc6099bc drm/amdgpu:read VRAMLOST from gim
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:45 -05:00
pding 0c03b912d7 drm/amdgpu: bypass FB resizing for SRIOV VF
It introduces 900ms latency in exclusive mode which causes failure
of driver loading. Host can resize the BAR before guest staring,
so the resizing is not necessary here.

Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:45 -05:00
pding c6332b97fa drm/amdgpu: release exclusive mode after hw_init
Signed-off-by: pding <Pixel.Ding@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:44 -05:00
pding 1884734a03 drm/amdkfd: initialise kfd inside amdgpu_device_init
Also finalize kfd inside amdgpu_device_fini. kfd device_init needs
SRIOV exclusive accessing. Try to gather exclusive accessing to
reduce time consuming.

Signed-off-by: pding <Pixel.Ding@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:44 -05:00
Christian König d6895ad39f drm/amdgpu: resize VRAM BAR for CPU access v6
Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
    handle gmc_v9 as well, round size up to power of two.
v3: handle gmc_v6 as well, release and reassign all BARs in the driver.
v4: rename new function to amdgpu_device_resize_fb_bar,
    reenable mem decoding only if all resources are assigned.
v5: reorder resource release, return -ENODEV instead of BUG_ON().
v6: squash in rebase fix

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:42 -05:00
Horace Chen 3c7388936a drm/amdgpu: refine SR-IOV firmware VRAM reservation to protect data
The previous solution will create a zero buffer on the system
domain and then move the zeroes to the VRAM. This will break the
original data on the VRAM.

Refine the code to create bo on VRAM domain directly and then remove
and re-create mem node to the exact position before bo_pin. This can
avoid breaking the data and will not cause eviction.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:42 -05:00
pding 5ffa61c1bd drm/amdgpu: retry init if exclusive mode request is failed
This is caused of that hypervisor fails to handle request, one known
issue is MMIO unblocking timeout. In theory we can retry init here.

Signed-off-by: pding <Pixel.Ding@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:42 -05:00
Christian König c1c7ce8f56 drm/amdgpu: move GART recovery into GTT manager v2
The GTT manager handles the GART address space anyway, so it is
completely pointless to keep the same information around twice.

v2: rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:33 -05:00
Monk Liu 13a752e3a2 drm/amdgpu:cleanup in_sriov_reset and lock_reset
since now gpu reset is unified with gpu_recover
for both bare-metal and SR-IOV:

1)rename in_sriov_reset to in_gpu_reset
2)move lock_reset from adev->virt to adev

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:31 -05:00
Monk Liu 5740682e66 drm/amdgpu:implement new GPU recover(v3)
1,new imple names amdgpu_gpu_recover which gives more hint
on what it does compared with gpu_reset

2,gpu_recover unify bare-metal and SR-IOV, only the asic reset
part is implemented differently

3,gpu_recover will increase hang job karma and mark its entity/context
as guilty if exceeds limit

V2:

4,in scheduler main routine the job from guilty context  will be immedialy
fake signaled after it poped from queue and its fence be set with
"-ECANCELED" error

5,in scheduler recovery routine all jobs from the guilty entity would be
dropped

6,in run_job() routine the real IB submission would be skipped if @skip parameter
equales true or there was VRAM lost occured.

V3:

7,replace deprecated gpu reset, use new gpu recover

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:30 -05:00
pding 8840a3878d drm/amdgpu: retry init if it fails due to exclusive mode timeout (v3)
The exclusive mode has real-time limitation in reality, such like being
done in 300ms. It's easy observed if running many VF/VMs in single host
with heavy CPU workload.

If we find the init fails due to exclusive mode timeout, try it again.

v2:
 - rewrite the condition for readable value.

v3:
 - fix typo, add comments for sleep

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:14 -05:00
pding 9953b72f9c drm/amdgpu: change redundant init logs to debug level
When this VF stays in exclusive mode for long, other VFs will be
impacted.

The redundant messages causes exclusive mode timeout when they're
redirected. That is a normal use case for cloud service to redirect
guest log to virtual serial port.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:12 -05:00
Monk Liu a8a51a7041 drm/amdgpu:cleanup job reset routine(v2)
merge the setting guilty on context into this function
to avoid implement extra routine.

v2:
go through entity list and compare the fence_ctx
before operate on the entity, otherwise the entity
may be just a wild pointer

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:10 -05:00
Monk Liu 2f9d4084ca drm/amdgpu:cleanup force_completion
cleanups, now only operate on the given ring

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:08 -05:00
Alex Deucher b693fc1f83 Revert "drm/amdgpu: fix rmmod KCQ disable failed error"
This reverts commit 446947b44f.

this patch is incorrrect, amdgpu_ucode_bo_fini always
called after gfx_hw_fini.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-28 17:44:13 -05:00
Linus Torvalds c353bfc6eb fixes/cleanups for rc1, non-desktop flags for VR
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaFkpiAAoJEAx081l5xIa+LOcQAJqXyh7vx++oPe5kJFC2rCoX
 MqX1aJ4nH8y04QJqLmKx1SC6eyYsTM92rcg3RfHOThktzonD5l2wSO9TvCkmLtr9
 2n9P/aYMcbPTZntrbJc4mQyzd82U0D4h40i5Cmhr9n4gcLPOsOpau/7eclyuEUds
 PHZSTCRq0Ygk1K5VWQPyKsY1k1TqFes2YE46FJzkD8SQwKDfbWxVZG0BPnvqb5Om
 PMVobnEukruzpsSqnetaEYsW89e0TJ2TW9MSCfVohzWvyCVGzmwSzqaooqOkgFe2
 5ZrzA4aW6qRez4nXN2Zw+p9qhS4DZ8MVEJO8qczrR6BGx5yRlHriGhs+5FQskGBT
 Idqj6YZX3x/qab/AXQy0fzn2lrZdwxTolG6BgnNOwdGhyFEfz7P7p9kcv4QLbyn5
 8MynMUcLmOkpouHD0mpIwn5kS7EU4hbEPGOeBwxy54FbiLFWb81FjlGts2N+/ckI
 69UlmyyFZrpxvTmL9vRzvGCeO0zdfvKtBa1GoYWbzNTs8r50F2EtdJkS64SYOVOf
 o4ApcG5bznx42NfBwa3TBc+NETTYJPS0blFImPVu1qvdQn5AciX137vYbqzwuqac
 2gM2m6Rdfpncw/3VRIePwXYwpNS/3fsa3V6UgzTFlDhrQCtP2XxKPhfru7pFN+te
 Vav1I46Q8pa7ko8dS3A3
 =P4O6
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15-part2' of git://people.freedesktop.org/~airlied/linux

Pull more drm updates from Dave Airlie:
 "Fixes/cleanups for rc1, non-desktop flags for VR

   - remove the MSM dt-bindings file Rob managed to push in the previous
     pull.

   - add a property/edid quirk to denote HMD devices, I had these
     hanging around for a few weeks and Keith had done some work on
     them, they are fairly self contained and small, and only affect
     people using HTC Vive VR headsets so far.

   - amdgpu, tegra, tilcdc, fsl fixes

   - some imx-drm cleanups I missed, these seemed pretty small, and no
     reason to hold off.

  I have one TTM regression fix (fixes bochs-vga in qemu) sitting
  locally awaiting review I'll probably send that in a separate pull
  request tomorrow"

* tag 'drm-for-v4.15-part2' of git://people.freedesktop.org/~airlied/linux: (33 commits)
  dt-bindings: remove file that was added accidentally
  drm/edid: quirk HTC vive headset as non-desktop. [v2]
  drm/fb: add support for not enabling fbcon on non-desktop displays [v2]
  drm: add connector info/property for non-desktop displays [v2]
  drm/amdgpu: fix rmmod KCQ disable failed error
  drm/amdgpu: fix kernel hang when starting VNC server
  drm/amdgpu: don't skip attributes when powerplay is enabled
  drm/amd/pp: fix typecast error in powerplay.
  drm/tilcdc: Remove obsolete "ti,tilcdc,slave" dts binding support
  drm/tegra: sor: Reimplement pad clock
  Revert "drm/radeon: dont switch vt on suspend"
  drm/amd/amdgpu: fix over-bound accessing in amdgpu_cs_wait_any_fence
  drm/amd/powerplay: fix unfreeze level smc message for smu7
  drm/amdgpu:fix memleak
  drm/amdgpu:fix memleak in takedown
  drm/amd/pp: fix dpm randomly failed on Vega10
  drm/amdgpu: set f_mapping on exported DMA-bufs
  drm/amdgpu: Properly allocate VM invalidate eng v2
  drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume()
  drm/fsl-dcu: avoid disabling pixel clock twice on suspend
  ...
2017-11-23 21:04:56 -10:00
Wang Hongcheng 446947b44f drm/amdgpu: fix rmmod KCQ disable failed error
If  gfx_v8_0_hw_fini is called after amdgpu_ucode_fini_bo, we will
hit KCQ disabled failed. Let amdgpu_ucode_fini_bo run after
gfx_v8_0_hw_fini.

BUG: SWDEV-135547
Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Wang Hongcheng <Annie.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-21 10:45:05 -05:00
Linus Torvalds f6705bf959 amdgpu DC display code for Vega.
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaDlaqAAoJEAx081l5xIa+VB8P/3tl1kg6gONXBHA89t4aoyaM
 uKyLy2D8//9RCPupnI2nOablbcdXzmZYE5gsLGHcN5G/cf9qHksslqo6P/8cjfIC
 lOz+2AxzFGTP9s6M0jyE7l4Dlk53Chd+7yOTJfm322BUuAZW7nSjWGglkO6rW6RR
 JRyNwIoRLX62nAkD769R9QTh8sh2P7pWvXKUSRtMQVWRRI0fICvUFuqyBbEFjJZN
 4GGkqM5bA6GU+z1W91iqkXoPWz34Zejch7cLBM5pXiZsgXOuzl4V/RwxdKZlWVrf
 9oA9357yKvvvb1bkNRgjNqLLHdOxQUomv1k2RxCbvX2xUecOCTKXKb4/X+AurZEI
 ENfSejTbzj+mP18CI1IsvsQolkighP1xxqjH3zmSu+bS0ivWBywbpDUVN969qKrV
 9kHigMwxxX5YCWGoLswhZ+6OsPm5R2FRKg10QVQAlARjye4Q7ssP+l+KRRP8rvkc
 D4rZiLBMuIDersRhW3ylEym8gXqSO2BoBJZS3+ECSzweIhvwziNgY0q6lpFxfzJa
 fzjW/mfK/uucEshoZrxJVRAEiWwtULvi1KVnTpQ/lm254maj4mOy6atqs7rmdAKK
 Jetfg+Z0Fb+805fHeS2dk/E855qwmTCsBf+TA4hGrxoW3EHB3yNLH1j4MSUxK8es
 6SpuEv7hzeyCiK0QJcSH
 =0JS4
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linux

Pull amdgpu DC display code for Vega from Dave Airlie:
 "This is the pull request for the AMD DC (display code) layer which is
  a requirement to program the display engines on the new Vega and Raven
  based GPUs. It also contains support for all amdgpu supported GPUs
  (CIK, VI, Polaris), which has to be enabled. It is also a kms atomic
  modesetting compatible driver (unlike the current in-tree display
  code).

  I've kept it separate from drm-next because it may have some things
  that cause you to reject it.

  Background story:

  AMD have an internal team creating a shared OS codebase for display at
  hw bring up time using information from their hardware teams. This
  process doesn't lead to the most Linux friendly/looking code but we
  have worked together on cleaning a lot of it up and dealing with
  sparse/smatch/checkpatch, and having their team internally adhere to
  Linux coding standards.

  This tree is a complete history rebased since they started opening it,
  we decided not to squash it down as the history may have some value.
  Some of the commits therefore might not reach kernel standards, and we
  are steadily training people in AMD to better write commit msgs.

  There is a major bunch of generated bandwidth calculation and
  verification code that comes from their hardware team. On Vega and
  before this is float calculations, on Raven (DCN10) this is double
  based. They do the required things to do FP in the kernel, and I could
  understand this might raise some issues. Rewriting the bandwidth would
  be a major undertaken in reverification, it's non-trivial to work out
  if a display can handle the complete set of mode information thrown at
  it.

  Future story:

  There is a TODO list with this, and it address most of the remaining
  things that would be nice to refine/remove. The DCN10 code is still
  under development internally and they push out a lot of patches quite
  regularly and are supporting this code base with their display team. I
  think we've reached the point where keeping it out of tree is going to
  motivate distributions to start carrying the code, so I'd prefer we
  get it in tree. I think this code is slightly better than STAGING
  quality but not massively so, I'd really like to see that float/double
  magic gone and fixed point used, but AMD don't seem to think the
  accuracy and revalidation of the code is worth the effort"

* tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linux: (1110 commits)
  drm/amd/display: fix MST link training fail division by 0
  drm/amd/display: Fix formatting for null pointer dereference fix
  drm/amd/display: Remove dangling planes on dc commit state
  drm/amd/display: add flip_immediate to commit update for stream
  drm/amd/display: Miss register MST encoder cbs
  drm/amd/display: Fix warnings on S3 resume
  drm/amd/display: use num_timing_generator instead of pipe_count
  drm/amd/display: use configurable FBC option in dm
  drm/amd/display: fix AZ clock not enabled before program AZ endpoint
  amdgpu/dm: Don't use DRM_ERROR in amdgpu_dm_atomic_check
  amd/display: Fix potential null dereference in dce_calcs.c
  amdgpu/dm: Remove unused forward declaration
  drm/amdgpu: Remove unused dc_stream from amdgpu_crtc
  amdgpu/dc: Fix double unlock in amdgpu_dm_commit_planes
  amdgpu/dc: Fix missing null checks in amdgpu_dm.c
  amdgpu/dc: Fix potential null dereferences in amdgpu_dm.c
  amdgpu/dc: fix more indentation warnings
  amdgpu/dc: handle allocation failures in dc_commit_planes_to_stream.
  amdgpu/dc: fix indentation warning from smatch.
  amdgpu/dc: fix non-ansi function decls.
  ...
2017-11-17 14:34:42 -08:00
Tom St Denis 0b968650cd drm/amd/amdgpu: Fix wave mask in amdgpu_debugfs_wave_read() (v2)
The bottom two bits of the simd value were being put into
the upper bits of the wave value which was likely working due
to the bits being ignored (or aliased).

Eitherway, now we mask it correctly.

(v2) Touch up using GENMASK_ULL to a couple of other functions too

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-13 14:37:05 -05:00
Monk Liu 63ae07ca4f drm/amdgpu:fix wb_clear
Properly shift the index when clearing so we clear
the right bit

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-20 13:28:55 -04:00
Monk Liu 6867e1b5fb drm/amdgpu:fix vf_error_put
1,it should not work on non-SR-IOV case
2,the NO_VBIOS error is incorrect, should
handle it under detect_sriov_bios.
3,wrap the whole detect_sriov_bios with sriov check

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-20 13:28:44 -04:00