Commit Graph

581 Commits

Author SHA1 Message Date
Chunming Zhou 7e5a547f64 drm/amdgpu: implement the allocation range (v3)
Pass a ttm_placement pointer to amdgpu_bo_create_restricted
add min_offset to amdgpu_bo_pin_restricted.  This makes it
easier to allocate memory with address restrictions.  With
this patch we can also enable 2-ended allocation again.

v2: fix rebase conflicts
v3: memset placements before using

Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:52 -04:00
yanyang1 5fc3aeeb9e drm/amdgpu: rename amdgpu_ip_funcs to amd_ip_funcs (v2)
The structure is renamed and moved to amd_shared.h to make
the component independent.  This makes it easier to add
new components in the future.

v2: fix include path

Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: yanyang1 <young.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:51 -04:00
Christian König 9269a60686 drm/amdgpu: drop AMDGPU_FENCE_SIGNALED_SEQ
It's causing issues with VMID handling and comparing the
fence value two times actually doesn't make handling faster.

Port of radeon commit "d6d5c5b8364bcc4d52cddc68bcb0a330d2af20f3".

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:50 -04:00
Christian König 5fb1941d0c drm/amdgpu: port fault_reserve_notify changes from radeon
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:49 -04:00
Sonny Jiang 564ea7900c drm/amdgpu: enable uvd dpm and powergating
Enable UVD dpm (dynamic power management) and powergating.  UVD dpm dynamically scales the UVD
clocks on demand.  Powergating turns off the power to the block when it's not in use.

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:48 -04:00
Leo Liu 5bbc553a1a drm/amdgpu: implement VCE two instances support
VCE 3.0 has two indentical instances in the engine, they share
the same registers name in differrent memory block distinguished
by the grbm_gfx_index, we set to master instance after init, it
will dispatch task to slave instance. These two instances will
share the same firmware, but have their own stacks and heaps.

v2: add mutex for using grbm_gfx_index

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:48 -04:00
Leo Liu e982262214 drm/amdgpu: recalculate VCE firmware BO size
Firmware required BO size changes in terms of ASIC family

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:47 -04:00
Alex Deucher 15a16ff606 drm/amdgpu: remove unused TRACE_SYSTEM_STRING define
Port of 77cb2fea1e to
amdgpu.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:46 -04:00
Marek Olšák fbd76d59ef drm/amdgpu: rework tiling flags
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:46 -04:00
Marek Olšák 63ab1c2bee drm/amdgpu: don't set unused tiling flags
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:45 -04:00
Christian König 9f7eb5367d drm/amdgpu: actually use the VM map parameters
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:44 -04:00
Christian König 0be52de91c drm/amdgpu: validate amdgpu_vm_bo_map parameters
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:43 -04:00
Christian König 271c812566 drm/amdgpu: enforce AMDGPU_GEM_CREATE_NO_CPU_ACCESS
Deny user and kernel mapping if we said we never want to do so.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:43 -04:00
Christian König 25a595e482 drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:42 -04:00
Alex Deucher 67ed009232 drm/amdgpu: retry dcpd fetch
Retry the dpcd fetch several times.  Some eDP panels
fail several times before the fetch is successful.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73530

Ported from radeon.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:41 -04:00
Alex Deucher 7af93b5069 drm/amdgpu: simplify DPCD debug output
Use %*ph rather than walking the array.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:41 -04:00
Alex Deucher dc5f428d61 drm/amdgpu: make some DP parameters const
Ported from similar radeon patch.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:40 -04:00
Alex Deucher 9e14c65c57 drm/amdgpu: take the mode_config mutex when handling hpds
Since we may modify display state.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:39 -04:00
Marek Olšák d94aed5a6c drm/amdgpu: add and implement the GPU reset status query
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-06-03 21:03:39 -04:00
Alex Deucher 1f8d962513 drm/amdgpu: add some new tonga pci ids
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:38 -04:00
Alex Deucher fb4f173734 drm/amdgpu: add new bonaire pci id
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:37 -04:00
Jammy Zhou 86c2b79062 drm/amdgpu: rewording some left radeons
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:36 -04:00
Jammy Zhou c65444fe05 drm/amdgpu: switch to amdgpu folder for firmware files v2
v2: keep using radeon folder for CIK

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:36 -04:00
Jammy Zhou 4b095304ea drm/amdgpu: do necessary NULL check
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:35 -04:00
Jammy Zhou 02b70c8c9f drm/amdgpu: expose the max virtual address
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:34 -04:00
Christian König 3cb485f340 drm/amdgpu: fix context switch
Properly protect the state and also handle submission failures.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:34 -04:00
Christian König d919ad49ac drm/amdgpu: fix dereference before check
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:33 -04:00
Christian König d2edb07b10 drm/amdgpu: cleanup HDP flush handling
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:32 -04:00
Christian König 66782cec7a drm/amdgpu: always emit GDS switch
Otherwise a process can access the GDS data of another process.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:31 -04:00
Jammy Zhou aa2bdb2476 drm/amdgpu: add CE preamble flag v3
The CE preamble IB can be dropped for the same context

v2: use the flags directly
v3: remove 'CE' for potential preamble usage by other rings

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:31 -04:00
Jammy Zhou de807f818b drm/amdgpu: add flags for amdgpu_ib structure
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:30 -04:00
Jammy Zhou 72efa7ebde drm/amdgpu: check context id for context switching (v2)
check the filp is not robust, and sometimes different contexts may
have same filp value.

v2: check both filp and ctx_id

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:29 -04:00
Jammy Zhou 66b3cf2ab3 drm/amdgpu: add ctx_id to the WAIT_CS IOCTL (v4)
It is required to support fence per context.

v2: add amdgpu_ctx_get/put
v3: improve get/put
v4: squash hlock fix

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:29 -04:00
Jack Xiao 74a5d1656e drm/amdgpu: allow unaligned memory access (v2)
Set up the CP and SDMA for proper unaligned memory access.
Required for OpenCL 2.x

v2: udpate commit message

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-06-03 21:03:28 -04:00
Marek Olšák 0147ee0f59 drm/amdgpu: make the CTX ioctl thread-safe
The existing locks were protecting the list, but not the elements.

v2: rename hlock to lock

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-06-03 21:03:27 -04:00
Marek Olšák f11358daa9 drm/amdgpu: remove unsafe context releasing
If ctx was released between put and get, then "get" would crash.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-06-03 21:03:27 -04:00
Christian König a961ea7349 drm/amdgpu: fix userptr lockup
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
2015-06-03 21:03:26 -04:00
monk.liu dd08fae1e9 drm/amdgpu: fix userptr BO unpin bug (v2)
sg could point to array of contigiouse page*, only free page could lead
to memory leak.

v2: use iterator

Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:25 -04:00
Jammy Zhou 886712881d drm/amdgpu: remove AMDGPU_GEM_CREATE_CPU_GTT_UC
This flag isn't used by user mode drivers, remove it to avoid
confusion. And rename GTT_WC to GTT_USWC to make it clear.

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:24 -04:00
Sonny Jiang 46651cc5db drm/amdgpu fix amdgpu.dpm=0 (v2)
Fix crash when disabling dpm.

v2: agd5f: fix coding style, cleanup commit message

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:24 -04:00
Alex Deucher c92b90ccc2 drm/amdgpu: memset gds_info struct in info ioctl
Avoids possibility that info may leak via the uninitialized
_pad element.

Noticed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:23 -04:00
Alex Deucher 1045745742 drm/amdgpu: fix error handling in cz_dpm_hw_fini/cz_dpm_suspend
Need to unlock the mutex on error.

Noticed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:22 -04:00
monk.liu decee87a88 drm/amdgpu: let bo_list handler start from 1
this could prevent mis-understanding, because libdrm side will consider
no bo_list created if handleis zero

Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:22 -04:00
monk.liu 840d51445f drm/amdgpu: fix bug occurs when bo_list is NULL
Still need to handle ibs BO and validate them even bo_list is NULL

Signed-off-by: Monk.Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:21 -04:00
Jack Xiao 7ab7e8a409 drm/amdgpu: fix error check issue in amdgpu_mn_invalidate_range_start
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:20 -04:00
Alex Deucher 2d8bd23a05 drm/amdgpu: drop ttm two ended allocation
amdgpu_bo_create() calls amdgpu_ttm_placement_from_domain()
before ttm_bo_init() is called.  amdgpu_ttm_placement_from_domain()
uses the ttm bo size to determine when to select top down
allocation but since the ttm bo is not initialized yet the
check is always false.  It only took affect when buffers
were validated later.  It also seemed to regress suspend
and resume on some systems possibly due to it not
taking affect in amdgpu_bo_create().

amdgpu_bo_create() and amdgpu_ttm_placement_from_domain()
need to be reworked substantially for this to be optimally
effective.  Re-enable it at that point.

Ported from radeon commit:
a239118a24

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-06-03 21:03:20 -04:00
Alex Deucher 1256a8b89e drm/amdgpu: add VI pci ids
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:19 -04:00
Alex Deucher 89330c391b drm/amdgpu: add CIK pci ids
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:18 -04:00
Alex Deucher aaa36a976b drm/amdgpu: Add initial VI support
This adds initial support for VI asics.  This
includes Iceland, Tonga, and Carrizo.  Our inital
focus as been Carrizo, so there are still gaps in
support for Tonga and Iceland, notably power
management.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:17 -04:00
Alex Deucher a2e73f56fa drm/amdgpu: Add support for CIK parts
This patch adds support for CIK parts.  These parts
are also supported by radeon which is the preferred
option, so there is a config option to enable support
for CIK parts in amdgpu for testing.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:17 -04:00
Alex Deucher 18da4340e6 drm/amdgpu: Do not directly dereference pointers to BIOS area.
Use readb() and memcpy_fromio() accessors instead.

Ported from radeon commit:
f2c9e560b4

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:16 -04:00
Alex Deucher 17b10f941f drm/amdgpu: fix const warnings in amdgpu_connectors.c
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:15 -04:00
Alex Deucher d38ceaf99e drm/amdgpu: add core driver (v4)
This adds the non-asic specific core driver code.

v2: remove extra kconfig option
v3: implement minor fixes from Fengguang Wu
v4: fix cast in amdgpu_ucode.c

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:15 -04:00
Alex Deucher 97b2e202fb drm/amdgpu: add amdgpu.h (v2)
This is the main header file for amdgpu.

v2: remove stable comments

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:14 -04:00
Alex Deucher 8a94f39580 drm/amdgpu: add amdgpu_family.h
This header defines asic families and attributes.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:13 -04:00
Alex Deucher b111f7e4d2 drm/amdgpu: add ppsmc.h
This header provides the smc message interface for the driver.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:13 -04:00
Alex Deucher bd098eb0ee drm/amdgpu: add clearstate_defs.h
This header provides for format for the GCA blocks
clear state (i.e., default state).  Each GCA version
has a specific clear state.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:12 -04:00
Alex Deucher a02860aa2b drm/amdgpu: add atombios headers
These headers define the atombios table structure and
driver interface.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:11 -04:00
Alex Deucher c481a6802e drm/amdgpu: add VCE 3.0 register headers
These are register headers for the VCE (Video Codec Engine)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:10 -04:00
Alex Deucher 683595a6f3 drm/amdgpu: add VCE 2.0 register headers
These are register headers for the VCE (Video Codec Engine)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:09 -04:00
Alex Deucher 3b1e08cb29 drm/amdgpu: add UVD 6.0 register headers
These are register headers for the UVD (Universal Video Decoder)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:08 -04:00
Alex Deucher 7aa27c3773 drm/amdgpu: add UVD 5.0 register headers
These are register headers for the UVD (Universal Video Decoder)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:08 -04:00
Alex Deucher 8630f839e0 drm/amdgpu: add UVD 4.2 register headers
These are register headers for the UVD (Universal Video Decoder)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:07 -04:00
Alex Deucher 47e6898750 drm/amdgpu: add SMU 8.0 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:06 -04:00
Alex Deucher bc136e1329 drm/amdgpu: add SMU 7.1.2 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:05 -04:00
Alex Deucher c4712a10e7 drm/amdgpu: add SMU 7.1.1 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:05 -04:00
Alex Deucher 90593ac0da drm/amdgpu: add SMU 7.1.0 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:04 -04:00
Alex Deucher a4efaabae5 drm/amdgpu: add SMU 7.0.1 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:03 -04:00
Alex Deucher 9b289c2610 drm/amdgpu: add SMU 7.0.0 register headers
These are register headers for the SMU (System Management Unit)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:03 -04:00
Alex Deucher a1ef4a8aa1 drm/amdgpu: add OSS 3.0.1 register headers
These are register headers for the OSS (OS Services)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:02 -04:00
Alex Deucher 6d5506b617 drm/amdgpu: add OSS 3.0 register headers
These are register headers for the OSS (OS Services)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:01 -04:00
Alex Deucher 3f2ec6f51d drm/amdgpu: add OSS 2.4 register headers
These are register headers for the OSS (OS Services)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:00 -04:00
Alex Deucher 599bd21552 drm/amdgpu: add OSS 2.0 register headers
These are register headers for the OSS (OS Services)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:03:00 -04:00
Alex Deucher 8f54b7c9eb drm/amdgpu: add GMC 8.2 register headers
These are register headers for the GMC (Graphics Memory Controller)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:59 -04:00
Alex Deucher bd6a6b43fd drm/amdgpu: add GMC 8.1 register headers
These are register headers for the GMC (Graphics Memory Controller)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:58 -04:00
Alex Deucher 973305270b drm/amdgpu: add GMC 7.1 register headers
These are register headers for the GMC (Graphics Memory Controller)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:57 -04:00
Alex Deucher 52fb57e7ee drm/amdgpu: add GMC 7.0 register headers
These are register headers for the GMC (Graphics Memory Controller)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:57 -04:00
Alex Deucher 675892a184 drm/amdgpu: add GCA 8.0 register headers
These are register headers for the GCA (Graphics and Compute Array)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:56 -04:00
Alex Deucher 46d5a27269 drm/amdgpu: add GCA 7.2 register headers
These are register headers for the GCA (Graphics and Compute Array)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:55 -04:00
Alex Deucher 9f24d8ce25 drm/amdgpu: add GCA 7.0 register headers
These are register headers for the GCA (Graphics and Compute Array)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:55 -04:00
Alex Deucher d180bab3a8 drm/amdgpu: add DCE 11.0 register headers
These are register headers for the DCE (Display and Composition Engine)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:54 -04:00
Alex Deucher 36cfed855d drm/amdgpu: add DCE 10.0 register headers
These are register headers for the DCE (Display and Composition Engine)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:53 -04:00
Alex Deucher 26159c86dd drm/amdgpu: add DCE 8.0 register headers
These are register headers for the DCE (Display and Composition Engine)
block on the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:52 -04:00
Alex Deucher 3e5343bd7c drm/amdgpu: add BIF 5.1 register headers
These are register headers for the BIF (Bus InterFace) block on
the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:51 -04:00
Alex Deucher 848ebfd731 drm/amdgpu: add BIF 5.0 register headers
These are register headers for the BIF (Bus InterFace) block on
the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:51 -04:00
Alex Deucher 054e4c60fe drm/amdgpu: add BIF 4.1 register headers
These are register headers for the BIF (Bus InterFace) block on
the GPU.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-03 21:02:50 -04:00
Alexey Skidanov 826f5de84c drm/amdkfd: fix topology bug with capability attr.
This patch fixes a bug where the number of watch points
was shown before it was actually calculated

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 21:45:54 +03:00
Ben Goz c3447e8150 drm/amdkfd: Enforce kill all waves on process termination
This commit makes sure that on process termination, after
we're destroying all the active queues, we're killing all the
existing wave front of the current process.

By doing this we're making sure that if any of the CUs were blocked
by infinite loop we're enforcing it to end the shader explicitly.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:47 +03:00
Alexey Skidanov edad40239f drm/radeon: Add ATC VMID<-->PASID functions to kfd->kgd
This patch adds three new interfaces to kfd2kgd interface file of radeon.

The interfaces are:

- Check if a specific VMID has a valid PASID mapping
- Retrieve the PASID which is mapped to a specific VMID
- Issue a VMID invalidation request to the ATC

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:46 +03:00
Yair Shachar f8bd13338a drm/amdkfd: Implement address watch debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:35 +03:00
Yair Shachar 9448458998 drm/amdkfd: Implement wave control debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:26 +03:00
Yair Shachar 037ed9a2ac drm/amdkfd: Implement (un)register debugger IOCTLs
v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:07 +03:00
Yair Shachar e2e9afc4a3 drm/amdkfd: Add address watch operation to debugger
The address watch operation gives the ability to specify watch points
which will generate a shader breakpoint, based on a specified single
address or range of addresses.

There is support for read/write/any access modes.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar 788bf83db3 drm/amdkfd: Add wave control operation to debugger
The wave control operation supports several command types executed upon
existing wave fronts that belong to the currently debugged process.

The available commands are:

HALT   - Freeze wave front(s) execution
RESUME - Resume freezed wave front(s) execution
KILL   - Kill existing wave front(s)

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar fbeb661bfa drm/amdkfd: Add skeleton H/W debugger module support
This patch adds the skeleton H/W debugger module support. This code
enables registration and unregistration of a single HSA process at a
time.

The module saves the process's pasid and use it to verify that only the
registered process is allowed to execute debugger operations through the
kernel driver.

v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar 992839ad64 drm/amdkfd: Add static user-mode queues support
This patch adds support for static user-mode queues in QCM.
Queues which are designated as static can NOT be preempted by
the CP microcode when it is executing its scheduling algorithm.

This is needed for supporting the debugger feature, because we
can't allow the CP to preempt queues which are currently being debugged.

The number of queues that can be designated as static is limited by the
number of HQDs (Hardware Queue Descriptors).

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar aef11009c4 drm/amdkfd: add H/W debugger IOCTL set definitions
This patch adds four new IOCTLs to amdkfd. These IOCTLs expose a H/W
debugger functionality to the userspace.

The IOCTLs are:

- AMDKFD_IOC_DBG_REGISTER:

The purpose of this IOCTL is to notify amdkfd that a process wants to use
GPU debugging facilities on itself only.
It is expected that this IOCTL would be called before any other H/W
debugger requests are sent to amdkfd and for each GPU where the H/W
debugging needs to be enabled. The use of this IOCTL ensures that only
one instance of a debugger is active in the system.

- AMDKFD_IOC_DBG_UNREGISTER:

This IOCTL detaches the debugger/debugged process from the H/W
Debug which was established by the AMDKFD_IOC_DBG_REGISTER IOCTL.

- AMDKFD_IOC_DBG_ADDRESS_WATCH:

This IOCTL allows to set different watchpoints with various conditions as
indicated by the IOCTL's arguments. The available number of watchpoints
is retrieved from topology. This operation is confined to the current
debugged process, which was registered through AMDKFD_IOC_DBG_REGISTER.

- AMDKFD_IOC_DBG_WAVE_CONTROL:

This IOCTL allows to control a wavefront as indicated by the IOCTL's
arguments. For example, you can halt/resume or kill either a
single wavefront or a set of wavefronts. This operation is confined to
the current debugged process, which was registered through
AMDKFD_IOC_DBG_REGISTER.

Because the arguments for the address watch IOCTL and wave control IOCTL
are dynamic, meaning that they could vary in size, the userspace passes a
pointer to a structure (in userspace) that contains the value of the
arguments. The kernel driver is responsible to parse this structure and
validate its contents.

v2: change void* to uint64_t inside ioctl arguments

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:07 +03:00
Yair Shachar a6186f4d6f drm/radeon: Add H/W debugger kfd->kgd functions
This patch adds new interface functions to the kfd2kgd interface file. The
new functions allow to perform H/W debugger operations by writing to GPU
registers.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:31:12 +03:00
Joe Perches f761d8bd80 drm/amdkfd: Use DECLARE_BITMAP
Use the generic mechanism to declare a bitmap instead of unsigned long.

It seems that "struct kfd_process.allocated_queue_bitmap" is unused.
Maybe it could be deleted instead.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:31:12 +03:00
Dave Airlie bdcddf95e8 Linux 4.1-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVWh3TAAoJEHm+PkMAQRiG/kwH/2c9irodp2+M9OUnX2bfsBb6
 LnChiDpvkF5BB8jhP6d/XmvPp4NJzAbTxByhjdfb2E2HkorCUHCOIn2tI1TE2pUs
 2qjkOVH+XCzoV0goGtQjzK1ht8f2IrtlDiEjyRekK5cJHzhggb22QPtWL4npyd0O
 reDmG2jsRaF9POr9uLSFEv4CEnkksmRLUU0vuQX0TZeCJ41O7TXrkN/wKrLZ5mj4
 IWpqXQaSlrffq/T5HnVbXBxk3/T8QmhrIoppiMpV1mUVj0uTqlFRNi5qwT2Nit1h
 FVljWI4+WgOk3bf7fUlp+ahopjkTgu+GuXkiRP/pdgWNQO0cxCWSAzSndAlIIAE=
 =uOoJ
 -----END PGP SIGNATURE-----

Backmerge v4.1-rc4 into into drm-next

We picked up a silent conflict in amdkfd with drm-fixes and drm-next,
backmerge v4.1-rc5 and fix the conflicts

Signed-off-by: Dave Airlie <airlied@redhat.com>

Conflicts:
	drivers/gpu/drm/drm_irq.c
2015-05-20 16:23:53 +10:00
Oded Gabbay 7591cd2cd5 drm/amdkfd: change driver version to 0.7.2
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:30 +03:00
Andrew Lewycky 8377396b5d drm/amdkfd: Implement events IOCTLs
Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:29 +03:00
Oded Gabbay 81663016db drm/amdkfd: Add module parameter of send_sigterm
This patch adds a new kernel module parameter to amdkfd,
called send_sigterm.

This parameter specifies whether amdkfd should send the
SIGTERM signal to an HSA process, when the following conditions
occur:

1. The GPU triggers an exception regarding a kernel that was
   issued by this process.

2. The HSA process isn't waiting on an event that handles
   this exception.

The default behavior is not to send a SIGTERM and suffice
with a dmesg error print.

Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:28 +03:00
Alexey Skidanov 930c5ff439 drm/amdkfd: Add bad opcode exception handling
Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:28 +03:00
Alexey Skidanov 59d3e8be87 drm/amdkfd: Add memory exception handling
This patch adds Peripheral Page Request (PPR) failure processing
and reporting.

Bad address or pointer to a system memory block with inappropriate
read/write permission cause such PPR failure during a user queue
processing. PPR request handling is done by IOMMU driver notifying
AMDKFD module on PPR failure.

The process triggering a PPR failure will be notified by
appropriate event or SIGTERM signal will be sent to it.

v3:
- Change all bool fields in struct kfd_memory_exception_failure to
  uint32_t

Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:27 +03:00
Andrew Lewycky f3a398183f drm/amdkfd: Add the events module
This patch adds the events module (kfd_events.c) and the interrupt
handle module for Kaveri (cik_event_interrupt.c).

The patch updates the interrupt_is_wanted(), so that it now calls the
interrupt isr function specific for the device that received the
interrupt. That function(implemented in cik_event_interrupt.c)
returns whether this interrupt is of interest to us or not.

The patch also updates the interrupt_wq(), so that it now calls the
device's specific wq function, which checks the interrupt source
and tries to signal relevant events.

v2:

Increase limit of signal events to 4096 per process
Remove bitfields from struct cik_ih_ring_entry
Rename radeon_kfd_event_mmap to kfd_event_mmap
Add debug prints to allocate_free_slot and allocate_signal_page
Make allocate_event_notification_slot return a correct value
Add warning prints to create_signal_event
Remove error print from IOCTL path
Reformatted debug prints in kfd_event_mmap
Map correct size (as received from mmap) in kfd_event_mmap

v3:

Reduce limit of signal events back to 256 per process
Fix allocation of kernel memory for signal events

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:26 +03:00
Andrew Lewycky 29a5d3eb9a drm/amdkfd: add events IOCTL set definitions
- AMDKFD_IOC_CREATE_EVENT:
	Creates a new event of a specified type

- AMDKFD_IOC_DESTROY_EVENT:
	Destroys an existing event

- AMDKFD_IOC_SET_EVENT:
	Signal an existing event

- AMDKFD_IOC_RESET_EVENT:
	Reset an existing event

- AMDKFD_IOC_WAIT_EVENTS:
	Wait on event(s) until they are signaled

v2:

- Move the limit of the signal events to kfd_ioctl.h so it
  can be used by userspace

v3:
- Change all bool fields in struct kfd_memory_exception_failure
to uint32_t

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:00 +03:00
Andrew Lewycky 2249d55827 drm/amdkfd: Add interrupt handling module
This patch adds the interrupt handling module, kfd_interrupt.c, and its
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring
per amdkfd device. The internal interrupt ring contains interrupts
that needs further handling. The extra handling is deferred to
a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware
simply queues a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose
interrupts because we have no back-pressure to the hardware.

However, only interrupts that are "wanted" by amdkfd, are copied into
the amdkfd s/w interrupt ring, in order to minimize the chances
for overflow of the ring.

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 12:13:39 +03:00
Oded Gabbay d36b94fcf0 drm/radeon: Add init interrupt kfd->kgd interface
This patch adds a new interface function to the kfd->kgd interface.
The function is kgd_init_interrupts() and its function is to
initialize a pipe's interrupts.

The function currently enables the timestamp interrupt and the
bad opcode interrupt.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 12:13:39 +03:00
Oded Gabbay 3e3f6e1a90 drm/amdkfd: make the sdma vm init to be asic specific
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-19 12:13:39 +03:00
Oded Gabbay d42af779fb drm/amdkfd: Use new struct for asic specific ops
This patch creates a new structure for asic specific operations, instead
of using the existing structure of operations.

This is done to make the code flow more logic, readable and maintainable.

The change is done only to the device queue manager module at this point.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-19 12:13:38 +03:00
Oded Gabbay 8856d8e048 drm/amdkfd: reformat some debug prints
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 12:13:38 +03:00
Firo Yang 1549fcd15c drm/amdkfd: Remove unessary void pointer cast
kmalloc() returns a void pointer - no need to cast it in
drivers/gpu/drm/amd/amdkfd/kfd_process.c::kfd_process_destroy_delayed()

Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-05-19 12:13:38 +03:00
Xihan Zhang 79b066bd76 drm/amdkfd: Initialize sdma vm when creating sdma queue
This patch fixes a bug where sdma vm wasn't initialized when
an sdma queue was created in HWS mode.

This caused GPUVM faults to appear on dmesg and it is one of the
causes that SDMA queues are not working.

Signed-off-by: Xihan Zhang <xihan.zhang@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.comt>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-07 17:38:06 +03:00
Oded Gabbay 42e08c7836 drm/amdkfd: Don't report local memory size
This patch sets the local memory size that is reported to userspace to 0.
This is done to make sure that userspace won't try to allocate local memory
for HSA.

As long as amdkfd doesn't support allocating local memory for HSA,
we need this patch.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-07 17:37:52 +03:00
Oded Gabbay 1e5ec956a0 drm/amdkfd: allow unregister process with queues
Sometimes we might unregister process that have queues, because we couldn't
preempt the queues. Until now we blocked it with BUG_ON but instead just
print it as debug.

Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-07 17:37:41 +03:00
Dave Airlie 9e87e48f8e Merge tag 'drm-intel-next-2015-03-27-merge' of git://anongit.freedesktop.org/drm-intel into drm-next
This backmerges 4.0-rc6 due to the recent fixes in rc5/6

- DP link rate refactoring from Ville
- byt/bsw rps tuning from Chris
- kerneldoc for the shrinker code
- more dynamic ppgtt pte work (Michel, Ben, ...)
- vlv dpll code refactoring to prep fro bxt (Imre)
- refactoring the sprite colorkey code (Ville)
- rotated ggtt view support from Tvrtko
- roll out struct drm_atomic_state to prep for atomic update (Ander)

* tag 'drm-intel-next-2015-03-27-merge' of git://anongit.freedesktop.org/drm-intel: (473 commits)
  Linux 4.0-rc6
  arm64: juno: Fix misleading name of UART reference clock
  drm/i915: Update DRIVER_DATE to 20150327
  drm/i915: Skip allocating shadow batch for 0-length batches
  drm/i915: Handle error to get connector state when staging config
  drm/i915: Compare GGTT view structs instead of types
  drm/i915: fix simple_return.cocci warnings
  drm/i915: Add module param to test the load detect code
  drm/i915: Remove usage of encoder->new_crtc from clock computations
  drm/i915: Don't look at staged config crtc when changing DRRS state
  drm/i915: Convert intel_pipe_will_have_type() to using atomic state
  drm/i915: Pass an atomic state to modeset_global_resources() functions
  drm/i915: Add dynamic page trace events
  drm/i915: Finish gen6/7 dynamic page table allocation
  drm/i915: Remove unnecessary gen6_ppgtt_unmap_pages
  drm/i915: Fix i915_dma_map_single positive error code
  drm/i915: Prevent out of range pt in gen6_for_each_pde
  drm/i915: fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl
  drm/i915: Rip out GET_SPRITE_COLORKEY ioctl
  watchdog: imgpdc: Fix default heartbeat
  ...
2015-04-01 08:21:46 +10:00
Xihan Zhang cea405b172 drm/amdkfd: Add multiple kgd support
The current code can only support one kgd instance. We have to
support multiple kgd instances in one system. i.e two amdgpu or two
radeon or one amdgpu + one radeon or more than two kgd instances.

Signed-off-by: Xihan Zhang <xihan.zhang@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-03-25 14:02:05 +02:00
John Stultz affa7d8644 drm/amdkfd: Convert timestamping to use 64bit time accessors
Convert the timestamping in the amdkfd driver to use a timespec64 and 64bit
time accessors.

Although the existing code is completely safe beyond y2038 because it deals
with monotonic time, this patch is still needed in order to kill off all uses
of struct timespec.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-03-25 14:02:05 +02:00
Oded Gabbay 94a1ee0923 drm/amdkfd: add debug prints for process teardown
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-03-25 14:02:05 +02:00
Oded Gabbay 0d9200874c drm/amdkfd: Remove unused field from struct qcm_process_device
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-03-25 14:02:05 +02:00
Oded Gabbay a9243ede5d drm/amdkfd: rename fence_wait_timeout
fence_wait_timeout() is an exported kernel symbol, so we should rename our
local function to something different.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-03-25 14:02:05 +02:00
Ben Goz 4fadf6b657 drm/amdkfd: Fix SDMA queue init. in non-HWS mode
This patch fixes the SDMA queue initialization, when running in non-HWS mode.

The first fix is to move the initialization of SDMA VM parameters before the
initialization of the SDMA MQD.

The second fix is to load the MQD to an HQD after the initialization of the MQD.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-03-16 23:36:58 +02:00
Ben Goz aaad2d8c7b drm/amdkfd: destroy mqd when destroying kernel queue
This patch adds a missing destruction of mqd, when destroying a kernel queue.
Without the destruction, there is a memory leakage when repeatedly creating and
destroying kernel queues.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2015-03-16 23:36:58 +02:00
Oded Gabbay 64ea8f4af5 drm/amdkfd: don't set get_pipes_num() as inline
get_pipes_num() calls BUG_ON so we can't set it as inline because it produces a
warning as BUG_ON() uses static variables when it is expanded.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-02-23 10:48:02 +02:00
Oded Gabbay 1365aa6266 drm/amdkfd: Initialize only amdkfd's assigned pipelines
This patch fixes a bug in the initialization of the pipelines. The
init_pipelines() function was called with a constant value of 0 in the
first_pipe argument. This is an error because amdkfd doesn't handle pipe 0.

The correct way is to pass the value that get_first_pipe() returns as the
argument for first_pipe.

This bug appeared in 3.19 (first version with amdkfd) and it causes around 15%
drop in CPU performance of Kaveri (A10-7850).

v2: Don't set get_first_pipe() as inline because it calls BUG_ON()

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-23 10:47:56 +02:00
Linus Torvalds 796e1c5571 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main drm pull, it has a shared branch with some alsa
  crossover but everything should be acked by relevant people.

  New drivers:
     - ATMEL HLCDC driver
     - designware HDMI core support (used in multiple SoCs).

  core:
     - lots more atomic modesetting work, properties and atomic ioctl
       (hidden under option)
     - bridge rework allows support for Samsung exynos chromebooks to
       work finally.
     - some more panels supported

  i915:
     - atomic plane update support
     - DSI uses shared DSI infrastructure
     - Skylake basic support is all merged now
     - component framework used for i915/snd-hda interactions
     - write-combine cpu memory mappings
     - engine init code refactored
     - full ppgtt enabled where execlists are enabled.
     - cherryview rps/gpu turbo and pipe CRC support.

  radeon:
     - indirect draw support for evergreen/cayman
     - SMC and manual fan control for SI/CI
     - Displayport audio support

  amdkfd:
     - SDMA usermode queue support
     - replace suballocator usage with more suitable one
     - rework for allowing interfacing to more than radeon

  nouveau:
     - major renaming in prep for later splitting work
     - merge arm platform driver into nouveau
     - GK20A reclocking support

  msm:
     - conversion to atomic modesetting
     - YUV support for mdp4/5
     - eDP support
     - hw cursor for mdp5

  tegra:
     - conversion to atomic modesetting
     - better suspend/resume support for child devices

  rcar-du:
     - interlaced support

  imx:
     - move to using dw_hdmi shared support
     - mode_fixup support

  sti:
     - DVO support
     - HDMI infoframe support

  exynos:
     - refactoring and cleanup, removed lots of internal unnecessary
       abstraction
     - exynos7 DECON display controller support

  Along with the usual bunch of fixes, cleanups etc"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (724 commits)
  drm/radeon: fix voltage setup on hawaii
  drm/radeon/dp: Set EDP_CONFIGURATION_SET for bridge chips if necessary
  drm/radeon: only enable kv/kb dpm interrupts once v3
  drm/radeon: workaround for CP HW bug on CIK
  drm/radeon: Don't try to enable write-combining without PAT
  drm/radeon: use 0-255 rather than 0-100 for pwm fan range
  drm/i915: Clamp efficient frequency to valid range
  drm/i915: Really ignore long HPD pulses on eDP
  drm/exynos: Add DECON driver
  drm/i915: Correct the base value while updating LP_OUTPUT_HOLD in MIPI_PORT_CTRL
  drm/i915: Insert a command barrier on BLT/BSD cache flushes
  drm/i915: Drop vblank wait from intel_dp_link_down
  drm/exynos: fix NULL pointer reference
  drm/exynos: remove exynos_plane_dpms
  drm/exynos: remove mode property of exynos crtc
  drm/exynos: Remove exynos_plane_dpms() call with no effect
  drm/i915: Squelch overzealous uncore reset WARN_ON
  drm/i915: Take runtime pm reference on hangcheck_info
  drm/i915: Correct the IOSF Dev_FN field for IOSF transfers
  drm/exynos: fix DMA_ATTR_NO_KERNEL_MAPPING usage
  ...
2015-02-16 15:48:00 -08:00
Oded Gabbay b9dce23ddc drm/amdkfd: Don't create BUG due to incorrect user parameter
This patch changes a BUG_ON() statement to pr_debug, in case the user tries to
update a non-existing queue.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02 09:45:24 +02:00
Oded Gabbay ca400b2a1a drm/amdkfd: max num of queues can't be 0
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02 09:45:24 +02:00
Oded Gabbay 8b58f26111 drm/amdkfd: Fix bug in accounting of queues
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02 09:45:24 +02:00
Oded Gabbay 9fa843e76d drm/amdkfd: Fix bug in call to init_pipelines()
This patch fixes a bug where the first_pipe index passed into init_pipelines()
was a #define instead of the value that is passed into amdkfd by radeon

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-01-22 12:50:37 +02:00
Oded Gabbay 749042b012 drm/amdkfd: Fix bug in pipelines initialization
This patch fixes a bug when calling to init_pipeline() interface.
The index that was passed to that function didn't take into account the
first_pipe value, which represents the first pipe index that is under amdkfd's
responsibility.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-01-22 11:03:42 +02:00
Jay Cornwall d752f95e55 drm/amdkfd: Preserve CP_MQD_IQ_RPTR internal state
CP microcode uses undocumented bits in this register to record queue
state information. The KFD zeroes these bits in update_mqd, when invoked
through the UPDATE_QUEUE ioctl, causing incoherent state when the ioctl
is used to successively unmap and map a queue.

Since the queue type cannot be changed in this path, move the MQD write
to init_mqd.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-19 11:47:34 -06:00
Jay Cornwall b6819cec29 drm/amdkfd: Fix dqm->queue_count tracking
dqm->queue_count tracks queues in the active state only. In a few
places this count is modified unconditionally, leading to an incorrect
value when the UPDATE_QUEUE ioctl is used to make a queue inactive.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-19 16:08:14 -06:00
Dave Airlie b3869b17fd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
This backmerges drm-fixes into drm-next mainly for the amdkfd
stuff, I'm not 100% confident, but it builds and the amdkfd
folks can fix anything up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Conflicts:
	drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
	drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
2015-01-29 11:45:31 +10:00
Oded Gabbay f9dcced8d4 drm/amdkfd: change amdkfd version to 0.7.1
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-22 17:53:03 +02:00
Oded Gabbay 0b3674ae1c drm/amdkfd: Fix sparse errors
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-22 17:52:50 +02:00
Oded Gabbay 7113cd6529 drm/amdkfd: Handle case of invalid queue type
This patch handles a case where amdkfd tries to destroy a queue but the queue
type is invalid.
This case occurs in non-HWS path.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-01-22 12:43:42 +02:00
Oded Gabbay 300dec9578 drm/amdkfd: Add break at the end of case
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-01-22 12:43:37 +02:00
Oded Gabbay 010b82e754 drm/amdkfd: Remove negative check of uint variable
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-01-22 12:43:28 +02:00
Dave Airlie 281d1bbd34 Merge remote-tracking branch 'origin/master' into drm-next
Backmerge Linus tree after rc5 + drm-fixes went in.

There were a few amdkfd conflicts I wanted to avoid,
and Ben requested this for nouveau also.

Conflicts:
	drivers/gpu/drm/amd/amdkfd/Makefile
	drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
	drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
	drivers/gpu/drm/amd/amdkfd/kfd_priv.h
	drivers/gpu/drm/amd/include/kgd_kfd_interface.h
	drivers/gpu/drm/i915/intel_runtime_pm.c
	drivers/gpu/drm/radeon/radeon_kfd.c
2015-01-22 10:44:41 +10:00
Oded Gabbay b8cbab042c drm/amdkfd: Allow user to limit only queues per device
This patch replaces the two current amdkfd module parameters with a new one.

The current parameters that are being replaced are:

- Maximum number of HSA processes
- Maximum number of queues per process

The new parameter that replaces them is called "Maximum queues per device"

This replacement achieves two goals:

- Allows the user to have as many HSA processes as it wants (until
  a maximum of 512 HSA processes in Kaveri).

- Removes the limitation the user had on maximum number of queues per HSA
  process. E.g. the user can now have processes which only have one queue and
  other processes which have hundreds of queues, while before the user
  couldn't have more than 128 queues per process (as default).

The default value of the new parameter is 4096 (32 * 128, which were the
defaults of the old parameters). There is almost no additional GART memory
required for the default case. As a reminder, this amount of queues requires a
little bit below 4MB of GART memory.

v2:
In addition, This patch defines a new counter for queues accounting in the DQM
structure. This is done because the current counter only counts active queues
which allows the user to create more queues than the
max_num_of_queues_per_device module parameter allows.

However, we need the current counter for the runlist packet build process, so
the solution is to have a dedicated counter for this accounting.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
2015-01-18 13:18:01 +02:00
Ben Goz cb2ac44128 drm/amdkfd: Fix description of sched_policy module parameter
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-18 13:18:01 +02:00
Ben Goz f046bfdf73 drm/amdkfd: PQM handle queue creation fault
If the first queue created was failed on DQM then PQM should
unregister the process from DQM.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-15 17:14:47 +02:00
Oded Gabbay 939f4a20a7 drm/amdkfd: Remove sync_with_hw() from amdkfd
This patch completely removes the sync_with_hw() because it was broken and
actually there is no point of using it.

This function was used to:

- Make sure that the submitted packet to the HIQ (which is a kernel queue) was
  read by the CP. However, it was discovered that the method this function used
  to do that (checking wptr == rptr) is not consistent with how the actual CP
  firmware works in all cases.

- Make sure that the queue is empty before issuing the next packet. To achieve
  that, the function blocked amdkfd from continuing until the recently
  submitted packet was consumed. However, the acquire_packet_buffer() already
  checks if there is enough room for a new packet so calling sync_with_hw() is
  redundant.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-15 12:07:48 +02:00
Oded Gabbay c51841fbbb drm/amdkfd: Remove unused function busy_wait()
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-15 12:04:10 +02:00
Oded Gabbay 99331a51cc drm/amdkfd: Replace cpu_relax() with schedule() in DQM
In order not to occupy the current core and thus prevent the core from
servicing IOMMU PPR requests, this patch replaces the call in DQM to
cpu_relax() with a call to schedule().

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-15 12:01:10 +02:00
Ben Goz f0ec5b9905 drm/amdkfd: Fix for-loop when allocating HQD (non-HWS)
This patch fixes a minor bug in allocate_hqd(), where the loop run from the
next-to-allocate pipe until the number of pipes.

This is wrong because we need to consider the possibility where
next-to-allocate pipe is not 0, and thus, the for-loop only checks part of the
pipes and doesn't wrap-around, as it supposed to do.

Therefore, we add another counting variable to make sure we go over all the
pipes, regardless of where we start to look at the first iteration of the loop.

This bug only affected non-HWS mode. In HWS mode, the CP fw is responsible for
allocating the HQD.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-13 11:18:06 +02:00
Oded Gabbay 8dfe58b206 drm/amdkfd: Fix sparse warning (different address space)
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-08 16:46:16 +02:00
Michel Dänzer 6ee0ad2a7f drm/amdkfd: Drop interrupt SW ring buffer
The work queue couldn't reliably prevent the SW ring buffer from
overflowing, so dmesg was spammed by

 kfd kfd: Interrupt ring overflow, dropping interrupt.

messages when running e.g. the Atlantis Substance demo from
https://wiki.unrealengine.com/Linux_Demos on Kaveri.

Since the SW ring buffer doesn't actually do anything at this point, just
remove it for now. When actual interrupt processing code is added to
amdkfd, it should try to do things immediately and only defer to work
queues when necessary.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-08 13:27:15 +09:00
Oded Gabbay 76baee6c73 drm/amdkfd: rewrite kfd_ioctl() according to drm_ioctl()
This patch changes kfd_ioctl() to be very similar to drm_ioctl().

The patch defines an array of amdkfd_ioctls, which maps IOCTL definition to the
ioctl function.

The kfd_ioctl() uses that mapping to call the appropriate ioctl function,
through a function pointer.

This patch also declares a new typedef for the ioctl function pointer.

v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_...

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:36 +02:00
Oded Gabbay b81c55db10 drm/amdkfd: reformat IOCTL definitions to drm-style
This patch reformats the ioctl definitions in kfd_ioctl.h to be similar to the
drm ioctls definition style.

v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_...

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:36 +02:00
Oded Gabbay 524a640444 drm/amdkfd: Do copy_to/from_user in general kfd_ioctl()
This patch moves the copy_to_user() and copy_from_user() calls from the
different ioctl functions in amdkfd to the general kfd_ioctl() function, as
this is a common code for all ioctls.

This was done according to example taken from drm_ioctl.c

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:26 +02:00
Ben Goz 2030664b70 drm/amdkfd: unmap VMID<-->PASID when relesing VMID (non-HWS)
This patch fixes a bug where deallocate_vmid() didn't actually unmap the
VMID<-->PASID mapping (in the registers).
That can cause undefined behavior.

This bug only occurs in non-HWS mode.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-05 15:48:28 +02:00
Ben Goz 030e416b4f drm/amdkfd: Load mqd to hqd in non-HWS mode
This patch fixes a bug in DQM, where the MQD of a newly created compute queue
is not loaded to an HQD slot. As a result, the CP never reads packets from this
queue.

This bug happens only in non-HWS (hardware scheduling) mode. In HWS mode, the
CP is responsible of loading MQDs to HQDs slots.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-04 21:46:44 +02:00
Ben Goz b64b8afcca drm/amd: Fixing typos in kfd<->kgd interface
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-09 12:00:09 +02:00
Sasha Levin 68d0cb49f8 amdkfd: actually allocate longs for the pasid bitmask
Commit "amdkfd: use sizeof(long) granularity for the pasid bitmask" calculated
the number of longs it will need, but ended up allocating that number of
bytes rather than longs.

Fix that silly error and allocate the amount of data really required.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-28 11:44:37 -05:00
Oded Gabbay 0cb989c0c6 amdkfd: Remove duplicate include
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-05 10:40:34 +02:00
Ben Goz 8dfead6c28 amdkfd: Fixing topology bug in building sysfs nodes
Original code sent always 0 as the index number of the node. This patch fixes
this bug by sending a variable which is incremented per node.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-02 16:41:08 +02:00
Oded Gabbay b6ffbab813 amdkfd: Fix accounting of device queues
This patch fixes a device QCM bug, where the number of queues were not
counted correctly for the operation of update queue. The count was incorrect
as there was no regard to the previous state of the queue.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-07 22:27:24 +02:00
Ben Goz 6898f0a568 drm/amdkfd: Add initial VI support for KQ
This patch starts to add support for the VI APU in the KQ (kernel queue)
module.

Because most (more than 90%) of the KQ code is shared among AMD's APUs, we
chose a design that performs most/all the code in the shared KQ file
(kfd_kernel_queue.c). If there is H/W specific code to be executed,
than it is written in an asic-specific extension function for that H/W.

That asic-specific extension function is called from the shared function at the
appropriate time. This requires that for every asic-specific extension function
that is implemented in a specific ASIC, there will be an equivalent
implementation in ALL ASICs, even if those implementations are just stubs.

That way we achieve:

- Maintainability: by having one copy of most of the code, we only need to
  fix bugs at one locations

- Readability: very clear what is the shared code and what is done per ASIC

- Extensibility: very easy to add new H/W specific files/functions

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-12-02 16:38:57 +02:00
Oded Gabbay 443fbd5f11 drm/amdkfd: Encapsulate KQ functions in ops structure
This patch does some re-org on the kernel_queue structure. It takes out
all the function pointers from the structure and puts them in a new structure,
called kernel_queue_ops. Then, it puts an instance of that structure
inside kernel_queue.

This re-org is done to prepare the KQ module to support more than one AMD APU
(Kaveri).

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-12 15:53:44 +02:00
Ben Goz a22fc85495 drm/amdkfd: Add initial VI support for DQM
This patch starts to add support for the VI APU in the DQM module.

Because most (more than 90%) of the DQM code is shared among AMD's APUs, we
chose a design that performs most/all the code in the shared DQM file
(kfd_device_queue_manager.c). If there is H/W specific code to be executed,
than it is written in an asic-specific extension function for that H/W.

That asic-specific extension function is called from the shared function at the
appropriate time. This requires that for every asic-specific extension function
that is implemented in a specific ASIC, there will be an equivalent
implementation in ALL ASICs, even if those implementations are just stubs.

That way we achieve:

- Maintainability: by having one copy of most of the code, we only need to
  fix bugs at one locations

- Readability: very clear what is the shared code and what is done per ASIC

- Extensibility: very easy to add new H/W specific files/functions

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-12 14:28:46 +02:00
Oded Gabbay 45c9a5e429 drm/amdkfd: Encapsulate DQM functions in ops structure
This patch does some re-org on the device_queue_manager structure. It takes out
all the function pointers from the structure and puts them in a new structure,
called device_queue_manager_ops. Then, it puts an instance of that structure
inside device_queue_manager.

This re-org is done to prepare the DQM module to support more than one AMD APU
(Kaveri).

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-12 14:26:10 +02:00
Oded Gabbay 9216ed2940 drm/amdkfd: Don't BUG on freeing GART sub-allocation
Instead of creating a BUG if trying to free a NULL GART sub-allocation object,
just return 0 (success).

This is done to mirror behavior of kfree.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-12 22:34:21 +02:00
Alexey Skidanov dd59239a98 amdkfd: init aperture once per process
Since the user space may call open() more that once from the same process,
the aperture initialization should be moved from kfd_open()

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-18 13:56:23 +02:00
Oded Gabbay f1386fbc2b amdkfd: Display MEC fw version in topology node
This patch displays the firmware version of the microcode that is currently
running in the MEC.
This is needed for the HSA RT, so it could differentiate its behavior based on
fw version. e.g. workarounds for bugs in fw

v2: Send the KGD_ENGINE_MEC1 as a parameter to the get_fw_version()

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-09 12:46:56 +02:00
Oded Gabbay 61466c651f drm/amd: Add get_fw_version to kfd-->kgd interface
This patch adds a new interface to the kfd-->kgd interface.
The new interface function retrieves the firmware version that is currently in
use by the MEC engine. The firmware was uploaded to the MEC engine by the kgd
(radeon).

v2: Added parameter of engine type to interface function

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-09 12:42:22 +02:00
Oded Gabbay a18069c132 amdkfd: Disable support for 32-bit user processes
This patch checks if the process that opens the /dev/kfd device is 32-bit
process. If so, it returns -EPERM and prints a warning message in dmesg.

This is done to prevent 32-bit user processes from using amdkfd, and hence, HSA
features.

AMD's HSA userspace stack will also support only 64-bit processes on Linux.

Reviewed-by: Alexey Skidanov <alexey.skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-05 22:01:35 +02:00
Oded Gabbay a550bb3d53 amdkfd: Set *buffer_ptr to NULL in case of error
In function acquire_packet_buffer() we may return -ENOMEM. In that case, we
should set the *buffer_ptr to NULL, so that calling functions which check the
*buffer_ptr value as a criteria for success, will know that
acquire_packet_buffer() failed.

Reviewed-by: Alexey Skidanov <alexey.skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-04 14:09:02 +02:00
Sasha Levin c448a142a7 amdkfd: use atomic allocations within srcu callbacks
srcu callbacks are running in atomic context, we can't allocate using
__GFP_WAIT.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-03 10:19:36 -05:00
Sasha Levin aeda036c37 amdkfd: use sizeof(long) granularity for the pasid bitmask
All the bit operations (such as find_first_zero_bit()) read sizeof(long) bytes
at a time. If we allocated less than sizeof(long) bytes for the bitmask we
would be accessing invalid memory when working with the bitmask.

Change the allocator to allocate sizeof(long) multiples for the bitmask.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-12-03 09:26:25 -05:00
Dan Carpenter 9cf4a28131 amdkfd: delete some dead code
This is dead code.  We don't need to unbind here, we can just return
directly.

Reviewed-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-25 19:43:29 +03:00
Oded Gabbay 6f9d54fd6e amdkfd: Fix memory leak of mqds on dqm fini
The mqds array members are not freed when dqm is uninitialized.

Reviewed-by: Ben Goz <Ben.Goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-25 15:16:38 +02:00
Dan Carpenter e048a0b260 amdkfd: fix an error handling bug in pqm_create_queue()
The call to kernel_queue_uninit(NULL) will trigger a BUG(), and also the
error code is incorrect.

Fixes: 45102048f7 ('amdkfd: Add process queue manager module')
Reviewed-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-25 13:24:51 +03:00
Dan Carpenter 66333cb3d7 amdkfd: fix some error handling in ioctl
There is a typo here so the errors from kfd_bind_process_to_device()
are not detected.

Reviewed-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-25 13:21:30 +03:00
Oded Gabbay a49493b548 amdkfd: Remove DRM_AMDGPU dependency from Kconfig
This patch removes the dependency of amdkfd upon DRM_AMDGPU symbol in amdkfd's
Kconfig file.

This is done because amdgpu driver is not yet upstreamed and therefore,
DRM_AMDGPU symbol is not present in any Kconfig file.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-21 22:36:09 +02:00
Oded Gabbay 824cb7d136 amdkfd: explicitely include io.h in kfd_doorbell.c
This patch fixes a compilation error when using certain configuration by
including the file io.h in kfd_doorbell.c

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-21 22:04:44 +02:00
Oded Gabbay abc9d3e3b9 amdkfd: Clear ctx cb before suspend
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-09 22:36:22 +02:00
Alexey Skidanov 52a5fdce13 amdkfd: Instead of using get function, use container_of
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-19 17:07:00 +02:00
Oded Gabbay 9a5634a729 amdkfd: use schedule() in sync_with_hw
amdkfd uses cpu_relax() in its sync_with_hw() function. Because cpu_relax() is
defined as 'REP; NOP' on x86_64, it will block the CPU from servicing
IOMMU PPR requests.

This may cause a deadlock, because sync_with_hw() won't be completed
until the PPR request has been served.

Therefore, we need to use schedule() instead of cpu_relax() as it is the
minimum requirement to allow other threads to execute.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-17 13:18:32 +02:00
Jay Cornwall f5d896bbd0 amdkfd: Fix memory leak on process deregistration
struct device_process_node was allocated during process registration but
not released at process deregistration.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 11:52:16 -06:00
Oded Gabbay 5cd78de526 amdkfd: add __iomem attribute to doorbell_ptr
This patch was done due to sparse warning. It changes the definition of
doorbell_ptr in queue_properties to be with __iomem attribute, so it would
match the type which the doorbell module functions are returning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 16:14:56 +02:00
Oded Gabbay d80d19bd50 amdkfd: fence_wait_timeout() can be static
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 15:54:05 +02:00
Oded Gabbay 20981e6801 amdkfd: is_occupied() can be static
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 15:50:53 +02:00
Oded Gabbay 585dbf3842 amdkfd: Fix sparse warnings in kfd_flat_memory.c
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 15:49:49 +02:00
kbuild test robot 7347a6cbf1 amdkfd: pqm_get_kernel_queue() can be static
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 17:16:23 +08:00
kbuild test robot 5ef360eab7 amdkfd: test_kq() can be static
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 16:08:14 +08:00
Oded Gabbay 16b9201c62 amdkfd: Fix sparse warnings in kfd_topology.c
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 15:41:33 +02:00
Oded Gabbay 4307d8f6e5 amdkfd: Fix sparse warnings in kfd_chardev.c
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-20 15:37:13 +02:00
Oded Gabbay ecd5c9821c amdkfd: Implement the Get Version IOCTL
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-02 12:18:29 +02:00
Ben Goz c2e1b3a496 drm/amdkfd: Fix logic of destroy_queue_nocpsch()
This patch rewrites destroy_queue_nocpsch() as the current logic that is
implemented in the function is completely flawed.

This function is used only in non-HWS mode.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-18 14:55:59 +03:00
Ben Goz 4b8f589b05 drm/amdkfd: Change MQD manager to be H/W specific
The MQDs for CI and VI are different. Therefore, the MQD manager module need to
be H/W specific.

This patch splits the current MQD manager into three files:

- kfd_mqd_manager.c, which contains common functions and initializes the
  specific mqd manager module according to the H/W

- kfd_mqd_manager_cik.c, which contains Kaveri specific functions. This is
  basically the old kfd_mqd_manager.c

- kfd_mqd_manager_vi.c, which will contain VI specific functions. Currently it
  is not implemented except for returning NULL on initialization.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-04 11:24:25 +02:00
Ben Goz 0da7558c69 drm/amdkfd: Add asic property to kfd_device_info
This patch adds a new property to kfd_device_info structure. That structure
holds information that is H/W specific.

The new property is called asic_family and its purpose is to distinguish
between different asic families in amdkfd operations, mainly in QCM (queue
control & management)

This patch also adds a new enum, to select different ASICs. We set the current
kfd_device_info instance as Kaveri and create a new instance which describes
the new AMD APU, codenamed 'Carrizo'.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-01 17:10:01 +02:00
Ben Goz 85d258f9a7 drm/amdkfd: Make KFD_MQD_TYPE enum types H/W agnostic
As the MQD types are common across all AMD GPUs/APUs, let's remove the CIK part
from the name.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-04 10:36:30 +02:00
Ben Goz ff3d04a171 drm/amdkfd: Add new VI-specific queue properties
This patch adds new fields to the queue_properties structure. The new fields
are relevant only for queues running on AMD GPU VI architecture.

The eop_ring_buffer_address and eop_ring_buffer_size describe an
end-of-pipe queue which is assigned to the MQD. In CI, the EOP queue was per
pipeline and in VI it is per queue.

The ctx_save_restore_area_address and ctx_save_restore_area_size describe a
memory area that is designated to allow the CP to do context save/restore in
mid-wave state.

This patch also modifies the set_queue_properties_from_user() (called from
kfd_ioctl_create_queue()) to check and copy those new parameters.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-04 10:37:18 +02:00
Oded Gabbay 71273adc52 drm/amdkfd: Don't include header files from radeon
Because amdkfd will need to work both with radeon and amdgpu, don't include
header files that are in radeon's folder.

Instead, use the common amd include folder and move amdkfd specific defines to
amdkfd header files.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-02 23:18:54 +02:00
Ben Goz bd7fbd38e5 drm/amd: Put cik structures in a common place
This patch creates a new file, cik_structs.h, and puts the cik_mqd and
cik_sdma_rlc_registers structures in that file.

The new file is placed in a common include folder under the drm/amd folder, so
it will be shared among all amd drm drivers.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-02 23:18:39 +02:00
Ben Goz fe50280420 drm/amdkfd: Remove call to deprecated init_memory interface
This patch removes a call to kfd-->kgd interface function that is doing H/W
initialization. That function is moved into radeon to be part of the common
H/W initialization sequence. The interface function will be deleted.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-26 18:07:34 +02:00
Ben Goz 08dcc57fcd drm/radeon: Initialize compute vmid
This patch moves to radeon the initialization of compute vmid.

That initializations was done in kfd-->kgd interface, but doing it in radeon
as part of radeon's H/W initialization routines is more appropriate.

In addition, this simplifies the kfd-->kgd interface.

The patch removes the function from the interface file and from the interface
declaration file.

The function initializes memory apertures to fixed base/limit address and non
cached memory types.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-02 23:43:19 +02:00
Oded Gabbay 6bbcde9803 drm/amd: Remove old radeon_sa funcs from kfd-->kgd interface
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:11 +02:00
Oded Gabbay a86aa3ca5a drm/amdkfd: Using new gtt sa in amdkfd
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd
interface to the new kfd gtt sa inside amdkfd

v2: change the new call in sdma code that appeared because of the sdma feature

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:10 +02:00
Oded Gabbay 73a1da0bb3 drm/amdkfd: Allocate gart memory using new interface
This patch changes the calls to allocate the gart memory for amdkfd from the
old interface (radeon_sa) to the new one (kfd_gtt_sa)

The new gart sub-allocator is initialized with chunk size equal to 512 bytes.
This is because the KV MQD is 512 Bytes and most of the sub-allocations are
MQDs.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay e18e794e6b drm/amdkfd: Fixed calculation of gart buffer size
This patch makes the gart's buffer size calculation more accurate. This buffer
is needed per GPU.

It takes into account maximum number of MQDs, runlist packets, kernel queues
and reserves 512KB for other misc allocations.

The total size is just shy of 4MB, for 32 processes and 128 queues per
process, which are the defaults for amdkfd kernel module parameters.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay 6e81090b2e drm/amdkfd: Add kfd gtt sub-allocator functions
This patch adds new kfd gtt sub-allocator functions that service the amdkfd
driver when it wants to use gtt memory.

The sub-allocator uses a bitmap to handle the memory area that was transferred
to it during init. It divides the memory area into chunks, according to chunk
size parameter.

The allocation function will allocate contiguous chunks from that memory area,
according to the requested size. If the requested size is smaller than the
chunk size, a single chunk will be allocated.

v2: Do some more verifications on parameters that are passed into
kfd_gtt_sa_init()

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay 36b5c08f09 drm/amdkfd: Add gtt sa related data to kfd_dev struct
This patch adds new fields to kfd_dev struct that are necessary for the new kfd
gtt sa module

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay e27ade73fd drm/amd: Add new kfd-->kgd interface for gart usage
This patch adds two new functions to the kfd-->kgd interface:

init_gtt_mem_allocation, which allocate a large enough buffer on the amdkfd
needs, such as mqds, hpds, kernel queue, fence and runlists. This function
is only called once per GPU device. The size of the allocated buffer is
based on the maximum number of HSA processes and maximum number of queues
per HSA process (two amdkfd kernel module parameters).

free_gtt_mem, which frees a buffer that was allocated on the gart aperture.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:07 +02:00
Ben Goz 85dfaef341 drm/amdkfd: Pass queue type to pqm_create_queue()
This patch passes the correct queue type to pqm_create_queue() instead of a
fixed KFD_QUEUE_TYPE_COMPUTE type.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:06 +02:00
Ben Goz 3385f9dd64 drm/amdkfd: Identify SDMA queue in create queue ioctl
This patch adds a check to the create queue ioctl path, which identifies SDMA
queue type that is sent by userspace.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz bcea308175 drm/amdkfd: Add SDMA user-mode queues support to QCM
This patch adds support for SDMA user-mode queues to the QCM - the Queue
management system that manages queues-per-device and queues-per-process.

v2: Remove calls to interface function that initializes sdma engines.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz 77669eb87a drm/amdkfd: Add SDMA mqd support
This patch adds support for SDMA mqd operations:
- init_mqd_sdma
- uninit_mqd_sdma
- load_mqd_sdma
- update_mqd_sdma
- destroy_mqd_sdma
- is_occupied_sdma

It also adds SDMA queue information to some private structures of amdkfd.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:04 +02:00
Ben Goz 85ea7d07e1 drm/amd: Add SDMA functions to kfd-->kgd interface
This patch adds three new functions to the kfd2kgd interface:

- hqd_sdma_load() - Loads SDMA mqd to a H/W SDMA hqd slot. Used only in no HWS
                    mode.

- hqd_sdma_is_occupied() - Checks if an SDMA hqd slot is occupied. Used only
                           in no HWS mode.

- hqd_sdma_destroy() - Destructs and preempts the SDMA queue assigned to
                       that SDMA hqd slot. Used only in no HWS mode.

These functions are needed to support SDMA queues scheduling when using no HWS
mode (used for debug or bring-up).

v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:03 +02:00
Alexey Skidanov 093c7d8cfd drm/amdkfd: Process-device data creation and lookup split
This patch splits the current kfd_get_process_device_data() to two
functions, one that specifically creates a pdd and another one which
just do lookup.

This is done to enhance the readability and maintainability of the code.

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:58 +02:00
Alexey Skidanov f7c826ad38 drm/amdkfd: Add number of watch points to topology
This patch adds the number of watch points to the node capabilities in the
topology module

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:55 +02:00
Alexey Skidanov 775921edc1 amdkfd: Implement the Get Process Aperture IOCTL
v3: Fixed debug messages

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:49:36 +03:00
Evgeny Pinchuk 4fac47c820 amdkfd: Implement the Get Clock Counters IOCTL
Signed-off-by: Evgeny Pinchuk <evgeny.pinchuk@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:47:58 +03:00
Andrew Lewycky 41a286fa54 amdkfd: Implement the Set Memory Policy IOCTL
Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:46:17 +03:00
Oded Gabbay 39b027d957 amdkfd: Implement the create/destroy/update queue IOCTLs
v3: Removed the use of internal typedefs, fixed debug prints, added checks
    for parameters and moved to using doorbell address from user

v4: Extracted some of the code in the create queue ioctl to a different
    function that may be also called from other ioctls in the future.
    Also fixed the check of the ring size argument.

v5:

Add support for AQL queues creation to enable working with open-source HSA
runtime

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-10-19 23:46:40 +03:00
Andrew Lewycky b3f5e6b441 amdkfd: Add interrupt handling module
This patch adds the interrupt handling module, in kfd_interrupt.c, and its
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring per amdkfd
device. The internal interrupt ring contains interrupts that needs further
handling. The extra handling is deferred to a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware simply queues
a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose
interrupts because we have no back-pressure to the hardware.

v3:

Move amdkfd from drm/radeon/ to drm/amd/
Change device init
Made sure spin lock is taken only if init is complete
Moved bool field to the end of the structure

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:37:30 +03:00
Ben Goz 64c7f8cf79 amdkfd: Add device queue manager module
The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The device bounded section is handled by this module.
The DQM module handles queue setup, update and tear-down from the device side.
It also supports suspend/resume operation.

v3: Changed device_init, added the use of the new gart allocation functions an
Added documentation.

v4:

Fixed a race in DQM queue scheduler where dqm->lock must be held when accessing
dqm->queue_count and dqm->processes_count. This fixes runlist IB allocation
failures when DQM is under load.

Fixed race in DQM queue destruction where queues being destroyed must be
removed from qpd->queues_list prior to preemption, or concurrent queue
creation activity may reschedule them while their MQD is destroyed.

Fixed EOP queue size setting in CP_HPD_EOP_CONTROL, because the size is
specified as (log2(size_dwords)-1). The previous calculation assumed the
size was specified in bytes, which caused interference between EOP queues
when multiple MEC pipelines were active.

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Remove unused unmap_queue function
Various fixes (Style, typos)

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:27:00 +03:00
Ben Goz 45102048f7 amdkfd: Add process queue manager module
The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The process bounded section is handled by this module. The PQM handles usermode
queue setup, updates and tear-down.

v3:

Used kernel parameter to limit queues per process instead of define
Added use of doorbell address from user

v4:

Modified pqm_create_queue so that only when creating usermode queues the
driver should return the queue properties to the userspace.

Added an info message print when no more queues can be opened because of the
queue per process limitation

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:04:10 +03:00
Ben Goz 241f24f823 amdkfd: Add packet manager module
The packet manager module builds PM4 packets for the sole use of the CP
scheduler. Those packets are used by the HIQ to submit runlists to the CP.

v3:

Removed include of cik_mqds.h
Changed lower_32/upper_32 calls to use linux macros
Used new gart allocation functions
Added documentation

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Always chain runlist if you have more than 1 process or if you have
over-subscription over the number of queues.
Various fixes (typos, style)

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:55:28 +03:00
Ben Goz 31c21fece7 amdkfd: Add module parameter of scheduling policy
This patch adds a new parameter to the amdkfd driver. This parameter enables
the user to select the scheduling policy of the CP. The choices are:

* CP Scheduling with support for over-subscription
* CP Scheduling without support for over-subscription
* Without CP Scheduling

Note that the third option (Without CP scheduling) is only for debug purposes
and bringup of new H/W. As such, it is _not_ guaranteed to work at all times on
all H/W versions.

v3: Fixed description of parameter, changed the permissions to read_only, added
a verification of the value and added documentation

v5: Set default sched_policy to HWS as it is now supported by firmware

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:48:28 +03:00
Ben Goz ed6e6a3487 amdkfd: Add kernel queue module
The kernel queue module enables the amdkfd to establish kernel queues, not
exposed to user space.

The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug
Interface Queue) operations

v3: Removed use of internal typedefs and added use of the new gart allocation
functions

v4: Fixed a miscalculation in kernel queue wrapping

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Add define for kernel queue size
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:45:35 +03:00
Ben Goz 6e99df5741 amdkfd: Add mqd_manager module
The mqd_manager module handles MQD data structures.
MQD stands for Memory Queue Descriptor, which is used by the H/W to
keep the usermode queue state in memory.

v3:

Removed new typedefs
Removed pragma pack 4
Remove cik_mqds.h file
Changed lower_32/upper_32 calls to use linux macros
Used new gart allocation functions
Added documentation

v4:

Added missing initialization of the addr field in init_mqd()

Setting the hqd persistent.preload_req bit ON so that when queues switches
on/off, their context will kept and read from the mqd when the cp reassign
them, and thus the dispatched workload context kept consistent without any
interrupts.

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:36:17 +03:00
Ben Goz ed8aab4594 amdkfd: Add queue module
The queue module enables allocating and initializing queues uniformly.

v3: Removed typedef and redundant memset call. Broke long pr_debug print to one
liners and Added documentation.

v5: Move amdkfd from drm/radeon/ to drm/amd/

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:18:51 +03:00
Oded Gabbay b17f068a09 amdkfd: Add binding/unbinding calls to amd_iommu driver
This patch adds the functions to bind and unbind pasid
from a device through the amd_iommu driver.

The unbind function is called when the mm_struct of the
process is released.

The bind function is not called here because it is called
only in the IOCTLs which are not yet implemented at this
stage of the patchset.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:06:27 +03:00
Oded Gabbay 19f6d2a660 amdkfd: Add basic modules to amdkfd
This patch adds the process module and three helper modules:

- kfd_process, which handles process which open /dev/kfd

- kfd_doorbell, which provides helper functions for doorbell allocation,
  release and mapping to userspace

- kfd_pasid, which provides helper functions for pasid allocation and release

- kfd_aperture, which provides helper functions for managing the LDS, Local GPU
  memory and Scratch memory apertures of the process

This patch only contains the basic kfd_process module, which doesn't contain
the reference to the queue scheduler. This was done to allow easier code review.

Also, this patch doesn't contain the calls to the IOMMU driver for binding the
pasid to the device. Again, this was done to allow easier code review

The kfd_process object is created when a process opens /dev/kfd and is closed
when the mm_struct of that process is teared-down.

v3:

Removed kfd_vidmem.c file
Replaced direct mmput call to mmu_notifier release
Removed typedefs
Moved bool field to end of the structure
Added new kernel params for gart usage limitation
Added initialization of sa manager
Fixed debug messages
Remove support for LDS in 32 bit
Changed code to support mmap of doorbell pages from userspace
Added documentation for apertures

v4: Replaced RCU by SRCU for kfd_process list management

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Rename kfd_aperture.c to kfd_flat_memory.c
Protect against multiple init calls
MQD size is H/W dependent so moved it to device info structure
Rename kfd_mem_obj structure's members
Use delayed function for process tear-down

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 23:25:31 +03:00
Evgeny Pinchuk 5b5c4e40a3 amdkfd: Add topology module to amdkfd
This patch adds the topology module to the driver. The topology is exposed to
userspace through the sysfs.

The calls to add and remove a device to/from topology are done by the radeon
driver.

v3:

The CPU information, that is provided in the topology section of the amdkfd
driver, is extracted from the CRAT table. Unlike the CPU information located
in /sys/devices/system/cpu/cpu*, which is extracted from the SRAT table.

While the CPU information provided by the CRAT and the SRAT tables might be
identical, the node topology might be different. The SRAT table contains the
topology of CPU nodes only. The CRAT table contains the topology of CPU and GPU
nodes together (and can be interleaved). For example CPU node 1 in SRAT can be
CPU node 3 in CRAT. Furthermore it's worth to mention that the CRAT table
contains only HSA compatible nodes (nodes which are compliant with the HSA
spec).

To recap, amdkfd exposes a different kind of topology than the one exposed by
/sys/devices/system/cpu/cpu even though it may contain similar information.

v4:

The topology module doesn't support uevent handling and doesn't notify the
userspace about runtime modifications. It is up to the userspace to acquire
snapshots of the topology information created by the amdkfd and exposed
in sysfs.

The following is an example of how the topology looks on a Kaveri A10-7850K
system with amdkfd installed:

/sys/devices/virtual/kfd/kfd/
|
--- topology/
      |
      |--- generation_id
      |--- system_properties
      |--- nodes/
            |
            |--- 0/
                 |
                 |--- gpu_id
                 |--- name
                 |--- properties
                 |--- caches/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                 |--- io_links/
                      |
                 |--- mem_banks/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                      |--- 3/
                           |
                           |--- properties

v5:

Move amdkfd from drm/radeon/ to drm/amd/

Add a check if dev->gpu pointer is null before accessing it in the
node_show function in kfd_topology.c
This situation may occur when amdkfd is loaded and there is a GPU with a CRAT
table, but that GPU isn't supported by amdkfd

Signed-off-by: Evgeny Pinchuk <evgeny.pinchuk@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:22:32 +03:00
Oded Gabbay 4a488a7ad7 amdkfd: Add amdkfd skeleton driver
This patch adds the amdkfd skeleton driver. The driver does nothing except
define a /dev/kfd device.

It returns -ENODEV on all amdkfd IOCTLs.

v3: Move bool field to the end of structure, removed the pmc ioctls and added
a meaningful error message for ioctl error.

v5:

Create a new folder drm/amd and move amdkfd from drm/radeon/ to drm/amd/
Remove scheduler_class from kfd_priv.h as it was never used
Add skeleton implementation of the Get Version IOCTL

v6:
Update module version to the correct number and remove the "default m" from the
Kconfig file

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:08:55 +03:00
Oded Gabbay e28740ece3 drm/radeon: Add radeon <--> amdkfd interface
This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

v3:

Add interface for sa manager init and fini. The init function will allocate a
buffer on system memory and pin it to the GART address space via the radeon sa
manager.

All mappings of buffers to GART address space are done via the radeon sa
manager. The interface of allocate memory will use the radeon sa manager to sub
allocate from the single buffer that was allocated during the init function.

Change lower_32/upper_32 calls to use linux macros

Add documentation for the interface

v4:

Change ptr field type in kgd_mem from uint32_t* to void* to match to type that
is returned by radeon_sa_bo_cpu_addr

v5:

Change format of mqd structure to work with latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Move generic kfd-->kgd interface and other generic kgd definitions to a generic
header file that will be used by AMD's radeon and amdgpu drivers

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-15 13:53:32 +03:00