Commit Graph

581 Commits

Author SHA1 Message Date
Oded Gabbay 6bbcde9803 drm/amd: Remove old radeon_sa funcs from kfd-->kgd interface
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:11 +02:00
Oded Gabbay a86aa3ca5a drm/amdkfd: Using new gtt sa in amdkfd
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd
interface to the new kfd gtt sa inside amdkfd

v2: change the new call in sdma code that appeared because of the sdma feature

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:10 +02:00
Oded Gabbay 73a1da0bb3 drm/amdkfd: Allocate gart memory using new interface
This patch changes the calls to allocate the gart memory for amdkfd from the
old interface (radeon_sa) to the new one (kfd_gtt_sa)

The new gart sub-allocator is initialized with chunk size equal to 512 bytes.
This is because the KV MQD is 512 Bytes and most of the sub-allocations are
MQDs.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay e18e794e6b drm/amdkfd: Fixed calculation of gart buffer size
This patch makes the gart's buffer size calculation more accurate. This buffer
is needed per GPU.

It takes into account maximum number of MQDs, runlist packets, kernel queues
and reserves 512KB for other misc allocations.

The total size is just shy of 4MB, for 32 processes and 128 queues per
process, which are the defaults for amdkfd kernel module parameters.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:09 +02:00
Oded Gabbay 6e81090b2e drm/amdkfd: Add kfd gtt sub-allocator functions
This patch adds new kfd gtt sub-allocator functions that service the amdkfd
driver when it wants to use gtt memory.

The sub-allocator uses a bitmap to handle the memory area that was transferred
to it during init. It divides the memory area into chunks, according to chunk
size parameter.

The allocation function will allocate contiguous chunks from that memory area,
according to the requested size. If the requested size is smaller than the
chunk size, a single chunk will be allocated.

v2: Do some more verifications on parameters that are passed into
kfd_gtt_sa_init()

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay 36b5c08f09 drm/amdkfd: Add gtt sa related data to kfd_dev struct
This patch adds new fields to kfd_dev struct that are necessary for the new kfd
gtt sa module

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:08 +02:00
Oded Gabbay e27ade73fd drm/amd: Add new kfd-->kgd interface for gart usage
This patch adds two new functions to the kfd-->kgd interface:

init_gtt_mem_allocation, which allocate a large enough buffer on the amdkfd
needs, such as mqds, hpds, kernel queue, fence and runlists. This function
is only called once per GPU device. The size of the allocated buffer is
based on the maximum number of HSA processes and maximum number of queues
per HSA process (two amdkfd kernel module parameters).

free_gtt_mem, which frees a buffer that was allocated on the gart aperture.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:07 +02:00
Ben Goz 85dfaef341 drm/amdkfd: Pass queue type to pqm_create_queue()
This patch passes the correct queue type to pqm_create_queue() instead of a
fixed KFD_QUEUE_TYPE_COMPUTE type.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:06 +02:00
Ben Goz 3385f9dd64 drm/amdkfd: Identify SDMA queue in create queue ioctl
This patch adds a check to the create queue ioctl path, which identifies SDMA
queue type that is sent by userspace.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz bcea308175 drm/amdkfd: Add SDMA user-mode queues support to QCM
This patch adds support for SDMA user-mode queues to the QCM - the Queue
management system that manages queues-per-device and queues-per-process.

v2: Remove calls to interface function that initializes sdma engines.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:05 +02:00
Ben Goz 77669eb87a drm/amdkfd: Add SDMA mqd support
This patch adds support for SDMA mqd operations:
- init_mqd_sdma
- uninit_mqd_sdma
- load_mqd_sdma
- update_mqd_sdma
- destroy_mqd_sdma
- is_occupied_sdma

It also adds SDMA queue information to some private structures of amdkfd.

v3: Use the new names of some of the defines.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:04 +02:00
Ben Goz 85ea7d07e1 drm/amd: Add SDMA functions to kfd-->kgd interface
This patch adds three new functions to the kfd2kgd interface:

- hqd_sdma_load() - Loads SDMA mqd to a H/W SDMA hqd slot. Used only in no HWS
                    mode.

- hqd_sdma_is_occupied() - Checks if an SDMA hqd slot is occupied. Used only
                           in no HWS mode.

- hqd_sdma_destroy() - Destructs and preempts the SDMA queue assigned to
                       that SDMA hqd slot. Used only in no HWS mode.

These functions are needed to support SDMA queues scheduling when using no HWS
mode (used for debug or bring-up).

v2: Removed init_sdma_engines() from interface. Initialization is done in
radeon.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-01-09 22:26:03 +02:00
Alexey Skidanov 093c7d8cfd drm/amdkfd: Process-device data creation and lookup split
This patch splits the current kfd_get_process_device_data() to two
functions, one that specifically creates a pdd and another one which
just do lookup.

This is done to enhance the readability and maintainability of the code.

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:58 +02:00
Alexey Skidanov f7c826ad38 drm/amdkfd: Add number of watch points to topology
This patch adds the number of watch points to the node capabilities in the
topology module

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-01-09 22:25:55 +02:00
Alexey Skidanov 775921edc1 amdkfd: Implement the Get Process Aperture IOCTL
v3: Fixed debug messages

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:49:36 +03:00
Evgeny Pinchuk 4fac47c820 amdkfd: Implement the Get Clock Counters IOCTL
Signed-off-by: Evgeny Pinchuk <evgeny.pinchuk@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:47:58 +03:00
Andrew Lewycky 41a286fa54 amdkfd: Implement the Set Memory Policy IOCTL
Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:46:17 +03:00
Oded Gabbay 39b027d957 amdkfd: Implement the create/destroy/update queue IOCTLs
v3: Removed the use of internal typedefs, fixed debug prints, added checks
    for parameters and moved to using doorbell address from user

v4: Extracted some of the code in the create queue ioctl to a different
    function that may be also called from other ioctls in the future.
    Also fixed the check of the ring size argument.

v5:

Add support for AQL queues creation to enable working with open-source HSA
runtime

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-10-19 23:46:40 +03:00
Andrew Lewycky b3f5e6b441 amdkfd: Add interrupt handling module
This patch adds the interrupt handling module, in kfd_interrupt.c, and its
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring per amdkfd
device. The internal interrupt ring contains interrupts that needs further
handling. The extra handling is deferred to a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware simply queues
a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose
interrupts because we have no back-pressure to the hardware.

v3:

Move amdkfd from drm/radeon/ to drm/amd/
Change device init
Made sure spin lock is taken only if init is complete
Moved bool field to the end of the structure

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:37:30 +03:00
Ben Goz 64c7f8cf79 amdkfd: Add device queue manager module
The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The device bounded section is handled by this module.
The DQM module handles queue setup, update and tear-down from the device side.
It also supports suspend/resume operation.

v3: Changed device_init, added the use of the new gart allocation functions an
Added documentation.

v4:

Fixed a race in DQM queue scheduler where dqm->lock must be held when accessing
dqm->queue_count and dqm->processes_count. This fixes runlist IB allocation
failures when DQM is under load.

Fixed race in DQM queue destruction where queues being destroyed must be
removed from qpd->queues_list prior to preemption, or concurrent queue
creation activity may reschedule them while their MQD is destroyed.

Fixed EOP queue size setting in CP_HPD_EOP_CONTROL, because the size is
specified as (log2(size_dwords)-1). The previous calculation assumed the
size was specified in bytes, which caused interference between EOP queues
when multiple MEC pipelines were active.

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Remove unused unmap_queue function
Various fixes (Style, typos)

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:27:00 +03:00
Ben Goz 45102048f7 amdkfd: Add process queue manager module
The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The process bounded section is handled by this module. The PQM handles usermode
queue setup, updates and tear-down.

v3:

Used kernel parameter to limit queues per process instead of define
Added use of doorbell address from user

v4:

Modified pqm_create_queue so that only when creating usermode queues the
driver should return the queue properties to the userspace.

Added an info message print when no more queues can be opened because of the
queue per process limitation

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 01:04:10 +03:00
Ben Goz 241f24f823 amdkfd: Add packet manager module
The packet manager module builds PM4 packets for the sole use of the CP
scheduler. Those packets are used by the HIQ to submit runlists to the CP.

v3:

Removed include of cik_mqds.h
Changed lower_32/upper_32 calls to use linux macros
Used new gart allocation functions
Added documentation

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Always chain runlist if you have more than 1 process or if you have
over-subscription over the number of queues.
Various fixes (typos, style)

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:55:28 +03:00
Ben Goz 31c21fece7 amdkfd: Add module parameter of scheduling policy
This patch adds a new parameter to the amdkfd driver. This parameter enables
the user to select the scheduling policy of the CP. The choices are:

* CP Scheduling with support for over-subscription
* CP Scheduling without support for over-subscription
* Without CP Scheduling

Note that the third option (Without CP scheduling) is only for debug purposes
and bringup of new H/W. As such, it is _not_ guaranteed to work at all times on
all H/W versions.

v3: Fixed description of parameter, changed the permissions to read_only, added
a verification of the value and added documentation

v5: Set default sched_policy to HWS as it is now supported by firmware

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:48:28 +03:00
Ben Goz ed6e6a3487 amdkfd: Add kernel queue module
The kernel queue module enables the amdkfd to establish kernel queues, not
exposed to user space.

The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug
Interface Queue) operations

v3: Removed use of internal typedefs and added use of the new gart allocation
functions

v4: Fixed a miscalculation in kernel queue wrapping

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime
Add define for kernel queue size
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:45:35 +03:00
Ben Goz 6e99df5741 amdkfd: Add mqd_manager module
The mqd_manager module handles MQD data structures.
MQD stands for Memory Queue Descriptor, which is used by the H/W to
keep the usermode queue state in memory.

v3:

Removed new typedefs
Removed pragma pack 4
Remove cik_mqds.h file
Changed lower_32/upper_32 calls to use linux macros
Used new gart allocation functions
Added documentation

v4:

Added missing initialization of the addr field in init_mqd()

Setting the hqd persistent.preload_req bit ON so that when queues switches
on/off, their context will kept and read from the mqd when the cp reassign
them, and thus the dispatched workload context kept consistent without any
interrupts.

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Change format of mqd structure to match latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Various fixes

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:36:17 +03:00
Ben Goz ed8aab4594 amdkfd: Add queue module
The queue module enables allocating and initializing queues uniformly.

v3: Removed typedef and redundant memset call. Broke long pr_debug print to one
liners and Added documentation.

v5: Move amdkfd from drm/radeon/ to drm/amd/

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:18:51 +03:00
Oded Gabbay b17f068a09 amdkfd: Add binding/unbinding calls to amd_iommu driver
This patch adds the functions to bind and unbind pasid
from a device through the amd_iommu driver.

The unbind function is called when the mm_struct of the
process is released.

The bind function is not called here because it is called
only in the IOCTLs which are not yet implemented at this
stage of the patchset.

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17 00:06:27 +03:00
Oded Gabbay 19f6d2a660 amdkfd: Add basic modules to amdkfd
This patch adds the process module and three helper modules:

- kfd_process, which handles process which open /dev/kfd

- kfd_doorbell, which provides helper functions for doorbell allocation,
  release and mapping to userspace

- kfd_pasid, which provides helper functions for pasid allocation and release

- kfd_aperture, which provides helper functions for managing the LDS, Local GPU
  memory and Scratch memory apertures of the process

This patch only contains the basic kfd_process module, which doesn't contain
the reference to the queue scheduler. This was done to allow easier code review.

Also, this patch doesn't contain the calls to the IOMMU driver for binding the
pasid to the device. Again, this was done to allow easier code review

The kfd_process object is created when a process opens /dev/kfd and is closed
when the mm_struct of that process is teared-down.

v3:

Removed kfd_vidmem.c file
Replaced direct mmput call to mmu_notifier release
Removed typedefs
Moved bool field to end of the structure
Added new kernel params for gart usage limitation
Added initialization of sa manager
Fixed debug messages
Remove support for LDS in 32 bit
Changed code to support mmap of doorbell pages from userspace
Added documentation for apertures

v4: Replaced RCU by SRCU for kfd_process list management

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Rename kfd_aperture.c to kfd_flat_memory.c
Protect against multiple init calls
MQD size is H/W dependent so moved it to device info structure
Rename kfd_mem_obj structure's members
Use delayed function for process tear-down

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 23:25:31 +03:00
Evgeny Pinchuk 5b5c4e40a3 amdkfd: Add topology module to amdkfd
This patch adds the topology module to the driver. The topology is exposed to
userspace through the sysfs.

The calls to add and remove a device to/from topology are done by the radeon
driver.

v3:

The CPU information, that is provided in the topology section of the amdkfd
driver, is extracted from the CRAT table. Unlike the CPU information located
in /sys/devices/system/cpu/cpu*, which is extracted from the SRAT table.

While the CPU information provided by the CRAT and the SRAT tables might be
identical, the node topology might be different. The SRAT table contains the
topology of CPU nodes only. The CRAT table contains the topology of CPU and GPU
nodes together (and can be interleaved). For example CPU node 1 in SRAT can be
CPU node 3 in CRAT. Furthermore it's worth to mention that the CRAT table
contains only HSA compatible nodes (nodes which are compliant with the HSA
spec).

To recap, amdkfd exposes a different kind of topology than the one exposed by
/sys/devices/system/cpu/cpu even though it may contain similar information.

v4:

The topology module doesn't support uevent handling and doesn't notify the
userspace about runtime modifications. It is up to the userspace to acquire
snapshots of the topology information created by the amdkfd and exposed
in sysfs.

The following is an example of how the topology looks on a Kaveri A10-7850K
system with amdkfd installed:

/sys/devices/virtual/kfd/kfd/
|
--- topology/
      |
      |--- generation_id
      |--- system_properties
      |--- nodes/
            |
            |--- 0/
                 |
                 |--- gpu_id
                 |--- name
                 |--- properties
                 |--- caches/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                 |--- io_links/
                      |
                 |--- mem_banks/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                      |--- 3/
                           |
                           |--- properties

v5:

Move amdkfd from drm/radeon/ to drm/amd/

Add a check if dev->gpu pointer is null before accessing it in the
node_show function in kfd_topology.c
This situation may occur when amdkfd is loaded and there is a GPU with a CRAT
table, but that GPU isn't supported by amdkfd

Signed-off-by: Evgeny Pinchuk <evgeny.pinchuk@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:22:32 +03:00
Oded Gabbay 4a488a7ad7 amdkfd: Add amdkfd skeleton driver
This patch adds the amdkfd skeleton driver. The driver does nothing except
define a /dev/kfd device.

It returns -ENODEV on all amdkfd IOCTLs.

v3: Move bool field to the end of structure, removed the pmc ioctls and added
a meaningful error message for ioctl error.

v5:

Create a new folder drm/amd and move amdkfd from drm/radeon/ to drm/amd/
Remove scheduler_class from kfd_priv.h as it was never used
Add skeleton implementation of the Get Version IOCTL

v6:
Update module version to the correct number and remove the "default m" from the
Kconfig file

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:08:55 +03:00
Oded Gabbay e28740ece3 drm/radeon: Add radeon <--> amdkfd interface
This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

v3:

Add interface for sa manager init and fini. The init function will allocate a
buffer on system memory and pin it to the GART address space via the radeon sa
manager.

All mappings of buffers to GART address space are done via the radeon sa
manager. The interface of allocate memory will use the radeon sa manager to sub
allocate from the single buffer that was allocated during the init function.

Change lower_32/upper_32 calls to use linux macros

Add documentation for the interface

v4:

Change ptr field type in kgd_mem from uint32_t* to void* to match to type that
is returned by radeon_sa_bo_cpu_addr

v5:

Change format of mqd structure to work with latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Move generic kfd-->kgd interface and other generic kgd definitions to a generic
header file that will be used by AMD's radeon and amdgpu drivers

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-15 13:53:32 +03:00