If the queues are full and we return -EAGAIN to the user, there is no need
to print an error, as that case isn't an error and the user is expected to
re-submit the work.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
This patch adds a support for a new H/W queue type.
This type of queue is for DMA and compute engines jobs, for which
completion notification are sent by H/W.
Command buffer for this queue can be created either through the CB
IOCTL and using the retrieved CB handle, or by preparing a buffer on the
host or device SRAM/DRAM, and using the device address to that buffer.
The patch includes the handling of the 2 options, as well as the
initialization of the H/W queue and its jobs scheduling.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Jobs on some queues must be provided with a handle to a driver command
buffer object, while for other queues, jobs must be provided with an
address to a command buffer.
Currently the distinction is done based on the queue type, which is less
flexible if the same queue type behaves differently on different
types of ASICs.
This patch adds a new queue property for this target, which is
configured per queue type per ASIC type.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
We want to stop using the acronym KMD. Therefore, replace all locations
(except for register names we can't modify) where KMD is written to other
terms such as "Linux kernel driver" or "Host kernel driver", etc.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Users and sysadmins usually want to know what is the device utilization as
a level 0 indication if they are efficiently using the device.
Add a new opcode to the INFO IOCTL that will return the device utilization
over the last period of 100-1000ms. The return value is 0-100,
representing as percentage the total utilization rate.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
When rejecting CS because of too many in-flight CS, print a debug message
about it as it useful to know when the user is debugging (it indicates a
back-pressure from the driver as the device is not fast enough to consume
the CS)
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
This patch fix a bug in the host memory polling macro. The bug is that the
memory being polled can be written by the device, which always writes it
in LE. However, if the host is running Linux in BE mode, we need to
convert the value that was written by the device before matching it to the
required value that the caller has given to the macro.
Signed-off-by: Ben Segal <bpsegal20@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch changes two polling functions to macros, in order to make their
API the same as the standard readl_poll_timeout so we would be able to
define the "condition for exit" when calling these macros.
This will simplify the code as it will eliminate the need to check both
for timeout and for the (cond) in the calling function.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
use_virt_addr member was used for telling whether to treat the
addresses in the CB as virtual during parsing. We disabled it only
when calling the parser from the driver memset device function,
and since this call had been removed, it should always be enabled.
Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch only does renaming of certain variables and structure members,
and their accompanied comments.
This is done to better reflect the actions these variables and members
represent.
There is no functional change in this patch.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
At the start of some IOCTLs we check if the device is disabled or in reset.
If it is, we return -EBUSY and print a message to kernel log.
Because these IOCTLs can be called at very high frequency, use ratelimit
to avoid spamming the kernel log. Also use the same type of message -
dev_warn - in all the relevant IOCTLs.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds accounting for active CS. Active means that the CS was
submitted to the H/W queues and was not completed yet.
This is necessary to support suspend operation. Because the device will be
reset upon suspend, we can only suspend after all active CS have been
completed. Hence, we need to perform accounting on their number.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch fix a bug in the driver, where if the TPC or MME remains in
non-IDLE even after all the command submissions are done (due to user bug
or malicious user), then future command submissions will fail in the
context-switch stage and the driver will remain in "stuck" mode.
The fix is to do a soft-reset of the device in case the context-switch
fails, because the device should be IDLE during context-switch. If it is
not IDLE, then something is wrong and we should reset the compute engines.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds debugfs support to the driver. It allows the user-space to
display information that is contained in the internal structures of the
driver, such as:
- active command submissions
- active user virtual memory mappings
- number of allocated command buffers
It also enables the user to perform reads and writes through Goya's PCI
bars.
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the main flow for the user to submit work to the device.
Each work is described by a command submission object (CS). The CS contains
3 arrays of command buffers: One for execution, and two for context-switch
(store and restore).
For each CB, the user specifies on which queue to put that CB. In case of
an internal queue, the entry doesn't contain a pointer to the CB but the
address in the on-chip memory that the CB resides at.
The driver parses some of the CBs to enforce security restrictions.
The user receives a sequence number that represents the CS object. The user
can then query the driver regarding the status of the CS, using that
sequence number.
In case the CS doesn't finish before the timeout expires, the driver will
perform a soft-reset of the device.
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>