Commit Graph

593 Commits

Author SHA1 Message Date
Ofir Bitton 5a2998f46c habanalabs/gaudi: fetch HBM ecc info from FW
Once FW security is enabled there is no access to HBM ecc registers,
need to read values from FW using a dedicated interface.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Ofir Bitton d611b9f0b1 habanalabs: fetch hard reset capability from FW
Driver must fetch FW hard reset capability during boot time,
in order to skip the hard reset flow if necessary.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Oded Gabbay 7f070c913c habanalabs: move asic property to correct structure
Whether an ASIC has MMU towards its DRAM is an ASIC property, so
move it to the asic fixed properties structure.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Ofir Bitton be91b91fa4 habanalabs: use host va range for internal pools
Instead of using a dedicated va range for each internal pool,
we introduce a new way for reserving a va block from an existing
va range. This is a more generic way of reserving va blocks for
future use.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Ofir Bitton adb51298fd habanalabs: improve hard reset procedure
We want to handle the scenario in which the driver was not able
to kill all user processes due to many memory mappings.
We need to retry again after some period while releasing the cores.
The devices will be unusable and "in-reset" status during that time.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Tomer Tayar 804a72276c habanalabs: Rename hw_queues_mirror to cs_mirror
Future command submission types might be submitted to HW not via the
QMAN queues path. However, it would be still required to have the TDR
mechanism for these CS, and thus the patch renames the TDR fields and
replaces the hw_queues_ prefix with cs_.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Ofir Bitton 784b916dad habanalabs: refactor mmu va_range db structure
Use an array of va_ranges instead of keeping each va_range separately,
we do this for better readability and in order to support access to
a specific range in a much elegant manner.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:34 +02:00
Ofir Bitton d1ddd90551 habanalabs: move HW dirty check to a proper location
Driver must verify if HW is dirty before trying to fetch preboot
information. Hence, we move this validation to a prior stage of
the boot sequence.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Oded Gabbay 28e052c952 habanalabs: restore vm_pgoff after mmap
Due to using dma_mmap_coherent() to perform mmap of dma memory, we
had to clear the vm_pgoff field before calling that function.

However, that broke the userspace (profiler tool) as they relied
on searching the /proc/self/maps for these values to correctly
"disassemble" the topology recipe.

To re-enable that functionality, the driver can simply restore the
value of vm_pgoff before returning to userspace but after calling
dma_mmap_coherent().

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Ofir Bitton 66a76401c5 habanalabs: add 'needs reset' state in driver
The new state indicates that device should be reset in order
to re-gain funcionality.
This unique state can occur if reset_on_lockup is disabled
and an actual lockup has occurred.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Omer Shpigelman f2d032ee13 habanalabs: fix hard reset print and comment
One of the first steps of a hard reset flow is to close all open user
contexts. This user process teradown might take some time due to long
cleanup in our driver or some other reason even before our cleanup flow.
Hence fix the relevant print and comment to be more accurate.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Igor Grinberg b726a2f7c0 habanalabs/gaudi: remove pcie_en strap toggle
Since the very large grace period is over and this functionality
prevents us to implement the new reset sequence and apply security
settings, we need to remove the code toggling the PCIE_EN bit in the
straps register.
Remove it for good.

Signed-off-by: Igor Grinberg <igrinberg@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Oded Gabbay 66bfcccdb8 habanalabs: remove duplicate print
We print twice the firmware status regarding security, once in
common code and once in asic code. Remove the print in asic code
and leave the common code print.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Tomer Tayar 649c459212 habanalabs: Separate CS job completion from its deallocation
Current CS jobs are no longer needed after their completion.
However, jobs of future workload might be in use even after they are
completed. To allow that, the patch adds a refcount to the job object,
and decouples its completion handling from its deallocation.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:33 +02:00
Oded Gabbay 0da5698bf4 habanalabs/gaudi: increase MAX CS to 16K
We need to have the MAX CS be much larger than the size of the
different queues. In GAUDI we have around 8 groups of queues, and each
group has 1K queue size. To prevent head-of-the-line blocking, we need
to make sure there is sufficient number of available CS allocations
even if one or more of those queues are full.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
farah kassabri eb10b897e4 habanalabs: reset device upon fw read failure
failure in reading pre-boot verion is not handled correctly,
upon failure we need to reset the device in order to be able
to reinstall the driver.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Tomer Tayar ba7e389c30 habanalabs: Move repeatedly included headers to habanalabs.h
Several header files are repeatedly included in many files.
Move these files to habanalabs.h which is included by all.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Ofir Bitton c1d505a922 habanalabs: release signal if collective wait was dropped
As in standard wait cs, we must release a signal fence once
a collective wait cs was dropped and not submitted.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Tomer Tayar 4ba1b227b6 habanalabs: Skip updating CI of internal queues if not in use
There are no internal queues if H/W queues are being used.
In this case we can skip the redundant traversal over the queues array,
looking for internal queues.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Tomer Tayar ea6ee260cb habanalabs: Small refactoring of cs_do_release()
Slightly refactor the cs_do_release() function, to reduce nesting level
and to ease the handling of future CS types.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Tomer Tayar 6de3d769fd habanalabs: Small refactoring of CS IOCTL handling
Refactor the CS IOCTL handling by gathering common code into
sub-functions, in order to ease future additions of new CS types.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:32 +02:00
Ofir Bitton 1cbca899fa habanalabs/gaudi: fetch PLL info from FW
Once FW security is enabled there is no access to PLL registers,
need to read values from FW using a dedicated interface.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Moti Haimovski ccf979ee33 habanalabs: refactor MMU to support dual residency MMU
This commit refactors the MMU code to support PCI MMU page tables
residing on host and DCORE MMU residing on the device DRAM at the
same time.

This is needed for future devices as on GAUDI and GOYA we have
a single MMU where its page tables always reside on DRAM.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Moti Haimovski a6722d6a97 habanalabs: fix MMU print message
This commit fixes an incorrect error message

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
farah kassabri 03df136bc5 habanalabs/gaudi: scrub all memory upon closing FD
In cases of multi-tenants, administrators may want to prevent data
leakage between users running on the same device one after another.

To do that the driver can scrub the internal memory (both SRAM and
DRAM) after a user finish to use the memory.

Because in GAUDI the driver allows only one application to use the
device at a time, it can scrub the memory when user app close FD.

In future devices where we have MMU on the DRAM, we can scrub the DRAM
memory with a finer granularity (page granularity) when the user
allocates the memory.

This feature is not supported in Goya.

To allow users that want to debug their applications, we add a kernel
module parameter to load the driver with this feature disabled.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Ofir Bitton c692dec703 habanalabs/gaudi: add support for FW security
Skip relevant HW configurations once FW security is enabled
because these configurations are being performed by FW.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Ofir Bitton 323b726706 habanalabs: fetch security indication from FW
Add support for fetching security indication from FW.
This indication is needed in order to skip unnecessary
initializations done by FW.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
farah kassabri e753643d51 habanalabs: fix cs counters structure
Fix cs counters structure in uapi to be one flat structure instead
of two instances of the same other structure.
use atomic read/increment for context counters so we could use
one structure for both aggregated and context counters.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Ofir Bitton 9bb86b63d8 habanalabs: advanced FW loading
Today driver is able to load a whole FW binary into a specific
location on ASIC. We add support for loading sections from the
same FW binary into different loactions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:31 +02:00
Oded Gabbay 977d53a614 habanalabs: initialize variable before use
GCC 7.3.1 20180303 (Red Hat 7.3.1-5) complains that collective_engine_id
might be used uninitialized.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Ofir Bitton 71a984f9ae habanalabs/gaudi: remove unreachable code
Remove unreachable code in gaudi collective flow.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Oded Gabbay e716ad3c76 habanalabs: make sure cs type is valid in cs_ioctl_signal_wait
Although we get a valid cs type from the callee, in case new values
will be added in the future, it is best to check the expected values
in that function.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Oded Gabbay 3e62299657 habanalabs/gaudi: monitor device memory usage
In GAUDI we don't have an MMU towards the HBM device memory. Therefore,
the user access that memory directly through physical address (via the
different engines) without the need to go through the driver to
allocate/free memory on the HBM.

For system monitoring purposes, the driver will keep track of the HBM
usage. This can be done as long as the user accurately reports the
allocations and releases of HBM memory, through the existing MEMORY
IOCTL uapi.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Ofir Bitton 5de406c0b5 habanalabs: sync stream collective support
Implement sync stream collective for GAUDI. Need to allocate additional
resources for that and add ctx_fini() to clean up those resources.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Ofir Bitton 0940cabafd habanalabs/gaudi: Set DMA5 QMAN internal
DMA5 QMAN is designated to be used for reduction process, hence it will
be no longer configured as external queue.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Ofir Bitton 5fe1c17ddf habanalabs: sync stream collective infrastructure
Define new API for collective wait support and modify sync stream
common flow. In addition add kernel CB allocation support for
internal queues.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:30 +02:00
Tal Cohen 4bb1f2f3fb habanalabs: use enum for CB allocation options
In the future there will be situations where queues can accept either
kernel allocated CBs or user allocated CBs, depending on different
states.

Therefore, instead of using a boolean variable of kernel/user allocated
CB, we need to use a bitmask to indicate that, which will allow to
combine the two options.

Add a flag to the uapi so the user will be able to indicate whether
the CB was allocated by kernel or by user. Of course the driver
validates that.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:29 +02:00
Oded Gabbay 3c68157fb8 habanalabs/gaudi: add support for NIC QMANs
Initialize the QMANs that are responsible to submit doorbells to the NIC
engines. Add support for stopping and disabling them, and reset them as
part of the hard-reset procedure of GAUDI. This will allow the user to
submit work to the NICs.

Add support for receiving events on QMAN errors from the firmware.

However, the nic_ports_mask is still initialized to 0. That means this code
won't initialize the QMANs just yet. That will be in a later patch.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:29 +02:00
Oded Gabbay 11dcb8c712 habanalabs/gaudi: add NIC security configuration
Configure the security properties of the NIC IP. This is to prevent the
user process from doing something with the NIC that he shouldn't do. e.g.
crash the server, steal data, etc.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:29 +02:00
Oded Gabbay b3a9c0bd2f habanalabs/gaudi: add NIC firmware-related definitions
Add new structures and messages that the driver use to interact with the
firmware to receive information and events (errors) about GAUDI's NIC.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:29 +02:00
Oded Gabbay 16ac365045 habanalabs/gaudi: add NIC QMAN H/W and registers definitions
Add auto-generated header files that describe the NIC QMANs registers
used by the driver.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Oded Gabbay becce5f994 habanalabs: remove duplicate check
We already check if queue index is smaller than max queues a few lines
above this check so no need to check this again.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Ofir Bitton 06f791f74f habanalabs: sync stream refactor functions
Refactor sync stream implementation by reducing function length
for better readability.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Ofir Bitton 2992c1dcd3 habanalabs: add support for multiple SOBs per monitor
Support advanced monitor functionality to monitor more than a
single SOB. In addition expand all CB generation functions
with buffer offset in order to put in them multiple packets that are
generated by different functions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Ofir Bitton 3cf74b3656 habanalabs: sync stream structures refactor
Refactor sync stream implementation by adding more structures for
better readability. In addition reducing allocated resources.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Oded Gabbay f3a965c250 habanalabs: don't init vm module if no MMU
In case we are running without MMU enabled (debug mode), no need to
initialize the VM module in the driver.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Oded Gabbay 8f50314674 habanalabs: minimize prints when everything is fine
No need to print when the driver starts to initialize the H/W. Drivers
should be silent when everything is OK.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:28 +02:00
Oded Gabbay 596553dbf9 habanalabs: support multiple types of firmwares
The driver now loads the firmware in two stages. For debugging purposes
we need to support situations where only the first stage firmware is
loaded.

Therefore, use a bitmask to determine which F/W is loaded

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:27 +02:00
Oded Gabbay 28958207e9 habanalabs: we need CPU queues for hwmon
F/W can be loaded but device CPU queues disabled. In that case, HWMON
should be disabled. This is only relevant when debugging

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:27 +02:00
Ofir Bitton 20b7525dc4 habanalabs/gaudi: move mmu_prepare to context init
Currently mmu_prepare is located at context switch.
Since we support a single context, no reason to reconfigure
the MMU registers every context switch.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:27 +02:00
Oded Gabbay 23c15ae615 habanalabs: change aggregate cs counters to atomic
In case we will have multiple contexts/processes, we can't just
increment aggregated counters. We need to make them atomic as they can
be incremented by multiple processes

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:47:27 +02:00
Ofir Bitton 5555b7c56b habanalabs: put devices before driver removal
Driver never puts its device and control_device objects, hence
a memory leak is introduced every driver removal.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:30:16 +02:00
Ofir Bitton c8c39fbd01 habanalabs: free host huge va_range if not used
If huge range is not valid, driver uses the host range also for
huge page allocations, but driver never frees its allocation.
This introduces a memory leak every time a user closes its context.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30 10:30:16 +02:00
Oded Gabbay 652b44453e habanalabs/gaudi: fix missing code in ECC handling
There is missing statement and missing "break;" in the ECC handling
code in gaudi.c
This will cause a wrong behavior upon certain ECC interrupts.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-23 20:08:43 +02:00
Oded Gabbay f83f3a31b2 habanalabs/gaudi: mask WDT error in QMAN
This interrupt cause is not relevant because of how the user use the
QMAN arbitration mechanism. We must mask it as the log explodes with it.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-04 08:56:07 +02:00
Ofir Bitton 1137e1ead9 habanalabs/gaudi: move coresight mmu config
We must relocate the coresight mmu configuration to the coresight
flow to make it work in case the first submission is to configure
the profiler.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-04 08:56:06 +02:00
Arnd Bergmann 82948e6e1d habanalabs: fix kernel pointer type
All throughout the driver, normal kernel pointers are
stored as 'u64' struct members, which is kind of silly
and requires casting through a uintptr_t to void* every
time they are used.

There is one line that missed the intermediate uintptr_t
case, which leads to a compiler warning:

drivers/misc/habanalabs/common/command_buffer.c: In function 'hl_cb_mmap':
drivers/misc/habanalabs/common/command_buffer.c:512:44: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
  512 |  rc = hdev->asic_funcs->cb_mmap(hdev, vma, (void *) cb->kernel_address,

Rather than adding one more cast, just fix the type and
remove all the other casts.

Fixes: 0db575350c ("habanalabs: make use of dma_mmap_coherent")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-04 08:56:06 +02:00
Oded Gabbay 5b94d6e476 habanalabs/gaudi: use correct define for qman init
There was a copy-paste error, and the wrong define was used for
initializing the QMAN.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200925171415.25663-1-oded.gabbay@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-30 08:38:18 +02:00
Ofir Bitton 25121d9804 habanalabs/gaudi: configure QMAN LDMA registers properly
LDMA registers are configured with a fixed value.
We add new define set which gives the configuration
a proper meaning.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-25 14:44:21 +03:00
Oded Gabbay eab1f6e7b0 habanalabs: add notice of device not idle
The device should be idle after a context is closed. If not, print a
notice.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-25 14:44:21 +03:00
Oded Gabbay 3c3aa5dbd6 habanalabs: add debug messages for opening/closing context
During debugging of error we sometimes need to know whether the error
happened when a user context was open. Add debug prints when opening and
closing user contexts.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-25 14:44:20 +03:00
Oded Gabbay 9e2e8fc7d6 habanalabs: release kernel context after hw_fini
Some engines use resources that belong to the kernel context (e.g. MMU
mappings). In case the halt-engines doesn't work properly due to H/W
restriction, we need to make sure the kernel context lives on until after
the hw_fini. The hw_fini resets the ASIC after that no engine is alive and
we can safely close the kernel context.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-25 14:44:20 +03:00
Oded Gabbay fc6121e961 habanalabs: correct an error message
We don't try to allocate huge pages here so remove the huge word.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-25 14:44:20 +03:00
Oded Gabbay f279e5cd95 habanalabs: update scratchpad register map
Our firmware use some scratchpad registers in the device for different
roles. Update the file to the latest version of the firmware code.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Oded Gabbay 57799ce9f8 habanalabs: add indication of security-enabled F/W
Future F/W versions will have enhanced security measures and the driver
won't be able to do certain configurations that it always did and those
configurations will be done by the firmware.

We use the firmware's preboot version to determine whether security
measures are enabled or not. Because we need this very early in our code,
the read of the preboot version is moved to the earliest possible place,
right after the device's PCI initialization.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Oded Gabbay d1f3633599 habanalabs/gaudi: fix DMA completions max outstanding to 15
This is a workaround for H/W bug H3-2116, where if there are more than 16
outstanding completions in the DMA transpose engine, there can be a
deadlock in the engine.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Oded Gabbay dbf053c429 habanalabs/gaudi: remove axi drain support
AXI drain is broken in GAUDI so remove support for enabling it.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Oded Gabbay 219b8f2ff0 habanalabs: update firmware interface file
Add new packet to fetch PLL information from firmware. This will be needed
in the future when the driver won't be able to access the PLL registers
directly

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Tomer Tayar ef6a0f6caa habanalabs: Add an option to map CB to device MMU
There are cases in which the device should access the host memory of a
CB through the device MMU, and thus this memory should be mapped.
The patch adds a flag to the CB IOCTL, in which a user can ask the
driver to perform the mapping when creating a CB.
The mapping is allowed only if a dedicated VA range was allocated for
the specific ASIC.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Tomer Tayar fa8641a14f habanalabs: Save context in a command buffer object
Future changes require using a context while handling a command buffer,
and thus need to save the context in the command buffer object.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:54 +03:00
Oded Gabbay 448f63badc habanalabs: no need for DMA_SHARED_BUFFER
Now that the driver no longer uses dma_buf, we can remove the select of
DMA_SHARED_BUFFER from kconfig.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Oded Gabbay 681a22f55f habanalabs: allow to wait on CS without sleep
The user sometimes wants to check if a CS has completed to clean resources.
In that case, the user doesn't want to sleep but just to check if the CS
has finished and continue with his code.

Add a new definition to the API of the wait on CS. The new definition says
that if the timeout is 0, the driver won't sleep at all but return
immediately after checking if the CS has finished.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Oded Gabbay 230b9b7d45 habanalabs/gaudi: increase timeout for boot fit load
The firmware running in the boot stage takes more time to execute due to
increased security mechanisms. Therefore, we need to increase the timeout
we wait for the boot fit to finish loading.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Moti Haimovski 214afa974d habanalabs: add debugfs support for MMU with 6 HOPs
This commit modify the existing debugfs code to support future devices that
have a 6 HOPs MMU implementation instead of 5 HOPs implementation.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Moti Haimovski 7edf341b9e habanalabs: add num_hops to hl_mmu_properties
This commit adds the number of HOPs supported by the device to the
device MMU properties.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Moti Haimovski d83fe66928 habanalabs: refactor MMU as device-oriented
As preparation to MMU v2, rework MMU to be device oriented
instantiated according to the device in hand.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Moti Haimovski c91324f41b habanalabs: rename mmu.c to mmu_v1.c
In the future we will have MMU v2 code, so we need to prepare the
driver for it. The first step is to rename the current MMU file to
mmu_v1.c.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Omer Shpigelman 7c52fb0a09 habanalabs: use smallest possible alignment for virtual addresses
Change the acquiring of a device virtual address for mapping by using the
smallest possible alignment, rather than the biggest, depending on the
page size used by the user for allocating the memory. This will lower the
virtual space memory consumption.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:53 +03:00
Oded Gabbay 1fb2f37437 habanalabs: check flag before reset because of f/w event
For consistency with GAUDI code, add check of the relevant flag in the
device structure before resetting the GOYA device in case of firmware
event.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Oded Gabbay 5a1b861daa habanalabs: increase PQ COMP_OFFSET by one nibble
For future ASICs, we increase this field by one nibble. This field was not
used by the current ASICs so this change doesn't break anything.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Ofir Bitton 763a0b4d81 habanalabs: Fix alignment issue in cpucp_info structure
Because the device CPU compiler aligns structures to 8 bytes,
struct cpucp_info has an alignment issue as some parts
in the structure are not aligned to 8 bytes.
It is preferred that we explicitly insert placeholders inside
the structure to avoid confusion

in order to validate this scenario, we printed both pointers:

__u8 cpucp_version[VERSION_MAX_LEN]; (0xffff899c67ed4cbc)
__le64 dram_size;                    (0xffff899c67ed4d40)

we see difference of 132 bytes although the first array
is only 128 bytes long, Meaning compiler added a 4 byte padding.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Oded Gabbay ae926514dd habanalabs: remove unused define
Cleanup the code.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Oded Gabbay b01a971f80 habanalabs: remove unused ASIC function pointer
Old function pointer that was left when the call to this function pointer
was removed.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Oded Gabbay 6138bbe911 habanalabs: rename ArmCP to CPU-CP
There were a couple of comments where the name ArmCP was still used. Rename
it to CPU-CP.

In addition, rename ArmCP or ARM in log messages to "device CPU".

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Oded Gabbay 975ab7b32b habanalabs: count dropped CS because max CS in-flight
There is a case where the user reaches the maximum number of CS in-flight.
In that case, the driver rejects the new CS of the user with EAGAIN. Count
that event so the user can query the driver later to see if it happened.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:52 +03:00
Hillf Danton 0db575350c habanalabs: make use of dma_mmap_coherent
Add dma_mmap_coherent() for goya and gaudi to match their use of
dma_alloc_coherent(), see the Link tag for why.

Link: https://lore.kernel.org/lkml/20200609091727.GA23814@lst.de/
Cc: Christoph Hellwig <hch@lst.de>
Cc: Zhang Li <li.zhang@bitmain.com>
Cc: Ding Z Nan <oshack@hotmail.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Oded Gabbay c5e0ec66f0 habanalabs: clear vm_pgoff before doing the mmap
The driver use vm_pgoff to hold the CB idr handle. Before we actually call
the mapping function, we need to clear the handle so there won't be any
garbage left in vm_pgoff.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Oded Gabbay 3174ac9bb1 habanalabs: restructure hl_mmap
Arrange the hl_mmap code to be more structured and expandable for the
future. Add better defines that describe our usage of the vm_pgoff.

Note that I shamelessly took the code and defines from the amdkfd driver
(my previous driver).

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Oded Gabbay f763946aef habanalabs: cast to u64 before shift > 31 bits
When shifting a boolean variable by more than 31 bits and putting the
result into a u64 variable, we need to cast the boolean into unsigned 64
bits to prevent possible overflow.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Oded Gabbay 2f55342c5e habanalabs: replace armcp with the generic cpucp
ArmCP mandates that the device CPU is always an ARM processor, which might
be wrong in the future.

Most of this change is an internal renaming of variables, functions and
defines but there are two entries in sysfs which have armcp in their
names. Add identical cpucp entries but don't remove yet the armcp entries.
Those will be deprecated next year. Add the documentation about it in sysfs
documentation.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Oded Gabbay 42b0698add habanalabs: update GAUDI hardware specs
Add define for the 2 MME slave engines.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
farah kassabri 9f3064913e habanalabs: add support for getting device total energy
Add driver implementation for reading the total energy consumption
from the device ARM FW.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
Tomer Tayar 56004701f5 habanalabs: Include linux/bitfield.h only in habanalabs.h
Include linux/bitfield.h only in habanalabs.h, instead of in each and
every file that needs it, as habanalabs.h is already included by all.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:51 +03:00
farah kassabri d90416c84d habanalabs: extend busy engines mask to 64 bits
change busy engines bitmask to 64 bits in order to represent
more engines, needed for future ASIC support.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay 107dd31465 habanalabs: use 1U when shifting bits
Eliminate following warning:
warning: Shifting signed 32-bit value by 31 bits is undefined behavior

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay 31ac1f1a57 habanalabs: check TPC vector pipe is empty
The driver waits for the TPC vector pipe to be empty before checking if the
TPC kernel has finished executing, but the code doesn't validate that the
pipe was indeed empty, it just wait for it without checking the return
value.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay 0358372bbe habanalabs: remove redundant assignment to variable
new_dma_pkt->ctl is assigned a value and then is reassigned a new value
without the first value ever being used.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay 65887291c6 habanalabs: use FIELD_PREP() instead of <<
Use the standard FIELD_PREP() macro instead of << operator to perform
bitmask operations. This ensures type check safety and eliminate compiler
warnings.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay a0e072f5a1 habanalabs: use standard BIT() and GENMASK()
Use the standard macros to define bitmasks.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay bd4ef37292 habanalabs: eliminate redundant else condition
If both parts of if-else are goto statements, we can remove the else and
put the else goto statement after the if statement.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:50 +03:00
Oded Gabbay f907af183b habanalabs: cast int to u32 before printing it with %u
%u is used for unsigned so we need to cast the int variable to u32 to avoid
compiler warning.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Oded Gabbay f5b9c8cf25 habanalabs: change CB's ID to be 64 bits
Although the possible values for CB's ID are only 32 bits, there are a few
places in the code where this field is shifted and passed into a function
which expects 64 bits.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Dotan Barak d6b045c083 habanalabs: print the queue id in case of an error
If there is a failure during the testing of a queue,
to ease up debugging - print the queue id.

Signed-off-by: Dotan Barak <dbarak@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
farah kassabri acd330c141 habanalabs: remove security from ARB_MST_QUIET register
Allow user application to write to this register in order
to be able to configure the quiet period of the QMAN between grants.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Ofir Bitton 2e5eda4681 habanalabs: PCIe Advanced Error Reporting support
driver will now get notified upon any PCI error occurred and
will respond according to the severity of the error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Ofir Bitton 843839bec3 habanalabs: expose sync manager resources allocation in INFO IOCTL
Although the driver defines the first user-available sync manager object
and monitor in habanalabs.h, we would like to also expose this information
via the INFO IOCTL so the runtime can get this information dynamically.
This is because in future ASICs we won't need to define it statically.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Ofir Bitton 0a068adde5 habanalabs: add information about PCIe controller
Update firmware header with new API for getting pcie info
such as tx/rx throughput and replay counter.
These counters are needed by customers for monitor and maintenance
of multiple devices.
Add new opcodes to the INFO ioctl to retrieve these counters.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:49 +03:00
Ofir Bitton a98d73c7fa habanalabs: Replace dma-fence mechanism with completions
habanalabs driver uses dma-fence mechanism for synchronization.
dma-fence mechanism was designed solely for GPUs, hence we purpose
a simpler mechanism based on completions to replace current
dma-fence objects.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22 18:49:48 +03:00
Oded Gabbay b71590efb2 habanalabs: increase length of ASIC name
Future ASIC names are longer than 15 chars so increase the variable length
to 32 chars.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-09-22 18:49:48 +03:00
Ofir Bitton 69c6e18d0c habanalabs: fix report of RAZWI initiator coordinates
All initiator coordinates received upon an 'MMU page fault RAZWI
event' should be the routers coordinates, the only exception is the
DMA initiators for which the reported coordinates correspond to
their actual location.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-31 15:10:27 +03:00
Moti Haimovski 6396feabf7 habanalabs: prevent user buff overflow
This commit fixes a potential debugfs issue that may occur when
reading the clock gating mask into the user buffer since the
user buffer size was not taken into consideration.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-31 15:10:27 +03:00
Ofir Bitton 5aba368893 habanalabs: correctly report inbound pci region cfg error
During inbound iATU configuration we can get errors while
configuring PCI registers, there is a certain scenario in which these
errors are not reflected and driver is loaded with wrong configuration.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:58 +03:00
Ofir Bitton 0839152f8c habanalabs: check correct vmalloc return code
vmalloc can return different return code than NULL and a valid
pointer. We must validate it in order to dereference a non valid
pointer.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:58 +03:00
Ofir Bitton bce382a8bb habanalabs: validate FW file size
We must validate FW size in order not to corrupt memory in case
a malicious FW file will be present in system.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:58 +03:00
Colin Ian King 804d057cfa habanalabs: fix incorrect check on failed workqueue create
The null check on a failed workqueue create is currently null checking
hdev->cq_wq rather than the pointer hdev->cq_wq[i] and so the test
will never be true on a failed workqueue create. Fix this by checking
hdev->cq_wq[i].

Addresses-Coverity: ("Dereference before null check")
Fixes: 5574cb2194 ("habanalabs: Assign each CQ with its own work queue")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:58 +03:00
Oded Gabbay 58361aae4b habanalabs: set max power according to card type
In Gaudi, the default max power setting is different between PCI and PMC
cards. Therefore, the driver need to set the default after knowing what is
the card type.

The current code has a bug where it limits the maximum power of the PMC
card to 200W after a reset occurs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Ofir Bitton 36545279f0 habanalabs: proper handling of alloc size in coresight
Allocation size can go up to 64bit but truncated to 32bit,
we should make sure it is not truncated and validate no address
overflow.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Ofir Bitton f44d23b909 habanalabs: set clock gating according to mask
Once clock gating is set we enable clock gating according to mask,
we should also disable clock gating according to relevant bits.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Ofir Bitton 1cff119740 habanalabs: verify user input in cs_ioctl_signal_wait
User input must be validated before using it to
access internal structures.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Dan Carpenter b0353540ff habanalabs: Fix a loop in gaudi_extract_ecc_info()
The condition was reversed.  It should have been less than instead of
greater than.  The result is that we never enter the loop.

Fixes: fcc6a4e606 ("habanalabs: Extract ECC information from FW")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Dan Carpenter eeec23cd32 habanalabs: Fix memory corruption in debugfs
This has to be a long instead of a u32 because we write a long value.
On 64 bit systems, this will cause memory corruption.

Fixes: c216477363 ("habanalabs: add debugfs support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Ofir Bitton bc75be24fa habanalabs: validate packet id during CB parse
During command buffer parsing, driver extracts packet id
from user buffer. Driver must validate this packet id, since it is
being used in order to extract information from internal structures.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:57 +03:00
Ofir Bitton bf6d10963e habanalabs: Validate user address before mapping
User address must be validated before driver performs address map.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:56 +03:00
Ofir Bitton f1aae40e8d habanalabs: unmap PCI bars upon iATU failure
In case the driver fails to configure the PCI controller iATU, it needs to
unmap the PCI bars before exiting so if the driver is removed, the bars
won't be left mapped.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-22 12:47:56 +03:00
Wei Yongjun 22362aa30b habanalabs: remove unused but set variable 'ctx_asid'
Gcc report warning as follows:

drivers/misc/habanalabs/common/command_submission.c:373:6: warning:
 variable 'ctx_asid' set but not used [-Wunused-but-set-variable]
  373 |  int ctx_asid, rc;
      |      ^~~~~~~~

This variable is not used in function cs_timedout(), this commit
remove it to fix the warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20200729155902.33976-1-weiyongjun1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29 18:02:21 +02:00
kernel test robot bb34bf798c habanalabs: goya_ctx_init() can be static
Signed-off-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20200729000313.GA14680@e442e3f624c4
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29 09:46:05 +02:00
Greg Kroah-Hartman 7b16a15524 habanalabs: fix up absolute include instructions
There's no need to try to be cute with the include file locations in the
Makefile, so just specify exactly where the files are.

Bonus is this fixes the problem of building with O= as well as trying to
just build the subdirectory alone.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Omer Shpigelman <oshpigelman@habana.ai>
Cc: Tomer Tayar <ttayar@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Ben Segal <bpsegal20@gmail.com>
Cc: Christine Gharzuzi <cgharzuzi@habana.ai>
Cc: Pawel Piskorski <ppiskorski@habana.ai>
Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29 08:15:50 +02:00
Greg Kroah-Hartman 65a9bde6ed Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge 5.8-rc7 into char-misc-next

This should resolve the merge/build issues reported when trying to
create linux-next.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-27 11:49:37 +02:00
Tomer Tayar 94f8be9eb0 habanalabs: Fix memory leak in error flow of context initialization
Add a missing free of the cs_pending array in the error flow of context
initialization.

Fixes: c16d45f42b ("habanalabs: Use pending CS amount per ASIC")

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:40:06 +03:00
Tomer Tayar 644883ef1a habanalabs: use no flags on MMU cache invalidation
gaudi_mmu_invalidate_cache() doesn't use the flags parameter, and thus
it can be set to 0 when the function is called in the gaudi only files.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Oded Gabbay 8df8cb1efc habanalabs: enable device before hw_init()
Device is now enabled before the hw_init() because part of the
initialization requires communication with the device firmware to get
information that is required for the initialization itself

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:37 +03:00
Ofir Bitton a04b7cd97e habanalabs: create internal CB pool
Create a device MMU-mapped internal command buffer pool, in order to allow
the driver to allocate CBs for the signal/wait operations
that are fetched by the queues when they are configured with the user's
address space ID.

We must pre-map this internal pool due to performance issues.

This pool is needed for future ASIC support and it is currently unused in
GOYA and GAUDI.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Oded Gabbay eb8b293e79 habanalabs: update hl_boot_if.h from firmware
Update the boot interface file from the latest version from firmware.
Defines for secure boot were added.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-24 20:31:37 +03:00
Oded Gabbay 70b2f993ea habanalabs: create common folder
For internal needs of our CI we need to move all the common code into a
common folder instead of putting them in the root folder of the driver.

Same applies to the common header files under include/

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-24 20:31:37 +03:00
Moti Haimovski a9855a2d91 habanalabs: check for DMA errors when clearing memory
In GAUDI we use QMAN0 DMA for clearing the MMU memory region
at initialization. if this operation fails it places the DMA in an error
state and then when trying to initialize QMAN0 we fail and erroneously
assume its the QMAN that failed.

This commit adds a check and clear of such DMA errors at initialization so
we will have a better understanding of what went wrong.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Ofir Bitton 22cb855598 habanalabs: verify queue can contain all cs jobs
In order for the user to be aware of wrong inputs, we must return
error in case the amount of jobs per cs exceeds the corresponding
queue size.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Ofir Bitton 5574cb2194 habanalabs: Assign each CQ with its own work queue
We identified a possible race during job completion when working
with a single multi-threaded work queue. In order to overcome this
race we suggest using a single threaded work queue per completion
queue, hence we guarantee jobs completion in order.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:37 +03:00
Oded Gabbay c83c417193 habanalabs: halt device CPU only upon certain reset
Currently the driver halts the device CPU in the halt engines function,
which halts all the engines of the ASIC. The problem is that if later on we
stop the reset process (due to inability to clean memory mappings in time),
the CPU will remain in halt mode. This creates many issues, such as
thermal/power control and FLR handling.

Therefore, move the halting of the device CPU to the very end of the reset
process, just before writing to the registers to initiate the reset. In
addition, the driver now needs to send a message to the device F/W to
disable it from sending interrupts to the host machine because during halt
engines function the driver disables the MSI/MSI-X interrupts.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:36 +03:00
Omer Shpigelman 9158c47e20 habanalabs: remove unused hash
Remove an old hash that is not in use anymore.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton 79b1894c41 habanalabs: use queue pi/ci in order to determine queue occupancy
Instead of using the free slots amount on the compute CQ to determine
whether we can submit work to queues, use the queues pi/ci.

This is needed in future ASICs where we don't have CQ per queue.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton 3abc99bb7d habanalabs: configure maximum queues per asic
Currently the amount of maximum queues is statically configured.
Using a static value is causing redundunt cycles when traversing
all queues and consumes more memory than actually needed.
In this patch we configure each asic with the exact number of
queues needed.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Oded Gabbay 12ae3133d2 habanalabs: remove soft-reset support from GAUDI
Soft-reset isn't supported in GAUDI. Remove the code that performs it and
print error in case the user wants to do it via sysfs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:36 +03:00
Ofir Bitton f4cbfd2445 habanalabs: PCIe iATU refactoring
Divide iATU initialization into inbound/outbound methods.
We must separate it in order to enable different match mode
per PCIe region.
In addition, added support for PCI address match mode.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Oded Gabbay fcc6a4e606 habanalabs: Extract ECC information from FW
ECC (Error Correcting Code) interrupts are going to be handled
by the FW. Hence, we define an interface in which the driver can
obtain the relevant ECC information.
This information is needed for monitoring and can also lead
to a hard reset if ECC error is not correctable.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Ofir Bitton db491e4f08 habanalabs: Add dropped cs statistics info struct
Add command submission statistics structure which can be obtained
through the info ioctl. Each drop counter describes the reason for
which the command submission was dropped.
This information is needed for the user to be aware of the specific
reason for which the submitted work was dropped. The user can then
utilize the driver more efficiently.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Christine Gharzuzi c8f9b49d2d habanalabs: extract cpu boot status lookup
Extract detection of the cpu boot status to a function
to allow code reuse

Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Oded Gabbay 0eab4f89d6 habanalabs: rephrase error messages
rephrase some error/warning/notice messages to make them more accessible to
ordinary users.

There is no need to print context ASID as the driver currently doesn't
support multiple contexts.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24 20:31:35 +03:00
Ofir Bitton dd9efabd0a habanalabs: Increase queues depth
After recent concurrent cs amount increase, we must also
increase queues depth since much more concurrent work can be done.
All external queue depths were increased to 4096 as gaudi's
internal queue depths were also increased to 1024.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Omer Shpigelman 917b79b096 habanalabs: rephrase error message
Rephrase F/W error message to make it more understandable to ordinary
users.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Adam Aharon e8edded693 habanalabs: calculate trace frequency from PLL
The profiler needs to know the PLL values for correctly showing the
profiling data. Because our firmware can use different PLL configurations,
we need to read the PLL values from the ASIC to pass them to the profiler.

Signed-off-by: Adam Aharon <aaharon@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Oded Gabbay 6ced91170d habanalabs: align armcp_packet structure to 8 bytes
Once there is a 64-bit field in a structure, GCC compiler for ARM aligns
the structure to 8 bytes. In order to avoid confusion when these
structures are being passed between CPUs from different architectures, we
explicitly align the structure to 8 bytes.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Ofir Bitton 6c07bab34b habanalabs: Use mask instead of shift in sync stream registers
Use proper bitfield masks instead of shifting values when configuring
packets sent to device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Ofir Bitton 21e7a34634 habanalabs: sync stream generic functionality
Currently sync stream is limited only for external queues. We want to
remove this constraint by adding a new queue property dedicated for sync
stream. In addition we move the initialization and reset methods to the
common code since we can re-use them with slight changes.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Ofir Bitton c16d45f42b habanalabs: Use pending CS amount per ASIC
Training schemes requires much more concurrent command submissions than
inference does. In addition, training command submissions can be completed
in a non serialized manner. Hence, we add support in which each ASIC will
be able to configure the amount of concurrent pending command submissions,
rather than use a predefined amount. This change will enhance performance
by allowing the user to add more concurrent work without waiting for the
previous work to be completed.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Oded Gabbay 0b168c8f1d habanalabs: remove rate limiters from GAUDI
We no longer need to initialize the rate limiters in GAUDI A1.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:34 +03:00
Oded Gabbay cea7a0449e habanalabs: prevent possible out-of-bounds array access
Queue index is received from the user. Therefore, we must validate it
before using it to access the queue props array.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-19 08:15:36 +03:00
Oded Gabbay 788cacf308 habanalabs: set 4s timeout for message to device CPU
We see that sometimes the CPU in GOYA and GAUDI is occupied by the
power/thermal loop and can't answer requests from the driver fast enough.

Therefore, to avoid false notifications on timeouts, increase the timeout
to 4 seconds on each message sent to the device CPU.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-10 19:53:03 +03:00
Oded Gabbay e38bfd30e0 habanalabs: set clock gating per engine
For debugging purposes, we need to allow the root user better control of
the clock gating feature of the DMA and compute engines. Therefore, change
the clock gating debugfs interface to be bitmask instead of true/false.
Each bit represents a different engine, according to gaudi_engine_id enum.

See debugfs documentation for more details.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-10 19:53:03 +03:00
Oded Gabbay 2edc66e22b habanalabs: block WREG_BULK packet on PDMA
WREG_BULK is a special packet that has a variable length. Therefore, we
can't parse it when validating CBs that go to the PCI DMA queue. In case
the user needs to use it, it can put multiple WREG32 packets instead.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-10 19:53:03 +03:00
Lee Jones 14395c6fb1 misc: habanalabs: gaudi: gaudi_security: Repair incorrectly named function arg
gaudi_pb_set_block()'s argument 'base' was incorrectly named 'block' in
its function header.

Fixes the following W=1 kernel build warning(s):

 drivers/misc/habanalabs/gaudi/gaudi_security.c:454: warning: Function parameter or member 'base' not described in 'gaudi_pb_set_block'
 drivers/misc/habanalabs/gaudi/gaudi_security.c:454: warning: Excess function parameter 'block' description in 'gaudi_pb_set_block'

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-11-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:37 +02:00
Lee Jones f7d227c306 misc: habanalabs: gaudi: Remove ill placed asterisk from kerneldoc header
W=1 kernel builds report a lack of description of gaudi_set_asic_funcs()'s
'hdev' argument.  In reality it is documented, but the formatting
was not as expected '@.*:'.  Instead, there was a misplaced asterisk
which was confusing the kerneldoc validator.

Squashes the following W=1 warning:

 drivers/misc/habanalabs/gaudi/gaudi.c:6746: warning: Function parameter or member 'hdev' not described in 'gaudi_set_asic_funcs'

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-10-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:37 +02:00
Lee Jones 67db05cea6 misc: habanalabs: goya: goya_coresight: Remove set but unused variable 'val'
No attempt to check the return value of RREG32() has been made
since the call was introduced a year ago.

Fixes W=1 kernel build warning:

 drivers/misc/habanalabs/goya/goya_coresight.c: In function ‘goya_debug_coresight’:
 drivers/misc/habanalabs/goya/goya_coresight.c:643:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable]
 643 | u32 val;
 | ^~~

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-9-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:37 +02:00
Lee Jones e0712c600e misc: habanalabs: pci: Scrub documentation for non-present function argument
'dma_mask' is not passed directly into hl_pci_set_dma_mask() as
an argument.  Instead, it is pulled from struct hl_device *hdev.

Fixed the following W=1 warning:

 drivers/misc/habanalabs/pci.c:328: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-8-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:36 +02:00
Lee Jones 2557f27fd4 misc: habanalabs: goya: Omit pointless check ensuring addr is >=0
Seeing as 'addr' is unsigned, it would be impossible for the assigned
value to be anything other than zero or positive.

Squashes the following W=1 warnings:

 drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read32’:
 drivers/misc/habanalabs/goya/goya.c:3945:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
 3945 | } else if ((addr >= DRAM_PHYS_BASE) &&
 | ^~
 drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write32’:
 drivers/misc/habanalabs/goya/goya.c:4002:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
 4002 | } else if ((addr >= DRAM_PHYS_BASE) &&
 | ^~
 drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read64’:
 drivers/misc/habanalabs/goya/goya.c:4047:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
 4047 | } else if ((addr >= DRAM_PHYS_BASE) &&
 | ^~
 drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write64’:
 drivers/misc/habanalabs/goya/goya.c:4091:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
 4091 | } else if ((addr >= DRAM_PHYS_BASE) &&
 | ^~
 drivers/misc/habanalabs/pci.c:328: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask'
 drivers/misc/habanalabs/goya/goya_coresight.c: In function ‘goya_debug_coresight’:
 drivers/misc/habanalabs/goya/goya_coresight.c:643:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable]
 643 | u32 val;
 | ^~~

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-7-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:36 +02:00
Lee Jones 3db99f000b misc: habanalabs: irq: Repair kerneldoc formatting issues
W=1 kernel builds report a lack of descriptions for various
function arguments.  In reality they are documented, but the
formatting was not as expected '@.*:'.  Instead, '-'s were
used as separators.

While we're here, the headers for functions various functions
were written in kerneldoc format, but lack the kerneldoc
identifier '/**'.  Let's promote them so they can gain access
to the checker.

This change fixes the following W=1 warnings:

 drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'eq_work' not described in 'hl_eqe_work'
 drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'hdev' not described in 'hl_eqe_work'
 drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'eq_entry' not described in 'hl_eqe_work'

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-6-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:36 +02:00
Lee Jones df123c9dcd misc: habanalabs: pci: Fix a variety of kerneldoc issues
hl_pci_bars_map() has a miss-typed argument name in the function
description.  hl_pci_elbi_write() was missing documented arguments.
The headers for functions hl_pci_bars_unmap(), hl_pci_elbi_write()
and hl_pci_reset_link_through_bridge() were written in kerneldoc
format, but lack the kerneldoc identifier '/**'.  Let's promote
them so they can gain access to the checker.

These changes fix the following W=1 kernel build warnings:

 drivers/misc/habanalabs/pci.c:27: warning: Function parameter or member 'name' not described in 'hl_pci_bars_map'
 drivers/misc/habanalabs/pci.c:27: warning: Excess function parameter 'bar_name' description in 'hl_pci_bars_map'
 drivers/misc/habanalabs/pci.c:147: warning: Function parameter or member 'addr' not described in 'hl_pci_iatu_write'
 drivers/misc/habanalabs/pci.c:147: warning: Function parameter or member 'data' not described in 'hl_pci_iatu_write'
 drivers/misc/habanalabs/pci.c:324: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask'

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-5-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:36 +02:00
Lee Jones a0c11b3c91 misc: habanalabs: firmware_if: Add missing 'fw_name' and 'dst' entries to function header
Looks as though documentation for these function arguments have
been missing since the driver's inception last year.

Fixes the following W=1 kernel build warnings:

 drivers/misc/habanalabs/firmware_if.c:26: warning: Function parameter or member 'fw_name' not described in 'hl_fw_load_fw_to_device'
 drivers/misc/habanalabs/firmware_if.c:26: warning: Function parameter or member 'dst' not described in 'hl_fw_load_fw_to_device'

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200701085853.164358-4-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-01 15:05:36 +02:00
Lee Jones 9eea2a499f misc: habanalabs: irq: Add missing struct identifier for 'struct hl_eqe_work'
In kerneldoc format, data structures have to start with 'struct'
else the kerneldoc tooling/parsers/validators get confused.

Squashes the following W=1 warning:

 drivers/misc/habanalabs/irq.c:19: warning: cannot understand function prototype: 'struct hl_eqe_work '

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20200626130525.389469-10-lee.jones@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-29 18:45:53 +02:00
Omer Shpigelman ce04326edd habanalabs: increase h/w timer when checking idle
In GAUDI the current timer value for the hardware to check if it is
in IDLE state is too low. As a result, there are occasions where the H/W
wrongly reports it is not IDLE. The driver checks that before submitting
work on behalf of the driver during initialization, so a false report might
cause the driver to fail during device initialization.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 12:35:23 +03:00
Ofir Bitton 3292055c85 habanalabs: Correct handling when failing to enqueue CB
The fence release flow is different if the CS was never submitted. In that
case, we don't have an hw_sob object attached that we need to "put". While
if the CS was aborted, we do need to "put" the hw_sob.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 09:09:10 +03:00
Oded Gabbay 647e835e67 habanalabs: increase GAUDI QMAN ARB WDT timeout
The current timeout is too low for some of the workloads and we see false
errors as a result.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 09:09:10 +03:00
Oded Gabbay dd2fde1093 habanalabs: rename mmu_write() to mmu_asid_va_write()
The function name conflicts with a static inline function in
arch/m68k/include/asm/mcfmmu.h

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 09:09:10 +03:00
Omer Shpigelman cfd4176dc0 habanalabs: use PI in MMU cache invalidation
The PS flow for MMU cache invalidation caused timeouts in stress tests.
Use PS + PI flow so no timeouts should happen whatsoever.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 09:09:10 +03:00
Oded Gabbay 64536abc62 habanalabs: block scalar load_and_exe on external queue
In Gaudi, the user can't execute scalar load_and_exe on external queue
because it can be a security hole. The driver doesn't parse the commands
being loaded and it can be msg_prot, which the user isn't allowed to use.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24 09:09:10 +03:00
Oded Gabbay 05c8a4fc44 habanalabs: correctly cast u64 to void*
Use the u64_to_user_ptr(x) kernel macro to correctly cast u64 to void*

Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200601065648.8775-2-oded.gabbay@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-01 09:08:10 +02:00
Tomer Tayar c68f1baeaf habanalabs: initialize variable to default value
Fix the following smatch error in unmap_device_va():
error: uninitialized symbol 'rc'.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200601065648.8775-1-oded.gabbay@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-01 09:08:10 +02:00
Omer Shpigelman 8ff5f4fd40 habanalabs: handle MMU cache invalidation timeout
MMU cache invalidation timeout indicates that the device is unstable and
therefore unusable.
Hence in such case do hard reset and return an error to the user if was
called from ioctl.
In addition, change the print to error level and rephrase its text.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-25 08:17:57 +03:00
Omer Shpigelman 36fafe87ed habanalabs: don't allow hard reset with open processes
When the MMU is heavily used by the engines, unmapping might take a lot of
time due to a full MMU cache invalidation done as part of the unmap flow.
Hence we might not be able to kill all open processes before going to hard
reset the device, as it involves unmapping of all user memory.
In case of a failure in killing all open processes, we should stop the
hard reset flow as it might lead to a kernel crash - one thread (killing
of a process) is updating MMU structures that other thread (hard reset) is
freeing.
Stopping a hard reset flow leaves the device as nonoperational and the
user can then initiate a hard reset via sysfs to reinitialize the device.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-25 08:15:33 +03:00
Oded Gabbay 66446820df habanalabs: GAUDI does not support soft-reset
GAUDI does not support soft-reset as it leaves the NIC ports in an awkward
state, where their QMANs were reset but the NIC itself is still working.

In addition, there is not much sense in doing soft-reset when training is
done on multiple GAUDIs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-05-25 08:15:33 +03:00
Omer Shpigelman d798507988 habanalabs: add print for soft reset due to event
Print the event name that caused the soft reset.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-25 08:15:33 +03:00
Omer Shpigelman 42d0b0b95f habanalabs: improve MMU cache invalidation code
A new sequence is introduced to invalidate the MMU cache in order to avoid
timeouts.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-25 08:15:33 +03:00
Daniel Vetter ed65bfd9fd habanalabs: don't set default fence_ops->wait
It's the default.

Also so much for "we're not going to tell the graphics people how to
review their code", dma_fence is a pretty core piece of gpu driver
infrastructure. And it's very much uapi relevant, including piles of
corresponding userspace protocols and libraries for how to pass these
around.

Would be great if habanalabs would not use this (from a quick look
it's not needed at all), since open source the userspace and playing
by the usual rules isn't on the table. If that's not possible (because
it's actually using the uapi part of dma_fence to interact with gpu
drivers) then we have exactly what everyone promised we'd want to
avoid.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-25 08:15:33 +03:00
Rachel Stahl 87eaea1cf8 habanalabs: update patched_cb_size for Wreg32
The patch_cb_size is not updated for Wreg32 in its validate function, so
updated in goya_validate_cb.

Signed-off-by: Rachel Stahl <rstahl@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Ofir Bitton ebd8d12251 habanalabs: move event handling to common firmware file
Instead of writing similar event handling code for each ASIC, move the code
to the common firmware file. This code will be used for GAUDI and all
future ASICs.

In addition, add two new fields to the auto-generated events file: valid
and description. This will save the need to manually write the events
description in the source code and simplify the code.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay af57cb81a6 habanalabs: enable gaudi code in driver
Enable the GAUDI ASIC code in the pci probe callback of the driver so the
driver will handle GAUDI ASICs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 79fc7a9fff habanalabs: add gaudi profiler module
Add the GAUDI code to initialize the ASIC's profiler. The profile receives
its initialization values from the user, same as in Goya, but the code to
initialize is in the driver because the configuration space of the
device is not directly exposed to the user.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 3a3a5bf196 habanalabs: add gaudi security module
Add the code to initialize the security module of GAUDI. Similar to Goya,
we have two dedicated mechanisms for security: Range Registers and
Protection bits. Those mechanisms protect sensitive memory and
configuration areas inside the device.

In addition, in Gaudi we moved to a 3-level security scheme, where the F/W
runs with the highest security level (Privileged), the driver runs with a
less secured level (Secured) and the user is neither privileged nor
secured. The security module in the driver configures the Secured parts so
the user won't be able to access them. The Privileged parts are configured
by the F/W.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay bcaf415204 habanalabs: add hwmgr module for gaudi
The hwmgr module is responsible for messages sent to GAUDI F/W that are
not common to all habanalabs ASICs.

In GAUDI, we provide the user a simplified mode of controlling the ASIC
clock frequency. Instead of three different clocks, we present a single
clock property that the user can configure via sysfs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay ac0ae6a96a habanalabs: add gaudi asic-dependent code
Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function
callbacks that the driver's common code need to initialize, finalize and
submit workloads to the GAUDI ASIC.

It also contains the code to initialize the F/W of the GAUDI ASIC and to
receive events from the F/W.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 2aad2bf81c habanalabs: add gaudi asic registers header files
Add the relevant GAUDI ASIC registers header files. These files are
generated automatically from a tool maintained by the VLSI engineers.

There are more files which are not upstreamed because only very few defines
from those files are used in the driver. For those files, we copied the
relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of
this patch.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman fca72fbb66 habanalabs: get card type, location from F/W
For Gaudi the driver gets two new additional properties from the F/W:
1. The card's type - PCI or PMC
2. The card's location in the Gaudi's box (relevant only for PMC).

The card's location is also passed to the user in the HW IP info structure
as it needs this property for establishing communication between Gaudis.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay ca62433f53 habanalabs: support clock gating enable/disable
In Gaudi there is a feature of clock gating certain engines.
Therefore, add this property to the device structure.

In addition, due to a limitation of this feature, the driver needs to
dynamically enable or disable this feature during run-time. Therefore, add
ASIC interface functions to enable/disable this function from the common
code.

Moreover, this feature must be turned off when the user wishes to debug the
ASIC by reading/writing registers and/or memory through the driver's
debugfs. Therefore, add an option to enable/disable clock gating via the
debugfs interface.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 803917f960 habanalabs: set PM profile to auto only for goya
For Gaudi, the driver doesn't change the PM profile automatically due to
device-controlled PM capabilities. Therefore, set the PM profile to auto
only for Goya so the driver's code to automatically change the profile
won't run on Gaudi.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman e09498b078 habanalabs: add dedicated define for hard reset
Gaudi requires longer waiting during reset due to closing of network ports.
Add this explanation to the relevant comment in the code and add a
dedicated define for this reset timeout period, instead of multiplying
another define.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 9e5e49cd5b habanalabs: check if CoreSight is supported
Coresight is not supported on simulator, therefore add a boolean for
checking that (currently used by un-upstreamed code).

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman b75f22505a habanalabs: add signal/wait to CS IOCTL operations
Add the following two operations to the CS IOCTL:

Signal:

The signal operation is basically a command submission, that is created by
the driver upon user request. It will be implemented using a dedicated PQE
that will increment a specific SOB. There will be a new flag:
HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure,
the driver will execute a dedicated code path that will prepare this
special PQE and submit it. The user only needs to provide a queue index on
which to put the signal.

Wait:

The wait operation is also a command submission that is created by the
driver upon user request. It will be implemented using a dedicated PQE that
will contain packets of "ARM a monitor" + FENCE packet. There will be a new
flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure,
the driver will execute a dedicated code path that will prepare this
special PQE and submit it.

The user needs to provide the following parameters:
1. queue ID
2. an array of signal_seq numbers and the number of signals to wait on
   (the length of signal_seq_arr).

The IOCTL will return the CS sequence number of the wait it put on the
queue ID.

Currently, the code supports signal_seq_nr==1. But this API definition will
allow us to put a single PQE that waits on multiple signals.

To correctly configure the monitor and fence, the driver will need to
retrieve the specified signal CS object that contains the relevant SOB and
its expected value. In case the signal CS has already been completed, there
is no point of adding a wait operation. In this case, the driver will
return to the user *without* putting anything on the PQ. The return code
should reflect to the user that the signal was completed, as we won't
return a CS sequence number for this wait.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman b0b5d92579 habanalabs: handle the h/w sync object
Define a structure representing the h/w sync object (SOB).

a SOB can contain up to 2^15 values. Each signal CS will increment the SOB
by 1, so after some time we will reach the maximum number the SOB can
represent. When that happens, the driver needs to move to a different SOB
for the signal operation.

A SOB can be in 1 of 4 states:

1. Working state with value < 2^15

2. We reached a value of 2^15, but the signal operations weren't completed
yet OR there are pending waits on this signal. For the next submission, the
driver will move to another SOB.

3. ALL the signal operations on the SOB have finished AND there are no more
pending waits on the SOB AND we reached a value of 2^15 (This basically
means the refcnt of the SOB is 0 - see explanation below). When that
happens, the driver can clear the SOB by simply doing WREG32 0 to it and
set the refcnt back to 1.

4. The SOB is cleared and can be used next time by the driver when it needs
to reuse an SOB.

Per SOB, the driver will maintain a single refcnt, that will be initialized
to 1. When a signal or wait operation on this SOB is submitted to the PQ,
the refcnt will be incremented. When a signal or wait operation on this SOB
completes, the refcnt will be decremented. After the submission of the
signal operation that increments the SOB to a value of 2^15, the refcnt is
also decremented.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman ec2f8a306a habanalabs: define ASIC-dependent interface for signal/wait
This feature requires handling h/w resources which are a bit different from
one ASIC to the other. Therefore, we need to define a set of interfaces the
ASIC code provides to the common code to signal, wait, reset sync object
and to reset and init a queue.

As this feature is not supported in Goya, provide an empty implementation
of those functions.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman f9e5f29518 uapi: habanalabs: add signal/wait operations
This is a pre-requisite to upstreaming GAUDI support.

Signal/wait operations are done by the user to perform sync between two
Primary Queues (PQs). The sync is done using the sync manager and it is
usually resolved inside the device, but sometimes it can be resolved in the
host, i.e. the user should be able to wait in the host until a signal has
been completed.

The mechanism to define signal and wait operations is done by the driver
because it needs atomicity and serialization, which is already done in the
driver when submitting work to the different queues.

To implement this feature, the driver "takes" a couple of h/w resources,
and this is reflected by the defines added to the uapi file.

The signal/wait operations are done via the existing CS IOCTL, and they use
the same data structure. There is a difference in the meaning of some of
the parameters, and for that we added unions to make the code more
readable.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 824b457839 habanalabs: add missing MODULE_DEVICE_TABLE
PCI drivers should use this define to declare their PCI ID table.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00