OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Oded Gabbay	23c15ae615	habanalabs: change aggregate cs counters to atomic In case we will have multiple contexts/processes, we can't just increment aggregated counters. We need to make them atomic as they can be incremented by multiple processes Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-30 10:47:27 +02:00
Ofir Bitton	5555b7c56b	habanalabs: put devices before driver removal Driver never puts its device and control_device objects, hence a memory leak is introduced every driver removal. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-30 10:30:16 +02:00
Ofir Bitton	c8c39fbd01	habanalabs: free host huge va_range if not used If huge range is not valid, driver uses the host range also for huge page allocations, but driver never frees its allocation. This introduces a memory leak every time a user closes its context. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-30 10:30:16 +02:00
Oded Gabbay	652b44453e	habanalabs/gaudi: fix missing code in ECC handling There is missing statement and missing "break;" in the ECC handling code in gaudi.c This will cause a wrong behavior upon certain ECC interrupts. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-23 20:08:43 +02:00
Oded Gabbay	f83f3a31b2	habanalabs/gaudi: mask WDT error in QMAN This interrupt cause is not relevant because of how the user use the QMAN arbitration mechanism. We must mask it as the log explodes with it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-04 08:56:07 +02:00
Ofir Bitton	1137e1ead9	habanalabs/gaudi: move coresight mmu config We must relocate the coresight mmu configuration to the coresight flow to make it work in case the first submission is to configure the profiler. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-04 08:56:06 +02:00
Arnd Bergmann	82948e6e1d	habanalabs: fix kernel pointer type All throughout the driver, normal kernel pointers are stored as 'u64' struct members, which is kind of silly and requires casting through a uintptr_t to void* every time they are used. There is one line that missed the intermediate uintptr_t case, which leads to a compiler warning: drivers/misc/habanalabs/common/command_buffer.c: In function 'hl_cb_mmap': drivers/misc/habanalabs/common/command_buffer.c:512:44: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 512 \| rc = hdev->asic_funcs->cb_mmap(hdev, vma, (void *) cb->kernel_address, Rather than adding one more cast, just fix the type and remove all the other casts. Fixes: `0db575350c` ("habanalabs: make use of dma_mmap_coherent") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2020-11-04 08:56:06 +02:00
Oded Gabbay	5b94d6e476	habanalabs/gaudi: use correct define for qman init There was a copy-paste error, and the wrong define was used for initializing the QMAN. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200925171415.25663-1-oded.gabbay@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-30 08:38:18 +02:00
Ofir Bitton	25121d9804	habanalabs/gaudi: configure QMAN LDMA registers properly LDMA registers are configured with a fixed value. We add new define set which gives the configuration a proper meaning. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:21 +03:00
Oded Gabbay	eab1f6e7b0	habanalabs: add notice of device not idle The device should be idle after a context is closed. If not, print a notice. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:21 +03:00
Oded Gabbay	3c3aa5dbd6	habanalabs: add debug messages for opening/closing context During debugging of error we sometimes need to know whether the error happened when a user context was open. Add debug prints when opening and closing user contexts. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Oded Gabbay	9e2e8fc7d6	habanalabs: release kernel context after hw_fini Some engines use resources that belong to the kernel context (e.g. MMU mappings). In case the halt-engines doesn't work properly due to H/W restriction, we need to make sure the kernel context lives on until after the hw_fini. The hw_fini resets the ASIC after that no engine is alive and we can safely close the kernel context. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Oded Gabbay	fc6121e961	habanalabs: correct an error message We don't try to allocate huge pages here so remove the huge word. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Oded Gabbay	f279e5cd95	habanalabs: update scratchpad register map Our firmware use some scratchpad registers in the device for different roles. Update the file to the latest version of the firmware code. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Oded Gabbay	57799ce9f8	habanalabs: add indication of security-enabled F/W Future F/W versions will have enhanced security measures and the driver won't be able to do certain configurations that it always did and those configurations will be done by the firmware. We use the firmware's preboot version to determine whether security measures are enabled or not. Because we need this very early in our code, the read of the preboot version is moved to the earliest possible place, right after the device's PCI initialization. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Oded Gabbay	d1f3633599	habanalabs/gaudi: fix DMA completions max outstanding to 15 This is a workaround for H/W bug H3-2116, where if there are more than 16 outstanding completions in the DMA transpose engine, there can be a deadlock in the engine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Oded Gabbay	dbf053c429	habanalabs/gaudi: remove axi drain support AXI drain is broken in GAUDI so remove support for enabling it. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Oded Gabbay	219b8f2ff0	habanalabs: update firmware interface file Add new packet to fetch PLL information from firmware. This will be needed in the future when the driver won't be able to access the PLL registers directly Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Tomer Tayar	ef6a0f6caa	habanalabs: Add an option to map CB to device MMU There are cases in which the device should access the host memory of a CB through the device MMU, and thus this memory should be mapped. The patch adds a flag to the CB IOCTL, in which a user can ask the driver to perform the mapping when creating a CB. The mapping is allowed only if a dedicated VA range was allocated for the specific ASIC. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Tomer Tayar	fa8641a14f	habanalabs: Save context in a command buffer object Future changes require using a context while handling a command buffer, and thus need to save the context in the command buffer object. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:54 +03:00
Oded Gabbay	448f63badc	habanalabs: no need for DMA_SHARED_BUFFER Now that the driver no longer uses dma_buf, we can remove the select of DMA_SHARED_BUFFER from kconfig. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Oded Gabbay	681a22f55f	habanalabs: allow to wait on CS without sleep The user sometimes wants to check if a CS has completed to clean resources. In that case, the user doesn't want to sleep but just to check if the CS has finished and continue with his code. Add a new definition to the API of the wait on CS. The new definition says that if the timeout is 0, the driver won't sleep at all but return immediately after checking if the CS has finished. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Oded Gabbay	230b9b7d45	habanalabs/gaudi: increase timeout for boot fit load The firmware running in the boot stage takes more time to execute due to increased security mechanisms. Therefore, we need to increase the timeout we wait for the boot fit to finish loading. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Moti Haimovski	214afa974d	habanalabs: add debugfs support for MMU with 6 HOPs This commit modify the existing debugfs code to support future devices that have a 6 HOPs MMU implementation instead of 5 HOPs implementation. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Moti Haimovski	7edf341b9e	habanalabs: add num_hops to hl_mmu_properties This commit adds the number of HOPs supported by the device to the device MMU properties. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Moti Haimovski	d83fe66928	habanalabs: refactor MMU as device-oriented As preparation to MMU v2, rework MMU to be device oriented instantiated according to the device in hand. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Moti Haimovski	c91324f41b	habanalabs: rename mmu.c to mmu_v1.c In the future we will have MMU v2 code, so we need to prepare the driver for it. The first step is to rename the current MMU file to mmu_v1.c. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Omer Shpigelman	7c52fb0a09	habanalabs: use smallest possible alignment for virtual addresses Change the acquiring of a device virtual address for mapping by using the smallest possible alignment, rather than the biggest, depending on the page size used by the user for allocating the memory. This will lower the virtual space memory consumption. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:53 +03:00
Oded Gabbay	1fb2f37437	habanalabs: check flag before reset because of f/w event For consistency with GAUDI code, add check of the relevant flag in the device structure before resetting the GOYA device in case of firmware event. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Oded Gabbay	5a1b861daa	habanalabs: increase PQ COMP_OFFSET by one nibble For future ASICs, we increase this field by one nibble. This field was not used by the current ASICs so this change doesn't break anything. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Ofir Bitton	763a0b4d81	habanalabs: Fix alignment issue in cpucp_info structure Because the device CPU compiler aligns structures to 8 bytes, struct cpucp_info has an alignment issue as some parts in the structure are not aligned to 8 bytes. It is preferred that we explicitly insert placeholders inside the structure to avoid confusion in order to validate this scenario, we printed both pointers: __u8 cpucp_version[VERSION_MAX_LEN]; (0xffff899c67ed4cbc) __le64 dram_size; (0xffff899c67ed4d40) we see difference of 132 bytes although the first array is only 128 bytes long, Meaning compiler added a 4 byte padding. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Oded Gabbay	ae926514dd	habanalabs: remove unused define Cleanup the code. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Oded Gabbay	b01a971f80	habanalabs: remove unused ASIC function pointer Old function pointer that was left when the call to this function pointer was removed. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Oded Gabbay	6138bbe911	habanalabs: rename ArmCP to CPU-CP There were a couple of comments where the name ArmCP was still used. Rename it to CPU-CP. In addition, rename ArmCP or ARM in log messages to "device CPU". Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Oded Gabbay	975ab7b32b	habanalabs: count dropped CS because max CS in-flight There is a case where the user reaches the maximum number of CS in-flight. In that case, the driver rejects the new CS of the user with EAGAIN. Count that event so the user can query the driver later to see if it happened. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:52 +03:00
Hillf Danton	0db575350c	habanalabs: make use of dma_mmap_coherent Add dma_mmap_coherent() for goya and gaudi to match their use of dma_alloc_coherent(), see the Link tag for why. Link: https://lore.kernel.org/lkml/20200609091727.GA23814@lst.de/ Cc: Christoph Hellwig <hch@lst.de> Cc: Zhang Li <li.zhang@bitmain.com> Cc: Ding Z Nan <oshack@hotmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Oded Gabbay	c5e0ec66f0	habanalabs: clear vm_pgoff before doing the mmap The driver use vm_pgoff to hold the CB idr handle. Before we actually call the mapping function, we need to clear the handle so there won't be any garbage left in vm_pgoff. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Oded Gabbay	3174ac9bb1	habanalabs: restructure hl_mmap Arrange the hl_mmap code to be more structured and expandable for the future. Add better defines that describe our usage of the vm_pgoff. Note that I shamelessly took the code and defines from the amdkfd driver (my previous driver). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Oded Gabbay	f763946aef	habanalabs: cast to u64 before shift > 31 bits When shifting a boolean variable by more than 31 bits and putting the result into a u64 variable, we need to cast the boolean into unsigned 64 bits to prevent possible overflow. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Oded Gabbay	2f55342c5e	habanalabs: replace armcp with the generic cpucp ArmCP mandates that the device CPU is always an ARM processor, which might be wrong in the future. Most of this change is an internal renaming of variables, functions and defines but there are two entries in sysfs which have armcp in their names. Add identical cpucp entries but don't remove yet the armcp entries. Those will be deprecated next year. Add the documentation about it in sysfs documentation. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Oded Gabbay	42b0698add	habanalabs: update GAUDI hardware specs Add define for the 2 MME slave engines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
farah kassabri	9f3064913e	habanalabs: add support for getting device total energy Add driver implementation for reading the total energy consumption from the device ARM FW. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
Tomer Tayar	56004701f5	habanalabs: Include linux/bitfield.h only in habanalabs.h Include linux/bitfield.h only in habanalabs.h, instead of in each and every file that needs it, as habanalabs.h is already included by all. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:51 +03:00
farah kassabri	d90416c84d	habanalabs: extend busy engines mask to 64 bits change busy engines bitmask to 64 bits in order to represent more engines, needed for future ASIC support. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	107dd31465	habanalabs: use 1U when shifting bits Eliminate following warning: warning: Shifting signed 32-bit value by 31 bits is undefined behavior Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	31ac1f1a57	habanalabs: check TPC vector pipe is empty The driver waits for the TPC vector pipe to be empty before checking if the TPC kernel has finished executing, but the code doesn't validate that the pipe was indeed empty, it just wait for it without checking the return value. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	0358372bbe	habanalabs: remove redundant assignment to variable new_dma_pkt->ctl is assigned a value and then is reassigned a new value without the first value ever being used. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	65887291c6	habanalabs: use FIELD_PREP() instead of << Use the standard FIELD_PREP() macro instead of << operator to perform bitmask operations. This ensures type check safety and eliminate compiler warnings. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	a0e072f5a1	habanalabs: use standard BIT() and GENMASK() Use the standard macros to define bitmasks. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	bd4ef37292	habanalabs: eliminate redundant else condition If both parts of if-else are goto statements, we can remove the else and put the else goto statement after the if statement. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:50 +03:00
Oded Gabbay	f907af183b	habanalabs: cast int to u32 before printing it with %u %u is used for unsigned so we need to cast the int variable to u32 to avoid compiler warning. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Oded Gabbay	f5b9c8cf25	habanalabs: change CB's ID to be 64 bits Although the possible values for CB's ID are only 32 bits, there are a few places in the code where this field is shifted and passed into a function which expects 64 bits. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Dotan Barak	d6b045c083	habanalabs: print the queue id in case of an error If there is a failure during the testing of a queue, to ease up debugging - print the queue id. Signed-off-by: Dotan Barak <dbarak@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
farah kassabri	acd330c141	habanalabs: remove security from ARB_MST_QUIET register Allow user application to write to this register in order to be able to configure the quiet period of the QMAN between grants. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Ofir Bitton	2e5eda4681	habanalabs: PCIe Advanced Error Reporting support driver will now get notified upon any PCI error occurred and will respond according to the severity of the error. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Ofir Bitton	843839bec3	habanalabs: expose sync manager resources allocation in INFO IOCTL Although the driver defines the first user-available sync manager object and monitor in habanalabs.h, we would like to also expose this information via the INFO IOCTL so the runtime can get this information dynamically. This is because in future ASICs we won't need to define it statically. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Ofir Bitton	0a068adde5	habanalabs: add information about PCIe controller Update firmware header with new API for getting pcie info such as tx/rx throughput and replay counter. These counters are needed by customers for monitor and maintenance of multiple devices. Add new opcodes to the INFO ioctl to retrieve these counters. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:49 +03:00
Ofir Bitton	a98d73c7fa	habanalabs: Replace dma-fence mechanism with completions habanalabs driver uses dma-fence mechanism for synchronization. dma-fence mechanism was designed solely for GPUs, hence we purpose a simpler mechanism based on completions to replace current dma-fence objects. Signed-off-by: Ofir Bitton <obitton@habana.ai> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-22 18:49:48 +03:00
Oded Gabbay	b71590efb2	habanalabs: increase length of ASIC name Future ASIC names are longer than 15 chars so increase the variable length to 32 chars. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-09-22 18:49:48 +03:00
Ofir Bitton	69c6e18d0c	habanalabs: fix report of RAZWI initiator coordinates All initiator coordinates received upon an 'MMU page fault RAZWI event' should be the routers coordinates, the only exception is the DMA initiators for which the reported coordinates correspond to their actual location. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-31 15:10:27 +03:00
Moti Haimovski	6396feabf7	habanalabs: prevent user buff overflow This commit fixes a potential debugfs issue that may occur when reading the clock gating mask into the user buffer since the user buffer size was not taken into consideration. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-31 15:10:27 +03:00
Ofir Bitton	5aba368893	habanalabs: correctly report inbound pci region cfg error During inbound iATU configuration we can get errors while configuring PCI registers, there is a certain scenario in which these errors are not reflected and driver is loaded with wrong configuration. Signed-off-by: Ofir Bitton <obitton@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:58 +03:00
Ofir Bitton	0839152f8c	habanalabs: check correct vmalloc return code vmalloc can return different return code than NULL and a valid pointer. We must validate it in order to dereference a non valid pointer. Signed-off-by: Ofir Bitton <obitton@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:58 +03:00
Ofir Bitton	bce382a8bb	habanalabs: validate FW file size We must validate FW size in order not to corrupt memory in case a malicious FW file will be present in system. Signed-off-by: Ofir Bitton <obitton@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:58 +03:00
Colin Ian King	804d057cfa	habanalabs: fix incorrect check on failed workqueue create The null check on a failed workqueue create is currently null checking hdev->cq_wq rather than the pointer hdev->cq_wq[i] and so the test will never be true on a failed workqueue create. Fix this by checking hdev->cq_wq[i]. Addresses-Coverity: ("Dereference before null check") Fixes: `5574cb2194` ("habanalabs: Assign each CQ with its own work queue") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:58 +03:00
Oded Gabbay	58361aae4b	habanalabs: set max power according to card type In Gaudi, the default max power setting is different between PCI and PMC cards. Therefore, the driver need to set the default after knowing what is the card type. The current code has a bug where it limits the maximum power of the PMC card to 200W after a reset occurs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Ofir Bitton	36545279f0	habanalabs: proper handling of alloc size in coresight Allocation size can go up to 64bit but truncated to 32bit, we should make sure it is not truncated and validate no address overflow. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Ofir Bitton	f44d23b909	habanalabs: set clock gating according to mask Once clock gating is set we enable clock gating according to mask, we should also disable clock gating according to relevant bits. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Ofir Bitton	1cff119740	habanalabs: verify user input in cs_ioctl_signal_wait User input must be validated before using it to access internal structures. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Dan Carpenter	b0353540ff	habanalabs: Fix a loop in gaudi_extract_ecc_info() The condition was reversed. It should have been less than instead of greater than. The result is that we never enter the loop. Fixes: `fcc6a4e606` ("habanalabs: Extract ECC information from FW") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Dan Carpenter	eeec23cd32	habanalabs: Fix memory corruption in debugfs This has to be a long instead of a u32 because we write a long value. On 64 bit systems, this will cause memory corruption. Fixes: `c216477363` ("habanalabs: add debugfs support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Ofir Bitton	bc75be24fa	habanalabs: validate packet id during CB parse During command buffer parsing, driver extracts packet id from user buffer. Driver must validate this packet id, since it is being used in order to extract information from internal structures. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:57 +03:00
Ofir Bitton	bf6d10963e	habanalabs: Validate user address before mapping User address must be validated before driver performs address map. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:56 +03:00
Ofir Bitton	f1aae40e8d	habanalabs: unmap PCI bars upon iATU failure In case the driver fails to configure the PCI controller iATU, it needs to unmap the PCI bars before exiting so if the driver is removed, the bars won't be left mapped. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-08-22 12:47:56 +03:00
Wei Yongjun	22362aa30b	habanalabs: remove unused but set variable 'ctx_asid' Gcc report warning as follows: drivers/misc/habanalabs/common/command_submission.c:373:6: warning: variable 'ctx_asid' set but not used [-Wunused-but-set-variable] 373 \| int ctx_asid, rc; \| ^~~~~~~~ This variable is not used in function cs_timedout(), this commit remove it to fix the warning. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20200729155902.33976-1-weiyongjun1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-29 18:02:21 +02:00
kernel test robot	bb34bf798c	habanalabs: goya_ctx_init() can be static Signed-off-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20200729000313.GA14680@e442e3f624c4 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-29 09:46:05 +02:00
Greg Kroah-Hartman	7b16a15524	habanalabs: fix up absolute include instructions There's no need to try to be cute with the include file locations in the Makefile, so just specify exactly where the files are. Bonus is this fixes the problem of building with O= as well as trying to just build the subdirectory alone. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Omer Shpigelman <oshpigelman@habana.ai> Cc: Tomer Tayar <ttayar@habana.ai> Cc: Moti Haimovski <mhaimovski@habana.ai> Cc: Ofir Bitton <obitton@habana.ai> Cc: Ben Segal <bpsegal20@gmail.com> Cc: Christine Gharzuzi <cgharzuzi@habana.ai> Cc: Pawel Piskorski <ppiskorski@habana.ai> Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-29 08:15:50 +02:00
Greg Kroah-Hartman	65a9bde6ed	Linux 5.8-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G 171s5Hs= =BQIl -----END PGP SIGNATURE----- Merge 5.8-rc7 into char-misc-next This should resolve the merge/build issues reported when trying to create linux-next. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-27 11:49:37 +02:00
Tomer Tayar	94f8be9eb0	habanalabs: Fix memory leak in error flow of context initialization Add a missing free of the cs_pending array in the error flow of context initialization. Fixes: `c16d45f42b` ("habanalabs: Use pending CS amount per ASIC") Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:40:06 +03:00
Tomer Tayar	644883ef1a	habanalabs: use no flags on MMU cache invalidation gaudi_mmu_invalidate_cache() doesn't use the flags parameter, and thus it can be set to 0 when the function is called in the gaudi only files. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:37 +03:00
Oded Gabbay	8df8cb1efc	habanalabs: enable device before hw_init() Device is now enabled before the hw_init() because part of the initialization requires communication with the device firmware to get information that is required for the initialization itself Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-24 20:31:37 +03:00
Ofir Bitton	a04b7cd97e	habanalabs: create internal CB pool Create a device MMU-mapped internal command buffer pool, in order to allow the driver to allocate CBs for the signal/wait operations that are fetched by the queues when they are configured with the user's address space ID. We must pre-map this internal pool due to performance issues. This pool is needed for future ASIC support and it is currently unused in GOYA and GAUDI. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:37 +03:00
Oded Gabbay	eb8b293e79	habanalabs: update hl_boot_if.h from firmware Update the boot interface file from the latest version from firmware. Defines for secure boot were added. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2020-07-24 20:31:37 +03:00
Oded Gabbay	70b2f993ea	habanalabs: create common folder For internal needs of our CI we need to move all the common code into a common folder instead of putting them in the root folder of the driver. Same applies to the common header files under include/ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2020-07-24 20:31:37 +03:00
Moti Haimovski	a9855a2d91	habanalabs: check for DMA errors when clearing memory In GAUDI we use QMAN0 DMA for clearing the MMU memory region at initialization. if this operation fails it places the DMA in an error state and then when trying to initialize QMAN0 we fail and erroneously assume its the QMAN that failed. This commit adds a check and clear of such DMA errors at initialization so we will have a better understanding of what went wrong. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:37 +03:00
Ofir Bitton	22cb855598	habanalabs: verify queue can contain all cs jobs In order for the user to be aware of wrong inputs, we must return error in case the amount of jobs per cs exceeds the corresponding queue size. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:37 +03:00
Ofir Bitton	5574cb2194	habanalabs: Assign each CQ with its own work queue We identified a possible race during job completion when working with a single multi-threaded work queue. In order to overcome this race we suggest using a single threaded work queue per completion queue, hence we guarantee jobs completion in order. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:37 +03:00
Oded Gabbay	c83c417193	habanalabs: halt device CPU only upon certain reset Currently the driver halts the device CPU in the halt engines function, which halts all the engines of the ASIC. The problem is that if later on we stop the reset process (due to inability to clean memory mappings in time), the CPU will remain in halt mode. This creates many issues, such as thermal/power control and FLR handling. Therefore, move the halting of the device CPU to the very end of the reset process, just before writing to the registers to initiate the reset. In addition, the driver now needs to send a message to the device F/W to disable it from sending interrupts to the host machine because during halt engines function the driver disables the MSI/MSI-X interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-24 20:31:36 +03:00
Omer Shpigelman	9158c47e20	habanalabs: remove unused hash Remove an old hash that is not in use anymore. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Ofir Bitton	79b1894c41	habanalabs: use queue pi/ci in order to determine queue occupancy Instead of using the free slots amount on the compute CQ to determine whether we can submit work to queues, use the queues pi/ci. This is needed in future ASICs where we don't have CQ per queue. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Ofir Bitton	3abc99bb7d	habanalabs: configure maximum queues per asic Currently the amount of maximum queues is statically configured. Using a static value is causing redundunt cycles when traversing all queues and consumes more memory than actually needed. In this patch we configure each asic with the exact number of queues needed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Oded Gabbay	12ae3133d2	habanalabs: remove soft-reset support from GAUDI Soft-reset isn't supported in GAUDI. Remove the code that performs it and print error in case the user wants to do it via sysfs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-24 20:31:36 +03:00
Ofir Bitton	f4cbfd2445	habanalabs: PCIe iATU refactoring Divide iATU initialization into inbound/outbound methods. We must separate it in order to enable different match mode per PCIe region. In addition, added support for PCI address match mode. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Oded Gabbay	fcc6a4e606	habanalabs: Extract ECC information from FW ECC (Error Correcting Code) interrupts are going to be handled by the FW. Hence, we define an interface in which the driver can obtain the relevant ECC information. This information is needed for monitoring and can also lead to a hard reset if ECC error is not correctable. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Ofir Bitton	db491e4f08	habanalabs: Add dropped cs statistics info struct Add command submission statistics structure which can be obtained through the info ioctl. Each drop counter describes the reason for which the command submission was dropped. This information is needed for the user to be aware of the specific reason for which the submitted work was dropped. The user can then utilize the driver more efficiently. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:36 +03:00
Christine Gharzuzi	c8f9b49d2d	habanalabs: extract cpu boot status lookup Extract detection of the cpu boot status to a function to allow code reuse Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:35 +03:00
Oded Gabbay	0eab4f89d6	habanalabs: rephrase error messages rephrase some error/warning/notice messages to make them more accessible to ordinary users. There is no need to print context ASID as the driver currently doesn't support multiple contexts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-24 20:31:35 +03:00
Ofir Bitton	dd9efabd0a	habanalabs: Increase queues depth After recent concurrent cs amount increase, we must also increase queues depth since much more concurrent work can be done. All external queue depths were increased to 4096 as gaudi's internal queue depths were also increased to 1024. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:35 +03:00
Omer Shpigelman	917b79b096	habanalabs: rephrase error message Rephrase F/W error message to make it more understandable to ordinary users. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:35 +03:00
Adam Aharon	e8edded693	habanalabs: calculate trace frequency from PLL The profiler needs to know the PLL values for correctly showing the profiling data. Because our firmware can use different PLL configurations, we need to read the PLL values from the ASIC to pass them to the profiler. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:35 +03:00
Oded Gabbay	6ced91170d	habanalabs: align armcp_packet structure to 8 bytes Once there is a 64-bit field in a structure, GCC compiler for ARM aligns the structure to 8 bytes. In order to avoid confusion when these structures are being passed between CPUs from different architectures, we explicitly align the structure to 8 bytes. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:35 +03:00
Ofir Bitton	6c07bab34b	habanalabs: Use mask instead of shift in sync stream registers Use proper bitfield masks instead of shifting values when configuring packets sent to device. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:34 +03:00
Ofir Bitton	21e7a34634	habanalabs: sync stream generic functionality Currently sync stream is limited only for external queues. We want to remove this constraint by adding a new queue property dedicated for sync stream. In addition we move the initialization and reset methods to the common code since we can re-use them with slight changes. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:34 +03:00
Ofir Bitton	c16d45f42b	habanalabs: Use pending CS amount per ASIC Training schemes requires much more concurrent command submissions than inference does. In addition, training command submissions can be completed in a non serialized manner. Hence, we add support in which each ASIC will be able to configure the amount of concurrent pending command submissions, rather than use a predefined amount. This change will enhance performance by allowing the user to add more concurrent work without waiting for the previous work to be completed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:34 +03:00
Oded Gabbay	0b168c8f1d	habanalabs: remove rate limiters from GAUDI We no longer need to initialize the rate limiters in GAUDI A1. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-07-24 20:31:34 +03:00
Oded Gabbay	cea7a0449e	habanalabs: prevent possible out-of-bounds array access Queue index is received from the user. Therefore, we must validate it before using it to access the queue props array. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-19 08:15:36 +03:00
Oded Gabbay	788cacf308	habanalabs: set 4s timeout for message to device CPU We see that sometimes the CPU in GOYA and GAUDI is occupied by the power/thermal loop and can't answer requests from the driver fast enough. Therefore, to avoid false notifications on timeouts, increase the timeout to 4 seconds on each message sent to the device CPU. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-07-10 19:53:03 +03:00
Oded Gabbay	e38bfd30e0	habanalabs: set clock gating per engine For debugging purposes, we need to allow the root user better control of the clock gating feature of the DMA and compute engines. Therefore, change the clock gating debugfs interface to be bitmask instead of true/false. Each bit represents a different engine, according to gaudi_engine_id enum. See debugfs documentation for more details. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2020-07-10 19:53:03 +03:00
Oded Gabbay	2edc66e22b	habanalabs: block WREG_BULK packet on PDMA WREG_BULK is a special packet that has a variable length. Therefore, we can't parse it when validating CBs that go to the PCI DMA queue. In case the user needs to use it, it can put multiple WREG32 packets instead. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2020-07-10 19:53:03 +03:00
Lee Jones	14395c6fb1	misc: habanalabs: gaudi: gaudi_security: Repair incorrectly named function arg gaudi_pb_set_block()'s argument 'base' was incorrectly named 'block' in its function header. Fixes the following W=1 kernel build warning(s): drivers/misc/habanalabs/gaudi/gaudi_security.c:454: warning: Function parameter or member 'base' not described in 'gaudi_pb_set_block' drivers/misc/habanalabs/gaudi/gaudi_security.c:454: warning: Excess function parameter 'block' description in 'gaudi_pb_set_block' Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-11-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:37 +02:00
Lee Jones	f7d227c306	misc: habanalabs: gaudi: Remove ill placed asterisk from kerneldoc header W=1 kernel builds report a lack of description of gaudi_set_asic_funcs()'s 'hdev' argument. In reality it is documented, but the formatting was not as expected '@.*:'. Instead, there was a misplaced asterisk which was confusing the kerneldoc validator. Squashes the following W=1 warning: drivers/misc/habanalabs/gaudi/gaudi.c:6746: warning: Function parameter or member 'hdev' not described in 'gaudi_set_asic_funcs' Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-10-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:37 +02:00
Lee Jones	67db05cea6	misc: habanalabs: goya: goya_coresight: Remove set but unused variable 'val' No attempt to check the return value of RREG32() has been made since the call was introduced a year ago. Fixes W=1 kernel build warning: drivers/misc/habanalabs/goya/goya_coresight.c: In function ‘goya_debug_coresight’: drivers/misc/habanalabs/goya/goya_coresight.c:643:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable] 643 \| u32 val; \| ^~~ Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-9-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:37 +02:00
Lee Jones	e0712c600e	misc: habanalabs: pci: Scrub documentation for non-present function argument 'dma_mask' is not passed directly into hl_pci_set_dma_mask() as an argument. Instead, it is pulled from struct hl_device *hdev. Fixed the following W=1 warning: drivers/misc/habanalabs/pci.c:328: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-8-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:36 +02:00
Lee Jones	2557f27fd4	misc: habanalabs: goya: Omit pointless check ensuring addr is >=0 Seeing as 'addr' is unsigned, it would be impossible for the assigned value to be anything other than zero or positive. Squashes the following W=1 warnings: drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read32’: drivers/misc/habanalabs/goya/goya.c:3945:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 3945 \| } else if ((addr >= DRAM_PHYS_BASE) && \| ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write32’: drivers/misc/habanalabs/goya/goya.c:4002:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4002 \| } else if ((addr >= DRAM_PHYS_BASE) && \| ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read64’: drivers/misc/habanalabs/goya/goya.c:4047:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4047 \| } else if ((addr >= DRAM_PHYS_BASE) && \| ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write64’: drivers/misc/habanalabs/goya/goya.c:4091:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4091 \| } else if ((addr >= DRAM_PHYS_BASE) && \| ^~ drivers/misc/habanalabs/pci.c:328: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask' drivers/misc/habanalabs/goya/goya_coresight.c: In function ‘goya_debug_coresight’: drivers/misc/habanalabs/goya/goya_coresight.c:643:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable] 643 \| u32 val; \| ^~~ Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-7-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:36 +02:00
Lee Jones	3db99f000b	misc: habanalabs: irq: Repair kerneldoc formatting issues W=1 kernel builds report a lack of descriptions for various function arguments. In reality they are documented, but the formatting was not as expected '@.:'. Instead, '-'s were used as separators. While we're here, the headers for functions various functions were written in kerneldoc format, but lack the kerneldoc identifier '/*'. Let's promote them so they can gain access to the checker. This change fixes the following W=1 warnings: drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'eq_work' not described in 'hl_eqe_work' drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'hdev' not described in 'hl_eqe_work' drivers/misc/habanalabs/irq.c:24: warning: Function parameter or member 'eq_entry' not described in 'hl_eqe_work' Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-6-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:36 +02:00
Lee Jones	df123c9dcd	misc: habanalabs: pci: Fix a variety of kerneldoc issues hl_pci_bars_map() has a miss-typed argument name in the function description. hl_pci_elbi_write() was missing documented arguments. The headers for functions hl_pci_bars_unmap(), hl_pci_elbi_write() and hl_pci_reset_link_through_bridge() were written in kerneldoc format, but lack the kerneldoc identifier '/**'. Let's promote them so they can gain access to the checker. These changes fix the following W=1 kernel build warnings: drivers/misc/habanalabs/pci.c:27: warning: Function parameter or member 'name' not described in 'hl_pci_bars_map' drivers/misc/habanalabs/pci.c:27: warning: Excess function parameter 'bar_name' description in 'hl_pci_bars_map' drivers/misc/habanalabs/pci.c:147: warning: Function parameter or member 'addr' not described in 'hl_pci_iatu_write' drivers/misc/habanalabs/pci.c:147: warning: Function parameter or member 'data' not described in 'hl_pci_iatu_write' drivers/misc/habanalabs/pci.c:324: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask' Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-5-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:36 +02:00
Lee Jones	a0c11b3c91	misc: habanalabs: firmware_if: Add missing 'fw_name' and 'dst' entries to function header Looks as though documentation for these function arguments have been missing since the driver's inception last year. Fixes the following W=1 kernel build warnings: drivers/misc/habanalabs/firmware_if.c:26: warning: Function parameter or member 'fw_name' not described in 'hl_fw_load_fw_to_device' drivers/misc/habanalabs/firmware_if.c:26: warning: Function parameter or member 'dst' not described in 'hl_fw_load_fw_to_device' Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-4-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-01 15:05:36 +02:00
Lee Jones	9eea2a499f	misc: habanalabs: irq: Add missing struct identifier for 'struct hl_eqe_work' In kerneldoc format, data structures have to start with 'struct' else the kerneldoc tooling/parsers/validators get confused. Squashes the following W=1 warning: drivers/misc/habanalabs/irq.c:19: warning: cannot understand function prototype: 'struct hl_eqe_work ' Cc: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20200626130525.389469-10-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-29 18:45:53 +02:00
Omer Shpigelman	ce04326edd	habanalabs: increase h/w timer when checking idle In GAUDI the current timer value for the hardware to check if it is in IDLE state is too low. As a result, there are occasions where the H/W wrongly reports it is not IDLE. The driver checks that before submitting work on behalf of the driver during initialization, so a false report might cause the driver to fail during device initialization. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 12:35:23 +03:00
Ofir Bitton	3292055c85	habanalabs: Correct handling when failing to enqueue CB The fence release flow is different if the CS was never submitted. In that case, we don't have an hw_sob object attached that we need to "put". While if the CS was aborted, we do need to "put" the hw_sob. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 09:09:10 +03:00
Oded Gabbay	647e835e67	habanalabs: increase GAUDI QMAN ARB WDT timeout The current timeout is too low for some of the workloads and we see false errors as a result. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 09:09:10 +03:00
Oded Gabbay	dd2fde1093	habanalabs: rename mmu_write() to mmu_asid_va_write() The function name conflicts with a static inline function in arch/m68k/include/asm/mcfmmu.h Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 09:09:10 +03:00
Omer Shpigelman	cfd4176dc0	habanalabs: use PI in MMU cache invalidation The PS flow for MMU cache invalidation caused timeouts in stress tests. Use PS + PI flow so no timeouts should happen whatsoever. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 09:09:10 +03:00
Oded Gabbay	64536abc62	habanalabs: block scalar load_and_exe on external queue In Gaudi, the user can't execute scalar load_and_exe on external queue because it can be a security hole. The driver doesn't parse the commands being loaded and it can be msg_prot, which the user isn't allowed to use. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-06-24 09:09:10 +03:00
Oded Gabbay	05c8a4fc44	habanalabs: correctly cast u64 to void* Use the u64_to_user_ptr(x) kernel macro to correctly cast u64 to void* Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200601065648.8775-2-oded.gabbay@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-01 09:08:10 +02:00
Tomer Tayar	c68f1baeaf	habanalabs: initialize variable to default value Fix the following smatch error in unmap_device_va(): error: uninitialized symbol 'rc'. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200601065648.8775-1-oded.gabbay@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-01 09:08:10 +02:00
Omer Shpigelman	8ff5f4fd40	habanalabs: handle MMU cache invalidation timeout MMU cache invalidation timeout indicates that the device is unstable and therefore unusable. Hence in such case do hard reset and return an error to the user if was called from ioctl. In addition, change the print to error level and rephrase its text. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:17:57 +03:00
Omer Shpigelman	36fafe87ed	habanalabs: don't allow hard reset with open processes When the MMU is heavily used by the engines, unmapping might take a lot of time due to a full MMU cache invalidation done as part of the unmap flow. Hence we might not be able to kill all open processes before going to hard reset the device, as it involves unmapping of all user memory. In case of a failure in killing all open processes, we should stop the hard reset flow as it might lead to a kernel crash - one thread (killing of a process) is updating MMU structures that other thread (hard reset) is freeing. Stopping a hard reset flow leaves the device as nonoperational and the user can then initiate a hard reset via sysfs to reinitialize the device. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Oded Gabbay	66446820df	habanalabs: GAUDI does not support soft-reset GAUDI does not support soft-reset as it leaves the NIC ports in an awkward state, where their QMANs were reset but the NIC itself is still working. In addition, there is not much sense in doing soft-reset when training is done on multiple GAUDIs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-05-25 08:15:33 +03:00
Omer Shpigelman	d798507988	habanalabs: add print for soft reset due to event Print the event name that caused the soft reset. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Omer Shpigelman	42d0b0b95f	habanalabs: improve MMU cache invalidation code A new sequence is introduced to invalidate the MMU cache in order to avoid timeouts. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Daniel Vetter	ed65bfd9fd	habanalabs: don't set default fence_ops->wait It's the default. Also so much for "we're not going to tell the graphics people how to review their code", dma_fence is a pretty core piece of gpu driver infrastructure. And it's very much uapi relevant, including piles of corresponding userspace protocols and libraries for how to pass these around. Would be great if habanalabs would not use this (from a quick look it's not needed at all), since open source the userspace and playing by the usual rules isn't on the table. If that's not possible (because it's actually using the uapi part of dma_fence to interact with gpu drivers) then we have exactly what everyone promised we'd want to avoid. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Rachel Stahl	87eaea1cf8	habanalabs: update patched_cb_size for Wreg32 The patch_cb_size is not updated for Wreg32 in its validate function, so updated in goya_validate_cb. Signed-off-by: Rachel Stahl <rstahl@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Ofir Bitton	ebd8d12251	habanalabs: move event handling to common firmware file Instead of writing similar event handling code for each ASIC, move the code to the common firmware file. This code will be used for GAUDI and all future ASICs. In addition, add two new fields to the auto-generated events file: valid and description. This will save the need to manually write the events description in the source code and simplify the code. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	af57cb81a6	habanalabs: enable gaudi code in driver Enable the GAUDI ASIC code in the pci probe callback of the driver so the driver will handle GAUDI ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	79fc7a9fff	habanalabs: add gaudi profiler module Add the GAUDI code to initialize the ASIC's profiler. The profile receives its initialization values from the user, same as in Goya, but the code to initialize is in the driver because the configuration space of the device is not directly exposed to the user. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	3a3a5bf196	habanalabs: add gaudi security module Add the code to initialize the security module of GAUDI. Similar to Goya, we have two dedicated mechanisms for security: Range Registers and Protection bits. Those mechanisms protect sensitive memory and configuration areas inside the device. In addition, in Gaudi we moved to a 3-level security scheme, where the F/W runs with the highest security level (Privileged), the driver runs with a less secured level (Secured) and the user is neither privileged nor secured. The security module in the driver configures the Secured parts so the user won't be able to access them. The Privileged parts are configured by the F/W. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	bcaf415204	habanalabs: add hwmgr module for gaudi The hwmgr module is responsible for messages sent to GAUDI F/W that are not common to all habanalabs ASICs. In GAUDI, we provide the user a simplified mode of controlling the ASIC clock frequency. Instead of three different clocks, we present a single clock property that the user can configure via sysfs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	ac0ae6a96a	habanalabs: add gaudi asic-dependent code Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the GAUDI ASIC. It also contains the code to initialize the F/W of the GAUDI ASIC and to receive events from the F/W. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	2aad2bf81c	habanalabs: add gaudi asic registers header files Add the relevant GAUDI ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, we copied the relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	fca72fbb66	habanalabs: get card type, location from F/W For Gaudi the driver gets two new additional properties from the F/W: 1. The card's type - PCI or PMC 2. The card's location in the Gaudi's box (relevant only for PMC). The card's location is also passed to the user in the HW IP info structure as it needs this property for establishing communication between Gaudis. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	ca62433f53	habanalabs: support clock gating enable/disable In Gaudi there is a feature of clock gating certain engines. Therefore, add this property to the device structure. In addition, due to a limitation of this feature, the driver needs to dynamically enable or disable this feature during run-time. Therefore, add ASIC interface functions to enable/disable this function from the common code. Moreover, this feature must be turned off when the user wishes to debug the ASIC by reading/writing registers and/or memory through the driver's debugfs. Therefore, add an option to enable/disable clock gating via the debugfs interface. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	803917f960	habanalabs: set PM profile to auto only for goya For Gaudi, the driver doesn't change the PM profile automatically due to device-controlled PM capabilities. Therefore, set the PM profile to auto only for Goya so the driver's code to automatically change the profile won't run on Gaudi. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	e09498b078	habanalabs: add dedicated define for hard reset Gaudi requires longer waiting during reset due to closing of network ports. Add this explanation to the relevant comment in the code and add a dedicated define for this reset timeout period, instead of multiplying another define. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	9e5e49cd5b	habanalabs: check if CoreSight is supported Coresight is not supported on simulator, therefore add a boolean for checking that (currently used by un-upstreamed code). Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	b75f22505a	habanalabs: add signal/wait to CS IOCTL operations Add the following two operations to the CS IOCTL: Signal: The signal operation is basically a command submission, that is created by the driver upon user request. It will be implemented using a dedicated PQE that will increment a specific SOB. There will be a new flag: HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user only needs to provide a queue index on which to put the signal. Wait: The wait operation is also a command submission that is created by the driver upon user request. It will be implemented using a dedicated PQE that will contain packets of "ARM a monitor" + FENCE packet. There will be a new flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user needs to provide the following parameters: 1. queue ID 2. an array of signal_seq numbers and the number of signals to wait on (the length of signal_seq_arr). The IOCTL will return the CS sequence number of the wait it put on the queue ID. Currently, the code supports signal_seq_nr==1. But this API definition will allow us to put a single PQE that waits on multiple signals. To correctly configure the monitor and fence, the driver will need to retrieve the specified signal CS object that contains the relevant SOB and its expected value. In case the signal CS has already been completed, there is no point of adding a wait operation. In this case, the driver will return to the user without putting anything on the PQ. The return code should reflect to the user that the signal was completed, as we won't return a CS sequence number for this wait. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	b0b5d92579	habanalabs: handle the h/w sync object Define a structure representing the h/w sync object (SOB). a SOB can contain up to 2^15 values. Each signal CS will increment the SOB by 1, so after some time we will reach the maximum number the SOB can represent. When that happens, the driver needs to move to a different SOB for the signal operation. A SOB can be in 1 of 4 states: 1. Working state with value < 2^15 2. We reached a value of 2^15, but the signal operations weren't completed yet OR there are pending waits on this signal. For the next submission, the driver will move to another SOB. 3. ALL the signal operations on the SOB have finished AND there are no more pending waits on the SOB AND we reached a value of 2^15 (This basically means the refcnt of the SOB is 0 - see explanation below). When that happens, the driver can clear the SOB by simply doing WREG32 0 to it and set the refcnt back to 1. 4. The SOB is cleared and can be used next time by the driver when it needs to reuse an SOB. Per SOB, the driver will maintain a single refcnt, that will be initialized to 1. When a signal or wait operation on this SOB is submitted to the PQ, the refcnt will be incremented. When a signal or wait operation on this SOB completes, the refcnt will be decremented. After the submission of the signal operation that increments the SOB to a value of 2^15, the refcnt is also decremented. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	ec2f8a306a	habanalabs: define ASIC-dependent interface for signal/wait This feature requires handling h/w resources which are a bit different from one ASIC to the other. Therefore, we need to define a set of interfaces the ASIC code provides to the common code to signal, wait, reset sync object and to reset and init a queue. As this feature is not supported in Goya, provide an empty implementation of those functions. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	f9e5f29518	uapi: habanalabs: add signal/wait operations This is a pre-requisite to upstreaming GAUDI support. Signal/wait operations are done by the user to perform sync between two Primary Queues (PQs). The sync is done using the sync manager and it is usually resolved inside the device, but sometimes it can be resolved in the host, i.e. the user should be able to wait in the host until a signal has been completed. The mechanism to define signal and wait operations is done by the driver because it needs atomicity and serialization, which is already done in the driver when submitting work to the different queues. To implement this feature, the driver "takes" a couple of h/w resources, and this is reflected by the defines added to the uapi file. The signal/wait operations are done via the existing CS IOCTL, and they use the same data structure. There is a difference in the meaning of some of the parameters, and for that we added unions to make the code more readable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	824b457839	habanalabs: add missing MODULE_DEVICE_TABLE PCI drivers should use this define to declare their PCI ID table. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Dotan Barak	0a62c3926e	habanalabs: print all CB handles as hex numbers Make all the CB handles printed in the same way and not some as decimal and some as hex numbers. Signed-off-by: Dotan Barak <dbarak@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	010a118cfe	habanalabs: update F/W register map Update the mapping to the latest one used by the Firmware. No impact on the driver in this update. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Adam Aharon	aa9dd58bcc	habanalabs: enable trace data compression (profiler) Set the STMTCSR.COMPEN bit to enable leading-zero trace data compression functionality for the extended stimulus ports. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Ofir Bitton	47f6b41cdd	habanalabs: load CPU device boot loader from host Load CPU device boot loader during driver boot time in order to avoid flash write for every boot loader update. To preserve backward-compatibility, skip the device boot load if the device doesn't request it. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	39b425170d	habanalabs: leave space for 2xMSG_PROT in CB The user must leave space for 2xMSG_PROT in the external CB, so adjust the define of max size accordingly. The driver, however, can still create a CB with the maximum size of 2MB. Therefore, we need to add a check specifically for the user requested size. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Christine Gharzuzi	8e708af284	habanalabs: support hwmon_reset_history attribute Support hwmon_temp_reset_histroy, hwmon_in_reset_history and hwmon_curr_reset attribute which resets the historical highest value. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	79c823c57e	habanalabs: Align protection bits configuration of all TPCs Align the protection bits configuration of all TPC cores to be as of TPC core 0. Fixes: `a513f9a7ec` ("habanalabs: make tpc registers secured") Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	eef544f746	habanalabs: Allow access to TPC LFSR register Allow user access to TPC LFSR register, as it might be accessed by TPC kernels. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	25e7aeba60	habanalabs: Add INFO IOCTL opcode for time sync information Add a new opcode to the INFO IOCTL that retrieves the device time alongside the host time, to allow a user application that want to measure device time together with host time (such as a profiler) to synchronize these times. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
kbuild test robot	ba7193c952	habanalabs: hl_pci_set_dma_mask() can be static set function to be static as it is not called from outside its file. Signed-off-by: kbuild test robot <lkp@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	926ba4cce1	habanalabs: handle barriers in DMA QMAN streams When we have DMA QMAN with multiple streams, we need to know whether the command buffer contains at least one DMA packet in order to configure the barriers correctly when adding the 2xMSG_PROT at the end of the JOB. If there is no DMA packet, then there is no need to put engine barrier. This is relevant only for GAUDI as GOYA doesn't have streams so the engine can't be busy by another stream. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	cb056b9fd5	habanalabs: retrieve DMA mask indication from firmware Retrieve from the firmware the DMA mask value we need to set according to the device's PCI controller configuration. This is needed when working on POWER9 machines, as the device's PCI controller is configured in a different way in those machines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	c8aee597bb	habanalabs: update firmware definitions Add comments for the various errors and states of the firmware during boot. Add a mapping of a new register that will tell the driver whether the firmware executed the request from the driver or if it has encountered an error. Add a new enum for the possible values of this register. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	7a65ee046b	habanalabs: increase timeout during reset When doing training, the DL framework (e.g. tensorflow) performs hundreds of thousands of memory allocations and mappings. In case the driver needs to perform hard-reset during training, the driver kills the application and unmaps all those memory allocations. Unfortunately, because of that large amount of mappings, the driver isn't able to do that in the current timeout (5 seconds). Therefore, increase the timeout significantly to 30 seconds to avoid situation where the driver resets the device with active mappings, which sometime can cause a kernel bug. BTW, it doesn't mean we will spend all the 30 seconds because the reset thread checks every one second if the unmap operation is done. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	49aba0bbab	habanalabs: print warning when reset is requested When the system administrator asks the driver to soft or hard reset the device through sysfs, the driver should display a warning in the kernel log to explain why it suddenly resets the device. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	7e1c07dd35	habanalabs: unify and improve device cpu init Move the code of device CPU initialization from being ASIC-Dependent to common code. In addition, add support for the new error reporting feature of the firmware boot code. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Omer Shpigelman	1fa185c656	habanalabs: re-factor H/W queues initialization We want to remove the following restrictions/assumptions in our driver: 1. The H/W queue index is also the completion queue index. 2. The H/W queue index is also the IRQ number of the completion queue. 3. All queues of the same type have consecutive indexes. Therefore we add the support for H/W queues of the same type with nonconsecutive indexes and completion queue index and IRQ number different than the H/W queue index. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Omer Shpigelman	76cedc739d	habanalabs: remove stop-on-error flag from DMA Stop-on-error mode in DMA is useful as it stops the transaction immediately upon error e.g. page fault. But it may cause the next command submission to fail as is leaves the DMA in unstable state. Therefore we remove the stop-on-error configuration from the DMA. Stop-on-err is still available for debug. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	3ec499c967	habanalabs: don't wait for ASIC CPU after reset Upon reset of the ASIC, the driver would have waited for the CPU to come out of reset before finishing the reset process. This was done for the purpose of making the CPU available to answer FLR requests. However, when a VM shuts down, the driver isn't removed so a reset never happens. Therefore, remove this waiting period as we don't need it. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	1184550155	habanalabs: fix pm manual->auto in GOYA When moving from manual to automatic power management mode in GOYA, the driver didn't correctly place the device in LOW power mode. As a result, if an application was run immediately after the move, it would have run with low frequencies. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Oded Gabbay	6966d9e1f2	habanalabs: show unsupported message for GAUDI If a GAUDI device is present in the system, display an error message that it is not supported by the current kernel. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Omer Shpigelman	4f0e6ab78a	habanalabs: add print upon clock change Add print upon clock slow down due to power consumption or overheating. In addition, add print when back to optimal clock. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Oded Gabbay	bc6ed3aa92	habanalabs: update goya firmware register map Use specific values in enum of register map to be able to deprecate old values. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Jules Irenge	8a7a88c10c	habanalabs: Add missing annotation for goya_hw_queues_unlock() Sparse reports a warning at goya_hw_queues_unlock() warning: context imbalance in goya_hw_queues_unlock() - unexpected unlock The root cause is a missing annotation at goya_hw_queues_unlock() Add the missing __releases(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Jules Irenge	cf87f966d2	habanalabs: Add missing annotation for goya_hw_queues_lock() Sparse reports a warning at goya_hw_queues_lock() warning: context imbalance in goya_hw_queues_lock() - wrong count at exit The root cause is a missing annotation at goya_hw_queues_lock() Add the missing __acquires(&goya->hw_queues_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Tomer Tayar	b41e9728d8	habanalabs: Remove unused parse_cnt variable The "parse_cnt" variable is incremented while validating the CS chunks, but it is actually not being used. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:17 +02:00
Christine Gharzuzi	0da10e683e	habanalabs: provide historical maximum of various sensors Add support for hwmon_in_highest, hwmon_temp_highest and hwmon_curr_highest attributes. These attributes retrieve the historical maximum voltage, temperature and current that were sampled, respectively. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Moti Haimovski	d57b83c3df	habanalabs: modify the return values of hl_read/write routines The hl read and write routines implement the hwmon_ops read and write interface routines respectively. These routines are expected to return a completion status when called, which was not the case until this commit. This commit modifies these routines to return 0 upon success and a negative error value upon failure. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Moti Haimovski	5557b138dc	habanalabs: support temperature offset via sysfs This commit adds support for offsetting the temperatures reading by a specified value as defined in https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface using the standard sysfs defined for hwmon. This is required by system administrators to inject errors to test their monitoring applications in data centers. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Oded Gabbay	e5509d5279	habanalabs: ratelimit error prints of IRQs The compute engines can perform millions of transactions per second. If there is a bug in the S/W stack, we could get a lot of interrupts and spam the kernel log. Therefore, ratelimit these prints Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Moti Haimovski	5cce51464c	habanalabs: add debugfs write64/read64 Allow debug user to write/read 64-bit data through debugfs. This will expedite the dump process of the (large) internal memories of the device done during debug. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Omer Shpigelman	0c002ceb39	habanalabs: fix DDR bar address setting DRAM_PHYS_BASE is already taken into account in MMU_PAGE_TABLES_ADDR. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Oded Gabbay	7491c036cb	habanalabs: removing extra ; There is an extra ; after the end of a function, which needs to be removed Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-03-24 10:54:16 +02:00
Tomer Tayar	1718a45b28	habanalabs: Avoid running restore chunks if no execute chunks CS with no chunks for execute phase is invalid, so its context_switch/restore phase should not be run. Hence, move the check of the execute chunks number to the beginning of hl_cs_ioctl(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Tomer Tayar	f3a838c0c7	habanalabs: Modify CS jobs counter to u16 As HL_MAX_JOBS_PER_CS is 512, it is possible that more than 255 CS jobs will be submitted for a certain queue. Hence, modify the "jobs_in_queue_cnt" parameter of the "hl_cs" structure to be u16 instead of u8. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Omer Shpigelman	64a7e2955d	habanalabs: split the host MMU properties Host memory may be allocated with huge pages. A different virtual range may be used for mapping in this case. Add Huge PCI MMU (HPMMU) properties to support it. This patch is a prerequisite for future ASICs support and has no effect on Goya ASIC as currently a single virtual host range is used for all page sizes. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Omer Shpigelman	240c92fd04	habanalabs: use the user CB size as a default job size When no patched command buffer (CB) is created, use the user CB size as the job size. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Pawel Piskorski	7fc40bcaa6	habanalabs: flush only at the end of the map/unmap Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx) within per-page loop. Signed-off-by: Pawel Piskorski <ppiskorski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-03-24 10:54:16 +02:00
Oded Gabbay	cf01514c5c	habanalabs: patched cb equals user cb in device memset During device memory memset, the driver allocates and use a CB (command buffer). To reuse existing code, it keeps a pointer to the CB in two variables, user_cb and patched_cb. Therefore, there is no need to "put" both the user_cb and patched_cb, as it will cause an underflow of the refcnt of the CB. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-02-11 11:12:47 +02:00
Omer Shpigelman	a37e47192d	habanalabs: do not halt CoreSight during hard reset During hard reset we must not write to the device. Hence avoid halting CoreSight during user context close if it is done during hard reset. In addition, we must not re-enable clock gating afterwards as it was deliberately disabled in the beginning of the hard reset flow. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-02-11 11:12:47 +02:00
Oded Gabbay	908087ffbe	habanalabs: halt the engines before hard-reset The driver must halt the engines before doing hard-reset, otherwise the device can go into undefined state. There is a place where the driver didn't do that and this patch fixes it. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-02-11 11:12:47 +02:00
Chen Wandun	68a1fdf245	habanalabs: remove variable 'val' set but not used Fixes gcc '-Wunused-but-set-variable' warning: drivers/misc/habanalabs/goya/goya.c: In function goya_pldm_init_cpu: drivers/misc/habanalabs/goya/goya.c:2195:6: warning: variable val set but not used [-Wunused-but-set-variable] drivers/misc/habanalabs/goya/goya.c: In function goya_hw_init: drivers/misc/habanalabs/goya/goya.c:2505:6: warning: variable val set but not used [-Wunused-but-set-variable] Fixes: `9494a8dd8d` ("habanalabs: add h/w queues module") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-12-14 15:12:21 +02:00
Oded Gabbay	018e0e3594	habanalabs: rate limit error msg on waiting for CS In case a user submits a CS, and the submission fails, and the user doesn't check the return value and instead use the error return value as a valid sequence number of a CS and ask to wait on it, the driver will print an error and return an error code for that wait. The real problem happens if now the user ignores the error of the wait, and try to wait again and again. This can lead to a flood of error messages from the driver and even soft lockup event. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-12-14 15:12:21 +02:00
Oded Gabbay	5feccddcf9	habanalabs: add more protection of device during reset Prevent accesses to the device (register read/write) from debugfs entries during reset as that can cause the device to get stuck. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:47 +02:00
Oded Gabbay	55f6d68097	habanalabs: flush EQ workers in hard reset During hard-reset, there can be multiple events received from the H/W. For each event, the driver opens a worker thread to handle it. For some of the events, the driver will read/write registers in the code that handles the event. In case of hard-reset, we must prevent reads/writes to the registers during the reset operation because the device might get stuck if that happens. Therefore, flush the EQ workers before resetting the device (in hard-reset only). Additional events won't arrive as we synced and disabled the interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:47 +02:00
Oded Gabbay	1af69d30c4	habanalabs: make the reset code more consistent In the hl_device_reset we ask about the hard_reset argument when we want to differentiate between soft and hard reset, except for three places where we use "from_hard_reset_thread". Replace one of those locations with the hard_reset argument as it is guaranteed that if we reached to that line in the code during hard_reset, it is from a kernel thread. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:47 +02:00
Moti Haimovski	52c01b0137	habanalabs: expose reset counters via existing INFO IOCTL Expose both soft and hard reset counts via INFO IOCTL. This will allow system management applications to easily check if the device has undergone reset. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:47 +02:00
Oded Gabbay	e16ee41037	habanalabs: make code more concise Instead of doing if inside if, just write them with && operator. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:47 +02:00
Oded Gabbay	da1342a0ee	habanalabs: use defines for F/W files Make the code more concise and maintainable by using defines for the F/W files. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:46 +02:00
Oded Gabbay	7fbdc12b91	habanalabs: remove prints on successful device initialization Successful device initialization is mentioned in kernel log with the message "Successfully added device to habanalabs driver". There is no point of spamming the log with additional messages about successful queue testing, which are implied by the above mentioned message. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	e604f551cd	habanalabs: remove unnecessary checks Now that the VA block free list is not updated on context close in order to optimize this flow, no need in the sanity checks of the list contents as these will fail for sure. In addition, remove the "context closing with VA in use" print during hard reset as this situation is a side effect of the failure that caused the hard reset. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	bea84c4d67	habanalabs: invalidate MMU cache only once Reduce context close time by performing MMU cache invalidation once at the end of the unmap loop rather in each iteration, in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	71c5e55e7c	habanalabs: skip VA block list update in reset flow Reduce context close time by skipping the VA block free list update in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	1b98d8b23f	habanalabs: optimize MMU unmap Reduce context close time by skipping hash table lookup if possible in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. This commit affect mainly when running on simulator. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	bc75d799f9	habanalabs: prevent read/write from/to the device during hard reset During hard reset we should not access the device except of necessary reset operations because the device might be stuck or unresponsive. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	54bb67444e	habanalabs: split MMU properties to PCI/DRAM Split the properties used for MMU mappings to DRAM and PCI (host) types. This is a prerequisite for future ASICs support. Note that in Goya ASIC, the PMMU and DMMU are the same (except of page sizes) as only one MMU mechanism is used for both of the mapping types. Hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	30919edef2	habanalabs: re-factor MMU masks and documentation Some cosmetics around the MMU code to make it more self-explanatory. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	7b6e4ea0f7	habanalabs: type specific MMU cache invalidation Add the ability to invalidate the necessary MMU cache only. This ability is a prerequisite for future ASICs support. Note that in Goya ASIC, a single cache is used for both host/DRAM mappings and hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	7f74d4d335	habanalabs: re-factor memory module code Some of the functions in the memory module code were too long and/or contained multiple operations that are not always done together. Re-factor the code by dividing those functions to smaller functions which are more readable and maintainable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Oded Gabbay	5d1012576d	habanalabs: export uapi defines to user-space The two defines that control the maximum size of a command buffer and the maximum number of JOBS per CS need to be exported to the user as they are part of the API towards user-space. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:46 +02:00
Oded Gabbay	eda58bf786	habanalabs: don't print error when queues are full If the queues are full and we return -EAGAIN to the user, there is no need to print an error, as that case isn't an error and the user is expected to re-submit the work. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	bd4c8cb17d	habanalabs: increase max jobs number to 512 In training, there is a need for a large amount of patching to the recipe. This results in many command buffers contains a lot of DMA packets. The number of command buffers per CS is larger than the current maximum of 64, which is an arbitrary number that is enough for inference, but it has no real affect on the code and/or resources of the host machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	6476b47243	habanalabs: set ETR as non-secured ETR should always be non-secured as it is used by the users to record profiling/trace data. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	e1a84d56fc	habanalabs: use registers name defines for ETR block We have a single ETR block in the SOC, so use explicit register name defines for initializing this block. This makes it more readable and maintainable. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	f05912d8f1	habanalabs: read F/W versions before failure Move the read of the F/W boot versions before exiting on possible failures of the F/W boot. This will help debug boot failures as we will be able to know the F/W boot version. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	91edbf2cf8	habanalabs: expose card name in INFO IOCTL To enable userspace processes, e.g. management utilities, to display the card name to the user, add the card name property to the HW_IP structure that is copied to the user in the INFO IOCTL. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
YueHaibing	8d6de52866	habanalabs: remove set but not used variable 'qman_base_addr' Fixes gcc '-Wunused-but-set-variable' warning: drivers/misc/habanalabs/goya/goya.c: In function 'goya_init_mme_cmdq': drivers/misc/habanalabs/goya/goya.c:1536:6: warning: variable 'qman_base_addr' set but not used [-Wunused-but-set-variable] It is never used, so can be removed. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Oded Gabbay	62c1e124a9	habanalabs: add opcode to INFO IOCTL to return clock rate Add a new opcode to the INFO IOCTL to allow the user application to retrieve the ASIC's current and maximum clock rate. The rate is returned in MHz. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	8fdacf2a53	habanalabs: set TPC Icache to 16 cache lines Reduce latency to memory during TPC kernel execution. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:45 +02:00
Tomer Tayar	cb596aee88	habanalabs: Add a new H/W queue type This patch adds a support for a new H/W queue type. This type of queue is for DMA and compute engines jobs, for which completion notification are sent by H/W. Command buffer for this queue can be created either through the CB IOCTL and using the retrieved CB handle, or by preparing a buffer on the host or device SRAM/DRAM, and using the device address to that buffer. The patch includes the handling of the 2 options, as well as the initialization of the H/W queue and its jobs scheduling. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Tomer Tayar	df762375f1	habanalabs: Mark queue as expecting CB handle or address Jobs on some queues must be provided with a handle to a driver command buffer object, while for other queues, jobs must be provided with an address to a command buffer. Currently the distinction is done based on the queue type, which is less flexible if the same queue type behaves differently on different types of ASICs. This patch adds a new queue property for this target, which is configured per queue type per ASIC type. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Tomer Tayar	f435614ff5	habanalabs: Fix typos s/paerser/parser/ s/requeusted/requested/ s/an JOB/a JOB/ Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
YueHaibing	1e295d4dd5	habanalabs: remove set but not used variable 'ctx' Fixes gcc '-Wunused-but-set-variable' warning: drivers/misc/habanalabs/device.c: In function hpriv_release: drivers/misc/habanalabs/device.c:45:17: warning: variable ctx set but not used [-Wunused-but-set-variable] It is never used since commit `eb7caf84b0` ("habanalabs: maintain a list of file private data objects") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:44 +02:00
Oded Gabbay	abb7e16fb6	habanalabs: handle F/W failure for sensor initialization In case the F/W fails to initialize the thermal sensors, print an appropriate error message to kernel log and fail the device initialization. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:44 +02:00
Oded Gabbay	6dc66f7c26	habanalabs: correctly cast variable to __le32 When using the macro le32_to_cpu(x), we need to correctly convert x to be __le32 in case it is defined as u32 variable. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-09-05 14:55:28 +03:00
Oded Gabbay	307eae93d5	habanalabs: show correct id in error print If the initialization of a device failed, the driver prints an error message with the id of the device. The device index on the file system is that id divided by 2. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:28 +03:00
Oded Gabbay	4c172bbfaa	habanalabs: stop using the acronym KMD We want to stop using the acronym KMD. Therefore, replace all locations (except for register names we can't modify) where KMD is written to other terms such as "Linux kernel driver" or "Host kernel driver", etc. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	0996bd1c74	habanalabs: display card name as sensors header To allow the user to use a custom file for the HWMON lm-sensors library per card type, the driver needs to register the HWMON sensors with the specific card type name. The card name is supplied by the F/W running on the device. If the F/W is old and doesn't supply a card name, a default card name is displayed as the sensors group name. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	e9730763a2	habanalabs: add uapi to retrieve aggregate H/W events Add a new opcode to INFO IOCTL to retrieve aggregate H/W events. i.e. the events counters are NOT cleared upon device reset, but count from the loading of the driver. Add the code to support it in the device event handling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	75b3cb2bb0	habanalabs: add uapi to retrieve device utilization Users and sysadmins usually want to know what is the device utilization as a level 0 indication if they are efficiently using the device. Add a new opcode to the INFO IOCTL that will return the device utilization over the last period of 100-1000ms. The return value is 0-100, representing as percentage the total utilization rate. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Tomer Tayar	413cf576fd	habanalabs: Make the Coresight timestamp perpetual The Coresight timestamp is enabled for a specific debug session using the HL_DEBUG_OP_TIMESTAMP opcode of the debug IOCTL. In order to have a perpetual timestamp that would be comparable between various debug sessions, this patch moves the timestamp enablement to be part of the HW initialization. The HL_DEBUG_OP_TIMESTAMP opcode turns to be deprecated and shouldn't be used. Old user-space that will call it won't see any change in the behavior of the debug session. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Oded Gabbay	867b58ac94	habanalabs: print to kernel log when reset is finished Now that we don't print the queue testing messages, we need to print when the reset is finished so whoever looks at the kernel log will know the reset process was finished successfully and the driver is not stuck. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:27 +03:00
Oded Gabbay	fe9a52c97f	habanalabs: replace __le32_to_cpu with le32_to_cpu In some files the driver uses __le32_to_cpu while in other it uses le32_to_cpu. Replace all __le32_to_cpu instances with le32_to_cpu for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Oded Gabbay	abca3a8224	habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64 In some files the code use __cpu_to_le32/64 while in other it use cpu_to_le32/64. Replace all __cpu_to_le32/64 instances with cpu_to_le32/64 for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Tomer Tayar	129b6a9324	habanalabs: Handle HW_IP_INFO if device disabled or in reset The HW IP information is relevant even if the device is disabled or in reset, so always handle the corresponding INFO IOCTL opcode. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Tomer Tayar	ea451f88ef	habanalabs: Expose devices after initialization is done The char devices are currently exposed to user before the device and driver initialization are done. This patch moves the cdev and device adding to the system to the end of the initialization sequence, while keeping the creation of the structures at the beginning to allow the usage of dev_*(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Omer Shpigelman	9b50f539ff	habanalabs: improve security in Debug IOCTL This patch improves the security in the Debug IOCTL. It adds checks that: - The register index value is in the allowed range for all opcodes. - The event types number is in the allowed range in SPMU enable. - The events number is in the allowed range in SPMU disable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Omer Shpigelman	8d1759329d	habanalabs: use default structure for user input in Debug IOCTL This patch fixes a possible kernel crash when a user provides a too small input structure to the Debug IOCTL. The fix sets a default input structure and copies to it the user data. In case the user provided as input a too small structure, the code will use the default values taken from the default structure. Note that in contrary to the input structure, the user can provide an output structure with changing size or no size at all. Therefore the user output structure validation is already done in the Debug logic later on. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Tomer Tayar	10d7de2cdb	habanalabs: Add descriptive name to PSOC app status register Add a meaningful name to the general PSOC application status register which better describes its usage in keeping the HW state. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:26 +03:00
Tomer Tayar	4095a17657	habanalabs: Add descriptive names to PSOC scratch-pad registers The PSOC scratch-pad registers are used for communication with the device CPU. This patch adds new definitions for these registers which are more descriptive than their general names. The new set of definitions also gathers and documents the current usage of the scratch-pad registers by the driver and the device CPU. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:26 +03:00
Oded Gabbay	4d6a7751f6	habanalabs: create two char devices per ASIC This patch changes the driver to create two char devices for each ASIC it discovers. This is done to allow system/monitoring applications to query the device for stats, information, idle state and more, while also allowing the deep-learning application to send work to the ASIC. One char device is the original device, hlX. IOCTL calls through this device file can perform any task on the device (compute, memory, queries). The open function for this device will fail if it was called before but the file-descriptor it created was not completely released yet (the release callback function is not called from the kernel until all instances of that FD are closed). The driver needs to keep this behavior to support backward compatibility with existing userspace, which count that the open will fail if the device is "occupied". The second char device is called "hl_controlDx", where x is the same index of the main device with a minor number of the original char device + 1. Applications that open this device can only call the INFO IOCTL. There is no limitation on the number of applications opening this device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	b968eb1a84	habanalabs: change device_setup_cdev() to be more generic This patch re-factors the device_setup_cdev() function to make it more generic. It doesn't manipulate members of the driver's internal device structure but instead works only on the arguments that are sent to it. This is in preparation for using this function to create an additional char device per ASIC. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	eb7caf84b0	habanalabs: maintain a list of file private data objects This patch adds a new list to the driver's device structure. The list will keep the file private data structures that the driver creates when a user process opens the device. This change is needed because it is useless to try to count how many FD are open. Instead, track our own private data structure per open file and once it is released, remove it from the list. As long as the list is not empty, it means we have a user that can do something with our device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	86d5307a6d	habanalabs: rename user_ctx as compute_ctx This patch renames the "user_ctx" field in the device structure to "compute_ctx". This better reflects the meaning of this context. In addition, we also check in the ctx_fini() that the debug mode should be disabled only if the context being destroyed is the compute context. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	02e921e42b	habanalabs: show the process context dram usage When the user query the dram usage of a context, show it the dram usage of its context, not the user context that is currently running on the device. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	4aecb05e52	habanalabs: kill user process after CS rollback This patch calls the kill user process function after we rollback the in-flight CSs. This is because the user process can't be closed while there are open CSs. Therefore, there is no point of sending it a SIGKILL before we do the rollback CS part. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	b888751a02	habanalabs: add handle field to context structure This patch adds a field to the context's structure that will hold a unique handle for the context. This will be needed when the user will create the context. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Chuhong Yuan	30f273222c	habanalabs: Use dev_get_drvdata Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:26 +03:00
Oded Gabbay	209257feab	habanalabs: power management through sysfs is only for GOYA The ability of setting power management properties by the system administrator (through sysfs properties) is only relevant for the GOYA ASIC. Therefore, move the relevant sysfs properties to the GOYA sysfs specific file, to make the properties appear in sysfs only for GOYA cards. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:26 +03:00
Oded Gabbay	ed0fc50535	habanalabs: cap simulator timeout In the driver timeout functions, we give the simulator a factor of 10 in the timeout. This was necessary when the requested timeout is small but if it was a few seconds, this can result in a very large timeout which is unnecessary. This patch caps the maximum timeout of the simulator to 10 seconds, which is our largest timeout in the code. That is more then enough for anything the simulator is doing. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:26 +03:00

... 3 4 5 6 7 ...

593 Commits