In order to allow user to have larger amount of submissions, we
increase the DMA and NIC queue depth to 4K.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order for the user to know if he can try and open device, we
expose the compute ctx state. The user can now know if the context
is used by another process or whether the device is still ongoing
through cleanup or reset and will be available soon.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to be more informative during device open, we are adding a
new return code -EAGAIN that indicates device is still going through
resource reclaiming and hence it cannot be used yet.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Future devices will support multiple device memory page sizes.
In addition, an API for the user was added for it to be able to control
the device memory allocation page size.
This patch is a complementary patch to inform the user of the available
page size supported by the device.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to hold each MMU mask/shift as a denoted structure
member (e.g. hop0_mask).
Instead converting it to array will result in smaller and more readable
code.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch breaks the cumbersome implementation of "get real page size"
along with it's multiple inner conditions and implement each case
(according to the real complexity) inside an ASIC function.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using the device memory allocation API the user ought to know what
is the default allocation page size.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Looking forward we will need to report to the user what is the default
page size used.
This will be done more conveniently by explicitly updating the property
rather than to rely on a "0 meaning default" value.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
allmodconfig builds on 32-bit architectures fail with the following error.
drivers/misc/habanalabs/common/memory.c: In function 'alloc_device_memory':
drivers/misc/habanalabs/common/memory.c:153:49: error:
cast from pointer to integer of different size
Fix the typecast. While at it, drop other unnecessary typecasts associated
with the same commit.
Fixes: e8458e20e0 ("habanalabs: make sure device mem alloc is page aligned")
Cc: Ohad Sharabi <osharabi@habana.ai>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220404134859.3278599-1-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the big set of char/misc and other small driver subsystem
updates for 5.18-rc1.
Included in here are merges from driver subsystems which contain:
- iio driver updates and new drivers
- fsi driver updates
- fpga driver updates
- habanalabs driver updates and support for new hardware
- soundwire driver updates and new drivers
- phy driver updates and new drivers
- coresight driver updates
- icc driver updates
Individual changes include:
- mei driver updates
- interconnect driver updates
- new PECI driver subsystem added
- vmci driver updates
- lots of tiny misc/char driver updates
There will be two merge conflicts with your tree, one in MAINTAINERS
which is obvious to fix up, and one in drivers/phy/freescale/Kconfig
which also should be easy to resolve.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYkG3fQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykNEgCfaRG8CRxewDXOO4+GSeA3NGK+AIoAnR89donC
R4bgCjfg8BWIBcVVXg3/
=WWXC
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the big set of char/misc and other small driver subsystem
updates for 5.18-rc1.
Included in here are merges from driver subsystems which contain:
- iio driver updates and new drivers
- fsi driver updates
- fpga driver updates
- habanalabs driver updates and support for new hardware
- soundwire driver updates and new drivers
- phy driver updates and new drivers
- coresight driver updates
- icc driver updates
Individual changes include:
- mei driver updates
- interconnect driver updates
- new PECI driver subsystem added
- vmci driver updates
- lots of tiny misc/char driver updates
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits)
firmware: google: Properly state IOMEM dependency
kgdbts: fix return value of __setup handler
firmware: sysfb: fix platform-device leak in error path
firmware: stratix10-svc: add missing callback parameter on RSU
arm64: dts: qcom: add non-secure domain property to fastrpc nodes
misc: fastrpc: Add dma handle implementation
misc: fastrpc: Add fdlist implementation
misc: fastrpc: Add helper function to get list and page
misc: fastrpc: Add support to secure memory map
dt-bindings: misc: add fastrpc domain vmid property
misc: fastrpc: check before loading process to the DSP
misc: fastrpc: add secure domain support
dt-bindings: misc: add property to support non-secure DSP
misc: fastrpc: Add support to get DSP capabilities
misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP
misc: fastrpc: separate fastrpc device from channel context
dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells
dt-bindings: nvmem: make "reg" property optional
nvmem: brcm_nvram: parse NVRAM content into NVMEM cells
nvmem: dt-bindings: Fix the error of dt-bindings check
...
During driver and F/W handshake, driver waits for F/W to reach
certain states in order to progress with the boot flow.
Some of the states were deprecated a long time ago and were never
present on official firmwares. Therefore, let's remove them from
the handshake process.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Several H/W events can be sent adjacently, even due to a single error.
If a hard-reset is triggered as part of handling one of these events,
the following events won't be handled.
The debug info from these missed events is important, sometimes even
more important than the one that was handled.
To allow handling these close events, add an option to delay a device
reset and use it when resetting due to H/W events.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As the potential failure of the pci_enable_device(),
it should be better to check the return value and return
error if fails.
Fixes: 70b2f993ea ("habanalabs: create common folder")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case user application was interrupted while some cs still in-flight
or in the middle of completion handling in driver, the
last refcount of the kernel private data for the user process
will not be put in the fd close flow, but in the cs completion
workqueue context.
This means that the device reset-upon-device-release will be called
from that context. During the reset flow, the driver flushes all the cs
workqueue to ensure that any scheduled work has run to completion,
and since we are running from the completion context we will
have deadlock.
Therefore, we need to skip flushing the workqueue in those cases.
It is safe to do it because the user won't be able to release the device
unless the workqueues are already empty.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Working with MMU that supports multiple page sizes requires that mapping
of a page of a certain size will be aligned to the same size (e.g. the
physical address of 32MB page shall be aligned to 32MB).
To achieve this the gen_poll allocation is now using the "align" variant
to comply with the alignment requirements.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There are a few events that can arrive from the f/w and without proper
handling can cause errors to appear in the kernel log without reason.
Add the relevant handling that was missing.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Various AXI errors can occur in the NIC engines and are reported to
the driver by the f/w. Add code to print the errors and ack them to
the f/w.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In future ASICs the MMU will be able to work with multiple page sizes,
thus a new flag is added to allow the user to set the requested page
size.
This flag is added since the whole DRAM is allocated for the user and
the user also should be familiar with the memory usage use case.
As such, the user may choose to "over allocate" memory in favor of
performance (for instance- large page allocations covers more memory
in less TLB entries).
For example: say available page sizes are of 1MB and 32MB. If user
wants to allocate 40MB the user can either set page size to 1MB and
allocate the exact amount of memory (but will result in 40 TLB entries)
or the user can use 32MB pages, "waste" 8MB of physical memory but
occupy only 2 TLB entries.
Note that this feature will be available only to ASIC that supports
multiple DRAM page sizes.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix the following compilation warning in
hl_cb_ioctl() @ command_buffer.c:
warning: ‘device_va’ may be used uninitialized in this function
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For current devices there is a need to send the max power value to F/W
during device init, for example because there might be several card
types.
In future devices, this info will be programmed in the device's EEPROM
and will be read by F/W, and hence the driver should not send it.
Modify the sending of the relevant message to be done only for ASIC
types that need it.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The max_power variable which is used for calculating the device
utilization is the ASIC specific property which is set during init.
However, the max value can be modified via sysfs, and thus the updated
value in the device structure should be used instead.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
On Goya and Gaudi, the stop-on-error configuration can be set via
debugfs. However, in future devices, this configuration will always be
enabled.
Modify the debugfs node to be allowed only for ASICs that support this
dynamic configuration.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Use of vfree(), vmalloc_user(), vmalloc() and remap_vmalloc_range()
requires this include in some architectures.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When the code iterates over the free list of physical pages nodes, it
deletes the physical page node which is used as the iterator.
Therefore, we need to use the safe version of the iteration to prevent
use-after-free.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The iATU is an internal h/w machine inside Habana's PCI controller.
Mentioning it by name doesn't say anything to the user. It is better
to say the PCI controller initialization was not done successfully.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Timestamp registration API allows the user to register
a timestamp record event which will make the driver set
timestamp when CQ counter reaches the target value
and write it to a specific location specified
by the user.
This is a non blocking API, unlike the wait_for_interrupt
which is a blocking one.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Scenario:
1. CS which is part of encaps signal has been completed and now
executing kref_put to its encaps signal handle. The refcount of the
handle decremented to 0, and called the encaps signal handle
release function - hl_encaps_handle_do_release.
2. At this point the user starts waiting on the signal, and finds the
encaps signal handle in the handlers list and increment the habdle
refcount to 1.
3. Immediately after, hl_encaps_handle_do_release removed the handle
from the list and free its memory.
4. Wait function using the handle although it has been freed.
This scenario caused the slab area which was previously allocated
for the handle to be poison overwritten which triggered kernel bug
the next time the OS needed to allocate this slab.
Fixed by getting the refcount of the handle only in case it is not
zero.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Smatch warns that:
drivers/misc/habanalabs/common/command_buffer.c:471 hl_cb_ioctl()
error: uninitialized symbol 'device_va'.
Which is true, but harmless. Anyway, it's easy to silence this by
adding a error check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
We print detailed messages inside the internal ioctl functions. No need
to print a generic message at the end, it doesn't add any information.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The heartbeat thread is active during soft-reset, and it tries to send
messages to CPU-CP core.
Within the soft-reset, in the time window in which the device is marked
as disabled, any CPU-CP command is "silently" skipped and a success
value it returned.
However, in addition to the return value, the heartbeat function also
checks the F/W result, but because no command is sent in this time
window, the result variable won't hold the expected value and we will
have a false heartbeat failure.
To avoid it, modify the "silent" skip to be done only in hard-reset.
The CPU-CP should be able to handle messages during soft-reset.
In addition to the heartbeat problem, this should also solve other
issues in other flows that send messages during soft-reset and use the
F/W result as it w/o being aware to the reset.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a race in the user interrupts code, where between checking
the target value and adding the new pend to the list, there is a chance
the interrupt happened.
In that case, no one will complete the node, and we will get a timeout
on it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When timeout is 0, we need to return the busy status in case the
target value wasn't reached upon entry to the ioctl.
Also return the correct timestamp.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This is not something we can do a workaround. It is clearly an error
and we should notify the user that it is an error.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently we only expose to the user the ID of the first available
user interrupt. To make user interrupts allocation truly dynamic, we
need to also expose the number of user interrupts.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add a missing error check in the sysfs show function for max_power.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of soft reset failure, hard reset should be initiated, but
reset flags were not set to enable it, which caused another soft reset
followed by another failure.
Updated reset flags to enable hard reset flow in case of soft reset
failure.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add a missing error check in the sysfs show functions for
clk_max_freq_mhz and clk_cur_freq_mhz_show.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If reading PLL info from F/W fails, the PLL info is not set in the
"result" variable, and hence shouldn't be copied to the caller's array.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Freeing phys_pg_pack includes calling to scrubbing functions of the
device's memory, taking locks and possibly even calling reset.
This is not something that should be done while holding a device-wide
spinlock.
Therefore, save the relevant objects on a local linked-list and after
releasing the spinlock, traverse that list and free the phys_pg_pack
objects.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to support several device MMU blocks with different
architectures (e.g. different HOP table size) we need to move to
per-MMU properties rather than keeping those properties as ASIC
properties.
Refactoring the code to use "per-MMU proprties" is a major effort.
To start making the transition towards this goal but still support
taking the properties from ASIC properties (for code that currently
uses them) this patch copies some of the properties to the "per-MMU"
properties and later, when implementing the per-MMU properties, we
would be able to delete the MMU props from the ASIC props.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In future ASICs, there is no kernel TDR for new workloads that are
submitted directly from user-space to the device.
Therefore, the driver can NEVER know that a workload has timed-out.
So, when the user asks us to wait for interrupt on the workload's
completion, and the wait has timed-out, it doesn't mean the workload
has timed-out. It only means the wait has timed-out, which is NOT an
error from driver's perspective.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Adds new sysfs entry to display firmware os version
/sys/class/habanalabs/hl<n>/fw_os_ver
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
We have a common function that wraps the call to the MMU cache
invalidation function, which is ASIC-specific. The wrapper checks
the return value and prints error if necessary. For consistency, try
to use the wrapper when possible.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
infineon version is only applicable to GOYA and GAUDI. For later
ASICs, we display the Voltage Regulator Monitor f/w version.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In this attribute group we are only adding clocks. This is in
preparation for adding a device specific attribute group which is
not related to clocks.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Setting PLL profile is the same for all ASICs, except for GOYA.
However, because this function is never called from common code, there
is no need to have an asic-specific callback function.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For better maintainability, try to concentrate all the common functions
that communicate with the f/w in firmware_if.c
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The two remaining functions in this file belong to firmware_if.c,
as they communicate with the firmware.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Retrieving the clock from the f/w is done exactly the same in ALL our
ASICs. Therefore, no real justification for doing it as an
ASIC-specific function.
The only thing is we need to check if we are running on simulator,
which doesn't require ASIC-specific callback.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Move common sysfs store/show functions to sysfs.c file for
consistency.
This is part of a patch-set to remove hwmgr.c
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Some MMU functions can be used by different versions of our MMUs, so
move them to be common.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Now that clock gating is permanently disabled in GAUDI, no need for
the ASIC functions of setting and disabling clock gating, as this
was a unique scenario in GAUDI.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Due to the need of SynapseAI to configure all TPC engines from a single
QMAN, the driver must disable CGM and never allow the user to enable
it. Otherwise, the configuration of the TPC engines will fail.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This patch fixes what seems to be copy paste error.
We will have a memory leak if the host-resident shadow is NULL (which
will likely happen as the DR and HR are not dependent).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As part of handling of the multi-CS wait ioctl, hl_cs_poll_fences() is
called in a "while (true)" loop. This function can fail, but the
checking of its return value was missed.
Add this check and exit the loop in case of a failure.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].
This code was transformed with the help of Coccinelle:
(next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)
@@
identifier S, member, array;
type T1, T2;
@@
struct S {
...
T1 member;
T2 array[
- 0
];
};
UAPI and wireless changes were intentionally excluded from this patch
and will be sent out separately.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/78
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
As hard-reset can be requested during soft-reset, driver must allow
it or else critical events received during soft-reset will be
ignored.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Atomic operations during reset are replaced by a spinlock in order
to have the ability to protect more than a single variable.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Unify variables related to device reset, which will help us to
add some new reset functionality in future patches.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This patch fixes issue in which we have timeout for multi-CS although
the CS in the list actually completed.
Example scenario (the two threads marked as WAIT for the thread that
handles the wait_for_multi_cs and CMPL as the thread that signal
completion for both CS and multi-CS):
1. Submit CS with sequence X
2. [WAIT]: call wait_for_multi_cs with single CS X
3. [CMPL]: CS X do invoke complete_all for both CS and multi-CS
(multi_cs_completion_done still false)
4. [WAIT]: enter poll_fences, reinit the completion and find the CS
as completed when asking on the fence but multi_cs_done is
still false it returns that no CS actually completed
5. [CMPL]: set multi_cs_handling_done as true
6. [WAIT]: wait for completion but no CS to awake the wait context
and hence wait till timeout
Solution: if CS detected as completed in poll_fences but multi_cs_done
is still false invoke complete_all to the multi-CS completion
and so it will not go to sleep in wait_for_completion but
rather will have a "second chance" to wait for
multi_cs_completion_done.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In some cases the driver cannot configure ASID of some engines due to
the security level of the relevant registers.
For this a new CPU-CP packet is introduced, which will allow the driver
to ask the F/W to do this configuration instead.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-ENOTTY is returned in case of error in the ioctl arguments themselves,
such as function that doesn't exists.
In all other cases, where the error is in the arguments of the custom
data structures that we define that are passed in the various ioctls,
we need to return -EINVAL.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Current sysfs implementation does not take endianness into
consideration when dumping the cpld version.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently the cq counters are allocated in userspace memory,
and mapped by the driver to the device address space.
A new requirement that is part of new future API related to this one,
requires that cq counters will be allocated in kernel memory.
We leverage the existing cb_create API with KERNEL_MAPPED flag set to
allocate this memory.
That way we gain two things:
1. The memory cannot be freed while in use since it's protected
by refcount in driver.
2. No need to wake up the user thread upon each interrupt from CQ,
because the kernel has direct access to the counter. Therefore,
it can make comparison with the target value in the interrupt
handler and wake up the user thread only if the counter reaches the
target value. This is instead of waking the thread up to copy counter
value from user then go sleep again if target value wasn't reached.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
By the original design we assumed that if we "miss" multi CS completion
it is of no severe consequence as we'll just call wait_for_multi_cs
again.
Sequence of events for such scenario:
1. user submit CS with sequence N
2. user calls wait for multi-CS with only CS #N in the list
3. the multi CS call starts with poll of the CSs but find that none
completed (while CS #N did not completed yet)
4. now, multi CS #N complete but multi CS CTX was not yet created for
the above multi-CS. so, attempt to complete multi-CS fails (as no
multi CS CTX exist)
5. wait_for_multi_cs call now does init_wait_multi_cs_completion (and
for this create the multi-CS CTX)
6. wait_for_multi_cs wits on completion but will not get one as CS #N
already completed
To fix the issue we initialize the multi-CS CTX prior polling the
fences.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As BTL can be replaced by ROM we should modify relevant error print.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
During the MMU development the MMU header files were left with unclean
definitions:
- MMU "version specific" definitions that were left in the mmu_general
file
- unused definitions
This patch attempts, where possible, to keep definitions that can serve
multiple MMU versions (but that are not tightly bound with specific MMU
arch) in the mmu_general header file (e.g. different definitions for
number of HOPs).
Otherwise, move MMU version specific definitions (e.g. HOPs masks and
shifts) to the specific MMU version file.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As we allow soft-reset to be performed only on inference devices,
having the sysfs nodes may cause a confusion. Hence, we remove those
nodes on training ASICs.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently sysfs support dumping a single infineon version, in
future asics we will have two infineon versions.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Need to allow user retrieve data during reset and afterwards without
the need to reopen the device.
Did it by seperating the user peocesses list into two lists:
1. fpriv_list which contains list of user processes that opened
the device (currently only one).
2. fpriv_ctrl_list which contains list of user processes that opened
the control device. This processes in this list shall not be
killed during reset, only when the device is suddenly removed from
PCI chain.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In legacy f/w that use old hwmon.h file, the values of the hwmon
enums are different than the values that are in newer kernels (5.6
and above).
Therefore, to support working with those f/w, we need to do some
fixup before registering with the hwmon subsystem and also when
calling the functions that communicate with the f/w to retrieve
sensors information.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to increase cpucp messaging reliability we will add
the current PI value to the descriptor sent to F/W.
F/W will wait for the PI value as an indication of a valid packet.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The driver supports only a single user anyway, so there is no point
in checking whether we are in_debug state when a user tries to open
the device, because if we are in_debug, it means a user is already
using the device.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Current clock throttling period returned from driver was wrong due
to wrong time comparison.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The original multi-CS design assumption that stream masters are used
exclusively (i.e. multi-CS with set of stream master QIDs will not get
completed by CS not from the multi-CS set) is inaccurate.
Thus multi-CS behavior is now modified not to treat such case as an
error.
Instead, if we have multi-CS completion but we detect that no CS from
the list is actually completed we will do another multi-CS wait (with
modified timeout).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
It was an error to save the compute context's pointer in the device
structure, as it allowed its use without proper ref-cnt.
Change the variable to a flag that only indicates whether there is
an active compute context. Code that needs the pointer will now
be forced to use proper internal APIs to get the pointer.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There are multiple places where the code needs to get the context's
pointer and increment its ref cnt. This is the proper way instead
of using the compute context pointer in the device structure.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Pass the user's context pointer into the etr configuration function
to extract its ASID.
Using the compute_ctx pointer is an error as it is just an indication
of whether a user has opened the compute device.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Compute context pointer in hdev shouldn't be used for fetching the
context's pointer.
If an object needs the context's pointer, it should get it while
incrementing its kref, and when the object is released, put it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The driver supports only a single context. Therefore, no need to check
if the user context that is closed is the compute context. The user
context, if exists, is always the compute context.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix a bug where in case of failure to allocate idr, the handle's
memory wasn't freed as part of the error handling code.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The reset flags used by the reset thread are currently a mix of
hard-coded values and a specific flag which is passed from the context
that initiates the reset.
To make it easier to pass more flags in future from this context to the
reset thread, modify it to pass all the original reset flags to the
thread.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because info ioctl is used to retrieve data, some of its opcodes may be
used during hard reset.
Other ioctls should be blocked while device is not operational.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For debug purpose, add SOB address and SOB initial counter value
before current submission to uAPI output.
Using SOB address and initial counter, user can calculate how much of
the submmision has been completed.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reporting FW errors involves reading of the error registers.
In case we have a corrupted FW descriptor we cannot do that since the
dynamic scratchpad is potentially corrupted as well and may cause kernel
crush when attempting access to a corrupted register offset.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Driver should handle events during soft-reset as F/W is not
going through reset and it keeps sending events towards host.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently we dump the physical IRQ line index in host if an event
is received during reset. This ID is confusing as it means nothing
to the user.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In new f/w versions, it is required to explicitly indicate the power
information type when querying the F/W for power info.
When getting the current power level it should be set to power_input.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Some info ioctls can be served even if the device is disabled or
in reset. Hence, we enable more info ioctls during reset, as these
ioctls do not require any H/W nor F/W communication.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Race example scenario:
1. User have 2 threads that waits on multi CS:
- thread_0 waits on QID 0 and uses multi CS context 0.
- thread_1 waits on QID 1 and uses multi CS context 1.
2. thread_1 got completion and release multi CS context 1.
3. CS related to multi CS of thread_0 starts executing
complete_multi_cs function, the first iteration of the loop
completes the multi CS of thread_0, hence multi CS context 0
is released.
4. thread_1 waits on QID 1 and uses multi CS context 0.
5. thread_0 waits on QID 0 and uses multi CS context 1.
6. The second iterattion of the loop (from step 3) starts, which
means, start checking multi CS context 1:
- multi CS contetxt is being used by thread_0 waiting on QID 0.
- The fence of the CS (still CS from step 3) has QID map the same
as the multi CS context 1.
- multi CS context 1 (thread_0) gets completion on CS that triggered
already thread_0 (with multi CS context 0) and is no longer
being waited on.
Fixed by exiting the loop in complete_multi_cs after getting completion
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As device boot warnings clears the indication from the error mask,
they must be located together before the unknown error validation.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
GAUDI supports only hard-reset. Therefore, this function should
return an error of operation not permitted.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The ASIC-specific soft_reset_late_init() is now called after either
soft-reset or reset-upon-device-release. Therefore, it needs a more
appropriate name.
No need to split it to two functions, as an ASIC either supports
soft-reset or reset-upon-device-release.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reset upon device release is not a soft-reset from user/system point
of view. As such, we shouldn't count that reset in the statistics we
gather and expose to the monitoring applications.
We also shouldn't print soft-reset when doing the reset upon device
release.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Changing the frequency automatically is only done in Goya. In future
ASICs this is done inside the firmware. Therefore, move the common code
into the Goya specific files.
Main changes as part of the commit are:
1. The thread for setting frequency is moved from device_late_init
to goya_late_init
2. hl_device_set_frequency is removed from hl_device_open as it is
not relevant for other ASICs and for Goya it is taken care by
the thread
3. hl_device_set_frequency is renamed as goya_set_frequency
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Hard-reset is mutually exclusive with reset-on-device-release.
Therefore, if such a request arrives to the reset function, abort
the reset and return an error to the callee.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently there is a deadlock in driver in scenarios where MMU
cache invalidation fails. The issue is basically device reset
being performed without releasing the MMU mutex.
The solution is to skip device reset as it is not necessary.
In addition we introduce a slight code refactor that prints the
invalidation error from a single location.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Getting the used PLL index with which to send the CPUPU packet relies on
the CPUCP info packet.
In case CPU queues are not enabled getting the PLL index will issue an
error and in some ASICs will also fail the driver load.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If a device reset has started, there is a chance that the heartbeat
function will fail because the device is disabled at the beginning
of the reset function.
In that case, we don't want the error message to appear in the log.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
A new uAPI is added for debug purposes of the user-space to retrieve
errors related data from previous session (before device reset was
performed).
Inforamtion is filled when a razwi or CS timeout happens and can
contain one of the following:
1. Retrieve timestamp of last time the device was opened and razwi or
CS timeout happened.
2. Retrieve information about last CS timeout.
3. Retrieve information about last razwi error.
This information doesn't contain user data, so no danger of data
leakage between users.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
AS TPM error indication is not fatal, driver should dump a warning
and continue booting.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
I2C debugfs support is limited to 1 byte. We extend functionality
to more than 1 byte by using one of the pad fields as a length.
No backward compatibility issues as new F/W versions will treat 0
length as a 1 byte length transaction.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Divide the code into 3 different parts:
- Copy kernel parameters
- Setting device behaivor per asic
- Fixup of various device parameters according to the device behaivor.
In addition, remove non-relevant code for upstream (simulator support).
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add implementation for new opcodes in the INFO IOCTL:
1. Retrieve the replaced DRAM rows from f/w.
2. Retrieve the pending DRAM rows from f/w.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Refactor the wait-for-user-interrupt routine to make it more
generic for re-use for other user exposed h/w interfaces in future
ASICs.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In Signaling-From-Graph case, the driver didn't set the hw_sob pointer
at the right place, which is needed for the cs completion
check prior to start sending all the master/slaves jobs to device.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In addition to the clock throttling reason, user should be able
to obtain also the start time and the duration of the throttling
event.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to increase maximum wait-for-interrupt timeout, change it
to 64 bit variable. This wait is used only by newer ASICs, so no
problem in changing this interface at this time.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
CPUCP_PACKET_POWER_GET packet type was used for both
hl_get_power() and hl_set_power().
To align with other sensor functions hl_set_power()
should use CPUCP_PACKET_POWER_SET.
This packet will only be used with newer ASICs, so need to add
a compatibility flag to the asic properties to indicate whether to use
this packet or the GET packet.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of device reset, the driver does a force trigger on all waiting
users to release them from waiting. However, the driver does not handle
error scenario while waiting.
hl_interrupt_wait_ioctl() now exits the wait in case of an error with
abort status.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Once we read indication of whether f/w is doing the reset, we don't
want to clear it, until the next time we read this indication.
Otherwise, we might be in a state of wrong indication.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Using a variable poll interval for fw loading allows us to support
much slower environments (emulation) while changing only a single
line in the code, instead of choosing a different interval in each
function that polls.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Up until now the driver stored indication if Linux was loaded on the
device CPU. This was needed in order to coordinate some tasks that are
performed by the Linux.
In future ASICs, many of those tasks will be performed by the boot
fit, so now we need the same indication of boot fit load status.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The PCI MMU cache is two layered. The upper layer, memcache, uses cache
lines, the bottom layer doesn't.
Hence, after PMMU map operation we have to invalidate memcache, to avoid
the situation where the new entry is already in the cache due to its
cache line being fully in the cache.
However, we do not have to invalidate the lower cache, and here we can
optimize, since cache invalidation is time consuming.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The enum vm_type was abused, used once as a value (indication
memory type for map) and once as a flag (for cache invalidation).
This makes it hard to add new and still keep it meaningful, hence it
is better to split into one enum for values and one for flags.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently LAST_MASK is a global, but really it is an MMU implementation
specific. We need this change for future ASICs.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
VA blocks are currently stored in an inconsistent way. Sometimes block
end is inclusive, sometimes exclusive. This leads to wrong size
calculations in certain cases, plus could lead to a segmentation fault
in case mapping process fails in the middle and we try to roll it back.
Need to make this consistent - start inclusive till end inclusive.
For example, the regions table may now look like this:
0x0000 - 0x1fff : allocated
0x2000 - 0x2fff : free
0x3000 - 0x3fff : allocated
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Do not use a dma channel for debugfs requested transfer if it's
QM is not idle.
Signed-off-by: Guy Zadicario <gzadicario@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The boot status flag "SRAM available" can be set by f/w Linux (in the
general case) or by f/w uboot (in some specific debug scenario) but
never by f/w preboot.
Hence, when polling the boot status flags in the preboot stage we do not
want to poll on "SRAM Avialable".
The special case in which uboot set this flag is when we are running
special debug scenario without Linux. In this case, at some point during
the boot, the uboot relocates its code to the DRAM and then set the
specified flag.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
VA range info could assist in debugging VA allocation bugs.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There are rare cases where the device CPU's watchdog has expired and as
a result, the watchdog reset has happened and the CPU will now move to
running its preboot f/w.
When that happens, the driver will only know that a heartbeat failure
occurred. As a result, the driver will send a message to the CPU's main
f/w asking it to reset the device, but because the CPU is now running
preboot, it won't respond and the re-initialization process will later
fail when trying to load the f/w.
The solution is to send the request to the preboot as well, only if the
reset was caused because of HB failure.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In the dynamic FW load protocol the boot status is updated to
"Ready to Boot" once uboot is active.
Polling on other boot status values is a residue of code duplication
from the static protocol and should be removed.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to better track where in the kernel the dma-buf code is used,
put the symbols in the namespace DMA_BUF and modify all users of the
symbols to properly import the namespace to not break the build at the
same time.
Now the output of modinfo shows the use of these symbols, making it
easier to watch for users over time:
$ modinfo drivers/misc/fastrpc.ko | grep import
import_ns: DMA_BUF
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To avoid checking if fence exists multipled times, changed fence
handling to depend only on the fence status field:
Busy, which means CS still did not completed :
Add its QID so multi CS wait on its completion.
Finished, which means CS completed and fence exists:
Raise its completion bit if it finished mcs handling and
update if necessary the earliest timestamp.
Gone, which means CS already completed and fence deleted:
Update multi CS data to ignore timestamp and raise its
completion bit.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
No need to check the return value if the following action is the same for
both cases. In addition, now that hl_ctx_free() doesn't print if the
context is not released, its name can be misleading as the context might
stay alive after it is executed with no indication for that.
Hence we can discard it and simply put the refcount.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Remove the flag that determines whether to take a timestamp once the
interrupt arrives.
Instead, always take the timestamp once per interrupt.
This is a must for the user-space to measure its graph operations
to evaluate the graph computation time.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When adding a new node to the hpriv list, the driver should
initialize its fields before adding the new node.
Otherwise, there may be some small chance of another thread traversing
that list and accessing the new node's fields without them being
initialized.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Make the frequency set/get functionality common to all ASICs.
This makes more code reusable when adding support for newer ASICs.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix the following build/link error by adding a dependency on the CRC32
routines:
ld: drivers/misc/habanalabs/common/firmware_if.o: in function `hl_fw_dynamic_request_descriptor':
firmware_if.c:(.text.unlikely+0xc89): undefined reference to `crc32_le'
Fixes: 8a43c83fec ("habanalabs: load boot fit to device")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Implement the calls to the dma-buf kernel api to create a dma-buf
object backed by FD.
We block the option to mmap the DMA-BUF object because we don't support
DIRECT_IO and implicit P2P. We only implement support for explicit P2P
through importing the FD of the DMA-BUF.
In the export phase, we provide to the DMA-BUF object an array of pages
that represent the device's memory area. During the map callback,
we convert the array of pages into an SGT. We split/merge the pages
according to the dma max segment size of the importer.
To get the DMA address of the PCI bar, we use the dma_map_resources()
kernel API, because our device memory is not backed by page struct
and this API doesn't need page struct to map the physical address to
a DMA address.
We set the orig_nents member of the SGT to be 0, to indicate to other
drivers that we don't support CPU mappings.
Note that in Habanalabs's ASICs, the device memory is pinned and
immutable. Therefore, there is no need for dynamic mappings and pinning
callbacks.
Also note that in GAUDI we don't have an MMU towards the device memory
and the user works on physical addresses. Therefore, the user doesn't
pass through the kernel driver to allocate memory there. As a result,
only for GAUDI we receive from the user a device memory physical address
(instead of a handle) and a size.
We check the p2p distance using pci_p2pdma_distance_many() and refusing
to map dmabuf in case the distance doesn't allow p2p.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When polling fences for multi CS, it is possible that fence is
no longer exists (its corresponding CS completed and the fence was
deleted) but we still accessing its parameters, causing NULL pointer
dereference.
Fixed by checking if fence exits before accessing its parameters.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Race condition occurs when CS fence completes and multi CS did not
completed yet, while waiting for multi CS ends and returns indication
to user that the CS completed. Next wait for multi CS may be triggered
by previous multi CS completion without any current CS completed,
causing an error.
Example scenario :
1. User do multi CS wait for CSs 1 and 2 on master QID 0
2. CS 1 and 2 reached the "cs release" code. The thread of CS 1
completed both the CS and multi CS handling but the completion
thread of CS 2 completed the CS but still did not executed
complete_multi_cs (note that in CS completion the sequence is to
first do complete all for the CS and then another complete all to
signal the multi_cs)
3. User received indication that CS 1 and 2 completed (since we check
the CS fence and both indicated as completed) and immediately waits
on CS 3 and 4, also on master QID 0.
4. Completion thread of CS2 executed complete_multi_cs before
completion of CS 3 and 4 and so will trigger the multi CS wait of
CSs 3 and 4 as they wait on master QID 0.
This will trigger multi CS completion although none of its
current CS has been completed.
Fixed by adding multi CS complete handling indication for each CS.
CS will be marked to the user as completed only if its fence completed
and multi CS handling is done.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There may be a situation where drivers receives continuous fatal H/W
error events from FW immediately post reset cycle.
This may be due to some fault on the silicon itself.
In such case its better to bypass reset cycle so we won't be stuck in
endless loop of resets.
This commit bypasses reset request in case driver received two back to
back FW fatal error before first occurrence of heartbeat event.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Taking an accurate timestamp in a close proximity of the interrupt is
required for user side statistics management.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The driver allows only a single process to open a device's FD at any
single time. This is done by checking "hdev->compute_ctx" under mutex.
Therefore, to prevent a race between the moment a user closes it's FD
and when another user tries to open the device, we need to make sure
that clearing this variable is the very last thing that is done in the
code of the FD's release.
I'm moving the idle check before clearing this variable and the
"reset on device release". btw, if the reset happens it will prevent
any other user from opening the device until the reset is finished.
An important thing to note is that we need to remove the user process
that is closing the device from the process list BEFORE calling the
reset function. That is to prevent a case where the reset code will
try to kill that user process and it is unnecessary as the process
doesn't hold any device/driver resources anymore.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reset to the device is not necessarily due to an error, so print it
as info instead of error.
In addition, print the type of reset we are doing:
- reset of the entire device (aka hard reset)
- reset of the device after user have released it (less than hard reset)
- lighter reset of an inference device (aka soft reset)
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Soft-reset is the procedure where we reset only the compute/DMA engines
of the device, without requiring the current user-space process to
release the device.
This type of reset can happen if TDR event occurred (a workload got
stuck) or by a root request through sysfs.
This is only relevant for inference ASICs, as there is no real-world
use-case to do that in training, because training runs on multiple
devices.
In addition, we also do (in certain ASICs) a reset upon device release.
That reset uses the same code as the soft-reset.
Therefore, to better differentiate between the two resets, it is better
to rename the soft-reset support as "inference soft-reset", to make
the code more self-explanatory.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The translation in debugfs of device memory MMU virtual addresses was
wrong as it did not take into consideration the fact that the page
sizes there can be not power of 2.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to avoid user target value wraparound, we modify the
current interface so user will be able to wait for an 8-byte
target value rather than a 4-byte value.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
During TDR handling, we check multiple times if CS is valid.
No need to perform this check as CS must be valid at all time
during the TDR handling.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add support to retrieve following power info via HWMON:
- instantaneous power value
- highest value since last reset
- reset the highest place holder
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Instead of having dedicated function per message that we want to send
to the firmware in COMMS protocol, have a generic function that we can
call to from other parts of the driver
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Instead of using the Linux kernel HWMON enums definition when
communicating with the firmware, use proprietary HWMON based enums
i.e. map hwmon.h header enum to cpucp_if.h based enum while.
This is needed because the HWMON enums are not forcing backward
compatibility and therefore changes can break compatibility between
newer driver and older firmware.
The driver will check for CPU_BOOT_DEV_STS0_MAP_HWMON_EN bit to
validate if f/w supports cpucp->hwmon enum mapping to support older
firmware where this mapping won't be available.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Command submission timeout is currently determined during driver
loading time. As some environments requires this timeout to be
modified in runtime, we introduce a new debugfs node that controls
the timeout value without the need to reload the driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In wait for CS IOCTL code, the driver resets the incoming args structure
before returning to the user, regardless of the return value of the
IOCTL.
In case the IOCTL returns EINTR, resetting the args will result in error
in case the userspace will repeat the ioctl call immediately (which is
the behavior in the hl-thunk userspace library).
The solution is to reset the args only if the driver returns success (0)
as a return value for the IOCTL.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Staged submission consists of multiple command submissions.
In order to be explicit, driver should return a single cs sequence
for every cs in the submission, or else user may try to wait on
an internal CS rather than waiting for the whole submission.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add handling for case where the user doesn't set wait offset,
and keeps it as 0. In such a case the driver will decrement one
from this zero value which will cause the code to wait for
wrong number of signals.
The solution is to treat this case as in legacy wait cs,
and wait for the next signal.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As user can send wrong arguments to multi CS API, we rate limit
the amount of errors dumped to dmesg, in addition we change the
severity to warning.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Couple of fixes to the LBW RR configuration:
1. Add missing configuration of the SM RR registers in the DMA_IF.
2. Remove HBW range that doesn't belong.
3. Add entire gap + DBG area, from end of TPC7 to end of entire
DBG space.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a spelling mistake in a literal string. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As collective wait operation is required only when NIC ports are
available, we disable the option to submit a CS in case all the ports
are disabled, which is the current situation in the upstream driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Due to FLR scenario when running inside a VM, we must not use indirect
MSI because it might cause some issues on VM destroy.
In a VM we use single MSI mode in contrary to multi MSI mode which is
used in bare-metal.
Hence direct MSI should be used in single MSI mode only.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of single staged cs with both first/last indications
set, we reach a scenario where in cs_release function flow
we don't cancel the TDR work before freeing the cs memory,
this lead to kernel OOPs since when the timer expires
the work pointer will be freed already.
In addition treat wait encaps cs "not found" handle
as "OK" for the user in order to keep the user interface
for both legacy and encpas signal/wait features the same.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
We have a potential race where a user interrupt can be received
in between user thread value comparison and before request was
added to wait list. This means that if no consecutive interrupt
will be received, user thread will timeout and fail.
The solution is to add the request to wait list before we
perform the comparison.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This commit corrects CARD NAME for Gaudi as "HL205"
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When the f/w runs in secured mode, it can reset the ASIC when certain
events occur. In unsecured mode, the driver asks the f/w to reset the
ASIC for those events.
We need to perform the entire reset procedure but without accessing the
ASIC. i.e. without halting the engines and without sending messages
to the f/w.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix 2 areas in the code where it's possible the code will
go to sleep while holding a spinlock.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
copy_from_user might sleep so we can never call it when we have
a spinlock.
Moreover, it is not necessary in waiting for user interrupt, because
if multiple threads will call this function on the same interrupt,
each one will have it's own fence object inside the driver. The
user address might be the same, but it doesn't really matter to us,
as we only read from it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Checking if the device is operational when entering the function to
wait for user interrupt is not something that is useful or necessary.
It is not done in any other wait_for_cs ioctl path.
If the device becomes non-operational during the wait, the reset
function will make sure the process wait is interrupted.
Instead, move the check to the beginning of hl_wait_ioctl(). It will
block any attempt to wait on CS or user interrupt once the device
is already marked as non-operational.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because this spinlock is taken in an interrupt handler, we must use
the spin_lock_irqsave/irqrestore version to disable the interrupts
on the local CPU. Otherwise, we can have a potential deadlock (if
the interrupt handler is scheduled to run on the same cpu that the
code who took the lock was running on).
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
On init, the disabled state is cleared right before hw_init and that
causes the device to report on "Operational" state before the device
initialization is finished. Although the char device is not yet exposed
to the user at this stage, the sysfs entries are exposed.
This can cause errors in monitoring applications that use the sysfs
entries.
In order to avoid this, a new state "in device creation" is introduced
to ne reported when the device is not disabled but is still in init
flow.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This must be done to clear the internal mem cache so we won't get
ecc errors on the first invalidation.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In secured mode, the CGM is disabled. Therefore, the DC power is
higher. Without taking it into consideration, the utilization is
12-15% at idle.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The out of bounds SLM access TPC interrupt indicates a severe compiler
bug and needs to be informed to user.
This interrupt is currently masked so unmask it.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
It is useful to have the ability to see which user address was pinned
to which physical address during the initial mapping. We already have
all that info stored, but no means to search this data (which may be
quite large).
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case F/W security is enabled driver cannot access ECC registers,
hence driver must fetch the ECC info from F/W.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
During the integration, the multi-CS requirements were refined:
- The multi CS call shall wait on "per-ASIC" predefined stream masters
instead of set of streams.
- Stream masters are set of QIDs used by the upper SW layers (synapse)
for completion (must be an external/HW queue).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Current "state dump" is lacking of monitored SOB IDs. Add for
convenience.
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because we don't have multiple contexts in GAUDI, and to minimize
calls to is_idle function (which uses many register reads), move
the call to clear the user registers to the opening of the single
user context.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Various f/w versions have different timeouts, so increase the default
timeout to accommodate all the options.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add several new packets between driver and firmware.
Add matching compatibility bits for backward compatibility.
Add support for 4K event types.
Add information about pcie errors.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because the register reads might be trapped by the hypervisor in
certain deployments, minimize the number of reads during runtime by
moving static initializations to functions that occur during device
initialization instead of context open.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The address resolution via debugfs was not taking into consideration the
page offset, resulting in a wrong address.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently userptr endpoint in debugfs prints out virtual addresses
in the user process memory space, without specifying their owner process
ID. User space virtual address is meaningless without knowing the owner
process.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
HW init is mostly about configuring registers. Therefore, it is better
to activate DMAs only in late init and afterwards.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to enhance debuggability, we will scrub the whole HBM to
a specific value, in case HBM scrubbing is enabled. Scrubbing will be
performed after reset and after user closes the FD.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently there is no validity check for event ID received from F/W,
Thus exposing driver to memory overrun.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For some ASICs, the f/w reads the msg_to_cpu_reg value after
reset, and for some it doesn't.
Therefore, to be sure f/w doesn't read a wrong value after reset, we
need to clear this register before the reset occurs.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to better support variants of the same ASIC
the set_pci_regions function is now an ASIC function which
allows each ASIC to implement it internally, thus keeping
all definitions static to the file.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Done as the bar size can exceed 4GB.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add the server type property to the hl_info_hw_ip_info structure
that is exposed to the user via the INFO IOCTL.
This is needed by the userspace s/w stack to know the connections map
of the internal links that connect the ASIC among themselves inside the
server.
The F/W will tell us, as part of the NIC information, the server type
that the GAUDI is located in.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This warning is redundant as we will print a notice in case the device
is still in use after the FD was closed. No need to print the same
message per context.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This commit is the second part of the encapsulated signals feature.
It contains the driver support for submission of cs with encapsulated
signals and the wait for them.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The signaling from within encapsulated OP capability is merged into the
existing stream architecture, such that one can trigger multiple
signaling from an encapsulated op, according to the time the event
was done in the graph execution and avoid the need to wait for the
whole encapsulated OP execution to be complete before the stream can
signal.
This commit implements only the reserve/unreserve part.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently the SOB reset was in fence release function which happens
only at the CS wraparound during the CS allocation time.
In order to support the new encapsulated signals reservation feature,
we need to move the SOB reset to an earlier phase because this SOB
could reach it's max value very fast using the signal reservation.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When user sends multiple CSs, waiting for each CS is not efficient
as it involves many user-kernel context switches.
In order to address this issue we add support to "wait on multiple CSs"
using a new uAPI which can wait on maximum of 32 CSs. The new uAPI is
defined using a new flag - WAIT_FOR_MULTI_CS - in the wait_for_cs IOCTL.
The input parameters for this uAPI will be:
@seq: user pointer to an array of up to 32 CS's sequence numbers.
@seq_array_len: length of sequence array.
@timeout_us: timeout for waiting for any CS.
The output paramateres for this API will be:
@status: multi CS ioctl completion status (dedicated status was added as
well).
@flags: bitmap of output flags of the CS.
@cs_completion_map: bitmap for multi CS, if CS sequence that was placed
in index N in input seq array has completed- the N-th
bit in cs_completion_map will be 1, otherwise it will
be 0.
@timestamp_nsec: timestamp of the first completed CS
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
To add proper support for wait-for-multi-CS, locking the CS lock
for each CS fence in the list is not efficient.
Instead, this patch add support to lock the CS lock once to get all
required fences.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The driver quietly handles memory mappings that were not freed so no
need to print a warning about that when user closes the FD.
Accordingly, revise the text that is printed in case the device is
still in use after the user process closed the FD.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a scenario where an ongoing soft reset would race with an
ongoing heartbeat routine, eventually causing heartbeat to fail and
thus to escalate into a hard reset.
With this fix, soft-reset procedure will disable heartbeat CPU messages
and flush the (ongoing) current one before continuing with reset code.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Print the SM name instead of index because it is more informational for
the user to know the SM name instead of id when a SM interrupt occurs.
In addition, the index that is printed is of the SOB group, not
a specific SOB.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
State dump is relevant to the user in case of Sync Manager error, so
we need to trigger it in that case as well.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Each ASIC can have a different offset to add to a host dma address,
to enable the ASIC to access that host memory.
The usage for this can be common code so add this to the asic
property structure.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Recently, the size parameter in userptr structure was change to u64.
As a result, we need to change the type of the local range_size
in device_va_to_pa() to u64 to avoid overflow.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If hard reset fails after the call to hw_fini and before loading the
linux image to the device, a subsequent call to hw_fini should
communicate via COMMS (or MSG_TO_CPU regs for old FW versions).
However, the driver still tries in this case to communicate via the GIC,
and thus no hard reset is actually done.
To avoid that, the patch clears the linux_loaded flag after every call
to hw_fini.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of host-resident MMU, when the page tables pool is destroyed,
its pointer is not nullified correctly.
As a result, on a device fini which happens after a failing reset, the
already destroyed pool is accessed, which leads to a kernel panic.
The patch fixes the setting of the pool pointer to NULL.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This function will be used for more mmap operations than just
mmaping CBs.
Signed-off-by: Zvika Yehudai <zyehudai@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
At the first stage, only gaudi core dump shall be implemented, not
including the status registers.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
With the infrastructure in place, monitors and fences dump shall be
implemented.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
To improve the user's ability to debug the case where a workload that
is part of executing training/inference of a topology is getting stuck,
we need to add a 'core dump' each time a CS times-out. The 'core dump'
shall contain all relevant Sync Manager information and corresponding
fence values.
The most recent dumps shall be accessible via debugfs, under
'state_dump' node. Reading from the node will provide the oldest dump
available. Writing an integer value X will discard X dumps, starting
with the oldest one, i.e. subsequent read will now return newer
dumps.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The previous function we used, find_get_pid(), wasn't good in case
the user process was run inside docker.
As a result, we didn't had the PID and we couldn't kill the user
process in case the device got stuck and we needed to reset the
device.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Sometimes we may need to disable optimization of using huge pages
in our memory management code. Add such a flag to the function that
creates the list of physical pages that would be programmed into the
device MMU.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Increase the size variable in the userptr structure to 64-bit. That
variable describes the size of the memory allocation of the user that
is now being mapped into the device. The mapping can be larger than
4GB, so we need to support it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Same as we handle it in the regular wait for CS, we need to handle the
case where the waiting for user interrupt was interrupted. In that case,
we need to return correct error code to the user.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In device fini there was missing a call to release all pending user
interrupts. That can cause a process to be stuck inside the driver's
IOCTL of wait for interrupts, in case the device is removed or
simulator is killed at the same time.
In addition, also call to remove inactive codec job was missing.
Moreover, to prevent such errors in the future (where code is added
to reset path but not to device fini), we moved some common parts
to two dedicated functions:
cleanup_resources
take_release_locks
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case user interrupt arrived but the completion value is less than
the target value, we want to retry the wait.
However, before the retry we must reinitialize the completion object,
under spin-lock, so the wait function won't exit immediately because
the completion object is already completed (from the previous
interrupt).
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update recent changes made in firmware header files, which contain
a minor COMMS protocol change and new error status definitions.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
A new user flag is required to make memory map hint mandatory, in
contrast to the current situation where it is best effort.
This is due to the requirement to map certain data to specific
pre-determined device virtual address ranges.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add support for pre-determined driver-reserved device VA address ranges.
This is needed for future ASIC support where some contents must be
mapped into these pre-determined ranges because the H/W will be
configured using these ranges.
In case the user asks to map a VA without a hint address, avoid
allocating the device VA from the reserved ranges.
Make sure the validation checks of the hint address take into account
situation where the DRAM page size is not pow of 2.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is code related to hard-reset, which is done in gaudi specific
code. However, this code can be used by future ASICs and therefore it
is better to move it to the common code section.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
We add support for NIC DERR ECC error events, in case this error
is received a device reset will be performed.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In preparation for a new feature that allows the user to reserve
signals ahead of submissions, we need to change a current assumption
in the code.
Currently, the driver uses 2 SOBs to support signal CS. When the first
SOB reaches max value, the driver switches to the other one and assumes
that when it will need to switch back to the first one, all of the
signals have already been handled.
This assumption won't hold when the new feature will be added, because
using signal reservation, the driver can reach the max SOB value very
fast.
The change is to add a validity check when submitting a signal CS, to
make sure the previous SOB is available (all the signals attached to
it indeed finished).
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
We introduce a new type of reset which is reset upon device release.
This reset is very similar to soft reset except the fact it is
performed only upon device release and not upon user sysfs request
nor TDR.
The purpose of this reset is to make sure the device is returned to
IDLE state after the current user has finished working with the device.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
To be able to debug long-running CS better, without changing the
userspace code, we are adding a new option through debugfs interface
to skip the reset of the device in case of CS timeout.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently driver sends fc interrupt id to FW instead of using
cpu interrupt id. We intend to fix that and keep backward
compatibility by using the same interrupt values.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There was a rogue #ifdef that crept into the upstream code for
backwards compatibility which isn't needed of course.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case QMAN has an error and stop_on_err is true, print specific
information of the "offending" command buffer batch.
If the error occurred on one of the higher CPs, the CQ pointer and size
will be printed along with (up to) last 8 PQEs of the stream.
If the error occurred in the lower CP, the CQ pointer and size will be
printed along with (up to) last 8 PQEs of ALL upper CPs as we have no
way to know which upper CP sent the job there.
This is done so higher SW levels will be able to debug their CS by
extracting the raw data of the offending command buffer batch and
examine those offline to detect the issue.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In a system with multiple ASICs, there is a need to provide monitoring
tools with information on how long a device was opened and how many
times a device was opened.
Therefore, we add a new opcode to the INFO ioctl to provide that
information.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update STMTCSR and STMSYNCR values in order to reduce amount of sync
packets
Signed-off-by: Tal Albo <talbo@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
fix the following smatch warnings:
goya_pin_memory_before_cs()
warn: '&userptr->job_node' not removed from list
gaudi_pin_memory_before_cs()
warn: '&userptr->job_node' not removed from list
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
pin_user_pages_fast() might fail and return a negative number, or pin
less pages than requested and return the number of the pages that were
pinned.
For the latter, it is informative to print also the memory size and the
number of requested pages.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it
must be undone by a corresponding 'pci_disable_pcie_error_reporting()'
call, as already done in the remove function.
Fixes: 2e5eda4681 ("habanalabs: PCIe Advanced Error Reporting support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Firmware in habanalabs devices is composed of several components.
During device initialization, we read these versions from the device.
Print them during device initialization to allow better visibility in
automated systems.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Hard reset flow on PLDM might take more than 2 minutes.
Hence add a dedicated hard reset timeout of 6 minutes for PLDM.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In current code, for dynamic f/w loading flow, DRAM scrambling is
enabled post Linux fit image is loaded to the card. This can cause the
device CPU to go into reset state.
The correct sequence should be:
1. Load boot fit image
2. Enable scrambling
3. Load Linux fit image
This commit aligns the DRAM scrambling enabling with the static f/w load
flow.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If there is an error in the QMAN/engine, there is no point of trying
to continue running the workload. It is better to stop to allow the
user to debug the program.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case we have EQ fault we would like to know about it.
For this, a status bitmask was added in which EQ_FAULT bit is
set by FW in case of EQ fault.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When there is an ECC error in the HBM, return a standard error code,
-EIO in this case, and not a positive value.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>