Age | Commit message (Collapse) | Author |
|
GAUDI supports only hard-reset. Therefore, this function should
return an error of operation not permitted.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The ASIC-specific soft_reset_late_init() is now called after either
soft-reset or reset-upon-device-release. Therefore, it needs a more
appropriate name.
No need to split it to two functions, as an ASIC either supports
soft-reset or reset-upon-device-release.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Changing the frequency automatically is only done in Goya. In future
ASICs this is done inside the firmware. Therefore, move the common code
into the Goya specific files.
Main changes as part of the commit are:
1. The thread for setting frequency is moved from device_late_init
to goya_late_init
2. hl_device_set_frequency is removed from hl_device_open as it is
not relevant for other ASICs and for Goya it is taken care by
the thread
3. hl_device_set_frequency is renamed as goya_set_frequency
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Currently there is a deadlock in driver in scenarios where MMU
cache invalidation fails. The issue is basically device reset
being performed without releasing the MMU mutex.
The solution is to skip device reset as it is not necessary.
In addition we introduce a slight code refactor that prints the
invalidation error from a single location.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Getting the used PLL index with which to send the CPUPU packet relies on
the CPUCP info packet.
In case CPU queues are not enabled getting the PLL index will issue an
error and in some ASICs will also fail the driver load.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
A new uAPI is added for debug purposes of the user-space to retrieve
errors related data from previous session (before device reset was
performed).
Inforamtion is filled when a razwi or CS timeout happens and can
contain one of the following:
1. Retrieve timestamp of last time the device was opened and razwi or
CS timeout happened.
2. Retrieve information about last CS timeout.
3. Retrieve information about last razwi error.
This information doesn't contain user data, so no danger of data
leakage between users.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In Signaling-From-Graph case, the driver didn't set the hw_sob pointer
at the right place, which is needed for the cs completion
check prior to start sending all the master/slaves jobs to device.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In addition to the clock throttling reason, user should be able
to obtain also the start time and the duration of the throttling
event.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Rename reset flags for better readability as compared to
HL_RESET_CAUSE* enum shared with the f/w.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
CPUCP_PACKET_POWER_GET packet type was used for both
hl_get_power() and hl_set_power().
To align with other sensor functions hl_set_power()
should use CPUCP_PACKET_POWER_SET.
This packet will only be used with newer ASICs, so need to add
a compatibility flag to the asic properties to indicate whether to use
this packet or the GET packet.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Up until now the driver stored indication if Linux was loaded on the
device CPU. This was needed in order to coordinate some tasks that are
performed by the Linux.
In future ASICs, many of those tasks will be performed by the boot
fit, so now we need the same indication of boot fit load status.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The enum vm_type was abused, used once as a value (indication
memory type for map) and once as a flag (for cache invalidation).
This makes it hard to add new and still keep it meaningful, hence it
is better to split into one enum for values and one for flags.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Currently LAST_MASK is a global, but really it is an MMU implementation
specific. We need this change for future ASICs.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Do not use a dma channel for debugfs requested transfer if it's
QM is not idle.
Signed-off-by: Guy Zadicario <gzadicario@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
There are rare cases where the device CPU's watchdog has expired and as
a result, the watchdog reset has happened and the CPU will now move to
running its preboot f/w.
When that happens, the driver will only know that a heartbeat failure
occurred. As a result, the driver will send a message to the CPU's main
f/w asking it to reset the device, but because the CPU is now running
preboot, it won't respond and the re-initialization process will later
fail when trying to load the f/w.
The solution is to send the request to the preboot as well, only if the
reset was caused because of HB failure.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Make the frequency set/get functionality common to all ASICs.
This makes more code reusable when adding support for newer ASICs.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Implement the calls to the dma-buf kernel api to create a dma-buf
object backed by FD.
We block the option to mmap the DMA-BUF object because we don't support
DIRECT_IO and implicit P2P. We only implement support for explicit P2P
through importing the FD of the DMA-BUF.
In the export phase, we provide to the DMA-BUF object an array of pages
that represent the device's memory area. During the map callback,
we convert the array of pages into an SGT. We split/merge the pages
according to the dma max segment size of the importer.
To get the DMA address of the PCI bar, we use the dma_map_resources()
kernel API, because our device memory is not backed by page struct
and this API doesn't need page struct to map the physical address to
a DMA address.
We set the orig_nents member of the SGT to be 0, to indicate to other
drivers that we don't support CPU mappings.
Note that in Habanalabs's ASICs, the device memory is pinned and
immutable. Therefore, there is no need for dynamic mappings and pinning
callbacks.
Also note that in GAUDI we don't have an MMU towards the device memory
and the user works on physical addresses. Therefore, the user doesn't
pass through the kernel driver to allocate memory there. As a result,
only for GAUDI we receive from the user a device memory physical address
(instead of a handle) and a size.
We check the p2p distance using pci_p2pdma_distance_many() and refusing
to map dmabuf in case the distance doesn't allow p2p.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In the kernel it is common to use u32 and not uint32_t.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
There may be a situation where drivers receives continuous fatal H/W
error events from FW immediately post reset cycle.
This may be due to some fault on the silicon itself.
In such case its better to bypass reset cycle so we won't be stuck in
endless loop of resets.
This commit bypasses reset request in case driver received two back to
back FW fatal error before first occurrence of heartbeat event.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Couple of fixes to the LBW RR configuration:
1. Add missing configuration of the SM RR registers in the DMA_IF.
2. Remove HBW range that doesn't belong.
3. Add entire gap + DBG area, from end of TPC7 to end of entire
DBG space.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
There is a spelling mistake in a literal string. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Due to FLR scenario when running inside a VM, we must not use indirect
MSI because it might cause some issues on VM destroy.
In a VM we use single MSI mode in contrary to multi MSI mode which is
used in bare-metal.
Hence direct MSI should be used in single MSI mode only.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This commit corrects CARD NAME for Gaudi as "HL205"
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
When the f/w runs in secured mode, it can reset the ASIC when certain
events occur. In unsecured mode, the driver asks the f/w to reset the
ASIC for those events.
We need to perform the entire reset procedure but without accessing the
ASIC. i.e. without halting the engines and without sending messages
to the f/w.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This register shouldn't be modified by user. Prefetch is disabled
in Gaudi.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This must be done to clear the internal mem cache so we won't get
ecc errors on the first invalidation.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
It's more readable for the size to be in decimal.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In secured mode, the CGM is disabled. Therefore, the DC power is
higher. Without taking it into consideration, the utilization is
12-15% at idle.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The out of bounds SLM access TPC interrupt indicates a severe compiler
bug and needs to be informed to user.
This interrupt is currently masked so unmask it.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In case F/W security is enabled driver cannot access ECC registers,
hence driver must fetch the ECC info from F/W.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
During the integration, the multi-CS requirements were refined:
- The multi CS call shall wait on "per-ASIC" predefined stream masters
instead of set of streams.
- Stream masters are set of QIDs used by the upper SW layers (synapse)
for completion (must be an external/HW queue).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Current "state dump" is lacking of monitored SOB IDs. Add for
convenience.
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Because we don't have multiple contexts in GAUDI, and to minimize
calls to is_idle function (which uses many register reads), move
the call to clear the user registers to the opening of the single
user context.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Various f/w versions have different timeouts, so increase the default
timeout to accommodate all the options.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Because the register reads might be trapped by the hypervisor in
certain deployments, minimize the number of reads during runtime by
moving static initializations to functions that occur during device
initialization instead of context open.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
HW init is mostly about configuring registers. Therefore, it is better
to activate DMAs only in late init and afterwards.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In order to enhance debuggability, we will scrub the whole HBM to
a specific value, in case HBM scrubbing is enabled. Scrubbing will be
performed after reset and after user closes the FD.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Currently there is no validity check for event ID received from F/W,
Thus exposing driver to memory overrun.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In order to better support variants of the same ASIC
the set_pci_regions function is now an ASIC function which
allows each ASIC to implement it internally, thus keeping
all definitions static to the file.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Add the server type property to the hl_info_hw_ip_info structure
that is exposed to the user via the INFO IOCTL.
This is needed by the userspace s/w stack to know the connections map
of the internal links that connect the ASIC among themselves inside the
server.
The F/W will tell us, as part of the NIC information, the server type
that the GAUDI is located in.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This commit is the second part of the encapsulated signals feature.
It contains the driver support for submission of cs with encapsulated
signals and the wait for them.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The signaling from within encapsulated OP capability is merged into the
existing stream architecture, such that one can trigger multiple
signaling from an encapsulated op, according to the time the event
was done in the graph execution and avoid the need to wait for the
whole encapsulated OP execution to be complete before the stream can
signal.
This commit implements only the reserve/unreserve part.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Currently the SOB reset was in fence release function which happens
only at the CS wraparound during the CS allocation time.
In order to support the new encapsulated signals reservation feature,
we need to move the SOB reset to an earlier phase because this SOB
could reach it's max value very fast using the signal reservation.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
When user sends multiple CSs, waiting for each CS is not efficient
as it involves many user-kernel context switches.
In order to address this issue we add support to "wait on multiple CSs"
using a new uAPI which can wait on maximum of 32 CSs. The new uAPI is
defined using a new flag - WAIT_FOR_MULTI_CS - in the wait_for_cs IOCTL.
The input parameters for this uAPI will be:
@seq: user pointer to an array of up to 32 CS's sequence numbers.
@seq_array_len: length of sequence array.
@timeout_us: timeout for waiting for any CS.
The output paramateres for this API will be:
@status: multi CS ioctl completion status (dedicated status was added as
well).
@flags: bitmap of output flags of the CS.
@cs_completion_map: bitmap for multi CS, if CS sequence that was placed
in index N in input seq array has completed- the N-th
bit in cs_completion_map will be 1, otherwise it will
be 0.
@timestamp_nsec: timestamp of the first completed CS
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Print the SM name instead of index because it is more informational for
the user to know the SM name instead of id when a SM interrupt occurs.
In addition, the index that is printed is of the SOB group, not
a specific SOB.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
State dump is relevant to the user in case of Sync Manager error, so
we need to trigger it in that case as well.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Each ASIC can have a different offset to add to a host dma address,
to enable the ASIC to access that host memory.
The usage for this can be common code so add this to the asic
property structure.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
This function will be used for more mmap operations than just
mmaping CBs.
Signed-off-by: Zvika Yehudai <zyehudai@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
At the first stage, only gaudi core dump shall be implemented, not
including the status registers.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
With the infrastructure in place, monitors and fences dump shall be
implemented.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|