Age | Commit message (Collapse) | Author |
|
Isolate the ABI based on raw SMCs. Code specific to the raw SMC ABI is
moved into smc_abi.c. This makes room for other ABIs with a clear
separation.
The driver changes to use module_init()/module_exit() instead of
module_platform_driver(). The platform_driver_register() and
platform_driver_unregister() functions called directly to keep the same
behavior. This is needed because module_platform_driver() is based on
module_driver() which can only be used once in a module.
A function optee_rpc_cmd() is factored out from the function
handle_rpc_func_cmd() to handle the ABI independent part of RPC
processing.
This patch is not supposed to change the driver behavior, it's only a
matter of reorganizing the code.
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
To avoid checking if fence exists multipled times, changed fence
handling to depend only on the fence status field:
Busy, which means CS still did not completed :
Add its QID so multi CS wait on its completion.
Finished, which means CS completed and fence exists:
Raise its completion bit if it finished mcs handling and
update if necessary the earliest timestamp.
Gone, which means CS already completed and fence deleted:
Update multi CS data to ignore timestamp and raise its
completion bit.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
No need to check the return value if the following action is the same for
both cases. In addition, now that hl_ctx_free() doesn't print if the
context is not released, its name can be misleading as the context might
stay alive after it is executed with no indication for that.
Hence we can discard it and simply put the refcount.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Remove the flag that determines whether to take a timestamp once the
interrupt arrives.
Instead, always take the timestamp once per interrupt.
This is a must for the user-space to measure its graph operations
to evaluate the graph computation time.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
When adding a new node to the hpriv list, the driver should
initialize its fields before adding the new node.
Otherwise, there may be some small chance of another thread traversing
that list and accessing the new node's fields without them being
initialized.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Make the frequency set/get functionality common to all ASICs.
This makes more code reusable when adding support for newer ASICs.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Fix the following build/link error by adding a dependency on the CRC32
routines:
ld: drivers/misc/habanalabs/common/firmware_if.o: in function `hl_fw_dynamic_request_descriptor':
firmware_if.c:(.text.unlikely+0xc89): undefined reference to `crc32_le'
Fixes: 8a43c83fec12 ("habanalabs: load boot fit to device")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Implement the calls to the dma-buf kernel api to create a dma-buf
object backed by FD.
We block the option to mmap the DMA-BUF object because we don't support
DIRECT_IO and implicit P2P. We only implement support for explicit P2P
through importing the FD of the DMA-BUF.
In the export phase, we provide to the DMA-BUF object an array of pages
that represent the device's memory area. During the map callback,
we convert the array of pages into an SGT. We split/merge the pages
according to the dma max segment size of the importer.
To get the DMA address of the PCI bar, we use the dma_map_resources()
kernel API, because our device memory is not backed by page struct
and this API doesn't need page struct to map the physical address to
a DMA address.
We set the orig_nents member of the SGT to be 0, to indicate to other
drivers that we don't support CPU mappings.
Note that in Habanalabs's ASICs, the device memory is pinned and
immutable. Therefore, there is no need for dynamic mappings and pinning
callbacks.
Also note that in GAUDI we don't have an MMU towards the device memory
and the user works on physical addresses. Therefore, the user doesn't
pass through the kernel driver to allocate memory there. As a result,
only for GAUDI we receive from the user a device memory physical address
(instead of a handle) and a size.
We check the p2p distance using pci_p2pdma_distance_many() and refusing
to map dmabuf in case the distance doesn't allow p2p.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
When polling fences for multi CS, it is possible that fence is
no longer exists (its corresponding CS completed and the fence was
deleted) but we still accessing its parameters, causing NULL pointer
dereference.
Fixed by checking if fence exits before accessing its parameters.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Race condition occurs when CS fence completes and multi CS did not
completed yet, while waiting for multi CS ends and returns indication
to user that the CS completed. Next wait for multi CS may be triggered
by previous multi CS completion without any current CS completed,
causing an error.
Example scenario :
1. User do multi CS wait for CSs 1 and 2 on master QID 0
2. CS 1 and 2 reached the "cs release" code. The thread of CS 1
completed both the CS and multi CS handling but the completion
thread of CS 2 completed the CS but still did not executed
complete_multi_cs (note that in CS completion the sequence is to
first do complete all for the CS and then another complete all to
signal the multi_cs)
3. User received indication that CS 1 and 2 completed (since we check
the CS fence and both indicated as completed) and immediately waits
on CS 3 and 4, also on master QID 0.
4. Completion thread of CS2 executed complete_multi_cs before
completion of CS 3 and 4 and so will trigger the multi CS wait of
CSs 3 and 4 as they wait on master QID 0.
This will trigger multi CS completion although none of its
current CS has been completed.
Fixed by adding multi CS complete handling indication for each CS.
CS will be marked to the user as completed only if its fence completed
and multi CS handling is done.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In the kernel it is common to use u32 and not uint32_t.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Update the firmware headers to the latest version
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
There may be a situation where drivers receives continuous fatal H/W
error events from FW immediately post reset cycle.
This may be due to some fault on the silicon itself.
In such case its better to bypass reset cycle so we won't be stuck in
endless loop of resets.
This commit bypasses reset request in case driver received two back to
back FW fatal error before first occurrence of heartbeat event.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Taking an accurate timestamp in a close proximity of the interrupt is
required for user side statistics management.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The driver allows only a single process to open a device's FD at any
single time. This is done by checking "hdev->compute_ctx" under mutex.
Therefore, to prevent a race between the moment a user closes it's FD
and when another user tries to open the device, we need to make sure
that clearing this variable is the very last thing that is done in the
code of the FD's release.
I'm moving the idle check before clearing this variable and the
"reset on device release". btw, if the reset happens it will prevent
any other user from opening the device until the reset is finished.
An important thing to note is that we need to remove the user process
that is closing the device from the process list BEFORE calling the
reset function. That is to prevent a case where the reset code will
try to kill that user process and it is unnecessary as the process
doesn't hold any device/driver resources anymore.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Reset to the device is not necessarily due to an error, so print it
as info instead of error.
In addition, print the type of reset we are doing:
- reset of the entire device (aka hard reset)
- reset of the device after user have released it (less than hard reset)
- lighter reset of an inference device (aka soft reset)
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Soft-reset is the procedure where we reset only the compute/DMA engines
of the device, without requiring the current user-space process to
release the device.
This type of reset can happen if TDR event occurred (a workload got
stuck) or by a root request through sysfs.
This is only relevant for inference ASICs, as there is no real-world
use-case to do that in training, because training runs on multiple
devices.
In addition, we also do (in certain ASICs) a reset upon device release.
That reset uses the same code as the soft-reset.
Therefore, to better differentiate between the two resets, it is better
to rename the soft-reset support as "inference soft-reset", to make
the code more self-explanatory.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
The translation in debugfs of device memory MMU virtual addresses was
wrong as it did not take into consideration the fact that the page
sizes there can be not power of 2.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
In order to avoid user target value wraparound, we modify the
current interface so user will be able to wait for an 8-byte
target value rather than a 4-byte value.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
During TDR handling, we check multiple times if CS is valid.
No need to perform this check as CS must be valid at all time
during the TDR handling.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Add support to retrieve following power info via HWMON:
- instantaneous power value
- highest value since last reset
- reset the highest place holder
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Instead of having dedicated function per message that we want to send
to the firmware in COMMS protocol, have a generic function that we can
call to from other parts of the driver
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Instead of using the Linux kernel HWMON enums definition when
communicating with the firmware, use proprietary HWMON based enums
i.e. map hwmon.h header enum to cpucp_if.h based enum while.
This is needed because the HWMON enums are not forcing backward
compatibility and therefore changes can break compatibility between
newer driver and older firmware.
The driver will check for CPU_BOOT_DEV_STS0_MAP_HWMON_EN bit to
validate if f/w supports cpucp->hwmon enum mapping to support older
firmware where this mapping won't be available.
Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Command submission timeout is currently determined during driver
loading time. As some environments requires this timeout to be
modified in runtime, we introduce a new debugfs node that controls
the timeout value without the need to reload the driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
|
|
Describe better which driver applies to which SoC, to make configuring
kernel for Samsung SoC easier.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://lore.kernel.org/r/20210924133624.112593-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
|
|
We need the driver-core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the usb fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the serial/tty fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the staging fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the char/misc fixes in here for merging and testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Unfortunately, the architecture provides no means to determine the bit
width of the system counter. However, we do know the following from the
specification:
- the system counter is at least 56 bits wide
- Roll-over time of not less than 40 years
To date, the arch timer driver has depended on the first property,
assuming any system counter to be 56 bits wide and masking off the rest.
However, combining a narrow clocksource mask with a high frequency
counter could result in prematurely wrapping the system counter by a
significant margin. For example, a 56 bit wide, 1GHz system counter
would wrap in a mere 2.28 years!
This is a problem for two reasons: v8.6+ implementations are required to
provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
implementers may select a counter frequency of their choosing.
Fix the issue by deriving a valid clock mask based on the second
property from above. Set the floor at 56 bits, since we know no system
counter is narrower than that.
[maz: fixed width computation not to lose the last bit, added
max delta generation for the timer]
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210807191428.3488948-1-oupton@google.com
Link: https://lore.kernel.org/r/20211017124225.3018098-13-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Space prohibited between function name and
open parenthesis '('
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Link: https://lore.kernel.org/r/20210928151833.589843-4-f.suligoi@asem.it
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Alignment should match open parenthesis.
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Link: https://lore.kernel.org/r/20210928151833.589843-3-f.suligoi@asem.it
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The "if" conditional statement is not a single statement,
so both branches require braces.
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Link: https://lore.kernel.org/r/20210928151833.589843-2-f.suligoi@asem.it
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Braces {} are not necessary for single statement blocks.
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Link: https://lore.kernel.org/r/20210928151833.589843-1-f.suligoi@asem.it
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add support for setting dma coherent mask, dma mask is set to 64 bit
Signed-off-by: Pandith N <pandith.n@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211001140812.24977-4-pandith.n@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Added hardware handshake selection in channel config,
for mem2per and per2mem case.
The peripheral specific handshake interface needs to be
programmed in src_per, dst_per bits of CHx_CFG register.
Signed-off-by: Pandith N <pandith.n@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211001140812.24977-3-pandith.n@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Added support for DMA controller with more than 8 channels.
DMAC register map changes based on number of channels.
Enabling DMAC channel:
DMAC_CHENREG has to be used when number of channels <= 8
DMAC_CHENREG2 has to be used when number of channels > 8
Configuring DMA channel:
CHx_CFG has to be used when number of channels <= 8
CHx_CFG2 has to be used when number of channels > 8
Suspending and resuming channel:
DMAC_CHENREG has to be used when number of channels <= 8 DMAC_CHSUSPREG
has to be used for suspending a channel > 8
Signed-off-by: Pandith N <pandith.n@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211001140812.24977-2-pandith.n@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Theorically, address pointers used by STM32 DMA must be chosen so as to
ensure that all transfers within a burst block are aligned on the address
boundary equal to the size of the transfer.
If this is always the case for peripheral addresses on STM32, it is not for
memory addresses if the user doesn't respect this alignment constraint.
To avoid a weird behavior of the DMA controller in this case (no error
triggered but data are not transferred as expected), force no burst.
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211011094259.315023-4-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
buf_addr parameter of stm32_dma_set_xfer_param function is a dma_addr_t.
We only need to check the remainder of buf_addr/max_width, so, no need to
use do_div and extra u64 addr. Use '%' instead.
Fixes: e0ebdbdcb42a ("dmaengine: stm32-dma: take address into account when computing max width")
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211011094259.315023-3-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
To prevent accidental repeated completion, mark pending descriptor
complete in terminate_all. It can be the case when terminate_all is called
while no end of transfer interrupt occurs.
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211011094259.315023-2-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
'head' is unused, remove it.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/46e071be21fbc5ac5c35d4796a7e4249e94c3a77.1633847306.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Total amount of SG list entries executed in a single burst is limited by
the number of available DMA descriptors.
This information is useful for device drivers utilizing this DMA engine.
Signed-off-by: Artur Rojek <contact@artur-rojek.eu>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20210829195805.148964-1-contact@artur-rojek.eu
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Currently, DMA clocks are turned on by the bootloader.
This patch adds support for DMA clock handling so that
the driver manages the DMA clocks.
Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20210923102451.11403-1-biju.das.jz@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Remove max_descs_per_engine field. The recently released DSA spec 1.2 [1]
has removed this field and made it reserved.
[1]: https://software.intel.com/content/dam/develop/external/us/en/documents-tps/341204-intel-data-streaming-accelerator-spec.pdf
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163406167978.1303649.1798682437841822837.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
DSA spec 1.2 has moved the GENCFG register under the GENCAP configuration
support with respect to writability. Add check in driver before writing to
GENCFG register.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163406171896.1303830.11217958011385656998.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
memset() and memcpy() on an MMIO region like here results in a
lockup at startup on mpc5200 platform (since this first happens
during probing of the ATA and Ethernet drivers). Use memset_io()
and memcpy_toio() instead.
Fixes: 2f9ea1bde0d1 ("bestcomm: core bestcomm support for Freescale MPC5200")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Link: https://lore.kernel.org/r/20211014094012.21286-1-agust@denx.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use pm_ptr() macro to fill at_xdmac_driver.driver.pm.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20211007111230.2331837-5-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use __maybe_unused for pm functions.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20211007111230.2331837-4-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
AT_XDMAC_CC_PERID() should be used to setup bits 24..30 of XDMAC_CC
register. Using it without parenthesis around 0x7f & (i) will lead to
setting all the time zero for bits 24..30 of XDMAC_CC as the << operator
has higher precedence over bitwise &. Thus, add paranthesis around
0x7f & (i).
Fixes: 15a03850ab8f ("dmaengine: at_xdmac: fix macro typo")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20211007111230.2331837-3-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|