summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-11-30habanalabs: release signal if collective wait was droppedOfir Bitton
As in standard wait cs, we must release a signal fence once a collective wait cs was dropped and not submitted. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: Skip updating CI of internal queues if not in useTomer Tayar
There are no internal queues if H/W queues are being used. In this case we can skip the redundant traversal over the queues array, looking for internal queues. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: Small refactoring of cs_do_release()Tomer Tayar
Slightly refactor the cs_do_release() function, to reduce nesting level and to ease the handling of future CS types. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: Small refactoring of CS IOCTL handlingTomer Tayar
Refactor the CS IOCTL handling by gathering common code into sub-functions, in order to ease future additions of new CS types. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: fetch PLL info from FWOfir Bitton
Once FW security is enabled there is no access to PLL registers, need to read values from FW using a dedicated interface. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: refactor MMU to support dual residency MMUMoti Haimovski
This commit refactors the MMU code to support PCI MMU page tables residing on host and DCORE MMU residing on the device DRAM at the same time. This is needed for future devices as on GAUDI and GOYA we have a single MMU where its page tables always reside on DRAM. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fix MMU print messageMoti Haimovski
This commit fixes an incorrect error message Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: scrub all memory upon closing FDfarah kassabri
In cases of multi-tenants, administrators may want to prevent data leakage between users running on the same device one after another. To do that the driver can scrub the internal memory (both SRAM and DRAM) after a user finish to use the memory. Because in GAUDI the driver allows only one application to use the device at a time, it can scrub the memory when user app close FD. In future devices where we have MMU on the DRAM, we can scrub the DRAM memory with a finer granularity (page granularity) when the user allocates the memory. This feature is not supported in Goya. To allow users that want to debug their applications, we add a kernel module parameter to load the driver with this feature disabled. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add support for FW securityOfir Bitton
Skip relevant HW configurations once FW security is enabled because these configurations are being performed by FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch security indication from FWOfir Bitton
Add support for fetching security indication from FW. This indication is needed in order to skip unnecessary initializations done by FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fix cs counters structurefarah kassabri
Fix cs counters structure in uapi to be one flat structure instead of two instances of the same other structure. use atomic read/increment for context counters so we could use one structure for both aggregated and context counters. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: advanced FW loadingOfir Bitton
Today driver is able to load a whole FW binary into a specific location on ASIC. We add support for loading sections from the same FW binary into different loactions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: initialize variable before useOded Gabbay
GCC 7.3.1 20180303 (Red Hat 7.3.1-5) complains that collective_engine_id might be used uninitialized. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: remove unreachable codeOfir Bitton
Remove unreachable code in gaudi collective flow. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: make sure cs type is valid in cs_ioctl_signal_waitOded Gabbay
Although we get a valid cs type from the callee, in case new values will be added in the future, it is best to check the expected values in that function. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: monitor device memory usageOded Gabbay
In GAUDI we don't have an MMU towards the HBM device memory. Therefore, the user access that memory directly through physical address (via the different engines) without the need to go through the driver to allocate/free memory on the HBM. For system monitoring purposes, the driver will keep track of the HBM usage. This can be done as long as the user accurately reports the allocations and releases of HBM memory, through the existing MEMORY IOCTL uapi. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: sync stream collective supportOfir Bitton
Implement sync stream collective for GAUDI. Need to allocate additional resources for that and add ctx_fini() to clean up those resources. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: Set DMA5 QMAN internalOfir Bitton
DMA5 QMAN is designated to be used for reduction process, hence it will be no longer configured as external queue. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: sync stream collective infrastructureOfir Bitton
Define new API for collective wait support and modify sync stream common flow. In addition add kernel CB allocation support for internal queues. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: use enum for CB allocation optionsTal Cohen
In the future there will be situations where queues can accept either kernel allocated CBs or user allocated CBs, depending on different states. Therefore, instead of using a boolean variable of kernel/user allocated CB, we need to use a bitmask to indicate that, which will allow to combine the two options. Add a flag to the uapi so the user will be able to indicate whether the CB was allocated by kernel or by user. Of course the driver validates that. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add support for NIC QMANsOded Gabbay
Initialize the QMANs that are responsible to submit doorbells to the NIC engines. Add support for stopping and disabling them, and reset them as part of the hard-reset procedure of GAUDI. This will allow the user to submit work to the NICs. Add support for receiving events on QMAN errors from the firmware. However, the nic_ports_mask is still initialized to 0. That means this code won't initialize the QMANs just yet. That will be in a later patch. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC security configurationOded Gabbay
Configure the security properties of the NIC IP. This is to prevent the user process from doing something with the NIC that he shouldn't do. e.g. crash the server, steal data, etc. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC firmware-related definitionsOded Gabbay
Add new structures and messages that the driver use to interact with the firmware to receive information and events (errors) about GAUDI's NIC. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC QMAN H/W and registers definitionsOded Gabbay
Add auto-generated header files that describe the NIC QMANs registers used by the driver. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: remove duplicate checkOded Gabbay
We already check if queue index is smaller than max queues a few lines above this check so no need to check this again. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: sync stream refactor functionsOfir Bitton
Refactor sync stream implementation by reducing function length for better readability. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: add support for multiple SOBs per monitorOfir Bitton
Support advanced monitor functionality to monitor more than a single SOB. In addition expand all CB generation functions with buffer offset in order to put in them multiple packets that are generated by different functions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: sync stream structures refactorOfir Bitton
Refactor sync stream implementation by adding more structures for better readability. In addition reducing allocated resources. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: don't init vm module if no MMUOded Gabbay
In case we are running without MMU enabled (debug mode), no need to initialize the VM module in the driver. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: minimize prints when everything is fineOded Gabbay
No need to print when the driver starts to initialize the H/W. Drivers should be silent when everything is OK. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: support multiple types of firmwaresOded Gabbay
The driver now loads the firmware in two stages. For debugging purposes we need to support situations where only the first stage firmware is loaded. Therefore, use a bitmask to determine which F/W is loaded Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: we need CPU queues for hwmonOded Gabbay
F/W can be loaded but device CPU queues disabled. In that case, HWMON should be disabled. This is only relevant when debugging Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: move mmu_prepare to context initOfir Bitton
Currently mmu_prepare is located at context switch. Since we support a single context, no reason to reconfigure the MMU registers every context switch. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: change aggregate cs counters to atomicOded Gabbay
In case we will have multiple contexts/processes, we can't just increment aggregated counters. We need to make them atomic as they can be incremented by multiple processes Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30MAINTAINERS: update email, git repo of habanalabs driverOded Gabbay
Update the email to my kernel.org email address and update the git repository address to the git.kernel.org Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-30habanalabs: put devices before driver removalOfir Bitton
Driver never puts its device and control_device objects, hence a memory leak is introduced every driver removal. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: free host huge va_range if not usedOfir Bitton
If huge range is not valid, driver uses the host range also for huge page allocations, but driver never frees its allocation. This introduces a memory leak every time a user closes its context. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30speakup: Reject setting the speakup line discipline outside of speakupSamuel Thibault
Speakup exposing a line discipline allows userland to try to use it, while it is deemed to be useless, and thus uselessly exposes potential bugs. One of them is simply that in such a case if the line sends data, spk_ttyio_receive_buf2 is called and crashes since spk_ttyio_synth is NULL. This change restricts the use of the speakup line discipline to speakup drivers, thus avoiding such kind of issues altogether. Cc: stable@vger.kernel.org Reported-by: Shisong Qin <qinshisong1205@gmail.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Shisong Qin <qinshisong1205@gmail.com> Link: https://lore.kernel.org/r/20201129193523.hm3f6n5xrn6fiyyc@function Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-30Merge tag 'usb-fixes-v5.10-rc6' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus Peter writes: - Fixed hardware role switch issue at TI platform - Fixed scatter-list buffer handling - Fixed error goto label issue * tag 'usb-fixes-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb: usb: cdns3: core: fix goto label for error path usb: cdns3: gadget: clear trb->length as zero after preparing every trb usb: cdns3: Fix hardware based role switch
2020-11-30Merge 5.10-rc6 into char-misc-nextGreg Kroah-Hartman
We need the fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-30usb: cdns3: core: fix goto label for error pathPeter Chen
The usb_role_switch_register has been already called, so if the devm_request_irq has failed, it needs to call usb_role_switch_unregister. Fixes: b1234e3b3b26 ("usb: cdns3: add runtime PM support") Signed-off-by: Peter Chen <peter.chen@nxp.com>
2020-11-30usb: cdns3: gadget: clear trb->length as zero after preparing every trbPeter Chen
It clears trb->length as zero before preparing td, but if scatter buffer is used for td, there are several trbs within td, it needs to clear every trb->length as zero, otherwise, the default value for trb->length may not be zero after it begins to use the second round of trb rings. Fixes: abc6b579048e ("usb: cdns3: gadget: using correct sg operations") Signed-off-by: Peter Chen <peter.chen@nxp.com>
2020-11-30usb: cdns3: Fix hardware based role switchRoger Quadros
Hardware based role switch is broken as the driver always skips it. Fix this by registering for SW role switch only if 'usb-role-switch' property is present in the device tree. Fixes: 50642709f659 ("usb: cdns3: core: quit if it uses role switch class") Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Peter Chen <peter.chen@nxp.com>
2020-11-29Linux 5.10-rc6v5.10-rc6Linus Torvalds
2020-11-29drm/panel: sony-acx565akm: Fix race condition in probeSebastian Reichel
The probe routine acquires the reset GPIO using GPIOD_OUT_LOW. Directly afterwards it calls acx565akm_detect(), which sets the GPIO value to HIGH. If the bootloader initialized the GPIO to HIGH before the probe routine was called, there is only a very short time period of a few instructions where the reset signal is LOW. Exact time depends on compiler optimizations, kernel configuration and alignment of the stars, but I expect it to be always way less than 10us. There are no public datasheets for the panel, but acx565akm_power_on() has a comment with timings and reset period should be at least 10us. So this potentially brings the panel into a half-reset state. The result is, that panel may not work after boot and can get into a working state by re-enabling it (e.g. by blanking + unblanking), since that does a clean reset cycle. This bug has recently been hit by Ivaylo Dimitrov, but there are some older reports which are probably the same bug. At least Tony Lindgren, Peter Ujfalusi and Jarkko Nikula have experienced it in 2017 describing the blank/unblank procedure as possible workaround. Note, that the bug really goes back in time. It has originally been introduced in the predecessor of the omapfb driver in commit 3c45d05be382 ("OMAPDSS: acx565akm panel: handle gpios in panel driver") in 2012. That driver eventually got replaced by a newer one, which had the bug from the beginning in commit 84192742d9c2 ("OMAPDSS: Add Sony ACX565AKM panel driver") and still exists in fbdev world. That driver has later been copied to omapdrm and then was used as a basis for this driver. Last but not least the omapdrm specific driver has been removed in commit 45f16c82db7e ("drm/omap: displays: Remove unused panel drivers"). Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com> Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reported-by: Tony Lindgren <tony@atomide.com> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Fixes: 1c8fc3f0c5d2 ("drm/panel: Add driver for the Sony ACX565AKM panel") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Jarkko Nikula <jarkko.nikula@bitmer.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20201127200429.129868-1-sebastian.reichel@collabora.com
2020-11-29Merge tag 'locking-urgent-2020-11-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two more places which invoke tracing from RCU disabled regions in the idle path. Similar to the entry path the low level idle functions have to be non-instrumentable" * tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: intel_idle: Fix intel_idle() vs tracing sched/idle: Fix arch_cpu_idle() vs tracing
2020-11-29Merge tag 'irq-urgent-2020-11-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two fixes for irqchip drivers: - Save and restore the GICV3 ITS state unconditionally on suspend/resume to handle firmware which fails to do so. - Use the correct index into the fwspec parameters to read the irq trigger type in the EXIU chip driver" * tag 'irq-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Unconditionally save/restore the ITS state on suspend irqchip/exiu: Fix the index of fwspec for IRQ type
2020-11-29Merge tag 'efi-urgent-for-v5.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Borislav Petkov: "More EFI fixes forwarded from Ard Biesheuvel: - revert efivarfs kmemleak fix again - it was a false positive - make CONFIG_EFI_EARLYCON depend on CONFIG_EFI explicitly so it does not pull in other dependencies unnecessarily if CONFIG_EFI is not set - defer attempts to load SSDT overrides from EFI vars until after the efivar layer is up" * tag 'efi-urgent-for-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: EFI_EARLYCON should depend on EFI efivarfs: revert "fix memory leak in efivarfs_create()" efi/efivars: Set generic ops before loading SSDT
2020-11-29Merge tag 'x86_urgent_for_v5.10-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "A couple of urgent fixes which accumulated this last week: - Two resctrl fixes to prevent refcount leaks when manipulating the resctrl fs (Xiaochen Shen) - Correct prctl(PR_GET_SPECULATION_CTRL) reporting (Anand K Mistry) - A fix to not lose already seen MCE severity which determines whether the machine can recover (Gabriele Paoloni)" * tag 'x86_urgent_for_v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Do not overwrite no_way_out if mce_end() fails x86/speculation: Fix prctl() when spectre_v2_user={seccomp,prctl},ibpb x86/resctrl: Add necessary kernfs_put() calls to prevent refcount leak x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak
2020-11-29drm/rockchip: Avoid uninitialized use of endpoint id in LVDSPaul Kocialkowski
In the Rockchip DRM LVDS component driver, the endpoint id provided to drm_of_find_panel_or_bridge is grabbed from the endpoint's reg property. However, the property may be missing in the case of a single endpoint. Initialize the endpoint_id variable to 0 to avoid using an uninitialized variable in that case. Fixes: 34cc0aa25456 ("drm/rockchip: Add support for Rockchip Soc LVDS") Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20201110200430.1713467-1-paul.kocialkowski@bootlin.com