summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-05ice: Check for DCB capability before initializing DCBAnirudh Venkataramanan
Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: report link down for VF when PF's queues are not enabledLukasz Czapnik
This is port of a fix from i40e commit 2ad1274fa35a ("i40e: don't report link up for a VF who hasn't enabled queues") Older VF drivers do not respond well to receiving a link up notification before queues are enabled. This can cause their state machine to think that it is safe to send traffic. This results in a Tx hang on the VF. Record whether the PF has actually enabled queues for the VF. When reporting link status, always report link down if the queues aren't enabled. In this way, the VF driver will never receive a link up notification until after its queues are enabled. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: Reliably reset VFsMitch Williams
When a PFR (or bigger reset) occurs, the device clears the VF_MBX_ARQLEN register for all VFs. But if a VFR is triggered by a VF, the device does NOT clear this register, and the VF driver will never see the reset. When this happens, the VF driver will eventually timeout and attempt recovery, and usually it will be successful. But this makes resets take a long time and there are occasional failures. We cannot just blithely clear this register on every reset; this has been shown to cause synchronization problems when a PFR is triggered with a large number of VFs. Fix this by clearing VF_MBX_ARQLEN when the reset source is not PFR. GlobR will trigger PFR, so this test catches that occurrence as well. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: change work limit to a constantJesse Brandeburg
The driver has supported a transmit work limit that was configurable from ethtool for a long time, but there are no good use cases for having it be a variable that can be changed at run time. In addition, this variable was noted to be causing performance overhead due to cache misses. Just remove the variable and let the code use a constant so that the functionality is maintained (a limit on the number of transmits that will be cleaned in any one call to the clean routines) without the cache miss. Removes code, removes a variable, removes testing surface. Yay. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: small efficiency fixesJesse Brandeburg
Add a small bit of efficiency to the code by adding a prefetch of the port_info structure in order to help avoid a cache miss a little later on in execution. Also add an unlikely statement to a branch which generally will never happen in normal operation. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: move code closer togetherJesse Brandeburg
This is a simple patch to move the assignment to a local variable closer to the site where the local variable is used. This can help readability and also maybe performance, although the performance enhancement is really dependent upon the compiler. No functional change. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: clean up argumentsJesse Brandeburg
There are a couple of functions that don't need two arguments passed in when the second argument already had access to the pointer pointed to by the first. Remove the unnecessary arguments. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: Check root pointer for validityAnirudh Venkataramanan
ice_sched_get_tc_node uses pi->root without checking for NULL. Add a check to prevent NULL pointer dereference. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: Add ice_get_main_vsi to get PF/main VSIAnirudh Venkataramanan
There are multiple places where we currently use ice_find_vsi_by_type to get the PF (a.k.a. main) VSI. The PF VSI by definition is always the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead add and use a new helper function ice_get_main_vsi, which just returns pf->vsi[0]. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05ice: Update fields in ice_vsi_set_num_qs when reconfiguringBrett Creeley
Currently when vsi->req_txqs or vsi->req_rxqs are set we don't correctly set the number of vsi->num_q_vectors. Fix this by setting the number of queue vectors based on the max between the vsi->alloc_txqs and vsi->alloc_rxqs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05irqchip/gic-v3-its: Fix LPI release for Multi-MSI devicesMarc Zyngier
When allocating a range of LPIs for a Multi-MSI capable device, this allocation extended to the closest power of 2. But on the release path, the interrupts are released one by one. This results in not releasing the "extra" range, leaking the its_device. Trying to reprobe the device will then fail. Fix it by releasing the LPIs the same way we allocate them. Fixes: 8208d1708b88 ("irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size") Reported-by: Jiaxing Luo <luojiaxing@huawei.com> Tested-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/f5e948aa-e32f-3f74-ae30-31fee06c2a74@huawei.com
2019-09-05parisc: Drop comments which are already in pci.hHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05parisc: Convert eisa_enumerator to use pr_cont()Helge Deller
Clean up and beautify kernel output by using pr_cont() and printk formats like %pR for resources. This was noticed on a HP D350/2 machine. Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05bus: ti-sysc: Fix clock handling for no-idle quirksTony Lindgren
NFSroot can fail on dra7 when cpsw is probed using ti-sysc interconnect target module driver as reported by Keerthy. Device clocks and the interconnect target module may or may not be enabled by the bootloader on init, but we currently assume the clocks and module are on from the bootloader for "ti,no-idle" and "ti,no-idle-on-init" quirks as reported by Grygorii Strashko. Let's fix the issue by always enabling clocks init, and never disable them for "ti,no-idle" quirk. For "ti,no-idle-on-init" quirk, we must decrement the usage count later on to allow PM runtime to idle the module if requested. Fixes: 1a5cd7c23cc5 ("bus: ti-sysc: Enable all clocks directly during init to read revision") Cc: Keerthy <j-keerthy@ti.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Reported-by: Keerthy <j-keerthy@ti.com> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-09-05parisc: Avoid warning when loading hppb driverHelge Deller
ccio_request_resource() may fail to allocate regions for the hppb driver. Do not print a misleading warning in this case. This was noticed on a HP D350/2 machine. Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05s390/crypto: xts-aes-s390 fix extra run-time crypto self tests findingHarald Freudenberger
With 'extra run-time crypto self tests' enabled, the selftest for s390-xts fails with alg: skcipher: xts-aes-s390 encryption unexpectedly succeeded on test vector "random: len=0 klen=64"; expected_error=-22, cfg="random: inplace use_digest nosimd src_divs=[2.61%@+4006, 84.44%@+21, 1.55%@+13, 4.50%@+344, 4.26%@+21, 2.64%@+27]" This special case with nbytes=0 is not handled correctly and this fix now makes sure that -EINVAL is returned when there is en/decrypt called with 0 bytes to en/decrypt. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05vfio-ccw: fix error return code in vfio_ccw_sch_init()Wei Yongjun
Fix to return negative error code -ENOMEM from the memory alloc failed error handling case instead of 0, as done elsewhere in this function. Fixes: 60e05d1cf087 ("vfio-ccw: add some logging") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link https://lore.kernel.org/kvm/20190904083315.105600-1-weiyongjun1@huawei.com/ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05s390: vfio-ap: fix warning reset not completedHalil Pasic
The intention seems to be to warn once when we don't wait enough for the reset to complete. Let's use the right retry counter to accomplish that semantic. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/20190903133618.9122-1-pasic@linux.ibm.com Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05HID: prodikeys: Fix general protection fault during probeAlan Stern
The syzbot fuzzer provoked a general protection fault in the hid-prodikeys driver: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.3.0-rc5+ #28 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event RIP: 0010:pcmidi_submit_output_report drivers/hid/hid-prodikeys.c:300 [inline] RIP: 0010:pcmidi_set_operational drivers/hid/hid-prodikeys.c:558 [inline] RIP: 0010:pcmidi_snd_initialise drivers/hid/hid-prodikeys.c:686 [inline] RIP: 0010:pk_probe+0xb51/0xfd0 drivers/hid/hid-prodikeys.c:836 Code: 0f 85 50 04 00 00 48 8b 04 24 4c 89 7d 10 48 8b 58 08 e8 b2 53 e4 fc 48 8b 54 24 20 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 13 04 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b The problem is caused by the fact that pcmidi_get_output_report() will return an error if the HID device doesn't provide the right sort of output report, but pcmidi_set_operational() doesn't bother to check the return code and assumes the function call always succeeds. This patch adds the missing check and aborts the probe operation if necessary. Reported-and-tested-by: syzbot+1088533649dafa1c9004@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05HID: wacom: add new MobileStudio Pro 13 supportPing Cheng
wacom_wac_pad_event is the only routine we need to update. Signed-off-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05drm/vmwgfx: Fix double free in vmw_recv_msg()Dan Carpenter
We recently added a kfree() after the end of the loop: if (retries == RETRIES) { kfree(reply); return -EINVAL; } There are two problems. First the test is wrong and because retries equals RETRIES if we succeed on the last iteration through the loop. Second if we fail on the last iteration through the loop then the kfree is a double free. When you're reading this code, please note the break statement at the end of the while loop. This patch changes the loop so that if it's not successful then "reply" is NULL and we can test for that afterward. Cc: <stable@vger.kernel.org> Fixes: 6b7c3b86f0b6 ("drm/vmwgfx: fix memory leak when too many retries have occurred") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-09-05HID: sony: Fix memory corruption issue on cleanup.Roderick Colenbrander
The sony driver is not properly cleaning up from potential failures in sony_input_configured. Currently it calls hid_hw_stop, while hid_connect is still running. This is not a good idea, instead hid_hw_stop should be moved to sony_probe. Similar changes were recently made to Logitech drivers, which were also doing improper cleanup. Signed-off-by: Roderick Colenbrander <roderick.colenbrander@sony.com> CC: stable@vger.kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-09-05Merge branch 'bpf-af-xdp-barrier-fixes'Daniel Borkmann
Björn Töpel says: ==================== This is a four patch series of various barrier, {READ, WRITE}_ONCE cleanups in the AF_XDP socket code. More details can be found in the corresponding commit message. Previous revisions: v1 [4] and v2 [5]. For an AF_XDP socket, most control plane operations are done under the control mutex (struct xdp_sock, mutex), but there are some places where members of the struct is read outside the control mutex. The dev, queue_id members are set in bind() and cleared at cleanup. The umem, fq, cq, tx, rx, and state member are all assigned in various places, e.g. bind() and setsockopt(). When the members are assigned, they are protected by the control mutex, but since they are read outside the mutex, a WRITE_ONCE is required to avoid store-tearing on the read-side. Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE statements if used in conjunction with state check. To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. After this series umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) [1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/ [2] https://lwn.net/Articles/793253/ [3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE [4] https://lore.kernel.org/bpf/20190822091306.20581-1-bjorn.topel@gmail.com/ [5] https://lore.kernel.org/bpf/20190826061053.15996-1-bjorn.topel@gmail.com/ v2->v3: Minor restructure of commits. Improve cover and commit messages. (Daniel) v1->v2: Removed redundant dev check. (Jonathan) ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05xsk: lock the control mutex in sock_diag interfaceBjörn Töpel
When accessing the members of an XDP socket, the control mutex should be held. This commit fixes that. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05xsk: use state member for socket synchronizationBjörn Töpel
Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE if used in conjunction with state check. In all syscalls and the xsk_rcv path we check if state is XSK_BOUND. If that is the case we do a SMP read barrier, and this implies that the dev, umem and all rings are correctly setup. Note that no READ_ONCE are needed for these variable if used when state is XSK_BOUND (plus the read barrier). To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due to the introduce state synchronization (XSK_BOUND plus smp_rmb()). Introducing the state check also fixes a race, found by syzcaller, in xsk_poll() where umem could be accessed when stale. Suggested-by: Hillf Danton <hdanton@sina.com> Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05xsk: avoid store-tearing when assigning umemBjörn Töpel
The umem member of struct xdp_sock is read outside of the control mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid potential store-tearing. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05xsk: avoid store-tearing when assigning queuesBjörn Töpel
Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid potential store-tearing. These members are read outside of the control mutex in the mmap implementation. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: 37b076933a8e ("xsk: add missing write- and data-dependency barrier") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05bpf: fix precision tracking of stack slotsAlexei Starovoitov
The problem can be seen in the following two tests: 0: (bf) r3 = r10 1: (55) if r3 != 0x7b goto pc+0 2: (7a) *(u64 *)(r3 -8) = 0 3: (79) r4 = *(u64 *)(r10 -8) .. 0: (85) call bpf_get_prandom_u32#7 1: (bf) r3 = r10 2: (55) if r3 != 0x7b goto pc+0 3: (7b) *(u64 *)(r3 -8) = r0 4: (79) r4 = *(u64 *)(r10 -8) When backtracking need to mark R4 it will mark slot fp-8. But ST or STX into fp-8 could belong to the same block of instructions. When backtracing is done the parent state may have fp-8 slot as "unallocated stack". Which will cause verifier to warn and incorrectly reject such programs. Writes into stack via non-R10 register are rare. llvm always generates canonical stack spill/fill. For such pathological case fall back to conservative precision tracking instead of rejecting. Reported-by: syzbot+c8d66267fd2b5955287e@syzkaller.appspotmail.com Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05selftests/bpf: precision tracking testsAlexei Starovoitov
Add two tests to check that stack slot marking during backtracking doesn't trigger 'spi > allocated_stack' warning. One test is using BPF_ST insn. Another is using BPF_STX. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05habanalabs: correctly cast variable to __le32Oded Gabbay
When using the macro le32_to_cpu(x), we need to correctly convert x to be __le32 in case it is defined as u32 variable. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-09-05habanalabs: show correct id in error printOded Gabbay
If the initialization of a device failed, the driver prints an error message with the id of the device. The device index on the file system is that id divided by 2. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: stop using the acronym KMDOded Gabbay
We want to stop using the acronym KMD. Therefore, replace all locations (except for register names we can't modify) where KMD is written to other terms such as "Linux kernel driver" or "Host kernel driver", etc. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: display card name as sensors headerOded Gabbay
To allow the user to use a custom file for the HWMON lm-sensors library per card type, the driver needs to register the HWMON sensors with the specific card type name. The card name is supplied by the F/W running on the device. If the F/W is old and doesn't supply a card name, a default card name is displayed as the sensors group name. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: add uapi to retrieve aggregate H/W eventsOded Gabbay
Add a new opcode to INFO IOCTL to retrieve aggregate H/W events. i.e. the events counters are NOT cleared upon device reset, but count from the loading of the driver. Add the code to support it in the device event handling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: add uapi to retrieve device utilizationOded Gabbay
Users and sysadmins usually want to know what is the device utilization as a level 0 indication if they are efficiently using the device. Add a new opcode to the INFO IOCTL that will return the device utilization over the last period of 100-1000ms. The return value is 0-100, representing as percentage the total utilization rate. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: Make the Coresight timestamp perpetualTomer Tayar
The Coresight timestamp is enabled for a specific debug session using the HL_DEBUG_OP_TIMESTAMP opcode of the debug IOCTL. In order to have a perpetual timestamp that would be comparable between various debug sessions, this patch moves the timestamp enablement to be part of the HW initialization. The HL_DEBUG_OP_TIMESTAMP opcode turns to be deprecated and shouldn't be used. Old user-space that will call it won't see any change in the behavior of the debug session. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: explicitly set the queue-id enumerated numbersDotan Barak
When looking at kernel log messages and when debugging user applications, we only see the queue id. This patch explicitly set the queue id in the queue enumeration which will be helpful for finding the queue name when we have its id. Signed-off-by: Dotan Barak <dbarak@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: print to kernel log when reset is finishedOded Gabbay
Now that we don't print the queue testing messages, we need to print when the reset is finished so whoever looks at the kernel log will know the reset process was finished successfully and the driver is not stuck. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: replace __le32_to_cpu with le32_to_cpuOded Gabbay
In some files the driver uses __le32_to_cpu while in other it uses le32_to_cpu. Replace all __le32_to_cpu instances with le32_to_cpu for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64Oded Gabbay
In some files the code use __cpu_to_le32/64 while in other it use cpu_to_le32/64. Replace all __cpu_to_le32/64 instances with cpu_to_le32/64 for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: Handle HW_IP_INFO if device disabled or in resetTomer Tayar
The HW IP information is relevant even if the device is disabled or in reset, so always handle the corresponding INFO IOCTL opcode. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: Expose devices after initialization is doneTomer Tayar
The char devices are currently exposed to user before the device and driver initialization are done. This patch moves the cdev and device adding to the system to the end of the initialization sequence, while keeping the creation of the structures at the beginning to allow the usage of dev_*(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: improve security in Debug IOCTLOmer Shpigelman
This patch improves the security in the Debug IOCTL. It adds checks that: - The register index value is in the allowed range for all opcodes. - The event types number is in the allowed range in SPMU enable. - The events number is in the allowed range in SPMU disable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: use default structure for user input in Debug IOCTLOmer Shpigelman
This patch fixes a possible kernel crash when a user provides a too small input structure to the Debug IOCTL. The fix sets a default input structure and copies to it the user data. In case the user provided as input a too small structure, the code will use the default values taken from the default structure. Note that in contrary to the input structure, the user can provide an output structure with changing size or no size at all. Therefore the user output structure validation is already done in the Debug logic later on. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: Add descriptive name to PSOC app status registerTomer Tayar
Add a meaningful name to the general PSOC application status register which better describes its usage in keeping the HW state. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: Add descriptive names to PSOC scratch-pad registersTomer Tayar
The PSOC scratch-pad registers are used for communication with the device CPU. This patch adds new definitions for these registers which are more descriptive than their general names. The new set of definitions also gathers and documents the current usage of the scratch-pad registers by the driver and the device CPU. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: create two char devices per ASICOded Gabbay
This patch changes the driver to create two char devices for each ASIC it discovers. This is done to allow system/monitoring applications to query the device for stats, information, idle state and more, while also allowing the deep-learning application to send work to the ASIC. One char device is the original device, hlX. IOCTL calls through this device file can perform any task on the device (compute, memory, queries). The open function for this device will fail if it was called before but the file-descriptor it created was not completely released yet (the release callback function is not called from the kernel until all instances of that FD are closed). The driver needs to keep this behavior to support backward compatibility with existing userspace, which count that the open will fail if the device is "occupied". The second char device is called "hl_controlDx", where x is the same index of the main device with a minor number of the original char device + 1. Applications that open this device can only call the INFO IOCTL. There is no limitation on the number of applications opening this device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: change device_setup_cdev() to be more genericOded Gabbay
This patch re-factors the device_setup_cdev() function to make it more generic. It doesn't manipulate members of the driver's internal device structure but instead works only on the arguments that are sent to it. This is in preparation for using this function to create an additional char device per ASIC. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: maintain a list of file private data objectsOded Gabbay
This patch adds a new list to the driver's device structure. The list will keep the file private data structures that the driver creates when a user process opens the device. This change is needed because it is useless to try to count how many FD are open. Instead, track our own private data structure per open file and once it is released, remove it from the list. As long as the list is not empty, it means we have a user that can do something with our device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: rename user_ctx as compute_ctxOded Gabbay
This patch renames the "user_ctx" field in the device structure to "compute_ctx". This better reflects the meaning of this context. In addition, we also check in the ctx_fini() that the debug mode should be disabled only if the context being destroyed is the compute context. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>