linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-03-25	Merge tag 'wireless-drivers-2020-03-25' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.6 Fourth, and last, set of fixes for v5.6. Just two important fixes to iwlwifi regressions. iwlwifi * fix GEO_TX_POWER_LIMIT command on certain devices which caused firmware to crash during initialisation * add back device ids for three devices which were accidentally removed ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26	nvme: cleanup namespace identifier reporting in nvme_init_ns_head	Christoph Hellwig
	Lift the common namespace identifier reporting between the shared namespace and new nshead cases into common code. This also means one less lock is held while doing I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: rename __nvme_find_ns_head to nvme_find_ns_head	Christoph Hellwig
	There is no non __-prefixed version, so make the name a little more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: refactor nvme_identify_ns_descs error handling	Christoph Hellwig
	Move the handling of an error into the function from the caller, and only do it for an actual error on the admin command itself, not the command parsing, as that should be enough to deal with devices claiming a bogus version compliance. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl	Israel Rukshin
	The transition to LIVE state should not fail in case of a new controller. Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the resources may leads to NULL dereference at teardown flow (e.g., IO tagset, admin_q, connect_q). Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl	Israel Rukshin
	The transition to LIVE state should not fail in case of a new controller. Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the resources may leads to NULL dereference at teardown flow (e.g., IO tagset, admin_q, connect_q). Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Fix controller creation races with teardown flow	Israel Rukshin
	Calling nvme_sysfs_delete() when the controller is in the middle of creation may cause several bugs. If the controller is in NEW state we remove delete_controller file and don't delete the controller. The user will not be able to use nvme disconnect command on that controller again, although the controller may be active. Other bugs may happen if the controller is in the middle of create_ctrl callback and nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at nvme_do_delete_ctrl() before it was allocated at create_ctrl callback. To fix all those races don't allow the user to delete the controller before it was fully created. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl	Israel Rukshin
	Put the ctrl reference count at nvme_uninit_ctrl as opposed to nvme_init_ctrl which takes it. This decrease the reference count at the core layer instead of decreasing it on each transport separately. Also move the call of nvme_uninit_ctrl at PCI driver after calling to nvme_release_prp_pools and nvme_dev_unmap, in order to put the reference count after using the dev. This is safe because those functions use nvme_dev which is freed only later at nvme_pci_free_ctrl. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Fix ctrl use-after-free during sysfs deletion	Israel Rukshin
	In case nvme_sysfs_delete() is called by the user before taking the ctrl reference count, the ctrl may be freed during the creation and cause the bug. Take the reference as soon as the controller is externally visible, which is done by cdev_device_add() in nvme_init_ctrl(). Also take the reference count at the core layer instead of taking it on each transport separately. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-pci: Re-order nvme_pci_free_ctrl	Israel Rukshin
	Destroy the resources in the same order like in nvme_probe error flow to improve code readability. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Remove unused return code from nvme_delete_ctrl_sync	Israel Rukshin
	The return code of nvme_delete_ctrl_sync is never used, so change it to void. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Use nvme_state_terminal helper	Israel Rukshin
	Improve code readability. Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: release ida resources	Max Gurtovoy
	ida instances allocate some internal memory in addition to the base 'struct ida'. Use ida_destroy() to release that memory at module_exit(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO	masahiro31.yamada@kioxia.com
	Currently 32 bit application gets ENOTTY when it calls compat_ioctl with NVME_IOCTL_SUBMIT_IO in 64 bit kernel. The cause is that the results of sizeof(struct nvme_user_io), which is used to define NVME_IOCTL_SUBMIT_IO, are not same between 32 bit compiler and 64 bit compiler. * 32 bit: the result of sizeof nvme_user_io is 44. * 64 bit: the result of sizeof nvme_user_io is 48. 64 bit compiler seems to add 32 bit padding for multiple of 8 bytes. This patch adds a compat_ioctl handler. The handler replaces NVME_IOCTL_SUBMIT_IO32 with NVME_IOCTL_SUBMIT_IO in case 32 bit application calls compat_ioctl for submit in 64 bit kernel. Then, it calls nvme_ioctl as usual. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada (KIOXIA) <masahiro31.yamada@kioxia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvmet-tcp: optimize tcp stack TX when data digest is used	Sagi Grimberg
	If we have a 4-byte data digest to send to the wire, but we have more data to send, set MSG_MORE to tell the stack that more is coming. Reviewed-by: Mark Wunderlich <mark.wunderlich@intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-multipath: do not reset on unknown status	John Meneghini
	The nvme multipath error handling defaults to controller reset if the error is unknown. There are, however, no existing nvme status codes that indicate a reset should be used, and resetting causes unnecessary disruption to the rest of IO. Change nvme's error handling to first check if failover should happen. If not, let the normal error handling take over rather than reset the controller. Based-on-a-patch-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Meneghini <johnm@netapp.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvmet-rdma: allocate RW ctxs according to mdts	Max Gurtovoy
	Current nvmet-rdma code allocates MR pool budget based on queue size, assuming both host and target use the same "max_pages_per_mr" count. After limiting the mdts value for RDMA controllers, we know the factor of maximum MR's per IO operation. Thus, make sure MR pool will be sufficient for the required IO depth and IO size. That is, say host's SQ size is 100, then the MR pool budget allocated currently at target will also be 100 MRs. But 100 IO WRITE Requests with 256 sg_count(IO size above 1MB) require 200 MRs when target's "max_pages_per_mr" is 128. Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
2020-03-26	nvmet-rdma: Implement get_mdts controller op	Max Gurtovoy
	Set the maximal data transfer size to be 1MB (currently mdts is unlimited). This will allow calculating the amount of MR's that one ctrl should allocate to fulfill it's capabilities. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
2020-03-26	nvmet: Add get_mdts op for controllers	Max Gurtovoy
	Some transports, such as RDMA, would like to set the Maximum Data Transfer Size (MDTS) according to device/port/ctrl characteristics. This will enable the transport to set the optimal MDTS according to controller needs and device capabilities. Add a new nvmet transport op that is called during ctrl identification. This will not effect transports that don't implement this option. The return value of the new op is according to the NVMe spec definition for MDTS. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
2020-03-26	nvme-pci: properly print controller address	Max Gurtovoy
	Align PCI address print with fabrics address that is printed with newline character. Before: [root@server40 linux]# cat /sys/class/nvme/nvme2/address 0000:0b:00.0[root@server40 linux]# After: [root@server40 linux]# cat /sys/class/nvme/nvme2/address 0000:0b:00.0 [root@server40 linux]# Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
2020-03-26	nvme-tcp: break from io_work loop if recv failed	Sagi Grimberg
	If we failed to receive data from the socket, don't try to further process it, we will for sure be handling a queue error at this point. While no issue was seen with the current behavior thus far, its safer to cease socket processing if we detected an error. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-tcp: move send failure to nvme_tcp_try_send	Sagi Grimberg
	Consolidate the request failure handling code to where it is being fetched (nvme_tcp_try_send). Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvmet-tcp: fix maxh2cdata icresp parameter	Sagi Grimberg
	MAXH2CDATA is not zero based. Also no reason to limit ourselves to 1M transfers as we can do more easily. Make this an arbitrary limit of 16M. Reported-by: Wenhua Liu <liuw@vmware.com> Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-tcp: optimize queue io_cpu assignment for multiple queue maps	Sagi Grimberg
	Currently, queue io_cpu assignment is done sequentially for default, read and poll queues based on queue id. This causes miss-alignment between context of CPU initiating I/O and the I/O worker thread processing queued requests or completions. Change to modify queue io_cpu assignment to take into account queue maps offset. Each queue io_cpu will start at zero for each queue map. This essentially aligns read/poll queues to start over the same range as default queues. Testing performed by Mark with: - ram device (nvmet) - single CPU core (pinned) - 100% 4k reads - engine io_uring (not using sq_thread option) - hipri flag set Micro-benchmark results show a net gain of: - increase of 18%-29% in IOPs - reduction of 16%-22% in average latency - reduction of 7%-23% in 99.99% latency Baseline: ======== QDepth/Batch \| IOPs [k] \| Avg. Lat [us] \| 99.99% Lat [us] ----------------------------------------------------------------- 1/1 \| 32.4 \| 30.11 \| 50.94 32/8 \| 179 \| 168.20 \| 371 CPU alignment: ============= QDepth/Batch \| IOPs [k] \| Avg. Lat [us] \| 99.99% Lat [us] ----------------------------------------------------------------- 1/1 \| 38.5 \| 25.18 \| 39.16 32/8 \| 231 \| 130.75 \| 343 Reported-by: Mark Wunderlich <mark.wunderlich@intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-pci: Simplify nvme_poll_irqdisable	Keith Busch
	The timeout handler can use the existing nvme_poll() if it needs to check a polled queue, allowing nvme_poll_irqdisable() to handle only irq driven queues for the remaining callers. Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-pci: Remove two-pass completions	Keith Busch
	Completion handling had been done in two steps: find all new completions under a lock, then handle those completions outside the lock. This was done to make the locked section as short as possible so that other threads using the same lock wait less time. The driver no longer shares locks during completion, and is in fact lockless for interrupt driven queues, so the optimization no longer serves its original purpose. Replace the two-pass completion queue handler with a single pass that completes entries immediately. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-pci: Remove tag from process cq	Keith Busch
	The only user for tagged completion was for timeout handling. That user, though, really only cares if the timed out command is completed, which we can safely check within the timeout handler. Remove the tag check to simplify completion handling. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme-pci: slimmer CQ head update	Alexey Dobriyan
	Update CQ head with pre-increment operator. This saves subtraction of 1 and a few registers. Also update phase with "^= 1". This generates only one RMW instruction. ffffffff815ba150 <nvme_update_cq_head>: ffffffff815ba150: 0f b7 47 70 movzx eax,WORD PTR [rdi+0x70] ffffffff815ba154: 83 c0 01 add eax,0x1 ffffffff815ba157: 66 89 47 70 mov WORD PTR [rdi+0x70],ax ffffffff815ba15b: 66 3b 47 68 cmp ax,WORD PTR [rdi+0x68] ffffffff815ba15f: 74 01 je ffffffff815ba162 <nvme_update_cq_head+0x12> ffffffff815ba161: c3 ret ffffffff815ba162: 31 c0 xor eax,eax ffffffff815ba164: 80 77 74 01 ===> xor BYTE PTR [rdi+0x74],0x1 ffffffff815ba168: 66 89 47 70 mov WORD PTR [rdi+0x70],ax ffffffff815ba16c: c3 ret add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-119 (-119) Function old new delta nvme_poll 690 678 -12 nvme_dev_disable 1230 1177 -53 nvme_irq 613 559 -54 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2020-03-26	nvmet: check ncqr & nsqr for set-features cmd	Amit Engel
	For set feature command when setting up NVME_FEAT_NUM_QUEUES, check Number of I/O Completion Queues Requested (NCQR) and Number of I/O Submission Queues Requested (NSQR) before we proceed, for invalid values (i.e. 65535) return an appropriate NVMe invalid field status. Signed-off-by: Amit Engel <Amit.Engel@dell.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Check for readiness more quickly, to speed up boot time	Josh Triplett
	After initialization, nvme_wait_ready checks for readiness every 100ms, even though the drive may be ready far sooner than that. This delays system boot by hundreds of milliseconds. Reduce the delay, checking for readiness every millisecond instead. Boot-time tests on an AWS c5.12xlarge: Before: [ 0.546936] initcall nvme_init+0x0/0x5b returned 0 after 37 usecs ... [ 0.764178] nvme nvme0: 2/0/0 default/read/poll queues [ 0.768424] nvme0n1: p1 [ 0.774132] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null) [ 0.774146] VFS: Mounted root (ext4 filesystem) on device 259:1. ... [ 0.788141] Run /sbin/init as init process After: [ 0.537088] initcall nvme_init+0x0/0x5b returned 0 after 37 usecs ... [ 0.543457] nvme nvme0: 2/0/0 default/read/poll queues [ 0.548473] nvme0n1: p1 [ 0.554339] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null) [ 0.554344] VFS: Mounted root (ext4 filesystem) on device 259:1. ... [ 0.567931] Run /sbin/init as init process Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: log additional message for controller status	Rupesh Girase
	Log the controller status to know more about issue if it lies within kernel nvme subsytem or controller is unhealthy. Signed-off-by: Rupesh Girase <rgirase@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulakrni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: code cleanup nvme_identify_ns_desc()	Chaitanya Kulkarni
	The function nvme_identify_ns_desc() has 3 levels of nesting which make error message to exceeded > 80 char per line which is not aligned with the kernel code standards and rest of the NVMe subsystem code. Add a helper function to move the processing of the log when the command is successful by reducing the nesting and keeping the code < 80 char per line. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: Don't deter users from enabling hwmon support	Jean Delvare
	I see no good reason for the "If unsure, say N" advice in the description of the NVME_HWMON configuration option. It is not dangerous, it does not select any other option, and has a fairly low overhead. As the option is already not enabled by default, further suggesting hesitant users to not enable it is not useful anyway. Unlike some other options where the description alone may not be sufficient for users to make a decision, NVME_HWMON is pretty simple to grasp in my opinion, so just let the user do what they want. Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: expose hostid via sysfs for fabrics controllers	Sagi Grimberg
	We allow userspace to connect with a custom hostid which is useful for certain use-cases. However there is is no way to tell what is the hostid used to connect to a given controller. Expose this so userspace can correlate controllers based on hostid. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-26	nvme: expose hostnqn via sysfs for fabrics controllers	Sagi Grimberg
	We allow userspace to connect with a custom hostnqn which is useful for certain use-cases. However there is no way to tell what is the hostnqn used to connect to a given controller. Expose this so userspace can correlate controllers based on hostnqn. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-25	net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build	Pablo Neira Ayuso
	net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’: net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’ pkt->skb->tc_redirected = 1; ^~ net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’ pkt->skb->tc_from_ingress = 1; ^~ To avoid a direct dependency with tc actions from netfilter, wrap the redirect bits around CONFIG_NET_REDIRECT and move helpers to include/linux/skbuff.h. Turn on this toggle from the ifb driver, the only existing client of these bits in the tree. This patch adds skb_set_redirected() that sets on the redirected bit on the skbuff, it specifies if the packet was redirect from ingress and resets the timestamp (timestamp reset was originally missing in the netfilter bugfix). Fixes: bcfabee1afd99484 ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress") Reported-by: noreply@ellerman.id.au Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	cxgb4: Add support to catch bits set in INT_CAUSE5	Rahul Kundu
	This commit adds support to catch any bits set in SGE_INT_CAUSE5 for Parity Errors. F_ERR_T_RXCRC flag is used to ignore that particular bit as it is not considered as fatal. So, we clear out the bit before looking for error. This patch now read and report separately all three registers(Cause1, Cause2, Cause5). Also, checks for errors if any. Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Rahul Kundu <rahul.kundu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	octeontx2-pf: Fix ndo_set_rx_mode	Sunil Goutham
	Since set_rx_mode takes a mutex lock for sending mailbox message to admin function to set the mode, moved logic to a workqueue. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	octeontx2-pf: Fix rx buffer page refcount	Sunil Goutham
	Fixed an issue wherein while refilling receive buffers for the last page allocated, recount is not being updated. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: modernize two list helpers	Julian Wiedmann
	Replace list_for_each() with list_for_each_entry(), and list_entry(head.next) with list_first_entry(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: keep track of fixed prio-queue configuration	Julian Wiedmann
	When a device is configured in prio-queue mode to pin all traffic onto a specific HW queue, treat this as a distinct variant of prio-queueing instead of QETH_NO_PRIO_QUEUEING. This corrects an error message from qeth_osa_set_output_queues() for devices configured in such a mode. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: fine-tune MAC Address-related errnos	Julian Wiedmann
	Return the correct errnos when .ndo_set_mac_address fails to set a new MAC address. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: add TX IRQ coalescing support for IQD devices	Julian Wiedmann
	Since IQD devices complete (most of) their transmissions synchronously, they don't offer TX completion IRQs and have no HW coalescing controls. But we can fake the easy parts in SW, and give the user some control wrt to how often the TX NAPI code should be triggered to process the TX completions. Having per-queue controls can in particular help the dedicated mcast queue, as it likely benefits from different fine-tuning than what the ucast queues need. CC: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: collect more TX statistics	Julian Wiedmann
	Count the number of TX doorbells we issue to the qdio layer. Also count the number of actual frames in a TX buffer, and then use this data along with the byte count during TX completion. We'll make additional use of the frame count in a subsequent patch. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: clean up the mac_bits	Julian Wiedmann
	We're down to a single bit flag for MAC-address related status, reflect that in the info struct. Also set up the flag during initialization instead of clearing it during shutdown - one more little step towards unifying the shutdown code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: simplify L3 dev_id logic	Julian Wiedmann
	The logic that deals with errors from qeth_l3_get_unique_id() is quite complex: it sets card->unique_id to 0xfffe, additionally flags it as UNIQUE_ID_NOT_BY_CARD and later takes this flag as cue to not propagate card->unique_id to dev->dev_id. With dev->dev_id thus holding 0, addrconf_ifid_eui48() applies its default behaviour. Get rid of all the special bit masks, and just return the old uid in case of an error. For the vast majority of cases this will be 0 (and so we still get the desired default behaviour) - with the rare exception where qeth_l3_get_unique_id() might have been called earlier but the initialization then failed at a later point. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qdio: extend polling support to multiple queues	Julian Wiedmann
	When the support for polling drivers was initially added, it only considered Input Queue 0. But as QDIO interrupts are actually for the full device and not a single queue, this doesn't really fit for configurations where multiple Input Queues are used. Rework the qdio code so that interrupts for a polling driver are not split up into actions for each queue. Instead deliver the interrupt as a single event, and let the driver decide which queue needs what action. When re-enabling the QDIO interrupt via qdio_start_irq(), this means that the qdio code needs to (1) put _all_ eligible queues back into a state where they raise IRQs, (2) and afterwards check _all_ eligible queues for new work to bridge the race window. On the qeth side of things (as the only qdio polling driver), we can now add CQ polling support to the main NAPI poll routine. It doesn't consume NAPI budget, and to avoid hogging the CPU we yield control after completing one full queue worth of buffers. The subsequent qdio_start_irq() will check for any additional work, and have us re-schedule the NAPI instance accordingly. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: remove redundant if-clause in RX poll code	Julian Wiedmann
	Whenever all completed RX buffers have been processed (ie. rx->b_count == 0), we call down to the HW layer to scan for additional buffers. If no further buffers are available, the code breaks out of the while-loop. So we never reach the 'process an RX buffer' step with rx->b_count == 0, eliminate that check and one level of indentation. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25	s390/qeth: split out RX poll code	Julian Wiedmann
	The main NAPI poll routine should eventually handle more types of work, beyond just the RX ring. Split off the RX poll logic into a separate function, and simplify the nested while-loop. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>