git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2019-09-12	staging: dt-bindings: wilc1000: add optional rtc_clk property	Eugen Hristev
	Add bindings for optional rtc clock pin. Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Acked-by: Ajay Singh <ajay.kathat@microchip.com> Link: https://lore.kernel.org/r/1568037993-4646-1-git-send-email-eugen.hristev@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	staging: nvec: make use of devm_platform_ioremap_resource	Hariprasad Kelam
	fix below issue reported by coccicheck drivers/staging//nvec/nvec.c:794:1-5: WARNING: Use devm_platform_ioremap_resource for base Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Acked-by: Marc Dietrich <marvin24@gmx.de> Link: https://lore.kernel.org/r/1567935662-8006-1-git-send-email-hariprasad.kelam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	staging: exfat: drop unused function parameter	Valentin Vidic
	sbi parameter not used inside the function so remove it. Also cleanup unused variables generated by this change. Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Link: https://lore.kernel.org/r/20190908173539.26963-1-vvidic@valentin-vidic.from.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	gpiolib: of: add a fallback for wlf,reset GPIO name	Dmitry Torokhov
	The old Arizona binding did not use -gpio or -gpios suffix, so devm_gpiod_get() does not work for it. As it is the one of a few users of devm_gpiod_get_from_of_node() API that I want to remove, I'd rather have a small quirk in the gpiolib OF handler, and switch Arizona driver to devm_gpiod_get(). Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/20190911075215.78047-2-dmitry.torokhov@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-09-12	Staging: exfat: Avoid use of strcpy	Sandro Volery
	Use strscpy instead of strcpy in exfat_core.c, and add a check for length that will return already known FFS_INVALIDPATH. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Sandro Volery <sandro@volery.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20190912082559.GA5043@volery Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	staging: exfat: use integer constants	Valentin Vidic
	Replace manually generated values with predefined constants. Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Link: https://lore.kernel.org/r/20190908152616.25459-3-vvidic@valentin-vidic.from.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	staging: exfat: cleanup spacing for casts	Valentin Vidic
	Fix checkpatch.pl warnings: CHECK: No space is necessary after a cast Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Link: https://lore.kernel.org/r/20190908152616.25459-2-vvidic@valentin-vidic.from.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	staging: exfat: cleanup spacing for operators	Valentin Vidic
	Fixes checkpatch.pl warnings: CHECK: spaces preferred around that '-' (ctx:VxV) CHECK: spaces preferred around that '+' (ctx:VxV) CHECK: spaces preferred around that '*' (ctx:VxV) CHECK: spaces preferred around that '\|' (ctx:VxV) Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Link: https://lore.kernel.org/r/20190908152616.25459-1-vvidic@valentin-vidic.from.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-12	gpio: htc-egpio: Remove unused exported htc_egpio_get_wakeup_irq()	Geert Uytterhoeven
	This function was never used upstream, and is a relic of the original handhelds.org code the htc-egpio driver was based on. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20190910141529.21030-1-geert+renesas@glider.be Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-09-12	Merge tag 'gpio-v5.3-6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "I don't really like to send so many fixes at the very last minute, but the bug-sport activity is unpredictable. Four fixes, three are -stable material that will go everywhere, one is for the current cycle: - An ACPI DSDT error fixup of the type we always see and Hans invariably gets to fix. - A OF quirk fix for the current release (v5.3) - Some consistency checks on the userspace ABI. - A memory leak" * tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist gpiolib: of: fix fallback quirks handling gpio: fix line flag validation in lineevent_create gpio: fix line flag validation in linehandle_create gpio: mockup: add missing single_release()
2019-09-12	pinctrl: aspeed: Fix spurious mux failures on the AST2500	Andrew Jeffery
	Commit 674fa8daa8c9 ("pinctrl: aspeed-g5: Delay acquisition of regmaps") was determined to be a partial fix to the problem of acquiring the LPC Host Controller and GFX regmaps: The AST2500 pin controller may need to fetch syscon regmaps during expression evaluation as well as when setting mux state. For example, this case is hit by attempting to export pins exposing the LPC Host Controller as GPIOs. An optional eval() hook is added to the Aspeed pinmux operation struct and called from aspeed_sig_expr_eval() if the pointer is set by the SoC-specific driver. This enables the AST2500 to perform the custom action of acquiring its regmap dependencies as required. John Wang tested the fix on an Inspur FP5280G2 machine (AST2500-based) where the issue was found, and I've booted the fix on Witherspoon (AST2500) and Palmetto (AST2400) machines, and poked at relevant pins under QEMU by forcing mux configurations via devmem before exporting GPIOs to exercise the driver. Fixes: 7d29ed88acbb ("pinctrl: aspeed: Read and write bits in LPC and GFX controllers") Fixes: 674fa8daa8c9 ("pinctrl: aspeed-g5: Delay acquisition of regmaps") Reported-by: John Wang <wangzqbj@inspur.com> Tested-by: John Wang <wangzqbj@inspur.com> Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/20190829071738.2523-1-andrew@aj.id.au Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-09-12	Merge branch '10GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-09-11 This series contains fixes to ixgbe. Alex fixes up the adaptive ITR scheme for ixgbe which could result in a value that was either 0 or something less than 10 which was causing issues with hardware features, like RSC, that do not function well with ITR values that low. Ilya Maximets fixes the ixgbe driver to limit the number of transmit descriptors to clean by the number of transmit descriptors used in the transmit ring, so that the driver does not try to "double" clean the same descriptors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-12	gpio: remove explicit comparison with 0	Saiyam Doshi
	No need to compare return value with 0. In case of non-zero return value, the if condition will be true. This makes intent a bit more clear to the reader. "if (x) then", compared to "if (x is not zero) then". Signed-off-by: Saiyam Doshi <saiyamdoshi.in@gmail.com> Link: https://lore.kernel.org/r/20190907173910.GA9547@SD Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-09-12	nfp: read chip model from the PluDevice register	Dirk van der Merwe
	The PluDevice register provides the authoritative chip model/revision. Since the model number is purely used for reporting purposes, follow the hardware team convention of subtracting 0x10 from the PluDevice register to obtain the chip model/revision number. Suggested-by: Francois H. Theron <francois.theron@netronome.com> Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	tcp: force a PSH flag on TSO packets	Eric Dumazet
	When tcp sends a TSO packet, adding a PSH flag on it reduces the sojourn time of GRO packet in GRO receivers. This is particularly the case under pressure, since RX queues receive packets for many concurrent flows. A sender can give a hint to GRO engines when it is appropriate to flush a super-packet, especially when pacing is in the picture, since next packet is probably delayed by one ms. Having less packets in GRO engine reduces chance of LRU eviction or inflated RTT, and reduces GRO cost. We found recently that we must not set the PSH flag on individual full-size MSS segments [1] : Under pressure (CWR state), we better let the packet sit for a small delay (depending on NAPI logic) so that the ACK packet is delayed, and thus next packet we send is also delayed a bit. Eventually the bottleneck queue can be drained. DCTCP flows with CWND=1 have demonstrated the issue. This patch allows to slowdown the aggregate traffic without involving high resolution timers on senders and/or receivers. It has been used at Google for about four years, and has been discussed at various networking conferences. [1] segments smaller than MSS already have PSH flag set by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE has been requested by the user. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Tariq Toukan <tariqt@mellanox.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR	Neal Cardwell
	Fix tcp_ecn_withdraw_cwr() to clear the correct bit: TCP_ECN_QUEUE_CWR. Rationale: basically, TCP_ECN_DEMAND_CWR is a bit that is purely about the behavior of data receivers, and deciding whether to reflect incoming IP ECN CE marks as outgoing TCP th->ece marks. The TCP_ECN_QUEUE_CWR bit is purely about the behavior of data senders, and deciding whether to send CWR. The tcp_ecn_withdraw_cwr() function is only called from tcp_undo_cwnd_reduction() by data senders during an undo, so it should zero the sender-side state, TCP_ECN_QUEUE_CWR. It does not make sense to stop the reflection of incoming CE bits on incoming data packets just because outgoing packets were spuriously retransmitted. The bug has been reproduced with packetdrill to manifest in a scenario with RFC3168 ECN, with an incoming data packet with CE bit set and carrying a TCP timestamp value that causes cwnd undo. Before this fix, the IP CE bit was ignored and not reflected in the TCP ECE header bit, and sender sent a TCP CWR ('W') bit on the next outgoing data packet, even though the cwnd reduction had been undone. After this fix, the sender properly reflects the CE bit and does not set the W bit. Note: the bug actually predates 2005 git history; this Fixes footer is chosen to be the oldest SHA1 I have tested (from Sep 2007) for which the patch applies cleanly (since before this commit the code was in a .h file). Fixes: bdf1ee5d3bd3 ("[TCP]: Move code from tcp_ecn.h to tcp*.c and tcp.h & remove it") Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	ipv6: Don't use dst gateway directly in ip6_confirm_neigh()	Stefano Brivio
	This is the equivalent of commit 2c6b55f45d53 ("ipv6: fix neighbour resolution with raw socket") for ip6_confirm_neigh(): we can send a packet with MSG_CONFIRM on a raw socket for a connected route, so the gateway would be :: here, and we should pick the next hop using rt6_nexthop() instead. This was found by code review and, to the best of my knowledge, doesn't actually fix a practical issue: the destination address from the packet is not considered while confirming a neighbour, as ip6_confirm_neigh() calls choose_neigh_daddr() without passing the packet, so there are no similar issues as the one fixed by said commit. A possible source of issues with the existing implementation might come from the fact that, if we have a cached dst, we won't consider it, while rt6_nexthop() takes care of that. I might just not be creative enough to find a practical problem here: the only way to affect this with cached routes is to have one coming from an ICMPv6 redirect, but if the next hop is a directly connected host, there should be no topology for which a redirect applies here, and tests with redirected routes show no differences for MSG_CONFIRM (and MSG_PROBE) packets on raw sockets destined to a directly connected host. However, directly using the dst gateway here is not consistent anymore with neighbour resolution, and, in general, as we want the next hop, using rt6_nexthop() looks like the only sane way to fetch it. Reported-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: Guillaume Nault <gnault@redhat.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	net: stmmac: pci: Add HAPS support using GMAC5	Jose Abreu
	Add the support for Synopsys HAPS board that uses GMAC5. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	net: phy: dp83867: Add SGMII mode type switching	Vitaly Gaiduk
	This patch adds ability to switch beetween two PHY SGMII modes. Some hardware, for example, FPGA IP designs may use 6-wire mode which enables differential SGMII clock to MAC. Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	net: phy: dp83867: Add documentation for SGMII mode type	Vitaly Gaiduk
	Add documentation of ti,sgmii-ref-clock-output-enable which can be used to select SGMII mode type (4 or 6-wire). Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11	null_blk: validate the number of devices	André Almeida
	A negative number of devices is nonsensical, so change the type to unsigned. If the number of devices is 0, it is impossible for userspace to interact with the module, so refuse loading the driver for that case. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-11	null_blk: fix module name at log message	André Almeida
	The name of the module is "null_blk", not "null". Make `pr_info()` follow the pattern of `pr_err()` log messages. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-11	docs: block: null_blk: enhance document style	André Almeida
	Use proper ReST syntax for chapters. Add more information to enhance standardization in the file and to make the rendering more homogeneous. Add a SPDX identifier. Mark single-queue mode as deprecated. Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-11	vhost: make sure log_num < in_num	yongduan
	The code assumes log_num < in_num everywhere, and that is true as long as in_num is incremented by descriptor iov count, and log_num by 1. However this breaks if there's a zero sized descriptor. As a result, if a malicious guest creates a vring desc with desc.len = 0, it may cause the host kernel to crash by overflowing the log array. This bug can be triggered during the VM migration. There's no need to log when desc.len = 0, so just don't increment log_num in this case. Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") Cc: stable@vger.kernel.org Reviewed-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: ruippan <ruippan@tencent.com> Signed-off-by: yongduan <yongduan@tencent.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-09-11	vhost: block speculation of translated descriptors	Michael S. Tsirkin
	iovec addresses coming from vhost are assumed to be pre-validated, but in fact can be speculated to a value out of range. Userspace address are later validated with array_index_nospec so we can be sure kernel info does not leak through these addresses, but vhost must also not leak userspace info outside the allowed memory table to guests. Following the defence in depth principle, make sure the address is not validated out of node range. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org Acked-by: Jason Wang <jasowang@redhat.com> Tested-by: Jason Wang <jasowang@redhat.com>
2019-09-11	software node: Initialize the return value in software_node_find_by_name()	Heikki Krogerus
	The software node is searched from a list that may be empty when the function is called. This makes sure that the function returns NULL if the list is empty. Fixes: 1666faedb567 ("software node: Add software_node_find_by_name()") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-09-11	ixgbe: fix double clean of Tx descriptors with xdp	Ilya Maximets
	Tx code doesn't clear the descriptors' status after cleaning. So, if the budget is larger than number of used elems in a ring, some descriptors will be accounted twice and xsk_umem_complete_tx will move prod_tail far beyond the prod_head breaking the completion queue ring. Fix that by limiting the number of descriptors to clean by the number of used descriptors in the Tx ring. 'ixgbe_clean_xdp_tx_irq()' function refactored to look more like 'ixgbe_xsk_clean_tx_ring()' since we're allowed to directly use 'next_to_clean' and 'next_to_use' indexes. CC: stable@vger.kernel.org Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Tested-by: William Tu <u9012063@gmail.com> Tested-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	ixgbe: Prevent u8 wrapping of ITR value to something less than 10us	Alexander Duyck
	There were a couple cases where the ITR value generated via the adaptive ITR scheme could exceed 126. This resulted in the value becoming either 0 or something less than 10. Switching back and forth between a value less than 10 and a value greater than 10 can cause issues as certain hardware features such as RSC to not function well when the ITR value has dropped that low. CC: stable@vger.kernel.org Fixes: b4ded8327fea ("ixgbe: Update adaptive ITR algorithm") Reported-by: Gregg Leventhal <gleventhal@janestreet.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	KVM: x86: Fix INIT signal handling in various CPU states	Liran Alon
	Commit cd7764fe9f73 ("KVM: x86: latch INITs while in system management mode") changed code to latch INIT while vCPU is in SMM and process latched INIT when leaving SMM. It left a subtle remark in commit message that similar treatment should also be done while vCPU is in VMX non-root-mode. However, INIT signals should actually be latched in various vCPU states: () For both Intel and AMD, INIT signals should be latched while vCPU is in SMM. () For Intel, INIT should also be latched while vCPU is in VMX operation and later processed when vCPU leaves VMX operation by executing VMXOFF. (*) For AMD, INIT should also be latched while vCPU runs with GIF=0 or in guest-mode with intercept defined on INIT signal. To fix this: 1) Add kvm_x86_ops->apic_init_signal_blocked() such that each CPU vendor can define the various CPU states in which INIT signals should be blocked and modify kvm_apic_accept_events() to use it. 2) Modify vmx_check_nested_events() to check for pending INIT signal while vCPU in guest-mode. If so, emualte vmexit on EXIT_REASON_INIT_SIGNAL. Note that nSVM should have similar behaviour but is currently left as a TODO comment to implement in the future because nSVM don't yet implement svm_check_nested_events(). Note: Currently KVM nVMX implementation don't support VMX wait-for-SIPI activity state as specified in MSR_IA32_VMX_MISC bits 6:8 exposed to guest (See nested_vmx_setup_ctls_msrs()). If and when support for this activity state will be implemented, kvm_check_nested_events() would need to avoid emulating vmexit on INIT signal in case activity-state is wait-for-SIPI. In addition, kvm_apic_accept_events() would need to be modified to avoid discarding SIPI in case VMX activity-state is wait-for-SIPI but instead delay SIPI processing to vmx_check_nested_events() that would clear pending APIC events and emulate vmexit on SIPI. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Co-developed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	i40e: fix potential RX buffer starvation for AF_XDP	Magnus Karlsson
	When the RX rings are created they are also populated with buffers so that packets can be received. Usually these are kernel buffers, but for AF_XDP in zero-copy mode, these are user-space buffers and in this case the application might not have sent down any buffers to the driver at this point. And if no buffers are allocated at ring creation time, no packets can be received and no interrupts will be generated so the NAPI poll function that allocates buffers to the rings will never get executed. To rectify this, we kick the NAPI context of any queue with an attached AF_XDP zero-copy socket in two places in the code. Once after an XDP program has loaded and once after the umem is registered. This take care of both cases: XDP program gets loaded first then AF_XDP socket is created, and the reverse, AF_XDP socket is created first, then XDP program is loaded. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	net/ixgbevf: make array api static const, makes object smaller	Colin Ian King
	Don't populate the array API on the stack but instead make it static const. Makes the object code smaller by 58 bytes. Before: text data bss dec hex filename 82969 9763 256 92988 16b3c ixgbevf/ixgbevf_main.o After: text data bss dec hex filename 82815 9859 256 92930 16b02 ixgbevf/ixgbevf_main.o (gcc version 9.2.1, amd64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	iavf: fix MAC address setting for VFs when filter is rejected	Stefan Assmann
	Currently iavf unconditionally applies MAC address change requests. This brings the VF in a state where it is no longer able to pass traffic if the PF rejects a MAC filter change for the VF. A typical scenario for a rejected MAC filter is for an untrusted VF to request to change the MAC address when an administratively set MAC is present. To keep iavf working in this scenario the MAC filter handling in iavf needs to act on the PF reply regarding the MAC filter change. In the case of an ack the new MAC address gets set, whereas in the case of a nack the previous MAC address needs to stay in place. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate	Stefan Assmann
	In the case of an invalid min Tx rate being requested i40e_ndo_set_vf_bw() immediately returns -EINVAL instead of releasing __I40E_VIRTCHNL_OP_PENDING first. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: use BIT macro to specify the cloud filter field flags	Jacob Keller
	The macros used to specify the cloud filter fields are intended to be individual bits. Declare them using the BIT() macro to make their intention a little more clear. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: Fix message for other card without FEC.	Czeslaw Zagorski
	When variable "req_fec, fec, an" are empty, dmesg shows log with "Requested FEC: , Negotiated FEC: , Autoneg:". Add link dmesg log for cards without FEC. Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: fix missed "Negotiated" string in i40e_print_link_message()	Aleksandr Loktionov
	The "Negotiated" string in i40e_print_link_message() function was missed. This string has been added to the dmesg and small refactoring done removing common substrings and unifying link status message format. Without this patch it was not clear that FEC is related to negotiated FEC. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: mark additional missing bits as reserved	Jacob Keller
	Mark bits 0xD through 0xF for the command flags of a cloud filter as reserved. These bits are not yet defined and are considered as reserved in the data sheet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP	Jacob Keller
	The bit 0x0001 used in the cloud filters adminq command is reserved, and is not actually a valid type. The Linux driver has never used this type, and it's not clear if any driver ever has. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: use ktime_get_real_ts64 instead of ktime_to_timespec64	Jacob Keller
	Remove a call to ktime_to_timespec64 by calling ktime_get_real_ts64 directly. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	ixgbe: use skb_get_queue_mapping in tx path	Tonghao Zhang
	Use the common api, and don't access queue_mapping directly. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask	Stefan Assmann
	While testing VF spawn/destroy the following panic occurred. BUG: unable to handle kernel NULL pointer dereference at 0000000000000029 [...] Workqueue: i40e i40e_service_task [i40e] RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e] [...] Call Trace: ? __switch_to_asm+0x35/0x70 ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x35/0x70 ? _cond_resched+0x15/0x30 i40e_sync_filters_subtask+0x56/0x70 [i40e] i40e_service_task+0x382/0x11b0 [i40e] ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x41/0x70 process_one_work+0x1a7/0x3b0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_bind+0x30/0x30 ret_from_fork+0x35/0x40 Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get accessed by the watchdog via i40e_sync_filters_subtask() although i40e_free_vfs() already free'd pf->vf. To avoid this the call to i40e_sync_vsi_filters() in i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE, which is also used by i40e_free_vfs(). Note: put the __I40E_VF_DISABLE check after the __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to trigger. CC: stable@vger.kernel.org Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	ixgbe: fix memory leaks	Wenwen Wang
	In ixgbe_configure_clsu32(), 'jump', 'input', and 'mask' are allocated through kzalloc() respectively in a for loop body. Then, ixgbe_clsu32_build_input() is invoked to build the input. If this process fails, next iteration of the for loop will be executed. However, the allocated 'jump', 'input', and 'mask' are not deallocated on this execution path, leading to memory leaks. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-11	KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode	Liran Alon
	According to Intel SDM section 25.2 "Other Causes of VM Exits", When INIT signal is received on a CPU that is running in VMX non-root mode it should cause an exit with exit-reason of 3. (See Intel SDM Appendix C "VMX BASIC EXIT REASONS") This patch introduce the exit-reason definition. Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Co-developed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	Merge tag 'kvm-s390-next-5.4-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD * More selftests * Improved KVM_S390_MEM_OP ioctl input checking * Add kvm_valid_regs and kvm_dirty_regs invalid bit checking
2019-09-11	KVM: VMX: Stop the preemption timer during vCPU reset	Wanpeng Li
	The hrtimer which is used to emulate lapic timer is stopped during vcpu reset, preemption timer should do the same. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	KVM: LAPIC: Micro optimize IPI latency	Wanpeng Li
	This patch optimizes the virtual IPI emulation sequence: write ICR2 write ICR2 write ICR read ICR2 read ICR ==> send virtual IPI read ICR2 write ICR send virtual IPI It can reduce kvm-unit-tests/vmexit.flat IPI testing latency(from sender send IPI to sender receive the ACK) from 3319 cycles to 3203 cycles on SKylake server. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	kvm: Nested KVM MMUs need PAE root too	Jiří Paleček
	On AMD processors, in PAE 32bit mode, nested KVM instances don't work. The L0 host get a kernel OOPS, which is related to arch.mmu->pae_root being NULL. The reason for this is that when setting up nested KVM instance, arch.mmu is set to &arch.guest_mmu (while normally, it would be &arch.root_mmu). However, the initialization and allocation of pae_root only creates it in root_mmu. KVM code (ie. in mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the unallocated arch.guest_mmu->pae_root. This fix just allocates (and frees) pae_root in both guest_mmu and root_mmu (and also lm_root if it was allocated). The allocation is subject to previous restrictions ie. it won't allocate anything on 64-bit and AFAIK not on Intel. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=203923 Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu") Signed-off-by: Jiri Palecek <jpalecek@web.de> Tested-by: Jiri Palecek <jpalecek@web.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	KVM: x86: set ctxt->have_exception in x86_decode_insn()	Jan Dakinevich
	x86_emulate_instruction() takes into account ctxt->have_exception flag during instruction decoding, but in practice this flag is never set in x86_decode_insn(). Fixes: 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") Cc: stable@vger.kernel.org Cc: Denis Lunev <den@virtuozzo.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	KVM: x86: always stop emulation on page fault	Jan Dakinevich
	inject_emulated_exception() returns true if and only if nested page fault happens. However, page fault can come from guest page tables walk, either nested or not nested. In both cases we should stop an attempt to read under RIP and give guest to step over its own page fault handler. This is also visible when an emulated instruction causes a #GP fault and the VMware backdoor is enabled. To handle the VMware backdoor, KVM intercepts #GP faults; with only the next patch applied, x86_emulate_instruction() injects a #GP but returns EMULATE_FAIL instead of EMULATE_DONE. EMULATE_FAIL causes handle_exception_nmi() (or gp_interception() for SVM) to re-inject the original #GP because it thinks emulation failed due to a non-VMware opcode. This patch prevents the issue as x86_emulate_instruction() will return EMULATE_DONE after injecting the #GP. Fixes: 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") Cc: stable@vger.kernel.org Cc: Denis Lunev <den@virtuozzo.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-11	cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are ↵	Wanpeng Li
	available The downside of guest side polling is that polling is performed even with other runnable tasks in the host. However, even if poll in kvm can aware whether or not other runnable tasks in the same pCPU, it can still incur extra overhead in over-subscribe scenario. Now we can just enable guest polling when dedicated pCPUs are available. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>