linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-09-01	rxrpc: Fix an insufficiently large sglist in rxkad_verify_packet_2()	David Howells
	rxkad_verify_packet_2() has a small stack-allocated sglist of 4 elements, but if that isn't sufficient for the number of fragments in the socket buffer, we try to allocate an sglist large enough to hold all the fragments. However, for large packets with a lot of fragments, this isn't sufficient and we need at least one additional fragment. The problem manifests as skb_to_sgvec() returning -EMSGSIZE and this then getting returned by userspace. Most of the time, this isn't a problem as rxrpc sets a limit of 5692, big enough for 4 jumbo subpackets to be glued together; occasionally, however, the server will ignore the reported limit and give a packet that's a lot bigger - say 19852 bytes with ->nr_frags being 7. skb_to_sgvec() then tries to return a "zeroth" fragment that seems to occur before the fragments counted by ->nr_frags and we hit the end of the sglist too early. Note that __skb_to_sgvec() also has an skb_walk_frags() loop that is recursive up to 24 deep. I'm not sure if I need to take account of that too - or if there's an easy way of counting those frags too. Fix this by counting an extra frag and allocating a larger sglist based on that. Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org
2022-09-01	rxrpc: Fix ICMP/ICMP6 error handling	David Howells
	Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing it to siphon off UDP packets early in the handling of received UDP packets thereby avoiding the packet going through the UDP receive queue, it doesn't get ICMP packets through the UDP ->sk_error_report() callback. In fact, it doesn't appear that there's any usable option for getting hold of ICMP packets. Fix this by adding a new UDP encap hook to distribute error messages for UDP tunnels. If the hook is set, then the tunnel driver will be able to see ICMP packets. The hook provides the offset into the packet of the UDP header of the original packet that caused the notification. An alternative would be to call the ->error_handler() hook - but that requires that the skbuff be cloned (as ip_icmp_error() or ipv6_cmp_error() do, though isn't really necessary or desirable in rxrpc's case is we want to parse them there and then, not queue them). Changes ======= ver #3) - Fixed an uninitialised variable. ver #2) - Fixed some missing CONFIG_AF_RXRPC_IPV6 conditionals. Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook") Signed-off-by: David Howells <dhowells@redhat.com>
2022-09-01	mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding ↵	Waiman Long
	slab_mutex/cpu_hotplug_lock A circular locking problem is reported by lockdep due to the following circular locking dependency. +--> cpu_hotplug_lock --> slab_mutex --> kn->active --+ \| \| +-----------------------------------------------------+ The forward cpu_hotplug_lock ==> slab_mutex ==> kn->active dependency happens in kmem_cache_destroy(): cpus_read_lock(); mutex_lock(&slab_mutex); ==> sysfs_slab_unlink() ==> kobject_del() ==> kernfs_remove() ==> __kernfs_remove() ==> kernfs_drain(): rwsem_acquire(&kn->dep_map, ...); The backward kn->active ==> cpu_hotplug_lock dependency happens in kernfs_fop_write_iter(): kernfs_get_active(); ==> slab_attr_store() ==> cpu_partial_store() ==> flush_all(): cpus_read_lock() One way to break this circular locking chain is to avoid holding cpu_hotplug_lock and slab_mutex while deleting the kobject in sysfs_slab_unlink() which should be equivalent to doing a write_lock and write_unlock pair of the kn->active virtual lock. Since the kobject structures are not protected by slab_mutex or the cpu_hotplug_lock, we can certainly release those locks before doing the delete operation. Move sysfs_slab_unlink() and sysfs_slab_release() to the newly created kmem_cache_release() and call it outside the slab_mutex & cpu_hotplug_lock critical sections. There will be a slight delay in the deletion of sysfs files if kmem_cache_release() is called indirectly from a work function. Fixes: 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context") Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: David Rientjes <rientjes@google.com> Link: https://lore.kernel.org/all/YwOImVd+nRUsSAga@hyeyoo/ Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-09-01	Merge branch 'rk3588-ethernet-support'	Paolo Abeni
	Sebastian Reichel says: ==================== RK3588 Ethernet Support This adds ethernet support for the RK3588(s) SoCs. Changes since PATCHv2: * Rebased to v6.0-rc1 * Wrap DT bindings additions at 80 characters * Add Acked-by from Krzysztof Changes since PATCHv1: * Drop first patch (not required) and rebase second patch accordingly * Rename DT property rockchip,php_grf -> rockchip,php-grf ==================== Link: https://lore.kernel.org/r/20220830154559.61506-1-sebastian.reichel@collabora.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-01	dt-bindings: net: rockchip-dwmac: add rk3588 gmac compatible	Sebastian Reichel
	Add compatible string for RK3588 gmac, which is similar to the RK3568 one, but needs another syscon device for clock selection. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-01	net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588	David Wu
	Add constants and callback functions for the dwmac on RK3588 soc. As can be seen, the base structure is the same, only registers and the bits in them moved slightly. Signed-off-by: David Wu <david.wu@rock-chips.com> [rebase, squash fixes] Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-01	Merge tag 'usb-serial-6.0-rc4' of ↵	Greg Kroah-Hartman
	https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: "USB-serial fixes for 6.0-rc4 Here are a couple of fixes for two long-standing issues with some older ch341 devices and a number of new device ids. All have been in linux-next with no reported issues." * tag 'usb-serial-6.0-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: ch341: fix disabled rx timer on older devices USB: serial: ch341: fix lost character on LCR updates USB: serial: cp210x: add Decagon UCA device id USB: serial: option: add support for Cinterion MV32-WA/WB RmNet mode USB: serial: ftdi_sio: add Omron CS1W-CIF31 device id USB: serial: option: add Quectel EM060K modem USB: serial: option: add support for OPPO R11 diag port
2022-09-01	soundwire: qcom: fix device status array range	Srinivas Kandagatla
	This patch updates device status array range from 11 to 12 as we will be reading status from device number 0 to device number 11 inclusive. Without this patch we can potentially access status array out of range during auto-enumeration. Fixes: aa1262ca6695 ("soundwire: qcom: Check device status before reading devid") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20220708104747.8722-1-srinivas.kandagatla@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-01	net/smc: Remove redundant refcount increase	Yacan Liu
	For passive connections, the refcount increment has been done in smc_clcsock_accept()-->smc_sock_alloc(). Fixes: 3b2dec2603d5 ("net/smc: restructure client and server code in af_smc") Signed-off-by: Yacan Liu <liuyacan@corp.netease.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Link: https://lore.kernel.org/r/20220830152314.838736-1-liuyacan@corp.netease.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-01	octeontx2-pf: Add egress PFC support	Suman Ghosh
	As of now all transmit queues transmit packets out of same scheduler queue hierarchy. Due to this PFC frames sent by peer are not handled properly, either all transmit queues are backpressured or none. To fix this when user enables PFC for a given priority map relavant transmit queue to a different scheduler queue hierarcy, so that backpressure is applied only to the traffic egressing out of that TXQ. Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20220830120304.158060-1-sumang@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-01	Merge tag 'drm-intel-fixes-2022-08-26' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - GVT fixes including fix for a CommetLake regression in mmio table and misc doc and typo fixes - Fix CCS handling (Matt) - Fix for guc requests after reset (Daniele) - Display DSI related fixes (Jani) - Display backlight related fixes (Arun, Jouni) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YwjCTDFm7clXPgEu@intel.com
2022-09-01	net: sched: remove redundant NULL check in change hook function	Zhengchao Shao
	Currently, the change function can be called by two ways. The one way is that qdisc_change() will call it. Before calling change function, qdisc_change() ensures tca[TCA_OPTIONS] is not empty. The other way is that .init() will call it. The opt parameter is also checked before calling change function in .init(). Therefore, it's no need to check the input parameter opt in change function. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://lore.kernel.org/r/20220829071219.208646-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-31	Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"	Jakub Kicinski
	This reverts commit 90fabae8a2c225c4e4936723c38857887edde5cc. Patch was applied hastily, revert and let the v2 be reviewed. Fixes: 90fabae8a2c2 ("sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb") Link: https://lore.kernel.org/all/87wnao2ha3.fsf@toke.dk/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	Merge branch 'tcp-tcp-challenge-ack-fixes'	Jakub Kicinski
	Eric Dumazet says: ==================== tcp: tcp challenge ack fixes syzbot found a typical data-race addressed in the first patch. While we are at it, second patch makes the global rate limit per net-ns and disabled by default. ==================== Link: https://lore.kernel.org/r/20220830185656.268523-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	tcp: make global challenge ack rate limitation per net-ns and default disabled	Eric Dumazet
	Because per host rate limiting has been proven problematic (side channel attacks can be based on it), per host rate limiting of challenge acks ideally should be per netns and turned off by default. This is a long due followup of following commits: 083ae308280d ("tcp: enable per-socket rate limiting of all 'challenge acks'") f2b2c582e824 ("tcp: mitigate ACK loops for connections as tcp_sock") 75ff39ccc1bd ("tcp: make challenge acks less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	tcp: annotate data-race around challenge_timestamp	Eric Dumazet
	challenge_timestamp can be read an written by concurrent threads. This was expected, but we need to annotate the race to avoid potential issues. Following patch moves challenge_timestamp and challenge_count to per-netns storage to provide better isolation. Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: dsa: hellcreek: Print warning only once	Kurt Kanzenbach
	In case the source port cannot be decoded, print the warning only once. This still brings attention to the user and does not spam the logs at the same time. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220830163448.8921-1-kurt@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net-next: Fix IP_UNICAST_IF option behavior for connected sockets	Richard Gobert
	The IP_UNICAST_IF socket option is used to set the outgoing interface for outbound packets. The IP_UNICAST_IF socket option was added as it was needed by the Wine project, since no other existing option (SO_BINDTODEVICE socket option, IP_PKTINFO socket option or the bind function) provided the needed characteristics needed by the IP_UNICAST_IF socket option. [1] The IP_UNICAST_IF socket option works well for unconnected sockets, that is, the interface specified by the IP_UNICAST_IF socket option is taken into consideration in the route lookup process when a packet is being sent. However, for connected sockets, the outbound interface is chosen when connecting the socket, and in the route lookup process which is done when a packet is being sent, the interface specified by the IP_UNICAST_IF socket option is being ignored. This inconsistent behavior was reported and discussed in an issue opened on systemd's GitHub project [2]. Also, a bug report was submitted in the kernel's bugzilla [3]. To understand the problem in more detail, we can look at what happens for UDP packets over IPv4 (The same analysis was done separately in the referenced systemd issue). When a UDP packet is sent the udp_sendmsg function gets called and the following happens: 1. The oif member of the struct ipcm_cookie ipc (which stores the output interface of the packet) is initialized by the ipcm_init_sk function to inet->sk.sk_bound_dev_if (the device set by the SO_BINDTODEVICE socket option). 2. If the IP_PKTINFO socket option was set, the oif member gets overridden by the call to the ip_cmsg_send function. 3. If no output interface was selected yet, the interface specified by the IP_UNICAST_IF socket option is used. 4. If the socket is connected and no destination address is specified in the send function, the struct ipcm_cookie ipc is not taken into consideration and the cached route, that was calculated in the connect function is being used. Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken into consideration. This patch corrects the behavior of the IP_UNICAST_IF socket option for connect()ed sockets by taking into consideration the IP_UNICAST_IF sockopt when connecting the socket. In order to avoid reconnecting the socket, this option is still ignored when applied on an already connected socket until connect() is called again by the Richard Gobert. Change the __ip4_datagram_connect function, which is called during socket connection, to take into consideration the interface set by the IP_UNICAST_IF socket option, in a similar way to what is done in the udp_sendmsg function. [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/ [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018 [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255 Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829111554.GA1771@debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	ip: fix triggering of 'icmp redirect'	Nicolas Dichtel
	__mkroute_input() uses fib_validate_source() to trigger an icmp redirect. My understanding is that fib_validate_source() is used to know if the src address and the gateway address are on the same link. For that, fib_validate_source() returns 1 (same link) or 0 (not the same network). __mkroute_input() is the only user of these positive values, all other callers only look if the returned value is negative. Since the below patch, fib_validate_source() didn't return anymore 1 when both addresses are on the same network, because the route lookup returns RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right. Let's adapat the test to return 1 again when both addresses are on the same link. CC: stable@vger.kernel.org Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop") Reported-by: kernel test robot <yujie.liu@intel.com> Reported-by: Heng Qi <hengqi@linux.alibaba.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	Merge branch 'net-sched-remove-unused-variables'	Jakub Kicinski
	Zhengchao Shao says: ==================== net: sched: remove unused variables The variable "other" is unused, remove it. ==================== Link: https://lore.kernel.org/r/20220830092255.281330-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: sched: gred/red: remove unused variables in struct red_stats	Zhengchao Shao
	The variable "other" in the struct red_stats is not used. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: sched: choke: remove unused variables in struct choke_sched_data	Zhengchao Shao
	The variable "other" in the struct choke_sched_data is not used. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: axienet: Switch to 64-bit RX/TX statistics	Robert Hancock
	The RX and TX byte/packet statistics in this driver could be overflowed relatively quickly on a 32-bit platform. Switch these stats to use the u64_stats infrastructure to avoid this. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Link: https://lore.kernel.org/r/20220829233901.3429419-1-robert.hancock@calian.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net/rds: Pass a pointer to virt_to_page()	Linus Walleij
	Functions that work on a pointer to virtual memory such as virt_to_pfn() and users of that function such as virt_to_page() are supposed to pass a pointer to virtual memory, ideally a (void ) or other pointer. However since many architectures implement virt_to_pfn() as a macro, this function becomes polymorphic and accepts both a (unsigned long) and a (void ). If we instead implement a proper virt_to_pfn(void *addr) function the following happens (occurred on arch/arm): net/rds/message.c:357:56: warning: passing argument 1 of 'virt_to_pfn' makes pointer from integer without a cast [-Wint-conversion] Fix this with an explicit cast. Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: rds-devel@oss.oracle.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220829132001.114858-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01	netfilter: nf_conntrack_irc: Fix forged IP logic	David Leadbeater
	Ensure the match happens in the right direction, previously the destination used was the server, not the NAT host, as the comment shows the code intended. Additionally nf_nat_irc uses port 0 as a signal and there's no valid way it can appear in a DCC message, so consider port 0 also forged. Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port") Signed-off-by: David Leadbeater <dgl@dgl.cx> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-31	mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse	Jann Horn
	anon_vma->degree tracks the combined number of child anon_vmas and VMAs that use the anon_vma as their ->anon_vma. anon_vma_clone() then assumes that for any anon_vma attached to src->anon_vma_chain other than src->anon_vma, it is impossible for it to be a leaf node of the VMA tree, meaning that for such VMAs ->degree is elevated by 1 because of a child anon_vma, meaning that if ->degree equals 1 there are no VMAs that use the anon_vma as their ->anon_vma. This assumption is wrong because the ->degree optimization leads to leaf nodes being abandoned on anon_vma_clone() - an existing anon_vma is reused and no new parent-child relationship is created. So it is possible to reuse an anon_vma for one VMA while it is still tied to another VMA. This is an issue because is_mergeable_anon_vma() and its callers assume that if two VMAs have the same ->anon_vma, the list of anon_vmas attached to the VMAs is guaranteed to be the same. When this assumption is violated, vma_merge() can merge pages into a VMA that is not attached to the corresponding anon_vma, leading to dangling page->mapping pointers that will be dereferenced during rmap walks. Fix it by separately tracking the number of child anon_vmas and the number of VMAs using the anon_vma as their ->anon_vma. Fixes: 7a3ef208e662 ("mm: prevent endless growth of anon_vma hierarchy") Cc: stable@kernel.org Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-31	r8152: allow userland to disable multicast	Sven van Ashbrook
	The rtl8152 driver does not disable multicasting when userspace asks it to. For example: $ ifconfig eth0 -multicast -allmulti $ tcpdump -p -i eth0 # will still capture multicast frames Fix by clearing the device multicast filter table when multicast and allmulti are both unset. Tested as follows: - Set multicast on eth0 network interface - verify that multicast packets are coming in: $ tcpdump -p -i eth0 - Clear multicast and allmulti on eth0 network interface - verify that no more multicast packets are coming in: $ tcpdump -p -i eth0 Signed-off-by: Sven van Ashbrook <svenva@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20220830045923.net-next.v1.1.I4fee0ac057083d4f848caf0fa3a9fd466fc374a0@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb	Toke Høiland-Jørgensen
	When the GSO splitting feature of sch_cake is enabled, GSO superpackets will be broken up and the resulting segments enqueued in place of the original skb. In this case, CAKE calls consume_skb() on the original skb, but still returns NET_XMIT_SUCCESS. This can confuse parent qdiscs into assuming the original skb still exists, when it really has been freed. Fix this by adding the __NET_XMIT_STOLEN flag to the return value in this case. Fixes: 0c850344d388 ("sch_cake: Conditionally split GSO segments") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231 Link: https://lore.kernel.org/r/20220831092103.442868-1-toke@toke.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: ethernet: move from strlcpy with unused retval to strscpy	Wolfram Sang
	Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4\|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: move from strlcpy with unused retval to strscpy	Wolfram Sang
	Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	Merge branch 'fixes for concurrent htab updates'	Martin KaFai Lau
	Hou Tao says: ==================== From: Hou Tao <houtao1@huawei.com> Hi, The patchset aims to fix the issues found during investigating the syzkaller problem reported in [0]. It seems that the concurrent updates to the same hash-table bucket may fail as shown in patch 1. Patch 1 uses preempt_disable() to fix the problem for htab_use_raw_lock() case. For !htab_use_raw_lock() case, the problem is left to "BPF specific memory allocator" patchset [1] in which !htab_use_raw_lock() will be removed. Patch 2 fixes the out-of-bound memory read problem reported in [0]. The problem has the root cause as patch 1 and it is fixed by handling -EBUSY from htab_lock_bucket() correctly. Patch 3 add two cases for hash-table update: one for the reentrancy of bpf_map_update_elem(), and another one for concurrent updates of the same hash-table bucket. Comments are always welcome. Regards, Tao [0]: https://lore.kernel.org/bpf/CACkBjsbuxaR6cv0kXJoVnBfL9ZJXjjoUcMpw_Ogc313jSrg14A@mail.gmail.com/ [1]: https://lore.kernel.org/bpf/20220819214232.18784-1-alexei.starovoitov@gmail.com/ Change Log: v4: * rebased on bpf-next * add htab_update to DENYLIST.s390x v3: https://lore.kernel.org/bpf/20220829023709.1958204-1-houtao@huaweicloud.com/ * patch 1: update commit message and add Fixes tag * patch 2: add Fixes tag * patch 3: elaborate the description of test cases v2: https://lore.kernel.org/bpf/bd60ef93-1c6a-2db2-557d-b09b92ad22bd@huaweicloud.com/ * Note the fix is for CONFIG_PREEMPT case in commit message and add Reviewed-by tag for patch 1 * Drop patch "bpf: Allow normally concurrent map updates for !htab_use_raw_lock() case" v1: https://lore.kernel.org/bpf/20220821033223.2598791-1-houtao@huaweicloud.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31	selftests/bpf: Add test cases for htab update	Hou Tao
	One test demonstrates the reentrancy of hash map update on the same bucket should fail, and another one shows concureently updates of the same hash map bucket should succeed and not fail due to the reentrancy checking for bucket lock. There is no trampoline support on s390x, so move htab_update to denylist. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31	bpf: Propagate error from htab_lock_bucket() to userspace	Hou Tao
	In __htab_map_lookup_and_delete_batch() if htab_lock_bucket() returns -EBUSY, it will go to next bucket. Going to next bucket may not only skip the elements in current bucket silently, but also incur out-of-bound memory access or expose kernel memory to userspace if current bucket_cnt is greater than bucket_size or zero. Fixing it by stopping batch operation and returning -EBUSY when htab_lock_bucket() fails, and the application can retry or skip the busy batch as needed. Fixes: 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked") Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-3-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31	bpf: Disable preemption when increasing per-cpu map_locked	Hou Tao
	Per-cpu htab->map_locked is used to prohibit the concurrent accesses from both NMI and non-NMI contexts. But since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT"), migrate_disable() is also preemptible under CONFIG_PREEMPT case, so now map_locked also disallows concurrent updates from normal contexts (e.g. userspace processes) unexpectedly as shown below: process A process B htab_map_update_elem() htab_lock_bucket() migrate_disable() /* return 1 / __this_cpu_inc_return() / preempted by B / htab_map_update_elem() / the same bucket as A / htab_lock_bucket() migrate_disable() / return 2, so lock fails */ __this_cpu_inc_return() return -EBUSY A fix that seems feasible is using in_nmi() in htab_lock_bucket() and only checking the value of map_locked for nmi context. But it will re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered through non-tracing program (e.g. fentry program). One cannot use preempt_disable() to fix this issue as htab_use_raw_lock being false causes the bucket lock to be a spin lock which can sleep and does not work with preempt_disable(). Therefore, use migrate_disable() when using the spinlock instead of preempt_disable() and defer fixing concurrent updates to when the kernel has its own BPF memory allocator. Fixes: 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT") Reviewed-by: Hao Luo <haoluo@google.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-2-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31	drm/amd/amdgpu: skip ucode loading if ucode_size == 0	Chengming Gui
	Restrict the ucode loading check to avoid frontdoor loading error. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-31	selftest/bpf: Ensure no module loading in bpf_setsockopt(TCP_CONGESTION)	Martin KaFai Lau
	This patch adds a test to ensure bpf_setsockopt(TCP_CONGESTION, "not_exist") will not trigger the kernel module autoload. Before the fix: [ 40.535829] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 [...] [ 40.552134] tcp_ca_find_autoload.constprop.0+0xcb/0x200 [ 40.552689] tcp_set_congestion_control+0x99/0x7b0 [ 40.553203] do_tcp_setsockopt+0x3ed/0x2240 [...] [ 40.556041] __bpf_setsockopt+0x124/0x640 Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231953.792412-1-martin.lau@linux.dev
2022-08-31	bpf, net: Avoid loading module when calling bpf_setsockopt(TCP_CONGESTION)	Martin KaFai Lau
	When bpf prog changes tcp-cc by calling bpf_setsockopt(TCP_CONGESTION), it should not try to load module which may be a blocking operation. This details was correct in the v1 [0] but missed by mistake in the later revision in commit cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()"). This patch fixes it by checking the has_current_bpf_ctx(). [0] https://lore.kernel.org/bpf/20220727060921.2373314-1-kafai@fb.com/ Fixes: cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()") Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231946.791504-1-martin.lau@linux.dev
2022-08-31	selftests: net: sort .gitignore file	Axel Rasmussen
	This is the result of `sort tools/testing/selftests/net/.gitignore`, but preserving the comment at the top. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Link: https://lore.kernel.org/r/20220829184748.1535580-1-axelrasmussen@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	Documentation: networking: correct possessive "its"	Randy Dunlap
	Change occurrences of "it's" that are possessive to "its" so that they don't read as "it is". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20220829235414.17110-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: phy: smsc: use device-managed clock API	Heiner Kallweit
	Simplify the code by using the device-managed clock API. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/b222be68-ba7e-999d-0a07-eca0ecedf74e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	kcm: fix strp_init() order and cleanup	Cong Wang
	strp_init() is called just a few lines above this csk->sk_user_data check, it also initializes strp->work etc., therefore, it is unnecessary to call strp_done() to cancel the freshly initialized work. And if sk_user_data is already used by KCM, psock->strp should not be touched, particularly strp->work state, so we need to move strp_init() after the csk->sk_user_data check. This also makes a lockdep warning reported by syzbot go away. Reported-and-tested-by: syzbot+9fc084a4348493ef65d2@syzkaller.appspotmail.com Reported-by: syzbot+e696806ef96cdd2d87cd@syzkaller.appspotmail.com Fixes: e5571240236c ("kcm: Check if sk_user_data already set in kcm_attach") Fixes: dff8baa26117 ("kcm: Call strp_stop before strp_done in kcm_attach") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220827181314.193710-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	mlxbf_gige: compute MDIO period based on i1clk	David Thompson
	This patch adds logic to compute the MDIO period based on the i1clk, and thereafter write the MDIO period into the YU MDIO config register. The i1clk resource from the ACPI table is used to provide addressing to YU bootrecord PLL registers. The values in these registers are used to compute MDIO period. If the i1clk resource is not present in the ACPI table, then the current default hardcorded value of 430Mhz is used. The i1clk clock value of 430MHz is only accurate for boards with BF2 mid bin and main bin SoCs. The BF2 high bin SoCs have i1clk = 500MHz, but can support a slower MDIO period. Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver") Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com> Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/20220826155916.12491-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	Revert "clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops"	Stephen Boyd
	This reverts commit 35b0fac808b95eea1212f8860baf6ad25b88b087. Alexander reports that it causes boot failures on i.MX8M Plus based boards (specifically imx8mp-tqma8mpql-mba8mpxl.dts). Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com> Cc: Chen-Yu Tsai <wenst@chromium.org> Fixes: 35b0fac808b9 ("clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops") Link: https://lore.kernel.org/r/12115951.O9o76ZdvQC@steina-w Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20220831175326.2523912-1-sboyd@kernel.org
2022-08-31	libbpf: Add GCC support for bpf_tail_call_static	James Hilliard
	The bpf_tail_call_static function is currently not defined unless using clang >= 8. To support bpf_tail_call_static on GCC we can check if __clang__ is not defined to enable bpf_tail_call_static. We need to use GCC assembly syntax when the compiler does not define __clang__ as LLVM inline assembly is not fully compatible with GCC. Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829210546.755377-1-james.hilliard1@gmail.com
2022-08-31	Merge tag 'fscache-fixes-20220831' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache/cachefiles fixes from David Howells: - Fix kdoc on fscache_use/unuse_cookie(). - Fix the error returned by cachefiles_ondemand_copen() from an upcall result. - Fix the distribution of requests in on-demand mode in cachefiles to be fairer by cycling through them rather than picking the one with the lowest ID each time (IDs being reused). * tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: cachefiles: make on-demand request distribution fairer cachefiles: fix error return code in cachefiles_ondemand_copen() fscache: fix misdocumented parameter
2022-08-31	Merge tag 'for-linus-2022083101' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - NULL pointer dereference fix for Steam driver (Lee Jones) - memory leak fix for hidraw (Karthik Alapati) - regression fix for functionality of some UCLogic tables (Benjamin Tissoires) - a few new device IDs and device-specific quirks * tag 'for-linus-2022083101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: nintendo: fix rumble worker null pointer deref HID: intel-ish-hid: ipc: Add Meteor Lake PCI device ID HID: input: fix uclogic tablets HID: Add Apple Touchbar on T2 Macs in hid_have_special_driver list HID: add Lenovo Yoga C630 battery quirk HID: AMD_SFH: Add a DMI quirk entry for Chromebooks HID: thrustmaster: Add sparco wheel and fix array length hid: intel-ish-hid: ishtp: Fix ishtp client sending disordered message HID: ishtp-hid-clientHID: ishtp-hid-client: Fix comment typo HID: asus: ROG NKey: Ignore portion of 0x5a report HID: hidraw: fix memory leak in hidraw_release() HID: steam: Prevent NULL pointer dereference in steam_{recv,send}_report
2022-08-31	Merge tag 'v6.0-p2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a boot performance regression due to an unnecessary dependency on XOR_BLOCKS" * tag 'v6.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: lib - remove unneeded selection of XOR_BLOCKS
2022-08-31	Merge tag 'lsm-pr-20220829' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM support for IORING_OP_URING_CMD from Paul Moore: "Add SELinux and Smack controls to the io_uring IORING_OP_URING_CMD. These are necessary as without them the IORING_OP_URING_CMD remains outside the purview of the LSMs (Luis' LSM patch, Casey's Smack patch, and my SELinux patch). They have been discussed at length with the io_uring folks, and Jens has given his thumbs-up on the relevant patches (see the commit descriptions). There is one patch that is not strictly necessary, but it makes testing much easier and is very trivial: the /dev/null IORING_OP_URING_CMD patch." * tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: Smack: Provide read control for io_uring_cmd /dev/null: add IORING_OP_URING_CMD support selinux: implement the security_uring_cmd() LSM hook lsm,io_uring: add LSM hooks for the new uring_cmd file op
2022-08-31	gpio: realtek-otto: switch to 32-bit I/O	Sander Vanheule
	By using 16-bit I/O on the GPIO peripheral, which is apparently not safe on MIPS, the IMR can end up containing garbage. This then results in interrupt triggers for lines that don't have an interrupt handler associated. The irq_desc lookup fails, and the ISR will not be cleared, keeping the CPU busy until reboot, or until another IMR operation restores the correct value. This situation appears to happen very rarely, for < 0.5% of IMR writes. Instead of using 8-bit or 16-bit I/O operations on the 32-bit memory mapped peripheral registers, switch to using 32-bit I/O only, operating on the entire bank for all single bit line settings. For 2-bit line settings, with 16-bit port values, stick to manual (un)packing. This issue has been seen on RTL8382M (HPE 1920-16G), RTL8391M (Netgear GS728TP v2), and RTL8393M (D-Link DGS-1210-52 F3, Zyxel GS1900-48). Reported-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> # DGS-1210-52 Reported-by: Birger Koblitz <mail@birger-koblitz.de> # GS728TP Reported-by: Jan Hoffmann <jan@3e8.eu> # 1920-16G Fixes: 0d82fb1127fb ("gpio: Add Realtek Otto GPIO support") Signed-off-by: Sander Vanheule <sander@svanheule.net> Cc: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-08-31	cachefiles: make on-demand request distribution fairer	Xin Yin
	For now, enqueuing and dequeuing on-demand requests all start from idx 0, this makes request distribution unfair. In the weighty concurrent I/O scenario, the request stored in higher idx will starve. Searching requests cyclically in cachefiles_ondemand_daemon_read, makes distribution fairer. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Reported-by: Yongqing Li <liyongqing@bytedance.com> Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220817065200.11543-1-yinxin.x@bytedance.com/ # v1 Link: https://lore.kernel.org/r/20220825020945.2293-1-yinxin.x@bytedance.com/ # v2