summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-07net: switchdev: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: watchdog: add net device refcount trackerEric Dumazet
Add a netdevice_tracker inside struct net_device, to track the self reference when a device has an active watchdog timer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: bridge: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07vlan: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: eql: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07cifs: Fix crash on unload of cifs_arc4.koVincent Whitchurch
The exit function is wrongly placed in the __init section and this leads to a crash when the module is unloaded. Just remove both the init and exit functions since this module does not need them. Fixes: 71c02863246167b3d ("cifs: fork arc4 and create a separate module...") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: stable@vger.kernel.org # 5.15 Signed-off-by: Steve French <stfrench@microsoft.com>
2021-12-07MAINTAINERS: net: mlxsw: Remove Jiri as a maintainer, add myselfPetr Machata
Jiri has moved on and will not carry out the mlxsw maintainership duty any longer. Add myself as a co-maintainer instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/45b54312cdebaf65c5d110b15a5dd2df795bf2be.1638807297.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch 'net-tls-cover-all-ciphers-with-tests'Jakub Kicinski
Vadim Fedorenko says: ==================== net: tls: cover all ciphers with tests Recent patches to Kernel TLS showed that some ciphers are not covered with tests. Let's cover missed. ==================== Link: https://lore.kernel.org/r/20211206213932.7508-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES256-GCM cipherVadim Fedorenko
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES-CCM cipher testsVadim Fedorenko
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07clk: Don't parent clks until the parent is fully registeredMike Tipton
Before commit fc0c209c147f ("clk: Allow parents to be specified without string names") child clks couldn't find their parent until the parent clk was added to a list in __clk_core_init(). After that commit, child clks can reference their parent clks directly via a clk_hw pointer, or they can lookup that clk_hw pointer via DT if the parent clk is registered with an OF clk provider. The common clk framework treats hw->core being non-NULL as "the clk is registered" per the logic within clk_core_fill_parent_index(): parent = entry->hw->core; /* * We have a direct reference but it isn't registered yet? * Orphan it and let clk_reparent() update the orphan status * when the parent is registered. */ if (!parent) Therefore we need to be extra careful to not set hw->core until the clk is fully registered with the clk framework. Otherwise we can get into a situation where a child finds a parent clk and we move the child clk off the orphan list when the parent isn't actually registered, wrecking our enable accounting and breaking critical clks. Consider the following scenario: CPU0 CPU1 ---- ---- struct clk_hw clkBad; struct clk_hw clkA; clkA.init.parent_hws = { &clkBad }; clk_hw_register(&clkA) clk_hw_register(&clkBad) ... __clk_register() hw->core = core ... __clk_register() __clk_core_init() clk_prepare_lock() __clk_init_parent() clk_core_get_parent_by_index() clk_core_fill_parent_index() if (entry->hw) { parent = entry->hw->core; At this point, 'parent' points to clkBad even though clkBad hasn't been fully registered yet. Ouch! A similar problem can happen if a clk controller registers orphan clks that are referenced in the DT node of another clk controller. Let's fix all this by only setting the hw->core pointer underneath the clk prepare lock in __clk_core_init(). This way we know that clk_core_fill_parent_index() can't see hw->core be non-NULL until the clk is fully registered. Fixes: fc0c209c147f ("clk: Allow parents to be specified without string names") Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com> Link: https://lore.kernel.org/r/20211109043438.4639-1-quic_mdtipton@quicinc.com [sboyd@kernel.org: Reword commit text, update comment] Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-12-08netfilter: conntrack: annotate data-races around ct->timeoutEric Dumazet
(struct nf_conn)->timeout can be read/written locklessly, add READ_ONCE()/WRITE_ONCE() to prevent load/store tearing. BUG: KCSAN: data-race in __nf_conntrack_alloc / __nf_conntrack_find_get write to 0xffff888132e78c08 of 4 bytes by task 6029 on cpu 0: __nf_conntrack_alloc+0x158/0x280 net/netfilter/nf_conntrack_core.c:1563 init_conntrack+0x1da/0xb30 net/netfilter/nf_conntrack_core.c:1635 resolve_normal_ct+0x502/0x610 net/netfilter/nf_conntrack_core.c:1746 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline] tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680 __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864 tcp_push_pending_frames include/net/tcp.h:1897 [inline] tcp_data_snd_check+0x62/0x2e0 net/ipv4/tcp_input.c:5452 tcp_rcv_established+0x880/0x10e0 net/ipv4/tcp_input.c:5947 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 sk_stream_wait_memory+0x435/0x700 net/core/stream.c:145 tcp_sendmsg_locked+0xb85/0x25a0 net/ipv4/tcp.c:1402 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1440 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:644 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] __sys_sendto+0x21e/0x2c0 net/socket.c:2036 __do_sys_sendto net/socket.c:2048 [inline] __se_sys_sendto net/socket.c:2044 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2044 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888132e78c08 of 4 bytes by task 17446 on cpu 1: nf_ct_is_expired include/net/netfilter/nf_conntrack.h:286 [inline] ____nf_conntrack_find net/netfilter/nf_conntrack_core.c:776 [inline] __nf_conntrack_find_get+0x1c7/0xac0 net/netfilter/nf_conntrack_core.c:807 resolve_normal_ct+0x273/0x610 net/netfilter/nf_conntrack_core.c:1734 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 __tcp_send_ack+0x1fd/0x300 net/ipv4/tcp_output.c:3956 tcp_send_ack+0x23/0x30 net/ipv4/tcp_output.c:3962 __tcp_ack_snd_check+0x2d8/0x510 net/ipv4/tcp_input.c:5478 tcp_ack_snd_check net/ipv4/tcp_input.c:5523 [inline] tcp_rcv_established+0x8c2/0x10e0 net/ipv4/tcp_input.c:5948 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 tcp_sendpage+0x94/0xb0 net/ipv4/tcp.c:1114 inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833 rds_tcp_xmit+0x376/0x5f0 net/rds/tcp_send.c:118 rds_send_xmit+0xbed/0x1500 net/rds/send.c:367 rds_send_worker+0x43/0x200 net/rds/threads.c:200 process_one_work+0x3fc/0x980 kernel/workqueue.c:2298 worker_thread+0x616/0xa70 kernel/workqueue.c:2445 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x00027cc2 -> 0x00000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G W 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: krdsd rds_send_worker Note: I chose an arbitrary commit for the Fixes: tag, because I do not think we need to backport this fix to very old kernels. Fixes: e37542ba111f ("netfilter: conntrack: avoid possible false sharing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: switch zone stress to socatFlorian Westphal
centos9 has nmap-ncat which doesn't like the '-q' option, use socat. While at it, mark test skipped if needed tools are missing. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08netfilter: nft_exthdr: break evaluation if setting TCP option failsPablo Neira Ayuso
Break rule evaluation on malformed TCP options. Fixes: 99d1712bc41c ("netfilter: exthdr: tcp option set support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: Add correctness test for mac,net set typeStefano Brivio
The existing net,mac test didn't cover the issue recently reported by Nikita Yushchenko, where MAC addresses wouldn't match if given as first field of a concatenated set with AVX2 and 8-bit groups, because there's a different code path covering the lookup of six 8-bit groups (MAC addresses) if that's the first field. Add a similar mac,net test, with MAC address and IPv4 address swapped in the set specification. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groupsStefano Brivio
The sixth byte of packet data has to be looked up in the sixth group, not in the seventh one, even if we load the bucket data into ymm6 (and not ymm5, for convenience of tracking stalls). Without this fix, matching on a MAC address as first field of a set, if 8-bit groups are selected (due to a small set size) would fail, that is, the given MAC address would never match. Reported-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Cc: <stable@vger.kernel.org> # 5.6.x Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-By: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08vrf: don't run conntrack on vrf with !dflt qdiscNicolas Dichtel
After the below patch, the conntrack attached to skb is set to "notrack" in the context of vrf device, for locally generated packets. But this is true only when the default qdisc is set to the vrf device. When changing the qdisc, notrack is not set anymore. In fact, there is a shortcut in the vrf driver, when the default qdisc is set, see commit dcdd43c41e60 ("net: vrf: performance improvements for IPv4") for more details. This patch ensures that the behavior is always the same, whatever the qdisc is. To demonstrate the difference, a new test is added in conntrack_vrf.sh. Fixes: 8c9c296adfae ("vrf: run conntrack only in context of lower/physdev for locally generated packets") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-07Merge tag 'perf-tools-fixes-for-v5.16-2021-12-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix SMT detection fast read path on sysfs. - Fix memory leaks when processing feature headers in perf.data files. - Fix 'Simple expression parser' 'perf test' on arch without CPU die topology info, such as s/390. - Fix building perf with BUILD_BPF_SKEL=1. - Fix 'perf bench' by reverting "perf bench: Fix two memory leaks detected with ASan". - Fix itrace space allowed for new attributes in 'perf script'. - Fix the build feature detection fast path, that was always failing on systems with python3 development packages, speeding up the build. - Reset shadow counts before loading, fixing metrics using duration_time. - Sync more kernel headers changed by the new futex_waitv syscall: s390 and powerpc. * tag 'perf-tools-fixes-for-v5.16-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf_skel: Do not use typedef to avoid error on old clang perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros perf header: Fix memory leaks when processing feature headers perf test: Reset shadow counts before loading perf test: Fix 'Simple expression parser' test on arch without CPU die topology info tools build: Remove needless libpython-version feature check that breaks test-all fast path perf tools: Fix SMT detection fast read path tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall perf inject: Fix itrace space allowed for new attributes tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall Revert "perf bench: Fix two memory leaks detected with ASan"
2021-12-07block: fix single bio async DIO error handlingPavel Begunkov
BUG: KASAN: use-after-free in io_submit_one+0x496/0x2fe0 fs/aio.c:1882 CPU: 2 PID: 15100 Comm: syz-executor873 Not tainted 5.16.0-rc1-syzk #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014 Call Trace: [...] refcount_dec_and_test include/linux/refcount.h:333 [inline] iocb_put fs/aio.c:1161 [inline] io_submit_one+0x496/0x2fe0 fs/aio.c:1882 __do_sys_io_submit fs/aio.c:1938 [inline] __se_sys_io_submit fs/aio.c:1908 [inline] __x64_sys_io_submit+0x1c7/0x4a0 fs/aio.c:1908 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae __blkdev_direct_IO_async() returns errors from bio_iov_iter_get_pages() directly, in which case upper layers won't be expecting ->ki_complete to be called by the block layer and will terminate the request. However, there is also bio_endio() leading to a second ->ki_complete and a double free. Fixes: 54a88eb838d37 ("block: add single bio async direct IO helper") Reported-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c9eb786f6cef041e159e6287de131bec0719ad5c.1638907997.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-07ice: fix adding different tunnelsMichal Swiatkowski
Adding filters with the same values inside for VXLAN and Geneve causes HW error, because it looks exactly the same. To choose between different type of tunnels new recipe is needed. Add storing tunnel types in creating recipes function and start checking it in finding function. Change getting open tunnels function to return port on correct tunnel type. This is needed to copy correct port to dummy packet. Block user from adding enc_dst_port via tc flower, because VXLAN and Geneve filters can be created only with destination port which was previously opened. Fixes: 8b032a55c1bd5 ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: fix choosing UDP header typeMichal Swiatkowski
In tunnels packet there can be two UDP headers: - outer which for hw should be mark as ICE_UDP_OF - inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if inner header is of TCP type In none tunnels packet header can be: - UDP, which for hw should be mark as ICE_UDP_ILOS - TCP, which for hw should be mark as ICE_TCP_IL Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS. ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to error from hw while adding this kind of recipe. In summary, for tunnel outer port type should always be set to ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be set to ICE_UDP_ILOS. Fixes: 9e300987d4a8 ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: ignore dropped packets during initJesse Brandeburg
If the hardware is constantly receiving unicast or broadcast packets during driver load, the device previously counted many GLV_RDPC (VSI dropped packets) events during init. This causes confusing dropped packet statistics during driver load. The dropped packets counter incrementing does stop once the driver finishes loading. Avoid this problem by baselining our statistics at the end of driver open instead of the end of probe. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: Fix problems with DSCP QoS implementationDave Ertman
The patch that implemented DSCP QoS implementation removed a bandwidth check that was used to check for a specific condition caused by some corner cases. This check should not of been removed. The same patch also added a check for when the DCBx state could be changed in relation to DSCP, but the check was erroneously added nested in a check for CEE mode, which made the check useless. Fix these problems by re-adding the bandwidth check and relocating the DSCP mode check earlier in the function that changes DCBx state in the driver. Fixes: 2a87bd73e50d ("ice: Add DSCP support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: rearm other interrupt cause register after enabling VFsPaul Greenwalt
The other interrupt cause register (OICR), global interrupt 0, is disabled when enabling VFs to prevent handling VFLR. If the OICR is not rearmed then the VF cannot communicate with the PF. Rearm the OICR after enabling VFs. Fixes: 916c7fdf5e93 ("ice: Separate VF VSI initialization/creation from reset flow") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: fix FDIR init missing when reset VFYahui Cao
When VF is being reset, ice_reset_vf() will be called and FDIR resource should be released and initialized again. Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF") Signed-off-by: Yahui Cao <yahui.cao@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07PCI: apple: Fix PERST# polarityMarc Zyngier
Now that PERST# is properly defined as active-low in the device tree, fix the driver to correctly drive the line independently of the implied polarity. Suggested-by: Pali Rohár <pali@kernel.org> Fixes: 1e33888fbe44 ("PCI: apple: Add initial hardware bring-up") Link: https://lore.kernel.org/r/20211123180636.80558-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
2021-12-07arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DTMarc Zyngier
As the name indicates, PERST# is active low. Fix the DT description to match the HW behaviour. Fixes: ff2a8d91d80c ("arm64: apple: Add PCIe node") Link: https://lore.kernel.org/r/20211123180636.80558-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Reviewed-by: Mark Kettenis <kettenis@openbsd.org> Acked-by: Arnd Bergmann <arnd@arndb.de>
2021-12-07clk: versatile: clk-icst: use after free on error pathDan Carpenter
This frees "name" and then tries to display in as part of the error message on the next line. Swap the order. Fixes: 1b2189f3aa50 ("clk: versatile: clk-icst: Ensure clock names are unique") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211117072604.GC5237@kili Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-12-07Merge branch 'mptcp-new-features-for-mptcp-sockets-and-netlink-pm'Jakub Kicinski
Mat Martineau says: ==================== mptcp: New features for MPTCP sockets and netlink PM This collection of patches adds MPTCP socket support for a few socket options, ioctls, and one ancillary data type (specifics for each are listed below). There's also a patch modifying the netlink MPTCP path manager API to allow setting the backup flag on a configured interface using the endpoint ID instead of the full IP address. Patches 1 & 2: TCP_INQ cmsg and selftests. Patches 2 & 3: SIOCINQ, OUTQ, and OUTQNSD ioctls and selftests. Patch 5: Change backup flag using endpoint ID. Patches 6 & 7: IP_TOS socket option and selftests. Patches 8-10: TCP_CORK and TCP_NODELAY socket options. Includes a tcp change to expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for use by MPTCP. ==================== Link: https://lore.kernel.org/r/20211203223541.69364-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: support TCP_CORK and TCP_NODELAYMaxim Galaganov
First, add cork and nodelay fields to the mptcp_sock structure so they can be used in sync_socket_options(), and fill them on setsockopt while holding the msk socket lock. Then, on setsockopt set proper tcp_sk(ssk)->nonagle values for subflows by calling __tcp_sock_set_cork() or __tcp_sock_set_nodelay() on the ssk while holding the ssk socket lock. tcp_push_pending_frames() will be invoked on the ssk if a cork was cleared or nodelay was set. Also set MPTCP_PUSH_PENDING bit by calling mptcp_check_and_set_pending(). This will lead to __mptcp_push_pending() being called inside mptcp_release_cb() with new tcp_sk(ssk)->nonagle. Also add getsockopt support for TCP_CORK and TCP_NODELAY. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: expose mptcp_check_and_set_pendingMaxim Galaganov
Expose the mptcp_check_and_set_pending() function for use inside MPTCP sockopt code. The next patch will call it when TCP_CORK is cleared or TCP_NODELAY is set on the MPTCP socket in order to push pending data from mptcp_release_cb(). Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07tcp: expose __tcp_sock_set_cork and __tcp_sock_set_nodelayMaxim Galaganov
Expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for use in MPTCP setsockopt code -- namely for syncing MPTCP socket options with subflows inside sync_socket_options() while already holding the subflow socket lock. Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: mptcp: check IP_TOS in/out are the sameFlorian Westphal
Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set right before. Also check that socklen_t == 0 and -1 input values match those of normal tcp sockets. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: getsockopt: add support for IP_TOSFlorian Westphal
earlier patch added IP_TOS setsockopt support, this allows to get the value set by earlier setsockopt. Extends mptcp_put_int_option to handle u8 input/output by adding required cast. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: allow changing the "backup" bit by endpoint idDavide Caratti
a non-zero 'id' is sufficient to identify MPTCP endpoints: allow changing the value of 'backup' bit by simply specifying the endpoint id. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/158 Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: mptcp: add inq test caseFlorian Westphal
client & server use a unix socket connection to communicate outside of the mptcp connection. This allows the consumer to know in advance how many bytes have been (or will be) sent by the peer. This allows stricter checks on the bytecounts reported by TCP_INQ cmsg. Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctlsFlorian Westphal
Allows to query in-sequence data ready for read(), total bytes in write queue and total bytes in write queue that have not yet been sent. v2: remove unneeded READ_ONCE() (Paolo Abeni) v3: check for new data unconditionally in SIOCINQ ioctl (Mat Martineau) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: mptcp: add TCP_INQ supportFlorian Westphal
Do checks on the returned inq counter. Fail on: 1. Huge value (> 1 kbyte, test case files are 1 kb) 2. last hint larger than returned bytes when read was short 3. erronenous indication of EOF. 3) happens when a hint of X bytes reads X-1 on next call but next recvmsg returns more data (instead of EOF). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07mptcp: add TCP_INQ cmsg supportFlorian Westphal
Support the TCP_INQ setsockopt. This is a boolean that tells recvmsg path to include the remaining in-sequence bytes in the cmsg data. v2: do not use CB(skb)->offset, increment map_seq instead (Paolo Abeni) v3: adjust CB(skb)->map_seq when taking skb from ofo queue (Paolo Abeni) Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/224 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07vrf: use dev_replace_track() for better trackingEric Dumazet
vrf_rt6_release() and vrf_rtable_release() changes dst->dev Instead of dev_hold(ndev); dev_put(odev); We should use dev_replace_track(odev, ndev, &dst->dev_tracker, GFP_KERNEL); If we do not transfer dst->dev_tracker to the new device, we will get warnings from ref_tracker_dir_exit() when odev is finally dismantled. Fixes: 9038c320001d ("net: dst: add net device refcount tracking to dst_entry") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211207055603.1926372-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net/qla3xxx: fix an error code in ql_adapter_up()Dan Carpenter
The ql_wait_for_drvr_lock() fails and returns false, then this function should return an error code instead of returning success. The other problem is that the success path prints an error message netdev_err(ndev, "Releasing driver lock\n"); Delete that and re-order the code a little to make it more clear. Fixes: 5a4faa873782 ("[PATCH] qla3xxx NIC driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211207082416.GA16110@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge tag 'linux-can-fixes-for-5.16-20211207' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2021-12-07 The 1st patch is by Vincent Mailhol and fixes a use after free in the pch_can driver. Dan Carpenter fixes a use after free in the ems_pcmcia sja1000 driver. The remaining 7 patches target the m_can driver. Brian Silverman contributes a patch to disable and ignore the ELO interrupt, which is currently not handled in the driver and may lead to an interrupt storm. Vincent Mailhol's patch fixes a memory leak in the error path of the m_can_read_fifo() function. The remaining patches are contributed by Matthias Schiffer, first a iomap_read_fifo() and iomap_write_fifo() functions are fixed in the PCI glue driver, then the clock rate for the Intel Ekhart Lake platform is fixed, the last 3 patches add support for the custom bit timings on the Elkhart Lake platform. * tag 'linux-can-fixes-for-5.16-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: m_can: pci: use custom bit timings for Elkhart Lake can: m_can: make custom bittiming fields const Revert "can: m_can: remove support for custom bit timing" can: m_can: pci: fix incorrect reference clock rate can: m_can: pci: fix iomap_read_fifo() and iomap_write_fifo() can: m_can: m_can_read_fifo: fix memory leak in error branch can: m_can: Disable and ignore ELO interrupt can: sja1000: fix use after free in ems_pcmcia_add_card() can: pch_can: pch_can_rx_normal: fix use after free ==================== Link: https://lore.kernel.org/r/20211207102420.120131-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07iwlwifi: work around reverse dependency on MEIArnd Bergmann
If the iwlmei code is a loadable module, the main iwlwifi driver cannot be built-in: x86_64-linux-ld: drivers/net/wireless/intel/iwlwifi/pcie/trans.o: in function `iwl_pcie_prepare_card_hw': trans.c:(.text+0x4158): undefined reference to `iwl_mei_is_connected' Unfortunately, Kconfig enforces the opposite, forcing the MEI driver to not be built-in if iwlwifi is a module. To work around this, decouple iwlmei from iwlwifi and add the dependency in the other direction. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211207151447.3338818-1-arnd@kernel.org Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2021-12-07xfs: remove all COW fork extents when remounting readonlyDarrick J. Wong
As part of multiple customer escalations due to file data corruption after copy on write operations, I wrote some fstests that use fsstress to hammer on COW to shake things loose. Regrettably, I caught some filesystem shutdowns due to incorrect rmap operations with the following loop: mount <filesystem> # (0) fsstress <run only readonly ops> & # (1) while true; do fsstress <run all ops> mount -o remount,ro # (2) fsstress <run only readonly ops> mount -o remount,rw # (3) done When (2) happens, notice that (1) is still running. xfs_remount_ro will call xfs_blockgc_stop to walk the inode cache to free all the COW extents, but the blockgc mechanism races with (1)'s reader threads to take IOLOCKs and loses, which means that it doesn't clean them all out. Call such a file (A). When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which walks the ondisk refcount btree and frees any COW extent that it finds. This function does not check the inode cache, which means that incore COW forks of inode (A) is now inconsistent with the ondisk metadata. If one of those former COW extents are allocated and mapped into another file (B) and someone triggers a COW to the stale reservation in (A), A's dirty data will be written into (B) and once that's done, those blocks will be transferred to (A)'s data fork without bumping the refcount. The results are catastrophic -- file (B) and the refcount btree are now corrupt. Solve this race by forcing the xfs_blockgc_free_space to run synchronously, which causes xfs_icwalk to return to inodes that were skipped because the blockgc code couldn't take the IOLOCK. This is safe to do here because the VFS has already prohibited new writer threads. Fixes: 10ddf64e420f ("xfs: remove leftover CoW reservations when remounting ro") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
2021-12-07Merge tag 'platform-drivers-x86-v5.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Various bug-fixes and hardware-id additions" * tag 'platform-drivers-x86-v5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel: hid: add quirk to support Surface Go 3 platform/x86: amd-pmc: Fix s2idle failures on certain AMD laptops platform/x86: touchscreen_dmi: Add TrekStor SurfTab duo W1 touchscreen info platform/x86: lg-laptop: Recognize more models platform/x86: thinkpad_acpi: Add lid_logo_dot to the list of safe LEDs platform/x86: thinkpad_acpi: Restore missing hotkey_tablet_mode and hotkey_radio_sw sysfs-attr
2021-12-07iwlwifi: mvm: optionally suppress assert logJohannes Berg
Normally, when we hit an assert, we print out all the assert data. However, in certain tests, when we trigger it from debugfs intentionally, that can be useless and confusing. Allow writing the string "nolog\n" to the fw_nmi and fw_restart files suppressing the assert dump as well as - in the case of fw_restart - the iwlwifi 0000:00:00.0: FW error in SYNC CMD REPLY_ERROR message. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211204174546.75e29a2ab68d.Id3064feda2ce7a77c116c6d6e71ce5ff447c6e86@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2021-12-07iwlwifi: add new ax1650 killer deviceYaara Baruch
Add new Qu-Hr killer device id. Signed-off-by: Yaara Baruch <yaara.baruch@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211204174546.997c250b9edc.Id50730e3e342297432eed47cdf9678ee16cf6d17@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2021-12-07iwlwifi: fw: correctly detect HW-SMEM region subtypeJohannes Berg
This is part of the "device memory" type, but with the subtypes we can now detect it properly, rather than having to make assumptions on the ID. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211204174546.91d33aa9dd3d.Ifb48e21fbb92ea25360856b5cc2afbb9b485d6b3@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2021-12-07iwlwifi: implement reset flow for Bz devicesJohannes Berg
On Bz devices, UREG_DOORBELL_TO_ISR6_NMI_BIT no longer actually triggers an NMI. So instead of setting BIT(0) | BIT(1) for the reset flow, we need to just set BIT(1) and then force the NMI in the new way. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211204174546.6b56e7ee1773.I71cba66e17cc0daabc5ad7abd88763674b625c82@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2021-12-07iwlwifi: add new Qu-Hr deviceYaara Baruch
Add new Qu-Hr device ID. Signed-off-by: Yaara Baruch <yaara.baruch@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211204174546.c68af5f8d7ce.I37894e98080161c3bca6f33b99a5b8812166ee41@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com>