summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-19Merge tag 'net-5.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from bpf, wireless and mac80211 trees. Current release - regressions: - tipc: call tipc_wait_for_connect only when dlen is not 0 - mac80211: fix locking in ieee80211_restart_work() Current release - new code bugs: - bpf: add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() - ethernet: ice: fix perout start time rounding - wwan: iosm: prevent underflow in ipc_chnl_cfg_get() Previous releases - regressions: - bpf: clear zext_dst of dead insns - sch_cake: fix srchost/dsthost hashing mode - vrf: reset skb conntrack connection on VRF rcv - net/rds: dma_map_sg is entitled to merge entries Previous releases - always broken: - ethernet: bnxt: fix Tx path locking and races, add Rx path barriers" * tag 'net-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits) net: dpaa2-switch: disable the control interface on error path Revert "flow_offload: action should not be NULL when it is referenced" iavf: Fix ping is lost after untrusted VF had tried to change MAC i40e: Fix ATR queue selection r8152: fix the maximum number of PLA bp for RTL8153C r8152: fix writing USB_BP2_EN mptcp: full fully established support after ADD_ADDR mptcp: fix memory leak on address flush net/rds: dma_map_sg is entitled to merge entries net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU port net: asix: fix uninit value bugs ovs: clear skb->tstamp in forwarding path net: mdio-mux: Handle -EPROBE_DEFER correctly net: mdio-mux: Don't ignore memory allocation errors net: mdio-mux: Delete unnecessary devm_kfree net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse sch_cake: fix srchost/dsthost hashing mode ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error path net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32 mac80211: fix locking in ieee80211_restart_work() ...
2021-08-19Merge tag 'platform-drivers-x86-v5.14-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Enable SW_TABLET_MODE support for the TP200s - Enable WMI on two more Gigabyte motherboards * tag 'platform-drivers-x86-v5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: gigabyte-wmi: add support for B450M S2H V2 platform/x86: gigabyte-wmi: add support for X570 GAMING X platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option
2021-08-19net: dpaa2-switch: disable the control interface on error pathVladimir Oltean
Currently dpaa2_switch_takedown has a funny name and does not do the opposite of dpaa2_switch_init, which makes probing fail when we need to handle an -EPROBE_DEFER. A sketch of what dpaa2_switch_init does: dpsw_open dpaa2_switch_detect_features dpsw_reset for (i = 0; i < ethsw->sw_attr.num_ifs; i++) { dpsw_if_disable dpsw_if_set_stp dpsw_vlan_remove_if_untagged dpsw_if_set_tci dpsw_vlan_remove_if } dpsw_vlan_remove alloc_ordered_workqueue dpsw_fdb_remove dpaa2_switch_ctrl_if_setup When dpaa2_switch_takedown is called from the error path of dpaa2_switch_probe(), the control interface, enabled by dpaa2_switch_ctrl_if_setup from dpaa2_switch_init, remains enabled, because dpaa2_switch_takedown does not call dpaa2_switch_ctrl_if_teardown. Since dpaa2_switch_probe might fail due to EPROBE_DEFER of a PHY, this means that a second probe of the driver will happen with the control interface directly enabled. This will trigger a second error: [ 93.273528] fsl_dpaa2_switch dpsw.0: dpsw_ctrl_if_set_pools() failed [ 93.281966] fsl_dpaa2_switch dpsw.0: fsl_mc_driver_probe failed: -13 [ 93.288323] fsl_dpaa2_switch: probe of dpsw.0 failed with error -13 Which if we investigate the /dev/dpaa2_mc_console log, we find out is caused by: [E, ctrl_if_set_pools:2211, DPMNG] ctrl_if must be disabled So make dpaa2_switch_takedown do the opposite of dpaa2_switch_init (in reasonable limits, no reason to change STP state, re-add VLANs etc), and rename it to something more conventional, like dpaa2_switch_teardown. Fixes: 613c0a5810b7 ("staging: dpaa2-switch: enable the control interface") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20210819141755.1931423-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19Revert "flow_offload: action should not be NULL when it is referenced"Ido Schimmel
This reverts commit 9ea3e52c5bc8bb4a084938dc1e3160643438927a. Cited commit added a check to make sure 'action' is not NULL, but 'action' is already dereferenced before the check, when calling flow_offload_has_one_action(). Therefore, the check does not make any sense and results in a smatch warning: include/net/flow_offload.h:322 flow_action_mixed_hw_stats_check() warn: variable dereferenced before check 'action' (see line 319) Fix by reverting this commit. Cc: gushengxian <gushengxian@yulong.com> Fixes: 9ea3e52c5bc8 ("flow_offload: action should not be NULL when it is referenced") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20210819105842.1315705-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19Merge branch 'intel-wired-lan-driver-updates-2021-08-18'Jakub Kicinski
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-08-18 This series contains updates to i40e and iavf drivers. Arkadiusz fixes Flow Director not using the correct queue due to calling the wrong pick Tx function for i40e. Sylwester resolves traffic loss for iavf when it attempts to change its MAC address when it does not have permissions to do so. ==================== Link: https://lore.kernel.org/r/20210818174217.4138922-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19iavf: Fix ping is lost after untrusted VF had tried to change MACSylwester Dziedziuch
Make changes to MAC address dependent on the response of PF. Disallow changes to HW MAC address and MAC filter from untrusted VF, thanks to that ping is not lost if VF tries to change MAC. Add a new field in iavf_mac_filter, to indicate whether there was response from PF for given filter. Based on this field pass or discard the filter. If untrusted VF tried to change it's address, it's not changed. Still filter was changed, because of that ping couldn't go through. Fixes: c5c922b3e09b ("iavf: fix MAC address setting for VFs when filter is rejected") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <Gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19i40e: Fix ATR queue selectionArkadiusz Kubalewski
Without this patch, ATR does not work. Receive/transmit uses queue selection based on SW DCB hashing method. If traffic classes are not configured for PF, then use netdev_pick_tx function for selecting queue for packet transmission. Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx, which ensures that packet is transmitted/received from CPU that is running the application. Reproduction steps: 1. Load i40e driver 2. Map each MSI interrupt of i40e port for each CPU 3. Disable ntuple, enable ATR i.e.: ethtool -K $interface ntuple off ethtool --set-priv-flags $interface flow-director-atr 4. Run application that is generating traffic and is bound to a single CPU, i.e.: taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10 5. Observe behavior: Application's traffic should be restricted to the CPU provided in taskset. Fixes: 89ec1f0886c1 ("i40e: Fix queue-to-TC mapping on Tx") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf 2021-08-19 We've added 3 non-merge commits during the last 3 day(s) which contain a total of 3 files changed, 29 insertions(+), 6 deletions(-). The main changes are: 1) Fix to clear zext_dst for dead instructions which was causing invalid program rejections on JITs with bpf_jit_needs_zext such as s390x, from Ilya Leoshkevich. 2) Fix RCU splat in bpf_get_current_{ancestor_,}cgroup_id() helpers when they are invoked from sleepable programs, from Yonghong Song. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests, bpf: Test that dead ldx_w insns are accepted bpf: Clear zext_dst of dead insns bpf: Add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() helpers ==================== Link: https://lore.kernel.org/r/20210819144904.20069-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19Merge branch 'r8152-bp-settings'David S. Miller
Hayes Wang says: ==================== r8152: fix bp settings Fix the wrong bp settings of the firmware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19r8152: fix the maximum number of PLA bp for RTL8153CHayes Wang
The maximum PLA bp number of RTL8153C is 16, not 8. That is, the bp 0 ~ 15 are at 0xfc28 ~ 0xfc46, and the bp_en is at 0xfc48. Fixes: 195aae321c82 ("r8152: support new chips") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19r8152: fix writing USB_BP2_ENHayes Wang
The register of USB_BP2_EN is 16 bits, so we should use ocp_write_word(), not ocp_write_byte(). Fixes: 9370f2d05a2a ("support request_firmware for RTL8153") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19Merge branch 'mptcp-fixes'David S. Miller
Mat Martineau says: ==================== mptcp: Bug fixes Here are two bug fixes for the net tree: Patch 1 fixes a memory leak that could be encountered when clearing the list of advertised MPTCP addresses. Patch 2 fixes a protocol issue early in an MPTCP connection, to ensure both peers correctly understand that the full MPTCP connection handshake has completed even when the server side quickly sends an ADD_ADDR option. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19mptcp: full fully established support after ADD_ADDRMatthieu Baerts
If directly after an MP_CAPABLE 3WHS, the client receives an ADD_ADDR with HMAC from the server, it is enough to switch to a "fully established" mode because it has received more MPTCP options. It was then OK to enable the "fully_established" flag on the MPTCP socket. Still, best to check if the ADD_ADDR looks valid by looking if it contains an HMAC (no 'echo' bit). If an ADD_ADDR echo is received while we are not in "fully established" mode, it is strange and then we should not switch to this mode now. But that is not enough. On one hand, the path-manager has be notified the state has changed. On the other hand, the "fully_established" flag on the subflow socket should be turned on as well not to re-send the MP_CAPABLE 3rd ACK content with the next ACK. Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19mptcp: fix memory leak on address flushPaolo Abeni
The endpoint cleanup path is prone to a memory leak, as reported by syzkaller: BUG: memory leak unreferenced object 0xffff88810680ea00 (size 64): comm "syz-executor.6", pid 6191, jiffies 4295756280 (age 24.138s) hex dump (first 32 bytes): 58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de Xu}<...."....... 01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00 ................ backtrace: [<0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline] [<0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170 [<00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731 [<00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] [<00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792 [<0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504 [<00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803 [<000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340 [<0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929 [<00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline] [<00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724 [<000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403 [<000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457 [<00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486 [<00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae We should not require an allocation to cleanup stuff. Rework the code a bit so that the additional RCU work is no more needed. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net/rds: dma_map_sg is entitled to merge entriesGerd Rausch
Function "dma_map_sg" is entitled to merge adjacent entries and return a value smaller than what was passed as "nents". Subsequently "ib_map_mr_sg" needs to work with this value ("sg_dma_len") rather than the original "nents" parameter ("sg_len"). This old RDS bug was exposed and reliably causes kernel panics (using RDMA operations "rds-stress -D") on x86_64 starting with: commit c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Simply put: Linux 5.11 and later. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Link: https://lore.kernel.org/r/60efc69f-1f35-529d-a7ef-da0549cad143@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU portVladimir Oltean
Currently we are unable to ping a bridge on top of a felix switch which uses the ocelot-8021q tagger. The packets are dropped on the ingress of the user port and the 'drop_local' counter increments (the counter which denotes drops due to no valid destinations). Dumping the PGID tables, it becomes clear that the PGID_SRC of the user port is zero, so it has no valid destinations. But looking at the code, the cpu_fwd_mask (the bit mask of DSA tag_8021q ports) is clearly missing from the forwarding mask of ports that are under a bridge. So this has always been broken. Looking at the version history of the patch, in v7 https://patchwork.kernel.org/project/netdevbpf/patch/20210125220333.1004365-12-olteanv@gmail.com/ the code looked like this: /* Standalone ports forward only to DSA tag_8021q CPU ports */ unsigned long mask = cpu_fwd_mask; (...) } else if (ocelot->bridge_fwd_mask & BIT(port)) { mask |= ocelot->bridge_fwd_mask & ~BIT(port); while in v8 (the merged version) https://patchwork.kernel.org/project/netdevbpf/patch/20210129010009.3959398-12-olteanv@gmail.com/ it looked like this: unsigned long mask; (...) } else if (ocelot->bridge_fwd_mask & BIT(port)) { mask = ocelot->bridge_fwd_mask & ~BIT(port); So the breakage was introduced between v7 and v8 of the patch. Fixes: e21268efbe26 ("net: dsa: felix: perform switch setup for tag_8021q") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20210817160425.3702809-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18Merge tag 'for-5.14-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix for cross-rename, adding a missing check for directory and subvolume, this could lead to a crash" * tag 'for-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
2021-08-18Merge tag 'sound-5.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Only a few regression fixes and trivial device quirks" * tag 'sound-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/via: Apply runtime PM workaround for ASUS B23E ALSA: hda: Fix hang during shutdown due to link reset ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15 9510 laptop ALSA: oxfw: fix functioal regression for silence in Apogee Duet FireWire ALSA: hda - fix the 'Capture Switch' value change notifications
2021-08-18Merge tag 'cfi-v5.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull clang cfi fix from Kees Cook: - Use rcu_read_{un}lock_sched_notrace to avoid recursion (Elliot Berman) * tag 'cfi-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: cfi: Use rcu_read_{un}lock_sched_notrace
2021-08-18pipe: avoid unnecessary EPOLLET wakeups under normal loadsLinus Torvalds
I had forgotten just how sensitive hackbench is to extra pipe wakeups, and commit 3a34b13a88ca ("pipe: make pipe writes always wake up readers") ended up causing a quite noticeable regression on larger machines. Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's not clear that this matters in real life all that much, but as Mel points out, it's used often enough when comparing kernels and so the performance regression shows up like a sore thumb. It's easy enough to fix at least for the common cases where pipes are used purely for data transfer, and you never have any exciting poll usage at all. So set a special 'poll_usage' flag when there is polling activity, and make the ugly "EPOLLET has crazy legacy expectations" semantics explicit to only that case. I would love to limit it to just the broken EPOLLET case, but the pipe code can't see the difference between epoll and regular select/poll, so any non-read/write waiting will trigger the extra wakeup behavior. That is sufficient for at least the hackbench case. Apart from making the odd extra wakeup cases more explicitly about EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe write, not at the first write chunk. That is actually much saner semantics (as much as you can call any of the legacy edge-triggered expectations for EPOLLET "sane") since it means that you know the wakeup will happen once the write is done, rather than possibly in the middle of one. [ For stable people: I'm putting a "Fixes" tag on this, but I leave it up to you to decide whether you actually want to backport it or not. It likely has no impact outside of synthetic benchmarks - Linus ] Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/ Fixes: 3a34b13a88ca ("pipe: make pipe writes always wake up readers") Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Sandeep Patil <sspatil@android.com> Tested-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-18platform/x86: gigabyte-wmi: add support for B450M S2H V2Thomas Weißschuh
Reported as working here: https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-901207693 Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20210818164435.99821-1-linux@weissschuh.net Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-08-18net: asix: fix uninit value bugsPavel Skripkin
Syzbot reported uninit-value in asix_mdio_read(). The problem was in missing error handling. asix_read_cmd() should initialize passed stack variable smsr, but it can fail in some cases. Then while condidition checks possibly uninit smsr variable. Since smsr is uninitialized stack variable, driver can misbehave, because smsr will be random in case of asix_read_cmd() failure. Fix it by adding error handling and just continue the loop instead of checking uninit value. Added helper function for checking Host_En bit, since wrong loop was used in 4 functions and there is no need in copy-pasting code parts. Cc: Robert Foss <robert.foss@collabora.com> Fixes: d9fe64e51114 ("net: asix: Add in_pm parameter") Reported-by: syzbot+a631ec9e717fb0423053@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18ovs: clear skb->tstamp in forwarding pathkaixi.fan
fq qdisc requires tstamp to be cleared in the forwarding path. Now ovs doesn't clear skb->tstamp. We encountered a problem with linux version 5.4.56 and ovs version 2.14.1, and packets failed to dequeue from qdisc when fq qdisc was attached to ovs port. Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: kaixi.fan <fankaixi.li@bytedance.com> Signed-off-by: xiexiaohui <xiexiaohui.xxh@bytedance.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18Merge branch 'mdio-fixes'David S. Miller
Saravana Kannan says: ==================== Clean up and fix error handling in mdio_mux_init() This patch series was started due to -EPROBE_DEFER not being handled correctly in mdio_mux_init() and causing issues [1]. While at it, I also did some more error handling fixes and clean ups. The -EPROBE_DEFER fix is the last patch. Ideally, in the last patch we'd treat any error similar to -EPROBE_DEFER but I'm not sure if it'll break any board/platforms where some child mdiobus never successfully registers. If we treated all errors similar to -EPROBE_DEFER, then none of the child mdiobus will work and that might be a regression. If people are sure this is not a real case, then I can fix up the last patch to always fail the entire mdio-mux init if any of the child mdiobus registration fails. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net: mdio-mux: Handle -EPROBE_DEFER correctlySaravana Kannan
When registering mdiobus children, if we get an -EPROBE_DEFER, we shouldn't ignore it and continue registering the rest of the mdiobus children. This would permanently prevent the deferring child mdiobus from working instead of reattempting it in the future. So, if a child mdiobus needs to be reattempted in the future, defer the entire mdio-mux initialization. This fixes the issue where PHYs sitting under the mdio-mux aren't initialized correctly if the PHY's interrupt controller is not yet ready when the mdio-mux is being probed. Additional context in the link below. Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.") Link: https://lore.kernel.org/lkml/CAGETcx95kHrv8wA-O+-JtfH7H9biJEGJtijuPVN0V5dUKUAB3A@mail.gmail.com/#t Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net: mdio-mux: Don't ignore memory allocation errorsSaravana Kannan
If we are seeing memory allocation errors, don't try to continue registering child mdiobus devices. It's unlikely they'll succeed. Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and errors more verbose") Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net: mdio-mux: Delete unnecessary devm_kfreeSaravana Kannan
The whole point of devm_* APIs is that you don't have to undo them if you are returning an error that's going to get propagated out of a probe() function. So delete unnecessary devm_kfree() call in the error return path. Fixes: b60161668199 ("mdio: mux: Correct mdio_mux_init error path issues") Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, ↵Vladimir Oltean
or worse It seems that of_find_compatible_node has a weird calling convention in which it calls of_node_put() on the "from" node argument, instead of leaving that up to the caller. This comes from the fact that of_find_compatible_node with a non-NULL "from" argument it only supposed to be used as the iterator function of for_each_compatible_node(). OF iterator functions call of_node_get on the next OF node and of_node_put() on the previous one. When of_find_compatible_node calls of_node_put, it actually never expects the refcount to drop to zero, because the call is done under the atomic devtree_lock context, and when the refcount drops to zero it triggers a kobject and a sysfs file deletion, which assume blocking context. So any driver call to of_find_compatible_node is probably buggy because an unexpected of_node_put() takes place. What should be done is to use the of_get_compatible_child() function. Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX") Link: https://lore.kernel.org/netdev/20210814010139.kzryimmp4rizlznt@skbuf/ Suggested-by: Frank Rowand <frowand.list@gmail.com> Suggested-by: Rob Herring <robh+dt@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18sch_cake: fix srchost/dsthost hashing modeToke Høiland-Jørgensen
When adding support for using the skb->hash value as the flow hash in CAKE, I accidentally introduced a logic error that broke the host-only isolation modes of CAKE (srchost and dsthost keywords). Specifically, the flow_hash variable should stay initialised to 0 in cake_hash() in pure host-based hashing mode. Add a check for this before using the skb->hash value as flow_hash. Fixes: b0c19ed6088a ("sch_cake: Take advantage of skb->hash where appropriate") Reported-by: Pete Heist <pete@heistp.net> Tested-by: Pete Heist <pete@heistp.net> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18platform/x86: gigabyte-wmi: add support for X570 GAMING XThomas Weißschuh
Reported as working here: https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-900263115 Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20210817154628.84992-1-linux@weissschuh.net Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-08-17ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error pathWang Hai
In ixgbe_xsk_pool_enable(), if ixgbe_xsk_wakeup() fails, We should restore the previous state and clean up the resources. Add the missing clear af_xdp_zc_qps and unmap dma to fix this bug. Fixes: d49e286d354e ("ixgbe: add tracking of AF_XDP zero-copy state for each queue pair") Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20210817203736.3529939-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17Merge tag 'wireless-drivers-2021-08-17' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.14 First set of fixes for v5.14 and nothing major this time. New devices for iwlwifi and one fix for a compiler warning. iwlwifi * support for new devices mt76 * fix compiler warning about MT_CIPHER_NONE * tag 'wireless-drivers-2021-08-17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers: mt76: fix enum type mismatch iwlwifi: add new so-jf devices iwlwifi: add new SoF with JF devices iwlwifi: pnvm: accept multiple HW-type TLVs ==================== Link: https://lore.kernel.org/r/20210817171027.EC1E6C43460@smtp.codeaurora.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17Merge tag 'trace-v5.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Limit the shooting in the foot of tp_printk The "tp_printk" option redirects the trace event output to printk at boot up. This is useful when a machine crashes before boot where the trace events can not be retrieved by the in kernel ring buffer. But it can be "dangerous" because trace events can be located in high frequency locations such as interrupts and the scheduler, where a printk can slow it down that it live locks the machine (because by the time the printk finishes, the next event is triggered). Thus tp_printk must be used with care. It was discovered that the filter logic to trace events does not apply to the tp_printk events. This can cause a surprise and live lock when the user expects it to be filtered to limit the amount of events printed to the console when in fact it still prints everything" * tag 'trace-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Apply trace filters on all output channels
2021-08-17net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32Dinghao Liu
qlcnic_83xx_unlock_flash() is called on all paths after we call qlcnic_83xx_lock_flash(), except for one error path on failure of QLCRD32(), which may cause a deadlock. This bug is suggested by a static analysis tool, please advise. Fixes: 81d0aeb0a4fff ("qlcnic: flash template based firmware reset recovery") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20210816131405.24024-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17mac80211: fix locking in ieee80211_restart_work()Johannes Berg
Ilan's change to move locking around accidentally lost the wiphy_lock() during some porting, add it back. Fixes: 45daaa131841 ("mac80211: Properly WARN on HW scan before restart") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210817121210.47bdb177064f.Ib1ef79440cd27f318c028ddfc0c642406917f512@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17virtio-net: use NETIF_F_GRO_HW instead of NETIF_F_LROJason Wang
Commit a02e8964eaf92 ("virtio-net: ethtool configurable LRO") maps LRO to virtio guest offloading features and allows the administrator to enable and disable those features via ethtool. This leads to several issues: - For a device that doesn't support control guest offloads, the "LRO" can't be disabled triggering WARN in dev_disable_lro() when turning off LRO or when enabling forwarding bridging etc. - For a device that supports control guest offloads, the guest offloads are disabled in cases of bridging, forwarding etc slowing down the traffic. Fix this by using NETIF_F_GRO_HW instead. Though the spec does not guarantee packets to be re-segmented as the original ones, we can add that to the spec, possibly with a flag for devices to differentiate between GRO and LRO. Further, we never advertised LRO historically before a02e8964eaf92 ("virtio-net: ethtool configurable LRO") and so bridged/forwarded configs effectively always relied on virtio receive offloads behaving like GRO - thus even if this breaks any configs it is at least not a regression. Fixes: a02e8964eaf92 ("virtio-net: ethtool configurable LRO") Acked-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: Ivan <ivan@prestigetransportation.com> Tested-by: Ivan <ivan@prestigetransportation.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17ALSA: hda/via: Apply runtime PM workaround for ASUS B23ETakashi Iwai
ASUS B23E requires the same workaround like other machines with VT1802, otherwise it looses the codec power on a few nodes and the sound kept silence. Fixes: a0645daf1610 ("ALSA: HDA: Early Forbid of runtime PM") Link: https://lore.kernel.org/r/ac2232f142efcd67fe6ac38897f704f7176bd200.camel@gmail.com Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210817052432.14751-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-17ALSA: hda: Fix hang during shutdown due to link resetImre Deak
During system shutdown codecs may be still active, and resetting the controller->codec HW link in this state - based on the bug reporter's tests - leads to the shutdown sequence to get stuck. This happens at least on the reporter's KBL system with an ALC662 codec. For now fix the issue by skipping the link reset step. Fixes: 472e18f63c42 ("ALSA: hda: Release controller display power during shutdown/reboot") References: https://bugzilla.kernel.org/show_bug.cgi?id=214045 References: https://gitlab.freedesktop.org/drm/intel/-/issues/3618#note_1024665 Reported-and-tested-by: youling257@gmail.com Cc: youling257@gmail.com Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20210816174259.2759103-1-imre.deak@intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-16Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This contains a fix for a potential boot failure due to a missing Kconfig dependency for people upgrading with the DRBG enabled" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: drbg - select SHA512
2021-08-16vrf: Reset skb conntrack connection on VRF rcvLahav Schlesinger
To fix the "reverse-NAT" for replies. When a packet is sent over a VRF, the POST_ROUTING hooks are called twice: Once from the VRF interface, and once from the "actual" interface the packet will be sent from: 1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct() This causes the POST_ROUTING hooks to run. 2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again. Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and second vrf_l3_rcv() calls them again. As an example, consider the following SNAT rule: > iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1 In this case sending over a VRF will create 2 conntrack entries. The first is from the VRF interface, which performs the IP SNAT. The second will run the SNAT, but since the "expected reply" will remain the same, conntrack randomizes the source port of the packet: e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack rules are: udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1 udp 17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1 i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is SNAT-ed from 10000 --> 61033. But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later conntrack entry is matched: udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1 udp 17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1 And a "port 61033 unreachable" ICMP packet is sent back. The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(), the skb already has a conntrack flow attached to it, which means nf_conntrack_in() will not resolve the flow again. This means only the dest port is "reverse-NATed" (61033 -> 10000) but the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's not received. This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'. The fix is then to reset the flow when skb is received on a VRF, to let conntrack resolve the flow again (which now will hit the earlier flow). To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by running 'bash ./repro'): $ cat run_in_A1.py import logging logging.getLogger("scapy.runtime").setLevel(logging.ERROR) from scapy.all import * import argparse def get_packet_to_send(udp_dst_port, msg_name): return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \ IP(src='3.3.3.3', dst='2.2.2.2')/ \ UDP(sport=53, dport=udp_dst_port)/ \ Raw(f'{msg_name}\x0012345678901234567890') parser = argparse.ArgumentParser() parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True, help="From run_in_A3.py") parser.add_argument('-socket_port', dest="socket_port", type=str, required=True, help="From run_in_A3.py") parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True, help="From script") args, _ = parser.parse_known_args() iface_mac = args.iface_mac socket_port = int(args.socket_port) v1_mac = args.v1_mac print(f'Source port before NAT: {socket_port}') while True: pkts = sniff(iface='_v0', store=True, count=1, timeout=10) if 0 == len(pkts): print('Something failed, rerun the script :(', flush=True) break pkt = pkts[0] if not pkt.haslayer('UDP'): continue pkt_sport = pkt.getlayer('UDP').sport print(f'Source port after NAT: {pkt_sport}', flush=True) pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port') sendp(pkt_to_send, '_v0', verbose=False) # Will not be received pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port') sendp(pkt_to_send, '_v0', verbose=False) break $ cat run_in_A2.py import socket import netifaces print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}", flush=True) s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, str('vrf_1' + '\0').encode('utf-8')) s.connect(('3.3.3.3', 53)) print(f'{s. getsockname()[1]}', flush=True) s.settimeout(5) while True: try: # Periodically send in order to keep the conntrack entry alive. s.send(b'a'*40) resp = s.recvfrom(1024) msg_name = resp[0].decode('utf-8').split('\0')[0] print(f"Got {msg_name}", flush=True) except Exception as e: pass $ cat repro.sh ip netns del A1 2> /dev/null ip netns del A2 2> /dev/null ip netns add A1 ip netns add A2 ip -n A1 link add _v0 type veth peer name _v1 netns A2 ip -n A1 link set _v0 up ip -n A2 link add e00000 type bond ip -n A2 link add lo0 type dummy ip -n A2 link add vrf_1 type vrf table 10001 ip -n A2 link set vrf_1 up ip -n A2 link set e00000 master vrf_1 ip -n A2 addr add 1.1.1.1/24 dev e00000 ip -n A2 link set e00000 up ip -n A2 link set _v1 master e00000 ip -n A2 link set _v1 up ip -n A2 link set lo0 up ip -n A2 addr add 2.2.2.2/32 dev lo0 ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000 ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001 ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \ SNAT --to-source 2.2.2.2 -o vrf_1 sleep 5 ip netns exec A2 python3 run_in_A2.py > x & XPID=$! sleep 5 IFACE_MAC=`sed -n 1p x` SOCKET_PORT=`sed -n 2p x` V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}` ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \ ${SOCKET_PORT} -v1_mac ${SOCKET_PORT} sleep 5 kill -9 $XPID wait $XPID 2> /dev/null ip netns del A1 ip netns del A2 tail x -n 2 rm x set +x Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device") Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-16Merge tag 'mtd/fixes-for-5.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Miquel Raynal: "MTD core fixes: - Fix lock hierarchy in deregister_mtd_blktrans - Handle flashes without OTP gracefully - Break circular locks in register_mtd_blktrans MTD device fixes: - mchp48l640: - Fix memory leak on cmd - Silence some uninitialized variable warnings - blkdevs: - Initialize rq.limits.discard_granularity CFI fixes: - Fix crash when erasing/writing AMD cards Raw NAND fixes: - Fix of_get_nand_secure_regions(): - Add a missing check - Avoid an unwanted probe failure when a DT property is missing" * tag 'mtd/fixes-for-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: Fix probe failure due to of_get_nand_secure_regions() mtd: fix lock hierarchy in deregister_mtd_blktrans mtd: devices: mchp48l640: Fix memory leak on cmd mtd: cfi_cmdset_0002: fix crash when erasing/writing AMD cards mtd: core: handle flashes without OTP gracefully mtd: mchp48l640: silence some uninitialized variable warnings mtd: break circular locks in register_mtd_blktrans mtd: rawnand: Add a check in of_get_nand_secure_regions() mtd: mtd_blkdevs: Initialize rq.limits.discard_granularity
2021-08-16Merge tag 'trace-v5.14-rc5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Fixes and clean ups to tracing: - Fix header alignment when PREEMPT_RT is enabled for osnoise tracer - Inject "stop" event to see where osnoise stopped the trace - Define DYNAMIC_FTRACE_WITH_ARGS as some code had an #ifdef for it - Fix erroneous message for bootconfig cmdline parameter - Fix crash caused by not found variable in histograms" * tag 'trace-v5.14-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name init: Suppress wrong warning for bootconfig cmdline parameter tracing: define needed config DYNAMIC_FTRACE_WITH_ARGS trace/osnoise: Print a stop tracing message trace/timerlat: Add a header with PREEMPT_RT additional fields trace/osnoise: Add a header with PREEMPT_RT additional fields
2021-08-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Two nested virtualization fixes for AMD processors" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656) KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
2021-08-16Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Fixes in virtio, vhost, and vdpa drivers" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Fix queue type selection logic vdpa/mlx5: Avoid destroying MR on empty iotlb tools/virtio: fix build virtio_ring: pull in spinlock header vringh: pull in spinlock header virtio-blk: Add validation for block size in config space vringh: Use wiov->used to check for read/write desc order virtio_vdpa: reject invalid vq indices vdpa: Add documentation for vdpa_alloc_device() macro vDPA/ifcvf: Fix return value check for vdpa_alloc_device() vp_vdpa: Fix return value check for vdpa_alloc_device() vdpa_sim: Fix return value check for vdpa_alloc_device() vhost: Fix the calculation in vhost_overflow() vhost-vdpa: Fix integer overflow in vhost_vdpa_process_iotlb_update() virtio_pci: Support surprise removal of virtio pci device virtio: Protect vqs list access virtio: Keep vring_del_virtqueue() mirror of VQ create virtio: Improve vq->broken access to avoid any compiler optimization
2021-08-16tracing: Apply trace filters on all output channelsPingfan Liu
The event filters are not applied on all of the output, which results in the flood of printk when using tp_printk. Unfolding event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so the filters can be applied on every output. Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com Cc: stable@vger.kernel.org Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()") Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-16KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656)Maxim Levitsky
If L1 disables VMLOAD/VMSAVE intercepts, and doesn't enable Virtual VMLOAD/VMSAVE (currently not supported for the nested hypervisor), then VMLOAD/VMSAVE must operate on the L1 physical memory, which is only possible by making L0 intercept these instructions. Failure to do so allowed the nested guest to run VMLOAD/VMSAVE unintercepted, and thus read/write portions of the host physical memory. Fixes: 89c8a4984fc9 ("KVM: SVM: Enable Virtual VMLOAD VMSAVE feature") Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-16KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)Maxim Levitsky
* Invert the mask of bits that we pick from L2 in nested_vmcb02_prepare_control * Invert and explicitly use VIRQ related bits bitmask in svm_clear_vintr This fixes a security issue that allowed a malicious L1 to run L2 with AVIC enabled, which allowed the L2 to exploit the uninitialized and enabled AVIC to read/write the host physical memory at some offsets. Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-16net: iosm: Prevent underflow in ipc_chnl_cfg_get()Dan Carpenter
The bounds check on "index" doesn't catch negative values. Using ARRAY_SIZE() directly is more readable and more robust because it prevents negative values for "index". Fortunately we only pass valid values to ipc_chnl_cfg_get() so this patch does not affect runtime. Reported-by: Solomon Ucko <solly.ucko@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16btrfs: prevent rename2 from exchanging a subvol with a directory from ↵NeilBrown
different parents Cross-rename lacks a check when that would prevent exchanging a directory and subvolume from different parent subvolume. This causes data inconsistencies and is caught before commit by tree-checker, turning the filesystem to read-only. Calling the renameat2 with RENAME_EXCHANGE flags like renameat2(AT_FDCWD, namesrc, AT_FDCWD, namedest, (1 << 1)) on two paths: namesrc = dir1/subvol1/dir2 namedest = subvol2/subvol3 will cause key order problem with following write time tree-checker report: [1194842.307890] BTRFS critical (device loop1): corrupt leaf: root=5 block=27574272 slot=10 ino=258, invalid previous key objectid, have 257 expect 258 [1194842.322221] BTRFS info (device loop1): leaf 27574272 gen 8 total ptrs 11 free space 15444 owner 5 [1194842.331562] BTRFS info (device loop1): refs 2 lock_owner 0 current 26561 [1194842.338772] item 0 key (256 1 0) itemoff 16123 itemsize 160 [1194842.338793] inode generation 3 size 16 mode 40755 [1194842.338801] item 1 key (256 12 256) itemoff 16111 itemsize 12 [1194842.338809] item 2 key (256 84 2248503653) itemoff 16077 itemsize 34 [1194842.338817] dir oid 258 type 2 [1194842.338823] item 3 key (256 84 2363071922) itemoff 16043 itemsize 34 [1194842.338830] dir oid 257 type 2 [1194842.338836] item 4 key (256 96 2) itemoff 16009 itemsize 34 [1194842.338843] item 5 key (256 96 3) itemoff 15975 itemsize 34 [1194842.338852] item 6 key (257 1 0) itemoff 15815 itemsize 160 [1194842.338863] inode generation 6 size 8 mode 40755 [1194842.338869] item 7 key (257 12 256) itemoff 15801 itemsize 14 [1194842.338876] item 8 key (257 84 2505409169) itemoff 15767 itemsize 34 [1194842.338883] dir oid 256 type 2 [1194842.338888] item 9 key (257 96 2) itemoff 15733 itemsize 34 [1194842.338895] item 10 key (258 12 256) itemoff 15719 itemsize 14 [1194842.339163] BTRFS error (device loop1): block=27574272 write time tree block corruption detected [1194842.339245] ------------[ cut here ]------------ [1194842.443422] WARNING: CPU: 6 PID: 26561 at fs/btrfs/disk-io.c:449 csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.511863] CPU: 6 PID: 26561 Comm: kworker/u17:2 Not tainted 5.14.0-rc3-git+ #793 [1194842.511870] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [1194842.511876] Workqueue: btrfs-worker-high btrfs_work_helper [btrfs] [1194842.511976] RIP: 0010:csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.512068] RSP: 0018:ffffa2c284d77da0 EFLAGS: 00010282 [1194842.512074] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff928867bd9978 [1194842.512078] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff928867bd9970 [1194842.512081] RBP: ffff92876b958000 R08: 0000000000000001 R09: 00000000000c0003 [1194842.512085] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [1194842.512088] R13: ffff92875f989f98 R14: 0000000000000000 R15: 0000000000000000 [1194842.512092] FS: 0000000000000000(0000) GS:ffff928867a00000(0000) knlGS:0000000000000000 [1194842.512095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1194842.512099] CR2: 000055f5384da1f0 CR3: 0000000102fe4000 CR4: 00000000000006e0 [1194842.512103] Call Trace: [1194842.512128] ? run_one_async_free+0x10/0x10 [btrfs] [1194842.631729] btree_csum_one_bio+0x1ac/0x1d0 [btrfs] [1194842.631837] run_one_async_start+0x18/0x30 [btrfs] [1194842.631938] btrfs_work_helper+0xd5/0x1d0 [btrfs] [1194842.647482] process_one_work+0x262/0x5e0 [1194842.647520] worker_thread+0x4c/0x320 [1194842.655935] ? process_one_work+0x5e0/0x5e0 [1194842.655946] kthread+0x135/0x160 [1194842.655953] ? set_kthread_struct+0x40/0x40 [1194842.655965] ret_from_fork+0x1f/0x30 [1194842.672465] irq event stamp: 1729 [1194842.672469] hardirqs last enabled at (1735): [<ffffffffbd1104f5>] console_trylock_spinning+0x185/0x1a0 [1194842.672477] hardirqs last disabled at (1740): [<ffffffffbd1104cc>] console_trylock_spinning+0x15c/0x1a0 [1194842.672482] softirqs last enabled at (1666): [<ffffffffbdc002e1>] __do_softirq+0x2e1/0x50a [1194842.672491] softirqs last disabled at (1651): [<ffffffffbd08aab7>] __irq_exit_rcu+0xa7/0xd0 The corrupted data will not be written, and filesystem can be unmounted and mounted again (all changes since the last commit will be lost). Add the missing check for new_ino so that all non-subvolumes must reside under the same parent subvolume. There's an exception allowing to exchange two subvolumes from any parents as the directory representing a subvolume is only a logical link and does not have any other structures related to the parent subvolume, unlike files, directories etc, that are always in the inode namespace of the parent subvolume. Fixes: cdd1fedf8261 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.7+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-16Merge branch 'bnxt_en-fixes'David S. Miller
Michael Chan says: ==================== bnxt_en: 2 bug fixes The first one disables aRFS/NTUPLE on an older broken firmware version. The second one adds missing memory barriers related to completion ring handling. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>