summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-15net/ism: Remove extra includeStefan Raspl
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/smc: Introduce explicit check for v2 supportStefan Raspl
Previously, v2 support was derived from a very specific format of the SEID as part of the SMC-D codebase. Make this part of the SMC-D device API, so implementers do not need to adhere to a specific SEID format. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Reviewed-and-tested-by: Jan Karcher <jaka@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15Merge branch 'net-smc-fixes'David S. Miller
Wenjia Zhang says: ==================== net/smc: Fixes 2023-03-01 The 1st patch solves the problem that CLC message initialization was not properly reversed in error handling path. And the 2nd one fixes the possible deadlock triggered by cancel_delayed_work_sync(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/smc: Fix device de-init sequenceStefan Raspl
CLC message initialization was not properly reversed in error handling path. Reported-and-suggested-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/smc: fix deadlock triggered by cancel_delayed_work_syn()Wenjia Zhang
The following LOCKDEP was detected: Workqueue: events smc_lgr_free_work [smc] WARNING: possible circular locking dependency detected 6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted ------------------------------------------------------ kworker/3:0/176251 is trying to acquire lock: 00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}, at: __flush_workqueue+0x7a/0x4f0 but task is already holding lock: 0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_work+0x76/0xf0 __cancel_work_timer+0x170/0x220 __smc_lgr_terminate.part.0+0x34/0x1c0 [smc] smc_connect_rdma+0x15e/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> #3 (smc_client_lgr_pending){+.+.}-{3:3}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __mutex_lock+0x96/0x8e8 mutex_lock_nested+0x32/0x40 smc_connect_rdma+0xa4/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> #2 (sk_lock-AF_SMC){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 lock_sock_nested+0x46/0xa8 smc_tx_work+0x34/0x50 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 process_one_work+0x2bc/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}: check_prev_add+0xd8/0xe88 validate_chain+0x70c/0xb20 __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_workqueue+0xaa/0x4f0 drain_workqueue+0xaa/0x158 destroy_workqueue+0x44/0x2d8 smc_lgr_free+0x9e/0xf8 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 other info that might help us debug this: Chain exists of: (wq_completion)smc_tx_wq-00000000#2 --> smc_client_lgr_pending --> (work_completion)(&(&lgr->free_work)->work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&(&lgr->free_work)->work)); lock(smc_client_lgr_pending); lock((work_completion) (&(&lgr->free_work)->work)); lock((wq_completion)smc_tx_wq-00000000#2); *** DEADLOCK *** 2 locks held by kworker/3:0/176251: #0: 0000000080183548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x232/0x730 #1: 0000037fffe97dc8 ((work_completion) (&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 stack backtrace: CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted Hardware name: IBM 8561 T01 701 (z/VM 7.2.0) Call Trace: [<000000002983c3e4>] dump_stack_lvl+0xac/0x100 [<0000000028b477ae>] check_noncircular+0x13e/0x160 [<0000000028b48808>] check_prev_add+0xd8/0xe88 [<0000000028b49cc4>] validate_chain+0x70c/0xb20 [<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8 [<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248 [<0000000028b4d17c>] lock_acquire+0xac/0x1c8 [<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0 [<0000000028addf9a>] drain_workqueue+0xaa/0x158 [<0000000028ae303c>] destroy_workqueue+0x44/0x2d8 [<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc] [<0000000028adf3d4>] process_one_work+0x30c/0x730 [<0000000028adf85a>] worker_thread+0x62/0x420 [<0000000028aeac50>] kthread+0x138/0x150 [<0000000028a63914>] __ret_from_fork+0x3c/0x58 [<00000000298503da>] ret_from_fork+0xa/0x40 INFO: lockdep is turned off. =================================================================== This deadlock occurs because cancel_delayed_work_sync() waits for the work(&lgr->free_work) to finish, while the &lgr->free_work waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that is already used under the mutex_lock. The solution is to use cancel_delayed_work() instead, which kills off a pending work. Fixes: a52bcc919b14 ("net/smc: improve termination processing") Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Jan Karcher <jaka@linux.ibm.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: geneve: set IFF_POINTOPOINT with IFLA_GENEVE_INNER_PROTO_INHERITJosef Miegl
The GENEVE tunnel used with IFLA_GENEVE_INNER_PROTO_INHERIT is point-to-point, so set IFF_POINTOPOINT to reflect that. Signed-off-by: Josef Miegl <josef@miegl.cz> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: ca8210: drop owner from driverKrzysztof Kozlowski
Core already sets owner in spi_driver. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: adf7242: drop owner from driverKrzysztof Kozlowski
Core already sets owner in spi_driver. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: ca8210: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/ieee802154/ca8210.c:3174:34: error: ‘ca8210_of_ids’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: at86rf230: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/ieee802154/at86rf230.c:1644:34: error: ‘at86rf230_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: mcr20a: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/ieee802154/mcr20a.c:1340:34: error: ‘mcr20a_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ieee802154: adf7242: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/ieee802154/adf7242.c:1322:34: error: ‘adf7242_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: phy: ks8995: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/phy/spi_ks8995.c:156:34: error: ‘ks8895_spi_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: ocelot: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/dsa/ocelot/ocelot_ext.c:143:34: error: ‘ocelot_ext_switch_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: ksz9477: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/dsa/microchip/ksz9477_i2c.c:84:34: error: ‘ksz9477_dt_ids’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: seville_vsc9953: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/dsa/ocelot/seville_vsc9953.c:1070:34: error: ‘seville_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: lan9303: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly or only by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). drivers/net/dsa/lan9303_i2c.c:97:34: error: ‘lan9303_i2c_of_match’ defined but not used [-Werror=unused-const-variable=] drivers/net/dsa/lan9303_mdio.c:157:34: error: ‘lan9303_mdio_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: lantiq_gswip: mark OF related data as maybe unusedKrzysztof Kozlowski
The driver can be compile tested with !CONFIG_OF making certain data unused: drivers/net/dsa/lantiq_gswip.c:1888:34: error: ‘xway_gphy_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15nfc: trf7970a: mark OF related data as maybe unusedKrzysztof Kozlowski
The driver can be compile tested with !CONFIG_OF making certain data unused: drivers/nfc/trf7970a.c:2232:34: error: ‘trf7970a_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: ni: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it is not relevant here). drivers/net/ethernet/ni/nixge.c:1253:34: error: ‘nixge_dt_ids’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: samsung: sxgbe: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it is not relevant here). drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c:220:34: error: ‘sxgbe_dt_ids’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: marvell: pxa168_eth: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it is not relevant here). drivers/net/ethernet/marvell/pxa168_eth.c:1575:34: error: ‘pxa168_eth_of_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: stmmac: generic: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it is not relevant here). drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c:72:34: error: ‘dwmac_generic_match’ defined but not used [-Werror=unused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: stmmac: qcom: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver is specific to ARCH_QCOM which depends on OF thus the driver is OF-only. Its of_device_id table is built unconditionally, thus of_match_ptr() for ID table does not make sense. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15Merge branch 'dsa-microchip-tc-ets'David S. Miller
Oleksij Rempel says: ==================== net: dsa: microchip: tc-ets support changes v3: - add tc_ets_supported to match supported devices - dynamically regenerated default TC to queue map. - add Acked-by to the first patch changes v2: - run egress limit configuration on all queue separately. Otherwise configuration may not apply correctly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: microchip: add ETS Qdisc support for KSZ9477 seriesOleksij Rempel
Add ETS Qdisc support for KSZ9477 of switches. Current implementation is limited to strict priority mode. Tested on KSZ8563R with following configuration: tc qdisc replace dev lan2 root handle 1: ets strict 4 \ priomap 3 3 2 2 1 1 0 0 ip link add link lan2 name v1 type vlan id 1 \ egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 and patched iperf3 version: https://github.com/esnet/iperf/pull/1476 iperf3 -c 172.17.0.1 -b100M -l1472 -t100 -u -R --sock-prio 2 Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net: dsa: microchip: add ksz_setup_tc_mode() functionOleksij Rempel
Add ksz_setup_tc_mode() to make queue scheduling and shaping configuration more visible. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15mlxsw: spectrum: Fix incorrect parsing depth after reloadIdo Schimmel
Spectrum ASICs have a configurable limit on how deep into the packet they parse. By default, the limit is 96 bytes. There are several cases where this parsing depth is not enough and there is a need to increase it. For example, timestamping of PTP packets and a FIB multipath hash policy that requires hashing on inner fields. The driver therefore maintains a reference count that reflects the number of consumers that require an increased parsing depth. During reload_down() the parsing depth reference count does not necessarily drop to zero, but the parsing depth itself is restored to the default during reload_up() when the firmware is reset. It is therefore possible to end up in situations where the driver thinks that the parsing depth was increased (reference count is non-zero), when it is not. Fix by making sure that all the consumers that increase the parsing depth reference count also decrease it during reload_down(). Specifically, make sure that when the routing code is de-initialized it drops the reference count if it was increased because of a FIB multipath hash policy that requires hashing on inner fields. Add a warning if the reference count is not zero after the driver was de-initialized and explicitly reset it to zero during initialization for good measures. Fixes: 2d91f0803b84 ("mlxsw: spectrum: Add infrastructure for parsing configuration") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/9c35e1b3e6c1d8f319a2449d14e2b86373f3b3ba.1678727526.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15veth: rely on rtnl_dereference() instead of on rcu_dereference() in ↵Lorenzo Bianconi
veth_set_xdp_features() Fix the following kernel warning in veth_set_xdp_features routine relying on rtnl_dereference() instead of on rcu_dereference(): ============================= WARNING: suspicious RCU usage 6.3.0-rc1-00144-g064d70527aaa #149 Not tainted ----------------------------- drivers/net/veth.c:1265 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/135: (net/core/rtnetlink.c:6172) stack backtrace: CPU: 1 PID: 135 Comm: ip Not tainted 6.3.0-rc1-00144-g064d70527aaa #149 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107) lockdep_rcu_suspicious (include/linux/context_tracking.h:152) veth_set_xdp_features (drivers/net/veth.c:1265 (discriminator 9)) veth_newlink (drivers/net/veth.c:1892) ? veth_set_features (drivers/net/veth.c:1774) ? kasan_save_stack (mm/kasan/common.c:47) ? kasan_save_stack (mm/kasan/common.c:46) ? kasan_set_track (mm/kasan/common.c:52) ? alloc_netdev_mqs (include/linux/slab.h:737) ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) ? trace_kmalloc (include/trace/events/kmem.h:54) ? __xdp_rxq_info_reg (net/core/xdp.c:188) ? alloc_netdev_mqs (net/core/dev.c:10657) ? rtnl_create_link (net/core/rtnetlink.c:3312) rtnl_newlink_create (net/core/rtnetlink.c:3440) ? rtnl_link_get_net_capable.constprop.0 (net/core/rtnetlink.c:3391) __rtnl_newlink (net/core/rtnetlink.c:3657) ? lock_downgrade (kernel/locking/lockdep.c:5321) ? rtnl_link_unregister (net/core/rtnetlink.c:3487) rtnl_newlink (net/core/rtnetlink.c:3671) rtnetlink_rcv_msg (net/core/rtnetlink.c:6174) ? rtnl_link_fill (net/core/rtnetlink.c:6070) ? mark_usage (kernel/locking/lockdep.c:4914) ? mark_usage (kernel/locking/lockdep.c:4914) netlink_rcv_skb (net/netlink/af_netlink.c:2574) ? rtnl_link_fill (net/core/rtnetlink.c:6070) ? netlink_ack (net/netlink/af_netlink.c:2551) ? lock_acquire (kernel/locking/lockdep.c:467) ? net_generic (include/linux/rcupdate.h:805) ? netlink_deliver_tap (include/linux/rcupdate.h:805) netlink_unicast (net/netlink/af_netlink.c:1340) ? netlink_attachskb (net/netlink/af_netlink.c:1350) netlink_sendmsg (net/netlink/af_netlink.c:1942) ? netlink_unicast (net/netlink/af_netlink.c:1861) ? netlink_unicast (net/netlink/af_netlink.c:1861) sock_sendmsg (net/socket.c:727) ____sys_sendmsg (net/socket.c:2501) ? kernel_sendmsg (net/socket.c:2448) ? __copy_msghdr (net/socket.c:2428) ___sys_sendmsg (net/socket.c:2557) ? mark_usage (kernel/locking/lockdep.c:4914) ? do_recvmmsg (net/socket.c:2544) ? lock_acquire (kernel/locking/lockdep.c:467) ? find_held_lock (kernel/locking/lockdep.c:5159) ? __lock_release (kernel/locking/lockdep.c:5345) ? __might_fault (mm/memory.c:5625) ? lock_downgrade (kernel/locking/lockdep.c:5321) ? __fget_light (include/linux/atomic/atomic-arch-fallback.h:227) __sys_sendmsg (include/linux/file.h:31) ? __sys_sendmsg_sock (net/socket.c:2572) ? rseq_get_rseq_cs (kernel/rseq.c:275) ? lockdep_hardirqs_on_prepare.part.0 (kernel/locking/lockdep.c:4263) do_syscall_64 (arch/x86/entry/common.c:50) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f0d1aadeb17 Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/netdev/cover.1678364612.git.lorenzo@kernel.org/T/#me4c9d8e985ec7ebee981cfdb5bc5ec651ef4035d Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reported-by: syzbot+c3d0d9c42d59ff644ea6@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/dfd6a9a7d85e9113063165e1f47b466b90ad7b8a.1678748579.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15Merge branch 'ipv6-optimize-rt6_score_route'Jakub Kicinski
Eric Dumazet says: ==================== ipv6: optimize rt6_score_route() This patch series remove an expensive rwlock acquisition in rt6_check_neigh()/rt6_score_route(). First patch adds missing annotations, and second patch implements the optimization. ==================== Link: https://lore.kernel.org/r/20230313201732.887488-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15ipv6: remove one read_lock()/read_unlock() pair in rt6_check_neigh()Eric Dumazet
rt6_check_neigh() uses read_lock() to protect n->nud_state reading. This seems overkill and causes false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15neighbour: annotate lockless accesses to n->nud_stateEric Dumazet
We have many lockless accesses to n->nud_state. Before adding another one in the following patch, add annotations to readers and writers. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15net: lan966x: Change lan966x_police_del return typeHoratiu Vultur
As the function always returns 0 change the return type to be void instead of int. In this way also remove a wrong message in case of error which would never happen. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230312195155.1492881-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15nfc: st-nci: Fix use after free bug in ndlc_remove due to race conditionZheng Wang
This bug influences both st_nci_i2c_remove and st_nci_spi_remove. Take st_nci_i2c_remove as an example. In st_nci_i2c_probe, it called ndlc_probe and bound &ndlc->sm_work with llt_ndlc_sm_work. When it calls ndlc_recv or timeout handler, it will finally call schedule_work to start the work. When we call st_nci_i2c_remove to remove the driver, there may be a sequence as follows: Fix it by finishing the work before cleanup in ndlc_remove CPU0 CPU1 |llt_ndlc_sm_work st_nci_i2c_remove | ndlc_remove | st_nci_remove | nci_free_device| kfree(ndev) | //free ndlc->ndev | |llt_ndlc_rcv_queue |nci_recv_frame |//use ndlc->ndev Fixes: 35630df68d60 ("NFC: st21nfcb: Add driver for STMicroelectronics ST21NFCB NFC chip") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230312160837.2040857-1-zyytlz.wz@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15Merge branch 'tcp-fix-bind-regression-for-dual-stack-wildcard-address'Jakub Kicinski
Kuniyuki Iwashima says: ==================== tcp: Fix bind() regression for dual-stack wildcard address. The first patch fixes the regression reported in [0], and the second patch adds a test for similar cases to catch future regression. [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/ ==================== Link: https://lore.kernel.org/r/20230312031904.4674-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15selftest: Add test for bind() conflicts.Kuniyuki Iwashima
The test checks if (IPv4, IPv6) address pair properly conflict or not. * IPv4 * 0.0.0.0 * 127.0.0.1 * IPv6 * :: * ::1 If the IPv6 address is [::], the second bind() always fails. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15tcp: Fix bind() conflict check for dual-stack wildcard address.Kuniyuki Iwashima
Paul Holzinger reported [0] that commit 5456262d2baa ("net: Fix incorrect address comparison when searching for a bind2 bucket") introduced a bind() regression. Paul also gave a nice repro that calls two types of bind() on the same port, both of which now succeed, but the second call should fail: bind(fd1, ::, port) + bind(fd2, 127.0.0.1, port) The cited commit added address family tests in three functions to fix the uninit-value KMSAN report. [1] However, the test added to inet_bind2_bucket_match_addr_any() removed a necessary conflict check; the dual-stack wildcard address no longer conflicts with an IPv4 non-wildcard address. If tb->family is AF_INET6 and sk->sk_family is AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check if tb has the dual-stack wildcard address. Note that the IPv4 wildcard address does not conflict with IPv6 non-wildcard addresses. [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/ [1]: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/ Fixes: 5456262d2baa ("net: Fix incorrect address comparison when searching for a bind2 bucket") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reported-by: Paul Holzinger <pholzing@redhat.com> Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/ Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Paul Holzinger <pholzing@redhat.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status failsHeiner Kallweit
If genphy_read_status fails then further access to the PHY may result in unpredictable behavior. To prevent this bail out immediately if genphy_read_status fails. Fixes: 4223dbffed9f ("net: phy: smsc: Re-enable EDPD mode for LAN87xx") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/026aa4f2-36f5-1c10-ab9f-cdb17dda6ac4@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15net: tunnels: annotate lockless accesses to dev->needed_headroomEric Dumazet
IP tunnels can apparently update dev->needed_headroom in their xmit path. This patch takes care of three tunnels xmit, and also the core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA() helpers. More changes might be needed for completeness. BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1: ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0: ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline] ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227 dst_output include/net/dst.h:444 [inline] NF_HOOK include/linux/netfilter.h:302 [inline] mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820 mld_send_cr net/ipv6/mcast.c:2121 [inline] mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653 process_one_work+0x3e6/0x750 kernel/workqueue.c:2390 worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537 kthread+0x1ac/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 value changed: 0x0dd4 -> 0x0e14 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5fa354-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 Workqueue: mld mld_ifc_work Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15ice: avoid bonding causing auxiliary plug/unplug under RTNL lockDave Ertman
RDMA is not supported in ice on a PF that has been added to a bonded interface. To enforce this, when an interface enters a bond, we unplug the auxiliary device that supports RDMA functionality. This unplug currently happens in the context of handling the netdev bonding event. This event is sent to the ice driver under RTNL context. This is causing a deadlock where the RDMA driver is waiting for the RTNL lock to complete the removal. Defer the unplugging/re-plugging of the auxiliary device to the service task so that it is not performed under the RTNL lock context. Cc: stable@vger.kernel.org # 6.1.x Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Link: https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/ Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave") Fixes: 4eace75e0853 ("RDMA/irdma: Report the correct link speed") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230310194833.3074601-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-14cifs: use DFS root session instead of tcon sesPaulo Alcantara
Use DFS root session whenever possible to get new DFS referrals otherwise we might end up with an IPC tcon (tcon->ses->tcon_ipc) that doesn't respond to them. It should be safe accessing @ses->dfs_root_ses directly in cifs_inval_name_dfs_link_error() as it has same lifetime as of @tcon. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-14cifs: return DFS root session id in DebugDataPaulo Alcantara
Return the DFS root session id in /proc/fs/cifs/DebugData to make it easier to track which IPC tcon was used to get new DFS referrals for a specific connection, and aids in debugging. A simple output of it would be Sessions: 1) Address: 192.168.1.13 Uses: 1 Capability: 0x300067 Session Status: 1 Security type: RawNTLMSSP SessionId: 0xd80000000009 User: 0 Cred User: 0 DFS root session id: 0x128006c000035 Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-14sched_getaffinity: don't assume 'cpumask_size()' is fully initializedLinus Torvalds
The getaffinity() system call uses 'cpumask_size()' to decide how big the CPU mask is - so far so good. It is indeed the allocation size of a cpumask. But the code also assumes that the whole allocation is initialized without actually doing so itself. That's wrong, because we might have fixed-size allocations (making copying and clearing more efficient), but not all of it is then necessarily used if 'nr_cpu_ids' is smaller. Having checked other users of 'cpumask_size()', they all seem to be ok, either using it purely for the allocation size, or explicitly zeroing the cpumask before using the size in bytes to copy it. See for example the ublk_ctrl_get_queue_affinity() function that uses the proper 'zalloc_cpumask_var()' to make sure that the whole mask is cleared, whether the storage is on the stack or if it was an external allocation. Fix this by just zeroing the allocation before using it. Do the same for the compat version of sched_getaffinity(), which had the same logic. Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to access the bits. For a cpumask_var_t, it ends up being a pointer to the same data either way, but it's just a good idea to treat it like you would a 'cpumask_t'. The compat case already did that. Reported-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/ Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-14RISC-V: mm: Support huge page in vmalloc_fault()Dylan Jhong
Since RISC-V supports ioremap() with huge page (pud/pmd) mapping, However, vmalloc_fault() assumes that the vmalloc range is limited to pte mappings. To complete the vmalloc_fault() function by adding huge page support. Fixes: 310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT") Cc: stable@vger.kernel.org Signed-off-by: Dylan Jhong <dylan@andestech.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230310075021.3919290-1-dylan@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14cifs: fix use-after-free bug in refresh_cache_worker()Paulo Alcantara
The UAF bug occurred because we were putting DFS root sessions in cifs_umount() while DFS cache refresher was being executed. Make DFS root sessions have same lifetime as DFS tcons so we can avoid the use-after-free bug is DFS cache refresher and other places that require IPCs to get new DFS referrals on. Also, get rid of mount group handling in DFS cache as we no longer need it. This fixes below use-after-free bug catched by KASAN [ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.947642] Read of size 8 at addr ffff888018f57030 by task kworker/u4:3/56 [ 379.948096] [ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku #23 [ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014 [ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs] [ 379.949942] Call Trace: [ 379.950113] <TASK> [ 379.950260] dump_stack_lvl+0x50/0x67 [ 379.950510] print_report+0x16a/0x48e [ 379.950759] ? __virt_addr_valid+0xd8/0x160 [ 379.951040] ? __phys_addr+0x41/0x80 [ 379.951285] kasan_report+0xdb/0x110 [ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs] [ 379.953637] ? __pfx___mutex_lock+0x10/0x10 [ 379.953915] ? lock_release+0xb6/0x720 [ 379.954167] ? __pfx_lock_acquire+0x10/0x10 [ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs] [ 379.954960] ? __pfx_wb_workfn+0x10/0x10 [ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs] [ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs] [ 379.956323] ? __pfx_lock_acquired+0x10/0x10 [ 379.956615] ? read_word_at_a_time+0xe/0x20 [ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220 [ 379.957235] process_one_work+0x535/0x990 [ 379.957509] ? __pfx_process_one_work+0x10/0x10 [ 379.957812] ? lock_acquired+0xb7/0x5f0 [ 379.958069] ? __list_add_valid+0x37/0xd0 [ 379.958341] ? __list_add_valid+0x37/0xd0 [ 379.958611] worker_thread+0x8e/0x630 [ 379.958861] ? __pfx_worker_thread+0x10/0x10 [ 379.959148] kthread+0x17d/0x1b0 [ 379.959369] ? __pfx_kthread+0x10/0x10 [ 379.959630] ret_from_fork+0x2c/0x50 [ 379.959879] </TASK> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-14cifs: set DFS root session in cifs_get_smb_ses()Paulo Alcantara
Set the DFS root session pointer earlier when creating a new SMB session to prevent racing with smb2_reconnect(), cifs_reconnect_tcon() and DFS cache refresher. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-14blk-mq: fix "bad unlock balance detected" on q->srcu in ↵Chris Leech
__blk_mq_run_dispatch_ops The 'q' parameter of the macro __blk_mq_run_dispatch_ops may not be one local variable, such as, it is rq->q, then request queue pointed by this variable could be changed to another queue in case of BLK_MQ_F_TAG_QUEUE_SHARED after 'dispatch_ops' returns, then 'bad unlock balance' is triggered. Fixes the issue by adding one local variable for doing srcu lock/unlock. Fixes: 2a904d00855f ("blk-mq: remove hctx_lock and hctx_unlock") Cc: Marco Patalano <mpatalan@redhat.com> Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230310010913.1014789-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-14loop: Fix use-after-free issuesBart Van Assche
do_req_filebacked() calls blk_mq_complete_request() synchronously or asynchronously when using asynchronous I/O unless memory allocation fails. Hence, modify loop_handle_cmd() such that it does not dereference 'cmd' nor 'rq' after do_req_filebacked() finished unless we are sure that the request has not yet been completed. This patch fixes the following kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000054 Call trace: css_put.42938+0x1c/0x1ac loop_process_work+0xc8c/0xfd4 loop_rootcg_workfn+0x24/0x34 process_one_work+0x244/0x558 worker_thread+0x400/0x8fc kthread+0x16c/0x1e0 ret_from_fork+0x10/0x20 Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Schatzberg <schatzberg.dan@gmail.com> Fixes: c74d40e8b5e2 ("loop: charge i/o to mem and blk cg") Fixes: bc07c10a3603 ("block: loop: support DIO & AIO") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230314182155.80625-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-14Merge tag 'mm-hotfixes-stable-2023-03-14-16-51' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Eleven hotfixes. Four of these are cc:stable and the remainder address post-6.2 issues or aren't considered suitable for backporting. Seven of these fixes are for MM" * tag 'mm-hotfixes-stable-2023-03-14-16-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/damon/paddr: fix folio_nr_pages() after folio_put() in damon_pa_mark_accessed_or_deactivate() mm/damon/paddr: fix folio_size() call after folio_put() in damon_pa_young() ocfs2: fix data corruption after failed write migrate_pages: try migrate in batch asynchronously firstly migrate_pages: move split folios processing out of migrate_pages_batch() migrate_pages: fix deadlock in batched migration .mailmap: add Alexandre Ghiti personal email address mailmap: correct Dikshita Agarwal's Qualcomm email address mailmap: updates for Jarkko Sakkinen mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage mm: teach mincore_hugetlb about pte markers
2023-03-14Merge tag 'trace-v6.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Do not allow histogram values to have modifies. They can cause a NULL pointer dereference if they do. - Warn if hist_field_name() is passed a NULL. Prevent the NULL pointer dereference mentioned above. - Fix invalid address look up race in lookup_rec() - Define ftrace_stub_graph conditionally to prevent linker errors - Always check if RCU is watching at all tracepoint locations * tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Make tracepoint lockdep check actually test something ftrace,kcfi: Define ftrace_stub_graph conditionally ftrace: Fix invalid address access in lookup_rec() when index is 0 tracing: Check field value in hist_field_name() tracing: Do not let histogram values have some modifiers