summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-09drm/mediatek: dp: Enable event interrupt only when bridge attachedAngeloGioacchino Del Regno
It is useless and error-prone to enable the DisplayPort event interrupt before finishing to probe and install the driver, as the DP training cannot happen before the entire pipeline is correctly set up, as the interrupt handler also requires the full hardware to be initialized by mtk_dp_bridge_attach(). Anyway, depending in which state the controller is left from the bootloader, this may cause an interrupt storm and consequently hang the kernel during boot, so, avoid enabling the interrupt until we reach a clean state by adding the IRQ_NOAUTOEN flag before requesting it at probe time and manage the enablement of the ISR in the .attach() and .detach() handlers for the DP bridge. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-7-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: dp: Move AUX_P0 setting to mtk_dp_initialize_aux_settings()AngeloGioacchino Del Regno
Move the register write to MTK_DP_AUX_P0_3690 to set the AUX reply mode to function mtk_dp_initialize_aux_settings(), as this is effectively part of the DPTX AUX setup sequence. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-6-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: dp: Use devm variant of drm_bridge_add()AngeloGioacchino Del Regno
In preparation for adding support for aux-bus, which will add a code path that may fail after the drm_bridge_add() call, change that to devm_drm_bridge_add() to simplify failure paths later. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-5-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: dp: Change logging to dev for mtk_dp_aux_transfer()AngeloGioacchino Del Regno
Change logging from drm_{err,info}() to dev_{err,info}() in functions mtk_dp_aux_transfer() and mtk_dp_aux_do_transfer(): this will be essential to avoid getting NULL pointer kernel panics if any kind of error happens during AUX transfers happening before the bridge is attached. This may potentially start happening in a later commit implementing aux-bus support, as AUX transfers will be triggered from the panel driver (for EDID) before the mtk-dp bridge gets attached, and it's done in preparation for the same. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-4-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: dp: Move AUX and panel poweron/off sequence to functionAngeloGioacchino Del Regno
Everytime we run bridge detection and/or EDID read we run a poweron and poweroff sequence for both the AUX and the panel; moreover, this is also done when enabling the bridge in the .atomic_enable() callback. Move this power on/off sequence to a new mtk_dp_aux_panel_poweron() function as to commonize it. Note that, before this commit, in mtk_dp_bridge_atomic_enable() only the AUX was getting powered on but the panel was left powered off if the DP cable wasn't plugged in while now we unconditionally send a D0 request and this is done for two reasons: - First, whether this request fails or not, it takes the same time and anyway the DP hardware won't produce any error (or, if it does, it's ignorable because it won't block further commands) - Second, training the link between a sleeping/standby/unpowered display makes little sense. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-3-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: dp: Add missing error checks in mtk_dp_parse_capabilitiesAngeloGioacchino Del Regno
If reading the RX capabilities fails the training pattern will be set wrongly: add error checking for drm_dp_read_dpcd_caps() and return if anything went wrong with it. While at it, also add a less critical error check when writing to clear the ESI0 IRQ vector. Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230725073234.55892-2-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Compress struct of_device_id entriesAngeloGioacchino Del Regno
Reduce line count by compressing the entries of struct of_device_id; while at it, also add the usual /* sentinel */ comment to the last entry. This commit brings no functional changes. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-7-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Use devm_platform_ioremap_resource()AngeloGioacchino Del Regno
Instead of the open-coded platform_get_resource, devm_ioremap_resource switch to devm_platform_ioremap_resource(), also dropping the useless struct resource pointer, which becomes unused. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-6-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Switch to .remove_new() void callbackAngeloGioacchino Del Regno
The .remove() callback cannot fail: switch to .remove_new() and change mtk_dpi_remove() to void. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-5-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Switch to devm_drm_of_get_bridge()AngeloGioacchino Del Regno
Function drm_of_find_panel_or_bridge() is marked as deprecated: since the usage of that in this driver exactly corresponds to the new function devm_drm_of_get_bridge(), switch to it. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-4-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Simplify with dev_err_probe()AngeloGioacchino Del Regno
Use dev_err_probe() across the entire probe function of this driver to shrink the size. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-3-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09drm/mediatek: mtk_dpi: Simplify with devm_drm_bridge_add()AngeloGioacchino Del Regno
Change drm_bridge_add() to its devm variant to slightly simplify the probe function. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20230726082245.550929-2-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2023-08-09ipv6: adjust ndisc_is_useropt() to also return true for PIOMaciej Żenczykowski
The upcoming (and nearly finalized): https://datatracker.ietf.org/doc/draft-collink-6man-pio-pflag/ will update the IPv6 RA to include a new flag in the PIO field, which will serve as a hint to perform DHCPv6-PD. As we don't want DHCPv6 related logic inside the kernel, this piece of information needs to be exposed to userspace. The simplest option is to simply expose the entire PIO through the already existing mechanism. Even without this new flag, the already existing PIO R (router address) flag (from RFC6275) cannot AFAICT be handled entirely in kernel, and provides useful information that should be exposed to userspace (the router's global address, for use by Mobile IPv6). Also cc'ing stable@ for inclusion in LTS, as while technically this is not quite a bugfix, and instead more of a feature, it is absolutely trivial and the alternative is manually cherrypicking into all Android Common Kernel trees - and I know Greg will ask for it to be sent in via LTS instead... Cc: Jen Linkova <furry@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> Cc: stable@vger.kernel.org Signed-off-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20230807102533.1147559-1-maze@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09drm/amdgpu: Use local64_try_cmpxchg in amdgpu_perf_readUros Bizjak
Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old in amdgpu_perf_read. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-09Merge tag 'wireless-2023-08-09' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Just a few small updates: * fix an integer overflow in nl80211 * fix rtw89 8852AE disconnections * fix a buffer overflow in ath12k * fix AP_VLAN configuration lookups * fix allocation failure handling in brcm80211 * update MAINTAINERS for some drivers * tag 'wireless-2023-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: ath12k: Fix buffer overflow when scanning with extraie wifi: nl80211: fix integer overflow in nl80211_parse_mbssid_elems() wifi: cfg80211: fix sband iftype data lookup for AP_VLAN wifi: rtw89: fix 8852AE disconnection caused by RX full flags MAINTAINERS: Remove tree entry for rtl8180 MAINTAINERS: Update entry for rtl8187 wifi: brcm80211: handle params_v1 allocation failure ==================== Link: https://lore.kernel.org/r/20230809124818.167432-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09blk-iocost: fix queue stats accountingChengming Zhou
The q->stats->accounting is not only used by iocost, but iocost only increase this counter, never decrease it. So queue stats accounting will always enabled after using iocost once. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20230804070609.31623-1-chengming.zhou@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-09block: don't make REQ_POLLED imply REQ_NOWAITJens Axboe
Normally these two flags do go together, as the issuer of polled IO generally cannot wait for resources that will get freed as part of IO completion. This is because that very task is the one that will complete the request and free those resources, hence that would introduce a deadlock. But it is possible to have someone else issue the polled IO, eg via io_uring if the request is punted to io-wq. For that case, it's fine to have the task block on IO submission, as it is not the same task that will be completing the IO. It's completely up to the caller to ask for both polled and nowait IO separately! If we don't allow polled IO where IOCB_NOWAIT isn't set in the kiocb, then we can run into repeated -EAGAIN submissions and not make any progress. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-09Merge branch 'selftests-forwarding-various-fixes'Jakub Kicinski
Ido Schimmel says: ==================== selftests: forwarding: Various fixes Fix various problems with forwarding selftests. See individual patches for problem description and solution. ==================== Link: https://lore.kernel.org/r/20230808141503.4060661-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb: Make test more robustIdo Schimmel
Some test cases check that the group timer is (or isn't) 0. Instead of grepping for "0.00" grep for " 0.00" as the former can also match "260.00" which is the default group membership interval. Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-18-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb_max: Fix failing test with old libnetIdo Schimmel
As explained in commit 8bcfb4ae4d97 ("selftests: forwarding: Fix failing tests with old libnet"), old versions of libnet (used by mausezahn) do not use the "SO_BINDTODEVICE" socket option. For IP unicast packets, this can be solved by prefixing mausezahn invocations with "ip vrf exec". However, IP multicast packets do not perform routing and simply egress the bound device, which does not exist in this case. Fix by specifying the source and destination MAC of the packet which will cause mausezahn to use a packet socket instead of an IP socket. Fixes: 3446dcd7df05 ("selftests: forwarding: bridge_mdb_max: Add a new selftest") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-17-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb: Fix failing test with old libnetIdo Schimmel
As explained in commit 8bcfb4ae4d97 ("selftests: forwarding: Fix failing tests with old libnet"), old versions of libnet (used by mausezahn) do not use the "SO_BINDTODEVICE" socket option. For IP unicast packets, this can be solved by prefixing mausezahn invocations with "ip vrf exec". However, IP multicast packets do not perform routing and simply egress the bound device, which does not exist in this case. Fix by specifying the source and destination MAC of the packet which will cause mausezahn to use a packet socket instead of an IP socket. Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-16-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: tc_flower_l2_miss: Fix failing test with old libnetIdo Schimmel
As explained in commit 8bcfb4ae4d97 ("selftests: forwarding: Fix failing tests with old libnet"), old versions of libnet (used by mausezahn) do not use the "SO_BINDTODEVICE" socket option. For IP unicast packets, this can be solved by prefixing mausezahn invocations with "ip vrf exec". However, IP multicast packets do not perform routing and simply egress the bound device, which does not exist in this case. Fix by specifying the source and destination MAC of the packet which will cause mausezahn to use a packet socket instead of an IP socket. Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-15-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: tc_tunnel_key: Make filters more specificIdo Schimmel
The test installs filters that match on various IP fragments (e.g., no fragment, first fragment) and expects a certain amount of packets to hit each filter. This is problematic as the filters are not specific enough and can match IP packets (e.g., IGMP) generated by the stack, resulting in failures [1]. Fix by making the filters more specific and match on more fields in the IP header: Source IP, destination IP and protocol. [1] # timeout set to 0 # selftests: net/forwarding: tc_tunnel_key.sh # TEST: tunnel_key nofrag (skip_hw) [FAIL] # packet smaller than MTU was not tunneled # INFO: Could not test offloaded functionality not ok 89 selftests: net/forwarding: tc_tunnel_key.sh # exit=1 Fixes: 533a89b1940f ("selftests: forwarding: add tunnel_key "nofrag" test case") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Acked-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-14-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: tc_flower: Relax success criterionIdo Schimmel
The test checks that filters that match on source or destination MAC were only hit once. A host can send more than one packet with a given source or destination MAC, resulting in failures. Fix by relaxing the success criterion and instead check that the filters were not hit zero times. Using tc_check_at_least_x_packets() is also an option, but it is not available in older kernels. Fixes: 07e5c75184a1 ("selftests: forwarding: Introduce tc flower matching tests") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-13-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: tc_actions: Use ncat instead of ncIdo Schimmel
The test relies on 'nc' being the netcat version from the nmap project. While this seems to be the case on Fedora, it is not the case on Ubuntu, resulting in failures such as [1]. Fix by explicitly using the 'ncat' utility from the nmap project and the skip the test in case it is not installed. [1] # timeout set to 0 # selftests: net/forwarding: tc_actions.sh # TEST: gact drop and ok (skip_hw) [ OK ] # TEST: mirred egress flower redirect (skip_hw) [ OK ] # TEST: mirred egress flower mirror (skip_hw) [ OK ] # TEST: mirred egress matchall mirror (skip_hw) [ OK ] # TEST: mirred_egress_to_ingress (skip_hw) [ OK ] # nc: invalid option -- '-' # usage: nc [-46CDdFhklNnrStUuvZz] [-I length] [-i interval] [-M ttl] # [-m minttl] [-O length] [-P proxy_username] [-p source_port] # [-q seconds] [-s sourceaddr] [-T keyword] [-V rtable] [-W recvlimit] # [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]] # [destination] [port] # nc: invalid option -- '-' # usage: nc [-46CDdFhklNnrStUuvZz] [-I length] [-i interval] [-M ttl] # [-m minttl] [-O length] [-P proxy_username] [-p source_port] # [-q seconds] [-s sourceaddr] [-T keyword] [-V rtable] [-W recvlimit] # [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]] # [destination] [port] # TEST: mirred_egress_to_ingress_tcp (skip_hw) [FAIL] # server output check failed # INFO: Could not test offloaded functionality not ok 80 selftests: net/forwarding: tc_actions.sh # exit=1 Fixes: ca22da2fbd69 ("act_mirred: use the backlog for nested calls to mirred ingress") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-12-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: ethtool_mm: Skip when MAC Merge is not supportedIdo Schimmel
MAC Merge cannot be tested with veth pairs, resulting in failures: # ./ethtool_mm.sh [...] TEST: Manual configuration with verification: swp1 to swp2 [FAIL] Verification did not succeed Fix by skipping the test when the interfaces do not support MAC Merge. Fixes: e6991384ace5 ("selftests: forwarding: add a test for MAC Merge layer") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-11-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: hw_stats_l3_gre: Skip when using veth pairsIdo Schimmel
Layer 3 hardware stats cannot be used when the underlying interfaces are veth pairs, resulting in failures: # ./hw_stats_l3_gre.sh TEST: ping gre flat [ OK ] TEST: Test rx packets: [FAIL] Traffic not reflected in the counter: 0 -> 0 TEST: Test tx packets: [FAIL] Traffic not reflected in the counter: 0 -> 0 Fix by skipping the test when used with veth pairs. Fixes: 813f97a26860 ("selftests: forwarding: Add a tunnel-based test for L3 HW stats") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-10-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: ethtool_extended_state: Skip when using veth pairsIdo Schimmel
Ethtool extended state cannot be tested with veth pairs, resulting in failures: # ./ethtool_extended_state.sh TEST: Autoneg, No partner detected [FAIL] Expected "Autoneg", got "Link detected: no" [...] Fix by skipping the test when used with veth pairs. Fixes: 7d10bcce98cd ("selftests: forwarding: Add tests for ethtool extended state") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: ethtool: Skip when using veth pairsIdo Schimmel
Auto-negotiation cannot be tested with veth pairs, resulting in failures: # ./ethtool.sh TEST: force of same speed autoneg off [FAIL] error in configuration. swp1 speed Not autoneg off [...] Fix by skipping the test when used with veth pairs. Fixes: 64916b57c0b1 ("selftests: forwarding: Add speed and auto-negotiation test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Add a helper to skip test when using veth pairsIdo Schimmel
A handful of tests require physical loopbacks to be used instead of veth pairs. Add a helper that these tests will invoke in order to be skipped when executed with veth pairs. Fixes: 64916b57c0b1 ("selftests: forwarding: Add speed and auto-negotiation test") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Set default IPv6 traceroute utilityIdo Schimmel
The test uses the 'TROUTE6' environment variable to encode the name of the IPv6 traceroute utility. By default (without a configuration file), this variable is not set, resulting in failures: # ./ip6_forward_instats_vrf.sh TEST: ping6 [ OK ] TEST: Ip6InTooBigErrors [ OK ] TEST: Ip6InHdrErrors [FAIL] TEST: Ip6InAddrErrors [ OK ] TEST: Ip6InDiscards [ OK ] Fix by setting a default utility name and skip the test if the utility is not present. Fixes: 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-6-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb_max: Check iproute2 versionIdo Schimmel
The selftest relies on iproute2 changes present in version 6.3, but the test does not check for it, resulting in errors: # ./bridge_mdb_max.sh INFO: 802.1d tests TEST: cfg4: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected TEST: ctl4: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected TEST: cfg6: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected [...] Fix by skipping the test if iproute2 is too old. Fixes: 3446dcd7df05 ("selftests: forwarding: bridge_mdb_max: Add a new selftest") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/6b04b2ba-2372-6f6b-3ac8-b7cba1cfae83@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb: Check iproute2 versionIdo Schimmel
The selftest relies on iproute2 changes present in version 6.3, but the test does not check for it, resulting in error: # ./bridge_mdb.sh INFO: # Host entries configuration tests TEST: Common host entries configuration tests (IPv4) [FAIL] Managed to add IPv4 host entry with a filter mode TEST: Common host entries configuration tests (IPv6) [FAIL] Managed to add IPv6 host entry with a filter mode TEST: Common host entries configuration tests (L2) [FAIL] Managed to add L2 host entry with a filter mode INFO: # Port group entries configuration tests - (*, G) Command "replace" is unknown, try "bridge mdb help". [...] Fix by skipping the test if iproute2 is too old. Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/6b04b2ba-2372-6f6b-3ac8-b7cba1cfae83@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Switch off timeoutIdo Schimmel
The default timeout for selftests is 45 seconds, but it is not enough for forwarding selftests which can takes minutes to finish depending on the number of tests cases: # make -C tools/testing/selftests TARGETS=net/forwarding run_tests TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # TEST: IGMPv2 report 239.10.10.10 [ OK ] # TEST: IGMPv2 leave 239.10.10.10 [ OK ] # TEST: IGMPv3 report 239.10.10.10 is_include [ OK ] # TEST: IGMPv3 report 239.10.10.10 include -> allow [ OK ] # not ok 1 selftests: net/forwarding: bridge_igmp.sh # TIMEOUT 45 seconds Fix by switching off the timeout and setting it to 0. A similar change was done for BPF selftests in commit 6fc5916cc256 ("selftests: bpf: Switch off timeout"). Fixes: 81573b18f26d ("selftests/net/forwarding: add Makefile to install tests") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/8d149f8c-818e-d141-a0ce-a6bae606bc22@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Skip test when no interfaces are specifiedIdo Schimmel
As explained in [1], the forwarding selftests are meant to be run with either physical loopbacks or veth pairs. The interfaces are expected to be specified in a user-provided forwarding.config file or as command line arguments. By default, this file is not present and the tests fail: # make -C tools/testing/selftests TARGETS=net/forwarding run_tests [...] TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # Command line is not complete. Try option "help" # Failed to create netif not ok 1 selftests: net/forwarding: bridge_igmp.sh # exit=1 [...] Fix by skipping a test if interfaces are not provided either via the configuration file or command line arguments. # make -C tools/testing/selftests TARGETS=net/forwarding run_tests [...] TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # SKIP: Cannot create interface. Name not specified ok 1 selftests: net/forwarding: bridge_igmp.sh # SKIP [1] tools/testing/selftests/net/forwarding/README Fixes: 81573b18f26d ("selftests/net/forwarding: add Makefile to install tests") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/856d454e-f83c-20cf-e166-6dc06cbc1543@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09tcp: add missing family to tcp_set_ca_state() tracepointEric Dumazet
Before this code is copied, add the missing family, as we did in commit 3dd344ea84e1 ("net: tracepoint: exposing sk_family in all tcp:tracepoints") Fixes: 15fcdf6ae116 ("tcp: Add tracepoint for tcp_set_ca_state") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ping Gan <jacky_gam_2001@163.com> Cc: Manjusaka <me@manjusaka.me> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230808084923.2239142-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09Merge branch 'nexthop-nexthop-dump-fixes'Jakub Kicinski
Ido Schimmel says: ==================== nexthop: Nexthop dump fixes Patches #1 and #3 fix two problems related to nexthops and nexthop buckets dump, respectively. Patch #2 is a preparation for the third patch. The pattern described in these patches of splitting the NLMSG_DONE to a separate response is prevalent in other rtnetlink dump callbacks. I don't know if it's because I'm missing something or if this was done intentionally to ensure the message is delivered to user space. After commit 0642840b8bb0 ("af_netlink: ensure that NLMSG_DONE never fails in dumps") this is no longer necessary and I can improve these dump callbacks assuming this analysis is correct. No regressions in existing tests: # ./fib_nexthops.sh [...] Tests passed: 230 Tests failed: 0 ==================== Link: https://lore.kernel.org/r/20230808075233.3337922-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09nexthop: Fix infinite nexthop bucket dump when using maximum nexthop IDIdo Schimmel
A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop bucket dump callback always returns a positive number if nexthop buckets were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthop buckets are present. In the last recvmsg() call the dump callback will not fill in any nexthop buckets because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396980, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 128 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 6.66 nhid 1 id 10 index 1 idle_time 6.66 nhid 1 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # ip nexthop bucket id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOPBUCKET responses: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396737, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 148 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 148 id 4294967295 index 0 idle_time 6.61 nhid 1 id 4294967295 index 1 idle_time 6.61 nhid 1 +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: 8a1bbabb034d ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230808075233.3337922-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09nexthop: Make nexthop bucket dump more efficientIdo Schimmel
rtm_dump_nexthop_bucket_nh() is used to dump nexthop buckets belonging to a specific resilient nexthop group. The function returns a positive return code (the skb length) upon both success and failure. The above behavior is problematic. When a complete nexthop bucket dump is requested, the function that walks the different nexthops treats the non-zero return code as an error. This causes buckets belonging to different resilient nexthop groups to be dumped using different buffers even if they can all fit in the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 10 index 0 idle_time 10.27 nhid 1 [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 20 index 0 idle_time 6.44 nhid 1 [...] Fix by only returning a non-zero return code when an error occurred and restarting the dump from the bucket index we failed to fill in. This allows buckets belonging to different resilient nexthop groups to be dumped using the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 30.21 nhid 1 id 20 index 0 idle_time 26.7 nhid 1 [...] While this change is more of a performance improvement change than an actual bug fix, it is a prerequisite for a subsequent patch that does fix a bug. Fixes: 8a1bbabb034d ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230808075233.3337922-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09nexthop: Fix infinite nexthop dump when using maximum nexthop IDIdo Schimmel
A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop dump callback always returns a positive number if nexthops were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthops are present. In the last recvmsg() call the dump callback will not fill in any nexthops because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip nexthop add id 1 blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394315, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 36 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 1], {nla_len=4, nla_type=NHA_BLACKHOLE}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 36 id 1 blackhole recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip nexthop add id $((2**32-1)) blackhole # ip nexthop id 4294967295 blackhole id 4294967295 blackhole [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOP response: # ip nexthop add id $((2**32-1)) blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394080, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 56 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 4294967295], {nla_len=4, nla_type=NHA_BLACKHOLE}]], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 56 id 4294967295 blackhole +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: ab84be7e54fc ("net: Initial nexthop code") Reported-by: Petr Machata <petrm@nvidia.com> Closes: https://lore.kernel.org/netdev/87sf91enuf.fsf@nvidia.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230808075233.3337922-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09vlan: Fix VLAN 0 memory leakVlad Buslov
The referenced commit intended to fix memleak of VLAN 0 that is implicitly created on devices with NETIF_F_HW_VLAN_CTAG_FILTER feature. However, it doesn't take into account that the feature can be re-set during the netdevice lifetime which will cause memory leak if feature is disabled during the device deletion as illustrated by [0]. Fix the leak by unconditionally deleting VLAN 0 on NETDEV_DOWN event. [0]: > modprobe 8021q > ip l set dev eth2 up > ethtool -K eth2 rx-vlan-filter off > modprobe -r mlx5_ib > modprobe -r mlx5_core > cat /sys/kernel/debug/kmemleak unreferenced object 0xffff888103dcd900 (size 256): comm "ip", pid 1490, jiffies 4294907305 (age 325.364s) hex dump (first 32 bytes): 00 80 5d 03 81 88 ff ff 00 00 00 00 00 00 00 00 ..]............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000899f3bb9>] kmalloc_trace+0x25/0x80 [<000000002889a7a2>] vlan_vid_add+0xa0/0x210 [<000000007177800e>] vlan_device_event+0x374/0x760 [8021q] [<000000009a0716b1>] notifier_call_chain+0x35/0xb0 [<00000000bbf3d162>] __dev_notify_flags+0x58/0xf0 [<0000000053d2b05d>] dev_change_flags+0x4d/0x60 [<00000000982807e9>] do_setlink+0x28d/0x10a0 [<0000000058c1be00>] __rtnl_newlink+0x545/0x980 [<00000000e66c3bd9>] rtnl_newlink+0x44/0x70 [<00000000a2cc5970>] rtnetlink_rcv_msg+0x29c/0x390 [<00000000d307d1e4>] netlink_rcv_skb+0x54/0x100 [<00000000259d16f9>] netlink_unicast+0x1f6/0x2c0 [<000000007ce2afa1>] netlink_sendmsg+0x232/0x4a0 [<00000000f3f4bb39>] sock_sendmsg+0x38/0x60 [<000000002f9c0624>] ____sys_sendmsg+0x1e3/0x200 [<00000000d6ff5520>] ___sys_sendmsg+0x80/0xc0 unreferenced object 0xffff88813354fde0 (size 32): comm "ip", pid 1490, jiffies 4294907305 (age 325.364s) hex dump (first 32 bytes): a0 d9 dc 03 81 88 ff ff a0 d9 dc 03 81 88 ff ff ................ 81 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000899f3bb9>] kmalloc_trace+0x25/0x80 [<000000002da64724>] vlan_vid_add+0xdf/0x210 [<000000007177800e>] vlan_device_event+0x374/0x760 [8021q] [<000000009a0716b1>] notifier_call_chain+0x35/0xb0 [<00000000bbf3d162>] __dev_notify_flags+0x58/0xf0 [<0000000053d2b05d>] dev_change_flags+0x4d/0x60 [<00000000982807e9>] do_setlink+0x28d/0x10a0 [<0000000058c1be00>] __rtnl_newlink+0x545/0x980 [<00000000e66c3bd9>] rtnl_newlink+0x44/0x70 [<00000000a2cc5970>] rtnetlink_rcv_msg+0x29c/0x390 [<00000000d307d1e4>] netlink_rcv_skb+0x54/0x100 [<00000000259d16f9>] netlink_unicast+0x1f6/0x2c0 [<000000007ce2afa1>] netlink_sendmsg+0x232/0x4a0 [<00000000f3f4bb39>] sock_sendmsg+0x38/0x60 [<000000002f9c0624>] ____sys_sendmsg+0x1e3/0x200 [<00000000d6ff5520>] ___sys_sendmsg+0x80/0xc0 Fixes: efc73f4bbc23 ("net: Fix memory leak - vlan_info struct") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20230808093521.1468929-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09x86/mm: Fix VDSO and VVAR placement on 5-level paging machinesKirill A. Shutemov
Yingcong has noticed that on the 5-level paging machine, VDSO and VVAR VMAs are placed above the 47-bit border: 8000001a9000-8000001ad000 r--p 00000000 00:00 0 [vvar] 8000001ad000-8000001af000 r-xp 00000000 00:00 0 [vdso] This might confuse users who are not aware of 5-level paging and expect all userspace addresses to be under the 47-bit border. So far problem has only been triggered with ASLR disabled, although it may also occur with ASLR enabled if the layout is randomized in a just right way. The problem happens due to custom placement for the VMAs in the VDSO code: vdso_addr() tries to place them above the stack and checks the result against TASK_SIZE_MAX, which is wrong. TASK_SIZE_MAX is set to the 56-bit border on 5-level paging machines. Use DEFAULT_MAP_WINDOW instead. Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Reported-by: Yingcong Wu <yingcong.wu@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230803151609.22141-1-kirill.shutemov%40linux.intel.com
2023-08-09platform/x86: ISST: Reduce noise for missing numa information in logsSrinivas Pandruvada
On platforms with no numa support and with several CPUs, logs have lots of noise for message "Fail to get numa node for CPU:.." Change pr_info() to pr_info_once() as one print is enough to show the issue. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20230808174359.50602-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-08-09platform/x86: msi-ec: Fix the buildJean Delvare
The msi-ec driver fails to build for me (gcc 7.5): CC [M] drivers/platform/x86/msi-ec.o drivers/platform/x86/msi-ec.c:72:6: error: initializer element is not constant { SM_ECO_NAME, 0xc2 }, ^~~~~~~~~~~ drivers/platform/x86/msi-ec.c:72:6: note: (near initialization for ‘CONF0.shift_mode.modes[0].name’) drivers/platform/x86/msi-ec.c:73:6: error: initializer element is not constant { SM_COMFORT_NAME, 0xc1 }, ^~~~~~~~~~~~~~~ drivers/platform/x86/msi-ec.c:73:6: note: (near initialization for ‘CONF0.shift_mode.modes[1].name’) drivers/platform/x86/msi-ec.c:74:6: error: initializer element is not constant { SM_SPORT_NAME, 0xc0 }, ^~~~~~~~~~~~~ drivers/platform/x86/msi-ec.c:74:6: note: (near initialization for ‘CONF0.shift_mode.modes[2].name’) (...) Don't try to be smart, just use defines for the constant strings. The compiler will recognize it's the same string and will store it only once in the data section anyway. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 392cacf2aa10 ("platform/x86: Add new msi-ec driver") Cc: stable@vger.kernel.org Cc: Nikita Kravets <teackot@gmail.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mark Gross <markgross@kernel.org> Link: https://lore.kernel.org/r/20230805101010.54d49e91@endymion.delvare Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-08-09ACPI: resource: Honor MADT INT_SRC_OVR settings for IRQ1 on AMD ZenHans de Goede
On AMD Zen acpi_dev_irq_override() by default prefers the DSDT IRQ 1 settings over the MADT settings. This causes the keyboard to malfunction on some laptop models (see Links), all models from the Links have an INT_SRC_OVR MADT entry for IRQ 1. Fixes: a9c4a912b7dc ("ACPI: resource: Remove "Zen" specific match and quirks") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217336 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217394 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217406 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-09ACPI: resource: Always use MADT override IRQ settings for all legacy non ↵Hans de Goede
i8042 IRQs All the cases, were the DSDT IRQ settings should be used instead of the MADT override, are for IRQ 1 or 12, the PS/2 kbd resp. mouse IRQs. Simplify things by always honering the override for other legacy IRQs (for non DMI quirked cases). This allows removing the DMI quirks to honor the override for some non i8042 IRQs on some AMD ZEN based Lenovo models. Fixes: a9c4a912b7dc ("ACPI: resource: Remove "Zen" specific match and quirks") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-09ACPI: resource: revert "Remove "Zen" specific match and quirks"Hans de Goede
Commit a9c4a912b7dc ("ACPI: resource: Remove "Zen" specific match and quirks") is causing keyboard problems for quite a log of AMD based laptop users, leading to many bug reports. Revert this change for now, until we can come up with a better fix for the PS/2 IRQ trigger-type/polarity problems on some x86 laptops. Fixes: a9c4a912b7dc ("ACPI: resource: Remove "Zen" specific match and quirks") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2228891 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2229165 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2229317 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217718 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217726 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217731 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-09rust: macros: vtable: fix `HAS_*` redefinition (`gen_const_name`)Qingsong Chen
If we define the same function name twice in a trait (using `#[cfg]`), the `vtable` macro will redefine its `gen_const_name`, e.g. this will define `HAS_BAR` twice: #[vtable] pub trait Foo { #[cfg(CONFIG_X)] fn bar(); #[cfg(not(CONFIG_X))] fn bar(x: usize); } Fixes: b44becc5ee80 ("rust: macros: add `#[vtable]` proc macro") Signed-off-by: Qingsong Chen <changxian.cqs@antgroup.com> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Sergio González Collado <sergio.collado@gmail.com> Link: https://lore.kernel.org/r/20230808025404.2053471-1-changxian.cqs@antgroup.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-08-10alpha: remove __init annotation from exported page_is_ram()Masahiro Yamada
EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Commit c5a130325f13 ("ACPI/APEI: Add parameter check before error injection") exported page_is_ram(), hence the __init annotation should be removed. This fixes the modpost warning in ARCH=alpha builds: WARNING: modpost: vmlinux: page_is_ram: EXPORT_SYMBOL used for init symbol. Remove __init or EXPORT_SYMBOL. Fixes: c5a130325f13 ("ACPI/APEI: Add parameter check before error injection") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2023-08-09tpm: Add a helper for checking hwrng enabledMario Limonciello
The same checks are repeated in three places to decide whether to use hwrng. Consolidate these into a helper. Also this fixes a case that one of them was missing a check in the cleanup path. Fixes: 554b841d4703 ("tpm: Disable RNG for all AMD fTPMs") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>