summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-13Bluetooth: hci_sock: fix slab oob read in create_monitor_eventEdward AD
When accessing hdev->name, the actual string length should prevail Reported-by: syzbot+c90849c50ed209d77689@syzkaller.appspotmail.com Fixes: dcda165706b9 ("Bluetooth: hci_core: Fix build warnings") Signed-off-by: Edward AD <twuufnxlz@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-10-13Bluetooth: btrtl: Ignore error return for hci_devcd_register()Max Chou
If CONFIG_DEV_COREDUMP was not set, it would return -EOPNOTSUPP for hci_devcd_register(). In this commit, ignore error return for hci_devcd_register(). Otherwise Bluetooth initialization will be failed. Fixes: 044014ce85a1 ("Bluetooth: btrtl: Add Realtek devcoredump support") Cc: stable@vger.kernel.org Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Closes: https://lore.kernel.org/all/ZRyqIn0_qqEFBPdy@debian.me/T/ Signed-off-by: Max Chou <max.chou@realtek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-10-13Bluetooth: hci_event: Fix coding styleLuiz Augusto von Dentz
This fixes the following code style problem: ERROR: that open brace { should be on the previous line + if (!bacmp(&hdev->bdaddr, &ev->bdaddr)) + { Fixes: 1ffc6f8cc332 ("Bluetooth: Reject connection with the device which has same BD_ADDR") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-10-13Bluetooth: hci_event: Fix using memcmp when comparing keysLuiz Augusto von Dentz
memcmp is not consider safe to use with cryptographic secrets: 'Do not use memcmp() to compare security critical data, such as cryptographic secrets, because the required CPU time depends on the number of equal bytes.' While usage of memcmp for ZERO_KEY may not be considered a security critical data, it can lead to more usage of memcmp with pairing keys which could introduce more security problems. Fixes: 455c2ff0a558 ("Bluetooth: Fix BR/EDR out-of-band pairing with only initiator data") Fixes: 33155c4aae52 ("Bluetooth: hci_event: Ignore NULL link key") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-10-13Merge tag 'mlx5-fixes-2023-10-12' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-10-12 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix VF representors reporting zero counters to "ip -s" command net/mlx5e: Don't offload internal port if filter device is out device net/mlx5e: Take RTNL lock before triggering netdev notifiers net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown net/mlx5e: RX, Fix page_pool allocation failure recovery for legacy rq net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq net/mlx5: Handle fw tracer change ownership event based on MTRC net/mlx5: Bridge, fix peer entry ageing in LAG mode net/mlx5: E-switch, register event handler before arming the event net/mlx5: Perform DMA operations in the right locations ==================== Link: https://lore.kernel.org/r/20231012195127.129585-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13appletalk: remove special handling code for ipddpLukas Bulwahn
After commit 1dab47139e61 ("appletalk: remove ipddp driver") removes the config IPDDP, there is some minor code clean-up possible in the appletalk network layer. Remove some code in appletalk layer after the ipddp driver is gone. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231012063443.22368-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Input: xpad - add PXN V900 supportMatthias Berndt
Add VID and PID to the xpad_device table to allow driver to use the PXN V900 steering wheel, which is XTYPE_XBOX360 compatible in xinput mode. Signed-off-by: Matthias Berndt <matthias_berndt@gmx.de> Link: https://lore.kernel.org/r/4932699.31r3eYUQgx@fedora Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-10-13Input: synaptics-rmi4 - handle reset delay when using SMBus trsnsportDmitry Torokhov
Touch controllers need some time after receiving reset command for the firmware to finish re-initializing and be ready to respond to commands from the host. The driver already had handling for the post-reset delay for I2C and SPI transports, this change adds the handling to SMBus-connected devices. SMBus devices are peculiar because they implement legacy PS/2 compatibility mode, so reset is actually issued by psmouse driver on the associated serio port, after which the control is passed to the RMI4 driver with SMBus companion device. Note that originally the delay was added to psmouse driver in 92e24e0e57f7 ("Input: psmouse - add delay when deactivating for SMBus mode"), but that resulted in an unwanted delay in "fast" reconnect handler for the serio port, so it was decided to revert the patch and have the delay being handled in the RMI4 driver, similar to the other transports. Tested-by: Jeffery Miller <jefferymiller@google.com> Link: https://lore.kernel.org/r/ZR1yUFJ8a9Zt606N@penguin Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-10-13Input: psmouse - fix fast_reconnect function for PS/2 modeJeffery Miller
When the SMBus connection is attempted psmouse_smbus_init() sets the fast_reconnect pointer to psmouse_smbus_reconnecti(). If SMBus initialization fails, elantech_setup_ps2() and synaptics_init_ps2() will fallback to PS/2 mode, replacing the psmouse private data. This can cause issues on resume, since psmouse_smbus_reconnect() expects to find an instance of struct psmouse_smbus_dev in psmouse->private. The issue was uncovered when in 92e24e0e57f7 ("Input: psmouse - add delay when deactivating for SMBus mode") psmouse_smbus_reconnect() started attempting to use more of the data structure. The commit was since reverted, not because it was at fault, but because there was found a better way of doing what it was attempting to do. Fix the problem by resetting the fast_reconnect pointer in psmouse structure in elantech_setup_ps2() and synaptics_init_ps2() when the PS/2 mode is used. Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Jeffery Miller <jefferymiller@google.com> Fixes: bf232e460a35 ("Input: psmouse-smbus - allow to control psmouse_deactivate") Link: https://lore.kernel.org/r/20231005002249.554877-1-jefferymiller@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-10-13Merge branch 'intel-wired-lan-driver-updates-2023-10-11-i40e-ice'Jakub Kicinski
Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2023-10-11 (i40e, ice) This series contains fixes for the i40e and ice drivers. Jesse adds handling to the ice driver which resetis the device when loading on a crash kernel, preventing stale transactions from causing machine check exceptions which could prevent capturing crash data. Mateusz fixes a bug in the ice driver 'Safe mode' logic for handling the device when the DDP is missing. Michal fixes a crash when probing the i40e driver in the event that HW registers are reporting invalid/unexpected values. The following are changes since commit a950a5921db450c74212327f69950ff03419483a: net/smc: Fix pos miscalculation in statistics I'm covering for Tony Nguyen while he's out, and don't have access to create a pull request branch on his net-queue, so these are sent via mail only. ==================== Link: https://lore.kernel.org/r/20231011233334.336092-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13ice: Fix safe mode when DDP is missingMateusz Pacuszka
One thing is broken in the safe mode, that is ice_deinit_features() is being executed even that ice_init_features() was not causing stack trace during pci_unregister_driver(). Add check on the top of the function. Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-4-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13ice: reset first in crash dump kernelsJesse Brandeburg
When the system boots into the crash dump kernel after a panic, the ice networking device may still have pending transactions that can cause errors or machine checks when the device is re-enabled. This can prevent the crash dump kernel from loading the driver or collecting the crash data. To avoid this issue, perform a function level reset (FLR) on the ice device via PCIe config space before enabling it on the crash kernel. This will clear any outstanding transactions and stop all queues and interrupts. Restore the config space after the FLR, otherwise it was found in testing that the driver wouldn't load successfully. The following sequence causes the original issue: - Load the ice driver with modprobe ice - Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs - Trigger a crash with echo c > /proc/sysrq-trigger - Load the ice driver again (or let it load automatically) with modprobe ice - The system crashes again during pcim_enable_device() Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") Reported-by: Vishal Agrawal <vagrawal@redhat.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13i40e: prevent crash on probe if hw registers have invalid valuesMichal Schmidt
The hardware provides the indexes of the first and the last available queue and VF. From the indexes, the driver calculates the numbers of queues and VFs. In theory, a faulty device might say the last index is smaller than the first index. In that case, the driver's calculation would underflow, it would attempt to write to non-existent registers outside of the ioremapped range and crash. I ran into this not by having a faulty device, but by an operator error. I accidentally ran a QE test meant for i40e devices on an ice device. The test used 'echo i40e > /sys/...ice PCI device.../driver_override', bound the driver to the device and crashed in one of the wr32 calls in i40e_clear_hw. Add checks to prevent underflows in the calculations of num_queues and num_vfs. With this fix, the wrong device probing reports errors and returns a failure without crashing. Fixes: 838d41d92a90 ("i40e: clear all queues and interrupts") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge tag 'nf-23-10-12' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net Patch 1, from Pablo Neira Ayuso, fixes a performance regression (since 6.4) when a large pending set update has to be canceled towards the end of the transaction. Patch 2 from myself, silences an incorrect compiler warning reported with a few (older) compiler toolchains. Patch 3, from Kees Cook, adds __counted_by annotation to nft_pipapo set backend type. I took this for net instead of -next given infra is already in place and no actual code change is made. Patch 4, from Pablo Neira Ayso, disables timeout resets on stateful element reset. The rest should only affect internal object state, e.g. reset a quota or counter, but not affect a pending timeout. Patches 5 and 6 fix NULL dereferences in 'inner header' match, control plane doesn't test for netlink attribute presence before accessing them. Broken since feature was added in 6.2, fixes from Xingyuan Mo. Last patch, from myself, fixes a bogus rule match when skb has a 0-length mac header, in this case we'd fetch data from network header instead of canceling rule evaluation. This is a day 0 bug, present since nftables was merged in 3.13. * tag 'nf-23-10-12' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_payload: fix wrong mac header matching nf_tables: fix NULL pointer dereference in nft_expr_inner_parse() nf_tables: fix NULL pointer dereference in nft_inner_init() netfilter: nf_tables: do not refresh timeout when resetting element netfilter: nf_tables: Annotate struct nft_pipapo_match with __counted_by netfilter: nfnetlink_log: silence bogus compiler warning netfilter: nf_tables: do not remove elements if set backend implements .abort ==================== Link: https://lore.kernel.org/r/20231012085724.15155-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13qed: replace uses of strncpyJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. This patch eliminates three uses of strncpy(): Firstly, `dest` is expected to be NUL-terminated which is evident by the manual setting of a NUL-byte at size - 1. For this use specifically, strscpy() is a viable replacement due to the fact that it guarantees NUL-termination on the destination buffer. The next two cases should simply be memcpy() as the size of the src string is always 3 and the destination string just wants the first 3 bytes changed. To be clear, there are no buffer overread bugs in the current code as the sizes and offsets are carefully managed such that buffers are NUL-terminated. However, with these changes, the code is now more robust and less ambiguous (and hopefully easier to read). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231012-strncpy-drivers-net-ethernet-qlogic-qed-qed_debug-c-v2-1-16d2c0162b80@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13r8169: fix rare issue with broken rx after link-down on RTL8125Heiner Kallweit
In very rare cases (I've seen two reports so far about different RTL8125 chip versions) it seems the MAC locks up when link goes down and requires a software reset to get revived. Realtek doesn't publish hw errata information, therefore the root cause is unknown. Realtek vendor drivers do a full hw re-initialization on each link-up event, the slimmed-down variant here was reported to fix the issue for the reporting user. It's not fully clear which parts of the NIC are reset as part of the software reset, therefore I can't rule out side effects. Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Martin Kjær Jørgensen <me@lagy.org> Link: https://lore.kernel.org/netdev/97ec2232-3257-316c-c3e7-a08192ce16a6@gmail.com/T/ Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/9edde757-9c3b-4730-be3b-0ef3a374ff71@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: ti: icssg-prueth: Fix tx_total_bytes countMD Danish Anwar
ICSSG HW stats on TX side considers 8 preamble bytes as data bytes. Due to this the tx_bytes of ICSSG interface doesn't match the rx_bytes of the link partner. There is no public errata available yet. As a workaround to fix this, decrease tx_bytes by 8 bytes for every tx frame. Fixes: c1e10d5dc7a1 ("net: ti: icssg-prueth: Add ICSSG Stats") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20231012064626.977466-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13docs: fix info about representor identificationMateusz Polchlopek
Update the "How are representors identified?" documentation subchapter. For newer kernels driver should use SET_NETDEV_DEVLINK_PORT instead of ndo_get_devlink_port() callback. Fixes: 7712b3e966ea ("Merge branch 'net-fix-netdev-to-devlink_port-linkage-and-expose-to-user'") Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20231012123144.15768-1-mateusz.polchlopek@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13netlink: specs: devlink: fix reply command valuesJiri Pirko
Make sure that the command values used for replies are correct. This is only affecting generated userspace helpers, no change on kernel code. Fixes: 7199c86247e9 ("netlink: specs: devlink: add commands that do per-instance dump") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231012115811.298129-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge branch 'net-netconsole-configfs-entries-for-boot-target'Jakub Kicinski
Breno Leitao says: ==================== net: netconsole: configfs entries for boot target There is a limitation in netconsole, where it is impossible to disable or modify the target created from the command line parameter. (netconsole=...). "netconsole" cmdline parameter sets the remote IP, and if the remote IP changes, the machine needs to be rebooted (with the new remote IP set in the command line parameter). This allows the user to modify a target without the need to restart the machine. This functionality sits on top of the dynamic target reconfiguration that is already implemented in netconsole. The way to modify a boot time target is creating special named configfs directories, that will be associated with the targets coming from `netconsole=...`. Example: Let's suppose you have two netconsole targets defined at boot time:: netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc;4444@10.0.0.1/eth1,9353@10.0.0.3/12:34:56:78:9a:bc You can modify these targets in runtime by creating the following targets:: $ mkdir cmdline1 $ cat cmdline1/remote_ip 10.0.0.3 $ echo 0 > cmdline1/enabled $ echo 10.0.0.4 > cmdline1/remote_ip $ echo 1 > cmdline1/enabled ==================== Link: https://lore.kernel.org/r/20231012111401.333798-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Documentation: netconsole: add support for cmdline targetsBreno Leitao
With the previous patches, there is no more limitation at modifying the targets created at boot time (or module load time). Document the way on how to create the configfs directories to be able to modify these netconsole targets. The design discussion about this topic could be found at: https://lore.kernel.org/all/ZRWRal5bW93px4km@gmail.com/ Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20231012111401.333798-5-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13netconsole: Attach cmdline target to dynamic targetBreno Leitao
Enable the attachment of a dynamic target to the target created during boot time. The boot-time targets are named as "cmdline\d", where "\d" is a number starting at 0. If the user creates a dynamic target named "cmdline0", it will attach to the first target created at boot time (as defined in the `netconsole=...` command line argument). `cmdline1` will attach to the second target and so forth. If there is no netconsole target created at boot time, then, the target name could be reused. Relevant design discussion: https://lore.kernel.org/all/ZRWRal5bW93px4km@gmail.com/ Suggested-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20231012111401.333798-4-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13netconsole: Initialize configfs_item for default targetsBreno Leitao
For netconsole targets allocated during the boot time (passing netconsole=... argument), netconsole_target->item is not initialized. That is not a problem because it is not used inside configfs. An upcoming patch will be using it, thus, initialize the targets with the name 'cmdline' plus a counter starting from 0. This name will match entries in the configfs later. Suggested-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20231012111401.333798-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13netconsole: move init/cleanup functions lowerBreno Leitao
Move alloc_param_target() and its counterpart (free_param_target()) to the bottom of the file. These functions are called mostly at initialization/cleanup of the module, and they should be just above the callers, at the bottom of the file. From a practical perspective, having alloc_param_target() at the bottom of the file will avoid forward declaration later (in the following patch). Nothing changed other than the functions location. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20231012111401.333798-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13sfc: replace deprecated strncpy with strscpyJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. `desc` is expected to be NUL-terminated as evident by the manual NUL-byte assignment. Moreover, NUL-padding does not seem to be necessary. The only caller of efx_mcdi_nvram_metadata() is efx_devlink_info_nvram_partition() which provides a NULL for `desc`: | rc = efx_mcdi_nvram_metadata(efx, partition_type, NULL, version, NULL, 0); Due to this, I am not sure this code is even reached but we should still favor something other than strncpy. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231012-strncpy-drivers-net-ethernet-sfc-mcdi-c-v1-1-478c8de1039d@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: phy: tja11xx: replace deprecated strncpy with ethtool_sprintfJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. ethtool_sprintf() is designed specifically for get_strings() usage. Let's replace strncpy in favor of this dedicated helper function. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231012-strncpy-drivers-net-phy-nxp-tja11xx-c-v1-1-5ad6c9dff5c4@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13ionic: replace deprecated strncpy with strscpyJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. NUL-padding is not needed due to `ident` being memset'd to 0 just before the copy. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231011-strncpy-drivers-net-ethernet-pensando-ionic-ionic_main-c-v1-1-23c62a16ff58@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: sparx5: replace deprecated strncpy with ethtool_sprintfJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. ethtool_sprintf() is designed specifically for get_strings() usage. Let's replace strncpy() in favor of this more robust and easier to understand interface. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231011-strncpy-drivers-net-ethernet-microchip-sparx5-sparx5_ethtool-c-v1-1-410953d07f42@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net/mlx4_core: replace deprecated strncpy with strscpyJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect `dst` to be NUL-terminated based on its use with format strings: | mlx4_dbg(dev, "Reporting Driver Version to FW: %s\n", dst); Moreover, NUL-padding is not required. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20231011-strncpy-drivers-net-ethernet-mellanox-mlx4-fw-c-v1-1-4d7b5d34c933@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13nfp: replace deprecated strncpy with strscpyJustin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect res->name to be NUL-terminated based on its usage with format strings: | dev_err(cpp->dev.parent, "Dangling area: %d:%d:%d:0x%0llx-0x%0llx%s%s\n", | NFP_CPP_ID_TARGET_of(res->cpp_id), | NFP_CPP_ID_ACTION_of(res->cpp_id), | NFP_CPP_ID_TOKEN_of(res->cpp_id), | res->start, res->end, | res->name ? " " : "", | res->name ? res->name : ""); ... and with strcmp() | if (!strcmp(res->name, NFP_RESOURCE_TBL_NAME)) { Moreover, NUL-padding is not required as `res` is already zero-allocated: | res = kzalloc(sizeof(*res), GFP_KERNEL); Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Let's also opt to use the more idiomatic strscpy() usage of (dest, src, sizeof(dest)) rather than (dest, src, SOME_LEN). Typically the pattern of 1) allocate memory for string, 2) copy string into freshly-allocated memory is a candidate for kmemdup_nul() but in this case we are allocating the entirety of the `res` struct and that should stay as is. As mentioned above, simple 1:1 replacement of strncpy -> strscpy :) Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20231011-strncpy-drivers-net-ethernet-netronome-nfp-nfpcore-nfp_resource-c-v1-1-7d1c984f0eba@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13mlxsw: pci: Allocate skbs using GFP_KERNEL during initializationIdo Schimmel
The driver allocates skbs during initialization and during Rx processing. Take advantage of the fact that the former happens in process context and allocate the skbs using GFP_KERNEL to decrease the probability of allocation failure. Tested with CONFIG_DEBUG_ATOMIC_SLEEP=y. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/dfa6ed0926e045fe7c14f0894cc0c37fee81bf9d.1697034729.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13octeontx2-af: Enable hardware timestamping for VFsSubbaraya Sundeep
Currently for VFs, mailbox returns ENODEV error when hardware timestamping enable is requested. This patch fixes this issue. Modified this patch to return EPERM error for the PF/VFs which are not attached to CGX/RPM. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231011121551.1205211-1-saikrishnag@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge branch 'wangxun-ethtool-stats'Jakub Kicinski
Jiawen Wu says: ==================== Wangxun ethtool stats Support to show ethtool stats for txgbe/ngbe. ==================== Link: https://lore.kernel.org/r/20231011091906.70486-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: ngbe: add ethtool stats supportJiawen Wu
Support to show ethtool statistics. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20231011091906.70486-4-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: txgbe: add ethtool stats supportJiawen Wu
Support to show ethtool statistics. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20231011091906.70486-3-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: libwx: support hardware statisticsJiawen Wu
Implement update and clear Rx/Tx statistics. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20231011091906.70486-2-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net/smc: fix smc clc failed issue when netdevice not in init_netAlbert Huang
If the netdevice is within a container and communicates externally through network technologies such as VxLAN, we won't be able to find routing information in the init_net namespace. To address this issue, we need to add a struct net parameter to the smc_ib_find_route function. This allow us to locate the routing information within the corresponding net namespace, ensuring the correct completion of the SMC CLC interaction. Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment") Signed-off-by: Albert Huang <huangjie.albert@bytedance.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Link: https://lore.kernel.org/r/20231011074851.95280-1-huangjie.albert@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13tcp: allow again tcp_disconnect() when threads are waitingPaolo Abeni
As reported by Tom, .NET and applications build on top of it rely on connect(AF_UNSPEC) to async cancel pending I/O operations on TCP socket. The blamed commit below caused a regression, as such cancellation can now fail. As suggested by Eric, this change addresses the problem explicitly causing blocking I/O operation to terminate immediately (with an error) when a concurrent disconnect() is executed. Instead of tracking the number of threads blocked on a given socket, track the number of disconnect() issued on such socket. If such counter changes after a blocking operation releasing and re-acquiring the socket lock, error out the current operation. Fixes: 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting") Reported-by: Tom Deseyn <tdeseyn@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886305 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/f3b95e47e3dbed840960548aebaa8d954372db41.1697008693.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13ice: fix over-shifted variableJesse Brandeburg
Since the introduction of the ice driver the code has been double-shifting the RSS enabling field, because the define already has shifts in it and can't have the regular pattern of "a << shiftval & mask" applied. Most places in the code got it right, but one line was still wrong. Fix this one location for easy backports to stable. An in-progress patch fixes the defines to "standard" and will be applied as part of the regular -next process sometime after this one. Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> CC: stable@vger.kernel.org Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231010203101.406248-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: dsa: bcm_sf2: Fix possible memory leak in bcm_sf2_mdio_register()Jinjie Ruan
In bcm_sf2_mdio_register(), the class_find_device() will call get_device() to increment reference count for priv->master_mii_bus->dev if of_mdio_find_bus() succeeds. If mdiobus_alloc() or mdiobus_register() fails, it will call get_device() twice without decrement reference count for the device. And it is the same if bcm_sf2_mdio_register() succeeds but fails in bcm_sf2_sw_probe(), or if bcm_sf2_sw_probe() succeeds. If the reference count has not decremented to zero, the dev related resource will not be freed. So remove the get_device() in bcm_sf2_mdio_register(), and call put_device() if mdiobus_alloc() or mdiobus_register() fails and in bcm_sf2_mdio_unregister() to solve the issue. And as Simon suggested, unwind from errors for bcm_sf2_mdio_register() and just return 0 if it succeeds to make it cleaner. Fixes: 461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231011032419.2423290-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13net: dsa: vsc73xx: replace deprecated strncpy with ethtool_sprintfJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. ethtool_sprintf() is designed specifically for get_strings() usage. Let's replace strncpy in favor of this more robust and easier to understand interface. This change could result in misaligned strings when if(cnt) fails. To combat this, use ternary to place empty string in buffer and properly increment pointer to next string slot. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231010-strncpy-drivers-net-dsa-vitesse-vsc73xx-core-c-v2-1-ba4416a9ff23@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge branch 'Open-coded task_vma iter'Andrii Nakryiko
Dave Marchevsky says: ==================== At Meta we have a profiling daemon which periodically collects information on many hosts. This collection usually involves grabbing stacks (user and kernel) using perf_event BPF progs and later symbolicating them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In those cases we must fall back to digging around in /proc/PID/maps to map virtual address to (binary, offset). The /proc/PID/maps digging does not occur synchronously with stack collection, so the process might already be gone, in which case it won't have /proc/PID/maps and we will fail to symbolicate. This 'exited process problem' doesn't occur very often as most of the prod services we care to profile are long-lived daemons, but there are enough usecases to warrant a workaround: a BPF program which can be optionally loaded at data collection time and essentially walks /proc/PID/maps. Currently this is done by walking the vma list: struct vm_area_struct* mmap = BPF_CORE_READ(mm, mmap); mmap_next = BPF_CORE_READ(rmap, vm_next); /* in a loop */ Since commit 763ecb035029 ("mm: remove the vma linked list") there's no longer a vma linked list to walk. Walking the vma maple tree is not as simple as hopping struct vm_area_struct->vm_next. Luckily, commit f39af05949a4 ("mm: add VMA iterator"), another commit in that series, added struct vma_iterator and for_each_vma macro for easy vma iteration. If similar functionality was exposed to BPF programs, it would be perfect for our usecase. This series adds such functionality, specifically a BPF equivalent of for_each_vma using the open-coded iterator style. Notes: * This approach was chosen after discussion on a previous series [0] which attempted to solve the same problem by adding a BPF_F_VMA_NEXT flag to bpf_find_vma. * Unlike the task_vma bpf_iter, the open-coded iterator kfuncs here do not drop the vma read lock between iterations. See Alexei's response in [0]. * The [vsyscall] page isn't really part of task->mm's vmas, but /proc/PID/maps returns information about it anyways. The vma iter added here does not do the same. See comment on selftest in patch 3. * bpf_iter_task_vma allocates a _data struct which contains - among other things - struct vma_iterator, using BPF allocator and keeps a pointer to the bpf_iter_task_vma_data. This is done in order to prevent changes to struct ma_state - which is wrapped by struct vma_iterator - from necessitating changes to uapi struct bpf_iter_task_vma. Changelog: v6 -> v7: https://lore.kernel.org/bpf/20231010185944.3888849-1-davemarchevsky@fb.com/ Patch numbers correspond to their position in v6 Patch 2 ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c") * Add Andrii ack Patch 3 ("bpf: Introduce task_vma open-coded iterator kfuncs") * Add Andrii ack * Add missing __diag_ignore_all for -Wmissing-prototypes (Song) Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter") * Remove two unnecessary header includes (Andrii) * Remove extraneous !vmas_seen check (Andrii) New Patch ("bpf: Add BPF_KFUNC_{START,END}_defs macros") * After talking to Andrii, this is an attempt to clean up __diag_ignore_all spam everywhere kfuncs are defined. If nontrivial changes are needed, let's apply the other 4 and I'll respin as a standalone patch. v5 -> v6: https://lore.kernel.org/bpf/20231010175637.3405682-1-davemarchevsky@fb.com/ Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter") * Remove extraneous blank line. I did this manually to the .patch file for v5, which caused BPF CI to complain about failing to apply the series v4 -> v5: https://lore.kernel.org/bpf/20231002195341.2940874-1-davemarchevsky@fb.com/ Patch numbers correspond to their position in v4 New Patch ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c") * Patch 2's renaming of this selftest, and associated changes in the userspace runner, are split out into this separate commit (Andrii) Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs") * Remove bpf_iter_task_vma kfuncs from libbpf's bpf_helpers.h, they'll be added to selftests' bpf_experimental.h in selftests patch below (Andrii) * Split bpf_iter_task_vma.c renaming into separate commit (Andrii) Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter") * Add bpf_iter_task_vma kfuncs to bpf_experimental.h (Andrii) * Remove '?' from prog SEC, open_and_load the skel in one operation (Andrii) * Ensure that fclose() always happens in test runner (Andrii) * Use global var w/ 1000 (vm_start, vm_end) structs instead of two MAP_TYPE_ARRAY's w/ 1k u64s each (Andrii) v3 -> v4: https://lore.kernel.org/bpf/20230822050558.2937659-1-davemarchevsky@fb.com/ Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num") * Add Andrii ack Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs") * Mark bpf_iter_task_vma_new args KF_RCU and remove now-unnecessary !task check (Yonghong) * Although KF_RCU is a function-level flag, in reality it only applies to the task_struct *task parameter, as the other two params are a scalar int and a specially-handled KF_ARG_PTR_TO_ITER * Remove struct bpf_iter_task_vma definition from uapi headers, define in kernel/bpf/task_iter.c instead (Andrii) Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter") * Use a local var when looping over vmas to track map idx. Update vmas_seen global after done iterating. Don't start iterating or update vmas_seen if vmas_seen global is nonzero. (Andrii) * Move getpgid() call to correct spot - above skel detach. (Andrii) v2 -> v3: https://lore.kernel.org/bpf/20230821173415.1970776-1-davemarchevsky@fb.com/ Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num") * Add Yonghong ack Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs") * UAPI bpf header and tools/ version should match * Add bpf_iter_task_vma_kern_data which bpf_iter_task_vma_kern points to, bpf_mem_alloc/free it instead of just vma_iterator. (Alexei) * Inner data ptr == NULL implies initialization failed v1 -> v2: https://lore.kernel.org/bpf/20230810183513.684836-1-davemarchevsky@fb.com/ * Patch 1 * Now removes the unnecessary BTF_TYPE_EMIT instead of changing the type (Yonghong) * Patch 2 * Don't do unnecessary BTF_TYPE_EMIT (Yonghong) * Bump task refcount to prevent ->mm reuse (Yonghong) * Keep a pointer to vma_iterator in bpf_iter_task_vma, alloc/free via BPF mem allocator (Yonghong, Stanislav) * Patch 3 [0]: https://lore.kernel.org/bpf/20230801145414.418145-1-davemarchevsky@fb.com/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-10-13selftests/bpf: Add tests for open-coded task_vma iterDave Marchevsky
The open-coded task_vma iter added earlier in this series allows for natural iteration over a task's vmas using existing open-coded iter infrastructure, specifically bpf_for_each. This patch adds a test demonstrating this pattern and validating correctness. The vma->vm_start and vma->vm_end addresses of the first 1000 vmas are recorded and compared to /proc/PID/maps output. As expected, both see the same vmas and addresses - with the exception of the [vsyscall] vma - which is explained in a comment in the prog_tests program. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231013204426.1074286-5-davemarchevsky@fb.com
2023-10-13bpf: Introduce task_vma open-coded iterator kfuncsDave Marchevsky
This patch adds kfuncs bpf_iter_task_vma_{new,next,destroy} which allow creation and manipulation of struct bpf_iter_task_vma in open-coded iterator style. BPF programs can use these kfuncs directly or through bpf_for_each macro for natural-looking iteration of all task vmas. The implementation borrows heavily from bpf_find_vma helper's locking - differing only in that it holds the mmap_read lock for all iterations while the helper only executes its provided callback on a maximum of 1 vma. Aside from locking, struct vma_iterator and vma_next do all the heavy lifting. A pointer to an inner data struct, struct bpf_iter_task_vma_data, is the only field in struct bpf_iter_task_vma. This is because the inner data struct contains a struct vma_iterator (not ptr), whose size is likely to change under us. If bpf_iter_task_vma_kern contained vma_iterator directly such a change would require change in opaque bpf_iter_task_vma struct's size. So better to allocate vma_iterator using BPF allocator, and since that alloc must already succeed, might as well allocate all iter fields, thereby freezing struct bpf_iter_task_vma size. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231013204426.1074286-4-davemarchevsky@fb.com
2023-10-13selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.cDave Marchevsky
Further patches in this series will add a struct bpf_iter_task_vma, which will result in a name collision with the selftest prog renamed in this patch. Rename the selftest to avoid the collision. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231013204426.1074286-3-davemarchevsky@fb.com
2023-10-13bpf: Don't explicitly emit BTF for struct btf_iter_numDave Marchevsky
Commit 6018e1f407cc ("bpf: implement numbers iterator") added the BTF_TYPE_EMIT line that this patch is modifying. The struct btf_iter_num doesn't exist, so only a forward declaration is emitted in BTF: FWD 'btf_iter_num' fwd_kind=struct That commit was probably hoping to ensure that struct bpf_iter_num is emitted in vmlinux BTF. A previous version of this patch changed the line to emit the correct type, but Yonghong confirmed that it would definitely be emitted regardless in [0], so this patch simply removes the line. This isn't marked "Fixes" because the extraneous btf_iter_num FWD wasn't causing any issues that I noticed, aside from mild confusion when I looked through the code. [0]: https://lore.kernel.org/bpf/25d08207-43e6-36a8-5e0f-47a913d4cda5@linux.dev/ Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231013204426.1074286-2-davemarchevsky@fb.com
2023-10-13Merge branch 'selftests-fib_tests-fixes-for-multipath-list-receive-tests'Jakub Kicinski
Ido Schimmel says: ==================== selftests: fib_tests: Fixes for multipath list receive tests Fix two issues in recently added FIB multipath list receive tests. ==================== Link: https://lore.kernel.org/r/20231010132113.3014691-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13selftests: fib_tests: Count all trace point invocationsIdo Schimmel
The tests rely on the IPv{4,6} FIB trace points being triggered once for each forwarded packet. If receive processing is deferred to the ksoftirqd task these invocations will not be counted and the tests will fail. Fix by specifying the '-a' flag to avoid perf from filtering on the mausezahn task. Before: # ./fib_tests.sh -t ipv4_mpath_list IPv4 multipath list receive tests TEST: Multipath route hit ratio (.68) [FAIL] # ./fib_tests.sh -t ipv6_mpath_list IPv6 multipath list receive tests TEST: Multipath route hit ratio (.27) [FAIL] After: # ./fib_tests.sh -t ipv4_mpath_list IPv4 multipath list receive tests TEST: Multipath route hit ratio (1.00) [ OK ] # ./fib_tests.sh -t ipv6_mpath_list IPv6 multipath list receive tests TEST: Multipath route hit ratio (.99) [ OK ] Fixes: 8ae9efb859c0 ("selftests: fib_tests: Add multipath list receive tests") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/netdev/202309191658.c00d8b8-oliver.sang@intel.com/ Tested-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Tested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Link: https://lore.kernel.org/r/20231010132113.3014691-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13selftests: fib_tests: Disable RP filter in multipath list receive testIdo Schimmel
The test relies on the fib:fib_table_lookup trace point being triggered once for each forwarded packet. If RP filter is not disabled, the trace point will be triggered twice for each packet (for source validation and forwarding), potentially masking actual bugs. Fix by explicitly disabling RP filter. Before: # ./fib_tests.sh -t ipv4_mpath_list IPv4 multipath list receive tests TEST: Multipath route hit ratio (1.99) [ OK ] After: # ./fib_tests.sh -t ipv4_mpath_list IPv4 multipath list receive tests TEST: Multipath route hit ratio (.99) [ OK ] Fixes: 8ae9efb859c0 ("selftests: fib_tests: Add multipath list receive tests") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/netdev/202309191658.c00d8b8-oliver.sang@intel.com/ Tested-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Tested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Link: https://lore.kernel.org/r/20231010132113.3014691-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13bpf: Change syscall_nr type to int in struct syscall_tp_tArtem Savkov
linux-rt-devel tree contains a patch (b1773eac3f29c ("sched: Add support for lazy preemption")) that adds an extra member to struct trace_entry. This causes the offset of args field in struct trace_event_raw_sys_enter be different from the one in struct syscall_trace_enter: struct trace_event_raw_sys_enter { struct trace_entry ent; /* 0 12 */ /* XXX last struct has 3 bytes of padding */ /* XXX 4 bytes hole, try to pack */ long int id; /* 16 8 */ long unsigned int args[6]; /* 24 48 */ /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */ char __data[]; /* 72 0 */ /* size: 72, cachelines: 2, members: 4 */ /* sum members: 68, holes: 1, sum holes: 4 */ /* paddings: 1, sum paddings: 3 */ /* last cacheline: 8 bytes */ }; struct syscall_trace_enter { struct trace_entry ent; /* 0 12 */ /* XXX last struct has 3 bytes of padding */ int nr; /* 12 4 */ long unsigned int args[]; /* 16 0 */ /* size: 16, cachelines: 1, members: 3 */ /* paddings: 1, sum paddings: 3 */ /* last cacheline: 16 bytes */ }; This, in turn, causes perf_event_set_bpf_prog() fail while running bpf test_profiler testcase because max_ctx_offset is calculated based on the former struct, while off on the latter: 10488 if (is_tracepoint || is_syscall_tp) { 10489 int off = trace_event_get_offsets(event->tp_event); 10490 10491 if (prog->aux->max_ctx_offset > off) 10492 return -EACCES; 10493 } What bpf program is actually getting is a pointer to struct syscall_tp_t, defined in kernel/trace/trace_syscalls.c. This patch fixes the problem by aligning struct syscall_tp_t with struct syscall_trace_(enter|exit) and changing the tests to use these structs to dereference context. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20231013054219.172920-1-asavkov@redhat.com