summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-05-03spi: pxa2xx: Don't provide struct chip_data for othersAndy Shevchenko
Now the struct chip_data is local to spi-pxa2xx.c, move its definition to the C file. This will slightly speed up a build and also hide badly named data type (too generic). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-10-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Remove timeout field from struct chip_dataAndy Shevchenko
The timeout field is used only once and assigned to a predefined constant. Replace all that by using the constant directly. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-9-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Remove DMA parameters from struct chip_dataAndy Shevchenko
The DMA related fields are set once and never modified. It effectively repeats the content of the same fields in struct pxa2xx_spi_controller. With that, remove DMA parameters from struct chip_data. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-8-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Drop struct pxa2xx_spi_chipAndy Shevchenko
No more users. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-7-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Don't use "proxy" headersAndy Shevchenko
Update header inclusions to follow IWYU (Include What You Use) principle. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-6-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Remove outdated documentationAndy Shevchenko
The documentation is referring to the legacy enumeration of the SPI host controllers and target devices. It has nothing to do with the modern way, which is the only supported in kernel right now. Hence, remove outdated documentation file. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-5-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Move contents of linux/spi/pxa2xx_spi.h to a local oneAndy Shevchenko
There is no user of the linux/spi/pxa2xx_spi.h. Move its contents to the drivers/spi/spi-pxa2xx.h. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-4-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Provide num-cs for Sharp PDAs via device propertiesAndy Shevchenko
Since driver can parse num-cs device property, replace platform data with this new approach. This pursues the following objectives: - getting rid of the public header that barely used outside of the SPI subsystem (more specifically the SPI PXA2xx drivers) - making a trampoline for the driver to support non-default number of the chip select pins in case the original code is going to be converted to Device Tree model It's not expected to have more users in board files except this one. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-3-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: pxa2xx: Allow number of chip select pins to be read from propertyAndy Shevchenko
In some cases the number of the chip select pins might come from the device property. Allow driver to use it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240417110334.2671228-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: dt-bindings: ti,qspi: convert to dtschemaKousik Sanagavarapu
Convert txt binding of TI's qspi controller (found on their omap SoCs) to dtschema to allow for validation. The changes, w.r.t. the original txt binding, are: - Introduce "clocks" and "clock-names" which was never mentioned. - Reflect that "ti,hwmods" is deprecated and is not a "required" property anymore. - Introduce "num-cs" which allows for setting the number of chip selects. - Drop "qspi_ctrlmod". Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240501165203.13763-1-five231003@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: bitbang: Add missing MODULE_DESCRIPTION()Andy Shevchenko
The modpost script is not happy WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spi/spi-bitbang.o because there is a missing module description. Add it to the module. While at it, update the terminology in Kconfig section to be in align with added description along with the code comments. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240502171518.2792895-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: bitbang: Use NSEC_PER_*SEC rather than hard codingAndy Shevchenko
Use NSEC_PER_*SEC rather than the hard coded value of 1000s. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240502154825.2752464-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: dw: Drop default number of CS settingSerge Semin
DW APB/AHB SSI core now supports the procedure automatically detecting the number of native chip-select lines. Thus there is no longer point in defaulting to four CS if the platform doesn't specify the real number especially seeing the default number didn't correspond to any original DW APB/AHB databook. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-5-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: dw: Convert dw_spi::num_cs to u32Serge Semin
Number of native chip-select lines is either retrieved from the "num-cs" DT-property or auto-detected in the generic DW APB/AHB SSI probe method. In the former case the property is supposed to be of the "u32" size. Convert the field type to being u32 then to be able to drop the temporary variable afterwards. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-4-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: dw: Add a number of native CS auto-detectionSerge Semin
Aside with the FIFO depth and DFS field size it's possible to auto-detect a number of native chip-select synthesized in the DW APB/AHB SSI IP-core. It can be done just by writing ones to the SER register. The number of writable flags in the register is limited by the SSI_NUM_SLAVES IP-core synthesize parameter. All the upper flags are read-only and wired to zero. Based on that let's add the number of native CS auto-detection procedure so the low-level platform drivers wouldn't need to manually set it up unless it's required to set a constraint due to platform-specific reasons (for instance, due to a hardware bug). Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240424150657.9678-3-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: dw: Convert to using BITS_TO_BYTES() macroSerge Semin
Since commit dd3e7cba1627 ("ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use") there is a generic helper available to calculate a number of bytes needed to accommodate the specified number of bits. Let's use it instead of the hard-coded DIV_ROUND_UP() macro function. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240424150657.9678-2-fancer.lancer@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03hwmon: (da9052) Use devm_regulator_get_enable_read_voltage()David Lechner
We can reduce boilerplate code by using devm_regulator_get_enable_read_voltage(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-3-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03hwmon: (adc128d818) Use devm_regulator_get_enable_read_voltage()David Lechner
We can reduce boilerplate code and eliminate the driver remove() function by using devm_regulator_get_enable_read_voltage(). A new external_vref flag is added since we no longer have the handle to the regulator to check if it is present. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-2-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03regulator: devres: add API for reference voltage suppliesDavid Lechner
A common use case for regulators is to supply a reference voltage to an analog input or output device. This adds a new devres API to get, enable, and get the voltage in a single call. This allows eliminating boilerplate code in drivers that use reference supplies in this way. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20240429-regulator-get-enable-get-votlage-v2-1-b1f11ab766c1@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-03spi: stm32: enable controller before asserting CSBen Wolsieffer
On the STM32F4/7, the MOSI and CLK pins float while the controller is disabled. CS is a regular GPIO, and therefore always driven. Currently, the controller is enabled in the transfer_one() callback, which runs after CS is asserted. Therefore, there is a period where the SPI pins are floating while CS is asserted, making it possible for stray signals to disrupt communications. An analogous problem occurs at the end of the transfer when the controller is disabled before CS is released. This problem can be reliably observed by enabling the pull-up (if CPOL=0) or pull-down (if CPOL=1) on the clock pin. This will cause two extra unintended clock edges per transfer, when the controller is enabled and disabled. Note that this bug is likely not present on the STM32H7, because this driver sets the AFCNTR bit (not supported on F4/F7), which keeps the SPI pins driven even while the controller is disabled. Enabling/disabling the controller as part of runtime PM was suggested as an alternative approach, but this breaks the driver on the STM32MP1 (see [1]). The following quote from the manual may explain this: > To restart the internal state machine properly, SPI is strongly > suggested to be disabled and re-enabled before next transaction starts > despite its setting is not changed. This patch has been tested on an STM32F746 with a MAX14830 UART expander. [1] https://lore.kernel.org/lkml/ZXzRi_h2AMqEhMVw@dell-precision-5540/T/ Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Link: https://lore.kernel.org/r/20240424135237.1329001-2-ben.wolsieffer@hefring.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-02tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().Kuniyuki Iwashima
Anderson Nascimento reported a use-after-free splat in tcp_twsk_unique() with nice analysis. Since commit ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance"), inet_twsk_hashdance() sets TIME-WAIT socket's sk_refcnt after putting it into ehash and releasing the bucket lock. Thus, there is a small race window where other threads could try to reuse the port during connect() and call sock_hold() in tcp_twsk_unique() for the TIME-WAIT socket with zero refcnt. If that happens, the refcnt taken by tcp_twsk_unique() is overwritten and sock_put() will cause underflow, triggering a real use-after-free somewhere else. To avoid the use-after-free, we need to use refcount_inc_not_zero() in tcp_twsk_unique() and give up on reusing the port if it returns false. [0]: refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1039313 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 CPU: 0 PID: 1039313 Comm: trigger Not tainted 6.8.6-200.fc39.x86_64 #1 Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 RIP: 0010:refcount_warn_saturate+0xe5/0x110 Code: 42 8e ff 0f 0b c3 cc cc cc cc 80 3d aa 13 ea 01 00 0f 85 5e ff ff ff 48 c7 c7 f8 8e b7 82 c6 05 96 13 ea 01 01 e8 7b 42 8e ff <0f> 0b c3 cc cc cc cc 48 c7 c7 50 8f b7 82 c6 05 7a 13 ea 01 01 e8 RSP: 0018:ffffc90006b43b60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff888009bb3ef0 RCX: 0000000000000027 RDX: ffff88807be218c8 RSI: 0000000000000001 RDI: ffff88807be218c0 RBP: 0000000000069d70 R08: 0000000000000000 R09: ffffc90006b439f0 R10: ffffc90006b439e8 R11: 0000000000000003 R12: ffff8880029ede84 R13: 0000000000004e20 R14: ffffffff84356dc0 R15: ffff888009bb3ef0 FS: 00007f62c10926c0(0000) GS:ffff88807be00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020ccb000 CR3: 000000004628c005 CR4: 0000000000f70ef0 PKRU: 55555554 Call Trace: <TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x171/0x1a0 ? refcount_warn_saturate+0xe5/0x110 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 tcp_twsk_unique+0x186/0x190 __inet_check_established+0x176/0x2d0 __inet_hash_connect+0x74/0x7d0 ? __pfx___inet_check_established+0x10/0x10 tcp_v4_connect+0x278/0x530 __inet_stream_connect+0x10f/0x3d0 inet_stream_connect+0x3a/0x60 __sys_connect+0xa8/0xd0 __x64_sys_connect+0x18/0x20 do_syscall_64+0x83/0x170 entry_SYSCALL_64_after_hwframe+0x78/0x80 RIP: 0033:0x7f62c11a885d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007f62c1091e58 EFLAGS: 00000296 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020ccb004 RCX: 00007f62c11a885d RDX: 0000000000000010 RSI: 0000000020ccb000 RDI: 0000000000000003 RBP: 00007f62c1091e90 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000296 R12: 00007f62c10926c0 R13: ffffffffffffff88 R14: 0000000000000000 R15: 00007ffe237885b0 </TASK> Fixes: ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") Reported-by: Anderson Nascimento <anderson@allelesecurity.com> Closes: https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/ Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240501213145.62261-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV socketsEric Dumazet
TCP_SYN_RECV state is really special, it is only used by cross-syn connections, mostly used by fuzzers. In the following crash [1], syzbot managed to trigger a divide by zero in tcp_rcv_space_adjust() A socket makes the following state transitions, without ever calling tcp_init_transfer(), meaning tcp_init_buffer_space() is also not called. TCP_CLOSE connect() TCP_SYN_SENT TCP_SYN_RECV shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) TCP_FIN_WAIT1 To fix this issue, change tcp_shutdown() to not perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, which makes no sense anyway. When tcp_rcv_state_process() later changes socket state from TCP_SYN_RECV to TCP_ESTABLISH, then look at sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, and send a FIN packet from a sane socket state. This means tcp_send_fin() can now be called from BH context, and must use GFP_ATOMIC allocations. [1] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 Call Trace: <TASK> tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x109/0x280 net/socket.c:1068 ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x474/0xae0 net/socket.c:2939 __sys_recvmmsg net/socket.c:3018 [inline] __do_sys_recvmmsg net/socket.c:3041 [inline] __se_sys_recvmmsg net/socket.c:3034 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7faeb6363db9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240501125448.896529-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02net_sched: sch_sfq: annotate data-races around q->perturb_periodEric Dumazet
sfq_perturbation() reads q->perturb_period locklessly. Add annotations to fix potential issues. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240430180015.3111398-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02net: dsa: mv88e6xxx: Correct check for empty listSimon Horman
Since commit a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") mv88e6xxx_default_mdio_bus() has checked that the return value of list_first_entry() is non-NULL. This appears to be intended to guard against the list chip->mdios being empty. However, it is not the correct check as the implementation of list_first_entry is not designed to return NULL for empty lists. Instead, use list_first_entry_or_null() which does return NULL if the list is empty. Flagged by Smatch. Compile tested only. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240430-mv88e6xx-list_empty-v3-1-c35c69d88d2e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02selftests/net: skip partial checksum packets in csum testWillem de Bruijn
Detect packets with ip_summed CHECKSUM_PARTIAL and skip these. These should not exist, as the test sends individual packets between two hosts. But if (HW) GRO is on, with randomized content sometimes subsequent packets can be coalesced. In this case the GSO packet checksum is converted to a pseudo checksum in anticipation of sending out as TSO/USO. So the field will not match the expected value. Do not count these as test errors. Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240501193156.3627344-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02selftests: net: py: check process exit code in bkg() and background cmd()Jakub Kicinski
We're a bit too loose with error checking for background processes. cmd() completely ignores the fail argument passed to the constructor if background is True. Default to checking for errors if process is not terminated explicitly. Caller can override with True / False. For bkg() the processing step is called magically by __exit__ so record the value passed in the constructor. Reported-by: Willem de Bruijn <willemb@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240502025325.1924923-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02IB/hfi1: allocate dummy net_device dynamicallyBreno Leitao
Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_device from struct hfi1_netdev_rx by converting it into a pointer. Then use the leverage alloc_netdev() to allocate the net_device object at hfi1_alloc_rx(). [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://lore.kernel.org/r/20240430162213.746492-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02net/mlx5e: flower: check for unsupported control flagsAsbjørn Sloth Tønnesen
Use flow_rule_is_supp_control_flags() to reject filters with unsupported control flags. In case any unsupported control flags are masked, flow_rule_is_supp_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Remove FLOW_DIS_FIRST_FRAG specific error message, and treat it as any other unsupported control flag. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240422152728.175677-1-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-03Merge tag 'drm-misc-fixes-2024-05-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: imagination: - fix page-count macro nouveau: - avoid page-table allocation failures - fix firmware memory allocation panel: - ili9341: avoid OF for device properties; respect deferred probe; fix usage of errno codes ttm: - fix status output vmwgfx: - fix legacy display unit - fix read length in fence signalling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240502192117.GA12158@linux.fritz.box
2024-05-03Merge tag 'drm-xe-fixes-2024-05-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix UAF on rebind worker - Fix ADL-N display integration Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6bontwst3mbxozs6u3ad5n3g5zmaucrngbfwv4hkfhpscnwlym@wlwjgjx6pwue
2024-05-03Merge tag 'amd-drm-fixes-6.9-2024-05-01' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-05-01: amdgpu: - Fix VRAM memory accounting - DCN 3.1 fixes - DCN 2.0 fix - DCN 3.1.5 fix - DCN 3.5 fix - DCN 3.2.1 fix - DP fixes - Seamless boot fix - Fix call order in amdgpu_ttm_move() - Fix doorbell regression - Disable panel replay temporarily amdkfd: - Flush wq before creating kfd process Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240501135054.1919108-1-alexander.deucher@amd.com
2024-05-02libbpf: fix ring_buffer__consume_n() return result logicAndrii Nakryiko
Add INT_MAX check to ring_buffer__consume_n(). We do the similar check to handle int return result of all these ring buffer APIs in other APIs and ring_buffer__consume_n() is missing one. This patch fixes this omission. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02libbpf: fix potential overflow in ring__consume_n()Andrii Nakryiko
ringbuf_process_ring() return int64_t, while ring__consume_n() assigns it to int. It's highly unlikely, but possible for ringbuf_process_ring() to return value larger than INT_MAX, so use int64_t. ring__consume_n() does check INT_MAX before returning int result to the user. Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02Merge branch 'Add new args into tcp_congestion_ops' cong_control'Martin KaFai Lau
Miao Xu says: ==================== This patchset attempts to add two new arguments into the hookpoint cong_control in tcp_congestion_ops. The new arguments are inherited from the caller tcp_cong_control and can be used by any bpf cc prog that implements its own logic inside this hookpoint. Please review. Thanks a lot! Changelog ===== v2->v3: - Fixed the broken selftest caused by the new arguments. - Renamed the selftest file name and bpf prog name. v1->v2: - Split the patchset into 3 separate patches. - Added highlights in the selftest prog. - Removed the dependency on bpf_tcp_helpers.h. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Add test for the use of new args in cong_controlMiao Xu
This patch adds a selftest to show the usage of the new arguments in cong_control. For simplicity's sake, the testing example reuses cubic's kernel functions. Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-4-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02bpf: tcp: Allow to write tp->snd_cwnd_stamp in bpf_tcp_caMiao Xu
This patch allows the write of tp->snd_cwnd_stamp in a bpf tcp ca program. An use case of writing this field is to keep track of the time whenever tp->snd_cwnd is raised or reduced inside the `cong_control` callback. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-3-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02tcp: Add new args for cong_control in tcp_congestion_opsMiao Xu
This patch adds two new arguments for cong_control of struct tcp_congestion_ops: - ack - flag These two arguments are inherited from the caller tcp_cong_control in tcp_intput.c. One use case of them is to update cwnd and pacing rate inside cong_control based on the info they provide. For example, the flag can be used to decide if it is the right time to raise or reduce a sender's cwnd. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-2-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02Merge branch 'selftests/bpf: Add sockaddr tests for kernel networking'Martin KaFai Lau
Jordan Rife says: ==================== This patch series adds test coverage for BPF sockaddr hooks and their interactions with kernel socket functions (i.e. kernel_bind(), kernel_connect(), kernel_sendmsg(), sock_sendmsg(), kernel_getpeername(), and kernel_getsockname()) while also rounding out IPv4 and IPv6 sockaddr hook coverage in prog_tests/sock_addr.c. As with v1 of this patch series, we add regression coverage for the issues addressed by these patches, - commit 0bdf399342c5("net: Avoid address overwrite in kernel_connect") - commit 86a7e0b69bd5("net: prevent rewrite of msg_name in sock_sendmsg()") - commit c889a99a21bf("net: prevent address rewrite in kernel_bind()") - commit 01b2885d9415("net: Save and restore msg_namelen in sock_sendmsg") but broaden the focus a bit. In order to extend prog_tests/sock_addr.c to test these kernel functions, we add a set of new kfuncs that wrap individual socket operations to bpf_testmod and invoke them through set of corresponding SYSCALL programs (progs/sock_addr_kern.c). Each test case can be configured to use a different set of "sock_ops" depending on whether it is testing kernel calls (kernel_bind(), kernel_connect(), etc.) or system calls (bind(), connect(), etc.). ======= Patches ======= * Patch 1 fixes the sock_addr bind test program to work for big endian architectures such as s390x. * Patch 2 introduces the new kfuncs to bpf_testmod. * Patch 3 introduces the BPF program which allows us to invoke these kfuncs invividually from the test program. * Patch 4 lays the groundwork for IPv4 and IPv6 sockaddr hook coverage by migrating much of the environment setup logic from bpf/test_sock_addr.sh into prog_tests/sock_addr.c and moves test cases to cover bind4/6, connect4/6, sendmsg4/6 and recvmsg4/6 hooks. * Patch 5 makes the set of socket operations for each test case configurable, laying the groundwork for Patch 6. * Patch 6 introduces two sets of sock_ops that invoke the kernel equivalents of connect(), bind(), etc. and uses these to add coverage for the kernel socket functions. ======= Changes ======= v2->v3 ------ * Renamed bind helpers. Dropped "_ntoh" suffix. * Added guards to kfuncs to make sure addrlen and msglen do not exceed the buffer capacity. * Added KF_SLEEPABLE flag to kfuncs. * Added a mutex (sock_lock) to kfuncs to serialize access to sock. * Added NULL check for sock to each kfunc. * Use the "sock_addr" networking namespace for all network interface setup and testing. * Use "nodad" when calling "ip -6 addr add" during interface setup to avoid delays and remove ping loop. * Removed test cases from test_sock_addr.c to make it clear what remains to be migrated. * Removed unused parameter (expect_change) from sock_addr_op(). Link: https://lore.kernel.org/bpf/20240412165230.2009746-1-jrife@google.com/T/#u v1->v2 ------ * Dropped test_progs/sock_addr_kern.c and the sock_addr_kern test module in favor of simply expanding bpf_testmod and test_progs/sock_addr.c. * Migrated environment setup logic from bpf/test_sock_addr.sh into prog_tests/sock_addr.c rather than invoking the script from the test program. * Added kfuncs to bpf_testmod as well as the sock_addr_kern BPF program to enable us to invoke kernel socket functions from test_progs/sock_addr.c. * Added test coverage for kernel socket functions to test_progs/sock_addr.c. Link: https://lore.kernel.org/bpf/20240329191907.1808635-1-jrife@google.com/T/#u ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Add kernel socket operation testsJordan Rife
This patch creates two sets of sock_ops that call out to the SYSCALL hooks in the sock_addr_kern BPF program and uses them to construct test cases for the range of supported operations (kernel_connect(), kernel_bind(), kernel_sendms(), sock_sendmsg(), kernel_getsockname(), kenel_getpeername()). This ensures that these interact with BPF sockaddr hooks as intended. Beyond this it also ensures that these operations do not modify their address parameter, providing regression coverage for the issues addressed by this set of patches: - commit 0bdf399342c5("net: Avoid address overwrite in kernel_connect") - commit 86a7e0b69bd5("net: prevent rewrite of msg_name in sock_sendmsg()") - commit c889a99a21bf("net: prevent address rewrite in kernel_bind()") - commit 01b2885d9415("net: Save and restore msg_namelen in sock_sendmsg") Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-7-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Make sock configurable for each test caseJordan Rife
In order to reuse the same test code for both socket system calls (e.g. connect(), bind(), etc.) and kernel socket functions (e.g. kernel_connect(), kernel_bind(), etc.), this patch introduces the "ops" field to sock_addr_test. This field allows each test cases to configure the set of functions used in the test case to create, manipulate, and tear down a socket. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-6-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Move IPv4 and IPv6 sockaddr test casesJordan Rife
This patch lays the groundwork for testing IPv4 and IPv6 sockaddr hooks and their interaction with both socket syscalls and kernel functions (e.g. kernel_connect, kernel_bind, etc.). It moves some of the test cases from the old-style bpf/test_sock_addr.c self test into the sock_addr prog_test in a step towards fully retiring bpf/test_sock_addr.c. We will expand the test dimensions in the sock_addr prog_test in a later patch series in order to migrate the remaining test cases. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-5-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02Merge tag 'md-6.10-20240502' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.10/block Pull MD fix from Song: "This fixes an issue observed with dm-raid." * tag 'md-6.10-20240502' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: fix resync softlockup when bitmap size is less than array size
2024-05-02md: fix resync softlockup when bitmap size is less than array sizeYu Kuai
Is is reported that for dm-raid10, lvextend + lvchange --syncaction will trigger following softlockup: kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [mdX_resync:6976] CPU: 7 PID: 3588 Comm: mdX_resync Kdump: loaded Not tainted 6.9.0-rc4-next-20240419 #1 RIP: 0010:_raw_spin_unlock_irq+0x13/0x30 Call Trace: <TASK> md_bitmap_start_sync+0x6b/0xf0 raid10_sync_request+0x25c/0x1b40 [raid10] md_do_sync+0x64b/0x1020 md_thread+0xa7/0x170 kthread+0xcf/0x100 ret_from_fork+0x30/0x50 ret_from_fork_asm+0x1a/0x30 And the detailed process is as follows: md_do_sync j = mddev->resync_min while (j < max_sectors) sectors = raid10_sync_request(mddev, j, &skipped) if (!md_bitmap_start_sync(..., &sync_blocks)) // md_bitmap_start_sync set sync_blocks to 0 return sync_blocks + sectors_skippe; // sectors = 0; j += sectors; // j never change Root cause is that commit 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter") return early from md_bitmap_get_counter(), without setting returned blocks. Fix this problem by always set returned blocks from md_bitmap_get_counter"(), as it used to be. Noted that this patch just fix the softlockup problem in kernel, the case that bitmap size doesn't match array size still need to be fixed. Fixes: 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter") Reported-and-tested-by: Nigel Croxon <ncroxon@redhat.com> Closes: https://lore.kernel.org/all/71ba5272-ab07-43ba-8232-d2da642acb4e@redhat.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240422065824.2516-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2024-05-02btrfs: make sure that WRITTEN is set on all metadata blocksJosef Bacik
We previously would call btrfs_check_leaf() if we had the check integrity code enabled, which meant that we could only run the extended leaf checks if we had WRITTEN set on the header flags. This leaves a gap in our checking, because we could end up with corruption on disk where WRITTEN isn't set on the leaf, and then the extended leaf checks don't get run which we rely on to validate all of the item pointers to make sure we don't access memory outside of the extent buffer. However, since 732fab95abe2 ("btrfs: check-integrity: remove CONFIG_BTRFS_FS_CHECK_INTEGRITY option") we no longer call btrfs_check_leaf() from btrfs_mark_buffer_dirty(), which means we only ever call it on blocks that are being written out, and thus have WRITTEN set, or that are being read in, which should have WRITTEN set. Add checks to make sure we have WRITTEN set appropriately, and then make sure __btrfs_check_leaf() always does the item checking. This will protect us from file systems that have been corrupted and no longer have WRITTEN set on some of the blocks. This was hit on a crafted image tweaking the WRITTEN bit and reported by KASAN as out-of-bound access in the eb accessors. The example is a dir item at the end of an eb. [2.042] BTRFS warning (device loop1): bad eb member start: ptr 0x3fff start 30572544 member offset 16410 size 2 [2.040] general protection fault, probably for non-canonical address 0xe0009d1000000003: 0000 [#1] PREEMPT SMP KASAN NOPTI [2.537] KASAN: maybe wild-memory-access in range [0x0005088000000018-0x000508800000001f] [2.729] CPU: 0 PID: 2587 Comm: mount Not tainted 6.8.2 #1 [2.729] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [2.621] RIP: 0010:btrfs_get_16+0x34b/0x6d0 [2.621] RSP: 0018:ffff88810871fab8 EFLAGS: 00000206 [2.621] RAX: 0000a11000000003 RBX: ffff888104ff8720 RCX: ffff88811b2288c0 [2.621] RDX: dffffc0000000000 RSI: ffffffff81dd8aca RDI: ffff88810871f748 [2.621] RBP: 000000000000401a R08: 0000000000000001 R09: ffffed10210e3ee9 [2.621] R10: ffff88810871f74f R11: 205d323430333737 R12: 000000000000001a [2.621] R13: 000508800000001a R14: 1ffff110210e3f5d R15: ffffffff850011e8 [2.621] FS: 00007f56ea275840(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 [2.621] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2.621] CR2: 00007febd13b75c0 CR3: 000000010bb50000 CR4: 00000000000006f0 [2.621] Call Trace: [2.621] <TASK> [2.621] ? show_regs+0x74/0x80 [2.621] ? die_addr+0x46/0xc0 [2.621] ? exc_general_protection+0x161/0x2a0 [2.621] ? asm_exc_general_protection+0x26/0x30 [2.621] ? btrfs_get_16+0x33a/0x6d0 [2.621] ? btrfs_get_16+0x34b/0x6d0 [2.621] ? btrfs_get_16+0x33a/0x6d0 [2.621] ? __pfx_btrfs_get_16+0x10/0x10 [2.621] ? __pfx_mutex_unlock+0x10/0x10 [2.621] btrfs_match_dir_item_name+0x101/0x1a0 [2.621] btrfs_lookup_dir_item+0x1f3/0x280 [2.621] ? __pfx_btrfs_lookup_dir_item+0x10/0x10 [2.621] btrfs_get_tree+0xd25/0x1910 Reported-by: lei lu <llfamsec@gmail.com> CC: stable@vger.kernel.org # 6.7+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ copy more details from report ] Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-02Merge branch 'shared-zeropage' into featuresAlexander Gordeev
David Hildenbrand says: =================== This series fixes one issue with uffd + shared zeropages on s390x and fixes that "ordinary" KVM guests can make use of shared zeropages again. userfaultfd could currently end up mapping shared zeropages into processes that forbid shared zeropages. This only apples to s390x, relevant for handling PV guests and guests that use storage kets correctly. Fix it by placing a zeroed folio instead of the shared zeropage during UFFDIO_ZEROPAGE instead. I stumbled over this issue while looking into a customer scenario that is using: (1) Memory ballooning for dynamic resizing. Start a VM with, say, 100 GiB and inflate the balloon during boot to 60 GiB. The VM has ~40 GiB available and additional memory can be "fake hotplugged" to the VM later on demand by deflating the balloon. Actual memory overcommit is not desired, so physical memory would only be moved between VMs. (2) Live migration of VMs between sites to evacuate servers in case of emergency. Without the shared zeropage, during (2), the VM would suddenly consume 100 GiB on the migration source and destination. On the migration source, where we don't excpect memory overcommit, we could easilt end up crashing the VM during migration. Independent of that, memory handed back to the hypervisor using "free page reporting" would end up consuming actual memory after the migration on the destination, not getting freed up until reused+freed again. While there might be ways to optimize parts of this in QEMU, we really should just support the shared zeropage again for ordinary VMs. We only expect legcy guests to make use of storage keys, so let's handle zeropages again when enabling storage keys or when enabling PV. To not break userfaultfd like we did in the past, don't zap the shared zeropages, but instead trigger unsharing faults, just like we do for unsharing KSM pages in break_ksm(). Unsharing faults will simply replace the shared zeropage by a zeroed anonymous folio. We can already trigger the same fault path using GUP, when trying to long-term pin a shared zeropage, but also when unmerging a KSM-placed zeropages, so this is nothing new. Patch #1 tested on 86-64 by forcing mm_forbids_zeropage() to be 1, and running the uffd selftests. Patch #2 tested on s390x: the live migration scenario now works as expected, and kvm-unit-tests that trigger usage of skeys work well, whereby I can see detection and unsharing of shared zeropages. Further (as broken in v2), I tested that the shared zeropage is no longer populated after skeys are used -- that mm_forbids_zeropage() works as expected: ./s390x-run s390x/skey.elf \ -no-shutdown \ -chardev socket,id=monitor,path=/var/tmp/mon,server,nowait \ -mon chardev=monitor,mode=readline Then, in another shell: # cat /proc/`pgrep qemu`/smaps_rollup | grep Rss Rss: 31484 kB # echo "dump-guest-memory tmp" | sudo nc -U /var/tmp/mon ... # cat /proc/`pgrep qemu`/smaps_rollup | grep Rss Rss: 160452 kB -> Reading guest memory does not populate the shared zeropage Doing the same with selftest.elf (no skeys) # cat /proc/`pgrep qemu`/smaps_rollup | grep Rss Rss: 30900 kB # echo "dump-guest-memory tmp" | sudo nc -U /var/tmp/mon ... # cat /proc/`pgrep qemu`/smaps_rollup | grep Rsstmp/mon Rss: 30924 kB -> Reading guest memory does populate the shared zeropage =================== Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-02btrfs: qgroup: do not check qgroup inherit if qgroup is disabledQu Wenruo
[BUG] After kernel commit 86211eea8ae1 ("btrfs: qgroup: validate btrfs_qgroup_inherit parameter"), user space tool snapper will fail to create snapshot using its timeline feature. [CAUSE] It turns out that, if using timeline snapper would unconditionally pass btrfs_qgroup_inherit parameter (assigning the new snapshot to qgroup 1/0) for snapshot creation. In that case, since qgroup is disabled there would be no qgroup 1/0, and btrfs_qgroup_check_inherit() would return -ENOENT and fail the whole snapshot creation. [FIX] Just skip the check if qgroup is not enabled. This is to keep the older behavior for user space tools, as if the kernel behavior changed for user space, it is a regression of kernel. Thankfully snapper is also fixing the behavior by detecting if qgroup is running in the first place, so the effect should not be that huge. Link: https://github.com/openSUSE/snapper/issues/894 Fixes: 86211eea8ae1 ("btrfs: qgroup: validate btrfs_qgroup_inherit parameter") CC: stable@vger.kernel.org # 6.8+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-02selftests/bpf: Implement BPF programs for kernel socket operationsJordan Rife
This patch lays out a set of SYSCALL programs that can be used to invoke the socket operation kfuncs in bpf_testmod, allowing a test program to manipulate kernel socket operations from userspace. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-4-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Implement socket kfuncs for bpf_testmodJordan Rife
This patch adds a set of kfuncs to bpf_testmod that can be used to manipulate a socket from kernel space. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-3-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02selftests/bpf: Fix bind program for big endian systemsJordan Rife
Without this fix, the bind4 and bind6 programs will reject bind attempts on big endian systems. This patch ensures that CI tests pass for the s390x architecture. Signed-off-by: Jordan Rife <jrife@google.com> Link: https://lore.kernel.org/r/20240429214529.2644801-2-jrife@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: include/linux/filter.h kernel/bpf/core.c 66e13b615a0c ("bpf: verifier: prevent userspace memory access") d503a04f8bc0 ("bpf: Add support for certain atomics in bpf_arena to x86 JIT") https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>