Age | Commit message (Collapse) | Author |
|
Rather than detecting when half-duplex is not supported, and clearing
the MAC capabilities, reverse the if() condition and use it to set the
capabilities instead.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXn-005pUb-SP@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move priv->phylink_config.mac_managed_pm to be along side the other
phylink initialisations.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXi-005pUV-Nq@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the xgmac specific phylink capabilities to the dwxgmac2 support
core.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXd-005pUP-JL@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the setup of gmac4 speicifc phylink capabilities into gmac4 code.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXY-005pUJ-Ez@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow MACs to provide their own capabilities via the MAC operations
struct.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXT-005pUD-Aj@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use phylink_limit_mac_speed() to limit the MAC capabilities rather
than coding this for each speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXO-005pU7-61@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have a local variable for priv->plat->mdio_bus_data, which we use
later in the conditional if() block, but we evaluate the above within
the conditional expression. Use mdio_bus_data instead. Since these
will be the only two users of this local variable, move its assignment
just before the if().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXJ-005pU1-1z@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the initialisation of the fwnode variable closer to its use
site, rather than scattered throughout stmmac_phy_setup().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXD-005pTv-TN@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All users of plat->phylink_node first convert it to a fwnode. Rather
than repeatedly convert to a fwnode, store it as a fwnode. To reflect
this change, call it plat->port_node instead - it is used for more
than just phylink.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX8-005pTo-OT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a function which can be used to limit the phylink MAC capabilities
to an upper speed limit.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX3-005pTi-K1@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When an skb fails to be forwarded to the peer(e.g., skb data buffer
length exceeds MTU), it will not be added to the peer's receive queue.
Therefore, we should schedule the peer's NAPI poll function only when
skb forwarding is successful to avoid unnecessary overhead.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Link: https://lore.kernel.org/r/20230824123131.7673-1-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Suman Ghosh says:
====================
Fix PFC related issues
This patchset fixes multiple PFC related issues related to Octeon.
Patch #1: octeontx2-pf: Fix PFC TX scheduler free
Patch #2: octeontx2-af: CN10KB: fix PFC configuration
Patch #3: octeonxt2-pf: Fix backpressure config for multiple PFC
priorities to work simultaneously
====================
Link: https://lore.kernel.org/r/20230824081032.436432-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
simultaneously
MAC (CGX or RPM) asserts backpressure at TL3 or TL2 node of the egress
hierarchical scheduler tree depending on link level config done. If
there are multiple PFC priorities enabled at a time and for all such
flows to backoff, each priority will have to assert backpressure at
different TL3/TL2 scheduler nodes and these flows will need to submit
egress pkts to these nodes.
Current PFC configuration has an issue where in only one backpressure
scheduler node is being allocated which is resulting in only one PFC
priority to work. This patch fixes this issue.
Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230824081032.436432-4-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Suppose user has enabled pfc with prio 0,1 on a PF netdev(eth0)
dcb pfc set dev eth0 prio-pfc o:on 1:on
later user enabled pfc priorities 2 and 3 on the VF interface(eth1)
dcb pfc set dev eth1 prio-pfc 2:on 3:on
Instead of enabling pfc on all priorities (0..3), the driver only
enables on priorities 2,3. This patch corrects the issue by using
the proper CSR address.
Fixes: b9d0fedc6234 ("octeontx2-af: cn10kb: Add RPM_USX MAC support")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230824081032.436432-3-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During PFC TX schedulers free, flag TXSCHQ_FREE_ALL was being set
which caused free up all schedulers other than the PFC schedulers.
This patch fixes that to free only the PFC Tx schedulers.
Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230824081032.436432-2-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-25
We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).
The main changes are:
1) Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds,
from Jiri Olsa.
2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
from Xu Kuohai.
3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
from Pu Lehui.
4) Fix LWT BPF xmit hooks wrt their return values where propagating
the result from skb_do_redirect() would trigger a use-after-free,
from Yan Zhai.
5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
where the map's value kptr type and locally allocated obj type
mismatch, from Yonghong Song.
6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
root/node which bypassed reg->off == 0 enforcement,
from Kumar Kartikeya Dwivedi.
7) Lift BPF verifier restriction in networking BPF programs to treat
comparison of packet pointers not as a pointer leak,
from Yafang Shao.
8) Remove unmaintained XDP BPF samples as they are maintained
in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.
9) Batch of fixes for the tracing programs from BPF samples in order
to make them more libbpf-aware, from Daniel T. Lee.
10) Fix a libbpf signedness determination bug in the CO-RE relocation
handling logic, from Andrii Nakryiko.
11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
fixes for bpf_refcount shared ownership implementation,
both from Dave Marchevsky.
12) Add a new bpf_object__unpin() API function to libbpf,
from Daniel Xu.
13) Fix a memory leak in libbpf to also free btf_vmlinux
when the bpf_object gets closed, from Hao Luo.
14) Small error output improvements to test_bpf module, from Helge Deller.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
bpf: Consider non-owning refs to refcounted nodes RCU protected
bpf: Reenable bpf_refcount_acquire
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
bpf: Consider non-owning refs trusted
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
selftests/bpf: Enable cpu v4 tests for RV64
riscv, bpf: Support unconditional bswap insn
riscv, bpf: Support signed div/mod insns
riscv, bpf: Support 32-bit offset jmp insn
riscv, bpf: Support sign-extension mov insns
riscv, bpf: Support sign-extension load insns
riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
samples/bpf: Cleanup .gitignore
samples/bpf: Remove the xdp_sample_pkts utility
samples/bpf: Remove the xdp1 and xdp2 utilities
samples/bpf: Remove the xdp_rxq_info utility
samples/bpf: Remove the xdp_redirect* utilities
...
====================
Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.6
The second pull request for v6.6, this time with both stack and driver
changes. Unusually we have only one major new feature but lots of
small cleanup all over, I guess this is due to people have been on
vacation the last month.
Major changes:
rtw89
- Introduce Time Averaged SAR (TAS) support
* tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (114 commits)
wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet()
wifi: rtw88: usb: kill and free rx urbs on probe failure
wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar()
wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG
wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips
wifi: rtw89: mac: define register address of rx_filter to generalize code
wifi: rtw89: mac: define internal memory address for WiFi 7 chip
wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory
wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address
wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code
wifi: rtw89: initialize multi-channel handling
wifi: rtw89: provide functions to configure NoA for beacon update
wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif
wifi: rtw89: sar: let caller decide the center frequency to query
wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band()
wifi: rtw89: add function prototype for coex request duration
Fix nomenclature for USB and PCI wireless devices
wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address
wifi: ath12k: Remove unused declarations
wifi: ath12k: add check max message length while scanning with extraie
...
====================
Link: https://lore.kernel.org/r/20230825132230.A0833C433C8@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Introduce HCI_QUIRK_BROKEN_LE_CODED
- Add support for PA/BIG sync
- Add support for NXP IW624 chipset
- Add support for Qualcomm WCN7850
* tag 'for-net-next-2023-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave()
Bluetooth: btusb: Fix quirks table naming
Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED
Bluetooth: btintel: Send new command for PPAG
Bluetooth: ISO: Add support for periodic adv reports processing
Bluetooth: hci_conn: fail SCO/ISO via hci_conn_failed if ACL gone early
Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH
Bluetooth: ISO: Use defer setup to separate PA sync and BIG sync
Bluetooth: qca: add support for WCN7850
Bluetooth: qca: use switch case for soc type behavior
dt-bindings: net: bluetooth: qualcomm: document WCN7850 chipset
Bluetooth: hci_conn: Fix sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL
Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync
Bluetooth: btnxpuart: Improve inband Independent Reset handling
Bluetooth: btnxpuart: Add support for IW624 chipset
Bluetooth: btnxpuart: Remove check for CTS low after FW download
====================
Link: https://lore.kernel.org/r/20230824201458.2577-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"One clk driver fix and two clk framework fixes:
- Fix an OOB access when devm_get_clk_from_child() is used and
devm_clk_release() casts the void pointer to the wrong type
- Move clk_rate_exclusive_{get,put}() within the correct ifdefs in
clk.h so that the stubs are used when CONFIG_COMMON_CLK=n
- Register the proper clk provider function depending on the value of
#clock-cells in the TI keystone driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Fix slab-out-of-bounds error in devm_clk_release()
clk: Fix undefined reference to `clk_rate_exclusive_{get,put}'
clk: keystone: syscon-clk: Fix audio refclk
|
|
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct dm_verity_loadpin_trusted_root_digest.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: dm-devel@redhat.com
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Link: https://lore.kernel.org/r/20230817235955.never.762-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
If pinctrl is not available (thus devm_pinctrl_get() returns NULL) then
recovery can't work, because we can't switch the I2C pins between the
I2C controller and GPIO. So, it is quite correct to print
"can't get pinctrl, bus recovery not supported" because the I2C bus
can't be recovered without pinctrl.
The PTR_ERR() is also fine - because if pinctrl is not present and
returns NULL, we'll end up returning zero, which is exactly what we
want.
However, open code that with a more accurate message will be more explicit
for NULL case when CONFIG_PINCTRL is not defined.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
All users of cleanup_symbol_name() do not use the return value.
So let us change the return value of cleanup_symbol_name() to
'void' to reflect its usage pattern.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230825202036.441212-1-yonghong.song@linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
For a device with no Power Management Capability, pci_power_up() previously
returned 0 (success) if the platform was able to put the device in D0,
which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even
though it doesn't exist.
Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually
read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted
dev->current_state. This led to messages like this in some cases:
pci 0000:01:00.0: Refused to change power state from D3hot to D0
To prevent this, make pci_power_up() always return a negative failure code
if the device lacks a Power Management Capability, even if non-PCI platform
power management has been able to put the device in D0. The failure will
prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL.
Fixes: e200904b275c ("PCI/PM: Split pci_power_up()")
Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable@vger.kernel.org # v5.19+
|
|
Testing of virtual Fibre Channel devices under Hyper-V has shown additional
SRB status values being returned for various error cases. Because these
SRB status values are not recognized by storvsc, the I/O operations are not
flagged as an error. Requests are treated as if they completed normally but
with zero data transferred, which can cause a flood of retries.
Add definitions for these SRB status values and handle them like other
error statuses from the Hyper-V host.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1692984084-95105-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently if the SoC needs pinctrl to switch the SCL and SDA from the I2C
function to GPIO function, the recovery won't work.
scl-gpio = <>;
sda-gpio = <>;
Are not enough for some SoCs to have a working recovery.
Some need:
scl-gpio = <>;
sda-gpio = <>;
pinctrl-names = "default", "recovery";
pinctrl-0 = <&i2c_pins_hw>;
pinctrl-1 = <&i2c_pins_gpio>;
The driver was not filling rinfo->pinctrl with the device node
pinctrl data which is needed by generic recovery code.
Signed-off-by: Yann Sionneau <ysionneau@kalray.eu>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
The gcc compiler translates on some architectures the 64-bit
__builtin_clzll() function to a call to the libgcc function __clzdi2(),
which should take a 64-bit parameter on 32- and 64-bit platforms.
But in the current kernel code, the built-in __clzdi2() function is
defined to operate (wrongly) on 32-bit parameters if BITS_PER_LONG ==
32, thus the return values on 32-bit kernels are in the range from
[0..31] instead of the expected [0..63] range.
This patch fixes the in-kernel functions __clzdi2() and __ctzdi2() to
take a 64-bit parameter on 32-bit kernels as well, thus it makes the
functions identical for 32- and 64-bit kernels.
This bug went unnoticed since kernel 3.11 for over 10 years, and here
are some possible reasons for that:
a) Some architectures have assembly instructions to count the bits and
which are used instead of calling __clzdi2(), e.g. on x86 the bsr
instruction and on ppc cntlz is used. On such architectures the
wrong __clzdi2() implementation isn't used and as such the bug has
no effect and won't be noticed.
b) Some architectures link to libgcc.a, and the in-kernel weak
functions get replaced by the correct 64-bit variants from libgcc.a.
c) __builtin_clzll() and __clzdi2() doesn't seem to be used in many
places in the kernel, and most likely only in uncritical functions,
e.g. when printing hex values via seq_put_hex_ll(). The wrong return
value will still print the correct number, but just in a wrong
formatting (e.g. with too many leading zeroes).
d) 32-bit kernels aren't used that much any longer, so they are less
tested.
A trivial testcase to verify if the currently running 32-bit kernel is
affected by the bug is to look at the output of /proc/self/maps:
Here the kernel uses a correct implementation of __clzdi2():
root@debian:~# cat /proc/self/maps
00010000-00019000 r-xp 00000000 08:05 787324 /usr/bin/cat
00019000-0001a000 rwxp 00009000 08:05 787324 /usr/bin/cat
0001a000-0003b000 rwxp 00000000 00:00 0 [heap]
f7551000-f770d000 r-xp 00000000 08:05 794765 /usr/lib/hppa-linux-gnu/libc.so.6
...
and this kernel uses the broken implementation of __clzdi2():
root@debian:~# cat /proc/self/maps
0000000010000-0000000019000 r-xp 00000000 000000008:000000005 787324 /usr/bin/cat
0000000019000-000000001a000 rwxp 000000009000 000000008:000000005 787324 /usr/bin/cat
000000001a000-000000003b000 rwxp 00000000 00:00 0 [heap]
00000000f73d1000-00000000f758d000 r-xp 00000000 000000008:000000005 794765 /usr/lib/hppa-linux-gnu/libc.so.6
...
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 4df87bb7b6a22 ("lib: add weak clz/ctz functions")
Cc: Chanho Min <chanho.min@lge.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add support for extended length of read and write transactions.
New FPGA logic allows to increase size of the read and write
transactions length. This feature is verified through capability
register 'CPBLTY_REG'. Two bits 5 and 6 of the register are used for
length capability detection. Value '10' indicates support of extended
transaction length - 128 bytes for read transactions and 132 for write
transactions.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Extend driver dependency by ARM64 architecture.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Instead of a if condition with a line split, use the usual error
handling pattern with a separate variable to improve readability.
No functional changes intended.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Instead of if conditions with line splits, use the usual error handling
pattern with a separate variable to improve readability.
No functional changes intended.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
devm_clk_bulk_get_all() can return zero when no clocks are obtained.
Passing zero to dev_err_probe() is a success which is incorrect.
Fixes: 605efbf43813 ("i2c: qcom-cci: Use dev_err_probe in probe function")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Merge devfreq changes and power management tools changes for 6.6-rc1:
- Fix memory leak in devfreq_dev_release() (Boris Brezillon).
- Rewrite devfreq_monitor_start() kerneldoc comment (Manivannan
Sadhasivam).
- Explicitly include correct DT includes in devfreq (Rob Herring).
- Add turbo-boost support to cpupower (Wyes Karny).
- Add support for amd_pstate mode change to cpupower (Wyes Karny).
- Fix 'cpupower idle_set' command to accept only numeric values of
arguments (Likhitha Korrapati).
* pm-devfreq:
PM / devfreq: Fix leak in devfreq_dev_release()
PM / devfreq: Reword the kernel-doc comment for devfreq_monitor_start() API
PM / devfreq: Explicitly include correct DT includes
* pm-tools:
cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.
cpupower: Add turbo-boost support in cpupower
cpupower: Add support for amd_pstate mode change
cpupower: Add EPP value change support
cpupower: Add is_valid_path API
cpupower: Recognise amd-pstate active mode driver
cpupower: Bump soname version
|
|
Merge system-wide power management changes and power capping updates
for 6.6-rc1:
- Add device PM helpers to allow a device to remain powered-on during
system-wide transitions (Ulf Hansson).
- Rework hibernation memory snapshotting to avoid storing pages filled
with zeros in hibernation image files (Brian Geffon).
- Add check to make sure that CPU latency QoS constraints do not use
negative values (Clive Lin).
- Optimize rp->domains memory allocation in the Intel RAPL power
capping driver (xiongxin).
- Remove recursion while parsing zones in the arm_scmi power capping
driver (Cristian Marussi).
* pm-sleep:
PM: sleep: Add helpers to allow a device to remain powered-on
PM: hibernate: don't store zero pages in the image file
* pm-qos:
PM: QoS: Add check to make sure CPU latency is non-negative
* powercap:
powercap: intel_rapl: Optimize rp->domains memory allocation
powercap: arm_scmi: Remove recursion while parsing zones
|
|
Merge CPU power management updates for 6.6-rc1:
- Rework the menu and teo cpuidle governors to avoid calling
tick_nohz_get_sleep_length(), which is likely to become quite
expensive going forward, too often and improve making decisions
regarding whether or not to stop the scheduler tick in the teo
governor (Rafael Wysocki).
- Improve the performance of cpufreq_stats_create_table() in some
cases (Liao Chang).
- Fix two issues in the amd-pstate-ut cpufreq driver (Swapnil Sapkal).
- Use clamp() helper macro to improve the code readability in
cpufreq_verify_within_limits() (Liao Chang).
- Set stale CPU frequency to minimum in intel_pstate (Doug Smythies).
* pm-cpuidle:
cpuidle: teo: Avoid unnecessary variable assignments
cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases
cpuidle: teo: Gather statistics regarding whether or not to stop the tick
cpuidle: teo: Skip tick_nohz_get_sleep_length() call in some cases
cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront
cpuidle: teo: Drop utilized from struct teo_cpu
cpuidle: teo: Avoid stopping the tick unnecessarily when bailing out
cpuidle: teo: Update idle duration estimate when choosing shallower state
* pm-cpufreq:
cpufreq: amd-pstate-ut: Fix kernel panic when loading the driver
cpufreq: amd-pstate-ut: Remove module parameter access
cpufreq: Use clamp() helper macro to improve the code readability
cpufreq: intel_pstate: set stale CPU frequency to minimum
cpufreq: stats: Improve the performance of cpufreq_stats_create_table()
|
|
Merge a PNP change for 6.6-rc1:
- Fix string truncation warning in pnpacpi_add_device() (Sunil V L).
* pnp:
PNP: ACPI: Fix string truncation warning
|
|
Merge ACPI power management updates for 6.6-rc1:
- Fix and clean up suspend-to-idle interface for AMD systems (Mario
Limonciello, Andy Shevchenko).
* acpi-pm:
ACPI: x86: s2idle: Add a function to get LPS0 constraint for a device
ACPI: x86: s2idle: Add for_each_lpi_constraint() helper
ACPI: x86: s2idle: Add more debugging for AMD constraints parsing
ACPI: x86: s2idle: Fix a logic error parsing AMD constraints table
ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects
ACPI: x86: s2idle: Post-increment variables when getting constraints
ACPI: Adjust #ifdef for *_lps0_dev use
|
|
Merge ACPI device enumeration changes, ACPI TAD and extlog drivers
updates, and miscellaneous ACPI-related changes for 6.6-rc1:
- Defer enumeration of devices with _DEP pointing to IVSC (Wentong Wu).
- Install SystemCMOS address space handler for ACPI000E (TAD) to meet
platform firmware expectations on some platforms (Zhang Rui).
- Fix finding the generic error data in the ACPi extlog driver for
compatibility with old and new firmware interface versions (Xiaochun
Lee).
- Remove assorted unused declarations of functions (Yue Haibing).
- Move AMBA bus scan handling into arm64 specific directory (Sudeep
Holla).
* acpi-scan:
ACPI: scan: Defer enumeration of devices with a _DEP pointing to IVSC device
* acpi-tad:
ACPI: TAD: Install SystemCMOS address space handler for ACPI000E
* acpi-extlog:
ACPI: extlog: Fix finding the generic error data for v3 structure
* acpi-misc:
ACPI: Remove assorted unused declarations of functions
ACPI: Remove unused extern declaration acpi_paddr_to_node()
ACPI: Move AMBA bus scan handling into arm64 specific directory
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
issues or aren't considered suitable for a -stable backport"
* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
shmem: fix smaps BUG sleeping while atomic
selftests: cachestat: catch failing fsync test on tmpfs
selftests: cachestat: test for cachestat availability
maple_tree: disable mas_wr_append() when other readers are possible
madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
mm: multi-gen LRU: don't spin during memcg release
mm: memory-failure: fix unexpected return value in soft_offline_page()
radix tree: remove unused variable
mm: add a call to flush_cache_vmap() in vmap_pfn()
selftests/mm: FOLL_LONGTERM need to be updated to 0x100
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
selftests: cgroup: fix test_kmem_basic less than error
mm: enable page walking API to lock vmas during the walk
smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
|
|
Merge ACPI thermal driver changes for 6.6-rc1:
- Drop non-functional nocrt parameter from ACPI thermal (Mario
Limonciello).
- Clean up the ACPI thermal driver, rework the handling of firmware
notifications in it and make it provide a table of generic trip point
structures to the core during initialization (Rafael Wysocki).
* acpi-thermal:
ACPI: thermal: Eliminate code duplication from acpi_thermal_notify()
ACPI: thermal: Drop unnecessary thermal zone callbacks
ACPI: thermal: Rework thermal_get_trend()
ACPI: thermal: Use trip point table to register thermal zones
thermal: core: Rework and rename __for_each_thermal_trip()
ACPI: thermal: Introduce struct acpi_thermal_trip
ACPI: thermal: Carry out trip point updates under zone lock
ACPI: thermal: Clean up acpi_thermal_register_thermal_zone()
thermal: core: Add priv pointer to struct thermal_trip
thermal: core: Introduce thermal_zone_device_exec()
thermal: core: Do not handle trip points with invalid temperature
ACPI: thermal: Drop redundant local variable from acpi_thermal_resume()
ACPI: thermal: Do not attach private data to ACPI handles
ACPI: thermal: Drop enabled flag from struct acpi_thermal_active
ACPI: thermal: Drop nocrt parameter
|
|
Merge ACPI processor driver changes for 6.6-rc1:
- Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao).
- Convert ACPI CPU initialization to using _OSC instead of _PDC that
has been depreceted since 2018 and dropped from the specification in
ACPI 6.5 (Michal Wilczynski, Rafael Wysocki).
* acpi-processor:
ACPI: processor: LoongArch: Get physical ID from MADT
ACPI: processor: Refine messages in acpi_early_processor_control_setup()
ACPI: processor: Remove acpi_hwp_native_thermal_lvt_osc()
ACPI: processor: Use _OSC to convey OSPM processor support information
ACPI: processor: Introduce acpi_processor_osc()
ACPI: processor: Set CAP_SMP_T_SWCOORD in arch_acpi_set_proc_cap_bits()
ACPI: processor: Clear C_C2C3_FFH and C_C1_FFH in arch_acpi_set_proc_cap_bits()
ACPI: processor: Rename ACPI_PDC symbols
ACPI: processor: Refactor arch_acpi_set_pdc_bits()
ACPI: processor: Move processor_physically_present() to acpi_processor.c
ACPI: processor: Move MWAIT quirk out of acpi_processor.c
|
|
Add support for sa8775p SoC that uses controller version 5.90
reusing the 1.9.0 config.
Link: https://lore.kernel.org/linux-pci/1689960276-29266-3-git-send-email-quic_msarkar@quicinc.com
Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
|
|
Add sa8775p platform to the binding.
Link: https://lore.kernel.org/linux-pci/1689960276-29266-2-git-send-email-quic_msarkar@quicinc.com
Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
|
|
Merge changes related to the ACPI bus type and ACPI backlight driver
changes for 6.6-rc1:
- Introduce new wrappers for ACPICA notify handler install/remove and
convert multiple drivers to using their own Notify() handlers instead
of the ACPI bus type .notify() slated for removal (Michal Wilczynski).
- Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2 (Hans
de Goede).
- Put ACPI video and its child devices explicitly into D0 on boot to
avoid platform firmware confusion (Kai-Heng Feng).
- Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby).
* acpi-bus:
ACPI: thermal: Install Notify() handler directly
ACPI: NFIT: Remove unnecessary .remove callback
ACPI: NFIT: Install Notify() handler directly
ACPI: HED: Install Notify() handler directly
ACPI: battery: Install Notify() handler directly
ACPI: video: Install Notify() handler directly
ACPI: AC: Install Notify() handler directly
ACPI: bus: Set driver_data to NULL every time .add() fails
ACPI: bus: Introduce wrappers for ACPICA notify handler install/remove
* acpi-video:
ACPI: video: Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2
ACPI: video: Put ACPI video and its child devices into D0 on boot
ACPI: video: Add backlight=native DMI quirk for Lenovo Ideapad Z470
|
|
Merge ACPICA material for 6.6-rc1.
This includes some fixes, cleanups and new material, mostly related to
parsing tables.
Specifics:
- Suppress a GCC 12 dangling-pointer warning (Philip Prindeville).
- Reformat the ACPI_STATE_COMMON macro and its users (George Guo).
- Replace the ternary operator with ACPI_MIN() (Jiangshan Yi).
- Add support for _DSC as per ACPI 6.5 (Saket Dumbre).
- Remove a duplicate macro from zephyr header (Najumon B.A).
- Add data structures for GED and _EVT tracking (Jose Marinho).
- Fix misspelled CDAT DSMAS define (Dave Jiang).
- Simplify an error message in acpi_ds_result_push() (Christophe
Jaillet).
- Add a struct size macro related to SRAT (Dave Jiang).
- Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar).
- Add support for RISC-V external interrupt controllers in MADT (Sunil
V L).
- Add RHCT flags, CMO and MMU nodes (Sunil V L).
- Change ACPICA version to 20230628 (Bob Moore).
* acpica:
ACPICA: Update version to 20230628
ACPICA: RHCT: Add flags, CMO and MMU nodes
ACPICA: MADT: Add RISC-V external interrupt controllers
ACPICA: Add AML_NO_OPERAND_RESOLVE flag to Timer
ACPICA: Add a define for size of struct acpi_srat_generic_affinity device_handle
ACPICA: Slightly simplify an error message in acpi_ds_result_push()
ACPICA: Fix misspelled CDAT DSMAS define
ACPICA: Add interrupt command to acpiexec
ACPICA: Detect GED device and keep track of _EVT
ACPICA: fix for conflict macro definition on zephyr interface
ACPICA: Add support for _DSC as per ACPI 6.5
ACPICA: exserial.c: replace ternary operator with ACPI_MIN()
ACPICA: Modify ACPI_STATE_COMMON
ACPICA: Fix GCC 12 dangling-pointer warning
|
|
Kernel test robot reported a kallsyms_test failure when clang lto is
enabled (thin or full) and CONFIG_KALLSYMS_SELFTEST is also enabled.
I can reproduce in my local environment with the following error message
with thin lto:
[ 1.877897] kallsyms_selftest: Test for 1750th symbol failed: (tsc_cs_mark_unstable) addr=ffffffff81038090
[ 1.877901] kallsyms_selftest: abort
It appears that commit 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes
from promoted global functions") caused the failure. Commit 8cc32a9bbf29
changed cleanup_symbol_name() based on ".llvm." instead of '.' where
".llvm." is appended to a before-lto-optimization local symbol name.
We need to propagate such knowledge in kallsyms_selftest.c as well.
Further more, compare_symbol_name() in kallsyms.c needs change as well.
In scripts/kallsyms.c, kallsyms_names and kallsyms_seqs_of_names are used
to record symbol names themselves and index to symbol names respectively.
For example:
kallsyms_names:
...
__amd_smn_rw._entry <== seq 1000
__amd_smn_rw._entry.5 <== seq 1001
__amd_smn_rw.llvm.<hash> <== seq 1002
...
kallsyms_seqs_of_names are sorted based on cleanup_symbol_name() through, so
the order in kallsyms_seqs_of_names actually has
index 1000: seq 1002 <== __amd_smn_rw.llvm.<hash> (actual symbol comparison using '__amd_smn_rw')
index 1001: seq 1000 <== __amd_smn_rw._entry
index 1002: seq 1001 <== __amd_smn_rw._entry.5
Let us say at a particular point, at index 1000, symbol '__amd_smn_rw.llvm.<hash>'
is comparing to '__amd_smn_rw._entry' where '__amd_smn_rw._entry' is the one to
search e.g., with function kallsyms_on_each_match_symbol(). The current implementation
will find out '__amd_smn_rw._entry' is less than '__amd_smn_rw.llvm.<hash>' and
then continue to search e.g., index 999 and never found a match although the actual
index 1001 is a match.
To fix this issue, let us do cleanup_symbol_name() first and then do comparison.
In the above case, comparing '__amd_smn_rw' vs '__amd_smn_rw._entry' and
'__amd_smn_rw._entry' being greater than '__amd_smn_rw', the next comparison will
be > index 1000 and eventually index 1001 will be hit an a match is found.
For any symbols not having '.llvm.' substr, there is no functionality change
for compare_symbol_name().
Fixes: 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes from promoted global functions")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202308232200.1c932a90-oliver.sang@intel.com
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Song Liu <song@kernel.org>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230825034659.1037627-1-yonghong.song@linux.dev
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Qcom PCIe EP controllers have 4K alignment restriction for the outbound
window address. Hence, pass this info to the EPF core so that the EPF
drivers can make use of this info.
Link: https://lore.kernel.org/linux-pci/20230717065459.14138-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
|
|
Use the finish zone command first when a zone should be closed.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"This is obviously not ideal, particularly for something this late in
the cycle.
Unfortunately we found some uABI issues in the vector support while
reviewing the GDB port, which has triggered a revert -- probably a
good sign we should have reviewed GDB before merging this, I guess I
just dropped the ball because I was so worried about the context
extension and libc suff I forgot. Hence the late revert.
There's some risk here as we're still exposing the vector context for
signal handlers, but changing that would have meant reverting all of
the vector support. The issues we've found so far have been fixed
already and they weren't absolute showstoppers, so we're essentially
just playing it safe by holding ptrace support for another release (or
until we get through a proper userspace code review).
Summary:
- The vector ucontext extension has been extended with vlenb
- The vector registers ELF core dump note type has been changed to
avoid aliasing with the CSR type used in embedded systems
- Support for accessing vector registers via ptrace() has been
reverted
- Another build fix for the ISA spec changes around Zifencei/Zicsr
that manifests on some systems built with binutils-2.37 and
gcc-11.2"
* tag 'riscv-for-linus-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix build errors using binutils2.37 toolchains
RISC-V: vector: export VLENB csr in __sc_riscv_v_state
RISC-V: Remove ptrace support for vectors
|
|
Dave Marchevsky says:
====================
BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes
This series is the third of three (or more) followups to address issues
in the bpf_refcount shared ownership implementation discovered by Kumar.
This series addresses the use-after-free scenario described in [0]. The
first followup series ([1]) also attempted to address the same
use-after-free, but only got rid of the splat without addressing the
underlying issue. After this series the underyling issue is fixed and
bpf_refcount_acquire can be re-enabled.
The main fix here is migration of bpf_obj_drop to use
bpf_mem_free_rcu. To understand why this fixes the issue, let us consider
the example interleaving provided by Kumar in [0]:
CPU 0 CPU 1
n = bpf_obj_new
lock(lock1)
bpf_rbtree_add(rbtree1, n)
m = bpf_rbtree_acquire(n)
unlock(lock1)
kptr_xchg(map, m) // move to map
// at this point, refcount = 2
m = kptr_xchg(map, NULL)
lock(lock2)
lock(lock1) bpf_rbtree_add(rbtree2, m)
p = bpf_rbtree_first(rbtree1) if (!RB_EMPTY_NODE) bpf_obj_drop_impl(m) // A
bpf_rbtree_remove(rbtree1, p)
unlock(lock1)
bpf_obj_drop(p) // B
bpf_refcount_acquire(m) // use-after-free
...
Before this series, bpf_obj_drop returns memory to the allocator using
bpf_mem_free. At this point (B in the example) there might be some
non-owning references to that memory which the verifier believes are valid,
but where the underlying memory was reused for some other allocation.
Commit 7793fc3babe9 ("bpf: Make bpf_refcount_acquire fallible for
non-owning refs") attempted to fix this by doing refcount_inc_non_zero
on refcount_acquire in instead of refcount_inc under the assumption that
preventing erroneous incr-on-0 would be sufficient. This isn't true,
though: refcount_inc_non_zero must *check* if the refcount is zero, and
the memory it's checking could have been reused, so the check may look
at and incr random reused bytes.
If we wait to reuse this memory until all non-owning refs that could
point to it are gone, there is no possibility of this scenario
happening. Migrating bpf_obj_drop to use bpf_mem_free_rcu for refcounted
nodes accomplishes this.
For such nodes, the validity of their underlying memory is now tied to
RCU critical section. This matches MEM_RCU trustedness
expectations, so the series takes the opportunity to more explicitly
mark this trustedness state.
The functional effects of trustedness changes here are rather small.
This is largely due to local kptrs having separate verifier handling -
with implicit trustedness assumptions - than arbitrary kptrs.
Regardless, let's take the opportunity to move towards a world where
trustedness is more explicitly handled.
Changelog:
v1 -> v2: https://lore.kernel.org/bpf/20230801203630.3581291-1-davemarchevsky@fb.com/
Patch 1 ("bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire")
* Spent some time experimenting with a better approach as per convo w/
Yonghong on v1's patch. It started getting too complex, so left unchanged
for now. Yonghong was fine with this approach being shipped.
Patch 2 ("bpf: Consider non-owning refs trusted")
* Add Yonghong ack
Patch 3 ("bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes")
* Add Yonghong ack
Patch 4 ("bpf: Reenable bpf_refcount_acquire")
* Add Yonghong ack
Patch 5 ("bpf: Consider non-owning refs to refcounted nodes RCU protected")
* Undo a nonfunctional whitespace change that shouldn't have been included
(Yonghong)
* Better logging message when complaining about rcu_read_{lock,unlock} in
rbtree cb (Alexei)
* Don't invalidate_non_owning_refs when processing bpf_rcu_read_unlock
(Yonghong, Alexei)
Patch 6 ("[RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS")
* preempt_{disable,enable} in __bpf_spin_{lock,unlock} (Alexei)
* Due to this we can consider spin_lock CS an RCU-sched read-side CS (per
RCU/Design/Requirements/Requirements.rst). Modify in_rcu_cs accordingly.
* no need to check for !in_rcu_cs before allowing bpf_spin_{lock,unlock}
(Alexei)
* RFC tag removed and renamed to "bpf: Allow bpf_spin_{lock,unlock} in
sleepable progs"
Patch 7 ("selftests/bpf: Add tests for rbtree API interaction in sleepable progs")
* Remove "no explicit bpf_rcu_read_lock" failure test, add similar success
test (Alexei)
Summary of patch contents, with sub-bullets being leading questions and
comments I think are worth reviewer attention:
* Patches 1 and 2 are moreso documententation - and
enforcement, in patch 1's case - of existing semantics / expectations
* Patch 3 changes bpf_obj_drop behavior for refcounted nodes such that
their underlying memory is not reused until RCU grace period elapses
* Perhaps it makes sense to move to mem_free_rcu for _all_
non-owning refs in the future, not just refcounted. This might
allow custom non-owning ref lifetime + invalidation logic to be
entirely subsumed by MEM_RCU handling. IMO this needs a bit more
thought and should be tackled outside of a fix series, so it's not
attempted here.
* Patch 4 re-enables bpf_refcount_acquire as changes in patch 3 fix
the remaining use-after-free
* One might expect this patch to be last in the series, or last
before selftest changes. Patches 5 and 6 don't change
verification or runtime behavior for existing BPF progs, though.
* Patch 5 brings the verifier's understanding of refcounted node
trustedness in line with Patch 4's changes
* Patch 6 allows some bpf_spin_{lock, unlock} calls in sleepable
progs. Marked RFC for a few reasons:
* bpf_spin_{lock,unlock} haven't been usable in sleepable progs
since before the introduction of bpf linked list and rbtree. As
such this feels more like a new feature that may not belong in
this fixes series.
* Patch 7 adds tests
[0]: https://lore.kernel.org/bpf/atfviesiidev4hu53hzravmtlau3wdodm2vqs7rd7tnwft34e3@xktodqeqevir/
[1]: https://lore.kernel.org/bpf/20230602022647.1571784-1-davemarchevsky@fb.com/
====================
Link: https://lore.kernel.org/r/20230821193311.3290257-1-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Confirm that the following sleepable prog states fail verification:
* bpf_rcu_read_unlock before bpf_spin_unlock
* RCU CS will last at least as long as spin_lock CS
Also confirm that correct usage passes verification, specifically:
* Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
* Implied RCU CS due to spin_lock CS
None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|