summaryrefslogtreecommitdiff
path: root/kernel/locking
AgeCommit message (Collapse)Author
2024-08-08Merge tag 'bcachefs-2024-08-08' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "Assorted little stuff: - lockdep fixup for lockdep_set_notrack_class() - we can now remove a device when using erasure coding without deadlocking, though we still hit other issues - the 'allocator stuck' timeout is now configurable, and messages are ratelimited. The default timeout has been increased from 10 seconds to 30" * tag 'bcachefs-2024-08-08' of git://evilpiepirate.org/bcachefs: bcachefs: Use bch2_wait_on_allocator() in btree node alloc path bcachefs: Make allocator stuck timeout configurable, ratelimit messages bcachefs: Add missing path_traverse() to btree_iter_next_node() bcachefs: ec should not allocate from ro devs bcachefs: Improved allocator debugging for ec bcachefs: Add missing bch2_trans_begin() call bcachefs: Add a comment for bucket helper types bcachefs: Don't rely on implicit unsigned -> signed integer conversion lockdep: Fix lockdep_set_notrack_class() for CONFIG_LOCK_STAT bcachefs: Fix double free of ca->buckets_nouse
2024-08-07lockdep: Fix lockdep_set_notrack_class() for CONFIG_LOCK_STATKent Overstreet
We won't find a contended lock if it's not being tracked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-29locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node()Uros Bizjak
"enum vcpu_state" is not compatible with "u8" type for all targets, resulting in: error: initialization of 'u8 *' {aka 'unsigned char *'} from incompatible pointer type 'enum vcpu_state *' for LoongArch. Correct the type of "old" variable to "u8". Fixes: fea0e1820b51 ("locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h") Closes: https://lore.kernel.org/lkml/20240719024010.3296488-1-maobibo@loongson.cn/ Reported-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20240721164552.50175-1-ubizjak@gmail.com
2024-07-18Merge tag 'bcachefs-2024-07-18.2' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull bcachefs updates from Kent Overstreet: - Metadata version 1.8: Stripe sectors accounting, BCH_DATA_unstriped This splits out the accounting of dirty sectors and stripe sectors in alloc keys; this lets us see stripe buckets that still have unstriped data in them. This is needed for ensuring that erasure coding is working correctly, as well as completing stripe creation after a crash. - Metadata version 1.9: Disk accounting rewrite The previous disk accounting scheme relied heavily on percpu counters that were also sharded by outstanding journal buffer; it was fast but not extensible or scalable, and meant that all accounting counters were recorded in every journal entry. The new disk accounting scheme stores accounting as normal btree keys; updates are deltas until they are flushed by the btree write buffer. This means we have no practical limit on the number of counters, and a new tagged union format that's easy to extend. We now have counters for compression type/ratio, per-snapshot-id usage, per-btree-id usage, and pending rebalance work. - Self healing on read IO/checksum error Data is now automatically rewritten if we get a read error and then a successful retry - Mount API conversion (thanks to Thomas Bertschinger) - Better lockdep coverage Previously, btree node locks were tracked individually by lockdep, like any other lock. But we may take _many_ btree node locks simultaneously, we easily blow through the limit of 48 locks that lockdep can track, leading to lockdep turning itself off. Tracking each btree node lock individually isn't really necessary since we have our own cycle detector for deadlock avoidance and centralized tracking of btree node locks, so we now have a single lockdep_map in btree_trans for "any btree nodes are locked". - Some more small incremental work towards online check_allocations - Lots more debugging improvements - Fixes, including: - undefined behaviour fixes, originally noted as breaking userspace LTO builds - fix a spurious warning in fsck_err, reported by Marcin - fix an integer overflow on trans->nr_updates, also reported by Marcin; this broke during deletion of highly fragmented indirect extents * tag 'bcachefs-2024-07-18.2' of https://evilpiepirate.org/git/bcachefs: (120 commits) lockdep: Add comments for lockdep_set_no{validate,track}_class() bcachefs: Fix integer overflow on trans->nr_updates bcachefs: silence silly kdoc warning bcachefs: Fix fsck warning about btree_trans not passed to fsck error bcachefs: Add an error message for insufficient rw journal devs bcachefs: varint: Avoid left-shift of a negative value bcachefs: darray: Don't pass NULL to memcpy() bcachefs: Kill bch2_assert_btree_nodes_not_locked() bcachefs: Rename BCH_WRITE_DONE -> BCH_WRITE_SUBMITTED bcachefs: __bch2_read(): call trans_begin() on every loop iter bcachefs: show none if label is not set bcachefs: drop packed, aligned from bkey_inode_buf bcachefs: btree node scan: fall back to comparing by journal seq bcachefs: Add lockdep support for btree node locks lockdep: lockdep_set_notrack_class() bcachefs: Improve copygc_wait_to_text() bcachefs: Convert clock code to u64s bcachefs: Improve startup message bcachefs: Self healing on read IO error bcachefs: Make read_only a mount option again, but hidden ...
2024-07-16Merge tag 'net-next-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Not much excitement - a handful of large patchsets (devmem among them) did not make it in time. Core & protocols: - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT - Use flex array for netdevice priv area, ensure its cache alignment - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful - Support scheduling transmission of packets based on CLOCK_TAI - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync - Improve TCP compliance with RFC 9293 for simultaneous connect() - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled - Introduce IPPROTO_SMC for selecting SMC when socket is created - Allow UDP GSO transmit from devices with no checksum offload - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding - nf_tables: shrink memory consumption for transaction objects Things we sprinkled into general kernel code: - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390) [ Already merged separately - Linus ] - Add IRQ information in sysfs for auxiliary bus - Introduce guard definition for local_lock - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures BPF: - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered - Add new kfuncs for a generic, open-coded bits iterator - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs Driver API: - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits - Track additional RSS contexts in the core, make sure configuration changes don't break them - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths - Support updating firmware on SFP modules Tests and tooling: - mptcp: use net/lib.sh to manage netns - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints - openvswitch: make test self-contained (don't depend on OvS CLI tools) Drivers: - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591" * tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits) eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering" tcp: Replace strncpy() with strscpy() wifi: ath12k: fix build vs old compiler tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child(). eth: fbnic: Write the TCAM tables used for RSS control and Rx to host eth: fbnic: Add L2 address programming eth: fbnic: Add basic Rx handling eth: fbnic: Add basic Tx handling eth: fbnic: Add link detection eth: fbnic: Add initial messaging to notify FW of our presence eth: fbnic: Implement Rx queue alloc/start/stop/free eth: fbnic: Implement Tx queue alloc/start/stop/free eth: fbnic: Allocate a netdevice and napi vectors with queues eth: fbnic: Add FW communication mechanism eth: fbnic: Add message parsing for FW messages eth: fbnic: Add register init to set PCIe/Ethernet device config eth: fbnic: Allocate core device specific structures and devlink interface eth: fbnic: Add scaffolding for Meta's NIC driver PCI: Add Meta Platforms vendor ID net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK ...
2024-07-16Merge tag 'locking-core-2024-07-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Jump label fixes, including a perf events fix that originally manifested as jump label failures, but was a serialization bug at the usage site - Mark down_write*() helpers as __always_inline, to improve WCHAN debuggability - Misc cleanups and fixes * tag 'locking-core-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Add __always_inline annotation to __down_write_common() and inlined callers jump_label: Simplify and clarify static_key_fast_inc_cpus_locked() jump_label: Clarify condition in static_key_fast_inc_not_disabled() jump_label: Fix concurrency issues in static_key_slow_dec() perf/x86: Serialize set_attr_rdpmc() cleanup: Standardize the header guard define's name
2024-07-16Merge tag 'sysctl-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Remove "->procname == NULL" check when iterating through sysctl table arrays Removing sentinels in ctl_table arrays reduces the build time size and runtime memory consumed by ~64 bytes per array. With all ctl_table sentinels gone, the additional check for ->procname == NULL that worked in tandem with the ARRAY_SIZE to calculate the size of the ctl_table arrays is no longer needed and has been removed. The sysctl register functions now returns an error if a sentinel is used. - Preparation patches for sysctl constification Constifying ctl_table structs prevents the modification of proc_handler function pointers as they would reside in .rodata. The ctl_table arguments in sysctl utility functions are const qualified in preparation for a future treewide proc_handler argument constification commit. - Misc fixes Increase robustness of set_ownership by providing sane default ownership values in case the callee doesn't set them. Bound check proc_dou8vec_minmax to avoid loading buggy modules and give sysctl testing module a name to avoid compiler complaints. * tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: Warn on an empty procname element sysctl: Remove ctl_table sentinel code comments sysctl: Remove "child" sysctl code comments sysctl: Remove superfluous empty allocations from sysctl internals sysctl: Replace nr_entries with ctl_table_size in new_links sysctl: Remove check for sentinel element in ctl_table arrays mm profiling: Remove superfluous sentinel element from ctl_table locking: Remove superfluous sentinel element from kern_lockdep_table sysctl: Add module description to sysctl-testing sysctl: constify ctl_table arguments of utility function utsname: constify ctl_table arguments of utility function sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array sysctl: always initialize i_uid/i_gid
2024-07-16Merge tag 'for-linus-6.11-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - some trivial cleanups - a fix for the Xen timer - add boot time selectable debug capability to the Xen multicall handling - two fixes for the recently added Xen irqfd handling * tag 'for-linus-6.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove deprecated xen_nopvspin boot parameter x86/xen: eliminate some private header files x86/xen: make some functions static xen: make multicall debug boot time selectable xen/arm: Convert comma to semicolon xen: privcmd: Fix possible access to a freed kirqfd instance xen: privcmd: Switch from mutex to spinlock for irqfds xen: add missing MODULE_DESCRIPTION() macros x86/xen: Convert comma to semicolon x86/xen/time: Reduce Xen timer tick xen/manage: Constify struct shutdown_handler
2024-07-14lockdep: lockdep_set_notrack_class()Kent Overstreet
Add a new helper to disable lockdep tracking entirely for a given class. This is needed for bcachefs, which takes too many btree node locks for lockdep to track. Instead, we have a single lockdep_map for "btree_trans has any btree nodes locked", which makes more since given that we have centralized lock management and a cycle detector. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11x86/xen: remove deprecated xen_nopvspin boot parameterJuergen Gross
The xen_nopvspin boot parameter is deprecated since 2019. nopvspin can be used instead. Remove the xen_nopvspin boot parameter and replace the xen_pvspin variable use cases with nopvspin. This requires to move the nopvspin variable out of the .initdata section, as it needs to be accessed for cpuhotplug, too. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Message-ID: <20240710110139.22300-1-jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2024-07-09locking/rwsem: Add __always_inline annotation to __down_write_common() and ↵John Stultz
inlined callers Apparently despite it being marked inline, the compiler may not inline __down_write_common() which makes it difficult to identify the cause of lock contention, as the wchan of the blocked function will always be listed as __down_write_common(). So add __always_inline annotation to the common function (as well as the inlined helper callers) to force it to be inlined so a more useful blocking function will be listed (via wchan). This mirrors commit 92cc5d00a431 ("locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers") which did the same for __down_read_common. I sort of worry that I'm playing wack-a-mole here, and talking with compiler people, they tell me inline means nothing, which makes me want to cry a little. So I'm wondering if we need to replace all the inlines with __always_inline, or remove them because either we mean something by it, or not. Fixes: c995e638ccbb ("locking/rwsem: Fold __down_{read,write}*()") Reported-by: Tim Murray <timmurray@google.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20240709060831.495366-1-jstultz@google.com
2024-06-24locking/local_lock: Add local nested BH locking infrastructure.Sebastian Andrzej Siewior
Add local_lock_nested_bh() locking. It is based on local_lock_t and the naming follows the preempt_disable_nested() example. For !PREEMPT_RT + !LOCKDEP it is a per-CPU annotation for locking assumptions based on local_bh_disable(). The macro is optimized away during compilation. For !PREEMPT_RT + LOCKDEP the local_lock_nested_bh() is reduced to the usual lock-acquire plus lockdep_assert_in_softirq() - ensuring that BH is disabled. For PREEMPT_RT local_lock_nested_bh() acquires the specified per-CPU lock. It does not disable CPU migration because it relies on local_bh_disable() disabling CPU migration. With LOCKDEP it performans the usual lockdep checks as with !PREEMPT_RT. Due to include hell the softirq check has been moved spinlock.c. The intention is to use this locking in places where locking of a per-CPU variable relies on BH being disabled. Instead of treating disabled bottom halves as a big per-CPU lock, PREEMPT_RT can use this to reduce the locking scope to what actually needs protecting. A side effect is that it also documents the protection scope of the per-CPU variables. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-3-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13locking: Remove superfluous sentinel element from kern_lockdep_tableJoel Granados
This commit is part of a greater effort to remove all empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-05-30locktorture: Add MODULE_DESCRIPTION()Jeff Johnson
Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in kernel/locking/locktorture.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2024-05-22Merge tag 'leds-next-6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds Pull LED updates from Lee Jones: "Core Frameworks: - Ensure seldom updated triggers have a brightness value before first update New Device Support: - Add support for Simatic IPC Device BX_59A to IPC LEDs Core - Add support for Qualcomm PMI8950 PWM to LPG Core New Functionality: - Add a bunch of new LED function identifiers - Add support for High Resolution Timers in LED Trigger Patten Fix-ups: - Shift out Audio Trigger to the Sound subsystem - Convert suitable calls to devm_* managed resources - Device Tree binding adaptions/conversions/creation - Remove superfluous code/variables/attributes and simplify overall - Use/convert to new/better APIs/helpers/MACROs instead of hand-rolling implementations Bug Fixes: - Repair enabling Torch Mode from V4L2 on the second LED - Ensure PWM is disabled when suspending" * tag 'leds-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (28 commits) leds: mt6370: Remove unused field 'reg_cfgs' from 'struct mt6370_priv' leds: lp50xx: Remove unused field 'num_of_banked_leds' from 'struct lp50xx' leds: lp50xx: Remove unused field 'bank_modules' from 'struct lp50xx_led' leds: aat1290: Remove unused field 'torch_brightness' from 'struct aat1290_led' leds: sun50i-a100: Use match_string() helper to simplify the code leds: pwm: Disable PWM when going to suspend leds: trigger: pattern: Add support for hrtimer leds: mt6360: Fix the second LED can not enable torch mode by V4L2 dt-bindings: leds: leds-qcom-lpg: Add support for PMI8950 PWM leds: qcom-lpg: Add support for PMI8950 PWM leds: apu: Remove duplicate DMI lookup data leds: trigger: netdev: Remove not needed call to led_set_brightness in deactivate dt-bindings: leds: Add LED_FUNCTION_SPEED_* for link speed on LAN/WAN dt-bindings: leds: Add LED_FUNCTION_MOBILE for mobile network leds: simatic-ipc-leds-gpio: Add support for module BX-59A dt-bindings: leds: qcom-lpg: Document PM6150L compatible dt-bindings: leds: pca963x: Convert text bindings to YAML leds: an30259a: Use devm_mutex_init() for mutex initialization leds: mlxreg: Use devm_mutex_init() for mutex initialization leds: nic78bx: Use devm API to cleanup module's resources ...
2024-04-12locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.hUros Bizjak
Use try_cmpxchg(*ptr, &old, new) instead of cmpxchg(*ptr, old, new) == old in qspinlock_paravirt.h x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240411192317.25432-2-ubizjak@gmail.com
2024-04-12locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending()Uros Bizjak
Replace this pattern in trylock_clear_pending(): cmpxchg_acquire(*ptr, old, new) == old ... with the simpler and faster: try_cmpxchg_acquire(*ptr, &old, new) The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. Also change the return type of the function to bool and streamline the control flow in the _Q_PENDING_BITS == 8 variant a bit. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Reviewed-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240325140943.815051-1-ubizjak@gmail.com
2024-04-11locking/mutex: Introduce devm_mutex_init()George Stark
Using of devm API leads to a certain order of releasing resources. So all dependent resources which are not devm-wrapped should be deleted with respect to devm-release order. Mutex is one of such objects that often is bound to other resources and has no own devm wrapping. Since mutex_destroy() actually does nothing in non-debug builds frequently calling mutex_destroy() is just ignored which is safe for now but wrong formally and can lead to a problem if mutex_destroy() will be extended so introduce devm_mutex_init(). Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: George Stark <gnstark@salutedevices.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20240411161032.609544-2-gnstark@salutedevices.com Signed-off-by: Lee Jones <lee@kernel.org>
2024-04-11locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()Uros Bizjak
Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail(). x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after CMPXCHG. No functional change intended. Since this code requires NR_CPUS >= 16k, I have tested it by unconditionally setting _Q_PENDING_BITS to 1 in <asm-generic/qspinlock_types.h>. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240321195309.484275-1-ubizjak@gmail.com
2024-03-21locking/qspinlock: Always evaluate lockevent* non-event parameter onceWaiman Long
The 'inc' parameter of lockevent_add() and the cond parameter of lockevent_cond_inc() are only evaluated when CONFIG_LOCK_EVENT_COUNTS is on. That can cause problem if those parameters are expressions with side effect like a "++". Fix this by evaluating those non-event parameters once even if CONFIG_LOCK_EVENT_COUNTS is off. This will also eliminate the need of the __maybe_unused attribute to the wait_early local variable in pv_wait_node(). Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20240319005004.1692705-1-longman@redhat.com
2024-03-01locking/rtmutex: Use try_cmpxchg_relaxed() in mark_rt_mutex_waiters()Uros Bizjak
Use try_cmpxchg() instead of cmpxchg(*ptr, old, new) == old. The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after CMPXCHG (and related move instruction in front of CMPXCHG). Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when CMPXCHG fails. There is no need to re-read the value in the loop. Note that the value from *ptr should be read using READ_ONCE() to prevent the compiler from merging, refetching or reordering the read. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240124104953.612063-1-ubizjak@gmail.com
2024-02-28locking/percpu-rwsem: Trigger contention tracepoints only if contendedNamhyung Kim
We mistakenly always fire lock contention tracepoints in the writer path, while it should be conditional on the trylock result. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20231108215322.2845536-1-namhyung@kernel.org
2024-02-28locking/rwsem: Clarify that RWSEM_READER_OWNED is just a hintWaiman Long
Clarify in the comments that the RWSEM_READER_OWNED bit in the owner field is just a hint, not an authoritative state of the rwsem. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20240222150540.79981-4-longman@redhat.com
2024-02-28locking/qspinlock: Fix 'wait_early' set but not used warningWaiman Long
When CONFIG_LOCK_EVENT_COUNTS is off, the wait_early variable will be set but not used. This is expected. Recent compilers will not generate wait_early code in this case. Add the __maybe_unused attribute to wait_early for suppressing this W=1 warning. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240222150540.79981-2-longman@redhat.com Closes: https://lore.kernel.org/oe-kbuild-all/202312260422.f4pK3f9m-lkp@intel.com/
2024-01-12Merge tag 'rcu.release.v6.8' of https://github.com/neeraju/linuxLinus Torvalds
Pull RCU updates from Neeraj Upadhyay: - Documentation and comment updates - RCU torture, locktorture updates that include cleanups; nolibc init build support for mips, ppc and rv64; testing of mid stall duration scenario and fixing fqs task creation conditions - Misc fixes, most notably restricting usage of RCU CPU stall notifiers, to confine their usage primarily to debug kernels - RCU tasks minor fixes - lockdep annotation fix for NMI-safe accesses, callback advancing/acceleration cleanup and documentation improvements * tag 'rcu.release.v6.8' of https://github.com/neeraju/linux: rcu: Force quiescent states only for ongoing grace period doc: Clarify historical disclaimers in memory-barriers.txt doc: Mention address and data dependencies in rcu_dereference.rst doc: Clarify RCU Tasks reader/updater checklist rculist.h: docs: Fix wrong function summary Documentation: RCU: Remove repeated word in comments srcu: Use try-lock lockdep annotation for NMI-safe access. srcu: Explain why callbacks invocations can't run concurrently srcu: No need to advance/accelerate if no callback enqueued srcu: Remove superfluous callbacks advancing from srcu_gp_start() rcu: Remove unused macros from rcupdate.h rcu: Restrict access to RCU CPU stall notifiers rcu-tasks: Mark RCU Tasks accesses to current->rcu_tasks_idle_cpu rcutorture: Add fqs_holdoff check before fqs_task is created rcutorture: Add mid-sized stall to TREE07 rcutorture: add nolibc init support for mips, ppc and rv64 locktorture: Increase Hamming distance between call_rcu_chain and rcu_call_chains
2024-01-10Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull header cleanups from Kent Overstreet: "The goal is to get sched.h down to a type only header, so the main thing happening in this patchset is splitting out various _types.h headers and dependency fixups, as well as moving some things out of sched.h to better locations. This is prep work for the memory allocation profiling patchset which adds new sched.h interdepencencies" * tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits) Kill sched.h dependency on rcupdate.h kill unnecessary thread_info.h include Kill unnecessary kernel.h include preempt.h: Kill dependency on list.h rseq: Split out rseq.h from sched.h LoongArch: signal.c: add header file to fix build error restart_block: Trim includes lockdep: move held_lock to lockdep_types.h sem: Split out sem_types.h uidgid: Split out uidgid_types.h seccomp: Split out seccomp_types.h refcount: Split out refcount_types.h uapi/linux/resource.h: fix include x86/signal: kill dependency on time.h syscall_user_dispatch.h: split out *_types.h mm_types_task.h: Trim dependencies Split out irqflags_types.h ipc: Kill bogus dependency on spinlock.h shm: Slim down dependencies workqueue: Split out workqueue_types.h ...
2024-01-02Merge tag 'v6.7-rc8' into locking/core, to pick up dependent changesIngo Molnar
Pick up these commits from Linus's tree: b106bcf0f99a ("locking/osq_lock: Clarify osq_wait_next()") 563adbfc351b ("locking/osq_lock: Clarify osq_wait_next() calling convention") 7c2230982129 ("locking/osq_lock: Move the definition of optimistic_spin_node into osq_lock.c") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-12-30locking/osq_lock: Clarify osq_wait_next()David Laight
Directly return NULL or 'next' instead of breaking out of the loop. Signed-off-by: David Laight <david.laight@aculab.com> [ Split original patch into two independent parts - Linus ] Link: https://lore.kernel.org/lkml/7c8828aec72e42eeb841ca0ee3397e9a@AcuMS.aculab.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-30locking/osq_lock: Clarify osq_wait_next() calling conventionDavid Laight
osq_wait_next() is passed 'prev' from osq_lock() and NULL from osq_unlock() but only needs the 'cpu' value to write to lock->tail. Just pass prev->cpu or OSQ_UNLOCKED_VAL instead. Should have no effect on the generated code since gcc manages to assume that 'prev != NULL' due to an earlier dereference. Signed-off-by: David Laight <david.laight@aculab.com> [ Changed 'old' to 'old_cpu' by request from Waiman Long - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-30locking/osq_lock: Move the definition of optimistic_spin_node into osq_lock.cDavid Laight
struct optimistic_spin_node is private to the implementation. Move it into the C file to ensure nothing is accessing it. Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-20sched.h: move pid helpers to pid.hKent Overstreet
This is needed for killing the sched.h dependency on rcupdate.h, and pid.h is a better place for this code anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-01locking/mutex: Document that mutex_unlock() is non-atomicJann Horn
I have seen several cases of attempts to use mutex_unlock() to release an object such that the object can then be freed by another task. This is not safe because mutex_unlock(), in the MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex structure after having marked it as unlocked; so mutex_unlock() requires its caller to ensure that the mutex stays alive until mutex_unlock() returns. If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters have to keep the mutex alive, but we could have a spurious MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed between the points where __mutex_unlock_slowpath() did the cmpxchg reading the flags and where it acquired the wait_lock. ( With spinlocks, that kind of code pattern is allowed and, from what I remember, used in several places in the kernel. ) Document this, such a semantic difference between mutexes and spinlocks is fairly unintuitive. [ mingo: Made the changelog a bit more assertive, refined the comments. ] Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@google.com
2023-11-24lockdep: Fix block chain corruptionPeter Zijlstra
Kent reported an occasional KASAN splat in lockdep. Mark then noted: > I suspect the dodgy access is to chain_block_buckets[-1], which hits the last 4 > bytes of the redzone and gets (incorrectly/misleadingly) attributed to > nr_large_chain_blocks. That would mean @size == 0, at which point size_to_bucket() returns -1 and the above happens. alloc_chain_hlocks() has 'size - req', for the first with the precondition 'size >= rq', which allows the 0. This code is trying to split a block, del_chain_block() takes what we need, and add_chain_block() puts back the remainder, except in the above case the remainder is 0 sized and things go sideways. Fixes: 810507fe6fd5 ("locking/lockdep: Reuse freed chain_hlocks entries") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://lkml.kernel.org/r/20231121114126.GH8262@noisy.programming.kicks-ass.net
2023-11-23locktorture: Increase Hamming distance between call_rcu_chain and ↵Paul E. McKenney
rcu_call_chains One letter difference is really not enough, so this commit changes call_rcu_chain to call_rcu_chain_list. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
2023-10-30Merge tag 'rcu-next-v6.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks Pull RCU updates from Frederic Weisbecker: - RCU torture, locktorture and generic torture infrastructure updates that include various fixes, cleanups and consolidations. Among the user visible things, ftrace dumps can now be found into their own file, and module parameters get better documented and reported on dumps. - Generic and misc fixes all over the place. Some highlights: * Hotplug handling has seen some light cleanups and comments * An RCU barrier can now be triggered through sysfs to serialize memory stress testing and avoid OOM * Object information is now dumped in case of invalid callback invocation * Also various SRCU issues, too hard to trigger to deserve urgent pull requests, have been fixed - RCU documentation updates - RCU reference scalability test minor fixes and doc improvements. - RCU tasks minor fixes - Stall detection updates. Introduce RCU CPU Stall notifiers that allows a subsystem to provide informations to help debugging. Also cure some false positive stalls. * tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (56 commits) srcu: Only accelerate on enqueue time locktorture: Check the correct variable for allocation failure srcu: Fix callbacks acceleration mishandling rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP rcu: Standardize explicit CPU-hotplug calls rcu: Conditionally build CPU-hotplug teardown callbacks rcu: Remove references to rcu_migrate_callbacks() from diagrams rcu: Assume rcu_report_dead() is always called locally rcu: Assume IRQS disabled from rcu_report_dead() rcu: Use rcu_segcblist_segempty() instead of open coding it rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects srcu: Fix srcu_struct node grpmask overflow on 64-bit systems torture: Convert parse-console.sh to mktemp rcutorture: Traverse possible cpu to set maxcpu in rcu_nocb_toggle() rcutorture: Replace schedule_timeout*() 1-jiffy waits with HZ/20 torture: Add kvm.sh --debug-info argument locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writers doc: Catch-up update for locktorture module parameters locktorture: Add call_rcu_chains module parameter locktorture: Add new module parameters to lock_torture_print_module_parms() ...
2023-10-30Merge tag 'locking-core-2023-10-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Info Molnar: "Futex improvements: - Add the 'futex2' syscall ABI, which is an attempt to get away from the multiplex syscall and adds a little room for extentions, while lifting some limitations. - Fix futex PI recursive rt_mutex waiter state bug - Fix inter-process shared futexes on no-MMU systems - Use folios instead of pages Micro-optimizations of locking primitives: - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock architectures, to improve lockref code generation - Improve the x86-32 lockref_get_not_zero() main loop by adding build-time CMPXCHG8B support detection for the relevant lockref code, and by better interfacing the CMPXCHG8B assembly code with the compiler - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg() code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg(). - Micro-optimize rcuref_put_slowpath() Locking debuggability improvements: - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well - Enforce atomicity of sched_submit_work(), which is de-facto atomic but was un-enforced previously. - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics - Fix ww_mutex self-tests - Clean up const-propagation in <linux/seqlock.h> and simplify the API-instantiation macros a bit RT locking improvements: - Provide the rt_mutex_*_schedule() primitives/helpers and use them in the rtmutex code to avoid recursion vs. rtlock on the PI state. - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock() and rwbase_read_lock() .. plus misc fixes & cleanups" * tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) futex: Don't include process MM in futex key on no-MMU locking/seqlock: Fix grammar in comment alpha: Fix up new futex syscall numbers locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning locking/seqlock: Change __seqprop() to return the function pointer locking/seqlock: Simplify SEQCOUNT_LOCKNAME() locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath() locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg() locking/atomic/x86: Introduce arch_sync_try_cmpxchg() locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback locking/seqlock: Fix typo in comment futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic() locking/local, arch: Rewrite local_add_unless() as a static inline function locking/debug: Fix debugfs API return value checks to use IS_ERR() locking/ww_mutex/test: Make sure we bail out instead of livelock locking/ww_mutex/test: Fix potential workqueue corruption locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup futex: Add sys_futex_requeue() futex: Add flags2 argument to futex_requeue() ...
2023-10-19locking: export contention tracepoints for bcachefs six locksBrian Foster
The bcachefs implementation of six locks is intended to land in generic locking code in the long term, but has been pulled into the bcachefs subsystem for internal use for the time being. This code lift breaks the bcachefs module build as six locks depend a couple of the generic locking tracepoints. Export these tracepoint symbols for bcachefs. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-12locking/lockdep: Fix string sizing bug that triggers a format-truncation ↵Lucy Mielke
compiler-warning On an allyesconfig, with "treat warnings as errors" unset, GCC emits these warnings: kernel/locking/lockdep_proc.c:438:32: Warning: Format specifier '%lld' may be truncated when writing 1 to 17 bytes into a region of size 15 [-Wformat-truncation=] kernel/locking/lockdep_proc.c:438:31: Note: Format directive argument is in the range [-9223372036854775, 9223372036854775] kernel/locking/lockdep_proc.c:438:9: Note: 'snprintf' has output between 5 and 22 bytes into a target of size 15 In seq_time(), the longest s64 is "-9223372036854775808"-ish, which converted to the fixed-point float format is "-9223372036854775.80": 21 bytes, plus termination is another byte: 22. Therefore, a larger buffer size of 22 is needed here - not 15. The code was safe due to the snprintf(). Fix it. Signed-off-by: Lucy Mielke <lucymielke@icloud.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/ZSfOEHRkZAWaQr3U@fedora.fritz.box
2023-10-11locktorture: Check the correct variable for allocation failureDan Carpenter
There is a typo so this checks the wrong variable. "chains" plural vs "chain" singular. We already know that "chains" is non-zero. Fixes: 7f993623e9eb ("locktorture: Add call_rcu_chains module parameter") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-10-03locking/debug: Fix debugfs API return value checks to use IS_ERR()Atul Kumar Pant
Update the checking of return values from debugfs_create_file() and debugfs_create_dir() to use IS_ERR(). Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230807121834.7438-1-atulpant.linux@gmail.com
2023-09-24locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writersPaul E. McKenney
This commit renames the readers_bind and writers_bind module parameters to bind_readers and bind_writers, respectively. This provides added clarity via the imperative mode and better organizes the documentation. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Add call_rcu_chains module parameterPaul E. McKenney
When running locktorture on large systems, there will normally be enough RCU activity to ensure that there is a grace period in flight at all times. However, on smaller systems, RCU might well be idle the majority of the time. This situation can be inconvenient in cases where the RCU CPU stall warning is part of the debugging process. This commit therefore adds an call_rcu_chains module parameter to locktorture, allowing the user to specify the desired number of self-propagating call_rcu() chains. For good measure, immediately before invoking call_rcu(), the self-propagating RCU callback invokes start_poll_synchronize_rcu() to force the immediate start of a grace period, with the call_rcu() forcing another to start shortly thereafter. Booting with locktorture.call_rcu_chains=2 increases the probability of a stuck locking primitive resulting in an RCU CPU stall warning from about 25% to nearly 100%. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Add new module parameters to lock_torture_print_module_parms()Paul E. McKenney
This commit adds new module parameters to lock_torture_print_module_parms, and alphabetizes things while in the area. This change makes locktorture test results more useful and self-contained. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Add acq_writer_lim to complain about long acquistion timesPaul E. McKenney
This commit adds a locktorture.acq_writer_lim module parameter that specifies the maximum number of jiffies that is expected to be consumed by write-side lock acquisition. If this limit is exceeded, a WARN_ONCE() causes a splat. Note that this limit applies to the main lock acquisition only, not to any nested acquisitions. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Consolidate "if" statements in lock_torture_writer()Paul E. McKenney
There is a pair of adjacent "if" statements with identical conditions in the lock_torture_writer() function. This commit therefore combines them. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Alphabetize torture_param() entriesPaul E. McKenney
There are getting to be too many module parameters for a random list to be comfortable, so this commit alphabetizes the list. Strictly code motion. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24locktorture: Add readers_bind and writers_bind module parametersPaul E. McKenney
This commit adds readers_bind and writers_bind module parameters to locktorture in order to skew tests across socket boundaries. This skewing is intended to provide additional variable-latency stress on the primitive under test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-22locking/ww_mutex/test: Make sure we bail out instead of livelockJohn Stultz
I've seen what appears to be livelocks in the stress_inorder_work() function, and looking at the code it is clear we can have a case where we continually retry acquiring the locks and never check to see if we have passed the specified timeout. This patch reworks that function so we always check the timeout before iterating through the loop again. I believe others may have hit this previously here: https://lore.kernel.org/lkml/895ef450-4fb3-5d29-a6ad-790657106a5a@intel.com/ Reported-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230922043616.19282-4-jstultz@google.com
2023-09-22locking/ww_mutex/test: Fix potential workqueue corruptionJohn Stultz
In some cases running with the test-ww_mutex code, I was seeing odd behavior where sometimes it seemed flush_workqueue was returning before all the work threads were finished. Often this would cause strange crashes as the mutexes would be freed while they were being used. Looking at the code, there is a lifetime problem as the controlling thread that spawns the work allocates the "struct stress" structures that are passed to the workqueue threads. Then when the workqueue threads are finished, they free the stress struct that was passed to them. Unfortunately the workqueue work_struct node is in the stress struct. Which means the work_struct is freed before the work thread returns and while flush_workqueue is waiting. It seems like a better idea to have the controlling thread both allocate and free the stress structures, so that we can be sure we don't corrupt the workqueue by freeing the structure prematurely. So this patch reworks the test to do so, and with this change I no longer see the early flush_workqueue returns. Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230922043616.19282-3-jstultz@google.com
2023-09-22locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootupJohn Stultz
Booting w/ qemu without kvm, and with 64 cpus, I noticed we'd sometimes hung task watchdog splats in get_random_u32_below() when using the test-ww_mutex stress test. While entropy exhaustion is no longer an issue, the RNG may be slower early in boot. The test-ww_mutex code will spawn off 128 threads (2x cpus) and each thread will call get_random_u32_below() a number of times to generate a random order of the 16 locks. This intense use takes time and without kvm, qemu can be slow enough that we trip the hung task watchdogs. For this test, we don't need true randomness, just mixed up orders for testing ww_mutex lock acquisitions, so it changes the logic to use the prng instead, which takes less time and avoids the watchdgos. Feedback would be appreciated! Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230922043616.19282-2-jstultz@google.com