summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-05cifs: sanitize paths in cifs_update_super_prepath.Thiago Rafael Becker
After a server reboot, clients are failing to move files with ENOENT. This is caused by DFS referrals containing multiple separators, which the server move call doesn't recognize. v1: Initial patch. v2: Move prototype to header. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182472 Fixes: a31080899d5f ("cifs: sanitize multiple delimiters in prepath") Actually-Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Thiago Rafael Becker <tbecker@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-05usb: gadgetfs: Fix ep_read_iter to handle ITER_UBUFSandeep Dhavale
iov_iter for ep_read_iter can be ITER_UBUF with io_uring. In that case dup_iter() does not have to allocate iov and it can return NULL. Fix the assumption by checking for iter_is_ubuf() other wise ep_read_iter can treat this as failure and return -ENOMEM. Fixes: 1e23db450cff ("io_uring: use iter_ubuf for single range imports") Signed-off-by: Sandeep Dhavale <dhavale@google.com> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20230401060509.3608259-3-dhavale@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-05usb: gadget: f_fs: Fix ffs_epfile_read_iter to handle ITER_UBUFSandeep Dhavale
iov_iter for ffs_epfile_read_iter can be ITER_UBUF with io_uring. In that case dup_iter() does not have to allocate anything and it can return NULL. ffs_epfile_read_iter treats this as a failure and returns -ENOMEM. Fix it by checking if iter_is_ubuf(). Fixes: 1e23db450cff ("io_uring: use iter_ubuf for single range imports") Signed-off-by: Sandeep Dhavale <dhavale@google.com> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20230401060509.3608259-2-dhavale@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-05usb: typec: altmodes/displayport: Fix configure initial pin assignmentRD Babiera
While determining the initial pin assignment to be sent in the configure message, using the DP_PIN_ASSIGN_DP_ONLY_MASK mask causes the DFP_U to send both Pin Assignment C and E when both are supported by the DFP_U and UFP_U. The spec (Table 5-7 DFP_U Pin Assignment Selection Mandates, VESA DisplayPort Alt Mode Standard v2.0) indicates that the DFP_U never selects Pin Assignment E when Pin Assignment C is offered. Update the DP_PIN_ASSIGN_DP_ONLY_MASK conditional to intially select only Pin Assignment C if it is available. Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20230329215159.2046932-1-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-05usb: dwc3: pci: add support for the Intel Meteor Lake-SHeikki Krogerus
This patch adds the necessary PCI ID for Intel Meteor Lake-S devices. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230330150224.89316-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-05arm64: compat: Work around uninitialized variable warningArd Biesheuvel
Dan reports that smatch complains about a potential uninitialized variable being used in the compat alignment fixup code. The logic is not wrong per se, but we do end up using an uninitialized variable if reading the instruction that triggered the alignment fault from user space faults, even if the fault ensures that the uninitialized value doesn't propagate any further. Given that we just give up and return 1 if any fault occurs when reading the instruction, let's get rid of the 'success handling' pattern that captures the fault in a variable and aborts later, and instead, just return 1 immediately if any of the get_user() calls result in an exception. Fixes: 3fc24ef32d3b ("arm64: compat: Implement misalignment fixups for multiword loads") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202304021214.gekJ8yRc-lkp@intel.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230404103625.2386382-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-04-05Merge tag 'trace-v6.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix timerlat notification, as it was not triggering the notify to users when a new max latency was hit. - Do not trigger max latency if the tracing is off. When tracing is off, the ring buffer is not updated, it does not make sense to notify when there's a new max latency detected by the tracer, as why that latency happened is not available. The tracing logic still runs when the ring buffer is disabled, but it should not be triggering notifications. - Fix race on freeing the synthetic event "last_cmd" variable by adding a mutex around it. - Fix race between reader and writer of the ring buffer by adding memory barriers. When the writer is still on the reader page it must have its content visible on the buffer before it moves the commit index that the reader uses to know how much content is on the page. - Make get_lock_parent_ip() always inlined, as it uses _THIS_IP_ and _RET_IP_, which gets broken if it is not inlined. - Make __field(int, arr[5]) in a TRACE_EVENT() macro fail to build. The field formats of trace events are calculated by using sizeof(type) and other means by what is passed into the structure macros like __field(). The __field() macro is only meant for atom types like int, long, short, pointer, etc. It is not meant for arrays. The code will currently compile with arrays, but then the format produced will be inaccurate, and user space parsing tools will break. Two bugs have already been fixed, now add code that will make the kernel fail to build if another trace event includes this buggy field format. - Fix boot up snapshot code: Boot snapshots were triggering when not even asked for on the kernel command line. This was caused by two bugs: 1) It would trigger a snapshot on any instance if one was created from the kernel command line. 2) The error handling would only affect the top level instance. So the fact that a snapshot was done on a instance that didn't allocate a buffer triggered a warning written into the top level buffer, and worse yet, disabled the top level buffer. - Fix memory leak that was caused when an error was logged in a trace buffer instance, and then the buffer instance was removed. The allocated error log messages still needed to be freed. * tag 'trace-v6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Free error logs of tracing instances tracing: Fix ftrace_boot_snapshot command line logic tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance tracing: Error if a trace event has an array for a __field() tracing/osnoise: Fix notify new tracing_max_latency tracing/timerlat: Notify new max thread latency ftrace: Mark get_lock_parent_ip() __always_inline ring-buffer: Fix race while reader and writer are on the same page tracing/synthetic: Fix races on freeing last_cmd
2023-04-05Smack: Improve mount process memory useCasey Schaufler
The existing mount processing code in Smack makes many unnecessary copies of Smack labels. Because Smack labels never go away once imported it is safe to use pointers to them rather than copies. Replace the use of copies of label names to pointers to the global label list entries. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-04-05nvme: fix discard support without oncsKeith Busch
The device can report discard support without setting the ONCS DSM bit. When not set, the driver clears max_discard_size expecting it to be set later. We don't know the size until we have the namespace format, though, so setting it is deferred until configuring one, but the driver was abandoning the discard settings due to that initial clearing. Move the max_discard_size calculation above the check for a '0' discard size. Fixes: 1a86924e4f46475 ("nvme: fix interpretation of DMRSL") Reported-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-04-05ARM: dts: imx6ull-colibri: Remove unnecessary #address-cells/#size-cellsFabio Estevam
Building with W=1 leads to the following dtc warning: arch/arm/boot/dts/imx6ull-colibri.dtsi:36.9-46.5: Warning (graph_child_address): /connector/ports: graph node has single child node 'port@0', #address-cells/#size-cells are not necessary Since a single port is used, 'ports' can be removed, as well as the unnecessary #address-cells/#size-cells. Fixes: bd5880e10982 ("ARM: dts: colibri-imx6ull: Enable dual-role switching") Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-04-05ARM: dts: imx7d-remarkable2: Remove unnecessary #address-cells/#size-cellsFabio Estevam
Building with W=1 leads to the following dtc warning: arch/arm/boot/dts/imx7d-remarkable2.dts:319.19-335.4: Warning (avoid_unnecessary_addr_size): /soc/bus@30800000/i2c@30a50000/pmic@62: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property Remove unnecessary #address-cells/#size-cells to fix it. Fixes: 9076cbaa7757 ("ARM: dts: imx7d-remarkable2: Enable silergy,sy7636a") Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Alistair Francis <alistair@alistair23.me> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-04-05arm64: dts: imx8mp-verdin: correct off-on-delayPeng Fan
The property should be off-on-delay-us, not off-on-delay Fixes: a39ed23bdf6e ("arm64: dts: freescale: add initial support for verdin imx8m plus") Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-04-05arm64: dts: imx8mm-verdin: correct off-on-delayPeng Fan
The property should be off-on-delay-us, not off-on-delay Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini") Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-04-05tracing: Free error logs of tracing instancesSteven Rostedt (Google)
When a tracing instance is removed, the error messages that hold errors that occurred in the instance needs to be freed. The following reports a memory leak: # cd /sys/kernel/tracing # mkdir instances/foo # echo 'hist:keys=x' > instances/foo/events/sched/sched_switch/trigger # cat instances/foo/error_log [ 117.404795] hist:sched:sched_switch: error: Couldn't find field Command: hist:keys=x ^ # rmdir instances/foo Then check for memory leaks: # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff88810d8ec700 (size 192): comm "bash", pid 869, jiffies 4294950577 (age 215.752s) hex dump (first 32 bytes): 60 dd 68 61 81 88 ff ff 60 dd 68 61 81 88 ff ff `.ha....`.ha.... a0 30 8c 83 ff ff ff ff 26 00 0a 00 00 00 00 00 .0......&....... backtrace: [<00000000dae26536>] kmalloc_trace+0x2a/0xa0 [<00000000b2938940>] tracing_log_err+0x277/0x2e0 [<000000004a0e1b07>] parse_atom+0x966/0xb40 [<0000000023b24337>] parse_expr+0x5f3/0xdb0 [<00000000594ad074>] event_hist_trigger_parse+0x27f8/0x3560 [<00000000293a9645>] trigger_process_regex+0x135/0x1a0 [<000000005c22b4f2>] event_trigger_write+0x87/0xf0 [<000000002cadc509>] vfs_write+0x162/0x670 [<0000000059c3b9be>] ksys_write+0xca/0x170 [<00000000f1cddc00>] do_syscall_64+0x3e/0xc0 [<00000000868ac68c>] entry_SYSCALL_64_after_hwframe+0x72/0xdc unreferenced object 0xffff888170c35a00 (size 32): comm "bash", pid 869, jiffies 4294950577 (age 215.752s) hex dump (first 32 bytes): 0a 20 20 43 6f 6d 6d 61 6e 64 3a 20 68 69 73 74 . Command: hist 3a 6b 65 79 73 3d 78 0a 00 00 00 00 00 00 00 00 :keys=x......... backtrace: [<000000006a747de5>] __kmalloc+0x4d/0x160 [<000000000039df5f>] tracing_log_err+0x29b/0x2e0 [<000000004a0e1b07>] parse_atom+0x966/0xb40 [<0000000023b24337>] parse_expr+0x5f3/0xdb0 [<00000000594ad074>] event_hist_trigger_parse+0x27f8/0x3560 [<00000000293a9645>] trigger_process_regex+0x135/0x1a0 [<000000005c22b4f2>] event_trigger_write+0x87/0xf0 [<000000002cadc509>] vfs_write+0x162/0x670 [<0000000059c3b9be>] ksys_write+0xca/0x170 [<00000000f1cddc00>] do_syscall_64+0x3e/0xc0 [<00000000868ac68c>] entry_SYSCALL_64_after_hwframe+0x72/0xdc The problem is that the error log needs to be freed when the instance is removed. Link: https://lore.kernel.org/lkml/76134d9f-a5ba-6a0d-37b3-28310b4a1e91@alu.unizg.hr/ Link: https://lore.kernel.org/linux-trace-kernel/20230404194504.5790b95f@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Eric Biggers <ebiggers@kernel.org> Fixes: 2f754e771b1a6 ("tracing: Have the error logs show up in the proper instances") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-05Merge branches 'rcu/staging-core', 'rcu/staging-docs' and ↵Joel Fernandes (Google)
'rcu/staging-kfree', remote-tracking branches 'paul/srcu-cf.2023.04.04a', 'fbq/rcu/lockdep.2023.03.27a' and 'fbq/rcu/rcutorture.2023.03.20a' into rcu/staging
2023-04-05checkpatch: Error out if deprecated RCU API usedJoel Fernandes (Google)
Single-argument kvfree_rcu() usage is being deprecated [1] [2]. However, till all users are converted, we would like to introduce checkpatch errors for new patches submitted. This patch adds support for the same. Tested with a trial patch. For now, we are only considering usages that don't have compound nesting, for example ignore: kvfree_rcu( (rcu_head_obj), rcu_head_name). This is sufficient as such usages are unlikely. Once all users are converted and we remove the old API, we can also revert this checkpatch patch then. [1] https://lore.kernel.org/rcu/CAEXW_YRhHaVuq+5f+VgCZM=SF+9xO+QXaxe0yE7oA9iCXK-XPg@mail.gmail.com/ [2] https://lore.kernel.org/rcu/CAEXW_YSY=q2_uaE2qo4XSGjzs4+C102YMVJ7kWwuT5LGmJGGew@mail.gmail.com/ Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()Joel Fernandes (Google)
The k[v]free_rcu() macro's single-argument form is deprecated. Therefore switch to the new k[v]free_rcu_mightsleep() variant. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. The callers are holding a mutex so the context allows blocking. Hence using the API with a single argument will be fine, but use its new name. There is no functionality change with this patch. Fixes: 57588c71177f ("mac802154: Handle passive scanning") Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kfree_rcu() and kvfree_rcu() macros' single-argument forms are deprecated. Therefore switch to the new kfree_rcu_mightsleep() and kvfree_rcu_mightsleep() variants. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kfree_rcu() and kvfree_rcu() macros' single-argument forms are deprecated. Therefore switch to the new kfree_rcu_mightsleep() and kvfree_rcu_mightsleep() variants. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Lukas Czerner <lczerner@redhat.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kfree_rcu() and kvfree_rcu() macros' single-argument forms are deprecated. Therefore switch to the new kfree_rcu_mightsleep() and kvfree_rcu_mightsleep() variants. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Cc: Ariel Levkovich <lariel@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kfree_rcu() and kvfree_rcu() macros' single-argument forms are deprecated. Therefore switch to the new kfree_rcu_mightsleep() and kvfree_rcu_mightsleep() variants. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Acked-by: Jakub Kicinski <kuba@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kvfree_rcu() macro's single-argument form is deprecated. Therefore switch to the new kvfree_rcu_mightsleep() variant. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kvfree_rcu() macro's single-argument form is deprecated. Therefore switch to the new kvfree_rcu_mightsleep() variant. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kvfree_rcu() macro's single-argument form is deprecated. Therefore switch to the new kvfree_rcu_mightsleep() variant. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Cc: Bryan Tan <bryantan@vmware.com> Cc: Vishnu Dasa <vdasa@vmware.com> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()Uladzislau Rezki (Sony)
The kvfree_rcu() macro's single-argument form is deprecated. Therefore switch to the new kvfree_rcu_mightsleep() variant. The goal is to avoid accidental use of the single-argument forms, which can introduce functionality bugs in atomic contexts and latency bugs in non-atomic contexts. Cc: Jens Axboe <axboe@kernel.dk> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Begrudgingly-acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Protect rcu_print_task_exp_stall() ->exp_tasks accessZqiang
For kernels built with CONFIG_PREEMPT_RCU=y, the following scenario can result in a NULL-pointer dereference: CPU1 CPU2 rcu_preempt_deferred_qs_irqrestore rcu_print_task_exp_stall if (special.b.blocked) READ_ONCE(rnp->exp_tasks) != NULL raw_spin_lock_rcu_node np = rcu_next_node_entry(t, rnp) if (&t->rcu_node_entry == rnp->exp_tasks) WRITE_ONCE(rnp->exp_tasks, np) .... raw_spin_unlock_irqrestore_rcu_node raw_spin_lock_irqsave_rcu_node t = list_entry(rnp->exp_tasks->prev, struct task_struct, rcu_node_entry) (if rnp->exp_tasks is NULL, this will dereference a NULL pointer) The problem is that CPU2 accesses the rcu_node structure's->exp_tasks field without holding the rcu_node structure's ->lock and CPU2 did not observe CPU1's change to rcu_node structure's ->exp_tasks in time. Therefore, if CPU1 sets rcu_node structure's->exp_tasks pointer to NULL, then CPU2 might dereference that NULL pointer. This commit therefore holds the rcu_node structure's ->lock while accessing that structure's->exp_tasks field. [ paulmck: Apply Frederic Weisbecker feedback. ] Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-edZheng Yejian
Registering a kprobe on __rcu_irq_enter_check_tick() can cause kernel stack overflow as shown below. This issue can be reproduced by enabling CONFIG_NO_HZ_FULL and booting the kernel with argument "nohz_full=", and then giving the following commands at the shell prompt: # cd /sys/kernel/tracing/ # echo 'p:mp1 __rcu_irq_enter_check_tick' >> kprobe_events # echo 1 > events/kprobes/enable This commit therefore adds __rcu_irq_enter_check_tick() to the kprobes blacklist using NOKPROBE_SYMBOL(). Insufficient stack space to handle exception! ESR: 0x00000000f2000004 -- BRK (AArch64) FAR: 0x0000ffffccf3e510 Task stack: [0xffff80000ad30000..0xffff80000ad38000] IRQ stack: [0xffff800008050000..0xffff800008058000] Overflow stack: [0xffff089c36f9f310..0xffff089c36fa0310] CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19 Hardware name: linux,dummy-virt (DT) pstate: 400003c5 (nZcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __rcu_irq_enter_check_tick+0x0/0x1b8 lr : ct_nmi_enter+0x11c/0x138 sp : ffff80000ad30080 x29: ffff80000ad30080 x28: ffff089c82e20000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff089c02a8d100 x24: 0000000000000000 x23: 00000000400003c5 x22: 0000ffffccf3e510 x21: ffff089c36fae148 x20: ffff80000ad30120 x19: ffffa8da8fcce148 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffa8da8e44ea6c x14: ffffa8da8e44e968 x13: ffffa8da8e03136c x12: 1fffe113804d6809 x11: ffff6113804d6809 x10: 0000000000000a60 x9 : dfff800000000000 x8 : ffff089c026b404f x7 : 00009eec7fb297f7 x6 : 0000000000000001 x5 : ffff80000ad30120 x4 : dfff800000000000 x3 : ffffa8da8e3016f4 x2 : 0000000000000003 x1 : 0000000000000000 x0 : 0000000000000000 Kernel panic - not syncing: kernel stack overflow CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0xf8/0x108 show_stack+0x20/0x30 dump_stack_lvl+0x68/0x84 dump_stack+0x1c/0x38 panic+0x214/0x404 add_taint+0x0/0xf8 panic_bad_stack+0x144/0x160 handle_bad_stack+0x38/0x58 __bad_stack+0x78/0x7c __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 [...] el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 arm64_enter_el1_dbg.isra.0+0x14/0x20 el1_dbg+0x2c/0x90 el1h_64_sync_handler+0xcc/0xe8 el1h_64_sync+0x64/0x68 __rcu_irq_enter_check_tick+0x0/0x1b8 el1_interrupt+0x28/0x60 el1h_64_irq_handler+0x18/0x28 el1h_64_irq+0x64/0x68 __ftrace_set_clr_event_nolock+0x98/0x198 __ftrace_set_clr_event+0x58/0x80 system_enable_write+0x144/0x178 vfs_write+0x174/0x738 ksys_write+0xd0/0x188 __arm64_sys_write+0x4c/0x60 invoke_syscall+0x64/0x180 el0_svc_common.constprop.0+0x84/0x160 do_el0_svc+0x48/0xe8 el0_svc+0x34/0xd0 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x190/0x194 SMP: stopping secondary CPUs Kernel Offset: 0x28da86000000 from 0xffff800008000000 PHYS_OFFSET: 0xfffff76600000000 CPU features: 0x00000,01a00100,0000421b Memory Limit: none Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Link: https://lore.kernel.org/all/20221119040049.795065-1-zhengyejian1@huawei.com/ Fixes: aaf2bc50df1f ("rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()Neeraj Upadhyay
The call to synchronize_srcu() from rcu_tasks_postscan() can be stalled by a task getting stuck in do_exit() between that function's calls to exit_tasks_rcu_start() and exit_tasks_rcu_finish(). To ease diagnosis of this situation, print a stall warning message every rcu_task_stall_info period when rcu_tasks_postscan() is stalled. [ paulmck: Adjust to handle CONFIG_SMP=n. ] Acked-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reported-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/rcu/20230111212736.GA1062057@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked earlyZqiang
According to the commit log of the patch that added it to the kernel, start_poll_synchronize_rcu_expedited() can be invoked very early, as in long before rcu_init() has been invoked. But before rcu_init(), the rcu_data structure's ->mynode field has not yet been initialized. This means that the start_poll_synchronize_rcu_expedited() function's attempt to set the CPU's leaf rcu_node structure's ->exp_seq_poll_rq field will result in a segmentation fault. This commit therefore causes start_poll_synchronize_rcu_expedited() to set ->exp_seq_poll_rq only after rcu_init() has initialized all CPUs' rcu_data structures' ->mynode fields. It also removes the check from the rcu_init() function so that start_poll_synchronize_rcu_expedited( is unconditionally invoked. Yes, this might result in an unnecessary boot-time grace period, but this is down in the noise. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()Zqiang
The rcu_accelerate_cbs() function is invoked by rcu_report_qs_rdp() only if there is a grace period in progress that is still blocked by at least one CPU on this rcu_node structure. This means that rcu_accelerate_cbs() should never return the value true, and thus that this function should never set the needwake variable and in turn never invoke rcu_gp_kthread_wake(). This commit therefore removes the needwake variable and the invocation of rcu_gp_kthread_wake() in favor of a WARN_ON_ONCE() on the call to rcu_accelerate_cbs(). The purpose of this new WARN_ON_ONCE() is to detect situations where the system's opinion differs from ours. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernelsZqiang
The lazy_rcu_shrink_count() shrinker function is registered even in kernels built with CONFIG_RCU_LAZY=n, in which case this function uselessly consumes cycles learning that no CPU has any lazy callbacks queued. This commit therefore registers this shrinker function only in the kernels built with CONFIG_RCU_LAZY=y, where it might actually do something useful. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency checkZqiang
This commit adds checks for the TICK_DEP_MASK_RCU_EXP bit, thus enabling RCU expedited grace periods to actually force-enable scheduling-clock interrupts on holdout CPUs. Fixes: df1e849ae455 ("rcu: Enable tick for nohz_full CPUs slow to provide expedited QS") Signed-off-by: Zqiang <qiang1.zhang@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Acked-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask raceZqiang
For kernels built with CONFIG_NO_HZ_FULL=y, the following scenario can result in the scheduling-clock interrupt remaining enabled on a holdout CPU after its quiescent state has been reported: CPU1 CPU2 rcu_report_exp_cpu_mult synchronize_rcu_expedited_wait acquires rnp->lock mask = rnp->expmask; for_each_leaf_node_cpu_mask(rnp, cpu, mask) rnp->expmask = rnp->expmask & ~mask; rdp = per_cpu_ptr(&rcu_data, cpu1); for_each_leaf_node_cpu_mask(rnp, cpu, mask) rdp = per_cpu_ptr(&rcu_data, cpu1); if (!rdp->rcu_forced_tick_exp) continue; rdp->rcu_forced_tick_exp = true; tick_dep_set_cpu(cpu1, TICK_DEP_BIT_RCU_EXP); The problem is that CPU2's sampling of rnp->expmask is obsolete by the time it invokes tick_dep_set_cpu(), and CPU1 is not guaranteed to see CPU2's store to ->rcu_forced_tick_exp in time to clear it. And even if CPU1 does see that store, it might invoke tick_dep_clear_cpu() before CPU2 got around to executing its tick_dep_set_cpu(), which would still leave the victim CPU with its scheduler-clock tick running. Either way, an nohz_full real-time application running on the victim CPU would have its latency needlessly degraded. Note that expedited RCU grace periods look at context-tracking information, and so if the CPU is executing in nohz_full usermode throughout, that CPU cannot be victimized in this manner. This commit therefore causes synchronize_rcu_expedited_wait to hold the rcu_node structure's ->lock when checking for holdout CPUs, setting TICK_DEP_BIT_RCU_EXP, and invoking tick_dep_set_cpu(), thus preventing this race. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu/trace: use strscpy() to instead of strncpy()Xu Panda
This commit saves a line of code by switching from strncpy() to strscpy() by permitting the later NUL assignment to be removed. While in the area, save another line by taking advantage of 100 characters. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Xu Panda <xu.panda@zte.com.cn> Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystemJoel Fernandes (Google)
For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined. However, cpu_is_hotpluggable() still returns true for those CPUs. This causes torture tests that do offlining to end up trying to offline this CPU causing test failures. Such failure happens on all architectures. Fix the repeated error messages thrown by this (even if the hotplug errors are harmless) by asking the opinion of the nohz subsystem on whether the CPU can be hotplugged. [ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ] For drivers/base/ portion: Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: rcu <rcu@vger.kernel.org> Cc: stable@vger.kernel.org Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05MAINTAINERS: Add Zqiang as a RCU reviewerJoel Fernandes (Google)
I have spent about two years studying and contributing to RCU, and sharing RCU-related knowledge within my team, if possible, please consider me as R ;-). Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05MAINTAINERS: Add Boqun to RCU entryBoqun Feng
Just to be clear, the "M:" tag before my name is short of "Minions" ;-) Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05MAINTAINERS: Change Joel Fernandes from R: to M:Joel Fernandes (Google)
I have spent years learning / contributing to RCU with several features, talks and presentations, with my most recent work being on Lazy-RCU. Please consider me for M, so I can tell my wife why I spend a lot of my weekends and evenings on this complicated and mysterious thing -- which is mostly in the hopes of preventing the world from burning down because everything runs on this one way or another. ;-) Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05arch/x86: Remove "select SRCU"Paul E. McKenney
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <x86@kernel.org> Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05kvm: Remove "select SRCU"Paul E. McKenney
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements from the various KVM Kconfig files. Acked-by: Sean Christopherson <seanjc@google.com> (x86) Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <kvm@vger.kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> (arm64) Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Anup Patel <anup@brainfault.org> (riscv) Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390) Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05mm: Remove "select SRCU"Paul E. McKenney
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Remove CONFIG_SRCUPaul E. McKenney
Now that all references to CONFIG_SRCU have been removed, it is time to remove CONFIG_SRCU itself. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Add comment to rcu_do_batch() identifying rcuoc code pathPaul E. McKenney
This commit adds a comment to help explain why the "else" clause of the in_serving_softirq() "if" statement does not need to enforce a time limit. The reason is that this "else" clause handles rcuoc kthreads that do not block handlers for other softirq vectors. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05doc: Update whatisRCU.rstUladzislau Rezki (Sony)
The kfree_rcu() macro is deprecated. Rename it to its new kfree_rcu_mightsleep() name in this documentation. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05Documentation: RCU: Correct spellingRandy Dunlap
Correct spelling problems for Documentation/RCU/ as reported by codespell. Note: in RTFP.txt, there are other misspellings that are left as is since they were used that way in email Subject: lines or in LWN.net articles. [preemptable, Preemptable, synchonisation] Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: rcu@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05srcu: Clarify comments on memory barrier "E"Joel Fernandes (Google)
There is an smp_mb() named "E" in srcu_flip() immediately before the increment (flip) of the srcu_struct structure's ->srcu_idx. The purpose of E is to order the preceding scan's read of lock counters against the flipping of the ->srcu_idx, in order to prevent new readers from continuing to use the old ->srcu_idx value, which might needlessly extend the grace period. However, this ordering is already enforced because of the control dependency between the preceding scan and the ->srcu_idx flip. This control dependency exists because atomic_long_read() is used to scan the counts, because WRITE_ONCE() is used to flip ->srcu_idx, and because ->srcu_idx is not flipped until the ->srcu_lock_count[] and ->srcu_unlock_count[] counts match. And such a match cannot happen when there is an in-flight reader that started before the flip (observation courtesy Mathieu Desnoyers). The litmus test below (courtesy of Frederic Weisbecker, with changes for ctrldep by Boqun and Joel) shows this: C srcu (* * bad condition: P0's first scan (SCAN1) saw P1's idx=0 LOCK count inc, though P1 saw flip. * * So basically, the ->po ordering on both P0 and P1 is enforced via ->ppo * (control deps) on both sides, and both P0 and P1 are interconnected by ->rf * relations. Combining the ->ppo with ->rf, a cycle is impossible. *) {} // updater P0(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int lock1; int unlock1; int lock0; int unlock0; // SCAN1 unlock1 = READ_ONCE(*UNLOCK1); smp_mb(); // A lock1 = READ_ONCE(*LOCK1); // FLIP if (lock1 == unlock1) { // Control dep smp_mb(); // E // Remove E and still passes. WRITE_ONCE(*IDX, 1); smp_mb(); // D // SCAN2 unlock0 = READ_ONCE(*UNLOCK0); smp_mb(); // A lock0 = READ_ONCE(*LOCK0); } } // reader P1(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1) { int tmp; int idx1; int idx2; // 1st reader idx1 = READ_ONCE(*IDX); if (idx1 == 0) { // Control dep tmp = READ_ONCE(*LOCK0); WRITE_ONCE(*LOCK0, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK0); WRITE_ONCE(*UNLOCK0, tmp + 1); } else { tmp = READ_ONCE(*LOCK1); WRITE_ONCE(*LOCK1, tmp + 1); smp_mb(); /* B and C */ tmp = READ_ONCE(*UNLOCK1); WRITE_ONCE(*UNLOCK1, tmp + 1); } } exists (0:lock1=1 /\ 1:idx1=1) More complicated litmus tests with multiple SRCU readers also show that memory barrier E is not needed. This commit therefore clarifies the comment on memory barrier E. Why not also remove that redundant smp_mb()? Because control dependencies are quite fragile due to their not being recognized by most compilers and tools. Control dependencies therefore exact an ongoing maintenance burden, and such a burden cannot be justified in this slowpath. Therefore, that smp_mb() stays until such time as its overhead becomes a measurable problem in a real workload running on a real production system, or until such time as compilers start paying attention to this sort of control dependency. Co-developed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05Documentation/RCU: s/not/note/ in checklist.rstQiuxu Zhuo
"Please not that you *cannot* rely..." has a typo. Fix it. Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05srcu: Add comments for srcu_size_statePingfan Liu
The SRCU_SIZE_* names are not self-explanatory, so this commit therefore adds comments to the definitions. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Zhang, Qiang1" <qiang1.zhang@intel.com> To: rcu@vger.kernel.org Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05rcu: Further comment and explain the state space of GP sequencesFrederic Weisbecker
The state space of the GP sequence number isn't documented and the definitions of its special values are scattered. This commit therefore gathers some common knowledge near the grace-period sequence-number definitions. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05arm64: dts: imx8mm-evk: correct pmic clock sourcePeng Fan
The osc_32k supports #clock-cells as 0, using an id is wrong, drop it. Fixes: a6a355ede574 ("arm64: dts: imx8mm-evk: Add 32.768 kHz clock to PMIC") Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Marco Felsch <m.felsch@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org>