summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-04-27ARM: dts: rockchip: fix phy nodename for rk3228-evbJohan Jonker
A test with the command below gives for example this error: arch/arm/boot/dts/rk3228-evb.dt.yaml: phy@0: '#phy-cells' is a required property The phy nodename is normally used by a phy-handle. This node is however compatible with "ethernet-phy-id1234.d400", "ethernet-phy-ieee802.3-c22" which is just been added to 'ethernet-phy.yaml'. So change nodename to 'ethernet-phy' for which '#phy-cells' is not a required property make ARCH=arm dtbs_check DT_SCHEMA_FILES=~/.local/lib/python3.5/site-packages/dtschema/schemas/ phy/phy-provider.yaml Signed-off-by: Johan Jonker <jbx6244@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20200416170321.4216-1-jbx6244@gmail.com
2020-04-27net/sonic: Fix a resource leak in an error handling path in 'jazz_sonic_probe()'Christophe JAILLET
A call to 'dma_alloc_coherent()' is hidden in 'sonic_alloc_descriptors()', called from 'sonic_probe1()'. This is correctly freed in the remove function, but not in the error handling path of the probe function. Fix it and add the missing 'dma_free_coherent()' call. While at it, rename a label in order to be slightly more informative. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27net: tc35815: Fix phydev supported/advertising maskAnthony Felice
Commit 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") updated ethernet drivers to use a linkmode bitmap. It mistakenly dropped a bitwise negation in the tc35815 ethernet driver on a bitmask to set the supported/advertising flags. Found by Anthony via code inspection, not tested as I do not have the required hardware. Fixes: 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: Anthony Felice <tony.felice@timesys.com> Reviewed-by: Akshay Bhat <akshay.bhat@timesys.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27sch_sfq: validate silly quantum valuesEric Dumazet
syzbot managed to set up sfq so that q->scaled_quantum was zero, triggering an infinite loop in sfq_dequeue() More generally, we must only accept quantum between 1 and 2^18 - 7, meaning scaled_quantum must be in [1, 0x7FFF] range. Otherwise, we also could have a loop in sfq_dequeue() if scaled_quantum happens to be 0x8000, since slot->allot could indefinitely switch between 0 and 0x8000. Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27Merge branch 'bnxt_en-fixes'David S. Miller
Michael Chan says: ==================== bnxt_en: Bug fixes. A collection of 5 miscellaneous bug fixes covering VF anti-spoof setup issues, devlink MSIX max value, AER, context memory allocation error path, and VLAN acceleration logic. Please queue for -stable. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27bnxt_en: Fix VLAN acceleration handling in bnxt_fix_features().Michael Chan
The current logic in bnxt_fix_features() will inadvertently turn on both CTAG and STAG VLAN offload if the user tries to disable both. Fix it by checking that the user is trying to enable CTAG or STAG before enabling both. The logic is supposed to enable or disable both CTAG and STAG together. Fixes: 5a9f6b238e59 ("bnxt_en: Enable and disable RX CTAG and RX STAG VLAN acceleration together.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27bnxt_en: Return error when allocating zero size context memory.Michael Chan
bnxt_alloc_ctx_pg_tbls() should return error when the memory size of the context memory to set up is zero. By returning success (0), the caller may proceed normally and may crash later when it tries to set up the memory. Fixes: 08fe9d181606 ("bnxt_en: Add Level 2 context memory paging support.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27bnxt_en: Improve AER slot reset.Michael Chan
Improve the slot reset sequence by disabling the device to prevent bad DMAs if slot reset fails. Return the proper result instead of always PCI_ERS_RESULT_RECOVERED to the caller. Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27bnxt_en: Reduce BNXT_MSIX_VEC_MAX value to supported CQs per PF.Vasundhara Volam
Broadcom adapters support only maximum of 512 CQs per PF. If user sets MSIx vectors more than supported CQs, firmware is setting incorrect value for msix_vec_per_pf_max parameter. Fix it by reducing the BNXT_MSIX_VEC_MAX value to 512, even though the maximum # of MSIx vectors supported by adapter are 1280. Fixes: f399e8497826 ("bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink params.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27bnxt_en: Fix VF anti-spoof filter setup.Michael Chan
Fix the logic that sets the enable/disable flag for the source MAC filter according to firmware spec 1.7.1. In the original firmware spec. before 1.7.1, the VF spoof check flags were not latched after making the HWRM_FUNC_CFG call, so there was a need to keep the func_flags so that subsequent calls would perserve the VF spoof check setting. A change was made in the 1.7.1 spec so that the flags became latched. So we now set or clear the anti- spoof setting directly without retrieving the old settings in the stored vf->func_flags which are no longer valid. We also remove the unneeded vf->func_flags. Fixes: 8eb992e876a8 ("bnxt_en: Update firmware interface spec to 1.7.6.2.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27net: phy: marvell10g: fix temperature sensor on 2110Baruch Siach
Read the temperature sensor register from the correct location for the 88E2110 PHY. There is no enable/disable bit on 2110, so make mv3310_hwmon_config() run on 88X3310 only. Fixes: 62d01535474b61 ("net: phy: marvell10g: add support for the 88x2110 PHY") Cc: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27sch_choke: avoid potential panic in choke_reset()Eric Dumazet
If choke_init() could not allocate q->tab, we would crash later in choke_reset(). BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline] BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326 Write of size 8 at addr 0000000000000000 by task syz-executor822/7022 CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 __kasan_report.cold+0x5/0x4d mm/kasan/report.c:515 kasan_report+0x33/0x50 mm/kasan/common.c:625 check_memory_region_inline mm/kasan/generic.c:187 [inline] check_memory_region+0x141/0x190 mm/kasan/generic.c:193 memset+0x20/0x40 mm/kasan/common.c:85 memset include/linux/string.h:366 [inline] choke_reset+0x208/0x340 net/sched/sch_choke.c:326 qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910 dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138 netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline] dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195 dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233 qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051 tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27fq_codel: fix TCA_FQ_CODEL_DROP_BATCH_SIZE sanity checksEric Dumazet
My intent was to not let users set a zero drop_batch_size, it seems I once again messed with min()/max(). Fixes: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27net/tls: Fix sk_psock refcnt leak when in tls_data_ready()Xiyu Yang
tls_data_ready() invokes sk_psock_get(), which returns a reference of the specified sk_psock object to "psock" with increased refcnt. When tls_data_ready() returns, local variable "psock" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of tls_data_ready(). When "psock->ingress_msg" is empty but "psock" is not NULL, the function forgets to decrease the refcnt increased by sk_psock_get(), causing a refcnt leak. Fix this issue by calling sk_psock_put() on all paths when "psock" is not NULL. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27net/x25: Fix x25_neigh refcnt leak when x25 disconnectXiyu Yang
x25_connect() invokes x25_get_neigh(), which returns a reference of the specified x25_neigh object to "x25->neighbour" with increased refcnt. When x25 connect success and returns, the reference still be hold by "x25->neighbour", so the refcount should be decreased in x25_disconnect() to keep refcount balanced. The reference counting issue happens in x25_disconnect(), which forgets to decrease the refcnt increased by x25_get_neigh() in x25_connect(), causing a refcnt leak. Fix this issue by calling x25_neigh_put() before x25_disconnect() returns. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27net/tls: Fix sk_psock refcnt leak in bpf_exec_tx_verdict()Xiyu Yang
bpf_exec_tx_verdict() invokes sk_psock_get(), which returns a reference of the specified sk_psock object to "psock" with increased refcnt. When bpf_exec_tx_verdict() returns, local variable "psock" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of bpf_exec_tx_verdict(). When "policy" equals to NULL but "psock" is not NULL, the function forgets to decrease the refcnt increased by sk_psock_get(), causing a refcnt leak. Fix this issue by calling sk_psock_put() on this error path before bpf_exec_tx_verdict() returns. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27aquantia: Fix the media type of AQC100 ethernet controller in the driverRichard Clark
The Aquantia AQC100 controller enables a SFP+ port, so the driver should configure the media type as '_TYPE_FIBRE' instead of '_TYPE_TP'. Signed-off-by: Richard Clark <richard.xnu.clark@gmail.com> Cc: Igor Russkikh <irusskikh@marvell.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27locktorture.c: Fix if-statement empty body warningsRandy Dunlap
When using -Wextra, gcc complains about torture_preempt_schedule() when its definition is empty (i.e., when CONFIG_PREEMPTION is not set/enabled). Fix these warnings by adding an empty do-while block for that macro when CONFIG_PREEMPTION is not set. Fixes these build warnings: ../kernel/locking/locktorture.c:119:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../kernel/locking/locktorture.c:166:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../kernel/locking/locktorture.c:337:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../kernel/locking/locktorture.c:490:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../kernel/locking/locktorture.c:528:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../kernel/locking/locktorture.c:553:29: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] I have verified that there is no object code change (with gcc 7.5.0). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org>
2020-04-27rcutorture: Mark data-race potential for rcu_barrier() test statisticsPaul E. McKenney
The n_barrier_successes, n_barrier_attempts, and n_rcu_torture_barrier_error variables are updated (without access markings) by the main rcu_barrier() test kthread, and accessed (also without access markings) by the rcu_torture_stats() kthread. This of course can result in KCSAN complaints. Because the accesses are in diagnostic prints, this commit uses data_race() to excuse the diagnostic prints from the data race. If this were to ever cause bogus statistics prints (for example, due to store tearing), any misleading information would be disambiguated by the presence or absence of an rcutorture splat. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely and due to the mild consequences of the failure, namely a confusing rcutorture console message. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2020-04-27rcutorture: Make kvm-recheck-rcu.sh handle truncated linesPaul E. McKenney
System hangs or killed rcutorture guest OSes can result in truncated "Reader Pipe:" lines, which can in turn result in false-positive reader-batch near-miss warnings. This commit therefore adjusts the reader-batch checks to account for possible line truncation. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcutorture: Add KCSAN stubsPaul E. McKenney
This commit adds stubs for KCSAN's data_race(), ASSERT_EXCLUSIVE_WRITER(), and ASSERT_EXCLUSIVE_ACCESS() macros to allow code using these macros to move ahead. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu: Remove self-stack-trace when all quiescent states seenPaul E. McKenney
When all quiescent states have been seen, it is normally the grace-period kthread that is in trouble. Although the existing stack trace from the current CPU might possibly provide useful information, experience indicates that there is too much noise for this to be worthwhile. This commit therefore removes this stack trace from the output. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu: When GP kthread is starved, tag idle threads as false positivesPaul E. McKenney
If the grace-period kthread is starved, idle threads' extended quiescent states are not reported. These idle threads thus wrongly appear to be blocking the current grace period. This commit therefore tags such idle threads as probable false positives when the grace-period kthread is being starved. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu: Use data_race() for RCU expedited CPU stall-warning printsPaul E. McKenney
Although the accesses used to determine whether or not an expedited stall should be printed are an integral part of the concurrency algorithm governing use of the corresponding variables, the values that are simply printed are ancillary. As such, it is best to use data_race() for these accesses in order to provide the greatest latitude in the use of KCSAN for the other accesses that are an integral part of the algorithm. This commit therefore changes the relevant uses of READ_ONCE() to data_race(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27ftrace: Use synchronize_rcu_tasks_rude() instead of ftrace_sync()Paul E. McKenney
This commit replaces the schedule_on_each_cpu(ftrace_sync) instances with synchronize_rcu_tasks_rude(). Suggested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> [ paulmck: Make Kconfig adjustments noted by kbuild test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Allow standalone use of TASKS_{TRACE_,}RCUPaul E. McKenney
This commit allows TASKS_TRACE_RCU to be used independently of TASKS_RCU and vice versa. [ paulmck: Fix conditional compilation per kbuild test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add IPI failure count to statisticsPaul E. McKenney
This commit adds a failure-return count for smp_call_function_single(), and adds this to the console messages for rcutorture writer stalls and at the end of rcutorture testing. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcutorture: Add TRACE02 scenario enabling RCU Tasks Trace IPIsPaul E. McKenney
This commit adds a TRACE02 scenario which enables preemption and RCU Tasks Trace IPIs, more specifically, disabling heavyweight readers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add count for idle tasks on offline CPUsPaul E. McKenney
This commit adds a counter for the number of times the quiescent state was an idle task associated with an offline CPU, and prints this count at the end of rcutorture runs and at stall time. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add rcu_dynticks_zero_in_eqs() effectiveness statisticsPaul E. McKenney
This commit adds counts of the number of calls and number of successful calls to rcu_dynticks_zero_in_eqs(), which are printed at the end of rcutorture runs and at stall time. This allows evaluation of the effectiveness of rcu_dynticks_zero_in_eqs(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Make RCU tasks trace also wait for idle tasksPaul E. McKenney
This commit scans the CPUs, adding each CPU's idle task to the list of tasks that need quiescent states. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Handle the running-offline idle-task special casePaul E. McKenney
The idle task corresponding to an offline CPU can appear to be running while that CPU is offline. This commit therefore adds checks for this situation, treating it as a quiescent state. Because the tasklist scan and the holdout-list scan now exclude CPU-hotplug operations, readers on the CPU-hotplug paths are still waited for. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Disable CPU hotplug across RCU tasks trace scansPaul E. McKenney
This commit disables CPU hotplug across RCU tasks trace scans, which is a first step towards correctly recognizing idle tasks "running" on offline CPUs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Allow rcu_read_unlock_trace() under scheduler locksPaul E. McKenney
The rcu_read_unlock_trace() can invoke rcu_read_unlock_trace_special(), which in turn can call wake_up(). Therefore, if any scheduler lock is held across a call to rcu_read_unlock_trace(), self-deadlock can occur. This commit therefore uses the irq_work facility to defer the wake_up() to a clean environment where no scheduler locks will be held. Reported-by: Steven Rostedt <rostedt@goodmis.org> [ paulmck: Update #includes for m68k per kbuild test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so builtPaul E. McKenney
Systems running CPU-bound real-time task do not want IPIs sent to CPUs executing nohz_full userspace tasks. Battery-powered systems don't want IPIs sent to idle CPUs in low-power mode. Unfortunately, RCU tasks trace can and will send such IPIs in some cases. Both of these situations occur only when the target CPU is in RCU dyntick-idle mode, in other words, when RCU is not watching the target CPU. This suggests that CPUs in dyntick-idle mode should use memory barriers in outermost invocations of rcu_read_lock_trace() and rcu_read_unlock_trace(), which would allow the RCU tasks trace grace period to directly read out the target CPU's read-side state. One challenge is that RCU tasks trace is not targeting a specific CPU, but rather a task. And that task could switch from one CPU to another at any time. This commit therefore uses try_invoke_on_locked_down_task() and checks for task_curr() in trc_inspect_reader_notrunning(). When this condition holds, the target task is running and cannot move. If CONFIG_TASKS_TRACE_RCU_READ_MB=y, the new rcu_dynticks_zero_in_eqs() function can be used to check if the specified integer (in this case, t->trc_reader_nesting) is zero while the target CPU remains in that same dyntick-idle sojourn. If so, the target task is in a quiescent state. If not, trc_read_check_handler() must indicate failure so that the grace-period kthread can take appropriate action or retry after an appropriate delay, as the case may be. With this change, given CONFIG_TASKS_TRACE_RCU_READ_MB=y, if a given CPU remains idle or a given task continues executing in nohz_full mode, the RCU tasks trace grace-period kthread will detect this without the need to send an IPI. Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add Kconfig option to mediate smp_mb() vs. IPIPaul E. McKenney
This commit provides a new TASKS_TRACE_RCU_READ_MB Kconfig option that enables use of read-side memory barriers by both rcu_read_lock_trace() and rcu_read_unlock_trace() when the are executed with the current->trc_reader_special.b.need_mb flag set. This flag is currently never set. Doing that is the subject of a later commit. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add grace-period and IPI counts to statisticsPaul E. McKenney
This commit adds a grace-period count and a count of IPIs sent since boot, which is printed in response to rcutorture writer stalls and at the end of rcutorture testing. These counts will be used to evaluate various schemes to reduce the number of IPIs sent. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Split ->trc_reader_need_endPaul E. McKenney
This commit splits ->trc_reader_need_end by using the rcu_special union. This change permits readers to check to see if a memory barrier is required without any added overhead in the common case where no such barrier is required. This commit also adds the read-side checking. Later commits will add the machinery to properly set the new ->trc_reader_special.b.need_mb field. This commit also makes rcu_read_unlock_trace_special() tolerate nested read-side critical sections within interrupt and NMI handlers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Provide boot parameter to delay IPIs until late in grace periodPaul E. McKenney
This commit provides a rcupdate.rcu_task_ipi_delay kernel boot parameter that specifies how old the RCU tasks trace grace period must be before the grace-period kthread starts sending IPIs. This delay allows more tasks to pass through rcu_tasks_qs() quiescent states, thus reducing (or even eliminating) the number of IPIs that must be sent. On a short rcutorture test setting this kernel boot parameter to HZ/2 resulted in zero IPIs for all 877 RCU-tasks trace grace periods that elapsed during that test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add a grace-period start time for throttling and debugPaul E. McKenney
This commit adds a place to record the grace-period start in jiffies. This will be used by later commits for debugging purposes and to throttle IPIs early in the grace period. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Make RCU Tasks Trace make use of RCU scheduler hooksPaul E. McKenney
This commit makes the calls to rcu_tasks_qs() detect and report quiescent states for RCU tasks trace. If the task is in a quiescent state and if ->trc_reader_checked is not yet set, the task sets its own ->trc_reader_checked. This will cause the grace-period kthread to remove it from the holdout list if it still remains there. [ paulmck: Fix conditional compilation per kbuild test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Make rcutorture writer stall output include GP statePaul E. McKenney
This commit adds grace-period state and time to the rcutorture writer stall output. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add RCU tasks to rcutorture writer stall outputPaul E. McKenney
This commit adds state for each RCU-tasks flavor to the rcutorture writer stall output. The initial state is minimal, but you have to start somewhere. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ paulmck: Fixes based on feedback from kbuild test robot. ]
2020-04-27rcu-tasks: Move #ifdef into tasks.hPaul E. McKenney
This commit pushes the #ifdef CONFIG_TASKS_RCU_GENERIC from kernel/rcu/update.c to kernel/rcu/tasks.h in order to improve readability as more APIs are added. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add stall warnings for RCU Tasks TracePaul E. McKenney
This commit adds RCU CPU stall warnings for RCU Tasks Trace. These dump out any tasks blocking the current grace period, as well as any CPUs that have not responded to an IPI request. This happens in two phases, when initially extracting state from the tasks and later when waiting for any holdout tasks to check in. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcutorture: Add torture tests for RCU Tasks TracePaul E. McKenney
This commit adds the definitions required to torture the tracing flavor of RCU tasks. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add an RCU Tasks Trace to simplify protection of tracing hooksPaul E. McKenney
Because RCU does not watch exception early-entry/late-exit, idle-loop, or CPU-hotplug execution, protection of tracing and BPF operations is needlessly complicated. This commit therefore adds a variant of Tasks RCU that: o Has explicit read-side markers to allow finite grace periods in the face of in-kernel loops for PREEMPT=n builds. These markers are rcu_read_lock_trace() and rcu_read_unlock_trace(). o Protects code in the idle loop, exception entry/exit, and CPU-hotplug code paths. In this respect, RCU-tasks trace is similar to SRCU, but with lighter-weight readers. o Avoids expensive read-side instruction, having overhead similar to that of Preemptible RCU. There are of course downsides: o The grace-period code can send IPIs to CPUs, even when those CPUs are in the idle loop or in nohz_full userspace. This is mitigated by later commits. o It is necessary to scan the full tasklist, much as for Tasks RCU. o There is a single callback queue guarded by a single lock, again, much as for Tasks RCU. However, those early use cases that request multiple grace periods in quick succession are expected to do so from a single task, which makes the single lock almost irrelevant. If needed, multiple callback queues can be provided using any number of schemes. Perhaps most important, this variant of RCU does not affect the vanilla flavors, rcu_preempt and rcu_sched. The fact that RCU Tasks Trace readers can operate from idle, offline, and exception entry/exit in no way enables rcu_preempt and rcu_sched readers to do so. The memory ordering was outlined here: https://lore.kernel.org/lkml/20200319034030.GX3199@paulmck-ThinkPad-P72/ This effort benefited greatly from off-list discussions of BPF requirements with Alexei Starovoitov and Andrii Nakryiko. At least some of the on-list discussions are captured in the Link: tags below. In addition, KCSAN was quite helpful in finding some early bugs. Link: https://lore.kernel.org/lkml/20200219150744.428764577@infradead.org/ Link: https://lore.kernel.org/lkml/87mu8p797b.fsf@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/lkml/20200225221305.605144982@linutronix.de/ Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrii Nakryiko <andriin@fb.com> [ paulmck: Apply feedback from Steve Rostedt and Joel Fernandes. ] [ paulmck: Decrement trc_n_readers_need_end upon IPI failure. ] [ paulmck: Fix locking issue reported by rcutorture. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Code movement to allow more Tasks RCU variantsPaul E. McKenney
This commit does nothing but move rcu_tasks_wait_gp() up to a new section for common code. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Further refactor RCU-tasks to allow adding more variantsPaul E. McKenney
This commit refactors RCU tasks to allow variants to be added. These variants will share the current Tasks-RCU tasklist scan and the holdout list processing. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Use unique names for RCU-Tasks kthreads and messagesPaul E. McKenney
This commit causes the flavors of RCU Tasks to use different names for their kthreads and in their console messages. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>