summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2020-01-27Merge tag 'audit-pr-20200127' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit update from Paul Moore: "One small audit patch for the Linux v5.6 merge window, and unsurprisingly it passes our test suite with flying colors" * tag 'audit-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: Add __rcu annotation to RCU pointer
2020-01-27Merge branch 'for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cgroup2 interface for hugetlb controller. I think this was the last remaining bit which was missing from cgroup2 - fixes for race and a spurious warning in threaded cgroup handling - other minor changes * 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: iocost: Fix iocost_monitor.py due to helper type mismatch cgroup: Prevent double killing of css when enabling threaded cgroup cgroup: fix function name in comment mm: hugetlb controller for cgroups v2
2020-01-27Merge branch 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: "Just a couple tracepoint patches" * 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove workqueue_work event class workqueue: add worker function to workqueue_execute_end tracepoint
2020-01-27Merge tag 'pm-5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add ACPI support to the intel_idle driver along with an admin guide document for it, add support for CPR (Core Power Reduction) to the AVS (Adaptive Voltage Scaling) subsystem, add new hardware support in a few places, add some new sysfs attributes, debugfs files and tracepoints, fix bugs and clean up a bunch of things all over. Specifics: - Update the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it, add ACPI support to the intel_idle driver based on that and clean up that driver somewhat (Rafael Wysocki). - Add an admin guide document for the intel_idle driver (Rafael Wysocki). - Clean up cpuidle core and drivers, enable compilation testing for some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael Wysocki, Yangtao Li). - Fix reference counting of OPP (operating performance points) table structures (Viresh Kumar). - Add support for CPR (Core Power Reduction) to the AVS (Adaptive Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King, YueHaibing). - Add support for TigerLake Mobile and JasperLake to the Intel RAPL power capping driver (Zhang Rui). - Update cpufreq drivers: - Add i.MX8MP support to imx-cpufreq-dt (Anson Huang). - Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva). - Fix cpufreq policy reference counting issues in s3c and brcmstb-avs (chenqiwu). - Fix ACPI table reference counting issue and HiSilicon quirk handling in the CPPC driver (Hanjun Guo). - Clean up spelling mistake in intel_pstate (Harry Pan). - Convert the kirkwood and tegra186 drivers to using devm_platform_ioremap_resource() (Yangtao Li). - Update devfreq core: - Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi). - Clean up the handing of transition statistics and allow them to be reset by writing 0 to the 'trans_stat' devfreq device attribute in sysfs (Kamil Konieczny). - Add 'devfreq_summary' to debugfs (Chanwoo Choi). - Clean up kerneldoc comments and Kconfig indentation (Krzysztof Kozlowski, Randy Dunlap). - Update devfreq drivers: - Add dynamic scaling for the imx8m DDR controller and clean up imx8m-ddrc (Leonard Crestez, YueHaibing). - Fix DT node reference counting and nitialization error code path in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency for it (Chanwoo Choi, Yangtao Li). - Fix DT node reference counting in rockchip-dfi and make it use devm_platform_ioremap_resource() (Yangtao Li). - Fix excessive stack usage in exynos-ppmu (Arnd Bergmann). - Fix initialization error code paths in exynos-bus (Yangtao Li). - Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof Kozlowski). - Add tracepoints for tracking usage_count updates unrelated to status changes in PM-runtime (Michał Mirosław). - Add sysfs attribute to control the "sync on suspend" behavior during system-wide suspend (Jonas Meurer). - Switch system-wide suspend tests over to 64-bit time (Alexandre Belloni). - Make wakeup sources statistics in debugfs cover deleted ones which used to be the case some time ago (zhuguangqing). - Clean up computations carried out during hibernation, update messages related to hibernation and fix a spelling mistake in one of them (Wen Yang, Luigi Semenzato, Colin Ian King). - Add mailmap entry for maintainer e-mail address that has not been functional for several years (Rafael Wysocki)" * tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits) cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG intel_idle: Clean up irtl_2_usec() intel_idle: Move 3 functions closer to their callers intel_idle: Annotate initialization code and data structures intel_idle: Move and clean up intel_idle_cpuidle_devices_uninit() intel_idle: Rearrange intel_idle_cpuidle_driver_init() intel_idle: Clean up NULL pointer check in intel_idle_init() intel_idle: Fold intel_idle_probe() into intel_idle_init() intel_idle: Eliminate __setup_broadcast_timer() cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings cpuidle: sysfs: fix warnings when compiling with W=1 cpuidle: coupled: fix warnings when compiling with W=1 cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior PM / devfreq: Add debugfs support with devfreq_summary file Documentation: admin-guide: PM: Add intel_idle document cpuidle: arm: Enable compile testing for some of drivers PM-runtime: add tracepoints for usage_count changes cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether" PM: hibernate: fix spelling mistake "shapshot" -> "snapshot" ...
2020-01-27Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The changes are a real mixed bag this time around. The only scary looking one from the diffstat is the uapi change to asm-generic/mman-common.h, but this has been acked by Arnd and is actually just adding a pair of comments in an attempt to prevent allocation of some PROT values which tend to get used for arch-specific purposes. We'll be using them for Branch Target Identification (a CFI-like hardening feature), which is currently under review on the mailing list. New architecture features: - Support for Armv8.5 E0PD, which benefits KASLR in the same way as KPTI but without the overhead. This allows KPTI to be disabled on CPUs that are not affected by Meltdown, even is KASLR is enabled. - Initial support for the Armv8.5 RNG instructions, which claim to provide access to a high bandwidth, cryptographically secure hardware random number generator. As well as exposing these to userspace, we also use them as part of the KASLR seed and to seed the crng once all CPUs have come online. - Advertise a bunch of new instructions to userspace, including support for Data Gathering Hint, Matrix Multiply and 16-bit floating point. Kexec: - Cleanups in preparation for relocating with the MMU enabled - Support for loading crash dump kernels with kexec_file_load() Perf and PMU drivers: - Cleanups and non-critical fixes for a couple of system PMU drivers FPU-less (aka broken) CPU support: - Considerable fixes to support CPUs without the FP/SIMD extensions, including their presence in heterogeneous systems. Good luck finding a 64-bit userspace that handles this. Modern assembly function annotations: - Start migrating our use of ENTRY() and ENDPROC() over to the new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to aid debuggers Kbuild: - Cleanup detection of LSE support in the assembler by introducing 'as-instr' - Remove compressed Image files when building clean targets IP checksumming: - Implement optimised IPv4 checksumming routine when hardware offload is not in use. An IPv6 version is in the works, pending testing. Hardware errata: - Work around Cortex-A55 erratum #1530923 Shadow call stack: - Work around some issues with Clang's integrated assembler not liking our perfectly reasonable assembly code - Avoid allocating the X18 register, so that it can be used to hold the shadow call stack pointer in future ACPI: - Fix ID count checking in IORT code. This may regress broken firmware that happened to work with the old implementation, in which case we'll have to revert it and try something else - Fix DAIF corruption on return from GHES handler with pseudo-NMIs Miscellaneous: - Whitelist some CPUs that are unaffected by Spectre-v2 - Reduce frequency of ASID rollover when KPTI is compiled in but inactive - Reserve a couple of arch-specific PROT flags that are already used by Sparc and PowerPC and are planned for later use with BTI on arm64 - Preparatory cleanup of our entry assembly code in preparation for moving more of it into C later on - Refactoring and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits) arm64: acpi: fix DAIF manipulation with pNMI arm64: kconfig: Fix alignment of E0PD help text arm64: Use v8.5-RNG entropy for KASLR seed arm64: Implement archrandom.h for ARMv8.5-RNG arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' arm64: entry: Avoid empty alternatives entries arm64: Kconfig: select HAVE_FUTEX_CMPXCHG arm64: csum: Fix pathological zero-length calls arm64: entry: cleanup sp_el0 manipulation arm64: entry: cleanup el0 svc handler naming arm64: entry: mark all entry code as notrace arm64: assembler: remove smp_dmb macro arm64: assembler: remove inherit_daif macro ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map() mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use arm64: Use macros instead of hard-coded constants for MAIR_EL1 arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list arm64: kernel: avoid x18 in __cpu_soft_restart arm64: kvm: stop treating register x18 as caller save arm64/lib: copy_page: avoid x18 register in assembler code ...
2020-01-27tracing/kprobes: Have uname use __get_str() in print_fmtSteven Rostedt (VMware)
Thomas Richter reported: > Test case 66 'Use vfs_getname probe to get syscall args filenames' > is broken on s390, but works on x86. The test case fails with: > > [root@m35lp76 perf]# perf test -F 66 > 66: Use vfs_getname probe to get syscall args filenames > :Recording open file: > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.004 MB /tmp/__perf_test.perf.data.TCdYj\ > (20 samples) ] > Looking at perf.data file for vfs_getname records for the file we touched: > FAILED! > [root@m35lp76 perf]# The root cause was the print_fmt of the kprobe event that referenced the "ustring" > Setting up the kprobe event using perf command: > > # ./perf probe "vfs_getname=getname_flags:72 pathname=filename:ustring" > > generates this format file: > [root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/probe/\ > vfs_getname/format > name: vfs_getname > ID: 1172 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:unsigned long __probe_ip; offset:8; size:8; signed:0; > field:__data_loc char[] pathname; offset:16; size:4; signed:1; > > print fmt: "(%lx) pathname=\"%s\"", REC->__probe_ip, REC->pathname Instead of using "__get_str(pathname)" it referenced it directly. Link: http://lkml.kernel.org/r/20200124100742.4050c15e@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 88903c464321 ("tracing/probe: Add ustring type for user-space string") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-01-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 20 non-merge commits during the last 5 day(s) which contain a total of 24 files changed, 433 insertions(+), 104 deletions(-). The main changes are: 1) Make BPF trampolines and dispatcher aware for the stack unwinder, from Jiri Olsa. 2) Improve handling of failed CO-RE relocations in libbpf, from Andrii Nakryiko. 3) Several fixes to BPF sockmap and reuseport selftests, from Lorenz Bauer. 4) Various cleanups in BPF devmap's XDP flush code, from John Fastabend. 5) Fix BPF flow dissector when used with port ranges, from Yoshiki Komachi. 6) Fix bpffs' map_seq_next callback to always inc position index, from Vasily Averin. 7) Allow overriding LLVM tooling for runqslower utility, from Andrey Ignatov. 8) Silence false-positive lockdep splats in devmap hash lookup, from Amol Grover. 9) Fix fentry/fexit selftests to initialize a variable before use, from John Sperbeck. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27Merge branches 'pm-cpufreq' and 'pm-sleep'Rafael J. Wysocki
* pm-cpufreq: cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether" cpufreq: s3c: fix unbalances of cpufreq policy refcount cpufreq: imx-cpufreq-dt: Add i.MX8MP support cpufreq: Use imx-cpufreq-dt for i.MX8MP's speed grading cpufreq: tegra186: convert to devm_platform_ioremap_resource cpufreq: kirkwood: convert to devm_platform_ioremap_resource cpufreq: CPPC: put ACPI table after using it cpufreq : CPPC: Break out if HiSilicon CPPC workaround is matched * pm-sleep: PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior PM: hibernate: fix spelling mistake "shapshot" -> "snapshot" PM: hibernate: Add more logging on hibernation failure PM: hibernate: improve arithmetic division in preallocate_highmem_fraction() PM: wakeup: Show statistics for deleted wakeup sources again PM: sleep: Switch to rtc_time64_to_tm()/rtc_tm_to_time64()
2020-01-27bpf, xdp: Remove no longer required rcu_read_{un}lock()John Fastabend
Now that we depend on rcu_call() and synchronize_rcu() to also wait for preempt_disabled region to complete the rcu read critical section in __dev_map_flush() is no longer required. Except in a few special cases in drivers that need it for other reasons. These originally ensured the map reference was safe while a map was also being free'd. And additionally that bpf program updates via ndo_bpf did not happen while flush updates were in flight. But flush by new rules can only be called from preempt-disabled NAPI context. The synchronize_rcu from the map free path and the rcu_call from the delete path will ensure the reference there is safe. So lets remove the rcu_read_lock and rcu_read_unlock pair to avoid any confusion around how this is being protected. If the rcu_read_lock was required it would mean errors in the above logic and the original patch would also be wrong. Now that we have done above we put the rcu_read_lock in the driver code where it is needed in a driver dependent way. I think this helps readability of the code so we know where and why we are taking read locks. Most drivers will not need rcu_read_locks here and further XDP drivers already have rcu_read_locks in their code paths for reading xdp programs on RX side so this makes it symmetric where we don't have half of rcu critical sections define in driver and the other half in devmap. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1580084042-11598-4-git-send-email-john.fastabend@gmail.com
2020-01-27bpf, xdp: Update devmap comments to reflect napi/rcu usageJohn Fastabend
Now that we rely on synchronize_rcu and call_rcu waiting to exit perempt-disable regions (NAPI) lets update the comments to reflect this. Fixes: 0536b85239b84 ("xdp: Simplify devmap cleanup") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/1580084042-11598-2-git-send-email-john.fastabend@gmail.com
2020-01-27bpf: map_seq_next should always increase position indexVasily Averin
If seq_file .next fuction does not change position index, read after some lseek can generate an unexpected output. See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283 v1 -> v2: removed missed increment in end of function Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/eca84fdd-c374-a154-d874-6c7b55fc3bc4@virtuozzo.com
2020-01-26sched.h: Annotate sighand_struct with __rcuMadhuparna Bhowmik
This patch fixes the following sparse errors by annotating the sighand_struct with __rcu kernel/fork.c:1511:9: error: incompatible types in comparison expression kernel/exit.c:100:19: error: incompatible types in comparison expression kernel/signal.c:1370:27: error: incompatible types in comparison expression This fix introduces the following sparse error in signal.c due to checking the sighand pointer without rcu primitives: kernel/signal.c:1386:21: error: incompatible types in comparison expression This new sparse error is also fixed in this patch. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20200124045908.26389-1-madhuparnabhowmik10@gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25rcu: Forgive slow expedited grace periods at boot timePaul E. McKenney
Boot-time processing often loops in the kernel longer than one might prefer, which can prevent expedited grace periods from completing in a timely manner. This in turn triggers a splat In nohz_full CPUs One could argue that long-looping code should be fixed, but on the other hand, boot time is a bit special. This commit therefore removes the splat. Later commits will add the splat back in, but in a way that removes false positives. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-25tracing: Use pr_err() instead of WARN() for memory failuresSteven Rostedt (VMware)
As warnings can trigger panics, especially when "panic_on_warn" is set, memory failure warnings can cause panics and fail fuzz testers that are stressing memory. Create a MEM_FAIL() macro to use instead of WARN() in the tracing code (perhaps this should be a kernel wide macro?), and use that for memory failure issues. This should stop failing fuzz tests due to warnings. Link: https://lore.kernel.org/r/CACT4Y+ZP-7np20GVRu3p+eZys9GPtbu+JpfV+HtsufAzvTgJrg@mail.gmail.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-25bpf: Allow to resolve bpf trampoline and dispatcher in unwindJiri Olsa
When unwinding the stack we need to identify each address to successfully continue. Adding latch tree to keep trampolines for quick lookup during the unwind. The patch uses first 48 bytes for latch tree node, leaving 4048 bytes from the rest of the page for trampoline or dispatcher generated code. It's still enough not to affect trampoline and dispatcher progs maximum counts. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-3-jolsa@kernel.org
2020-01-25bpf: Allow BTF ctx access for string pointersJiri Olsa
When accessing the context we allow access to arguments with scalar type and pointer to struct. But we deny access for pointer to scalar type, which is the case for many functions. Alexei suggested to take conservative approach and allow currently only string pointer access, which is the case for most functions now: Adding check if the pointer is to string type and allow access to it. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-2-jolsa@kernel.org
2020-01-25Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Expedited grace-period updates - kfree_rcu() updates - RCU list updates - Preemptible RCU updates - Torture-test updates - Miscellaneous fixes - Documentation updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-24tracing: Decrement trace_array when bootconfig creates an instanceSteven Rostedt (VMware)
The trace_array_get_by_name() creates a ftrace instance and trace_array_put() is used to remove the reference. Even though the trace_array_get_by_name() creates the instance, it also adds a reference count to it, that prevents user space from removing it. As the bootconfig just creates the instance on boot up, it should still be used where it can be deleted by user space after boot. A trace_array_put() is required to let that happen. Also, change the documentation on trace_array_get_by_name() to make this not be so confusing. Link: https://lore.kernel.org/r/20200124205927.76128804@rorschach.local.home Fixes: 4f712a4d04a4e ("tracing/boot: Add instance node support") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Remove unneeded NULL checkDan Carpenter
We checked "iter->trace" earlier so there is no need to check here. Link: http://lkml.kernel.org/r/20141122183012.GB6994@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Set kernel_stack's caller size properlyJosef Bacik
I noticed when trying to use the trace-cmd python interface that reading the raw buffer wasn't working for kernel_stack events. This is because it uses a stubbed version of __dynamic_array that doesn't do the __data_loc trick and encode the length of the array into the field. Instead it just shows up as a size of 0. So change this to __array and set the len to FTRACE_STACK_ENTRIES since this is what we actually do in practice and matches how user_stack_trace works. Link: http://lkml.kernel.org/r/1411589652-1318-1-git-send-email-jbacik@fb.com Signed-off-by: Josef Bacik <jbacik@fb.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Fix tracing_stat return values in error handling pathsLuis Henriques
tracing_stat_init() was always returning '0', even on the error paths. It now returns -ENODEV if tracing_init_dentry() fails or -ENOMEM if it fails to created the 'trace_stat' debugfs directory. Link: http://lkml.kernel.org/r/1410299381-20108-1-git-send-email-luis.henriques@canonical.com Fixes: ed6f1c996bfe4 ("tracing: Check return value of tracing_init_dentry()") Signed-off-by: Luis Henriques <luis.henriques@canonical.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24tracing: Fix very unlikely race of registering two stat tracersSteven Rostedt (VMware)
Looking through old emails in my INBOX, I came across a patch from Luis Henriques that attempted to fix a race of two stat tracers registering the same stat trace (extremely unlikely, as this is done in the kernel, and probably doesn't even exist). The submitted patch wasn't quite right as it needed to deal with clean up a bit better (if two stat tracers were the same, it would have the same files). But to make the code cleaner, all we needed to do is to keep the all_stat_sessions_mutex held for most of the registering function. Link: http://lkml.kernel.org/r/1410299375-20068-1-git-send-email-luis.henriques@canonical.com Fixes: 002bb86d8d42f ("tracing/ftrace: separate events tracing and stats tracing engine") Reported-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-24alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=nStephen Boyd
The stubbed version of alarmtimer_get_rtcdev() is not exported. so this won't work if this function is used in a module when CONFIG_RTC_CLASS=n. Move the stub function to the header file and make it inline so that callers don't have to worry about linking against this symbol. rtcdev isn't used outside of this ifdef so it's not required to be redefined to NULL. Drop that while touching this area. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200124055849.154411-4-swboyd@chromium.org
2020-01-24alarmtimer: Use wakeup source from alarmtimer platform deviceStephen Boyd
Use the wakeup source that can be associated with the 'alarmtimer' platform device instead of registering another one by hand. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200124055849.154411-3-swboyd@chromium.org
2020-01-24alarmtimer: Make alarmtimer platform device child of RTC deviceStephen Boyd
The alarmtimer_suspend() function will fail if an RTC device is on a bus such as SPI or i2c and that RTC device registers and probes after alarmtimer_init() registers and probes the 'alarmtimer' platform device. This is because system wide suspend suspends devices in the reverse order of their probe. When alarmtimer_suspend() attempts to program the RTC for a wakeup it will try to program an RTC device on a bus that has already been suspended. Move the alarmtimer device registration to happen when the RTC which is used for wakeup is registered. Register the 'alarmtimer' platform device as a child of the RTC device too, so that it can be guaranteed that the RTC device won't be suspended when alarmtimer_suspend() is called. Reported-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200124055849.154411-2-swboyd@chromium.org
2020-01-24alarmtimer: Update alarmtimer_get_rtcdev() docs to reflect realityStephen Boyd
This function doesn't do anything like this comment says when an RTC device hasn't been chosen. It looks like we used to do something like that before commit 8bc0dafb5cf3 ("alarmtimers: Rework RTC device selection using class interface") but that's long gone now. Remove this sentence to avoid confusing the reader. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200124055849.154411-5-swboyd@chromium.org
2020-01-24smp: Remove allocation mask from on_each_cpu_cond.*()Sebastian Andrzej Siewior
The allocation mask is no longer used by on_each_cpu_cond() and on_each_cpu_cond_mask() and can be removed. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20200117090137.1205765-4-bigeasy@linutronix.de
2020-01-24smp: Add a smp_cond_func_t argument to smp_call_function_many()Sebastian Andrzej Siewior
on_each_cpu_cond_mask() allocates a new CPU mask. The newly allocated mask is a subset of the provided mask based on the conditional function. This memory allocation can be avoided by extending smp_call_function_many() with the conditional function and performing the remote function call based on the mask and the conditional function. Rename smp_call_function_many() to smp_call_function_many_cond() and add the smp_cond_func_t argument. If smp_cond_func_t is provided then it is used before invoking the function. Provide smp_call_function_many() with cond_func set to NULL. Let on_each_cpu_cond_mask() use smp_call_function_many_cond(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20200117090137.1205765-3-bigeasy@linutronix.de
2020-01-24smp: Use smp_cond_func_t as type for the conditional functionSebastian Andrzej Siewior
Use a typdef for the conditional function instead defining it each time in the function prototype. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20200117090137.1205765-2-bigeasy@linutronix.de
2020-01-24Merge tag 'irqchip-5.6' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Conversion of the SiFive PLIC to hierarchical domains - New SiFive GPIO irqchip driver - New Aspeed SCI irqchip driver - New NXP INTMUX irqchip driver - Additional support for the Meson A1 GPIO irqchip - First part of the GICv4.1 support - Assorted fixes
2020-01-24Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', ↵Paul E. McKenney
'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD doc.2019.12.10a: Documentations updates exp.2019.12.09a: Expedited grace-period updates fixes.2020.01.24a: Miscellaneous fixes kfree_rcu.2020.01.24a: Batch kfree_rcu() work list.2020.01.10a: RCU-protected-list updates preempt.2020.01.24a: Preemptible RCU updates torture.2019.12.09a: Torture-test updates
2020-01-24rcu: Remove unused stop-machine #includePaul E. McKenney
Long ago, RCU used the stop-machine mechanism to implement expedited grace periods, but no longer does so. This commit therefore removes the no-longer-needed #includes of linux/stop_machine.h. Link: https://lwn.net/Articles/805317/ Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24srcu: Apply *_ONCE() to ->srcu_last_gp_endPaul E. McKenney
The ->srcu_last_gp_end field is accessed from any CPU at any time by synchronize_srcu(), so non-initialization references need to use READ_ONCE() and WRITE_ONCE(). This commit therefore makes that change. Reported-by: syzbot+08f3e9d26e5541e1ecf2@syzkaller.appspotmail.com Acked-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Switch force_qs_rnp() to for_each_leaf_node_cpu_mask()Paul E. McKenney
Currently, force_qs_rnp() uses a for_each_leaf_node_possible_cpu() loop containing a check of the current CPU's bit in ->qsmask. This works, but this commit saves three lines by instead using for_each_leaf_node_cpu_mask(), which combines the functionality of for_each_leaf_node_possible_cpu() and leaf_node_cpu_bit(). This commit also replaces the use of the local variable "bit" with rdp->grpmask. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Move rcu_{expedited,normal} definitions into rcupdate.hBen Dooks
This commit moves the rcu_{expedited,normal} definitions from kernel/rcu/update.c to include/linux/rcupdate.h to make sure they are in sync, and also to avoid the following warning from sparse: kernel/ksysfs.c:150:5: warning: symbol 'rcu_expedited' was not declared. Should it be static? kernel/ksysfs.c:167:5: warning: symbol 'rcu_normal' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.hLai Jiangshan
Only tree_stall.h needs to get name from GP state, so this commit moves the gp_state_names[] array and the gp_state_getname() from kernel/rcu/tree.h and kernel/rcu/tree.c, respectively, to kernel/rcu/tree_stall.h. While moving gp_state_names[], this commit uses the GCC syntax to ensure that the right string is associated with the right CPP macro. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove the declaration of call_rcu() in tree.hLai Jiangshan
The call_rcu() function is an external RCU API that is declared in include/linux/rcupdate.h. There is thus no point in redeclaring it in kernel/rcu/tree.h, so this commit removes that redundant declaration. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Fix tracepoint tracking RCU CPU kthread utilizationLai Jiangshan
In the call to trace_rcu_utilization() at the start of the loop in rcu_cpu_kthread(), "rcu_wait" is incorrect, plus this trace event needs to be hoisted above the loop to balance with either the "rcu_wait" or "rcu_yield", depending on how the loop exits. This commit therefore makes these changes. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Fix harmless omission of "CONFIG_" from #if conditionLai Jiangshan
The C preprocessor macros SRCU and TINY_RCU should instead be CONFIG_SRCU and CONFIG_TINY_RCU, respectively in the #f in kernel/rcu/rcu.h. But there is no harm when "TINY_RCU" is wrongly used, which are always non-defined, which makes "!defined(TINY_RCU)" always true, which means the code block is always included, and the included code block doesn't cause any compilation error so far in CONFIG_TINY_RCU builds. It is also the reason this change should not be taken in -stable. This commit adds the needed "CONFIG_" prefix to both macros. Not for -stable. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Avoid tick_dep_set_cpu() misorderingPaul E. McKenney
In the current code, rcu_nmi_enter_common() might decide to turn on the tick using tick_dep_set_cpu(), but be delayed just before doing so. Then the grace-period kthread might notice that the CPU in question had in fact gone through a quiescent state, thus turning off the tick using tick_dep_clear_cpu(). The later invocation of tick_dep_set_cpu() would then incorrectly leave the tick on. This commit therefore enlists the aid of the leaf rcu_node structure's ->lock to ensure that decisions to enable or disable the tick are carried out before they can be reversed. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Provide wrappers for uses of ->rcu_read_lock_nestingLai Jiangshan
This commit provides wrapper functions for uses of ->rcu_read_lock_nesting to improve readability and to ease future changes to support inlining of __rcu_read_lock() and __rcu_read_unlock(). Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()Paul E. McKenney
The rcu_node structure's ->expmask field is updated only when holding the ->lock, but is also accessed locklessly. This means that all ->expmask updates must use WRITE_ONCE() and all reads carried out without holding ->lock must use READ_ONCE(). This commit therefore changes the lockless ->expmask read in rcu_read_unlock_special() to use READ_ONCE(). Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Marco Elver <elver@google.com>
2020-01-24rcu: Clear ->rcu_read_unlock_special only onceLai Jiangshan
In rcu_preempt_deferred_qs_irqrestore(), ->rcu_read_unlock_special is cleared one piece at a time. Given that the "if" statements in this function use the copy in "special", this commit removes the clearing of the individual pieces in favor of clearing ->rcu_read_unlock_special in one go just after it has been determined to be non-zero. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Clear .exp_hint only when deferred quiescent state has been reportedLai Jiangshan
Currently, the .exp_hint flag is cleared in rcu_read_unlock_special(), which works, but which can also prevent subsequent rcu_read_unlock() calls from helping expedite the quiescent state needed by an ongoing expedited RCU grace period. This commit therefore defers clearing of .exp_hint from rcu_read_unlock_special() to rcu_preempt_deferred_qs_irqrestore(), thus ensuring that intervening calls to rcu_read_unlock() have a chance to help end the expedited grace period. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Rename some instance of CONFIG_PREEMPTION to CONFIG_PREEMPT_RCULai Jiangshan
CONFIG_PREEMPTION and CONFIG_PREEMPT_RCU are always identical, but some code depends on CONFIG_PREEMPTION to access to rcu_preempt functionality. This patch changes CONFIG_PREEMPTION to CONFIG_PREEMPT_RCU in these cases. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove kfree_call_rcu_nobatch()Joel Fernandes (Google)
Now that the kfree_rcu() special-casing has been removed from tree RCU, this commit removes kfree_call_rcu_nobatch() since it is no longer needed. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove kfree_rcu() special casing and lazy-callback handlingJoel Fernandes (Google)
This commit removes kfree_rcu() special-casing and the lazy-callback handling from Tree RCU. It moves some of this special casing to Tiny RCU, the removal of which will be the subject of later commits. This results in a nice negative delta. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Add support for debug_objects debugging for kfree_rcu()Joel Fernandes (Google)
This commit applies RCU's debug_objects debugging to the new batched kfree_rcu() implementations. The object is queued at the kfree_rcu() call and dequeued during reclaim. Tested that enabling CONFIG_DEBUG_OBJECTS_RCU_HEAD successfully detects double kfree_rcu() calls. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Fix IRQ per kbuild test robot <lkp@intel.com> feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Add multiple in-flight batches of kfree_rcu() workJoel Fernandes (Google)
During testing, it was observed that amount of memory consumed due kfree_rcu() batching is 300-400MB. Previously we had only a single head_free pointer pointing to the list of rcu_head(s) that are to be freed after a grace period. Until this list is drained, we cannot queue any more objects on it since such objects may not be ready to be reclaimed when the worker thread eventually gets to drainin g the head_free list. We can do better by maintaining multiple lists as done by this patch. Testing shows that memory consumption came down by around 100-150MB with just adding another list. Adding more than 1 additional list did not show any improvement. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Code style and initialization handling. ] [ paulmck: Fix field name, reported by kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>