summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2020-04-07Merge tag 'thermal-v5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Convert tsens configuration DT binding to yaml (Rajeshwari) - Add interrupt support on the rcar sensor (Niklas Söderlund) - Add a new Spreadtrum thermal driver (Baolin Wang) - Add thermal binding for the fsl scu board, a new API to retrieve the sensor id bound to the thermal zone and i.MX system controller sensor (Anson Huang)) - Remove warning log when a deferred probe is requested on Exynos (Marek Szyprowski) - Add the thermal monitoring unit support for imx8mm with its DT bindings (Anson Huang) - Rephrase the Kconfig text for clarity (Linus Walleij) - Use the gpio descriptor for the ti-soc-thermal (Linus Walleij) - Align msg structure to 4 bytes for i.MX SC, fix the Kconfig dependency, add the __may_be unused annotation for PM functions and the COMPILE_TEST option for imx8mm (Anson Huang) - Fix a dependency on regmap in Kconfig for qoriq (Yuantian Tang) - Add DT binding and support for the rcar gen3 r8a77961 and improve the error path on the rcar init function (Niklas Söderlund) - Cleanup and improvements for the tsens Qcom sensor (Amit Kucheria) - Improve code by removing lock and caching values in the rcar thermal sensor (Niklas Söderlund) - Cleanup in the qoriq drivers and add a call to imx_thermal_unregister_legacy_cooling in the removal function (Anson Huang) - Remove redundant 'maxItems' in tsens and sprd DT bindings (Rob Herring) - Change the thermal DT bindings by making the cooling-maps optional (Yuantian Tang) - Add Tiger Lake support (Sumeet Pawnikar) - Use scnprintf() for avoiding potential buffer overflow (Takashi Iwai) - Make pkg_temp_lock a raw_spinlock_t(Clark Williams) - Fix incorrect data types by changing them to signed on i.MX SC (Anson Huang) - Replace zero-length array with flexible-array member (Gustavo A. R. Silva) - Add support for i.MX8MP in the driver and in the DT bindings (Anson Huang) - Fix return value of the cpufreq_set_cur_state() function (Willy Wolff) - Remove abusing and scary WARN_ON in the cpufreq cooling device (Daniel Lezcano) - Fix build warning of incorrect argument type reported by sparse on imx8mm (Anson Huang) - Fix stub for the devfreq cooling device (Martin Blumenstingl) - Fix cpu idle cooling documentation (Sergey Vidishev) * tag 'thermal-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (52 commits) Documentation: cpu-idle-cooling: Fix diagram for 33% duty cycle thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n thermal: imx8mm: Fix build warning of incorrect argument type thermal/drivers/cpufreq_cooling: Remove abusing WARN_ON thermal/drivers/cpufreq_cooling: Fix return of cpufreq_set_cur_state thermal: imx8mm: Add i.MX8MP support dt-bindings: thermal: imx8mm-thermal: Add support for i.MX8MP thermal: qcom: tsens.h: Replace zero-length array with flexible-array member thermal: imx_sc_thermal: Fix incorrect data type thermal: int340x_thermal: Use scnprintf() for avoiding potential buffer overflow thermal: int340x: processor_thermal: Add Tiger Lake support thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t dt-bindings: thermal: make cooling-maps property optional dt-bindings: thermal: qcom-tsens: Remove redundant 'maxItems' dt-bindings: thermal: sprd: Remove redundant 'maxItems' thermal: imx: Calling imx_thermal_unregister_legacy_cooling() in .remove thermal: qoriq: Sort includes alphabetically thermal: qoriq: Use devm_add_action_or_reset() to handle all cleanups thermal: rcar_thermal: Remove lock in rcar_thermal_get_current_temp() thermal: rcar_thermal: Do not store ctemp in rcar_thermal_priv ...
2020-04-07kselftest: introduce new epoll test caseRoman Penyaev
This testcase repeats epollbug.c from the bug: https://bugzilla.kernel.org/show_bug.cgi?id=205933 What it tests? It tests the race between epoll_ctl() and epoll_wait(). New event mask passed to epoll_ctl() triggers wake up, which can be missed because of the bug described in the link. Reproduction is 100%, so easy to fix. Kudos, Max, for wonderful test case. Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Max Neunhoeffer <max@arangodb.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Jes Sorensen <jes.sorensen@gmail.com> Link: http://lkml.kernel.org/r/20200214170211.561524-2-rpenyaev@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07lib/rbtree: fix coding style of assignmentschenqiwu
Leave blank space between the right-hand and left-hand side of the assignment to meet the kernel coding style better. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Michel Lespinasse <walken@google.com> Link: http://lkml.kernel.org/r/1582621140-25850-1-git-send-email-qiwuchen55@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07userfaultfd: selftests: add write-protect testPeter Xu
Add uffd tests for write protection. Instead of introducing new tests for it, let's simply squashing uffd-wp tests into existing uffd-missing test cases. Changes are: (1) Bouncing tests We do the write-protection in two ways during the bouncing test: - By using UFFDIO_COPY_MODE_WP when resolving MISSING pages: then we'll make sure for each bounce process every single page will be at least fault twice: once for MISSING, once for WP. - By direct call UFFDIO_WRITEPROTECT on existing faulted memories: To further torture the explicit page protection procedures of uffd-wp, we split each bounce procedure into two halves (in the background thread): the first half will be MISSING+WP for each page as explained above. After the first half, we write protect the faulted region in the background thread to make sure at least half of the pages will be write protected again which is the first half to test the new UFFDIO_WRITEPROTECT call. Then we continue with the 2nd half, which will contain both MISSING and WP faulting tests for the 2nd half and WP-only faults from the 1st half. (2) Event/Signal test Mostly previous tests but will do MISSING+WP for each page. For sigbus-mode test we'll need to provide standalone path to handle the write protection faults. For all tests, do statistics as well for uffd-wp pages. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Martin Cracauer <cracauer@cons.org> Cc: Marty McFadden <mcfadden8@llnl.gov> Cc: Maya Gokhale <gokhale2@llnl.gov> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shli@fb.com> Link: http://lkml.kernel.org/r/20200220163112.11409-20-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07userfaultfd: selftests: refactor statisticsPeter Xu
Introduce uffd_stats structure for statistics of the self test, at the same time refactor the code to always pass in the uffd_stats for either read() or poll() typed fault handling threads instead of using two different ways to return the statistic results. No functional change. With the new structure, it's very easy to introduce new statistics. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Martin Cracauer <cracauer@cons.org> Cc: Marty McFadden <mcfadden8@llnl.gov> Cc: Maya Gokhale <gokhale2@llnl.gov> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shli@fb.com> Link: http://lkml.kernel.org/r/20200220163112.11409-19-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-05Merge tag 'perf-urgent-2020-04-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more perf updates from Thomas Gleixner: "Perf updates all over the place: core: - Support for cgroup tracking in samples to allow cgroup based analysis tools: - Support for cgroup analysis - Commandline option and hotkey for perf top to change the sort order - A set of fixes all over the place - Various build system related improvements - Updates of the X86 pmu event JSON data - Documentation updates" * tag 'perf-urgent-2020-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) perf python: Fix clang detection to strip out options passed in $CC perf tools: Support Python 3.8+ in Makefile perf script: Fix invalid read of directory entry after closedir() perf script report: Fix SEGFAULT when using DWARF mode perf script: add -S/--symbols documentation perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric perf events parser: Add missing Intel CPU events to parser perf script: Allow --symbol to accept hexadecimal addresses perf report/top TUI: Fix title line formatting perf top: Support hotkey to change sort order perf top: Support --group-sort-idx to change the sort order perf symbols: Fix arm64 gap between kernel start and module end perf build-test: Honour JOBS to override detection of number of cores perf script: Add --show-cgroup-events option perf top: Add --all-cgroups option perf record: Add --all-cgroups option perf record: Support synthesizing cgroup events perf report: Add 'cgroup' sort key perf cgroup: Maintain cgroup hierarchy perf tools: Basic support for CGROUP event ...
2020-04-05Merge tag 'powerpc-5.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Slightly late as I had to rebase mid-week to insert a bug fix: - A large series from Nick for 64-bit to further rework our exception vectors, and rewrite portions of the syscall entry/exit and interrupt return in C. The result is much easier to follow code that is also faster in general. - Cleanup of our ptrace code to split various parts out that had become badly intertwined with #ifdefs over the years. - Changes to our NUMA setup under the PowerVM hypervisor which should hopefully avoid non-sensical topologies which can lead to warnings from the workqueue code and other problems. - MAINTAINERS updates to remove some of our old orphan entries and update the status of others. - Quite a few other small changes and fixes all over the map. Thanks to: Abdul Haleem, afzal mohammed, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Balamuruhan S, Cédric Le Goater, Chen Zhou, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Clement Courbet, Daniel Axtens, David Gibson, Douglas Miller, Fabiano Rosas, Fangrui Song, Ganesh Goudar, Gautham R. Shenoy, Greg Kroah-Hartman, Greg Kurz, Gustavo Luiz Duarte, Hari Bathini, Ilie Halip, Jan Kara, Joe Lawrence, Joe Perches, Kajol Jain, Larry Finger, Laurentiu Tudor, Leonardo Bras, Libor Pechacek, Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Masami Hiramatsu, Mauricio Faria de Oliveira, Michael Neuling, Michal Suchanek, Mike Rapoport, Nageswara R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Po-Hsu Lin, Pratik Rajesh Sampat, Rasmus Villemoes, Ravi Bangoria, Roman Bolshakov, Sam Bobroff, Sandipan Das, Santosh S, Sedat Dilek, Segher Boessenkool, Shilpasri G Bhat, Sourabh Jain, Srikar Dronamraju, Stephen Rothwell, Tyrel Datwyler, Vaibhav Jain, YueHaibing" * tag 'powerpc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits) powerpc: Make setjmp/longjmp signature standard powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type powerpc: Suppress .eh_frame generation powerpc: Drop -fno-dwarf2-cfi-asm powerpc/32: drop unused ISA_DMA_THRESHOLD powerpc/powernv: Add documentation for the opal sensor_groups sysfs interfaces selftests/powerpc: Fix try-run when source tree is not writable powerpc/vmlinux.lds: Explicitly retain .gnu.hash powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c powerpc/ptrace: create ppc_gethwdinfo() powerpc/ptrace: create ptrace_get_debugreg() powerpc/ptrace: split out ADV_DEBUG_REGS related functions. powerpc/ptrace: move register viewing functions out of ptrace.c powerpc/ptrace: split out TRANSACTIONAL_MEM related functions. powerpc/ptrace: split out SPE related functions. powerpc/ptrace: split out ALTIVEC related functions. powerpc/ptrace: split out VSX related functions. powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64 powerpc/ptrace: remove unused header includes ...
2020-04-05Merge tag 'trace-v5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - The ring buffer is no longer disabled when reading the trace file. The trace_pipe file was made to be used for live tracing and reading as it acted like the normal producer/consumer. As the trace file would not consume the data, the easy way of handling it was to just disable writes to the ring buffer. This came to a surprise to the BPF folks who complained about lost events due to reading. This is no longer an issue. If someone wants to keep the old disabling there's a new option "pause-on-trace" that can be set. - New set_ftrace_notrace_pid file. PIDs in this file will not be traced by the function tracer. Similar to set_ftrace_pid, which makes the function tracer only trace those tasks with PIDs in the file, the set_ftrace_notrace_pid does the reverse. - New set_event_notrace_pid file. PIDs in this file will cause events not to be traced if triggered by a task with a matching PID. Similar to the set_event_pid file but will not be traced. Note, sched_waking and sched_switch events may still be traced if one of the tasks referenced by those events contains a PID that is allowed to be traced. Tracing related features: - New bootconfig option, that is attached to the initrd file. If bootconfig is on the command line, then the initrd file is searched looking for a bootconfig appended at the end. - New GPU tracepoint infrastructure to help the gfx drivers to get off debugfs (acked by Greg Kroah-Hartman) And other minor updates and fixes" * tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits) tracing: Do not allocate buffer in trace_find_next_entry() in atomic tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid selftests/ftrace: Add test to test new set_event_notrace_pid file selftests/ftrace: Add test to test new set_ftrace_notrace_pid file tracing: Create set_event_notrace_pid to not trace tasks ftrace: Create set_ftrace_notrace_pid to not trace tasks ftrace: Make function trace pid filtering a bit more exact ftrace/kprobe: Show the maxactive number on kprobe_events tracing: Have the document reflect that the trace file keeps tracing enabled ring-buffer/tracing: Have iterator acknowledge dropped events tracing: Do not disable tracing when reading the trace file ring-buffer: Do not disable recording when there is an iterator ring-buffer: Make resize disable per cpu buffer instead of total buffer ring-buffer: Optimize rb_iter_head_event() ring-buffer: Do not die if rb_iter_peek() fails more than thrice ring-buffer: Have rb_iter_head_event() handle concurrent writer ring-buffer: Add page_stamp to iterator for synchronization ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance() ring-buffer: Have ring_buffer_empty() not depend on tracing stopped tracing: Save off entry when peeking at next entry ...
2020-04-04Merge tag 'gpio-v5.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO development for the v5.7 kernel cycle. Core and userspace API: - The userspace API KFIFOs have been imoproved with locks that do not block interrupts. This makes us better at getting events to userspace without blocking or disturbing new events arriving in the same time. This was reviewed by the KFIFO maintainer Stefani. This is a generic improvement which paves the road for similar improvements in other subsystems. - We provide a new ioctl() for monitoring changes in the line information, such as when multiple clients are taking lines and giving them back, possibly reconfiguring them in the process: we can now monitor that and not get stuck with stale static information. - An example tool 'gpio-watch' is provided to showcase this functionality. - Timestamps for events are switched to ktime_get_ns() which is monotonic. We previously had a 'realtime' stamp which could move forward and *backward* in time, which probably would just cause silent bugs and weird behaviour. In the long run we see two relevant timestamps: ktime_get_ns() or the timestamp sometimes provided by the GPIO hardware itself, if that exists. - Device Tree overlay support for GPIO hogs. On systems that load overlays, these overlays can now contain hogs, and will then be respected. - Handle pin control interaction with nonexisting pin ranges in the GPIO library core instead of in the individual drivers. New drivers: - New driver for the Mellanox BlueField 2 GPIO controller. Driver improvements: - Introduce the BGPIOF_NO_SET_ON_INPUT flag to the generic MMIO GPIO library and use this flag in the MT7621 driver. - Texas Instruments OMAP CPU power management improvements, such as blocking of idle on pending GPIO interrupts" * tag 'gpio-v5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (59 commits) Revert "gpio: eic-sprd: Use devm_platform_ioremap_resource()" pinctrl: Unconditionally assign .request()/.free() gpio: Unconditionally assign .request()/.free() gpio: export of_pinctrl_get to modules pinctrl: Define of_pinctrl_get() dummy for !PINCTRL gpio: Rename variable in core APIs gpio: Avoid using pin ranges with !PINCTRL gpiolib: Remove unused gpio_chip parameter from gpio_set_bias() gpiolib: Pass gpio_desc to gpio_set_config() gpiolib: Introduce gpiod_set_config() tools: gpio: Fix out-of-tree build regression gpio: gpiolib: fix a doc warning gpio: tegra186: Add Tegra194 pin ranges for GG.0 and GG.1 gpio: tegra186: Add support for pin ranges gpio: Support GPIO controllers without pin-ranges ARM: integrator: impd1: Use GPIO_LOOKUP() helper macro gpio: brcmstb: support gpio-line-names property tools: gpio: Fix typo in gpio-utils tools: gpio-hammer: Apply scripts/Lindent and retain good changes gpiolib: gpio_name_to_desc: factor out !name check ...
2020-04-04Merge tag 'threads-v5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread updates from Christian Brauner: "The main change for this cycle was the extension for clone3() to support spawning processes directly into cgroups via CLONE_INTO_CGROUP (commit ef2c41cf38a7: "clone3: allow spawning processes into cgroups"). But since I had to touch kernel/cgroup/ quite a bit I had Tejun route that through his tree this time around to make it easier for him to handle other changes. So here is just the unexciting leftovers: a regression test for the ENOMEM regression we fixed in commit b26ebfe12f34 ("pid: Fix error return value in some cases") verifying that we report ENOMEM when trying to create a new process in a pid namespace whose init process/subreaper has already exited" * tag 'threads-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add pid namespace ENOMEM regression test
2020-04-04Merge tag 'perf-urgent-for-mingo-5.7-20200403' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo: perf python: Arnaldo Carvalho de Melo: - Fix clang detection to strip out options passed in $CC. build: He Zhe: - Normalize gcc parameter when generating arch errno table, fixing the build by removing options from $(CC). Sam Lunt: - Support Python 3.8+ in Makefile. perf report/top: Arnaldo Carvalho de Melo: - Fix title line formatting. perf script: Andreas Gerstmayr: - Fix SEGFAULT when using DWARF mode. - Fix invalid read of directory entry after closedir(), found with valgrind. Hagen Paul Pfeifer: - Introduce --deltatime option. Stephane Eranian: - Allow --symbol to accept hexadecimal addresses. Ian Rogers: - Add -S/--symbols documentation Namhyung Kim: - Add --show-cgroup-events option. perf python: Arnaldo Carvalho de Melo: - Include rwsem.c in the python binding, needed by the cgroups improvements. build-test: Arnaldo Carvalho de Melo: - Honour JOBS to override detection of number of cores perf top: Jin Yao: - Support --group-sort-idx to change the sort order - perf top: Support hotkey to change sort order perf pmu-events x86: Jin Yao: - Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric perf symbols arm64: Kemeng Shi: - Fix arm64 gap between kernel start and module end kernel perf subsystem: Namhyung Kim: - Add PERF_RECORD_CGROUP event and Add PERF_SAMPLE_CGROUP feature, to allow cgroup tracking, saving a link between cgroup path and its id number. perf cgroup: Namhyung Kim: - Maintain cgroup hierarchy. perf report: Namhyung Kim: - Add 'cgroup' sort key. perf record: Namhyung Kim: - Support synthesizing cgroup events for pre-existing cgroups. - Add --all-cgroups option Documentation: Tony Jones: - Update docs regarding kernel/user space unwinding. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-04-03Merge tag 'pci-v5.7-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg) - Add more 32 GT/s link speed decoding and improve the implementation (Yicong Yang) Resource management: - Add support for sizing programmable host bridge apertures and fix a related alpha Nautilus regression (Ivan Kokshaysky) Interrupts: - Add boot interrupt quirk mechanism for Xeon chipsets and document boot interrupts (Sean V Kelley) PCIe native device hotplug: - When possible, disable in-band presence detect and use PDS (Alexandru Gagniuc) - Add DMI table for devices that don't use in-band presence detection but don't advertise that correctly (Stuart Hayes) - Fix hang when powering slots up/down via sysfs (Lukas Wunner) - Fix an MSI interrupt race (Stuart Hayes) Virtualization: - Add ACS quirks for Zhaoxin devices (Raymond Pang) Error handling: - Add Error Disconnect Recover (EDR) support so firmware can report devices disconnected via DPC and we can try to recover (Kuppuswamy Sathyanarayanan) Peer-to-peer DMA: - Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew Maier) ASPM: - Reduce severity of common clock config message (Chris Packham) - Clear the correct bits when enabling L1 substates, so we don't go to the wrong state (Yicong Yang) Endpoint framework: - Replace EPF linkup ops with notifier call chain and improve locking (Kishon Vijay Abraham I) - Fix concurrent memory allocation in OB address region (Kishon Vijay Abraham I) - Move PF function number assignment to EPC core to support multiple function creation methods (Kishon Vijay Abraham I) - Fix issue with clearing configfs "start" entry (Kunihiko Hayashi) - Fix issue with endpoint MSI-X ignoring BAR Indicator and Table Offset (Kishon Vijay Abraham I) - Add support for testing DMA transfers (Kishon Vijay Abraham I) - Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I) - Add support for tests to clear IRQ (Kishon Vijay Abraham I) - Add common DT schema for endpoint controllers (Kishon Vijay Abraham I) Amlogic Meson PCIe controller driver: - Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi Pommarel) - Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi Pommarel) Cadence PCIe controller driver: - Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay Abraham I) Intel VMD host bridge driver: - Add two VMD Device IDs that require bus restriction mode (Sushma Kalakota) Mobiveil PCIe controller driver: - Refactor and modularize mobiveil driver (Hou Zhiqiang) - Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang) Microsoft Hyper-V host bridge driver: - Add support for Hyper-V PCI protocol version 1.3 and PCI_BUS_RELATIONS2 (Long Li) - Refactor to prepare for virtual PCI on non-x86 architectures (Boqun Feng) - Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui) NVIDIA Tegra PCIe controller driver: - Use pci_parse_request_of_pci_ranges() (Rob Herring) - Add support for endpoint mode and related DT updates (Vidya Sagar) - Reduce -EPROBE_DEFER error message log level (Thierry Reding) Qualcomm PCIe controller driver: - Restrict class fixup to specific Qualcomm devices (Bjorn Andersson) Synopsys DesignWare PCIe controller driver: - Refactor core initialization code for endpoint mode (Vidya Sagar) - Fix endpoint MSI-X to use correct table address (Kishon Vijay Abraham I) TI DRA7xx PCIe controller driver: - Fix MSI IRQ handling (Vignesh Raghavendra) TI Keystone PCIe controller driver: - Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I) Miscellaneous: - Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng Feng) - Use ioremap(), not phys_to_virt(), for platform ROM to fix video ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)" * tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits) misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS PCI: tegra: Print -EPROBE_DEFER error message at debug level misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq() misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices tools: PCI: Add 'e' to clear IRQ misc: pci_endpoint_test: Add ioctl to clear IRQ misc: pci_endpoint_test: Avoid using module parameter to determine irqtype PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments misc: pci_endpoint_test: Add support to get DMA option from userspace tools: PCI: Add 'd' command line option to support DMA misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation PCI: endpoint: functions/pci-epf-test: Print throughput information PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data PCI: pciehp: Fix MSI interrupt race PCI: pciehp: Fix indefinite wait on sysfs requests PCI: endpoint: Fix clearing start entry in configfs PCI: tegra: Add support for PCIe endpoint mode in Tegra194 PCI: sysfs: Revert "rescan" file renames ...
2020-04-03Merge tag 'char-misc-5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of char/misc/other driver patches for 5.7-rc1. Lots of things in here, and it's later than expected due to some reverts to resolve some reported issues. All is now clean with no reported problems in linux-next. Included in here is: - interconnect updates - mei driver updates - uio updates - nvmem driver updates - soundwire updates - binderfs updates - coresight updates - habanalabs updates - mhi new bus type and core - extcon driver updates - some Kconfig cleanups - other small misc driver cleanups and updates As mentioned, all have been in linux-next for a while, and with the last two reverts, all is calm and good" * tag 'char-misc-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (174 commits) Revert "driver core: platform: Initialize dma_parms for platform devices" Revert "amba: Initialize dma_parms for amba devices" amba: Initialize dma_parms for amba devices driver core: platform: Initialize dma_parms for platform devices bus: mhi: core: Drop the references to mhi_dev in mhi_destroy_device() bus: mhi: core: Initialize bhie field in mhi_cntrl for RDDM capture bus: mhi: core: Add support for reading MHI info from device misc: rtsx: set correct pcr_ops for rts522A speakup: misc: Use dynamic minor numbers for speakup devices mei: me: add cedar fork device ids coresight: do not use the BIT() macro in the UAPI header Documentation: provide IBM contacts for embargoed hardware nvmem: core: remove nvmem_sysfs_get_groups() nvmem: core: use is_bin_visible for permissions nvmem: core: use device_register and device_unregister nvmem: core: add root_only member to nvmem device struct extcon: axp288: Add wakeup support extcon: Mark extcon_get_edev_name() function as exported symbol extcon: palmas: Hide error messages if gpio returns -EPROBE_DEFER dt-bindings: extcon: usbc-cros-ec: convert extcon-usbc-cros-ec.txt to yaml format ...
2020-04-03Merge tag 'spdx-5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are three SPDX patches for 5.7-rc1. One fixes up the SPDX tag for a single driver, while the other two go through the tree and add SPDX tags for all of the .gitignore files as needed. Nothing too complex, but you will get a merge conflict with your current tree, that should be trivial to handle (one file modified by two things, one file deleted.) All three of these have been in linux-next for a while, with no reported issues other than the merge conflict" * tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: ASoC: MT6660: make spdxcheck.py happy .gitignore: add SPDX License Identifier .gitignore: remove too obvious comments
2020-04-03Merge branch 'for-5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Christian extended clone3 so that processes can be spawned into cgroups directly. This is not only neat in terms of semantics but also avoids grabbing the global cgroup_threadgroup_rwsem for migration. - Daniel added !root xattr support to cgroupfs. Userland already uses xattrs on cgroupfs for bookkeeping. This will allow delegated cgroups to support such usages. - Prateek tried to make cpuset hotplug handling synchronous but that led to possible deadlock scenarios. Reverted. - Other minor changes including release_agent_path handling cleanup. * 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1: Document the cpuset_v2_mode mount option Revert "cpuset: Make cpuset hotplug synchronous" cgroupfs: Support user xattrs kernfs: Add option to enable user xattrs kernfs: Add removed_size out param for simple_xattr_set kernfs: kvmalloc xattr value instead of kmalloc cgroup: Restructure release_agent_path handling selftests/cgroup: add tests for cloning into cgroups clone3: allow spawning processes into cgroups cgroup: add cgroup_may_write() helper cgroup: refactor fork helpers cgroup: add cgroup_get_from_file() helper cgroup: unify attach permission checking cpuset: Make cpuset hotplug synchronous cgroup.c: Use built-in RCU list checking kselftest/cgroup: add cgroup destruction test cgroup: Clean up css_set task traversal
2020-04-03perf python: Fix clang detection to strip out options passed in $CCArnaldo Carvalho de Melo
The clang check in the python setup.py file expected $CC to be just the name of the compiler, not the compiler + options, i.e. all options were expected to be passed in $CFLAGS, this ends up making it fail in systems where CC is set to, e.g.: "aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot" Like this: $ python3 >>> from subprocess import Popen >>> a = Popen(["aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot", "-v"]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.6/subprocess.py", line 729, in __init__ restore_signals, start_new_session) File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot': 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot' >>> Make it more robust, covering this case, by passing cc.split()[0] as the first arg to popen(). Fixes: a7ffd416d804 ("perf python: Fix clang detection when using CC=clang-version") Reported-by: Daniel Díaz <daniel.diaz@linaro.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Daniel Díaz <daniel.diaz@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ilie Halip <ilie.halip@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200401124037.GA12534@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Support Python 3.8+ in MakefileSam Lunt
Python 3.8 changed the output of 'python-config --ldflags' to no longer include the '-lpythonX.Y' flag (this apparently fixed an issue loading modules with a statically linked Python executable). The libpython feature check in linux/build/feature fails if the Python library is not included in FEATURE_CHECK_LDFLAGS-libpython variable. This adds a check in the Makefile to determine if PYTHON_CONFIG accepts the '--embed' flag and passes that flag alongside '--ldflags' if so. tools/perf is the only place the libpython feature check is used. Signed-off-by: Sam Lunt <samuel.j.lunt@gmail.com> Tested-by: He Zhe <zhe.he@windriver.com> Link: http://lore.kernel.org/lkml/c56be2e1-8111-9dfe-8298-f7d0f9ab7431@windriver.com Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: trivial@kernel.org Cc: stable@kernel.org Link: http://lore.kernel.org/lkml/20200131181123.tmamivhq4b7uqasr@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Fix invalid read of directory entry after closedir()Andreas Gerstmayr
closedir(lang_dir) frees the memory of script_dirent->d_name, which gets accessed in the next line in a call to scnprintf(). Valgrind report: Invalid read of size 1 ==413557== at 0x483CBE6: strlen (vg_replace_strmem.c:461) ==413557== by 0x4DD45FD: __vfprintf_internal (vfprintf-internal.c:1688) ==413557== by 0x4DE6679: __vsnprintf_internal (vsnprintf.c:114) ==413557== by 0x53A037: vsnprintf (stdio2.h:80) ==413557== by 0x53A037: scnprintf (vsprintf.c:21) ==413557== by 0x435202: get_script_path (builtin-script.c:3223) ==413557== Address 0x52e7313 is 1,139 bytes inside a block of size 32,816 free'd ==413557== at 0x483AA0C: free (vg_replace_malloc.c:540) ==413557== by 0x4E303C0: closedir (closedir.c:50) ==413557== by 0x4351DC: get_script_path (builtin-script.c:3222) Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200402124337.419456-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script report: Fix SEGFAULT when using DWARF modeAndreas Gerstmayr
When running perf script report with a Python script and a callgraph in DWARF mode, intr_regs->regs can be 0 and therefore crashing the regs_map function. Added a check for this condition (same check as in builtin-script.c:595). Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lore.kernel.org/lkml/20200402125417.422232-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: add -S/--symbols documentationIan Rogers
Capture both that this option exists and that symbols can be hexadecimal addresses. Signed-off-by: Ian Rogers <irogers@google.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200402174130.140319-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metricJin Yao
The kernel utilization metric does multiplexing currently and is somewhat unreliable. The problem is that it uses two instances of the fixed counter, and the kernel has to multipleplex which causes errors. So should use CPU_CLK_UNHALTED.THREAD instead. Before: # perf stat -M Kernel_Utilization -- sleep 1 Performance counter stats for 'sleep 1': 1,419,425 cpu_clk_unhalted.ref_tsc:k <not counted> cpu_clk_unhalted.ref_tsc (0.00%) After: # perf stat -M Kernel_Utilization -- sleep 1 Performance counter stats for 'sleep 1': 746,688 cpu_clk_unhalted.thread:k # 0.7 Kernel_Utilization 1,088,348 cpu_clk_unhalted.thread Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200309013125.7559-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf events parser: Add missing Intel CPU events to parserAdrian Hunter
perf list expects CPU events to be parseable by name, e.g. # perf list | grep el-capacity-read el-capacity-read OR cpu/el-capacity-read/ [Kernel PMU event] But the event parser does not recognize them that way, e.g. # perf test -v "Parse event" <SNIP> running test 54 'cycles//u' running test 55 'cycles:k' running test 0 'cpu/config=10,config1,config2=3,period=1000/u' running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u' running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/' running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp' -> cpu/event=0,umask=0x11/ -> cpu/event=0,umask=0x13/ -> cpu/event=0x54,umask=0x1/ failed to parse event 'el-capacity-read:u,cpu/event=el-capacity-read/u', err 1, str 'parser error' event syntax error: 'el-capacity-read:u,cpu/event=el-capacity-read/u' \___ parser error test child finished with 1 ---- end ---- Parse event definition strings: FAILED! This happens because the parser splits names by '-' in order to deal with cache events. For example 'L1-dcache' is a token in parse-events.l which is matched to 'L1-dcache-load-miss' by the following rule: PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config And so there is special handling for 2-part PMU names i.e. PE_PMU_EVENT_PRE '-' PE_PMU_EVENT_SUF sep_dc but no handling for 3-part names, which are instead added as tokens e.g. topdown-[a-z-]+ While it would be possible to add a rule for 3-part names, that would not work if the first parts were also a valid PMU name e.g. 'el-capacity-read' would be matched to 'el-capacity' before the parser reached the 3rd part. The parser would need significant change to rationalize all this, so instead fix for now by adding missing Intel CPU events with 3-part names to the event parser as tokens. Missing events were found by using: grep -r EVENT_ATTR_STR arch/x86/events/intel/core.c Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/90c7ae07-c568-b6d3-f9c4-d0c1528a0610@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Allow --symbol to accept hexadecimal addressesStephane Eranian
This patch extends the perf script --symbols option to filter on hexadecimal addresses in addition to symbol names. This makes it easier to handle cases where symbols are aliased. With this patch, it is possible to mix and match symbols and hexadecimal addresses using the --symbols option. $ perf script --symbols=noploop,0x4007a0 Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325220802.15039-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf report/top TUI: Fix title line formattingArnaldo Carvalho de Melo
In d10ec006dcd7 ("perf hists browser: Allow passing an initial hotkey") the hist_entry__title() call was cut'n'pasted to a function where the 'title' variable is a pointer, not an array, so the sizeof(title) continues syntactically valid but ends up reducing the real size of the buffer where to format the first line in the screen to 8 bytes, which makes the formatting at the title at each refresh to produce just the string "Samples ", duh, fix it by passing the size of the buffer. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: d10ec006dcd7 ("perf hists browser: Allow passing an initial hotkey") Link: http://lore.kernel.org/lkml/20200330154314.GB4576@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Support hotkey to change sort orderJin Yao
It would be nice if we can use a hotkey in perf top browser to select a event for sorting. For example: perf top --group -e cycles,instructions,cache-misses Samples Overhead Shared Object Symbol 40.03% 45.71% 0.03% div [.] main 20.46% 14.67% 0.21% libc-2.27.so [.] __random_r 20.01% 19.54% 0.02% libc-2.27.so [.] __random 9.68% 10.68% 0.00% div [.] compute_flag 4.32% 4.70% 0.00% libc-2.27.so [.] rand 3.84% 3.43% 0.00% div [.] rand@plt 0.05% 0.05% 2.33% libc-2.27.so [.] __strcmp_sse2_unaligned 0.04% 0.08% 2.43% perf [.] perf_hpp__is_dynamic_en 0.04% 0.02% 6.64% perf [.] rb_next 0.04% 0.01% 3.87% perf [.] dso__find_symbol 0.04% 0.04% 1.77% perf [.] sort__dso_cmp When user press hotkey '2' (event index, starting from 0), it indicates to sort output by the third event in group (cache-misses). Samples Overhead Shared Object Symbol 4.07% 1.28% 6.68% perf [.] rb_next 3.57% 3.98% 4.11% perf [.] __hists__insert_output 3.67% 11.24% 3.60% perf [.] perf_hpp__is_dynamic_e 3.67% 3.20% 3.20% perf [.] hpp__sort_overhead 0.81% 0.06% 3.01% perf [.] dso__find_symbol 1.62% 5.47% 2.51% perf [.] hists__match 2.70% 1.86% 2.47% libc-2.27.so [.] _int_malloc 0.19% 0.00% 2.29% [kernel] [k] copy_page 0.41% 0.32% 1.98% perf [.] hists__decay_entries 1.84% 3.67% 1.68% perf [.] sort__dso_cmp 0.16% 0.00% 1.63% [kernel] [k] clear_page_erms Now the output is sorted by cache-misses. v2: --- Zero the history if hotkey is pressed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200324220711.6025-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Support --group-sort-idx to change the sort orderJin Yao
'perf report' supports the option --group-sort-idx, which sorts the output by the event at the index n in event group. For example: perf record -e cycles,instructions,cache-misses perf report --group --group-sort-idx 2 --stdio The perf-report output is sorted by cache-misses. This patch supports --group-sort-idx in perf-top. For example: perf top --group -e cycles,instructions,cache-misses --group-sort-idx 2 The perf-top output is sorted by cache-misses. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200324220711.6025-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf symbols: Fix arm64 gap between kernel start and module endKemeng Shi
During execution of command 'perf report' in my arm64 virtual machine, this error message is showed: failed to process sample __symbol__inc_addr_samples(860): ENOMEM! sym->name=__this_module, start=0x1477100, addr=0x147dbd8, end=0x80002000, func: 0 The error is caused with path: cmd_report __cmd_report perf_session__process_events __perf_session__process_events ordered_events__flush __ordered_events__flush oe->deliver (ordered_events__deliver_event) perf_session__deliver_event machines__deliver_event perf_evlist__deliver_sample tool->sample (process_sample_event) hist_entry_iter__add iter->add_entry_cb(hist_iter__report_callback) hist_entry__inc_addr_samples symbol__inc_addr_samples __symbol__inc_addr_samples h = annotated_source__histogram(src, evidx) (NULL) annotated_source__histogram failed is caused with path: ... hist_entry__inc_addr_samples symbol__inc_addr_samples symbol__hists annotated_source__alloc_histograms src->histograms = calloc(nr_hists, sizeof_sym_hist) (failed) Calloc failed as the symbol__size(sym) is too huge. As show in error message: start=0x1477100, end=0x80002000, size of symbol is about 2G. This is the same problem as 'perf annotate: Fix s390 gap between kernel end and module start (b9c0a64901d5bd)'. Perf gets symbol information from /proc/kallsyms in __dso__load_kallsyms. A part of symbol in /proc/kallsyms from my virtual machine is as follows: #cat /proc/kallsyms | sort ... ffff000001475080 d rpfilter_mt_reg [ip6t_rpfilter] ffff000001475100 d $d [ip6t_rpfilter] ffff000001475100 d __this_module [ip6t_rpfilter] ffff000080080000 t _head ffff000080080000 T _text ffff000080080040 t pe_header ... Take line 'ffff000001475100 d __this_module [ip6t_rpfilter]' as example. The start and end of symbol are both set to ffff000001475100 in dso__load_all_kallsyms. Then symbols__fixup_end will set the end of symbol to next big address to ffff000001475100 in /proc/kallsyms, ffff000080080000 in this example. Then sizeof of symbol will be about 2G and cause the problem. The start of module in my machine is ffff000000a62000 t $x [dm_mod] The start of kernel in my machine is ffff000080080000 t _head There is a big gap between end of module and begin of kernel if a samll amount of memory is used by module. And the last symbol in module will have a large address range as caotaining the big gap. Give that the module and kernel text segment sequence may change in the future, fix this by limiting range of last symbol in module and kernel to 4K in arch arm64. Signed-off-by: Kemeng Shi <shikemeng@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/33fd24c4-0d5a-9d93-9b62-dffa97c992ca@huawei.com [ refreshed the patch on current codebase, added string.h include as strchr() is used ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf build-test: Honour JOBS to override detection of number of coresArnaldo Carvalho de Melo
When one does: $ make -C tools/perf build-test The makefile in tools/perf/tests/ will, just like the main one, detect how many cores are in the system and use it with -j. Sometimes we may need to override that, for instance, when using icecream or distcc to use multiple machines in the build process, then we need to, as with the main makefile, use: $ make JOBS=N -C tools/perf build-test Fix the tests makefile to honour that. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200330130301.GA31702@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Add --show-cgroup-events optionNamhyung Kim
The --show-cgroup-events option is to print CGROUP events in the output like others. Committer testing: [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ] [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2 swapper 0 0.000000: PERF_RECORD_CGROUP cgroup: 1 / perf 12145 11200.440730: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) perf 12145 11200.440733: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) -- cgtest 12145 11200.440739: 193472 cycles: ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12145 11200.440790: 2691608 cycles: 7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so) cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub cgtest 12147 11200.441054: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12147 11200.441057: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) -- cgtest 12148 11200.441103: 10227 cycles: ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12148 11200.441106: 273295 cycles: ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1 cgtest 12147 11200.441143: 2788845 cycles: ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2 cgtest 12148 11200.441182: 2669546 cycles: 401020 _init+0x20 (/wb/cgtest) cgtest 12149 11200.441247: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Add --all-cgroups optionNamhyung Kim
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that 'perf top' can identify task/cgroup association later. Committer testing: Use: # perf top --all-cgroups -s cgroup_id,cgroup,pid Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-9-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf record: Add --all-cgroups optionNamhyung Kim
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that perf report can identify task/cgroup association later. [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ] [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 558 of event 'cycles' # Event count (approx.): 458017341 # # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 33.15% 4/0xeffffffb /sub 9615:looper0 32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2 32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1 0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest 0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest 0.32% 4/0xeffffffb / 9615:looper0 0.11% 4/0xeffffffb /sub 9617:cgtest 0.10% 4/0xeffffffb /sub 9618:cgtest # # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S') # [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf record: Support synthesizing cgroup eventsNamhyung Kim
Synthesize cgroup events by iterating cgroup filesystem directories. The cgroup event only saves the portion of cgroup path after the mount point and the cgroup id (which actually is a file handle). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-7-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf report: Add 'cgroup' sort keyNamhyung Kim
The cgroup sort key is to show cgroup membership of each task. Currently it shows full path in the cgroupfs (not relative to the root of cgroup namespace) since it'd be more intuitive IMHO. Otherwise root cgroup in different namespaces will all show same name - "/". The cgroup sort key should come before cgroup_id otherwise sort_dimension__add() will match it to cgroup_id as it only matches with the given substring. For example it will look like following. Note that record patch adding --all-cgroups patch will come later. $ perf record -a --namespace --all-cgroups cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.208 MB perf.data (4090 samples) ] $ perf report -s cgroup_id,cgroup,pid ... # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 93.96% 0/0x0 / 0:swapper 1.25% 3/0xeffffffb / 278:looper0 0.86% 3/0xf000015f /sub/cgrp1 280:cgtest 0.37% 3/0xf0000160 /sub/cgrp2 281:cgtest 0.34% 3/0xf0000163 /sub/cgrp3 282:cgtest 0.22% 3/0xeffffffb /sub 278:looper0 0.20% 3/0xeffffffb / 280:cgtest 0.15% 3/0xf0000163 /sub/cgrp3 285:looper3 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf cgroup: Maintain cgroup hierarchyNamhyung Kim
Each cgroup is kept in the perf_env's cgroup_tree sorted by the cgroup id. Hist entries have cgroup id can compare it directly and later it can be used to find a group name using this tree. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Basic support for CGROUP eventNamhyung Kim
Implement basic functionality to support cgroup tracking. Each cgroup can be identified by inode number which can be read from userspace too. The actual cgroup processing will come in the later patch. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> [ fix perf test failure on sampling parsing ] Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Add file-handle feature testNamhyung Kim
The file handle (FHANDLE) support is configurable so some systems might not have it. So add a config feature item to check it on build time so that we don't add the cgroup tracking feature based on that. Committer notes: Had to make the test use the same construct as its later use in synthetic-events.c, in the next patch in this series. i.e. make it be: struct { struct file_handle fh; uint64_t cgroup_id; } handle; To cope with: CC /tmp/build/perf/util/cloexec.o util/synthetic-events.c:428:22: error: field 'fh' with CC /tmp/build/perf/util/call-path.o variable sized type 'struct file_handle' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end] struct file_handle fh; ^ 1 error generated. Deal with this at some point, i.e. investigate if the right thing is to remove that -Wgnu-variable-sized-type-not-at-end from our CFLAGS, for now do the test the same way as it is used looks more sensible. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ split from a larger patch, removed blank line at EOF ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf python: Include rwsem.c in the pythong bidingArnaldo Carvalho de Melo
We'll need it for the cgroup patches, and its better to have it in a separate patch in case we need to later revert the cgroup patches. I.e. without this we have: [root@five ~]# perf test -v python 19: 'import perf' in python : --- start --- test child forked, pid 148447 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.cpython-37m-x86_64-linux-gnu.so: undefined symbol: down_write test child finished with -1 ---- end ---- 'import perf' in python: FAILED! [root@five ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200403123606.GC23243@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - GICv4.1 support - 32bit host removal PPC: - secure (encrypted) using under the Protected Execution Framework ultravisor s390: - allow disabling GISA (hardware interrupt injection) and protected VMs/ultravisor support. x86: - New dirty bitmap flag that sets all bits in the bitmap when dirty page logging is enabled; this is faster because it doesn't require bulk modification of the page tables. - Initial work on making nested SVM event injection more similar to VMX, and less buggy. - Various cleanups to MMU code (though the big ones and related optimizations were delayed to 5.8). Instead of using cr3 in function names which occasionally means eptp, KVM too has standardized on "pgd". - A large refactoring of CPUID features, which now use an array that parallels the core x86_features. - Some removal of pointer chasing from kvm_x86_ops, which will also be switched to static calls as soon as they are available. - New Tigerlake CPUID features. - More bugfixes, optimizations and cleanups. Generic: - selftests: cleanups, new MMU notifier stress test, steal-time test - CSV output for kvm_stat" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits) x86/kvm: fix a missing-prototypes "vmread_error" KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y KVM: VMX: Add a trampoline to fix VMREAD error handling KVM: SVM: Annotate svm_x86_ops as __initdata KVM: VMX: Annotate vmx_x86_ops as __initdata KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup() KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes KVM: VMX: Configure runtime hooks using vmx_x86_ops KVM: VMX: Move hardware_setup() definition below vmx_x86_ops KVM: x86: Move init-only kvm_x86_ops to separate struct KVM: Pass kvm_init()'s opaque param to additional arch funcs s390/gmap: return proper error code on ksm unsharing KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move() KVM: Fix out of range accesses to memslots KVM: X86: Micro-optimize IPI fastpath delay KVM: X86: Delay read msr data iff writes ICR MSR KVM: PPC: Book3S HV: Add a capability for enabling secure guests KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs ...
2020-04-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: "A large amount of MM, plenty more to come. Subsystems affected by this patch series: - tools - kthread - kbuild - scripts - ocfs2 - vfs - mm: slub, kmemleak, pagecache, gup, swap, memcg, pagemap, mremap, sparsemem, kasan, pagealloc, vmscan, compaction, mempolicy, hugetlbfs, hugetlb" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits) include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS selftests/vm: fix map_hugetlb length used for testing read and write mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge() mm/hugetlb.c: clean code by removing unnecessary initialization hugetlb_cgroup: add hugetlb_cgroup reservation docs hugetlb_cgroup: add hugetlb_cgroup reservation tests hugetlb: support file_region coalescing again hugetlb_cgroup: support noreserve mappings hugetlb_cgroup: add accounting for shared mappings hugetlb: disable region_add file_region coalescing hugetlb_cgroup: add reservation accounting for private mappings mm/hugetlb_cgroup: fix hugetlb_cgroup migration hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations hugetlb_cgroup: add hugetlb_cgroup reservation counter hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization mm/memblock.c: remove redundant assignment to variable max_addr mm: mempolicy: require at least one nodeid for MPOL_PREFERRED mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk() ...
2020-04-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exec/proc updates from Eric Biederman: "This contains two significant pieces of work: the work to sort out proc_flush_task, and the work to solve a deadlock between strace and exec. Fixing proc_flush_task so that it no longer requires a persistent mount makes improvements to proc possible. The removal of the persistent mount solves an old regression that that caused the hidepid mount option to only work on remount not on mount. The regression was found and reported by the Android folks. This further allows Alexey Gladkov's work making proc mount options specific to an individual mount of proc to move forward. The work on exec starts solving a long standing issue with exec that it takes mutexes of blocking userspace applications, which makes exec extremely deadlock prone. For the moment this adds a second mutex with a narrower scope that handles all of the easy cases. Which makes the tricky cases easy to spot. With a little luck the code to solve those deadlocks will be ready by next merge window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (25 commits) signal: Extend exec_id to 64bits pidfd: Use new infrastructure to fix deadlocks in execve perf: Use new infrastructure to fix deadlocks in execve proc: io_accounting: Use new infrastructure to fix deadlocks in execve proc: Use new infrastructure to fix deadlocks in execve kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve kernel: doc: remove outdated comment cred.c mm: docs: Fix a comment in process_vm_rw_core selftests/ptrace: add test cases for dead-locks exec: Fix a deadlock in strace exec: Add exec_update_mutex to replace cred_guard_mutex exec: Move exec_mmap right after de_thread in flush_old_exec exec: Move cleanup of posix timers on exec out of de_thread exec: Factor unshare_sighand out of de_thread and call it separately exec: Only compute current once in flush_old_exec pid: Improve the comment about waiting in zap_pid_ns_processes proc: Remove the now unnecessary internal mount of proc uml: Create a private mount of proc for mconsole uml: Don't consult current to find the proc_mnt in mconsole_proc proc: Use a list of inodes to flush from proc ...
2020-04-02tools: PCI: Add 'e' to clear IRQKishon Vijay Abraham I
Add a new command line option 'e' to invoke "PCITEST_CLEAR_IRQ" ioctl. This can be used to clear the irqs set using the 'i' option. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2020-04-02tools: PCI: Add 'd' command line option to support DMAKishon Vijay Abraham I
Add a new command line option 'd' to use DMA for data transfers. It should be used with read, write or copy commands. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Alan Mikhak <alan.mikhak@sifive.com>
2020-04-02selftests/vm: fix map_hugetlb length used for testing read and writeChristophe Leroy
Commit fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb") added the possibility to change the size of memory mapped for the test, but left the read and write test using the default value. This is unnoticed when mapping a length greater than the default one, but segfaults otherwise. Fix read_bytes() and write_bytes() by giving them the real length. Also fix the call to munmap(). Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/9a404a13c871c4bd0ba9ede68f69a1225180dd7e.1580978385.git.christophe.leroy@c-s.fr Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02hugetlb_cgroup: add hugetlb_cgroup reservation testsMina Almasry
The tests use both shared and private mapped hugetlb memory, and monitors the hugetlb usage counter as well as the hugetlb reservation counter. They test different configurations such as hugetlb memory usage via hugetlbfs, or MAP_HUGETLB, or shmget/shmat, and with and without MAP_POPULATE. Also add test for hugetlb reservation reparenting, since this is a subtle issue. Signed-off-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sandipan Das <sandipan@linux.ibm.com> [powerpc64] Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Link: http://lkml.kernel.org/r/20200211213128.73302-8-almasrymina@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02selftests: vm: drop dependencies on page flags from mlock2 testsMichal Hocko
It was noticed that mlock2 tests are failing after 9c4e6b1a7027f ("mm, mlock, vmscan: no more skipping pagevecs") because the patch has changed the timing on when the page is added to the unevictable LRU list and thus gains the unevictable page flag. The test was just too dependent on the implementation details which were true at the time when it was introduced. Page flags and the timing when they are set is something no userspace should ever depend on. The test should be testing only for the user observable contract of the tested syscalls. Those are defined pretty well for the mlock and there are other means for testing them. In fact this is already done and testing for page flags can be safely dropped to achieve the aimed purpose. Present bits can be checked by /proc/<pid>/smaps RSS field and the locking state by VmFlags although I would argue that Locked: field would be more appropriate. Drop all the page flag machinery and considerably simplify the test. This should be more robust for future kernel changes while checking the promised contract is still valid. Fixes: 9c4e6b1a7027f ("mm, mlock, vmscan: no more skipping pagevecs") Reported-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200324154218.GS19542@dhcp22.suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02selftests: add MREMAP_DONTUNMAP selftestBrian Geffon
Add a few simple self tests for the new flag MREMAP_DONTUNMAP, they are simple smoke tests which also demonstrate the behavior. [akpm@linux-foundation.org: convert eight-spaces to hard tabs] [bgeffon@google.com: v7] Link: http://lkml.kernel.org/r/20200221174248.244748-2-bgeffon@google.com [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Brian Geffon <bgeffon@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Deacon <will@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Yu Zhao <yuzhao@google.com> Cc: Jesse Barnes <jsbarnes@google.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Link: http://lkml.kernel.org/r/20200218173221.237674-2-bgeffon@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverageJohn Hubbard
It's good to have basic unit test coverage of the new FOLL_PIN behavior. Fortunately, the gup_benchmark unit test is extremely fast (a few milliseconds), so adding it the the run_vmtests suite is going to cause no noticeable change in running time. So, add two new invocations to run_vmtests: 1) Run gup_benchmark with normal get_user_pages(). 2) Run gup_benchmark with pin_user_pages(). This is much like the first call, except that it sets FOLL_PIN. Running these two in quick succession also provide a visual comparison of the running times, which is convenient. The new invocations are fairly early in the run_vmtests script, because with test suites, it's usually preferable to put the shorter, faster tests first, all other things being equal. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200211001536.1027652-11-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/gup_benchmark: support pin_user_pages() and related callsJohn Hubbard
Up until now, gup_benchmark supported testing of the following kernel functions: * get_user_pages(): via the '-U' command line option * get_user_pages_longterm(): via the '-L' command line option * get_user_pages_fast(): as the default (no options required) Add test coverage for the new corresponding pin_*() functions: * pin_user_pages_fast(): via the '-a' command line option * pin_user_pages(): via the '-b' command line option Also, add an option for clarity: '-u' for what is now (still) the default choice: get_user_pages_fast(). Also, for the commands that set FOLL_PIN, verify that the pages really are dma-pinned, via the new is_dma_pinned() routine. Those commands are: PIN_FAST_BENCHMARK : calls pin_user_pages_fast() PIN_BENCHMARK : calls pin_user_pages() In between the calls to pin_*() and unpin_user_pages(), check each page: if page_maybe_dma_pinned() returns false, then WARN and return. Do this outside of the benchmark timestamps, so that it doesn't affect reported times. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200211001536.1027652-10-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02tools/accounting/getdelays.c: fix netlink attribute lengthDavid Ahern
A recent change to the netlink code: 6e237d099fac ("netlink: Relax attr validation for fixed length types") logs a warning when programs send messages with invalid attributes (e.g., wrong length for a u32). Yafang reported this error message for tools/accounting/getdelays.c. send_cmd() is wrongly adding 1 to the attribute length. As noted in include/uapi/linux/netlink.h nla_len should be NLA_HDRLEN + payload length, so drop the +1. Fixes: 9e06d3f9f6b1 ("per task delay accounting taskstats interface: documentation fix") Reported-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Yafang Shao <laoar.shao@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200327173111.63922-1-dsahern@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02tools headers UAPI: Update tools's copy of linux/perf_event.hNamhyung Kim
To get the changes in: 6546b19f95ac ("perf/core: Add PERF_SAMPLE_CGROUP feature") 96aaab686505 ("perf/core: Add PERF_RECORD_CGROUP event") This silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h This update is a prerequisite to adding support for the HW index of raw branch records. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.org [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>