summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-10-13perf vendor events: Fix typos in power8 PMU eventsSandipan Das
This replaces the incorrectly spelled word "localtion" with "location" in some power8 PMU event descriptions. Fixes: 2a81fa3bb5ed ("perf vendor events: Add power8 PMU events") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lore.kernel.org/lkml/20201012050205.328523-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf bench: Run inject-build-id with --buildid-all option tooNamhyung Kim
For comparison, it now runs the benchmark twice - one if regular -b and another for --buildid-all. $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 21.002 msec (+- 0.172 msec) Average time per event: 2.059 usec (+- 0.017 usec) Average memory usage: 8169 KB (+- 0 KB) Average build-id-all injection took: 19.543 msec (+- 0.124 msec) Average time per event: 1.916 usec (+- 0.012 usec) Average memory usage: 7348 KB (+- 0 KB) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Add --buildid-all optionNamhyung Kim
Like 'perf record', we can even more speedup build-id processing by just using all DSOs. Then we don't need to look at all the sample events anymore. The following patch will update 'perf bench' to show the result of the --buildid-all option too. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Original-patch-by: Stephane Eranian <eranian@google.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Do not load map/dso when injecting build-idNamhyung Kim
No need to load symbols in a DSO when injecting build-id. I guess the reason was to check the DSO is a special file like anon files. Use some helper functions in map.c to check them before reading build-id. Also pass sample event's cpumode to a new build-id event. It brought a speedup in the benchmark of 25 -> 21 msec on my laptop. Also the memory usage (Max RSS) went down by ~200 KB. # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 21.389 msec (+- 0.138 msec) Average time per event: 2.097 usec (+- 0.014 usec) Average memory usage: 8225 KB (+- 0 KB) Committer notes: Before: $ perf stat -r5 perf bench internals inject-build-id > /dev/null Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 4,020.56 msec task-clock:u # 1.271 CPUs utilized ( +- 0.74% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 123,354 page-faults:u # 0.031 M/sec ( +- 0.81% ) 7,119,951,568 cycles:u # 1.771 GHz ( +- 1.74% ) (83.27%) 230,086,969 stalled-cycles-frontend:u # 3.23% frontend cycles idle ( +- 1.97% ) (83.41%) 1,168,298,765 stalled-cycles-backend:u # 16.41% backend cycles idle ( +- 1.13% ) (83.44%) 11,173,083,669 instructions:u # 1.57 insn per cycle # 0.10 stalled cycles per insn ( +- 1.58% ) (83.31%) 2,413,908,936 branches:u # 600.392 M/sec ( +- 1.69% ) (83.26%) 46,576,289 branch-misses:u # 1.93% of all branches ( +- 2.20% ) (83.31%) 3.1638 +- 0.0309 seconds time elapsed ( +- 0.98% ) $ After: $ perf stat -r5 perf bench internals inject-build-id > /dev/null Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 2,379.94 msec task-clock:u # 1.473 CPUs utilized ( +- 0.18% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 62,584 page-faults:u # 0.026 M/sec ( +- 0.07% ) 2,372,389,668 cycles:u # 0.997 GHz ( +- 0.29% ) (83.14%) 106,937,862 stalled-cycles-frontend:u # 4.51% frontend cycles idle ( +- 4.89% ) (83.20%) 581,697,915 stalled-cycles-backend:u # 24.52% backend cycles idle ( +- 0.71% ) (83.47%) 3,659,692,199 instructions:u # 1.54 insn per cycle # 0.16 stalled cycles per insn ( +- 0.10% ) (83.63%) 791,372,961 branches:u # 332.518 M/sec ( +- 0.27% ) (83.39%) 10,648,083 branch-misses:u # 1.35% of all branches ( +- 0.22% ) (83.16%) 1.61570 +- 0.00172 seconds time elapsed ( +- 0.11% ) $ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Original-patch-by: Stephane Eranian <eranian@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Enter namespace when reading build-idNamhyung Kim
It should be in a proper mnt namespace when accessing the file. I think this had no problem since the build-id was actually read from map__load() -> dso__load() already. But I'd like to change it in the following commit. Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Add missing callbacks in perf_toolNamhyung Kim
I found some events (like PERF_RECORD_CGROUP) are not copied by perf inject due to the missing callbacks. Let's add them. While at it, I've changed the order of the callbacks to match with struct perf_tool so that we can compare them easily. Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf bench: Add build-id injection benchmarkNamhyung Kim
Sometimes I can see that 'perf record' piped with 'perf inject' take a long time processing build-ids. So introduce a inject-build-id benchmark to the internals benchmark suite to measure its overhead regularly. It runs the 'perf inject' command internally and feeds the given number of synthesized events (MMAP2 + SAMPLE basically). Usage: perf bench internals inject-build-id <options> -i, --iterations <n> Number of iterations used to compute average (default: 100) -m, --nr-mmaps <n> Number of mmap events for each iteration (default: 100) -n, --nr-samples <n> Number of sample events per mmap event (default: 100) -v, --verbose be more verbose (show iteration count, DSO name, etc) By default, it measures average processing time of 100 MMAP2 events and 10000 SAMPLE events. Below is a result on my laptop. $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 25.789 msec (+- 0.202 msec) Average time per event: 2.528 usec (+- 0.020 usec) Average memory usage: 8411 KB (+- 7 KB) Committer testing: $ perf bench Usage: perf bench [<common options>] <collection> <benchmark> [<options>] # List of all available benchmark collections: sched: Scheduler and IPC benchmarks syscall: System call benchmarks mem: Memory access benchmarks numa: NUMA scheduling and MM benchmarks futex: Futex stressing benchmarks epoll: Epoll stressing benchmarks internals: Perf-internals benchmarks all: All benchmarks $ perf bench internals # List of available benchmarks for collection 'internals': synthesize: Benchmark perf event synthesis kallsyms-parse: Benchmark kallsyms parsing inject-build-id: Benchmark build-id injection $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.202 msec (+- 0.059 msec) Average time per event: 1.392 usec (+- 0.006 usec) Average memory usage: 12650 KB (+- 10 KB) Average build-id-all injection took: 12.831 msec (+- 0.071 msec) Average time per event: 1.258 usec (+- 0.007 usec) Average memory usage: 11895 KB (+- 10 KB) $ $ perf stat -r5 perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.380 msec (+- 0.056 msec) Average time per event: 1.410 usec (+- 0.006 usec) Average memory usage: 12608 KB (+- 11 KB) Average build-id-all injection took: 11.889 msec (+- 0.064 msec) Average time per event: 1.166 usec (+- 0.006 usec) Average memory usage: 11838 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.246 msec (+- 0.065 msec) Average time per event: 1.397 usec (+- 0.006 usec) Average memory usage: 12744 KB (+- 10 KB) Average build-id-all injection took: 12.019 msec (+- 0.066 msec) Average time per event: 1.178 usec (+- 0.006 usec) Average memory usage: 11963 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.321 msec (+- 0.067 msec) Average time per event: 1.404 usec (+- 0.007 usec) Average memory usage: 12690 KB (+- 10 KB) Average build-id-all injection took: 11.909 msec (+- 0.041 msec) Average time per event: 1.168 usec (+- 0.004 usec) Average memory usage: 11938 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.287 msec (+- 0.059 msec) Average time per event: 1.401 usec (+- 0.006 usec) Average memory usage: 12864 KB (+- 10 KB) Average build-id-all injection took: 11.862 msec (+- 0.058 msec) Average time per event: 1.163 usec (+- 0.006 usec) Average memory usage: 12103 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.402 msec (+- 0.053 msec) Average time per event: 1.412 usec (+- 0.005 usec) Average memory usage: 12876 KB (+- 10 KB) Average build-id-all injection took: 11.826 msec (+- 0.061 msec) Average time per event: 1.159 usec (+- 0.006 usec) Average memory usage: 12111 KB (+- 10 KB) Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 4,267.48 msec task-clock:u # 1.502 CPUs utilized ( +- 0.14% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 102,092 page-faults:u # 0.024 M/sec ( +- 0.08% ) 3,894,589,578 cycles:u # 0.913 GHz ( +- 0.19% ) (83.49%) 140,078,421 stalled-cycles-frontend:u # 3.60% frontend cycles idle ( +- 0.77% ) (83.34%) 948,581,189 stalled-cycles-backend:u # 24.36% backend cycles idle ( +- 0.46% ) (83.25%) 5,835,587,719 instructions:u # 1.50 insn per cycle # 0.16 stalled cycles per insn ( +- 0.21% ) (83.24%) 1,267,423,636 branches:u # 296.996 M/sec ( +- 0.22% ) (83.12%) 17,484,290 branch-misses:u # 1.38% of all branches ( +- 0.12% ) (83.55%) 2.84176 +- 0.00222 seconds time elapsed ( +- 0.08% ) $ Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13dt-bindings: update usb-c-connector exampleBiju Das
Some hardware designs have USB typec connector attached to both SoC and super speed mux. We need to use separate connector node for such design. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/20200920134905.4370-2-biju.das.jz@bp.renesas.com Signed-off-by: Rob Herring <robh@kernel.org>
2020-10-13XArray: Fix xas_create_range for ranges above 4 billionMatthew Wilcox (Oracle)
The 'sibs' variable would be shifted as a 32-bit integer, so if 'shift' is more than 32, this is undefined behaviour. In practice, this doesn't happen because the page cache is the only user and nobody uses 16TB pages. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-13Merge branches 'pm-avs' and 'powercap'Rafael J. Wysocki
* pm-avs: MAINTAINERS: drop myself from PM AVS drivers PM: AVS: qcom-cpr: simplify the return expression of cpr_disable() * powercap: powercap: include header to fix -Wmissing-prototypes
2020-10-13Merge branches 'pm-core', 'pm-sleep', 'pm-pci' and 'pm-domains'Rafael J. Wysocki
* pm-core: PM: runtime: Fix timer_expires data type on 32-bit arches PM: runtime: Remove link state checks in rpm_get/put_supplier() * pm-sleep: ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe() ACPI: EC: PM: Flush EC work unconditionally after wakeup PM: hibernate: remove the bogus call to get_gendisk() in software_resume() PM: hibernate: Batch hibernate and resume IO requests * pm-pci: PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI * pm-domains: PM: domains: Allow to abort power off when no ->power_off() callback PM: domains: Rename power state enums for genpd
2020-10-13Merge branches 'acpi-extlog', 'acpi-memhotplug', 'acpi-button', 'acpi-tools' ↵Rafael J. Wysocki
and 'acpi-pci' * acpi-extlog: ACPI / extlog: Check for RDMSR failure * acpi-memhotplug: ACPI: memhotplug: Remove 'state' from struct acpi_memory_device * acpi-button: ACPI: button: fix handling lid state changes when input device closed * acpi-tools: tools/power/acpi: Serialize Makefile * acpi-pci: ACPI: PCI: update kernel-doc line comments
2020-10-13Merge branch 'acpi-misc'Rafael J. Wysocki
* acpi-misc: ACPI: Make acpi_evaluate_dsm() prototype consistent ACPI: wakeup: Remove dead ACPICA debug code ACPI: video: Remove leftover ACPICA debug code ACPI: tiny-power-button: Remove dead ACPICA debug code ACPI: processor: Remove dead ACPICA debug code ACPI: proc: Remove dead ACPICA debug code ACPI: PCI: Remove unused ACPICA debug code ACPI: event: Remove leftover ACPICA debug code ACPI: dock: Remove dead ACPICA debug code ACPI: debugfs: Remove dead ACPICA debug code ACPI: custom_method: Remove dead ACPICA debug code ACPI: container: Remove leftover ACPICA debug functionality ACPI: platform: Remove ACPI_MODULE_NAME() ACPI: memhotplug: Remove leftover ACPICA debug functionality ACPI: LPSS: Remove ACPI_MODULE_NAME() ACPI: cmos_rtc: Remove leftover ACPI_MODULE_NAME() ACPI: Remove three unused inline functions
2020-10-13Merge branch 'acpi-numa'Rafael J. Wysocki
* acpi-numa: docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ACPI: Support Generic Initiator only domains ACPI / NUMA: Add stub function for pxm_to_node() irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory ACPI: Remove side effect of partly creating a node in acpi_get_node() ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node() ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node() ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT ACPI: Add out of bounds and numa_off protections to pxm_to_node()
2020-10-13Merge branches 'acpi-video', 'acpi-battery', 'acpi-config' and 'acpi-scan'Rafael J. Wysocki
* acpi-video: ACPI: video: use ACPI backlight for HP 635 Notebook * acpi-battery: ACPI: battery: include linux/power_supply.h * acpi-config: ACPI: configfs: Add missing config_item_put() to fix refcount leak * acpi-scan: ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug()
2020-10-13Merge branches 'acpi-tables', 'acpi-pmic', 'acpi-dptf' and 'acpi-soc'Rafael J. Wysocki
* acpi-tables: ACPI: NFIT: Use kobj_to_dev() instead * acpi-pmic: MAINTAINERS: Use my kernel.org address for Intel PMIC work ACPI / PMIC: Move TPS68470 OpRegion driver to drivers/acpi/pmic/ ACPI / PMIC: Split out Kconfig and Makefile specific for ACPI PMIC * acpi-dptf: ACPI: DPTF: Add PCH FIVR participant driver * acpi-soc: ACPI: APD: Clean up header file include statements ACPI: APD: Remove unnecessary APD_ADDR() macro stub ACPI: APD: Remove ACPI_MODULE_NAME() ACPI: APD: Remove flags from struct apd_device_desc ACPI: APD: Add kerneldoc for properties in struct apd_device_desc
2020-10-13radix-tree: fix the comment of radix_tree_next_slot()Hui Su
fix the comment of radix_tree_next_slot(): interator --> iterator. Signed-off-by: Hui Su <sh_def@163.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-13XArray: Fix xas_reload for multi-index entriesMatthew Wilcox (Oracle)
xas_reload() was only checking that the head entry was still at the head index. If the entry has been split, that's not enough as there may be a different entry at the specified index now. Solve this by checking the slot for the requested index instead of the head index. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-13XArray: Add private interface for workingset node deletionMatthew Wilcox (Oracle)
Move the tricky bits of dealing with the XArray from the workingset code to the XArray. Make it clear in the documentation that this is a private interface, and only export it for the benefit of the test suite. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-13Merge branches 'pm-cpuidle' and 'pm-devfreq'Rafael J. Wysocki
* pm-cpuidle: cpuidle: record state entry rejection statistics cpuidle: psci: Allow PM domain to be initialized even if no OSI mode firmware: psci: Extend psci_set_osi_mode() to allow reset to PC mode ACPI: processor: Print more information when acpi_processor_evaluate_cst() fails cpuidle: tegra: Correctly handle result of arm_cpuidle_simple_enter() * pm-devfreq: PM / devfreq: tegra30: Improve initial hardware resetting PM / devfreq: event: Change prototype of devfreq_event_get_edev_by_phandle function PM / devfreq: Change prototype of devfreq_get_devfreq_by_phandle function PM / devfreq: Add devfreq_get_devfreq_by_node function
2020-10-13Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: (30 commits) cpufreq: stats: Fix string format specifier mismatch arm: disable frequency invariance for CONFIG_BL_SWITCHER cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale() cpufreq: stats: Add memory barrier to store_reset() cpufreq: schedutil: Simplify sugov_fast_switch() cpufreq: Move traces and update to policy->cur to cpufreq core cpufreq: stats: Enable stats for fast-switch as well cpufreq: stats: Mark few conditionals with unlikely() cpufreq: stats: Remove locking cpufreq: stats: Defer stats update to cpufreq_stats_record_transition() arch_topology, arm, arm64: define arch_scale_freq_invariant() arch_topology, cpufreq: constify arch_* cpumasks cpufreq: report whether cpufreq supports Frequency Invariance (FI) cpufreq: move invariance setter calls in cpufreq core arch_topology: validate input frequencies to arch_set_freq_scale() cpufreq: qcom: Don't add frequencies without an OPP cpufreq: qcom-hw: Add cpufreq support for SM8250 SoC cpufreq: qcom-hw: Use of_device_get_match_data for offsets and row size cpufreq: qcom-hw: Use devm_platform_ioremap_resource() to simplify code dt-bindings: cpufreq: cpufreq-qcom-hw: Document Qcom EPSS compatible ...
2020-10-13ARM/ixp4xx: add a missing include of dma-map-ops.hChristoph Hellwig
Compilation of ixp4xx_defconfig fails with: arch/arm/mach-ixp4xx/common.c: In function 'ixp4xx_platform_notify_remove': arch/arm/mach-ixp4xx/common.c:291:3: error: implicit declaration of function 'dmabounce_unregister_dev' [-Werror=implicit-function-declaration] 291 | dmabounce_unregister_dev(dev); | ^~~~~~~~~~~~~~~~~~~~~~~~ arch/arm/mach-ixp4xx/common.c: In function 'ixp4xx_platform_notify': arch/arm/mach-ixp4xx/common.c:307:3: error: implicit declaration of function 'dmabounce_register_dev' [-Werror=implicit-function-declaration] 307 | dmabounce_register_dev(dev, 2048, 4096, ixp4xx_needs_bounce); Add a missing include that is needed because of the header reshuffle. Fixes: 0a0f0d8be76d ("dma-mapping: split <linux/dma-mapping.h>") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-13perf build: Allow nested externs to enable BUILD_BUG() usageVasily Gorbik
Currently the BUILD_BUG() macro is expanded to the following: do { extern void __compiletime_assert_0(void) __attribute__((error("BUILD_BUG failed"))); if (!(!(1))) __compiletime_assert_0(); } while (0); If used in a function body this would obviously produce build errors with -Wnested-externs and -Werror. To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf includes in intel-pt-decoder, build perf without -Wnested-externs. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # build tested Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/patch-1.thread-251403.git-2514037e9477.your-ad-here.call-01602244460-ext-7088@work.hours
2020-10-13bcm963xx_tag.h: fix duplicated wordRandy Dunlap
Change doubled word "is" to "it is". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: linux-mips@vger.kernel.org Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-10-13mips: ralink: enable zboot supportChuanhong Guo
Some of these ralink devices come with an ancient u-boot which can't extract LZMA properly when image gets too big. Enable zboot support to get a self-extracting kernel instead of relying on broken u-boot support. Signed-off-by: Chuanhong Guo <gch981213@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-10-13MIPS: ingenic: Remove CPU_SUPPORTS_HUGEPAGESPaul Cercueil
While it is true that Ingenic SoCs support huge pages, we cannot use them yet as PTEs don't have any single bit that is free. Right now, having that symbol only causes build errors, so remove it until the situation with PTEs is resolved. Fixes: f0f4a753079c ("MIPS: generic: Add support for Ingenic SoCs") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-10-13PCI: dwc: Fix MSI page leakage in suspend/resumeJisheng Zhang
Currently, dw_pcie_msi_init() allocates and maps page for msi, then program the PCIE_MSI_ADDR_LO and PCIE_MSI_ADDR_HI. The Root Complex may lose power during suspend-to-RAM, so when we resume, we want to redo the latter but not the former. If designware based driver (for example, pcie-tegra194.c) calls dw_pcie_msi_init() in resume path, the msi page will be leaked. As pointed out by Rob and Ard, there's no need to allocate a page for the MSI address, we could use an address in the driver data. To avoid map the MSI msg again during resume, we move the map MSI msg from dw_pcie_msi_init() to dw_pcie_host_init(). Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20201009155505.5a580ef5@xhacker.debian Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-13PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabledJisheng Zhang
If MSI is disabled, there's no need to program PCIE_MSI_INTR0_MASK and PCIE_MSI_INTR0_ENABLE registers. Link: https://lore.kernel.org/r/20201009155436.27e67238@xhacker.debian Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-13PCI: keystone: Remove iATU register mappingKunihiko Hayashi
After applying "PCI: dwc: Add common iATU register support", there is no need to set own iATU in the Keystone driver itself. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1601444167-11316-5-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-13PCI: dwc: Add common iATU register supportKunihiko Hayashi
This gets iATU register area from reg property that has reg-names "atu". In Synopsys DWC version 4.80 or later, since iATU register area is separated from core register area, this area is necessary to get from DT independently. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1601444167-11316-4-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-13dt-bindings: PCI: uniphier-ep: Add iATU register descriptionKunihiko Hayashi
In the dt-bindings, "atu" reg-names is required to get the register space for iATU in Synopsis DWC version 4.80 or later. Link: https://lore.kernel.org/r/1601444167-11316-3-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-13dt-bindings: PCI: uniphier: Add iATU register descriptionKunihiko Hayashi
In the dt-bindings, "atu" reg-names is required to get the register space for iATU in Synopsys DWC version 4.80 or later. Link: https://lore.kernel.org/r/1601444167-11316-2-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rob Herring <robh@kernel.org>
2020-10-12maiblox: mediatek: Fix handling of platform_get_irq() errorKrzysztof Kozlowski
platform_get_irq() returns -ERRNO on error. In such case casting to u32 and comparing to 0 would pass the check. Fixes: 623a6143a845 ("mailbox: mediatek: Add Mediatek CMDQ driver") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12mailbox: arm_mhu: Add ARM MHU doorbell driverSudeep Holla
The MHU drives the signal using a 32-bit register, with all 32 bits logically ORed together. The MHU provides a set of registers to enable software to set, clear, and check the status of each of the bits of this register independently. The use of 32 bits for each interrupt line enables software to provide more information about the source of the interrupt. For example, each bit of the register can be associated with a type of event that can contribute to raising the interrupt. This patch adds a separate the MHU controller driver for doorbel mode of operation using the extended DT binding to add support the same. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12mailbox: arm_mhu: Match only if compatible is "arm,mhu"Sudeep Holla
Since we will be soon adding a separate driver based on this ARM MHU driver to support doorbell mode, let us add explicit check to match the default compatible for this driver. This is needed as the probe and match reuses the AMBA device ids currently and don't have any explicit compatible check. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12dt-bindings: mailbox: add doorbell support to ARM MHUSudeep Holla
The ARM MHU's reference manual states following: "The MHU drives the signal using a 32-bit register, with all 32 bits logically ORed together. The MHU provides a set of registers to enable software to set, clear, and check the status of each of the bits of this register independently. The use of 32 bits for each interrupt line enables software to provide more information about the source of the interrupt. For example, each bit of the register can be associated with a type of event that can contribute to raising the interrupt." This patch thus extends the MHU controller's DT binding to add support for doorbell mode. Though the same MHU hardware controller is used in the two modes, A new compatible string is added here to represent the combination of the MHU hardware and the firmware sitting on the other side (which expects each bit to represent a different signal now). Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Co-developed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12dt-bindings: mailbox : arm,mhu: Convert to Json-schemaViresh Kumar
Convert the DT binding over to Json-schema. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12mailbox: bcm: convert tasklets to use new tasklet_setup() APIAllen Pais
In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2020-10-12x86/uaccess: utilize CONFIG_CC_HAS_ASM_GOTO_OUTPUTNick Desaulniers
Clang-11 shipped support for outputs to asm goto statments along the fallthrough path. Double up some of the get_user() and related macros to be able to take advantage of this extended GNU C extension. This should help improve the generated code's performance for these accesses. Cc: Bill Wendling <morbo@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-12x86: Make __put_user() generate an out-of-line callLinus Torvalds
Instead of inlining the stac/mov/clac sequence (which also requires individual exception table entries and several asm instruction alternatives entries), just generate "call __put_user_nocheck_X" for the __put_user() cases, the same way we changed __get_user earlier. Unlike the get_user() case, we didn't have the same nice infrastructure to just generate the call with a single case, so this actually has to change some of the infrastructure in order to do this. But that only cleans up the code further. So now, instead of using a case statement for the sizes, we just do the same thing we've done on the get_user() side for a long time: use the size as an immediate constant to the asm, and generate the asm that way directly. In order to handle the special case of 64-bit data on a 32-bit kernel, I needed to change the calling convention slightly: the data is passed in %eax[:%edx], the pointer in %ecx, and the return value is also returned in %ecx. It used to be returned in %eax, but because of how %eax can now be a double register input, we don't want mix that with a single-register output. The actual low-level asm is easier to handle: we'll just share the code between the checking and non-checking case, with the non-checking case jumping into the middle of the function. That may sound a bit too special, but this code is all very very special anyway, so... Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-12x86: Make __get_user() generate an out-of-line callLinus Torvalds
Instead of inlining the whole stac/lfence/mov/clac sequence (which also requires individual exception table entries and several asm instruction alternatives entries), just generate "call __get_user_nocheck_X" for the __get_user() cases. We can use all the same infrastructure that we already do for the regular "get_user()", and the end result is simpler source code, and much simpler code generation. It also means that when I introduce asm goto with input for "unsafe_get_user()", there are no nasty interactions with the __get_user() code. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-12Merge tag 'locks-v5.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking fix from Jeff Layton: "Just a single patch to fix up some tracepoint output" * tag 'locks-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: Remove extra "0x" in tracepoint format specifier
2020-10-12Merge branch 'compat.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat mount cleanups from Al Viro: "The last remnants of mount(2) compat buried by Christoph. Buried into NFS, that is. Generally I'm less enthusiastic about "let's use in_compat_syscall() deep in call chain" kind of approach than Christoph seems to be, but in this case it's warranted - that had been an NFS-specific wart, hopefully not to be repeated in any other filesystems (read: any new filesystem introducing non-text mount options will get NAKed even if it doesn't mess the layout up). IOW, not worth trying to grow an infrastructure that would avoid that use of in_compat_syscall()..." * 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove compat_sys_mount fs,nfs: lift compat nfs4 mount data handling into the nfs code nfs: simplify nfs4_parse_monolithic
2020-10-12Merge branch 'work.quota-compat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat quotactl cleanups from Al Viro: "More Christoph's compat cleanups: quotactl(2)" * 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: quota: simplify the quotactl compat handling compat: add a compat_need_64bit_alignment_fixup() helper compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
2020-10-12Merge branch 'work.iov_iter' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat iovec cleanups from Al Viro: "Christoph's series around import_iovec() and compat variant thereof" * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: security/keys: remove compat_keyctl_instantiate_key_iov mm: remove compat_process_vm_{readv,writev} fs: remove compat_sys_vmsplice fs: remove the compat readv/writev syscalls fs: remove various compat readv/writev helpers iov_iter: transparently handle compat iovecs in import_iovec iov_iter: refactor rw_copy_check_uvector and import_iovec iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c compat.h: fix a spelling error in <linux/compat.h>
2020-10-12Merge branch 'work.csum_and_copy' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull copy_and_csum cleanups from Al Viro: "Saner calling conventions for csum_and_copy_..._user() and friends" [ Removing 800+ lines of code and cleaning stuff up is good - Linus ] * 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ppc: propagate the calling conventions change down to csum_partial_copy_generic() amd64: switch csum_partial_copy_generic() to new calling conventions sparc64: propagate the calling convention changes down to __csum_partial_copy_...() xtensa: propagate the calling conventions change down into csum_partial_copy_generic() mips: propagate the calling convention change down into __csum_partial_copy_..._user() mips: __csum_partial_copy_kernel() has no users left mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() i386: propagate the calling conventions change down to csum_partial_copy_generic() sh: propage the calling conventions change down to csum_partial_copy_generic() m68k: get rid of zeroing destination on error in csum_and_copy_from_user() arm: propagate the calling convention changes down to csum_partial_copy_from_user() alpha: propagate the calling convention changes down to csum_partial_copy.c helpers saner calling conventions for csum_and_copy_..._user() csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum csum_partial_copy_nocheck(): drop the last argument unify generic instances of csum_partial_copy_nocheck() icmp_push_reply(): reorder adding the checksum up skb_copy_and_csum_bits(): don't bother with the last argument
2020-10-12Merge tag 'docs-5.10' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "As hoped, things calmed down for docs this cycle; fewer changes and almost no conflicts at all. This includes: - A reworked and expanded user-mode Linux document - Some simplifications and improvements for submitting-patches.rst - An emergency fix for (some) problems with Sphinx 3.x - Some welcome automarkup improvements to automatically generate cross-references to struct definitions and other documents - The usual collection of translation updates, typo fixes, etc" * tag 'docs-5.10' of git://git.lwn.net/linux: (81 commits) gpiolib: Update indentation in driver.rst for code excerpts Documentation/admin-guide: tainted-kernels: Fix typo occured Documentation: better locations for sysfs-pci, sysfs-tagging docs: programming-languages: refresh blurb on clang support Documentation: kvm: fix a typo Documentation: Chinese translation of Documentation/arm64/amu.rst doc: zh_CN: index files in arm64 subdirectory mailmap: add entry for <mstarovoitov@marvell.com> doc: seq_file: clarify role of *pos in ->next() docs: trace: ring-buffer-design.rst: use the new SPDX tag Documentation: kernel-parameters: clarify "module." parameters Fix references to nommu-mmap.rst docs: rewrite admin-guide/sysctl/abi.rst docs: fb: Remove vesafb scrollback boot option docs: fb: Remove sstfb scrollback boot option docs: fb: Remove matroxfb scrollback boot option docs: fb: Remove framebuffer scrollback boot option docs: replace the old User Mode Linux HowTo with a new one Documentation/admin-guide: blockdev/ramdisk: remove use of "rdev" Documentation/admin-guide: README & svga: remove use of "rdev" ...
2020-10-12Merge tag 'ia64_for_5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 updates from Tony Luck: "Cleanups by Christoph: - Switch to libata instead of legacy ide driver - Drop ia64 perfmon (it's been broken for a while)" * tag 'ia64_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: ia64: Use libata instead of the legacy ide driver in defconfigs ia64: Remove perfmon
2020-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-10-12 The main changes are: 1) The BPF verifier improvements to track register allocation pattern, from Alexei and Yonghong. 2) libbpf relocation support for different size load/store, from Andrii. 3) bpf_redirect_peer() helper and support for inner map array with different max_entries, from Daniel. 4) BPF support for per-cpu variables, form Hao. 5) sockmap improvements, from John. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offloadRaed Salem
In the TX data path, spot packets with xfrm stack IPsec offload indication. Fill Software-Parser segment in TX descriptor so that the hardware may parse the ESP protocol, and perform TX checksum offload on the inner payload. Support GSO, by providing the trailer data and ICV placeholder so HW can fill it post encryption operation. Padding alignment cannot be performed in HW (ConnectX-6Dx) due to a bug. Software can overcome this limitation by adding NETIF_F_HW_ESP to the gso_partial_features field in netdev so the packets being aligned by the stack. l4_inner_checksum cannot be offloaded by HW for IPsec tunnel type packet. Note that for GSO SKBs, the stack does not include an ESP trailer, unlike the non-GSO case. Below is the iperf3 performance report on two server of 24 cores Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with ConnectX6-DX. All the bandwidth test uses iperf3 TCP traffic with packet size 128KB. Each tunnel uses one iperf3 stream with one thread (option -P1). TX crypto offload shows improvements on both bandwidth and CPU utilization. ---------------------------------------------------------------------- Mode | Num tunnel | BW | Send CPU util | Recv CPU util | | (Gbps) | (Average %) | (Average %) ---------------------------------------------------------------------- Cryto offload | | | | (RX only) | 1 | 4.7 | 4.2 | 3.5 ---------------------------------------------------------------------- Cryto offload | | | | (RX only) | 24 | 15.6 | 20 | 10 ---------------------------------------------------------------------- Non-offload | 1 | 4.6 | 4 | 5 ---------------------------------------------------------------------- Non-offload | 24 | 11.9 | 16 | 12 ---------------------------------------------------------------------- Cryto offload | | | | (TX & RX) | 1 | 11.9 | 2.1 | 5.9 ---------------------------------------------------------------------- Cryto offload | | | | (TX & RX) | 24 | 38 | 9.5 | 27.5 ---------------------------------------------------------------------- Cryto offload | | | | (TX only) | 1 | 4.7 | 0.7 | 5 ---------------------------------------------------------------------- Cryto offload | | | | (TX only) | 24 | 14.5 | 6 | 20 Regression tests show no degradation on non-ipsec and non-offload-ipsec traffics. The packet rate test uses pktgen UDP to transmit on single CPU, the instructions and cycles are measured on the transmit CPU. before: ---------------------------------------------------------------------- Non-offload | 1 | 4.7 | 4.2 | 5.1 ---------------------------------------------------------------------- Non-offload | 24 | 11.2 | 14 | 15 ---------------------------------------------------------------------- Non-ipsec | 1 | 28 | 4 | 5.7 ---------------------------------------------------------------------- Non-ipsec | 24 | 68.3 | 17.8 | 39.7 ---------------------------------------------------------------------- Non-ipsec packet rate(BURST=1000 BC=5 NCPUS=1 SIZE=60) 13.56Mpps, 456 instructions/pkt, 191 cycles/pkt after: ---------------------------------------------------------------------- Non-offload | 1 | 4.69 | 4.2 | 5 ---------------------------------------------------------------------- Non-offload | 24 | 11.9 | 13.5 | 15.1 ---------------------------------------------------------------------- Non-ipsec | 1 | 29 | 3.2 | 5.5 ---------------------------------------------------------------------- Non-ipsec | 24 | 68.2 | 18.5 | 39.8 ---------------------------------------------------------------------- Non-ipsec packet rate: 13.56Mpps, 472 instructions/pkt, 191 cycles/pkt Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>