summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-04-03f2fs: compress: support zstd compress algorithmChao Yu
Add zstd compress algorithm support, use "compress_algorithm=zstd" mountoption to enable it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-03dm integrity: fix logic bug in integrity tag testingMikulas Patocka
If all the bytes are equal to DISCARD_FILLER, we want to accept the buffer. If any of the bytes are different, we must do thorough tag-by-tag checking. The condition was inverted. Fixes: 84597a44a9d8 ("dm integrity: add optional discard support") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03docs: cgroup-v1: Document the cpuset_v2_mode mount optionWaiman Long
The cpuset in cgroup v1 accepts a special "cpuset_v2_mode" mount option that make cpuset.cpus and cpuset.mems behave more like those in cgroup v2. Document it to make other people more aware of this feature that can be useful in some circumstances. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-04-03Revert "dm: always call blk_queue_split() in dm_process_bio()"Mike Snitzer
This reverts commit effd58c95f277744f75d6e08819ac859dbcbd351. blk_queue_split() is causing excessive IO splitting -- because blk_max_size_offset() depends on 'chunk_sectors' limit being set and if it isn't (as is the case for DM targets!) it falls back to splitting on a 'max_sectors' boundary regardless of offset. "Fix" this by reverting back to _not_ using blk_queue_split() in dm_process_bio() for normal IO (reads and writes). Long-term fix is still TBD but it should focus on training blk_max_size_offset() to call into a DM provided hook (to call DM's max_io_len()). Test results from simple misaligned IO test on 4-way dm-striped device with chunksize of 128K and stripesize of 512K: xfs_io -d -c 'pread -b 2m 224s 4072s' /dev/mapper/stripe_dev before this revert: 253,0 21 1 0.000000000 2206 Q R 224 + 4072 [xfs_io] 253,0 21 2 0.000008267 2206 X R 224 / 480 [xfs_io] 253,0 21 3 0.000010530 2206 X R 224 / 256 [xfs_io] 253,0 21 4 0.000027022 2206 X R 480 / 736 [xfs_io] 253,0 21 5 0.000028751 2206 X R 480 / 512 [xfs_io] 253,0 21 6 0.000033323 2206 X R 736 / 992 [xfs_io] 253,0 21 7 0.000035130 2206 X R 736 / 768 [xfs_io] 253,0 21 8 0.000039146 2206 X R 992 / 1248 [xfs_io] 253,0 21 9 0.000040734 2206 X R 992 / 1024 [xfs_io] 253,0 21 10 0.000044694 2206 X R 1248 / 1504 [xfs_io] 253,0 21 11 0.000046422 2206 X R 1248 / 1280 [xfs_io] 253,0 21 12 0.000050376 2206 X R 1504 / 1760 [xfs_io] 253,0 21 13 0.000051974 2206 X R 1504 / 1536 [xfs_io] 253,0 21 14 0.000055881 2206 X R 1760 / 2016 [xfs_io] 253,0 21 15 0.000057462 2206 X R 1760 / 1792 [xfs_io] 253,0 21 16 0.000060999 2206 X R 2016 / 2272 [xfs_io] 253,0 21 17 0.000062489 2206 X R 2016 / 2048 [xfs_io] 253,0 21 18 0.000066133 2206 X R 2272 / 2528 [xfs_io] 253,0 21 19 0.000067507 2206 X R 2272 / 2304 [xfs_io] 253,0 21 20 0.000071136 2206 X R 2528 / 2784 [xfs_io] 253,0 21 21 0.000072764 2206 X R 2528 / 2560 [xfs_io] 253,0 21 22 0.000076185 2206 X R 2784 / 3040 [xfs_io] 253,0 21 23 0.000077486 2206 X R 2784 / 2816 [xfs_io] 253,0 21 24 0.000080885 2206 X R 3040 / 3296 [xfs_io] 253,0 21 25 0.000082316 2206 X R 3040 / 3072 [xfs_io] 253,0 21 26 0.000085788 2206 X R 3296 / 3552 [xfs_io] 253,0 21 27 0.000087096 2206 X R 3296 / 3328 [xfs_io] 253,0 21 28 0.000093469 2206 X R 3552 / 3808 [xfs_io] 253,0 21 29 0.000095186 2206 X R 3552 / 3584 [xfs_io] 253,0 21 30 0.000099228 2206 X R 3808 / 4064 [xfs_io] 253,0 21 31 0.000101062 2206 X R 3808 / 3840 [xfs_io] 253,0 21 32 0.000104956 2206 X R 4064 / 4096 [xfs_io] 253,0 21 33 0.001138823 0 C R 4096 + 200 [0] after this revert: 253,0 18 1 0.000000000 4430 Q R 224 + 3896 [xfs_io] 253,0 18 2 0.000018359 4430 X R 224 / 256 [xfs_io] 253,0 18 3 0.000028898 4430 X R 256 / 512 [xfs_io] 253,0 18 4 0.000033535 4430 X R 512 / 768 [xfs_io] 253,0 18 5 0.000065684 4430 X R 768 / 1024 [xfs_io] 253,0 18 6 0.000091695 4430 X R 1024 / 1280 [xfs_io] 253,0 18 7 0.000098494 4430 X R 1280 / 1536 [xfs_io] 253,0 18 8 0.000114069 4430 X R 1536 / 1792 [xfs_io] 253,0 18 9 0.000129483 4430 X R 1792 / 2048 [xfs_io] 253,0 18 10 0.000136759 4430 X R 2048 / 2304 [xfs_io] 253,0 18 11 0.000152412 4430 X R 2304 / 2560 [xfs_io] 253,0 18 12 0.000160758 4430 X R 2560 / 2816 [xfs_io] 253,0 18 13 0.000183385 4430 X R 2816 / 3072 [xfs_io] 253,0 18 14 0.000190797 4430 X R 3072 / 3328 [xfs_io] 253,0 18 15 0.000197667 4430 X R 3328 / 3584 [xfs_io] 253,0 18 16 0.000218751 4430 X R 3584 / 3840 [xfs_io] 253,0 18 17 0.000226005 4430 X R 3840 / 4096 [xfs_io] 253,0 18 18 0.000250404 4430 Q R 4120 + 176 [xfs_io] 253,0 18 19 0.000847708 0 C R 4096 + 24 [0] 253,0 18 20 0.000855783 0 C R 4120 + 176 [0] Fixes: effd58c95f27774 ("dm: always call blk_queue_split() in dm_process_bio()") Cc: stable@vger.kernel.org Reported-by: Andreas Gruenbacher <agruenba@redhat.com> Tested-by: Barry Marson <bmarson@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03Revert "cpuset: Make cpuset hotplug synchronous"Tejun Heo
This reverts commit a49e4629b5ed ("cpuset: Make cpuset hotplug synchronous") as it may deadlock with cpu hotplug path. Link: http://lkml.kernel.org/r/F0388D99-84D7-453B-9B6B-EEFF0E7BE4CC@lca.pw Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Qian Cai <cai@lca.pw> Cc: Prateek Sood <prsood@codeaurora.org>
2020-04-03tracing: Do not allocate buffer in trace_find_next_entry() in atomicSteven Rostedt (VMware)
When dumping out the trace data in latency format, a check is made to peek at the next event to compare its timestamp to the current one, and if the delta is of a greater size, it will add a marker showing so. But to do this, it needs to save the current event otherwise peeking at the next event will remove the current event. To save the event, a temp buffer is used, and if the event is bigger than the temp buffer, the temp buffer is freed and a bigger buffer is allocated. This allocation is a problem when called in atomic context. The only way this gets called via atomic context is via ftrace_dump(). Thus, use a static buffer of 128 bytes (which covers most events), and if the event is bigger than that, simply return NULL. The callers of trace_find_next_entry() need to handle a NULL case, as that's what would happen if the allocation failed. Link: https://lore.kernel.org/r/20200326091256.GR11705@shao2-debian Fixes: ff895103a84ab ("tracing: Save off entry when peeking at next entry") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-04-03KVM: SVM: Split svm_vcpu_run inline assembly to separate fileUros Bizjak
The compiler (GCC) does not like the situation, where there is inline assembly block that clobbers all available machine registers in the middle of the function. This situation can be found in function svm_vcpu_run in file kvm/svm.c and results in many register spills and fills to/from stack frame. This patch fixes the issue with the same approach as was done for VMX some time ago. The big inline assembly is moved to a separate assembly .S file, taking into account all ABI requirements. There are two main benefits of the above approach: * elimination of several register spills and fills to/from stack frame, and consequently smaller function .text size. The binary size of svm_vcpu_run is lowered from 2019 to 1626 bytes. * more efficient access to a register save array. Currently, register save array is accessed as: 7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx 7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx 7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx and passing ia pointer to a register array as an argument to a function one gets: 12: 48 8b 48 08 mov 0x8(%rax),%rcx 16: 48 8b 50 10 mov 0x10(%rax),%rdx 1a: 48 8b 58 18 mov 0x18(%rax),%rbx As a result, the total size, considering that the new function size is 229 bytes, gets lowered by 164 bytes. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-03KVM: SVM: Move SEV code to separate fileJoerg Roedel
Move the SEV specific parts of svm.c into the new sev.c file. Signed-off-by: Joerg Roedel <jroedel@suse.de> Message-Id: <20200324094154.32352-5-joro@8bytes.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-03KVM: SVM: Move AVIC code to separate fileJoerg Roedel
Move the AVIC related functions from svm.c to the new avic.c file. Signed-off-by: Joerg Roedel <jroedel@suse.de> Message-Id: <20200324094154.32352-4-joro@8bytes.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-03KVM: SVM: Move Nested SVM Implementation to nested.cJoerg Roedel
Split out the code for the nested SVM implementation and move it to a separate file. Signed-off-by: Joerg Roedel <jroedel@suse.de> Message-Id: <20200324094154.32352-3-joro@8bytes.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-03kVM SVM: Move SVM related files to own sub-directoryJoerg Roedel
Move svm.c and pmu_amd.c into their own arch/x86/kvm/svm/ subdirectory. Signed-off-by: Joerg Roedel <jroedel@suse.de> Message-Id: <20200324094154.32352-2-joro@8bytes.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-03dm integrity: fix ppc64le warningMike Snitzer
Otherwise: In file included from drivers/md/dm-integrity.c:13: drivers/md/dm-integrity.c: In function 'dm_integrity_status': drivers/md/dm-integrity.c:3061:10: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=] DMEMIT("%llu %llu", ^~~~~~~~~~~ atomic64_read(&ic->number_of_mismatches), ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/device-mapper.h:550:46: note: in definition of macro 'DMEMIT' 0 : scnprintf(result + sz, maxlen - sz, x)) ^ cc1: all warnings being treated as errors Fixes: 7649194a1636ab5 ("dm integrity: remove sector type casts") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03perf python: Fix clang detection to strip out options passed in $CCArnaldo Carvalho de Melo
The clang check in the python setup.py file expected $CC to be just the name of the compiler, not the compiler + options, i.e. all options were expected to be passed in $CFLAGS, this ends up making it fail in systems where CC is set to, e.g.: "aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot" Like this: $ python3 >>> from subprocess import Popen >>> a = Popen(["aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot", "-v"]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.6/subprocess.py", line 729, in __init__ restore_signals, start_new_session) File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot': 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot' >>> Make it more robust, covering this case, by passing cc.split()[0] as the first arg to popen(). Fixes: a7ffd416d804 ("perf python: Fix clang detection when using CC=clang-version") Reported-by: Daniel Díaz <daniel.diaz@linaro.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Daniel Díaz <daniel.diaz@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ilie Halip <ilie.halip@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200401124037.GA12534@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Support Python 3.8+ in MakefileSam Lunt
Python 3.8 changed the output of 'python-config --ldflags' to no longer include the '-lpythonX.Y' flag (this apparently fixed an issue loading modules with a statically linked Python executable). The libpython feature check in linux/build/feature fails if the Python library is not included in FEATURE_CHECK_LDFLAGS-libpython variable. This adds a check in the Makefile to determine if PYTHON_CONFIG accepts the '--embed' flag and passes that flag alongside '--ldflags' if so. tools/perf is the only place the libpython feature check is used. Signed-off-by: Sam Lunt <samuel.j.lunt@gmail.com> Tested-by: He Zhe <zhe.he@windriver.com> Link: http://lore.kernel.org/lkml/c56be2e1-8111-9dfe-8298-f7d0f9ab7431@windriver.com Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: trivial@kernel.org Cc: stable@kernel.org Link: http://lore.kernel.org/lkml/20200131181123.tmamivhq4b7uqasr@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Fix invalid read of directory entry after closedir()Andreas Gerstmayr
closedir(lang_dir) frees the memory of script_dirent->d_name, which gets accessed in the next line in a call to scnprintf(). Valgrind report: Invalid read of size 1 ==413557== at 0x483CBE6: strlen (vg_replace_strmem.c:461) ==413557== by 0x4DD45FD: __vfprintf_internal (vfprintf-internal.c:1688) ==413557== by 0x4DE6679: __vsnprintf_internal (vsnprintf.c:114) ==413557== by 0x53A037: vsnprintf (stdio2.h:80) ==413557== by 0x53A037: scnprintf (vsprintf.c:21) ==413557== by 0x435202: get_script_path (builtin-script.c:3223) ==413557== Address 0x52e7313 is 1,139 bytes inside a block of size 32,816 free'd ==413557== at 0x483AA0C: free (vg_replace_malloc.c:540) ==413557== by 0x4E303C0: closedir (closedir.c:50) ==413557== by 0x4351DC: get_script_path (builtin-script.c:3222) Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200402124337.419456-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script report: Fix SEGFAULT when using DWARF modeAndreas Gerstmayr
When running perf script report with a Python script and a callgraph in DWARF mode, intr_regs->regs can be 0 and therefore crashing the regs_map function. Added a check for this condition (same check as in builtin-script.c:595). Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lore.kernel.org/lkml/20200402125417.422232-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: add -S/--symbols documentationIan Rogers
Capture both that this option exists and that symbols can be hexadecimal addresses. Signed-off-by: Ian Rogers <irogers@google.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200402174130.140319-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metricJin Yao
The kernel utilization metric does multiplexing currently and is somewhat unreliable. The problem is that it uses two instances of the fixed counter, and the kernel has to multipleplex which causes errors. So should use CPU_CLK_UNHALTED.THREAD instead. Before: # perf stat -M Kernel_Utilization -- sleep 1 Performance counter stats for 'sleep 1': 1,419,425 cpu_clk_unhalted.ref_tsc:k <not counted> cpu_clk_unhalted.ref_tsc (0.00%) After: # perf stat -M Kernel_Utilization -- sleep 1 Performance counter stats for 'sleep 1': 746,688 cpu_clk_unhalted.thread:k # 0.7 Kernel_Utilization 1,088,348 cpu_clk_unhalted.thread Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200309013125.7559-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf events parser: Add missing Intel CPU events to parserAdrian Hunter
perf list expects CPU events to be parseable by name, e.g. # perf list | grep el-capacity-read el-capacity-read OR cpu/el-capacity-read/ [Kernel PMU event] But the event parser does not recognize them that way, e.g. # perf test -v "Parse event" <SNIP> running test 54 'cycles//u' running test 55 'cycles:k' running test 0 'cpu/config=10,config1,config2=3,period=1000/u' running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u' running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/' running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp' -> cpu/event=0,umask=0x11/ -> cpu/event=0,umask=0x13/ -> cpu/event=0x54,umask=0x1/ failed to parse event 'el-capacity-read:u,cpu/event=el-capacity-read/u', err 1, str 'parser error' event syntax error: 'el-capacity-read:u,cpu/event=el-capacity-read/u' \___ parser error test child finished with 1 ---- end ---- Parse event definition strings: FAILED! This happens because the parser splits names by '-' in order to deal with cache events. For example 'L1-dcache' is a token in parse-events.l which is matched to 'L1-dcache-load-miss' by the following rule: PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config And so there is special handling for 2-part PMU names i.e. PE_PMU_EVENT_PRE '-' PE_PMU_EVENT_SUF sep_dc but no handling for 3-part names, which are instead added as tokens e.g. topdown-[a-z-]+ While it would be possible to add a rule for 3-part names, that would not work if the first parts were also a valid PMU name e.g. 'el-capacity-read' would be matched to 'el-capacity' before the parser reached the 3rd part. The parser would need significant change to rationalize all this, so instead fix for now by adding missing Intel CPU events with 3-part names to the event parser as tokens. Missing events were found by using: grep -r EVENT_ATTR_STR arch/x86/events/intel/core.c Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/90c7ae07-c568-b6d3-f9c4-d0c1528a0610@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Allow --symbol to accept hexadecimal addressesStephane Eranian
This patch extends the perf script --symbols option to filter on hexadecimal addresses in addition to symbol names. This makes it easier to handle cases where symbols are aliased. With this patch, it is possible to mix and match symbols and hexadecimal addresses using the --symbols option. $ perf script --symbols=noploop,0x4007a0 Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325220802.15039-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf report/top TUI: Fix title line formattingArnaldo Carvalho de Melo
In d10ec006dcd7 ("perf hists browser: Allow passing an initial hotkey") the hist_entry__title() call was cut'n'pasted to a function where the 'title' variable is a pointer, not an array, so the sizeof(title) continues syntactically valid but ends up reducing the real size of the buffer where to format the first line in the screen to 8 bytes, which makes the formatting at the title at each refresh to produce just the string "Samples ", duh, fix it by passing the size of the buffer. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: d10ec006dcd7 ("perf hists browser: Allow passing an initial hotkey") Link: http://lore.kernel.org/lkml/20200330154314.GB4576@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Support hotkey to change sort orderJin Yao
It would be nice if we can use a hotkey in perf top browser to select a event for sorting. For example: perf top --group -e cycles,instructions,cache-misses Samples Overhead Shared Object Symbol 40.03% 45.71% 0.03% div [.] main 20.46% 14.67% 0.21% libc-2.27.so [.] __random_r 20.01% 19.54% 0.02% libc-2.27.so [.] __random 9.68% 10.68% 0.00% div [.] compute_flag 4.32% 4.70% 0.00% libc-2.27.so [.] rand 3.84% 3.43% 0.00% div [.] rand@plt 0.05% 0.05% 2.33% libc-2.27.so [.] __strcmp_sse2_unaligned 0.04% 0.08% 2.43% perf [.] perf_hpp__is_dynamic_en 0.04% 0.02% 6.64% perf [.] rb_next 0.04% 0.01% 3.87% perf [.] dso__find_symbol 0.04% 0.04% 1.77% perf [.] sort__dso_cmp When user press hotkey '2' (event index, starting from 0), it indicates to sort output by the third event in group (cache-misses). Samples Overhead Shared Object Symbol 4.07% 1.28% 6.68% perf [.] rb_next 3.57% 3.98% 4.11% perf [.] __hists__insert_output 3.67% 11.24% 3.60% perf [.] perf_hpp__is_dynamic_e 3.67% 3.20% 3.20% perf [.] hpp__sort_overhead 0.81% 0.06% 3.01% perf [.] dso__find_symbol 1.62% 5.47% 2.51% perf [.] hists__match 2.70% 1.86% 2.47% libc-2.27.so [.] _int_malloc 0.19% 0.00% 2.29% [kernel] [k] copy_page 0.41% 0.32% 1.98% perf [.] hists__decay_entries 1.84% 3.67% 1.68% perf [.] sort__dso_cmp 0.16% 0.00% 1.63% [kernel] [k] clear_page_erms Now the output is sorted by cache-misses. v2: --- Zero the history if hotkey is pressed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200324220711.6025-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Support --group-sort-idx to change the sort orderJin Yao
'perf report' supports the option --group-sort-idx, which sorts the output by the event at the index n in event group. For example: perf record -e cycles,instructions,cache-misses perf report --group --group-sort-idx 2 --stdio The perf-report output is sorted by cache-misses. This patch supports --group-sort-idx in perf-top. For example: perf top --group -e cycles,instructions,cache-misses --group-sort-idx 2 The perf-top output is sorted by cache-misses. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200324220711.6025-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf symbols: Fix arm64 gap between kernel start and module endKemeng Shi
During execution of command 'perf report' in my arm64 virtual machine, this error message is showed: failed to process sample __symbol__inc_addr_samples(860): ENOMEM! sym->name=__this_module, start=0x1477100, addr=0x147dbd8, end=0x80002000, func: 0 The error is caused with path: cmd_report __cmd_report perf_session__process_events __perf_session__process_events ordered_events__flush __ordered_events__flush oe->deliver (ordered_events__deliver_event) perf_session__deliver_event machines__deliver_event perf_evlist__deliver_sample tool->sample (process_sample_event) hist_entry_iter__add iter->add_entry_cb(hist_iter__report_callback) hist_entry__inc_addr_samples symbol__inc_addr_samples __symbol__inc_addr_samples h = annotated_source__histogram(src, evidx) (NULL) annotated_source__histogram failed is caused with path: ... hist_entry__inc_addr_samples symbol__inc_addr_samples symbol__hists annotated_source__alloc_histograms src->histograms = calloc(nr_hists, sizeof_sym_hist) (failed) Calloc failed as the symbol__size(sym) is too huge. As show in error message: start=0x1477100, end=0x80002000, size of symbol is about 2G. This is the same problem as 'perf annotate: Fix s390 gap between kernel end and module start (b9c0a64901d5bd)'. Perf gets symbol information from /proc/kallsyms in __dso__load_kallsyms. A part of symbol in /proc/kallsyms from my virtual machine is as follows: #cat /proc/kallsyms | sort ... ffff000001475080 d rpfilter_mt_reg [ip6t_rpfilter] ffff000001475100 d $d [ip6t_rpfilter] ffff000001475100 d __this_module [ip6t_rpfilter] ffff000080080000 t _head ffff000080080000 T _text ffff000080080040 t pe_header ... Take line 'ffff000001475100 d __this_module [ip6t_rpfilter]' as example. The start and end of symbol are both set to ffff000001475100 in dso__load_all_kallsyms. Then symbols__fixup_end will set the end of symbol to next big address to ffff000001475100 in /proc/kallsyms, ffff000080080000 in this example. Then sizeof of symbol will be about 2G and cause the problem. The start of module in my machine is ffff000000a62000 t $x [dm_mod] The start of kernel in my machine is ffff000080080000 t _head There is a big gap between end of module and begin of kernel if a samll amount of memory is used by module. And the last symbol in module will have a large address range as caotaining the big gap. Give that the module and kernel text segment sequence may change in the future, fix this by limiting range of last symbol in module and kernel to 4K in arch arm64. Signed-off-by: Kemeng Shi <shikemeng@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/33fd24c4-0d5a-9d93-9b62-dffa97c992ca@huawei.com [ refreshed the patch on current codebase, added string.h include as strchr() is used ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf build-test: Honour JOBS to override detection of number of coresArnaldo Carvalho de Melo
When one does: $ make -C tools/perf build-test The makefile in tools/perf/tests/ will, just like the main one, detect how many cores are in the system and use it with -j. Sometimes we may need to override that, for instance, when using icecream or distcc to use multiple machines in the build process, then we need to, as with the main makefile, use: $ make JOBS=N -C tools/perf build-test Fix the tests makefile to honour that. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200330130301.GA31702@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf script: Add --show-cgroup-events optionNamhyung Kim
The --show-cgroup-events option is to print CGROUP events in the output like others. Committer testing: [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ] [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2 swapper 0 0.000000: PERF_RECORD_CGROUP cgroup: 1 / perf 12145 11200.440730: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) perf 12145 11200.440733: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) -- cgtest 12145 11200.440739: 193472 cycles: ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12145 11200.440790: 2691608 cycles: 7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so) cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub cgtest 12147 11200.441054: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12147 11200.441057: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) -- cgtest 12148 11200.441103: 10227 cycles: ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12148 11200.441106: 273295 cycles: ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1 cgtest 12147 11200.441143: 2788845 cycles: ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2 cgtest 12148 11200.441182: 2669546 cycles: 401020 _init+0x20 (/wb/cgtest) cgtest 12149 11200.441247: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux) [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf top: Add --all-cgroups optionNamhyung Kim
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that 'perf top' can identify task/cgroup association later. Committer testing: Use: # perf top --all-cgroups -s cgroup_id,cgroup,pid Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-9-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf record: Add --all-cgroups optionNamhyung Kim
The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that perf report can identify task/cgroup association later. [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ] [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 558 of event 'cycles' # Event count (approx.): 458017341 # # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 33.15% 4/0xeffffffb /sub 9615:looper0 32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2 32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1 0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest 0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest 0.32% 4/0xeffffffb / 9615:looper0 0.11% 4/0xeffffffb /sub 9617:cgtest 0.10% 4/0xeffffffb /sub 9618:cgtest # # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S') # [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf record: Support synthesizing cgroup eventsNamhyung Kim
Synthesize cgroup events by iterating cgroup filesystem directories. The cgroup event only saves the portion of cgroup path after the mount point and the cgroup id (which actually is a file handle). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-7-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf report: Add 'cgroup' sort keyNamhyung Kim
The cgroup sort key is to show cgroup membership of each task. Currently it shows full path in the cgroupfs (not relative to the root of cgroup namespace) since it'd be more intuitive IMHO. Otherwise root cgroup in different namespaces will all show same name - "/". The cgroup sort key should come before cgroup_id otherwise sort_dimension__add() will match it to cgroup_id as it only matches with the given substring. For example it will look like following. Note that record patch adding --all-cgroups patch will come later. $ perf record -a --namespace --all-cgroups cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.208 MB perf.data (4090 samples) ] $ perf report -s cgroup_id,cgroup,pid ... # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 93.96% 0/0x0 / 0:swapper 1.25% 3/0xeffffffb / 278:looper0 0.86% 3/0xf000015f /sub/cgrp1 280:cgtest 0.37% 3/0xf0000160 /sub/cgrp2 281:cgtest 0.34% 3/0xf0000163 /sub/cgrp3 282:cgtest 0.22% 3/0xeffffffb /sub 278:looper0 0.20% 3/0xeffffffb / 280:cgtest 0.15% 3/0xf0000163 /sub/cgrp3 285:looper3 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf cgroup: Maintain cgroup hierarchyNamhyung Kim
Each cgroup is kept in the perf_env's cgroup_tree sorted by the cgroup id. Hist entries have cgroup id can compare it directly and later it can be used to find a group name using this tree. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Basic support for CGROUP eventNamhyung Kim
Implement basic functionality to support cgroup tracking. Each cgroup can be identified by inode number which can be read from userspace too. The actual cgroup processing will come in the later patch. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> [ fix perf test failure on sampling parsing ] Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf tools: Add file-handle feature testNamhyung Kim
The file handle (FHANDLE) support is configurable so some systems might not have it. So add a config feature item to check it on build time so that we don't add the cgroup tracking feature based on that. Committer notes: Had to make the test use the same construct as its later use in synthetic-events.c, in the next patch in this series. i.e. make it be: struct { struct file_handle fh; uint64_t cgroup_id; } handle; To cope with: CC /tmp/build/perf/util/cloexec.o util/synthetic-events.c:428:22: error: field 'fh' with CC /tmp/build/perf/util/call-path.o variable sized type 'struct file_handle' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end] struct file_handle fh; ^ 1 error generated. Deal with this at some point, i.e. investigate if the right thing is to remove that -Wgnu-variable-sized-type-not-at-end from our CFLAGS, for now do the test the same way as it is used looks more sensible. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ split from a larger patch, removed blank line at EOF ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03perf python: Include rwsem.c in the pythong bidingArnaldo Carvalho de Melo
We'll need it for the cgroup patches, and its better to have it in a separate patch in case we need to later revert the cgroup patches. I.e. without this we have: [root@five ~]# perf test -v python 19: 'import perf' in python : --- start --- test child forked, pid 148447 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.cpython-37m-x86_64-linux-gnu.so: undefined symbol: down_write test child finished with -1 ---- end ---- 'import perf' in python: FAILED! [root@five ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200403123606.GC23243@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03rtc: ds1307: check for failed memory allocation on wdtColin Ian King
Currently a failed memory allocation will lead to a null pointer dereference on point wdt. Fix this by checking for a failed allocation and just returning. Addresses-Coverity: ("Dereference null return") Fixes: fd90d48db037 ("rtc: ds1307: add support for watchdog timer on ds1388") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200403110437.57420-1-colin.king@canonical.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2020-04-03Documentation: PM: sleep: Document system-wide suspend code flowsRafael J. Wysocki
Add a document describing high-level system-wide suspend code flows in Linux. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-03cpufreq: Select schedutil when using big.LITTLELinus Walleij
When we are using a system with big.LITTLE HMP configuration, we need to use EAS to schedule the system. As can be seen from kernel/sched/topology.c: "EAS can be used on a root domain if it meets all the following conditions: 1. an Energy Model (EM) is available; 2. the SD_ASYM_CPUCAPACITY flag is set in the sched_domain hierarchy. 3. no SMT is detected. 4. the EM complexity is low enough to keep scheduling overheads low; 5. schedutil is driving the frequency of all CPUs of the rd;" This means that at the very least, schedutil needs to be available as a scheduling policy for EAS to work on these systems. Make this explicit by defaulting to the schedutil governor if BIG_LITTLE is selected. Currently users of the TC2 board (like me) has to figure these dependencies out themselves and it is not helpful. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-03Revert "powerpc/64: irq_work avoid interrupt when called with hardware irqs ↵Nicholas Piggin
enabled" This reverts commit ebb37cf3ffd39fdb6ec5b07111f8bb2f11d92c5f. That commit does not play well with soft-masked irq state manipulations in idle, interrupt replay, and possibly others due to tracing code sometimes using irq_work_queue (e.g., in trace_hardirqs_on()). That can cause PACA_IRQ_DEC to become set when it is not expected, and be ignored or cleared or cause warnings. The net result seems to be missing an irq_work until the next timer interrupt in the worst case which is usually not going to be noticed, however it could be a long time if the tick is disabled, which is against the spirit of irq_work and might cause real problems. The idea is still solid, but it would need more work. It's not really clear if it would be worth added complexity, so revert this for now (not a straight revert, but replace with a comment explaining why we might see interrupts happening, and gives git blame something to find). Fixes: ebb37cf3ffd3 ("powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200402120401.1115883-1-npiggin@gmail.com
2020-04-03powerpc/time: Replace <linux/clk-provider.h> by <linux/of_clk.h>Geert Uytterhoeven
The PowerPC time code is not a clock provider, and just needs to call of_clk_init(). Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>. Remove the #ifdef protecting the of_clk_init() call, as a stub is available for the !CONFIG_COMMON_CLK case. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200213083804.24315-1-geert+renesas@glider.be
2020-04-03csky: Fixup cpu speculative execution to IO areaGuo Ren
For the memory size ( > 512MB, < 1GB), the MSA setting is: - SSEG0: PHY_START , PHY_START + 512MB - SSEG1: PHY_START + 512MB, PHY_START + 1GB But the real memory is no more than 1GB, there is a gap between the end size of memory and border of 1GB. CPU could speculatively execute to that gap and if the gap of the bus couldn't respond to the CPU request, then the crash will happen. Now make the setting with: - SSEG0: PHY_START , PHY_START + 512MB (no change) - SSEG1: Disabled (We use highmem to use the memory of 512MB~1GB) We also deprecated zhole_szie[] settings, it's only used by arm style CPUs. All memory gap should use Reserved setting of dts in csky system. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-04-03crypto: marvell/octeontx - fix double free of ptrColin Ian King
Currently in the case where eq->src != req->ds, the allocation of ptr is kfree'd at the end of the code block. However later on in the case where enc is not null any of the error return paths that return via the error handling return path end up performing an erroneous second kfree of ptr. Fix this by adding an error exit label error_free and only jump to this when ptr needs kfree'ing thus avoiding the double free issue. Addresses-Coverity: ("Double free") Fixes: 10b4f09491bf ("crypto: marvell - add the Virtual Function driver for CPT") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-04-03crypto: hisilicon - Fix build errorYueHaibing
When UACCE is m, CRYPTO_DEV_HISI_QM cannot be built-in. But CRYPTO_DEV_HISI_QM is selected by CRYPTO_DEV_HISI_SEC2 and CRYPTO_DEV_HISI_HPRE unconditionally, which may leads this: drivers/crypto/hisilicon/qm.o: In function 'qm_alloc_uacce': drivers/crypto/hisilicon/qm.c:1579: undefined reference to 'uacce_alloc' Add Kconfig dependency to enforce usable configurations. Fixes: 47c16b449921 ("crypto: hisilicon - qm depends on UACCE") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-04-03csky: Add uprobes supportGuo Ren
This patch adds support for uprobes on csky architecture. Just like kprobe, it support single-step and simulate instructions. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-04-03csky: Add kprobes supportedGuo Ren
This patch enable kprobes, kretprobes, ftrace interface. It utilized software breakpoint and single step debug exceptions, instructions simulation on csky. We use USR_BKPT replace origin instruction, and the kprobe handler prepares an excutable memory slot for out-of-line execution with a copy of the original instruction being probed. Most of instructions could be executed by single-step, but some instructions need origin pc value to execute and we need software simulate these instructions. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-04-02Merge branch 'for-5.7/libnvdimm' into libnvdimm-for-nextDan Williams
- Introduce 'zero_page_range' as a dax operation. This facilitates filesystem-dax operation without a block-device. - Advertise a persistence-domain for of_pmem and papr_scm. The persistence domain indicates where cpu-store cycles need to reach in the platform-memory subsystem before the platform will consider them power-fail protected. - Fixup some flexible-array declarations.
2020-04-02Merge branch 'for-5.7/numa' into libnvdimm-for-nextDan Williams
- Promote numa_map_to_online_node() to a cross-kernel generic facility. - Save x86 numa information to allow for node-id lookups for reserved memory ranges, deploy that capability for the e820-pmem driver. - Introduce phys_to_target_node() to facilitate drivers that want to know resulting numa node if a given reserved address range was onlined.
2020-04-02Merge branch 'for-5.6/libnvdimm-fixes' into libnvdimm-for-nextDan Williams
Pick up some miscellaneous minor fixes, that missed v5.6-final, including a some smatch reports in the ioctl path and some unit test compilation fixups.
2020-04-02dax: Move mandatory ->zero_page_range() check in alloc_dax()Vivek Goyal
zero_page_range() dax operation is mandatory for dax devices. Right now that check happens in dax_zero_page_range() function. Dan thinks that's too late and its better to do the check earlier in alloc_dax(). I also modified alloc_dax() to return pointer with error code in it in case of failure. Right now it returns NULL and caller assumes failure happened due to -ENOMEM. But with this ->zero_page_range() check, I need to return -EINVAL instead. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-02dax,iomap: Add helper dax_iomap_zero() to zero a rangeVivek Goyal
Add a helper dax_ioamp_zero() to zero a range. This patch basically merges __dax_zero_page_range() and iomap_dax_zero(). Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200228163456.1587-7-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-02dax: Use new dax zero page method for zeroing a pageVivek Goyal
Use new dax native zero page method for zeroing page if I/O is page aligned. Otherwise fall back to direct_access() + memcpy(). This gets rid of one of the depenendency on block device in dax path. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200228163456.1587-6-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>