summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2018-10-31perf unwind: Take pgoff into account when reporting elf to libdwflMilian Wolff
libdwfl parses an ELF file itself and creates mappings for the individual sections. perf on the other hand sees raw mmap events which represent individual sections. When we encounter an address pointing into a mapping with pgoff != 0, we must take that into account and report the file at the non-offset base address. This fixes unwinding with libdwfl in some cases. E.g. for a file like: ``` using namespace std; mutex g_mutex; double worker() { lock_guard<mutex> guard(g_mutex); uniform_real_distribution<double> uniform(-1E5, 1E5); default_random_engine engine; double s = 0; for (int i = 0; i < 1000; ++i) { s += norm(complex<double>(uniform(engine), uniform(engine))); } cout << s << endl; return s; } int main() { vector<std::future<double>> results; for (int i = 0; i < 10000; ++i) { results.push_back(async(launch::async, worker)); } return 0; } ``` Compile it with `g++ -g -O2 -lpthread cpp-locking.cpp -o cpp-locking`, then record it with `perf record --call-graph dwarf -e sched:sched_switch`. When you analyze it with `perf script` and libunwind, you should see: ``` cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120 ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux) 7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so) 7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so) 7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so) 7f38e42569e5 __GI___libc_malloc+0x115 (inlined) 7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined) 7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined) 7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined) 7f38e424df36 _IO_new_file_xsputn+0x116 (inlined) 7f38e4242bfb __GI__IO_fwrite+0xdb (inlined) 7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c> 7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl> 7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25) 563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined) 563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking) 563b9cb506fb double std::__invoke_impl<double, double (*)()>(std::__invoke_other, double (*&&)())+0x2b (inlined) 563b9cb506fb std::__invoke_result<double (*)()>::type std::__invoke<double (*)()>(double (*&&)())+0x2b (inlined) 563b9cb506fb decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<double (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2b (inlined) 563b9cb506fb std::thread::_Invoker<std::tuple<double (*)()> >::operator()()+0x2b (inlined) 563b9cb506fb std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<double>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<double (*)()> >, dou> 563b9cb506fb std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_> 563b9cb507e8 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const+0x28 (inlined) 563b9cb507e8 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)+0x28 (/ssd/milian/> 7f38e46d24fe __pthread_once_slow+0xbe (/usr/lib/libpthread-2.28.so) 563b9cb51149 __gthread_once+0xe9 (inlined) 563b9cb51149 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)> 563b9cb51149 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool)+0xe9 (inlined) 563b9cb51149 std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >&&)::{lambda()#1}::op> 563b9cb51149 void std::__invoke_impl<void, std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double> 563b9cb51149 std::__invoke_result<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >> 563b9cb51149 decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_> 563b9cb51149 std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<dou> 563b9cb51149 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread> 7f38e45f0062 execute_native_thread_routine+0x12 (/usr/lib/libstdc++.so.6.0.25) 7f38e46caa9c start_thread+0xfc (/usr/lib/libpthread-2.28.so) 7f38e42ccb22 __GI___clone+0x42 (inlined) ``` Before this patch, using libdwfl, you would see: ``` cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120 ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux) 7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so) a041161e77950c5c [unknown] ([unknown]) ``` With this patch applied, we get a bit further in unwinding: ``` cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120 ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux) ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux) 7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so) 7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so) 7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so) 7f38e42569e5 __GI___libc_malloc+0x115 (inlined) 7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined) 7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined) 7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined) 7f38e424df36 _IO_new_file_xsputn+0x116 (inlined) 7f38e4242bfb __GI__IO_fwrite+0xdb (inlined) 7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined) 7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c> 7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl> 7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25) 563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined) 563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking) 6eab825c1ee3e4ff [unknown] ([unknown]) ``` Note that the backtrace is still stopping too early, when compared to the nice results obtained via libunwind. It's unclear so far what the reason for that is. Committer note: Further comment by Milian on the thread started on the Link: tag below: --- The remaining issue is due to a bug in elfutils: https://sourceware.org/ml/elfutils-devel/2018-q4/msg00089.html With both patches applied, libunwind and elfutils produce the same output for the above scenario. --- Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181029141644.3907-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31perf top: Do not use overwrite mode by defaultArnaldo Carvalho de Melo
Enabling --overwrite mode allows us to to use just the most recent records, which helps in high core count machines such as Knights Landing/Mill, but right now is being disabled by default as the pausing used in this technique is leading to loss of metadata events such as PERF_RECORD_MMAP which makes 'perf top' unable to resolve samples, leading to lots of unknown samples appearing on the UI. Enabling this may be useful if you are in such machines and profiling a workload that doesn't creates short lived threads and/or doesn't uses many executable mmap operations. Work is being planed to solve this situation, till then, this will remain disabled by default. Reported-by: David Miller <davem@davemloft.net> Acked-by: Kan Liang <kan.liang@intel.com> Link: https://lkml.kernel.org/r/4f84468f-37d9-cf1b-12c1-514ef74b6a48@linux.intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode") Link: https://lkml.kernel.org/n/tip-ehvf77vi1si9409r7p4wx788@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31selftests/powerpc/cache_shape: Fix out-of-tree buildMichael Ellerman
Use TEST_GEN_PROGS and don't redefine all, this makes the out-of-tree build work. We need to move the extra dependencies below the include of lib.mk, because it adds the $(OUTPUT) prefix if it's defined. We can also drop the clean rule, lib.mk does it for us. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-31selftests/powerpc/switch_endian: Fix out-of-tree buildMichael Ellerman
For the out-of-tree build to work we need to tell switch_endian_test to look for check-reversed.S in $(OUTPUT). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-31selftests/powerpc/pmu: Link ebb tests with -no-pieJoel Stanley
When running the ebb tests after building on a ppc64le Ubuntu machine: $ pmu/ebb/reg_access_test: error while loading shared libraries: R_PPC64_ADDR16_HI reloc at 0x000000013a965130 for symbol `' out of range This is because the Ubuntu toolchain builds has PIE enabled by default. Change it to be always off instead. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-31selftests/powerpc/signal: Fix out-of-tree buildJoel Stanley
We should use TEST_GEN_PROGS, not TEST_PROGS. That tells the selftests makefile (lib.mk) that those tests are generated (built), and so it adds the $(OUTPUT) prefix for us, making the out-of-tree build work correctly. It also means we don't need our own clean rule, lib.mk does it. We also have to update the signal_tm rule to use $(OUTPUT). Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-31selftests/powerpc/ptrace: Fix out-of-tree buildJoel Stanley
We should use TEST_GEN_PROGS, not TEST_PROGS. That tells the selftests makefile (lib.mk) that those tests are generated (built), and so it adds the $(OUTPUT) prefix for us, making the out-of-tree build work correctly. It also means we don't need our own clean rule, lib.mk does it. We also have to update the ptrace-pkey and core-pkey rules to use $(OUTPUT). Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-31selftests: powerpc: Fix warning for security subdirJoel Stanley
typing 'make' inside tools/testing/selftests/powerpc gave a build warning: BUILD_TARGET=tools/testing/selftests/powerpc/security; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C security all make[1]: Entering directory 'tools/testing/selftests/powerpc/security' ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make[1]: *** No rule to make target '../../../../scripts/subarch.include'. make[1]: Failed to remake makefile '../../../../scripts/subarch.include'. The build is one level deeper than lib.mk thinks it is. Set top_srcdir to set things straight. Note that the test program is still built. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-30tools/bpf: add unlimited rlimit for flow_dissector_loadYonghong Song
On our test machine, bpf selftest test_flow_dissector.sh failed with the following error: # ./test_flow_dissector.sh bpffs not mounted. Mounting... libbpf: failed to create map (name: 'jmp_table'): Operation not permitted libbpf: failed to load object 'bpf_flow.o' ./flow_dissector_load: bpf_prog_load bpf_flow.o selftests: test_flow_dissector [FAILED] Let us increase the rlimit to remove the above map creation failure. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-30Merge tag 'trace-v4.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The biggest change here is the updates to kprobes Back in January I posted patches to create function based events. These were the events that you suggested I make to allow developers to easily create events in code where no trace event exists. After posting those changes for review, it was suggested that we implement this instead with kprobes. The problem with kprobes is that the interface is too complex and needs to be simplified. Masami Hiramatsu posted patches in March and I've been playing with them a bit. There's been a bit of clean up in the kprobe code that was inspired by the function based event patches, and a couple of enhancements to the kprobe event interface. - If the arch supports it (we added support for x86), you can place a kprobe event at the start of a function and use $arg1, $arg2, etc to reference the arguments of a function. (Before you needed to know what register or where on the stack the argument was). - The second is a way to see array of events. For example, if you reference a mac address, you can add: echo 'p:mac ip_rcv perm_addr=+574($arg2):x8[6]' > kprobe_events And this will produce: mac: (ip_rcv+0x0/0x140) perm_addr={0x52,0x54,0x0,0xc0,0x76,0xec} Other changes include - Exporting trace_dump_stack to modules - Have the stack tracer trace the entire stack (stop trying to remove tracing itself, as we keep removing too much). - Added support for SDT in uprobes" [ SDT - "Statically Defined Tracing" are userspace markers for tracing. Let's not use random TLA's in explanations unless they are fairly well-established as generic (at least for kernel people) - Linus ] * tag 'trace-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits) tracing: Have stack tracer trace full stack tracing: Export trace_dump_stack to modules tracing: probeevent: Fix uninitialized used of offset in parse args tracing/kprobes: Allow kprobe-events to record module symbol tracing/kprobes: Check the probe on unloaded module correctly tracing/uprobes: Fix to return -EFAULT if copy_from_user failed tracing: probeevent: Add $argN for accessing function args x86: ptrace: Add function argument access API tracing: probeevent: Add array type support tracing: probeevent: Add symbol type tracing: probeevent: Unify fetch_insn processing common part tracing: probeevent: Append traceprobe_ for exported function tracing: probeevent: Return consumed bytes of dynamic area tracing: probeevent: Unify fetch type tables tracing: probeevent: Introduce new argument fetching code tracing: probeevent: Remove NOKPROBE_SYMBOL from print functions tracing: probeevent: Cleanup argument field definition tracing: probeevent: Cleanup print argument functions trace_uprobe: support reference counter in fd-based uprobe perf probe: Support SDT markers having reference counter (semaphore) ...
2018-10-30Merge tag 'trace-v4.19-rc8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Masami had a couple more fixes to the synthetic events. One was a proper error return value, and the other is for the self tests" * tag 'trace-v4.19-rc8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: selftests/ftrace: Fix synthetic event test to delete event correctly tracing: Return -ENOENT if there is no target synthetic event
2018-10-30perf top: Allow disabling the overwrite modeArnaldo Carvalho de Melo
In ebebbf082357 ("perf top: Switch default mode to overwrite mode") we forgot to leave a way to disable that new default, add a --overwrite option that can be disabled using --no-overwrite, since the code already in such a way that we can readily disable this mode. This is useful when investigating bugs with this mode like the recent report from David Miller where lots of unknown symbols appear due to disabling the events while processing them which disables all record types, not just PERF_RECORD_SAMPLE, which makes it impossible to resolve maps when we lose PERF_RECORD_MMAP records. This can be easily seen while building a kernel, when there are lots of short lived processes. Reported-by: David Miller <davem@davemloft.net> Acked-by: Kan Liang <kan.liang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode") Link: https://lkml.kernel.org/n/tip-oqgsz2bq4kgrnnajrafcdhie@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf trace: Beautify mount's first pathname argArnaldo Carvalho de Melo
The pathname beautifiers so far support just one augmented pathname per syscall, so do it just for mount's first arg, later this will get fixed. With: # perf probe -l probe:vfs_getname (on getname_flags:73@acme/git/linux/fs/namei.c with pathname) # Later this will get added to augmented_syscalls.c (eBPF): In one xterm: # perf trace -e mount,umount 2687.331 ( 3.544 ms): mount/8892 mount(dev_name: /mnt, dir_name: 0x561f9ac184a0, type: 0x561f9ac1b170, flags: BIND) = 0 3912.126 ( 8.807 ms): umount/8895 umount2(name: /mnt) = 0 ^C# In the other: $ sudo mount --bind /proc /mnt $ sudo umount /mnt Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qsvhrm2es635cl4zicqjeth2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf trace: Beautify the umount's 'name' argumentArnaldo Carvalho de Melo
By using the SCA_FILENAME beautifier, that works when either the probe:vfs_getname probe is in place or with the eBPF program tools/perf/examples/bpf/augmented_syscalls.c: # perf probe -l probe:vfs_getname (on getname_flags:73@acme/git/linux/fs/namei.c with pathname) # perf trace -e umount 9630.332 ( 9.521 ms): umount/8082 umount2(name: /mnt) = 0 # The augmented syscalls one will be done in the next patch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hegbzlpd2nrn584l5jxn7sy2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf trace: Consider syscall aliases tooArnaldo Carvalho de Melo
When trying to trace the 'umount' syscall on x86_64 I noticed that it was failing: # trace -e umount umount /mnt event syntax error: 'umount' \___ parser error Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # This is because in the x86-64 we have it just as 'umount2': $ grep umount arch/x86/entry/syscalls/syscall_64.tbl 166 common umount2 __x64_sys_umount $ So if the syscall name fails, try fallbacking to looking at the aliases we have in the syscall_fmts table to then re-lookup, now: # trace -e umount umount -f /mnt umount: /mnt: not mounted. 1.759 ( 0.004 ms): umount/18365 umount2(name: 0x55fbfcbc4480, flags: 1) = -1 EINVAL Invalid argument # Time to beautify the flags arg :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ukweodgzbmjd25lfkgryeft1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf trace beauty: Beautify mount/umount's 'flags' argumentArnaldo Carvalho de Melo
# trace -e mount mount -o ro -t debugfs nodev /mnt 0.000 ( 1.040 ms): mount/27235 mount(dev_name: 0x5601cc8c64e0, dir_name: 0x5601cc8c6500, type: 0x5601cc8c6480, flags: RDONLY) = 0 # trace -e mount mount -o remount,relatime -t debugfs nodev /mnt 0.000 ( 2.946 ms): mount/27262 mount(dev_name: 0x55f4a73d64e0, dir_name: 0x55f4a73d6500, type: 0x55f4a73d6480, flags: REMOUNT|RELATIME) = 0 # trace -e mount mount -o remount,strictatime -t debugfs nodev /mnt 0.000 ( 2.934 ms): mount/27265 mount(dev_name: 0x5617f71d94e0, dir_name: 0x5617f71d9500, type: 0x5617f71d9480, flags: REMOUNT|STRICTATIME) = 0 # trace -e mount mount -o remount,suid,silent -t debugfs nodev /mnt 0.000 ( 0.049 ms): mount/27273 mount(dev_name: 0x55ad65df24e0, dir_name: 0x55ad65df2500, type: 0x55ad65df2480, flags: REMOUNT|SILENT) = 0 # trace -e mount mount -o remount,rw,sync,lazytime -t debugfs nodev /mnt 0.000 ( 2.684 ms): mount/27281 mount(dev_name: 0x561216055530, dir_name: 0x561216055550, type: 0x561216055510, flags: SYNCHRONOUS|REMOUNT|LAZYTIME) = 0 # trace -e mount mount -o remount,dirsync -t debugfs nodev /mnt 0.000 ( 3.512 ms): mount/27314 mount(dev_name: 0x55c4e7188480, dir_name: 0x55c4e7188530, type: 0x55c4e71884a0, flags: REMOUNT|DIRSYNC, data: 0x55c4e71884e0) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-i5ncao73c0bd02qprgrq6wb9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf trace beauty: Allow syscalls to mask an argument before considering itArnaldo Carvalho de Melo
Take mount's 'flags' arg, to cope with this semantic, as defined in do_mount in fs/namespace.c: /* * Pre-0.97 versions of mount() didn't have a flags word. When the * flags word was introduced its top half was required to have the * magic value 0xC0ED, and this remained so until 2.4.0-test9. * Therefore, if this magic number is present, it carries no * information and must be discarded. */ We need to mask this arg, and then see if it is zero, when we simply don't print the arg name and value. The next patch will use this for mount's 'flag' arg. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-btue14k5jemayuykfrwsnh85@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf beauty: Introduce strarray__scnprintf_flags()Arnaldo Carvalho de Melo
Generalizing pkey_alloc__scnprintf_access_rights(), so that we can use it with other flags-like arguments, such as mount's mountflags argument. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-o3ymi3104m8moaz9865g09w9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf beauty: Switch from GPL v2.0 to LGPL v2.1Arnaldo Carvalho de Melo
The intention is to have this as a library, since it is not perf specific at all. I did the switch for the files where I'm the only contributor, with the exception of a few lines changed by Jiri Olsa. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-a04q6chdyjknm1hr305ulx8h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30perf beauty: Add a generator for MS_ mount/umount's flag constantsArnaldo Carvalho de Melo
It'll use tools/include copy of linux/fs.h to generate a table to be used by tools, initially by the 'mount' and 'umount' beautifiers in 'perf trace', but that could also be used to translate from a string constant to the integer value to be used in a eBPF or tracefs tracepoint filter. When used without any args it produces: $ tools/perf/trace/beauty/mount_flags.sh static const char *mount_flags[] = { [1 ? (ilog2(1) + 1) : 0] = "RDONLY", [2 ? (ilog2(2) + 1) : 0] = "NOSUID", [4 ? (ilog2(4) + 1) : 0] = "NODEV", [8 ? (ilog2(8) + 1) : 0] = "NOEXEC", [16 ? (ilog2(16) + 1) : 0] = "SYNCHRONOUS", [32 ? (ilog2(32) + 1) : 0] = "REMOUNT", [64 ? (ilog2(64) + 1) : 0] = "MANDLOCK", [128 ? (ilog2(128) + 1) : 0] = "DIRSYNC", [1024 ? (ilog2(1024) + 1) : 0] = "NOATIME", [2048 ? (ilog2(2048) + 1) : 0] = "NODIRATIME", [4096 ? (ilog2(4096) + 1) : 0] = "BIND", [8192 ? (ilog2(8192) + 1) : 0] = "MOVE", [16384 ? (ilog2(16384) + 1) : 0] = "REC", [32768 ? (ilog2(32768) + 1) : 0] = "SILENT", [16 + 1] = "POSIXACL", [17 + 1] = "UNBINDABLE", [18 + 1] = "PRIVATE", [19 + 1] = "SLAVE", [20 + 1] = "SHARED", [21 + 1] = "RELATIME", [22 + 1] = "KERNMOUNT", [23 + 1] = "I_VERSION", [24 + 1] = "STRICTATIME", [25 + 1] = "LAZYTIME", [26 + 1] = "SUBMOUNT", [27 + 1] = "NOREMOTELOCK", [28 + 1] = "NOSEC", [29 + 1] = "BORN", [30 + 1] = "ACTIVE", [31 + 1] = "NOUSER", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mgutbbkmip9gfnmd28ikg7xt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30tools include uapi: Grab a copy of linux/fs.hArnaldo Carvalho de Melo
We'll use it to create tables for the 'flags' argument to the 'mount' and 'umount' syscalls. Add it to check_headers.sh so that when a new protocol gets added we get a notification during the build process. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Peterson <benjamin@python.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-yacf9jvkwfwg2g95r2us3xb3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-30selftests/powerpc: Relax L1d miss targets for rfi_flush testNaveen N. Rao
When running the rfi_flush test, if the system is loaded, we see two issues: 1. The L1d misses when rfi_flush is disabled increase significantly due to other workloads interfering with the cache. 2. The L1d misses when rfi_flush is enabled sometimes goes slightly below the expected number of misses. To address these, let's relax the expected number of L1d misses: 1. When rfi_flush is disabled, we allow upto half the expected number of the misses for when rfi_flush is enabled. 2. When rfi_flush is enabled, we allow ~1% lower number of cache misses. Reported-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-29Merge branch 'linus' into perf/urgent, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-29Merge branches 'x86/early-printk', 'x86/microcode' and 'core/objtool' into ↵Ingo Molnar
x86/urgent, to pick up simple topic branches Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) GRO overflow entries are not unlinked properly, resulting in list poison pointers being dereferenced. 2) Fix bridge build with ipv6 disabled, from Nikolay Aleksandrov. 3) Direct packet access and other fixes in BPF from Daniel Borkmann. 4) gred_change_table_def() gets passed the wrong pointer, a pointer to a set of unparsed attributes instead of the attribute itself. From Jakub Kicinski. 5) Allow macsec device to be brought up even if it's lowerdev is down, from Sabrina Dubroca. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: diag: document swapped src/dst in udp_dump_one. macsec: let the administrator set UP state even if lowerdev is down macsec: update operstate when lower device changes net: sched: gred: pass the right attribute to gred_change_table_def() ptp: drop redundant kasprintf() to create worker name net: bridge: remove ipv6 zero address check in mcast queries net: Properly unlink GRO packets on overflow. bpf: fix wrong helper enablement in cgroup local storage bpf: add bpf_jit_limit knob to restrict unpriv allocations bpf: make direct packet write unclone more robust bpf: fix leaking uninitialized memory on pop/peek helpers bpf: fix direct packet write into pop/peek helpers bpf: fix cg_skb types to hint access type in may_access_direct_pkt_data bpf: fix direct packet access for flow dissector progs bpf: disallow direct packet access for unpriv in cg_skb bpf: fix test suite to enable all unpriv program types bpf, btf: fix a missing check bug in btf_parse selftests/bpf: add config fragments BPF_STREAM_PARSER and XDP_SOCKETS bpf: devmap: fix wrong interface selection in notifier_call
2018-10-28Merge tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "This is going to rebuild more than drm as it adds a new helper to list.h for doing bulk updates. Seemed like a reasonable addition to me. Otherwise the usual merge window stuff lots of i915 and amdgpu, not so much nouveau, and piles of everything else. Core: - Adds a new list.h helper for doing bulk list updates for TTM. - Don't leak fb address in smem_start to userspace (comes with EXPORT workaround for people using mali out of tree hacks) - udmabuf device to turn memfd regions into dma-buf - Per-plane blend mode property - ref/unref replacements with get/put - fbdev conflicting framebuffers code cleaned up - host-endian format variants - panel orientation quirk for Acer One 10 bridge: - TI SN65DSI86 chip support vkms: - GEM support. - Cursor support amdgpu: - Merge amdkfd and amdgpu into one module - CEC over DP AUX support - Picasso APU support + VCN dynamic powergating - Raven2 APU support - Vega20 enablement + kfd support - ACP powergating improvements - ABGR/XBGR display support - VCN jpeg support - xGMI support - DC i2c/aux cleanup - Ycbcr 4:2:0 support - GPUVM improvements - Powerplay and powerplay endian fixes - Display underflow fixes vmwgfx: - Move vmwgfx specific TTM code to vmwgfx - Split out vmwgfx buffer/resource validation code - Atomic operation rework bochs: - use more helpers - format/byteorder improvements qxl: - use more helpers i915: - GGTT coherency getparam - Turn off resource streamer API - More Icelake enablement + DMC firmware - Full PPGTT for Ivybridge, Haswell and Valleyview - DDB distribution based on resolution - Limited range DP display support nouveau: - CEC over DP AUX support - Initial HDMI 2.0 support virtio-gpu: - vmap support for PRIME objects tegra: - Initial Tegra194 support - DMA/IOMMU integration fixes msm: - a6xx perf improvements + clock prefix - GPU preemption optimisations - a6xx devfreq support - cursor support rockchip: - PX30 support - rgb output interface support mediatek: - HDMI output support on mt2701 and mt7623 rcar-du: - Interlaced modes on Gen3 - LVDS on R8A77980 - D3 and E3 SoC support hisilicon: - misc fixes mxsfb: - runtime pm support sun4i: - R40 TCON support - Allwinner A64 support - R40 HDMI support omapdrm: - Driver rework changing display pipeline ordering to use common code - DMM memory barrier and irq fixes - Errata workarounds exynos: - out-bridge support for LVDS bridge driver - Samsung 16x16 tiled format support - Plane alpha and pixel blend mode support tilcdc: - suspend/resume update mali-dp: - misc updates" * tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drm: (1382 commits) firmware/dmc/icl: Add missing MODULE_FIRMWARE() for Icelake. drm/i915/icl: Fix signal_levels drm/i915/icl: Fix DDI/TC port clk_off bits drm/i915/icl: create function to identify combophy port drm/i915/gen9+: Fix initial readout for Y tiled framebuffers drm/i915: Large page offsets for pread/pwrite drm/i915/selftests: Disable shrinker across mmap-exhaustion drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode drm/i915: Fix intel_dp_mst_best_encoder() drm/i915: Skip vcpi allocation for MSTB ports that are gone drm/i915: Don't unset intel_connector->mst_port drm/i915: Only reset seqno if actually idle drm/i915: Use the correct crtc when sanitizing plane mapping drm/i915: Restore vblank interrupts earlier drm/i915: Check fb stride against plane max stride drm/amdgpu/vcn:Fix uninitialized symbol error drm: panel-orientation-quirks: Add quirk for Acer One 10 (S1003) drm/amd/amdgpu: Fix debugfs error handling drm/amdgpu: Update gc_9_0 golden settings. drm/amd/powerplay: update PPtable with DC BTC and Tvr SocLimit fields ...
2018-10-28Merge tag 'linux-kselftest-4.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: "This Kselftest update for Linux 4.20-rc1 consists of: - Improvements to ftrace test suite from Masami Hiramatsu. - Color coded ftrace PASS / FAIL results from Steven Rostedt (VMware) to improve readability of reports. - watchdog Fixes and enhancement to add gettimeout and get|set pretimeout options from Jerry Hoemann. - Several fixes to warnings and spelling etc" * tag 'linux-kselftest-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (40 commits) selftests/ftrace: Strip escape sequences for log file selftests/ftrace: Use colored output when available selftests: fix warning: "_GNU_SOURCE" redefined selftests: kvm: Fix -Wformat warnings selftests/ftrace: Add color to the PASS / FAIL results kvm: selftests: fix spelling mistake "Insufficent" -> "Insufficient" selftests: gpio: Fix OUTPUT directory in Makefile selftests: gpio: restructure Makefile selftests: watchdog: Fix ioctl SET* error paths to take oneshot exit path selftests: watchdog: Add gettimeout and get|set pretimeout selftests: watchdog: Fix error message. selftests: watchdog: fix message when /dev/watchdog open fails selftests/ftrace: Add ftrace cpumask testcase selftests/ftrace: Add wakeup_rt tracer testcase selftests/ftrace: Add wakeup tracer testcase selftests/ftrace: Add stacktrace ftrace filter command testcase selftests/ftrace: Add trace_pipe testcase selftests/ftrace: Add function filter on module testcase selftests/ftrace: Add max stack tracer testcase selftests/ftrace: Add function profiling stat testcase ...
2018-10-28selftests/ftrace: Fix synthetic event test to delete event correctlyMasami Hiramatsu
Fix the synthetic event test case to remove event correctly. If redirecting command to synthetic_event file without append mode, it cleans up all existing events and execute (parse) the command. This means "delete event" always fails to find the target event. Since previous synthetic event has a bug which doesn't return -ENOENT even if it fails to find the deleting event, this test passed. But fixing that bug, this test fails because this test itself has a bug. This fixes that bug by trying to delete event right after adding an event, and use append mode redirection ('>>') instead of normal redirection ('>'). Link: http://lkml.kernel.org/r/154013452832.25576.2305459545429386517.stgit@devbox Acked-by: Shuah Khan <shuah@kernel.org> Acked-by: Tom Zanussi <zanussi@linux.intel.com> Tested-by: Tom Zanussi <zanussi@linux.intel.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Rajvi Jingar <rajvi.jingar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: stable@vger.kernel.org Fixes: f06eec4d0f2c ('selftests: ftrace: Add inter-event hist triggers testcases') Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-28Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray conversion from Matthew Wilcox: "The XArray provides an improved interface to the radix tree data structure, providing locking as part of the API, specifying GFP flags at allocation time, eliminating preloading, less re-walking the tree, more efficient iterations and not exposing RCU-protected pointers to its users. This patch set 1. Introduces the XArray implementation 2. Converts the pagecache to use it 3. Converts memremap to use it The page cache is the most complex and important user of the radix tree, so converting it was most important. Converting the memremap code removes the only other user of the multiorder code, which allows us to remove the radix tree code that supported it. I have 40+ followup patches to convert many other users of the radix tree over to the XArray, but I'd like to get this part in first. The other conversions haven't been in linux-next and aren't suitable for applying yet, but you can see them in the xarray-conv branch if you're interested" * 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits) radix tree: Remove multiorder support radix tree test: Convert multiorder tests to XArray radix tree tests: Convert item_delete_rcu to XArray radix tree tests: Convert item_kill_tree to XArray radix tree tests: Move item_insert_order radix tree test suite: Remove multiorder benchmarking radix tree test suite: Remove __item_insert memremap: Convert to XArray xarray: Add range store functionality xarray: Move multiorder_check to in-kernel tests xarray: Move multiorder_shrink to kernel tests xarray: Move multiorder account test in-kernel radix tree test suite: Convert iteration test to XArray radix tree test suite: Convert tag_tagged_items to XArray radix tree: Remove radix_tree_clear_tags radix tree: Remove radix_tree_maybe_preload_order radix tree: Remove split/join code radix tree: Remove radix_tree_update_node_t page cache: Finish XArray conversion dax: Convert page fault handlers to XArray ...
2018-10-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-10-27 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix toctou race in BTF header validation, from Martin and Wenwen. 2) Fix devmap interface comparison in notifier call which was neglecting netns, from Taehee. 3) Several fixes in various places, for example, correcting direct packet access and helper function availability, from Daniel. 4) Fix BPF kselftest config fragment to include af_xdp and sockmap, from Naresh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: - a few misc things - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits) hugetlbfs: dirty pages as they are added to pagecache mm: export add_swap_extent() mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS tools/testing/selftests/vm/map_fixed_noreplace.c: add test for MAP_FIXED_NOREPLACE mm: thp: relocate flush_cache_range() in migrate_misplaced_transhuge_page() mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page() mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition mm/kasan/quarantine.c: make quarantine_lock a raw_spinlock_t mm/gup: cache dev_pagemap while pinning pages Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved" mm: return zero_resv_unavail optimization mm: zero remaining unavailable struct pages tools/testing/selftests/vm/gup_benchmark.c: add MAP_HUGETLB option tools/testing/selftests/vm/gup_benchmark.c: add MAP_SHARED option tools/testing/selftests/vm/gup_benchmark.c: allow user specified file tools/testing/selftests/vm/gup_benchmark.c: fix 'write' flag usage mm/gup_benchmark.c: add additional pinning methods mm/gup_benchmark.c: time put_page() mm: don't raise MEMCG_OOM event due to failed high-order allocation mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock ...
2018-10-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "What better way to start off a weekend than with some networking bug fixes: 1) net namespace leak in dump filtering code of ipv4 and ipv6, fixed by David Ahern and Bjørn Mork. 2) Handle bad checksums from hardware when using CHECKSUM_COMPLETE properly in UDP, from Sean Tranchetti. 3) Remove TCA_OPTIONS from policy validation, it turns out we don't consistently use nested attributes for this across all packet schedulers. From David Ahern. 4) Fix SKB corruption in cadence driver, from Tristram Ha. 5) Fix broken WoL handling in r8169 driver, from Heiner Kallweit. 6) Fix OOPS in pneigh_dump_table(), from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits) net/neigh: fix NULL deref in pneigh_dump_table() net: allow traceroute with a specified interface in a vrf bridge: do not add port to router list when receives query with source 0.0.0.0 net/smc: fix smc_buf_unuse to use the lgr pointer ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called net/{ipv4,ipv6}: Do not put target net if input nsid is invalid lan743x: Remove SPI dependency from Microchip group. drivers: net: remove <net/busy_poll.h> inclusion when not needed net: phy: genphy_10g_driver: Avoid NULL pointer dereference r8169: fix broken Wake-on-LAN from S5 (poweroff) octeontx2-af: Use GFP_ATOMIC under spin lock net: ethernet: cadence: fix socket buffer corruption problem net/ipv6: Allow onlink routes to have a device mismatch if it is the default route net: sched: Remove TCA_OPTIONS from policy ice: Poll for link status change ice: Allocate VF interrupts and set queue map ice: Introduce ice_dev_onetime_setup net: hns3: Fix for warning uninitialized symbol hw_err_lst3 octeontx2-af: Copy the right amount of memory net: udp: fix handling of CHECKSUM_COMPLETE packets ...
2018-10-26tools/testing/selftests/vm/map_fixed_noreplace.c: add test for ↵Michael Ellerman
MAP_FIXED_NOREPLACE Add a test for MAP_FIXED_NOREPLACE, based on some code originally by Jann Horn. This would have caught the overlap bug reported by Daniel Micay. I originally suggested to Michal that we create MAP_FIXED_NOREPLACE, but instead of writing a selftest I spent my time bike-shedding whether it should be called MAP_FIXED_SAFE/NOCLOBBER/WEAK/NEW .. mea culpa. Link: http://lkml.kernel.org/r/20181013133929.28653-1-mpe@ellerman.id.au Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jann Horn <jannh@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Joel Stanley <joel@jms.id.au> Cc: Jason Evans <jasone@google.com> Cc: David Goldblatt <davidtgoldblatt@gmail.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26tools/testing/selftests/vm/gup_benchmark.c: add MAP_HUGETLB optionKeith Busch
Add a new option '-H' to the gup benchmark to help understand how hugetlb mapping pages compare with the default. Link: http://lkml.kernel.org/r/20181010195605.10689-6-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26tools/testing/selftests/vm/gup_benchmark.c: add MAP_SHARED optionKeith Busch
Add a new benchmark option, -S, to request MAP_SHARED. This can be used to compare with MAP_PRIVATE, or for files that require this option, like dax. Link: http://lkml.kernel.org/r/20181010195605.10689-5-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26tools/testing/selftests/vm/gup_benchmark.c: allow user specified fileKeith Busch
Allow a user to specify a file to map by adding a new option, '-f', providing a means to test various file backings. If not specified, the benchmark will use a private mapping of /dev/zero, which produces an anonymous mapping as before. [akpm@linux-foundation.org: avoid using comma operator] Link: http://lkml.kernel.org/r/20181010195605.10689-4-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26tools/testing/selftests/vm/gup_benchmark.c: fix 'write' flag usageKeith Busch
If the '-w' parameter was provided, the benchmark would exit due to a mssing 'break'. Link: http://lkml.kernel.org/r/20181010195605.10689-3-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26mm/gup_benchmark.c: add additional pinning methodsKeith Busch
Provide new gup benchmark ioctl commands to run different user page pinning methods, get_user_pages_longterm() and get_user_pages(), in addition to the existing get_user_pages_fast(). Link: http://lkml.kernel.org/r/20181010195605.10689-2-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26mm/gup_benchmark.c: time put_page()Keith Busch
We'd like to measure time to unpin user pages, so this adds a second benchmark timer on put_page, separate from get_page. Adding the field breaks this ioctl ABI, but should be okay since this an in-tree kernel selftest. [akpm@linux-foundation.org: add expansion to struct gup_benchmark for future use] Link: http://lkml.kernel.org/r/20181010195605.10689-1-keith.busch@intel.com Signed-off-by: Keith Busch <keith.busch@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26userfaultfd: selftest: recycle lock threads firstPeter Xu
Now we recycle the uffd servicing threads earlier than the lock threads. It might happen that when the lock thread is still blocked at a pthread mutex lock while the servicing thread has already quitted for the cpu so the lock thread will be blocked forever and hang the test program. To fix the possible race, recycle the lock threads first. This never happens with current missing-only tests, but when I start to run the write-protection tests (the feature is not yet posted upstream) it happens every time of the run possibly because in that new test we'll need to service two page faults for each lock operation. Link: http://lkml.kernel.org/r/20180930074259.18229-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@fb.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26userfaultfd: selftest: generalize read and pollPeter Xu
We do very similar things in read and poll modes, but we're copying the codes around. Share the codes properly on reading the message and handling the page fault to make the code cleaner. Meanwhile this solves previous mismatch of behaviors between the two modes on that the old code: - did not check EAGAIN case in read() mode - ignored BOUNCE_VERIFY check in read() mode Link: http://lkml.kernel.org/r/20180930074259.18229-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@fb.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26userfaultfd: selftest: cleanup help messagesPeter Xu
Firstly, the help in the comment region is obsolete, now we support three parameters. Since at it, change it and move it into the help message of the program. Also, the help messages dumped here and there is obsolete too. Use a single usage() helper. Link: http://lkml.kernel.org/r/20180930074259.18229-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@fb.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26delayacct: track delays from thrashing cache pagesJohannes Weiner
Delay accounting already measures the time a task spends in direct reclaim and waiting for swapin, but in low memory situations tasks spend can spend a significant amount of their time waiting on thrashing page cache. This isn't tracked right now. To know the full impact of memory contention on an individual task, measure the delay when waiting for a recently evicted active cache page to read back into memory. Also update tools/accounting/getdelays.c: [hannes@computer accounting]$ sudo ./getdelays -d -p 1 print delayacct stats ON PID 1 CPU count real total virtual total delay total delay average 50318 745000000 847346785 400533713 0.008ms IO count delay total delay average 435 122601218 0ms SWAP count delay total delay average 0 0 0ms RECLAIM count delay total delay average 0 0 0ms THRASHING count delay total delay average 19 12621439 0ms Link: http://lkml.kernel.org/r/20180828172258.3185-4-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26Merge tag 'powerpc-4.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A large series to rewrite our SLB miss handling, replacing a lot of fairly complicated asm with much fewer lines of C. - Following on from that, we now maintain a cache of SLB entries for each process and preload them on context switch. Leading to a 27% speedup for our context switch benchmark on Power9. - Improvements to our handling of SLB multi-hit errors. We now print more debug information when they occur, and try to continue running by flushing the SLB and reloading, rather than treating them as fatal. - Enable THP migration on 64-bit Book3S machines (eg. Power7/8/9). - Add support for physical memory up to 2PB in the linear mapping on 64-bit Book3S. We only support up to 512TB as regular system memory, otherwise the percpu allocator runs out of vmalloc space. - Add stack protector support for 32 and 64-bit, with a per-task canary. - Add support for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP. - Support recognising "big cores" on Power9, where two SMT4 cores are presented to us as a single SMT8 core. - A large series to cleanup some of our ioremap handling and PTE flags. - Add a driver for the PAPR SCM (storage class memory) interface, allowing guests to operate on SCM devices (acked by Dan). - Changes to our ftrace code to handle very large kernels, where we need to use a trampoline to get to ftrace_caller(). And many other smaller enhancements and cleanups. Thanks to: Alan Modra, Alistair Popple, Aneesh Kumar K.V, Anton Blanchard, Aravinda Prasad, Bartlomiej Zolnierkiewicz, Benjamin Herrenschmidt, Breno Leitao, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Dan Carpenter, Daniel Axtens, Finn Thain, Gautham R. Shenoy, Gustavo Romero, Haren Myneni, Hari Bathini, Jia Hongtao, Joel Stanley, John Allen, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Hairgrove, Masahiro Yamada, Michael Bringmann, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Petr Vorel, Rashmica Gupta, Reza Arbab, Rob Herring, Sam Bobroff, Samuel Mendoza-Jonas, Scott Wood, Stan Johnson, Stephen Rothwell, Stewart Smith, Suraj Jitindar Singh, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, YueHaibing, zhong jiang" * tag 'powerpc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (221 commits) Revert "selftests/powerpc: Fix out-of-tree build errors" powerpc/msi: Fix compile error on mpc83xx powerpc: Fix stack protector crashes on CPU hotplug powerpc/traps: restore recoverability of machine_check interrupts powerpc/64/module: REL32 relocation range check powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd selftests/powerpc: Add a test of wild bctr powerpc/mm: Fix page table dump to work on Radix powerpc/mm/radix: Display if mappings are exec or not powerpc/mm/radix: Simplify split mapping logic powerpc/mm/radix: Remove the retry in the split mapping logic powerpc/mm/radix: Fix small page at boundary when splitting powerpc/mm/radix: Fix overuse of small pages in splitting logic powerpc/mm/radix: Fix off-by-one in split mapping logic powerpc/ftrace: Handle large kernel configs powerpc/mm: Fix WARN_ON with THP NUMA migration selftests/powerpc: Fix out-of-tree build errors powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selected powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64 powerpc/time: isolate scaled cputime accounting in dedicated functions. ...
2018-10-26Merge tag 'usb-4.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY updates from Greg KH: "Here is the big USB/PHY driver patches for 4.20-rc1 Lots of USB changes in here, primarily in these areas: - typec updates and new drivers - new PHY drivers - dwc2 driver updates and additions (this old core keeps getting added to new devices.) - usbtmc major update based on the industry group coming together and working to add new features and performance to the driver. - USB gadget additions for new features - USB gadget configfs updates - chipidea driver updates - other USB gadget updates - USB serial driver updates - renesas driver updates - xhci driver updates - other tiny USB driver updates All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (229 commits) usb: phy: ab8500: silence some uninitialized variable warnings usb: xhci: tegra: Add genpd support usb: xhci: tegra: Power-off power-domains on removal usbip:vudc: BUG kmalloc-2048 (Not tainted): Poison overwritten usbip: tools: fix atoi() on non-null terminated string USB: misc: appledisplay: fix backlight update_status return code phy: phy-pxa-usb: add a new driver usb: host: add DT bindings for faraday fotg2 usb: host: ohci-at91: fix request of irq for optional gpio usb/early: remove set but not used variable 'remain_length' usb: typec: Fix copy/paste on typec_set_vconn_role() kerneldoc usb: typec: tcpm: Report back negotiated PPS voltage and current USB: core: remove set but not used variable 'udev' usb: core: fix memory leak on port_dev_path allocation USB: net2280: Remove ->disconnect() callback from net2280_pullup() usb: dwc2: disable power_down on rockchip devices usb: gadget: udc: renesas_usb3: add support for r8a77990 dt-bindings: usb: renesas_usb3: add bindings for r8a77990 usb: gadget: udc: renesas_usb3: Add r8a774a1 support USB: serial: cypress_m8: remove set but not used variable 'iflag' ...
2018-10-26Revert "selftests/powerpc: Fix out-of-tree build errors"Michael Ellerman
This reverts commit d8a2fe29d3c97038c8efcc328d5e7940c5310565. That commit, by me, fixed the out of tree build errors by causing some of the tests not to build at all.
2018-10-26selftests/powerpc: Fix ptrace tm failureBreno Leitao
Test ptrace-tm-spd-gpr fails on current kernel (4.19) due to a segmentation fault that happens on the child process prior to setting cptr[2] = 1. This causes the parent process to wait forever at 'while (!pptr[2])' and the test to be killed by the test harness framework by timeout, thus, failing. The segmentation fault happens because of a inline assembly being generated as: 0x10000355c <tm_spd_gpr+492> lfs f0, 0(0) This is reading memory position 0x0 and causing the segmentation fault. This code is being generated by ASM_LOAD_FPR_SINGLE_PRECISION(flt_4), where flt_4 is passed to the inline assembly block as: [flt_4] "r" (&d) Since the inline assembly 'r' constraint means any GPR, gpr0 is being chosen, thus causing this issue when issuing a Load Floating-Point Single instruction. This patch simply changes the constraint to 'b', which specify that this register will be used as base, and r0 is not allowed to be used, avoiding this issue. Other than that, removing flt_2 register from the input operands, since it is not used by the inline assembly code at all. Cc: stable@vger.kernel.org Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26Merge tag 'perf-core-for-mingo-4.20-20181025' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Introduce 'perf trace --max-events' for stopping 'perf trace' when that many syscalls (enter+exit), tracepoints or other events such as page faults take place. Support that as well on a per-event basis, e.g.: perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/ Will stop when 2 context switches, 4 block plugs, 1 block unplug and 3 net_dev_queue tracepoints take place. (Arnaldo Carvalho de Melo) - Poll for monitored tasks being alive in 'perf stat -p/-t', exiting when those tasks all terminate (Jiri Olsa) - Encode -k clockid frequency into perf.data to enable timestamps derived metrics conversion into wall clock time on reporting stage. (Alexey Budankov) - Improve Intel PT call graph from SQL database and GUI python scripts, including adopting the Qt MDI interface to allow for multiple subwindows for all the tables, helping in better visualizing the data in the SQL tables, also uses, when available, the Intel XED disassembler libraries to present the Intel PT data as x86 asm mnemonics. This last feature is not currently working in some cases, fix is being discussed (Adrian Hunter) - Implement a ftrace function_graph view in 'perf script' when processing hardware trace data such as Intel PT (Andi Kleen) - Better integration with the Intel XED disassembler, when available, in 'perf script' (Andi Kleen) - Some 'perf trace' drop refcount fixes (Arnaldo Carvalho de Melo) - Add Sparc support to 'perf annotate', jitdump (David Miller) - Fix PLT symbols entry/header sizes properly on Sparc (David Miller) - Fix generation of system call table failure with /tmp mounted with 'noexec' in arm64 (Hongxu Jia) - Allow extended console debug output in 'perf script' (Milian Wolff) - Flush output stream after events in 'perf script' verbose mode (Milian Wolff) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-25Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...
2018-10-25bpf: disallow direct packet access for unpriv in cg_skbDaniel Borkmann
Commit b39b5f411dcf ("bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB") added support for returning pkt pointers for direct packet access. Given this program type is allowed for both unprivileged and privileged users, we shouldn't allow unprivileged ones to use it, e.g. besides others one reason would be to avoid any potential speculation on the packet test itself, thus guard this for root only. Fixes: b39b5f411dcf ("bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>