git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-09-03	tools/bpftool: Replace bpf_program__title() with bpf_program__section_name()	Andrii Nakryiko
	bpf_program__title() is deprecated, switch to bpf_program__section_name() and avoid compilation warnings. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-9-andriin@fb.com
2020-09-03	selftests/bpf: Add selftest for multi-prog sections and bpf-to-bpf calls	Andrii Nakryiko
	Add a selftest excercising bpf-to-bpf subprogram calls, as well as multiple entry-point BPF programs per section. Also make sure that BPF CO-RE works for such set ups both for sub-programs and for multi-entry sections. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-8-andriin@fb.com
2020-09-03	libbpf: Add multi-prog section support for struct_ops	Andrii Nakryiko
	Adjust struct_ops handling code to work with multi-program ELF sections properly. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-7-andriin@fb.com
2020-09-03	libbpf: Implement generalized .BTF.ext func/line info adjustment	Andrii Nakryiko
	Complete multi-prog sections and multi sub-prog support in libbpf by properly adjusting .BTF.ext's line and function information. Mark exposed btf_ext__reloc_func_info() and btf_ext__reloc_func_info() APIs as deprecated. These APIs have simplistic assumption that all sub-programs are going to be appended to all main BPF programs, which doesn't hold in real life. It's unlikely there are any users of this API, as it's very libbpf internals-specific. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-6-andriin@fb.com
2020-09-03	libbpf: Make RELO_CALL work for multi-prog sections and sub-program calls	Andrii Nakryiko
	This patch implements general and correct logic for bpf-to-bpf sub-program calls. Only sub-programs used (called into) from entry-point (main) BPF program are going to be appended at the end of main BPF program. This ensures that BPF verifier won't encounter any dead code due to copying unreferenced sub-program. This change means that each entry-point (main) BPF program might have a different set of sub-programs appended to it and potentially in different order. This has implications on how sub-program call relocations need to be handled, described below. All relocations are now split into two categores: data references (maps and global variables) and code references (sub-program calls). This distinction is important because data references need to be relocated just once per each BPF program and sub-program. These relocation are agnostic to instruction locations, because they are not code-relative and they are relocating against static targets (maps, variables with fixes offsets, etc). Sub-program RELO_CALL relocations, on the other hand, are highly-dependent on code position, because they are recorded as instruction-relative offset. So BPF sub-programs (those that do calls into other sub-programs) can't be relocated once, they need to be relocated each time such a sub-program is appended at the end of the main entry-point BPF program. As mentioned above, each main BPF program might have different subset and differen order of sub-programs, so call relocations can't be done just once. Splitting data reference and calls relocations as described above allows to do this efficiently and cleanly. bpf_object__find_program_by_name() will now ignore non-entry BPF programs. Previously one could have looked up '.text' fake BPF program, but the existence of such BPF program was always an implementation detail and you can't do much useful with it. Now, though, all non-entry sub-programs get their own BPF program with name corresponding to a function name, so there is no more '.text' name for BPF program. This means there is no regression, effectively, w.r.t. API behavior. But this is important aspect to highlight, because it's going to be critical once libbpf implements static linking of BPF programs. Non-entry static BPF programs will be allowed to have conflicting names, but global and main-entry BPF program names should be unique. Just like with normal user-space linking process. So it's important to restrict this aspect right now, keep static and non-entry functions as internal implementation details, and not have to deal with regressions in behavior later. This patch leaves .BTF.ext adjustment as is until next patch. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-5-andriin@fb.com
2020-09-03	libbpf: Support CO-RE relocations for multi-prog sections	Andrii Nakryiko
	Fix up CO-RE relocation code to handle relocations against ELF sections containing multiple BPF programs. This requires lookup of a BPF program by its section name and instruction index it contains. While it could have been done as a simple loop, it could run into performance issues pretty quickly, as number of CO-RE relocations can be quite large in real-world applications, and each CO-RE relocation incurs BPF program look up now. So instead of simple loop, implement a binary search by section name + insn offset. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-4-andriin@fb.com
2020-09-03	libbpf: Parse multi-function sections into multiple BPF programs	Andrii Nakryiko
	Teach libbpf how to parse code sections into potentially multiple bpf_program instances, based on ELF FUNC symbols. Each BPF program will keep track of its position within containing ELF section for translating section instruction offsets into program instruction offsets: regardless of BPF program's location in ELF section, it's first instruction is always at local instruction offset 0, so when libbpf is working with relocations (which use section-based instruction offsets) this is critical to make proper translations. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-3-andriin@fb.com
2020-09-03	libbpf: Ensure ELF symbols table is found before further ELF processing	Andrii Nakryiko
	libbpf ELF parsing logic might need symbols available before ELF parsing is completed, so we need to make sure that symbols table section is found in a separate pass before all the subsequent sections are processed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-2-andriin@fb.com
2020-09-03	selftests/net: improve descriptions for XFAIL cases in psock_snd.sh	Po-Hsu Lin
	Before changing this it's a bit confusing to read test output: raw csum_off with bad offset (fails) ./psock_snd: write: Invalid argument Change "fails" in the test case description to "expected to fail", so that the test output can be more understandable. Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03	perf tools: Add bpf image check to __map__is_kmodule	Jiri Olsa
	When validating kcore modules the do_validate_kcore_modules function checks on every kernel module dso against modules record. The __map__is_kmodule check is used to get only kernel module dso objects through. Currently the bpf images are slipping through the check and making the validation to fail, so report falls back from kcore usage to kallsyms. Adding __map__is_bpf_image check for bpf image and adding it to __map__is_kmodule check. Fixes: 3c29d4483e85 ("perf annotate: Add basic support for bpf_image") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200826213017.818788-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	perf record/stat: Explicitly call out event modifiers in the documentation	Kim Phillips
	Event modifiers are not mentioned in the perf record or perf stat manpages. Add them to orient new users more effectively by pointing them to the perf list manpage for details. Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tony Jones <tonyj@suse.de> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	perf bench: The do_run_multi_threaded() function must use ↵	YueHaibing
	IS_ERR(perf_session__new()) In case of error, the function perf_session__new() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR() Committer notes: This wasn't compiling due to an extraneous '{' not matched by a '}', fix it. Fixes: 13edc237200c ("perf bench: Add a multi-threaded synthesize benchmark") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200902140526.26916-1-yuehaibing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	perf stat: Turn off summary for interval mode by default	Jin Yao
	There's a risk that outputting interval mode summaries by default breaks CSV consumers. It already broke pmu-tools/toplev. So now we turn off the summary by default but we create a new option '--summary' to enable the summary. This is active even when not using CSV mode. Before: root@kbl-ppc:~# perf stat -I1000 --interval-count 2 # time counts unit events 1.000265904 8,005.73 msec cpu-clock # 8.006 CPUs utilized 1.000265904 601 context-switches # 0.075 K/sec 1.000265904 10 cpu-migrations # 0.001 K/sec 1.000265904 0 page-faults # 0.000 K/sec 1.000265904 66,746,521 cycles # 0.008 GHz 1.000265904 71,874,398 instructions # 1.08 insn per cycle 1.000265904 13,356,781 branches # 1.668 M/sec 1.000265904 298,756 branch-misses # 2.24% of all branches 2.001857667 8,012.52 msec cpu-clock # 8.013 CPUs utilized 2.001857667 164 context-switches # 0.020 K/sec 2.001857667 10 cpu-migrations # 0.001 K/sec 2.001857667 2 page-faults # 0.000 K/sec 2.001857667 5,822,188 cycles # 0.001 GHz 2.001857667 2,186,170 instructions # 0.38 insn per cycle 2.001857667 442,378 branches # 0.055 M/sec 2.001857667 44,750 branch-misses # 10.12% of all branches Performance counter stats for 'system wide': 16,018.25 msec cpu-clock # 7.993 CPUs utilized 765 context-switches # 0.048 K/sec 20 cpu-migrations # 0.001 K/sec 2 page-faults # 0.000 K/sec 72,568,709 cycles # 0.005 GHz 74,060,568 instructions # 1.02 insn per cycle 13,799,159 branches # 0.861 M/sec 343,506 branch-misses # 2.49% of all branches 2.004118489 seconds time elapsed After: root@kbl-ppc:~# perf stat -I1000 --interval-count 2 # time counts unit events 1.001336393 8,013.28 msec cpu-clock # 8.013 CPUs utilized 1.001336393 82 context-switches # 0.010 K/sec 1.001336393 8 cpu-migrations # 0.001 K/sec 1.001336393 0 page-faults # 0.000 K/sec 1.001336393 4,199,121 cycles # 0.001 GHz 1.001336393 1,373,991 instructions # 0.33 insn per cycle 1.001336393 270,681 branches # 0.034 M/sec 1.001336393 31,659 branch-misses # 11.70% of all branches 2.003905006 8,020.52 msec cpu-clock # 8.021 CPUs utilized 2.003905006 184 context-switches # 0.023 K/sec 2.003905006 8 cpu-migrations # 0.001 K/sec 2.003905006 2 page-faults # 0.000 K/sec 2.003905006 5,446,190 cycles # 0.001 GHz 2.003905006 2,312,547 instructions # 0.42 insn per cycle 2.003905006 451,691 branches # 0.056 M/sec 2.003905006 37,925 branch-misses # 8.40% of all branches root@kbl-ppc:~# perf stat -I1000 --interval-count 2 --summary # time counts unit events 1.001313128 8,013.20 msec cpu-clock # 8.013 CPUs utilized 1.001313128 83 context-switches # 0.010 K/sec 1.001313128 8 cpu-migrations # 0.001 K/sec 1.001313128 0 page-faults # 0.000 K/sec 1.001313128 4,470,950 cycles # 0.001 GHz 1.001313128 1,440,045 instructions # 0.32 insn per cycle 1.001313128 283,222 branches # 0.035 M/sec 1.001313128 33,576 branch-misses # 11.86% of all branches 2.003857385 8,020.34 msec cpu-clock # 8.020 CPUs utilized 2.003857385 154 context-switches # 0.019 K/sec 2.003857385 8 cpu-migrations # 0.001 K/sec 2.003857385 2 page-faults # 0.000 K/sec 2.003857385 4,515,676 cycles # 0.001 GHz 2.003857385 2,180,449 instructions # 0.48 insn per cycle 2.003857385 435,254 branches # 0.054 M/sec 2.003857385 31,179 branch-misses # 7.16% of all branches Performance counter stats for 'system wide': 16,033.53 msec cpu-clock # 7.992 CPUs utilized 237 context-switches # 0.015 K/sec 16 cpu-migrations # 0.001 K/sec 2 page-faults # 0.000 K/sec 8,986,626 cycles # 0.001 GHz 3,620,494 instructions # 0.40 insn per cycle 718,476 branches # 0.045 M/sec 64,755 branch-misses # 9.01% of all branches 2.006124542 seconds time elapsed Fixes: c7e5b328a8d4 ("perf stat: Report summary for interval mode") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200903010113.32232-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	libtraceevent: Fix build warning on 32-bit arches	Tzvetomir Stoyanov (VMware)
	Fixed a compilation warning for casting to pointer from integer of different size on 32-bit platforms. Reported-by: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	perf jevents: Fix suspicious code in fixregex()	Namhyung Kim
	The new string should have enough space for the original string and the back slashes IMHO. Fixes: fbc2844e84038ce3 ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	perf parse-events: Use uintptr_t when casting numbers to pointers	Arnaldo Carvalho de Melo
	To address these errors found when cross building from x86_64 to MIPS little endian 32-bit: CC /tmp/build/perf/util/parse-events-bison.o util/parse-events.y: In function 'parse_events_parse': util/parse-events.y:514:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 514 \| (void ) $2, $6, $4); \| ^ util/parse-events.y:531:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 531 \| (void ) $2, NULL, $4)) { \| ^ util/parse-events.y:547:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 547 \| (void ) $2, $4, 0); \| ^ util/parse-events.y:564:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 564 \| (void ) $2, NULL, 0)) { \| ^ Fixes: cabbf26821aa210f ("perf parse: Before yyabort-ing free components") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03	tools/power turbostat: Build with _FILE_OFFSET_BITS=64	Alexander Monakov
	For compatibility reasons, Glibc off_t is a 32-bit type on 32-bit x86 unless _FILE_OFFSET_BITS=64 is defined. Add this define, as otherwise reading MSRs with index 0x80000000 and above attempts a pread with a negative offset, which fails. Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Tested-by: Liwei Song <liwei.song@windriver.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Support AMD Family 19h	Kim Phillips
	Family 19h processors have the same RAPL (Running average power limit) hardware register interface as Family 17h processors. Change the family checks to succeed for Family 17h and above to enable core and package energy measurement on Family 19h machines. Also update the TDP to the largest found at the bottom of the page at amd.com->processors->servers->epyc->2nd-gen-epyc, i.e., the EPYC 7H12. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Len Brown <len.brown@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Remove empty columns for Jacobsville	Antti Laakso
	Jacobsville doesn't have Package C2 and C6. Also Core and DRAM RAPL are not available. Adjust output accordingly. Signed-off-by: Antti Laakso <antti.laakso@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Add a new GFXAMHz column that exposes gt_act_freq_mhz.	Rafael Antognolli
	The column already present called GFXMHz reads from gt_cur_freq_mhz, which represents the GT frequency that was requested, but power management might not be able to do that. So the new column will display what the actual frequency GT is running at. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power x86_energy_perf_policy: Input/output error in a VM	Ondřej Lysoněk
	I've encountered an issue with x86_energy_perf_policy. If I run it on a machine that I'm told is a qemu-kvm virtual machine running inside a privileged container, I get the following error: x86_energy_perf_policy: /dev/cpu/0/msr offset 0x1ad read failed: Input/output error I get the same error in a Digital Ocean droplet, so that might be a similar environment. I created the following patch which is intended to give a more user-friendly message. It's based on a patch for turbostat from Prarit Bhargava that was posted some time ago. The patch is "[v2] turbostat: Running on virtual machine is not supported" [1]. Given my limited knowledge of the topic, I can't say with confidence that this is the right solution, though (that's why this is not an official patch submission). Also, I'm not sure what the convention with exit codes is in this tool. Also, instead of the error message, perhaps the tool should just not print anything in this case, which is how it behaves in a "regular" VM? [1] https://patchwork.kernel.org/patch/9868587/ Signed-off-by: Ondřej Lysoněk <olysonek@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Skip pc8, pc9, pc10 columns, if they are disabled	Len Brown
	Like we skip PC3 and PC6 columns when the package C-state limit disables them, skip PC8/PC9/CP10 under analogous conditions. Reported-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Support additional CPU model numbers	Len Brown
	Initial support for models recently added to intel-family.h. Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Fix output formatting for ACPI CST enumeration	David Arcari
	turbostat formatting is broken with ACPI CST for enumeration. The problem is that the CX_ACPI% is eight characters long which does not work with tab formatting. One simple solution is to remove the underbar from the state name such that C1_ACPI will be displayed as C1ACPI. Signed-off-by: David Arcari <darcari@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Replace HTTP links with HTTPS ones: TURBOSTAT UTILITY	Alexander A. Klimov
	Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w\|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Use sched_getcpu() instead of hardcoded cpu 0	Prarit Bhargava
	Disabling cpu 0 results in an error turbostat: /sys/devices/system/cpu/cpu0/topology/thread_siblings: open failed: No such file or directory Use sched_getcpu() instead of a hardcoded cpu 0 to get the max cpu number. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Enable accumulate RAPL display	Chen Yu
	Enable the accumulated RAPL display by default. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Introduce functions to accumulate RAPL consumption	Chen Yu
	Since the RAPL Joule Counter is 32 bit, turbostat would only print a star instead of printing the actual energy consumed to indicate the overflow due to long duration. This does not meet the requirement from servers as the sampling time of turbostat is usually very long on servers. So maintain a set of MSR buffer, and update them periodically before the 32bit MSR register is wrapped round, so as to avoid the overflow. The idea is similar to the implementation of ktime_get(): Periodical MSR timer: total_rapl_sum += (current_rapl_msr - last_rapl_msr); Using get_msr_sum() to get the accumulated RAPL: return (current_rapl_msr - last_rapl_msr) + total_rapl_sum; The accumulated RAPL mechanism will be turned on in next patch. Originally-by: Aaron Lu <aaron.lwe@gmail.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Make the energy variable to be 64 bit	Chen Yu
	Change the energy variable from 32bit to 64bit, so that it can record long time duration. After this conversion, adjust the DELTA_WRAP32() accordingly. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Always print idle in the system configuration header	Doug Smythies
	If the --quiet option is not used, turbostat prints a useful system configuration header during startup. But inclusion of idle system configuration information in this header is currently a function of inclusion in the columns chosen to be displayed. Always list this idle system configuration. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/power turbostat: Print /dev/cpu_dma_latency	Len Brown
	Users are puzzled when they use tuned performance and all their C-states vanish. Dump /dev/cpu_dma_latency and state whether the value is default, or constraining, to explain this situation. Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03	tools/memory-model: Add a simple entry point document	Paul E. McKenney
	Current LKMM documentation assumes that the reader already understands concurrency in the Linux kernel, which won't necessarily always be the case. This commit supplies a simple.txt file that provides a starting point for someone who is new to concurrency in the Linux kernel. That said, this file might also useful as a reminder to experienced developers of simpler approaches to dealing with concurrency. Link: Link: https://lwn.net/Articles/827180/ [ paulmck: Apply feedback from Joel Fernandes. ] Co-developed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Co-developed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-03	tools/memory-model: Improve litmus-test documentation	Paul E. McKenney
	The current LKMM documentation says very little about litmus tests, and worse yet directs people to the herd7 documentation for more information. Now, the herd7 documentation is quite voluminous and educational, but it is intended for people creating and modifying memory models, not those attempting to use them. This commit therefore updates README and creates a litmus-tests.txt file that gives an overview of litmus-test format and describes ways of modeling various special cases, illustrated with numerous examples. [ paulmck: Add Alan Stern feedback. ] [ paulmck: Apply Dave Chinner feedback. ] [ paulmck: Apply Andrii Nakryiko feedback. ] [ paulmck: Apply Johannes Weiner feedback. ] Link: https://lwn.net/Articles/827180/ Reported-by: Dave Chinner <david@fromorbit.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-03	tools/memory-model: Update recipes.txt prime_numbers.c path	Paul E. McKenney
	The expand_to_next_prime() and next_prime_number() functions have moved from lib/prime_numbers.c to lib/math/prime_numbers.c, so this commit updates recipes.txt to reflect this change. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-03	Replace HTTP links with HTTPS ones: LKMM	Alexander A. Klimov
	Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w\|/)`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-03	Merge branch 'scftorture.2020.08.24a' into HEAD	Paul E. McKenney
	scftorture.2020.08.24a: Torture tests for smp_call_function() and friends.
2020-09-03	libbpf: Remove arch-specific include path in Makefile	Naveen N. Rao
	Ubuntu mainline builds for ppc64le are failing with the below error (): CALL /home/kernel/COD/linux/scripts/atomic/check-atomics.sh DESCEND bpf/resolve_btfids Auto-detecting system features: ... libelf: [ [32mon[m ] ... zlib: [ [32mon[m ] ... bpf: [ [31mOFF[m ] BPF API too old make[6]: [Makefile:295: bpfdep] Error 1 make[5]: * [Makefile:54: /home/kernel/COD/linux/debian/build/build-generic/tools/bpf/resolve_btfids//libbpf.a] Error 2 make[4]: * [Makefile:71: bpf/resolve_btfids] Error 2 make[3]: * [/home/kernel/COD/linux/Makefile:1890: tools/bpf/resolve_btfids] Error 2 make[2]: * [/home/kernel/COD/linux/Makefile:335: __build_one_by_one] Error 2 make[2]: Leaving directory '/home/kernel/COD/linux/debian/build/build-generic' make[1]: * [Makefile:185: __sub-make] Error 2 make[1]: Leaving directory '/home/kernel/COD/linux' resolve_btfids needs to be build as a host binary and it needs libbpf. However, libbpf Makefile hardcodes an include path utilizing $(ARCH). This results in mixing of cross-architecture headers resulting in a build failure. The specific header include path doesn't seem necessary for a libbpf build. Hence, remove the same. (*) https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.9-rc3/ppc64el/log Reported-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200902084246.1513055-1-naveen.n.rao@linux.vnet.ibm.com
2020-09-02	selftests: mptcp: fix typo in mptcp_connect usage	Davide Caratti
	in mptcp_connect, 's' selects IPPROTO_MPTCP / IPPROTO_TCP as the value of 'protocol' in socket(), and 'm' switches between different send / receive modes. Fix die_usage(): swap 'm' and 's' and add missing 'sendfile' mode. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-02	selftests/bpf: Test task_file iterator without visiting pthreads	Yonghong Song
	Modified existing bpf_iter_test_file.c program to check whether all accessed files from the main thread or not. Modified existing bpf_iter_test_file program to check whether all accessed files from the main thread or not. $ ./test_progs -n 4 ... #4/7 task_file:OK ... #4 bpf_iter:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200902023113.1672863-1-yhs@fb.com
2020-09-01	Merge tag 'perf-tools-fixes-for-v5.9-2020-09-01' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix infinite loop in the TUI for grouped events in 'perf top/record', eg when using "perf top -e '{cycles,instructions,cache-misses}'". - Fix segfault by skipping side-band event setup if HAVE_LIBBPF_SUPPORT is not set. - Fix synthesized branch stacks generated from CoreSight ETM trace and Intel PT hardware traces. - Fix error when synthesizing events from ARM SPE hardware trace. - The SNOOPX and REMOTE offsets in the data_src bitmask in perf records were were both 37, SNOOPX is 38, fix it. - Fix use of CPU list with summary option in 'perf sched timehist'. - Avoid an uninitialized read when using fake PMUs. - Set perf_event_attr.exclude_guest=1 for user-space counting. - Don't order events when doing a 'perf report -D' raw dump of perf.data records. - Set NULL sentinel in pmu_events table in "Parse and process metrics" 'perf test' - Fix basic bpf filtering 'perf test' on s390x. - Fix out of bounds array access in the 'perf stat' print_counters() evlist method. - Add mwait_idle_with_hints.constprop.0 to the list of idle symbols. - Use %zd for size_t printf formats on 32-bit. - Correct the help info of "perf record --no-bpf-event" option. - Add entries for CoreSight and Arm SPE tooling to MAINTAINERS. * tag 'perf-tools-fixes-for-v5.9-2020-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf report: Disable ordered_events for raw dump perf tools: Correct SNOOPX field offset perf intel-pt: Fix corrupt data after perf inject from perf cs-etm: Fix corrupt data after perf inject from perf top/report: Fix infinite loop in the TUI for grouped events perf parse-events: Avoid an uninitialized read when using fake PMUs perf stat: Fix out of bounds array access in the print_counters() evlist method perf test: Set NULL sentinel in pmu_events table in "Parse and process metrics" test perf parse-events: Set exclude_guest=1 for user-space counting perf record: Correct the help info of option "--no-bpf-event" perf tools: Use %zd for size_t printf formats on 32-bit MAINTAINERS: Add entries for CoreSight and Arm SPE tooling perf: arm-spe: Fix check error when synthesizing events perf symbols: Add mwait_idle_with_hints.constprop.0 to the list of idle symbols perf top: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set perf sched timehist: Fix use of CPU list with summary option perf test: Fix basic bpf filtering test
2020-09-01	blk-iocost: update iocost_monitor.py	Tejun Heo
	iocost went through significant internal changes. Update iocost_monitor.py accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01	objtool: Define 'struct orc_entry' only when needed	Julien Thierry
	Implementation of ORC requires some definitions that are currently provided by the target architecture headers. Do not depend on these definitions when the orc subcommand is not implemented. This avoid requiring arches with no orc implementation to provide dummy orc definitions. Signed-off-by: Julien Thierry <jthierry@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-01	objtool: Skip ORC entry creation for non-text sections	Julien Thierry
	Orc generation is only done for text sections, but some instructions can be found in non-text sections (e.g. .discard.text sections). Skip setting their orc sections since their whole sections will be skipped for orc generation. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Julien Thierry <jthierry@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-01	objtool: Move ORC logic out of check()	Julien Thierry
	Now that the objtool_file can be obtained outside of the check function, orc generation builtin no longer requires check to explicitly call its orc related functions. Signed-off-by: Julien Thierry <jthierry@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-01	objtool: Move object file loading out of check()	Julien Thierry
	Structure objtool_file can be used by different subcommands. In fact it already is, by check and orc. Provide a function that allows to initialize objtool_file, that builtin can call, without relying on check to do the correct setup for them and explicitly hand the objtool_file to them. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Julien Thierry <jthierry@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2020-09-01 The following pull-request contains BPF updates for your net-next tree. There are two small conflicts when pulling, resolve as follows: 1) Merge conflict in tools/lib/bpf/libbpf.c between 88a82120282b ("libbpf: Factor out common ELF operations and improve logging") in bpf-next and 1e891e513e16 ("libbpf: Fix map index used in error message") in net-next. Resolve by taking the hunk in bpf-next: [...] scn = elf_sec_by_idx(obj, obj->efile.btf_maps_shndx); data = elf_sec_data(obj, scn); if (!scn \|\| !data) { pr_warn("elf: failed to get %s map definitions for %s\n", MAPS_ELF_SEC, obj->path); return -EINVAL; } [...] 2) Merge conflict in drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c between 9647c57b11e5 ("xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance") in bpf-next and e20f0dbf204f ("net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES") in net-next. Resolve the two locations by retaining net_prefetch() and taking xsk_buff_dma_sync_for_cpu() from bpf-next. Should look like: [...] xdp_set_data_meta_invalid(xdp); xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); net_prefetch(xdp->data); [...] We've added 133 non-merge commits during the last 14 day(s) which contain a total of 246 files changed, 13832 insertions(+), 3105 deletions(-). The main changes are: 1) Initial support for sleepable BPF programs along with bpf_copy_from_user() helper for tracing to reliably access user memory, from Alexei Starovoitov. 2) Add BPF infra for writing and parsing TCP header options, from Martin KaFai Lau. 3) bpf_d_path() helper for returning full path for given 'struct path', from Jiri Olsa. 4) AF_XDP support for shared umems between devices and queues, from Magnus Karlsson. 5) Initial prep work for full BPF-to-BPF call support in libbpf, from Andrii Nakryiko. 6) Generalize bpf_sk_storage map & add local storage for inodes, from KP Singh. 7) Implement sockmap/hash updates from BPF context, from Lorenz Bauer. 8) BPF xor verification for scalar types & add BPF link iterator, from Yonghong Song. 9) Use target's prog type for BPF_PROG_TYPE_EXT prog verification, from Udip Pant. 10) Rework BPF tracing samples to use libbpf loader, from Daniel T. Lee. 11) Fix xdpsock sample to really cycle through all buffers, from Weqaar Janjua. 12) Improve type safety for tun/veth XDP frame handling, from Maciej Żenczykowski. 13) Various smaller cleanups and improvements all over the place. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-01	tools/bpf: build: Make sure resolve_btfids cleans up after itself	Toke Høiland-Jørgensen
	The new resolve_btfids tool did not clean up the feature detection folder on 'make clean', and also was not called properly from the clean rule in tools/make/ folder on its 'make clean'. This lead to stale objects being left around, which could cause feature detection to fail on subsequent builds. Fixes: fbbb68de80a4 ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20200901144343.179552-1-toke@redhat.com
2020-09-01	perf report: Disable ordered_events for raw dump	Jiri Olsa
	Disable ordered_events for report raw dump, because for raw dump we want to see events as they are stored in the perf.data file, not sorted by time. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200827134830.126721-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-01	perf tools: Correct SNOOPX field offset	Al Grant
	perf_event.h has macros that define the field offsets in the data_src bitmask in perf records. The SNOOPX and REMOTE offsets were both 37. These are distinct fields, and the bitfield layout in perf_mem_data_src confirms that SNOOPX should be at offset 38. Committer notes: This was extracted from a larger patch that also contained kernel changes. Fixes: 52839e653b5629bd ("perf tools: Add support for printing new mem_info encodings") Signed-off-by: Al Grant <al.grant@arm.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/9974f2d0-bf7f-518e-d9f7-4520e5ff1bb0@foss.arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-01	perf intel-pt: Fix corrupt data after perf inject from	Al Grant
	Commit 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack") changed the format of branch stacks in perf samples. When samples use this new format, a flag must be set in the corresponding event. Synthesized branch stacks generated from Intel PT were using the new format, but not setting the event attribute, leading to consumers seeing corrupt data. This patch fixes the issue by setting the event attribute to indicate use of the new format. Fixes: 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack") Signed-off-by: Al Grant <al.grant@arm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20200819084751.17686-2-leo.yan@linaro.org Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>