linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-02-11	perf maps: Move kmap::kmaps setup to maps__insert()	Jiri Olsa
	So the kmaps pointer setup is centralized and we do not need to update it in all those places (2 current places and few more missing) after calling maps__insert(). Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200210143218.24948-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11	perf maps: Fix map__clone() for struct kmap	Jiri Olsa
	The map__clone() function can be called on kernel maps as well, so it needs to duplicate the whole kmap data. Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200210143218.24948-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11	perf maps: Mark ksymbol DSOs with kernel type	Jiri Olsa
	We add ksymbol map into machine->kmaps, so it needs to be created as 'struct kmap', which is dependent on its dso having kernel type. Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200210200847.GA36715@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11	perf maps: Mark module DSOs with kernel type	Jiri Olsa
	We add kernel module map into machine->kmaps, so it needs to be created as 'struct kmap', which is dependent on its dso having kernel type. Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kim Phillips <kim.phillips@amd.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200210143218.24948-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11	tools include UAPI: Sync x86's syscalls_64.tbl, generic unistd.h and fcntl.h ↵	Arnaldo Carvalho de Melo
	to pick up openat2 and pidfd_getfd fddb5d430ad9 ("open: introduce openat2(2) syscall") 9a2cef09c801 ("arch: wire up pidfd_getfd syscall") We also need to grab a copy of uapi/linux/openat2.h since it is now needed by fcntl.h, add it to tools/perf/check_headers.h. $ diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl --- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 2019-12-20 16:43:57.662429958 -0300 +++ arch/x86/entry/syscalls/syscall_64.tbl 2020-02-10 16:36:22.070012468 -0300 @@ -357,6 +357,8 @@ 433 common fspick __x64_sys_fspick 434 common pidfd_open __x64_sys_pidfd_open 435 common clone3 __x64_sys_clone3/ptregs +437 common openat2 __x64_sys_openat2 +438 common pidfd_getfd __x64_sys_pidfd_getfd # # x32-specific system call numbers start at 512 to avoid cache impact $ Update tools/'s copy of that file: $ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl See the result: $ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2020-02-10 16:42:59.010636041 -0300 +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2020-02-10 16:43:24.149958337 -0300 @@ -346,5 +346,7 @@ [433] = "fspick", [434] = "pidfd_open", [435] = "clone3", + [437] = "openat2", + [438] = "pidfd_getfd", }; -#define SYSCALLTBL_x86_64_MAX_ID 435 +#define SYSCALLTBL_x86_64_MAX_ID 438 $ Now one can use: perf trace -e openat2,pidfd_getfd To get just those syscalls or use in things like: perf trace -e open* To get all the open variant (open, openat, openat2, etc) or: perf trace pidfd* To get the pidfd syscalls. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11	objtool: Add relocation check for alternative sections	Josh Poimboeuf
	Relocations in alternative code can be dangerous, because the code is copy/pasted to the text section after relocations have been resolved, which can corrupt PC-relative addresses. However, relocations might be acceptable in some cases, depending on the architecture. For example, the x86 alternatives code manually fixes up the target addresses for PC-relative jumps and calls. So disallow relocations in alternative code, except where the x86 arch code allows it. This code may need to be tweaked for other arches when objtool gets support for them. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Julien Thierry <jthierry@redhat.com> Link: https://lkml.kernel.org/r/7b90b68d093311e4e8f6b504a9e1c758fd7e0002.1581359535.git.jpoimboe@redhat.com
2020-02-11	objtool: Add is_static_jump() helper	Josh Poimboeuf
	There are several places where objtool tests for a non-dynamic (aka direct) jump. Move the check to a helper function. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Julien Thierry <jthierry@redhat.com> Link: https://lkml.kernel.org/r/9b8b438df918276315e4765c60d2587f3c7ad698.1581359535.git.jpoimboe@redhat.com
2020-02-11	objtool: Fail the kernel build on fatal errors	Josh Poimboeuf
	When objtool encounters a fatal error, it usually means the binary is corrupt or otherwise broken in some way. Up until now, such errors were just treated as warnings which didn't fail the kernel build. However, objtool is now stable enough that if a fatal error is discovered, it most likely means something is seriously wrong and it should fail the kernel build. Note that this doesn't apply to "normal" objtool warnings; only fatal ones. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Julien Thierry <jthierry@redhat.com> Link: https://lkml.kernel.org/r/f18c3743de0fef673d49dd35760f26bdef7f6fc3.1581359535.git.jpoimboe@redhat.com
2020-02-10	selftests/resctrl: Disable MBA and MBM tests for AMD	Babu Moger
	For now, disable MBA and MBM tests for AMD. Deciding test pass/fail is not clear right now. We can enable when we have some clarity. Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Use cache index3 id for AMD schemata masks	Babu Moger
	AMD uses the cache l3 boundary for schemata masks. Update it accordigly. Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add vendor detection mechanism	Babu Moger
	RESCTRL feature is supported both on Intel and AMD now. Some features are implemented differently. Add vendor detection mechanism. Use the vendor check where there are differences. Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add Cache Allocation Technology (CAT) selftest	Fenghua Yu
	Cache Allocation Technology (CAT) selftest allocates a portion of last level cache and starts a benchmark to read each cache line in this portion of cache. Measure the cache misses in perf and the misses should be equal to the number of cache lines in this portion of cache. We don't use CQM to calculate cache usage because some CAT enabled platforms don't have CQM. Co-developed-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add Cache QoS Monitoring (CQM) selftest	Fenghua Yu
	Cache QoS Monitoring (CQM) selftest starts stressful cache benchmark with specified size of memory to access the cache. Last Level cache occupancy reported by CQM should be close to the size of the memory. Co-developed-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add MBA test	Fenghua Yu
	MBA (Memory Bandwidth Allocation) test starts a stressful memory bandwidth benchmark and allocates memory bandwidth from 100% down to 10% for the benchmark process. For each allocation, compare perf IMC counter and mbm total bytes from resctrl. The difference between the two values should be within a threshold to pass the test. Default benchmark is built-in fill_buf. But users can specify their own benchmark by option "-b". We can add memory bandwidth allocation for multiple processes in the future. Co-developed-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add MBM test	Fenghua Yu
	MBM (Memory Bandwidth Monitoring) test is the first implemented selftest. It starts a stressful memory bandwidth benchmark and assigns the bandwidth pid in a resctrl monitoring group. Read and compare perf IMC counter and MBM total bytes for the benchmark. The numbers should be close enough to pass the test. Default benchmark is built-in fill_buf. But users can specify their own benchmark by option "-b". We can add memory bandwidth monitoring for multiple processes in the future. Co-developed-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add built in benchmark	Sai Praneeth Prakhya
	Built-in benchmark fill_buf generates stressful memory bandwidth and cache traffic. Later it will be used as a default benchmark by various resctrl tests such as MBA (Memory Bandwidth Allocation) and MBM (Memory Bandwidth Monitoring) tests. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add callback to start a benchmark	Sai Praneeth Prakhya
	The callback starts a child process and puts the child pid in created resctrl group with specified memory bandwidth in schemata. The child starts running benchmark. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Read memory bandwidth from perf IMC counter and from ↵	Sai Praneeth Prakhya
	resctrl file system Total memory bandwidth can be monitored from perf IMC counter and from resctrl file system. Later the two will be compared to verify the total memory bandwidth read from resctrl is correct. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add basic resctrl file system operations and data	Sai Praneeth Prakhya
	The basic resctrl file system operations and data are added for future usage by resctrl selftest tool. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/resctrl: Add README for resctrl tests	Fenghua Yu
	resctrl tests will be implemented. README is added for the tool first. Co-developed-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests: fix too long argument	Jiri Benc
	With some shells, the command construed for install of bpf selftests becomes too large due to long list of files: make[1]: execvp: /bin/sh: Argument list too long make[1]: *** [../lib.mk:73: install] Error 127 Currently, each of the file lists is replicated three times in the command: in the shell 'if' condition, in the 'echo' and in the 'rsync'. Reduce that by one instance by using make conditionals and separate the echo and rsync into two shell commands. (One would be inclined to just remove the '@' at the beginning of the rsync command and let 'make' echo it by itself; unfortunately, it appears that the '@' in the front of mkdir silences output also for the following commands.) Also, separate handling of each of the lists to its own shell command. The semantics of the makefile is unchanged before and after the patch. The ability of individual test directories to override INSTALL_RULE is retained. Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Tested-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests: allow detection of build failures	Jiri Benc
	Commit 5f70bde26a48 ("selftests: fix build behaviour on targets' failures") added a logic to track failure of builds of individual targets. However, it does exactly the opposite of what a distro kernel needs: we create a RPM package with a selected set of selftests and we need the build to fail if build of any of the targets fail. Both use cases are valid. A distribution kernel is in control of what is included in the kernel and what is being built; any error needs to be flagged and acted upon. A CI system that tries to build as many tests as possible on the best effort basis is not really interested in a failure here and there. Support both use cases by introducing a FORCE_TARGETS variable. It is switched off by default to make life for CI systems easier, distributions can easily switch it on while building their packages. Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	Kernel selftests: tpm2: check for tpm support	Nikita Sobolev
	tpm2 tests set fails if there is no /dev/tpm0 and /dev/tpmrm0 supported. Check if these files exist before run and mark test as skipped in case of absence. Signed-off-by: Nikita Sobolev <Nikita.Sobolev@synopsys.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests/ftrace: Have pid filter test use instance flag	Steven Rostedt (VMware)
	While running the ftracetests, the pid filter test failed because the instance "foo" existed, and it was using it to rerun the test under a instance named foo. The collision caused the test to fail as the mkdir failed as the name already existed. As of commit b5b77be812de7 ("selftests: ftrace: Allow some tests to be run in a tracing instance") all a selftest needs to do to be tested in an instance is to set the "instance" flag. There's no reason a selftest needs to create an instance to run its test in an instance directly. Remove the open coded testing in an instance for the pid filter test and have it set the "instance" flag instead. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	selftests: fix spelling mistaked "chaigned" -> "chained"	Colin Ian King
	There is a spelling mistake in a literal string, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-02-10	firmware_loader: load files from the mount namespace of init	Topi Miettinen
	I have an experimental setup where almost every possible system service (even early startup ones) runs in separate namespace, using a dedicated, minimal file system. In process of minimizing the contents of the file systems with regards to modules and firmware files, I noticed that in my system, the firmware files are loaded from three different mount namespaces, those of systemd-udevd, init and systemd-networkd. The logic of the source namespace is not very clear, it seems to depend on the driver, but the namespace of the current process is used. So, this patch tries to make things a bit clearer and changes the loading of firmware files only from the mount namespace of init. This may also improve security, though I think that using firmware files as attack vector could be too impractical anyway. Later, it might make sense to make the mount namespace configurable, for example with a new file in /proc/sys/kernel/firmware_config/. That would allow a dedicated file system only for firmware files and those need not be present anywhere else. This configurability would make more sense if made also for kernel modules and /sbin/modprobe. Modules are already loaded from init namespace (usermodehelper uses kthreadd namespace) except when directly loaded by systemd-udevd. Instead of using the mount namespace of the current process to load firmware files, use the mount namespace of init process. Link: https://lore.kernel.org/lkml/bb46ebae-4746-90d9-ec5b-fce4c9328c86@gmail.com/ Link: https://lore.kernel.org/lkml/0e3f7653-c59d-9341-9db2-c88f5b988c68@gmail.com/ Signed-off-by: Topi Miettinen <toiwoton@gmail.com> Link: https://lore.kernel.org/r/20200123125839.37168-1-toiwoton@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-10	bpf: Selftests build error in sockmap_basic.c	John Fastabend
	Fix following build error. We could push a tcp.h header into one of the include paths, but I think its easy enough to simply pull in the three defines we need here. If we end up using more of tcp.h at some point we can pull it in later. /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: In function ‘connected_socket_v4’: /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: error: ‘TCP_REPAIR_ON’ undeclared (first use in this function) repair = TCP_REPAIR_ON; ^ /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: note: each undeclared identifier is reported only once for each function it appears in /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:29:11: error: ‘TCP_REPAIR_OFF_NO_WP’ undeclared (first use in this function) repair = TCP_REPAIR_OFF_NO_WP; Then with fix, $ ./test_progs -n 44 #44/1 sockmap create_update_free:OK #44/2 sockhash create_update_free:OK #44 sockmap_basic:OK Fixes: 5d3919a953c3c ("selftests/bpf: Test freeing sockmap/sockhash with a socket in it") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/158131347731.21414.12120493483848386652.stgit@john-Precision-5820-Tower
2020-02-10	tools/bootconfig: Suppress non-error messages	Masami Hiramatsu
	Suppress non-error messages when applying new bootconfig to initrd image. To enable it, replace printf for error message with pr_err() macro. This also adds a testcase for this fix. Link: http://lkml.kernel.org/r/158125351377.16911.13283712972275131160.stgit@devnote2 Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10	bootconfig: Allocate xbc_nodes array dynamically	Masami Hiramatsu
	To reduce the large static array from kernel data, allocate xbc_nodes array dynamically only if the kernel loads a bootconfig. Note that this also add dummy memblock.h for user-spacae bootconfig tool. Link: http://lkml.kernel.org/r/158108569699.3187.6512834527603883707.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10	perf symbols: Convert symbol__is_idle() to use strlist	Kim Phillips
	Use the more optimized strlist implementation to do the idle function lookup. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200210163147.25358-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-10	perf symbols: Update the list of kernel idle symbols	Kim Phillips
	The "acpi_idle_do_entry", "acpi_processor_ffh_cstate_enter", and "idle_cpu" symbols appear in 'perf top' output, at least on AMD systems. Add them to perf's idle_symbols list, so they don't dominate 'perf top' output. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200207230613.26709-2-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-10	perf stat: Don't report a null stalled cycles per insn metric	Kim Phillips
	For data collected on machines with front end stalled cycles supported, such as found on modern AMD CPU families, commit 146540fb545b ("perf stat: Always separate stalled cycles per insn") introduces a new line in CSV output with a leading comma that upsets some automated scripts. Scripts have to use "-e ex_ret_instr" to work around this issue, after upgrading to a version of perf with that commit. We could add "if (have_frontend_stalled && !config->csv_sep)" to the not (total && avg) else clause, to emphasize that CSV users are usually scripts, and are written to do only what is needed, i.e., they wouldn't typically invoke "perf stat" without specifying an explicit event list. But - let alone CSV output - why should users now tolerate a constant 0-reporting extra line in regular terminal output?: BEFORE: $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1 Performance counter stats for 'system wide': 181,110,981 instructions # 0.58 insn per cycle # 0.00 stalled cycles per insn 309,876,469 cycles 1.002202582 seconds time elapsed The user would not like to see the now permanent: "0.00 stalled cycles per insn" line fixture, as it gives no useful information. So this patch removes the printing of the zeroed stalled cycles line altogether, almost reverting the very original commit fb4605ba47e7 ("perf stat: Check for frontend stalled for metrics"), which seems like it was written to normalize --metric-only column output of common Intel machines at the time: modern Intel machines have ceased to support the genericised frontend stalled metrics AFAICT. AFTER: $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1 Performance counter stats for 'system wide': 244,071,432 instructions # 0.69 insn per cycle 355,353,490 cycles 1.001862516 seconds time elapsed Output behaviour when stalled cycles is indeed measured is not affected (BEFORE == AFTER): $ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1 Performance counter stats for 'system wide': 247,227,799 instructions # 0.63 insn per cycle # 0.26 stalled cycles per insn 394,745,636 cycles 63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle 1.002079770 seconds time elapsed Fixes: 146540fb545b ("perf stat: Always separate stalled cycles per insn") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-10	tools/bootconfig: Fix wrong __VA_ARGS__ usage	Masami Hiramatsu
	Since printk() wrapper macro uses __VA_ARGS__ without "##" prefix, it causes a build error if there is no variable arguments (e.g. only fmt is specified.) To fix this error, use ##__VA_ARGS__ instead of __VAR_ARGS__. Link: http://lkml.kernel.org/r/158108370130.2758.10893830923800978011.stgit@devnote2 Fixes: 950313ebf79c ("tools: bootconfig: Add bootconfig command") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10	tools/power/x86/intel-speed-select: Avoid duplicate names for json parsing	Srinivas Pandruvada
	For the command "intel-speed-select perf-profile info": There are two instances of “speed-select-turbo-freq” underneath “perf-profile-level-0” for each package. When we load the output into python with json.load(), the second instance overwrites the first. Result is that we can only access: "speed-select-turbo-freq": { "bucket-0": { "high-priority-cores-count": "2", "high-priority-max-frequency(MHz)": "3000", "high-priority-max-avx2-frequency(MHz)": "2800", "high-priority-max-avx512-frequency(MHz)": "2600" }, Because it is a duplicate of "speed-select-turbo-freq": "disabled" Same is true for "speed-select-base-freq". To avoid this add "-properties" suffix for the second instance to differentiate. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-02-10	tools/power/x86/intel-speed-select: Fix display for turbo-freq auto mode	Srinivas Pandruvada
	When mailbox command for the turbo-freq enable fails, then don't display result for auto-mode. When turbo-freq enable fails, there is no point to set CPU priorities. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-02-09	Merge tag 'perf-urgent-2020-02-09' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of fixes and improvements for the perf subsystem: Kernel fixes: - Install cgroup events to the correct CPU context to prevent a potential list double add - Prevent an integer underflow in the perf mlock accounting - Add a missing prototype for arch_perf_update_userpage() Tooling: - Add a missing unlock in the error path of maps__insert() in perf maps. - Fix the build with the latest libbfd - Fix the perf parser so it does not delete parse event terms, which caused a regression for using perf with the ARM CoreSight as the sink configuration was missing due to the deletion. - Fix the double free in the perf CPU map merging test case - Add the missing ustring support for the perf probe command" * tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf maps: Add missing unlock to maps__insert() error case perf probe: Add ustring support for perf probe command perf: Make perf able to build with latest libbfd perf test: Fix test case Merge cpu map perf parse: Copy string to perf_evsel_config_term perf parse: Refactor 'struct perf_evsel_config_term' kernel/events: Add a missing prototype for arch_perf_update_userpage() perf/cgroups: Install cgroup events to correct cpuctx perf/core: Fix mlock accounting in perf_mmap()
2020-02-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Unbalanced locking in mwifiex_process_country_ie, from Brian Norris. 2) Fix thermal zone registration in iwlwifi, from Andrei Otcheretianski. 3) Fix double free_irq in sgi ioc3 eth, from Thomas Bogendoerfer. 4) Use after free in mptcp, from Florian Westphal. 5) Use after free in wireguard's root_remove_peer_lists, from Eric Dumazet. 6) Properly access packets heads in bonding alb code, from Eric Dumazet. 7) Fix data race in skb_queue_len(), from Qian Cai. 8) Fix regression in r8169 on some chips, from Heiner Kallweit. 9) Fix XDP program ref counting in hv_netvsc, from Haiyang Zhang. 10) Certain kinds of set link netlink operations can cause a NULL deref in the ipv6 addrconf code. Fix from Eric Dumazet. 11) Don't cancel uninitialized work queue in drop monitor, from Ido Schimmel. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) net: thunderx: use proper interface type for RGMII mt76: mt7615: fix max_nss in mt7615_eeprom_parse_hw_cap bpf: Improve bucket_log calculation logic selftests/bpf: Test freeing sockmap/sockhash with a socket in it bpf, sockhash: Synchronize_rcu before free'ing map bpf, sockmap: Don't sleep while holding RCU lock on tear-down bpftool: Don't crash on missing xlated program instructions bpf, sockmap: Check update requirements after locking drop_monitor: Do not cancel uninitialized work item mlxsw: spectrum_dpipe: Add missing error path mlxsw: core: Add validation of hardware device types for MGPIR register mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abort selftests: mlxsw: Add test cases for local table route replacement mlxsw: spectrum_router: Prevent incorrect replacement of local table routes net: dsa: microchip: enable module autoprobe ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() dpaa_eth: support all modes with rate adapting PHYs net: stmmac: update pci platform data to use phy_interface net: stmmac: xgmac: fix missing IFF_MULTICAST checki in dwxgmac2_set_filter net: stmmac: fix missing IFF_MULTICAST check in dwmac4_set_filter ...
2020-02-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2020-02-07 The following pull-request contains BPF updates for your net tree. We've added 15 non-merge commits during the last 10 day(s) which contain a total of 12 files changed, 114 insertions(+), 31 deletions(-). The main changes are: 1) Various BPF sockmap fixes related to RCU handling in the map's tear- down code, from Jakub Sitnicki. 2) Fix macro state explosion in BPF sk_storage map when calculating its bucket_log on allocation, from Martin KaFai Lau. 3) Fix potential BPF sockmap update race by rechecking socket's established state under lock, from Lorenz Bauer. 4) Fix crash in bpftool on missing xlated instructions when kptr_restrict sysctl is set, from Toke Høiland-Jørgensen. 5) Fix i40e's XSK wakeup code to return proper error in busy state and various misc fixes in xdpsock BPF sample code, from Maciej Fijalkowski. 6) Fix the way modifiers are skipped in BTF in the verifier while walking pointers to avoid program rejection, from Alexei Starovoitov. 7) Fix Makefile for runqslower BPF tool to i) rebuild on libbpf changes and ii) to fix undefined reference linker errors for older gcc version due to order of passed gcc parameters, from Yulia Kartseva and Song Liu. 8) Fix a trampoline_count BPF kselftest warning about missing braces around initializer, from Andrii Nakryiko. 9) Fix up redundant "HAVE" prefix from large INSN limit kernel probe in bpftool, from Michal Rostecki. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07	selftests/bpf: Test freeing sockmap/sockhash with a socket in it	Jakub Sitnicki
	Commit 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down") introduced sleeping issues inside RCU critical sections and while holding a spinlock on sockmap/sockhash tear-down. There has to be at least one socket in the map for the problem to surface. This adds a test that triggers the warnings for broken locking rules. Not a fix per se, but rather tooling to verify the accompanying fixes. Run on a VM with 1 vCPU to reproduce the warnings. Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200206111652.694507-4-jakub@cloudflare.com
2020-02-07	bpftool: Don't crash on missing xlated program instructions	Toke Høiland-Jørgensen
	Turns out the xlated program instructions can also be missing if kptr_restrict sysctl is set. This means that the previous fix to check the jited_prog_insns pointer was insufficient; add another check of the xlated_prog_insns pointer as well. Fixes: 5b79bcdf0362 ("bpftool: Don't crash on missing jited insns or ksyms") Fixes: cae73f233923 ("bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200206102906.112551-1-toke@redhat.com
2020-02-07	selftests: mlxsw: Add test cases for local table route replacement	Ido Schimmel
	Test that routes in the main table do not replace identical routes in the local table and that routes in the local table do replace identical routes in the main table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-06	Merge tag 'kvm-5.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull more KVM updates from Paolo Bonzini: "s390: - fix register corruption - ENOTSUPP/EOPNOTSUPP mixed - reset cleanups/fixes - selftests x86: - Bug fixes and cleanups - AMD support for APIC virtualization even in combination with in-kernel PIT or IOAPIC. MIPS: - Compilation fix. Generic: - Fix refcount overflow for zero page" * tag 'kvm-5.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: vmx: delete meaningless vmx_decache_cr0_guest_bits() declaration KVM: x86: Mark CR4.UMIP as reserved based on associated CPUID bit x86: vmxfeatures: rename features for consistency with KVM and manual KVM: SVM: relax conditions for allowing MSR_IA32_SPEC_CTRL accesses KVM: x86: Fix perfctr WRMSR for running counters x86/kvm/hyper-v: don't allow to turn on unsupported VMX controls for nested guests x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs() kvm: mmu: Separate generating and setting mmio ptes kvm: mmu: Replace unsigned with unsigned int for PTE access KVM: nVMX: Remove stale comment from nested_vmx_load_cr3() KVM: MIPS: Fold comparecount_func() into comparecount_wakeup() KVM: MIPS: Fix a build error due to referencing not-yet-defined function x86/kvm: do not setup pv tlb flush when not paravirtualized KVM: fix overflow of zero page refcount with ksm running KVM: x86: Take a u64 when checking for a valid dr7 value KVM: x86: use raw clock values consistently KVM: x86: reorganize pvclock_gtod_data members KVM: nVMX: delete meaningless nested_vmx_run() declaration KVM: SVM: allow AVIC without split irqchip kvm: ioapic: Lazy update IOAPIC EOI ...
2020-02-06	mptcp: fix use-after-free for ipv6	Florian Westphal
	Turns out that when we accept a new subflow, the newly created inet_sk(tcp_sk)->pinet6 points at the ipv6_pinfo structure of the listener socket. This wasn't caught by the selftest because it closes the accepted fd before the listening one. adding a close(listenfd) after accept returns is enough: BUG: KASAN: use-after-free in inet6_getname+0x6ba/0x790 Read of size 1 at addr ffff88810e310866 by task mptcp_connect/2518 Call Trace: inet6_getname+0x6ba/0x790 __sys_getpeername+0x10b/0x250 __x64_sys_getpeername+0x6f/0xb0 also alter test program to exercise this. Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-06	Merge tag 'trace-v5.6-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added new "bootconfig". This looks for a file appended to initrd to add boot config options, and has been discussed thoroughly at Linux Plumbers. Very useful for adding kprobes at bootup. Only enabled if "bootconfig" is on the real kernel command line. - Created dynamic event creation. Merges common code between creating synthetic events and kprobe events. - Rename perf "ring_buffer" structure to "perf_buffer" - Rename ftrace "ring_buffer" structure to "trace_buffer" Had to rename existing "trace_buffer" to "array_buffer" - Allow trace_printk() to work withing (some) tracing code. - Sort of tracing configs to be a little better organized - Fixed bug where ftrace_graph hash was not being protected properly - Various other small fixes and clean ups * tag 'trace-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (88 commits) bootconfig: Show the number of nodes on boot message tools/bootconfig: Show the number of bootconfig nodes bootconfig: Add more parse error messages bootconfig: Use bootconfig instead of boot config ftrace: Protect ftrace_graph_hash with ftrace_sync ftrace: Add comment to why rcu_dereference_sched() is open coded tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu tracing: Annotate ftrace_graph_hash pointer with __rcu bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline tracing: Use seq_buf for building dynevent_cmd string tracing: Remove useless code in dynevent_arg_pair_add() tracing: Remove check_arg() callbacks from dynevent args tracing: Consolidate some synth_event_trace code tracing: Fix now invalid var_ref_vals assumption in trace action tracing: Change trace_boot to use synth_event interface tracing: Move tracing selftests to bottom of menu tracing: Move mmio tracer config up with the other tracers tracing: Move tracing test module configs together tracing: Move all function tracing configs together tracing: Documentation for in-kernel synthetic event API ...
2020-02-05	tools/bootconfig: Show the number of bootconfig nodes	Masami Hiramatsu
	Show the number of bootconfig nodes when applying new bootconfig to initrd. Since there are limitations of bootconfig not only in its filesize, but also the number of nodes, the number should be shown when applying so that user can get the feeling of scale of current bootconfig. Link: http://lkml.kernel.org/r/158091061337.27924.10886706631693823982.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-05	tools/bpf/runqslower: Rebuild libbpf.a on libbpf source change	Song Liu
	Add missing dependency of $(BPFOBJ) to $(LIBBPF_SRC), so that running make in runqslower/ will rebuild libbpf.a when there is change in libbpf/. Fixes: 9c01546d26d2 ("tools/bpf: Add runqslower tool to tools/bpf") Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200204215037.2258698-1-songliubraving@fb.com
2020-02-05	Merge tag 'kvm-s390-next-5.6-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and cleanups for 5.6 - fix register corruption - ENOTSUPP/EOPNOTSUPP mixed - reset cleanups/fixes - selftests
2020-02-05	wireguard: selftests: tie socket waiting to target pid	Jason A. Donenfeld
	Without this, we wind up proceeding too early sometimes when the previous process has just used the same listening port. So, we tie the listening socket query to the specific pid we're interested in. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-05	wireguard: selftests: cleanup CONFIG_ENABLE_WARN_DEPRECATED	Krzysztof Kozlowski
	CONFIG_ENABLE_WARN_DEPRECATED is gone since commit 771c035372a0 ("deprecate the '__deprecated' attribute warnings entirely and for good"). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-05	wireguard: selftests: ensure non-addition of peers with failed precomputation	Jason A. Donenfeld
	Ensure that peers with low order points are ignored, both in the case where we already have a device private key and in the case where we do not. This adds points that naturally give a zero output. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>