summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-10mm/damon/vaddr: remove an unnecessary warning messageSeongJae Park
The DAMON virtual address space monitoring primitive prints a warning message for wrong DAMOS action. However, it is not essential as the code returns appropriate failure in the case. This commit removes the message to make the log clean. Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org Fixes: 6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: remove unnecessary error messagesSeongJae Park
DAMON core prints error messages when damon_target object creation is failed or wrong monitoring attributes are given. Because appropriate error code is returned for each case, the messages are not essential. Also, because the code path can be triggered with user-specified input, this could result in kernel log mistakenly being messy. To avoid the case, this commit removes the messages. Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Fixes: b9a6ac4e4ede ("mm/damon: adaptively adjust regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: kernel test robot <lkp@intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/dbgfs: remove an unnecessary error messageSeongJae Park
When wrong scheme action is requested via the debugfs interface, DAMON prints an error message. Because the function returns error code, this is not really needed. Because the code path is triggered by the user specified input, this can result in kernel log mistakenly being messy. To avoid the case, this commit removes the message. Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org Fixes: af122dd8f3c0 ("mm/damon/dbgfs: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: use better timer mechanisms selection thresholdSeongJae Park
Patch series "mm/damon: Trivial fixups and improvements". This patchset contains trivial fixups and improvements for DAMON and its kunit/kselftest tests. This patch (of 11): DAMON is using hrtimer if requested sleep time is <=100ms, while the suggested threshold[1] is <=20ms. This commit applies the threshold. [1] Documentation/timers/timers-howto.rst Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org Fixes: ee801b7dd7822 ("mm/damon/schemes: activate schemes based on a watermarks mechanism") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: fix fake load reports due to uninterruptible sleepsSeongJae Park
Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake load while DAMON is turned on, though it is doing nothing. This can confuse users[1]. To avoid the case, this commit makes DAMON sleeps in idle mode. [1] https://lore.kernel.org/all/11868371.O9o76ZdvQC@natalenko.name/ Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org Fixes: 2224d8485492 ("mm: introduce Data Access MONitor (DAMON)") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10timers: implement usleep_idle_range()SeongJae Park
Patch series "mm/damon: Fix fake /proc/loadavg reports", v3. This patchset fixes DAMON's fake load report issue. The first patch makes yet another variant of usleep_range() for this fix, and the second patch fixes the issue of DAMON by making it using the newly introduced function. This patch (of 2): Some kernel threads such as DAMON could need to repeatedly sleep in micro seconds level. Because usleep_range() sleeps in uninterruptible state, however, such threads would make /proc/loadavg reports fake load. To help such cases, this commit implements a variant of usleep_range() called usleep_idle_range(). It is same to usleep_range() but sets the state of the current task as TASK_IDLE while sleeping. Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10filemap: remove PageHWPoison check from next_uptodate_page()Matthew Wilcox (Oracle)
Pages are individually marked as suffering from hardware poisoning. Checking that the head page is not hardware poisoned doesn't make sense; we might be after a subpage. We check each page individually before we use it, so this was an optimisation gone wrong. It will cause us to fall back to the slow path when there was no need to do that Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Yang Shi <shy828301@gmail.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mailmap: update email address for Guo RenGuo Ren
The ren_guo@c-sky.com would be deprecated and use guoren@kernel.org as the main email address. Link: https://lkml.kernel.org/r/20211123022741.545541-1-guoren@kernel.org Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10MAINTAINERS: update kdump maintainersDave Young
Remove myself from kdump maintainers as I have no enough time to maintain it now. But I can review patches on demand though. Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Increase default MLOCK_LIMIT to 8 MiBDrew DeVault
This limit has not been updated since 2008, when it was increased to 64 KiB at the request of GnuPG. Until recently, the main use-cases for this feature were (1) preventing sensitive memory from being swapped, as in GnuPG's use-case; and (2) real-time use-cases. In the first case, little memory is called for, and in the second case, the user is generally in a position to increase it if they need more. The introduction of IOURING_REGISTER_BUFFERS adds a third use-case: preparing fixed buffers for high-performance I/O. This use-case will take as much of this memory as it can get, but is still limited to 64 KiB by default, which is very little. This increases the limit to 8 MB, which was chosen fairly arbitrarily as a more generous, but still conservative, default value. It is also possible to raise this limit in userspace. This is easily done, for example, in the use-case of a network daemon: systemd, for instance, provides for this via LimitMEMLOCK in the service file; OpenRC via the rc_ulimit variables. However, there is no established userspace facility for configuring this outside of daemons: end-user applications do not presently have access to a convenient means of raising their limits. The buck, as it were, stops with the kernel. It's much easier to address it here than it is to bring it to hundreds of distributions, and it can only realistically be relied upon to be high-enough by end-user software if it is more-or-less ubiquitous. Most distros don't change this particular rlimit from the kernel-supplied default value, so a change here will easily provide that ubiquity. Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com Signed-off-by: Drew DeVault <sir@cmpwn.com> Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Cyril Hrubis <chrubis@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Andrew Dona-Couch <andrew@donacou.ch> Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Merge tag 'thermal-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix the definition of one of the Tiger Lake MMIO registers in the int340x thermal driver (Sumeet Pawnikar)" * tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL
2021-12-10Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Andrii Nakryiko says: ==================== bpf-next 2021-12-10 v2 We've added 115 non-merge commits during the last 26 day(s) which contain a total of 182 files changed, 5747 insertions(+), 2564 deletions(-). The main changes are: 1) Various samples fixes, from Alexander Lobakin. 2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov. 3) A batch of new unified APIs for libbpf, logging improvements, version querying, etc. Also a batch of old deprecations for old APIs and various bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko. 4) BPF documentation reorganization and improvements, from Christoph Hellwig and Dave Tucker. 5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in libbpf, from Hengqi Chen. 6) Verifier log fixes, from Hou Tao. 7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong. 8) Extend branch record capturing to all platforms that support it, from Kajol Jain. 9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi. 10) bpftool doc-generating script improvements, from Quentin Monnet. 11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet. 12) Deprecation warning fix for perf/bpf_counter, from Song Liu. 13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf, from Tiezhu Yang. 14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song. 15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun, and others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits) libbpf: Add "bool skipped" to struct bpf_map libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition bpftool: Switch bpf_object__load_xattr() to bpf_object__load() selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr() selftests/bpf: Add test for libbpf's custom log_buf behavior selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load() libbpf: Deprecate bpf_object__load_xattr() libbpf: Add per-program log buffer setter and getter libbpf: Preserve kernel error code and remove kprobe prog type guessing libbpf: Improve logging around BPF program loading libbpf: Allow passing user log setting through bpf_object_open_opts libbpf: Allow passing preallocated log_buf when loading BTF into kernel libbpf: Add OPTS-based bpf_btf_load() API libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 samples/bpf: Remove unneeded variable bpf: Remove redundant assignment to pointer t selftests/bpf: Fix a compilation warning perf/bpf_counter: Use bpf_map_create instead of bpf_create_map samples: bpf: Fix 'unknown warning group' build warning on Clang samples: bpf: Fix xdp_sample_user.o linking with Clang ... ==================== Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10libbpf: Add "bool skipped" to struct bpf_mapShuyi Cheng
Fix error: "failed to pin map: Bad file descriptor, path: /sys/fs/bpf/_rodata_str1_1." In the old kernel, the global data map will not be created, see [0]. So we should skip the pinning of the global data map to avoid bpf_object__pin_maps returning error. Therefore, when the map is not created, we mark “map->skipped" as true and then check during relocation and during pinning. Fixes: 16e0c35c6f7a ("libbpf: Load global data maps lazily on legacy kernels") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definitionVincent Minet
The btf__dedup_deprecated name was misspelled in the definition of the compat symbol for btf__dedup. This leads it to be missing from the shared library. This fixes it. Fixes: 957d350a8b94 ("libbpf: Turn btf_dedup_opts into OPTS-based struct") Signed-off-by: Vincent Minet <vincent@vincent-minet.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211210063112.80047-1-vincent@vincent-minet.net
2021-12-10Merge branch 'Enhance and rework logging controls in libbpf'Alexei Starovoitov
Andrii Nakryiko says: ==================== Add new open options and per-program setters to control BTF and program loading log verboseness and allow providing custom log buffers to capture logs of interest. Note how custom log_buf and log_level are orthogonal, which matches previous (alas less customizable) behavior of libbpf, even though it sort of worked by accident: if someone specified log_level = 1 in bpf_object__load_xattr(), first attempt to load any BPF program resulted in wasted bpf() syscall with -EINVAL due to !!log_buf != !!log_level. Then on retry libbpf would allocated log_buffer and try again, after which prog loading would succeed and libbpf would print verbose program loading log through its print callback. This behavior is now documented and made more efficient, not wasting unnecessary syscall. But additionally, log_level can be controlled globally on a per-bpf_object level through bpf_object_open_opts, as well as on a per-program basis with bpf_program__set_log_buf() and bpf_program__set_log_level() APIs. Now that we have a more future-proof way to set log_level, deprecate bpf_object__load_xattr(). v2->v3: - added log_buf selftests for bpf_prog_load() and bpf_btf_load(); - fix !log_buf in bpf_prog_load (John); - fix log_level==0 in bpf_btf_load (thanks selftest!); v1->v2: - fix log_level == 0 handling of bpf_prog_load, add as patch #1 (Alexei); - add comments explaining log_buf_size overflow prevention (Alexei). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10bpftool: Switch bpf_object__load_xattr() to bpf_object__load()Andrii Nakryiko
Switch all the uses of to-be-deprecated bpf_object__load_xattr() into a simple bpf_object__load() calls with optional log_level passed through open_opts.kernel_log_level, if -d option is specified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-13-andrii@kernel.org
2021-12-10selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()Andrii Nakryiko
Switch from bpf_object__load_xattr() to bpf_object__load() and kernel_log_level in bpf_object_open_opts. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-12-andrii@kernel.org
2021-12-10selftests/bpf: Add test for libbpf's custom log_buf behaviorAndrii Nakryiko
Add a selftest that validates that per-program and per-object log_buf overrides work as expected. Also test same logic for low-level bpf_prog_load() and bpf_btf_load() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-11-andrii@kernel.org
2021-12-10selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()Andrii Nakryiko
Switch all selftests uses of to-be-deprecated bpf_load_btf() with equivalent bpf_btf_load() calls. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-10-andrii@kernel.org
2021-12-10libbpf: Deprecate bpf_object__load_xattr()Andrii Nakryiko
Deprecate non-extensible bpf_object__load_xattr() in v0.8 ([0]). With log_level control through bpf_object_open_opts or bpf_program__set_log_level(), we are finally at the point where bpf_object__load_xattr() doesn't provide any functionality that can't be accessed through other (better) ways. The other feature, target_btf_path, is also controllable through bpf_object_open_opts. [0] Closes: https://github.com/libbpf/libbpf/issues/289 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-9-andrii@kernel.org
2021-12-10libbpf: Add per-program log buffer setter and getterAndrii Nakryiko
Allow to set user-provided log buffer on a per-program basis ([0]). This gives great deal of flexibility in terms of which programs are loaded with logging enabled and where corresponding logs go. Log buffer set with bpf_program__set_log_buf() overrides kernel_log_buf and kernel_log_size settings set at bpf_object open time through bpf_object_open_opts, if any. Adjust bpf_object_load_prog_instance() logic to not perform own log buf allocation and load retry if custom log buffer is provided by the user. [0] Closes: https://github.com/libbpf/libbpf/issues/418 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-8-andrii@kernel.org
2021-12-10libbpf: Preserve kernel error code and remove kprobe prog type guessingAndrii Nakryiko
Instead of rewriting error code returned by the kernel of prog load with libbpf-sepcific variants pass through the original error. There is now also no need to have a backup generic -LIBBPF_ERRNO__LOAD fallback error as bpf_prog_load() guarantees that errno will be properly set no matter what. Also drop a completely outdated and pretty useless BPF_PROG_TYPE_KPROBE guess logic. It's not necessary and neither it's helpful in modern BPF applications. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-7-andrii@kernel.org
2021-12-10libbpf: Improve logging around BPF program loadingAndrii Nakryiko
Add missing "prog '%s': " prefixes in few places and use consistently markers for beginning and end of program load logs. Here's an example of log output: libbpf: prog 'handler': BPF program load failed: Permission denied libbpf: -- BEGIN PROG LOAD LOG --- arg#0 reference type('UNKNOWN ') size cannot be determined: -22 ; out1 = in1; 0: (18) r1 = 0xffffc9000cdcc000 2: (61) r1 = *(u32 *)(r1 +0) ... 81: (63) *(u32 *)(r4 +0) = r5 R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0) invalid access to map value, value_size=16 off=400 size=4 R4 min value is outside of the allowed memory range processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: failed to load program 'handler' libbpf: failed to load object 'test_skeleton' The entire verifier log, including BEGIN and END markers are now always youtput during a single print callback call. This should make it much easier to post-process or parse it, if necessary. It's not an explicit API guarantee, but it can be reasonably expected to stay like that. Also __bpf_object__open is renamed to bpf_object_open() as it's always an adventure to find the exact function that implements bpf_object's open phase, so drop the double underscored and use internal libbpf naming convention. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org
2021-12-10libbpf: Allow passing user log setting through bpf_object_open_optsAndrii Nakryiko
Allow users to provide their own custom log_buf, log_size, and log_level at bpf_object level through bpf_object_open_opts. This log_buf will be used during BTF loading. Subsequent patch will use same log_buf during BPF program loading, unless overriden at per-bpf_program level. When such custom log_buf is provided, libbpf won't be attempting retrying loading of BTF to try to provide its own log buffer to capture kernel's error log output. User is responsible to provide big enough buffer, otherwise they run a risk of getting -ENOSPC error from the bpf() syscall. See also comments in bpf_object_open_opts regarding log_level and log_buf interactions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-5-andrii@kernel.org
2021-12-10libbpf: Allow passing preallocated log_buf when loading BTF into kernelAndrii Nakryiko
Add libbpf-internal btf_load_into_kernel() that allows to pass preallocated log_buf and custom log_level to be passed into kernel during BPF_BTF_LOAD call. When custom log_buf is provided, btf_load_into_kernel() won't attempt an retry with automatically allocated internal temporary buffer to capture BTF validation log. It's important to note the relation between log_buf and log_level, which slightly deviates from stricter kernel logic. From kernel's POV, if log_buf is specified, log_level has to be > 0, and vice versa. While kernel has good reasons to request such "sanity, this, in practice, is a bit unconvenient and restrictive for libbpf's high-level bpf_object APIs. So libbpf will allow to set non-NULL log_buf and log_level == 0. This is fine and means to attempt to load BTF without logging requested, but if it failes, retry the load with custom log_buf and log_level 1. Similar logic will be implemented for program loading. In practice this means that users can provide custom log buffer just in case error happens, but not really request slower verbose logging all the time. This is also consistent with libbpf behavior when custom log_buf is not set: libbpf first tries to load everything with log_level=0, and only if error happens allocates internal log buffer and retries with log_level=1. Also, while at it, make BTF validation log more obvious and follow the log pattern libbpf is using for dumping BPF verifier log during BPF_PROG_LOAD. BTF loading resulting in an error will look like this: libbpf: BTF loading error: -22 libbpf: -- BEGIN BTF LOAD LOG --- magic: 0xeb9f version: 1 flags: 0x0 hdr_len: 24 type_off: 0 type_len: 1040 str_off: 1040 str_len: 2063598257 btf_total_size: 1753 Total section length too long -- END BTF LOAD LOG -- libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring. This makes it much easier to find relevant parts in libbpf log output. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-4-andrii@kernel.org
2021-12-10libbpf: Add OPTS-based bpf_btf_load() APIAndrii Nakryiko
Similar to previous bpf_prog_load() and bpf_map_create() APIs, add bpf_btf_load() API which is taking optional OPTS struct. Schedule bpf_load_btf() for deprecation in v0.8 ([0]). This makes naming consistent with BPF_BTF_LOAD command, sets up an API for extensibility in the future, moves options parameters (log-related fields) into optional options, and also allows to pass log_level directly. It also removes log buffer auto-allocation logic from low-level API (consistent with bpf_prog_load() behavior), but preserves a special treatment of log_level == 0 with non-NULL log_buf, which matches low-level bpf_prog_load() and high-level libbpf APIs for BTF and program loading behaviors. [0] Closes: https://github.com/libbpf/libbpf/issues/419 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-3-andrii@kernel.org
2021-12-10libbpf: Fix bpf_prog_load() log_buf logic for log_level 0Andrii Nakryiko
To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix bpf_prog_load() to follow the same logic as bpf_btf_load() and high-level bpf_object__load() API will follow in the subsequent patches: - if log_level is 0 and non-NULL log_buf is provided by a user, attempt load operation initially with no log_buf and log_level set; - if successful, we are done, return new FD; - on error, retry the load operation with log_level bumped to 1 and log_buf set; this way verbose logging will be requested only when we are sure that there is a failure, but will be fast in the common/expected success case. Of course, user can still specify log_level > 0 from the very beginning to force log collection. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-2-andrii@kernel.org
2021-12-10Merge tag 'acpi-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Create the output directory for the ACPI tools during build if it has not been present before and prevent the compilation from failing in that case (Chen Yu)" * tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: tools: Fix compilation when output directory is not present
2021-12-10Merge tag 'pm-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a kernedoc comment that doesn't match the behavior of the function documented by it" * tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Fix pm_runtime_active() kerneldoc comment
2021-12-10Merge tag 'hwmon-for-v5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - In the pwm-fan driver, ensure that the internal pwm state matches the state assumed by the pwm code. - Avoid EREMOTEIO errors in sht4 driver - In the nct6775 driver, make it explicit that the register value passed to nct6775_asuswmi_read() is an 8-bit value - Avoid WARNing in dell-smm driver removal after failing to create /proc/i8k - Stop using a plain integer as NULL pointer in corsair-psu driver * tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pwm-fan) Ensure the fan going on in .probe() hwmon: (sht4x) Fix EREMOTEIO errors hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value() hwmon: (dell-smm) Fix warning on /proc/i8k creation error hwmon: (corsair-psu) fix plain integer used as NULL pointer
2021-12-10Merge tag 'trace-v5.16-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Tracing, ftrace and tracefs fixes: - Have tracefs honor the gid mount option - Have new files in tracefs inherit the parent ownership - Have direct_ops unregister when it has no more functions - Properly clean up the ops when unregistering multi direct ops - Add a sample module to test the multiple direct ops - Fix memory leak in error path of __create_synth_event()" * tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible memory leak in __create_synth_event() error path ftrace/samples: Add module to test multi direct modify interface ftrace: Add cleanup to unregister_ftrace_direct_multi ftrace: Use direct_ops hash in unregister_ftrace_direct tracefs: Set all files to the same group ownership as the mount option tracefs: Have new files inherit the ownership of their parent
2021-12-10Merge tag 'aio-poll-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull aio poll fixes from Eric Biggers: "Fix three bugs in aio poll, and one issue with POLLFREE more broadly: - aio poll didn't handle POLLFREE, causing a use-after-free. - aio poll could block while the file is ready. - aio poll called eventfd_signal() when it isn't allowed. - POLLFREE didn't handle multiple exclusive waiters correctly. This has been tested with the libaio test suite, as well as with test programs I wrote that reproduce the first two bugs. I am sending this pull request myself as no one seems to be maintaining this code" * tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: aio: Fix incorrect usage of eventfd_signal_allowed() aio: fix use-after-free due to missing POLLFREE handling aio: keep poll requests on waitqueue until completed signalfd: use wake_up_pollfree() binder: use wake_up_pollfree() wait: add wake_up_pollfree()
2021-12-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "More x86 fixes: - Logic bugs in CR0 writes and Hyper-V hypercalls - Don't use Enlightened MSR Bitmap for L3 - Remove user-triggerable WARN Plus a few selftest fixes and a regression test for the user-triggerable WARN" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode selftests: KVM: avoid failures due to reserved HyperTransport region KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
2021-12-10i2c: mpc: Use atomic read and fix break conditionChris Packham
Maxime points out that the polling code in mpc_i2c_isr should use the _atomic API because it is called in an irq context and that the behaviour of the MCF bit is that it is 1 when the byte transfer is complete. All of this means the original code was effectively a udelay(100). Fix this by using readb_poll_timeout_atomic() and removing the negation of the break condition. Fixes: 4a8ac5e45cda ("i2c: mpc: Poll for MCF") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-10io-wq: check for wq exit after adding new worker task_workJens Axboe
We check IO_WQ_BIT_EXIT before attempting to create a new worker, and wq exit cancels pending work if we have any. But it's possible to have a race between the two, where creation checks exit finding it not set, but we're in the process of exiting. The exit side will cancel pending creation task_work, but there's a gap where we add task_work after we've canceled existing creations at exit time. Fix this by checking the EXIT bit post adding the creation task_work. If it's set, run the same cancelation that exit does. Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com Reviewed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10io_uring: ensure task_work gets run as part of cancelationsJens Axboe
If we successfully cancel a work item but that work item needs to be processed through task_work, then we can be sleeping uninterruptibly in io_uring_cancel_generic() and never process it. Hence we don't make forward progress and we end up with an uninterruptible sleep warning. While in there, correct a comment that should be IFF, not IIF. Reported-and-tested-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10Merge tag 'pci-v5.16-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Revert emulation of Marvell Armada A3720 expansion ROM because it doesn't work as expected (Marek Behún) - Assert PERST# in Apple M1 driver to fix initialization when booting from bootloaders using PCIe, such as U-Boot (Marc Zyngier) - Describe PERST# as active low in Apple T8103 DT and update driver to match (Marc Zyngier) * tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: apple: Fix PERST# polarity arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DT PCI: apple: Follow the PCIe specifications when resetting the port Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"
2021-12-10Merge tag 'mmc-v5.16-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - mtk-sd: Fix memory leak during tuning - renesas_sdhi: Initialize variable properly when tuning * tag 'mmc-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mediatek: free the ext_csd when mmc_get_ext_csd success mmc: renesas_sdhi: initialize variable properly when tuning
2021-12-10Merge tag 'libata-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull libata fixes from Damien Le Moal: - Fix a sparse warning in the ahci_ceva driver (me) - Disable the ASMedia 1092 non-functional device (Hannes) * tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: libata: add horkage for ASMedia 1092 ata: ahci_ceva: Fix id array access in ceva_ahci_read_id()
2021-12-10Merge tag 'sound-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Another collection of small fixes. It's still not quite calm yet, but nothing looks scary. ALSA core got a few fixes for covering the issues detected by fuzzer and the 32bit compat problem of control API, while the rest are all device-specific small fixes, including the continued fixes for Tegra" * tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits) ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform ALSA: usb-audio: Reorder snd_djm_devices[] entries ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1 ALSA: ctl: Fix copy of updated id with element read/write ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*() ALSA: pcm: oss: Limit the period size to 16MB ALSA: pcm: oss: Fix negative period/buffer sizes ASoC: codecs: wsa881x: fix return values from kcontrol put ASoC: codecs: wcd934x: return correct value from mixer put ASoC: codecs: wcd934x: handle channel mappping list correctly ASoC: qdsp6: q6routing: Fix return value from msm_routing_put_audio_mixer ASoC: SOF: Intel: Retry codec probing if it fails ASoC: amd: fix uninitialized variable in snd_acp6x_probe() ASoC: rockchip: i2s_tdm: Dup static DAI template ASoC: rt5682s: Fix crash due to out of scope stack vars ASoC: rt5682: Fix crash due to out of scope stack vars ASoC: tegra: Use normal system sleep for ADX ASoC: tegra: Use normal system sleep for AMX ASoC: tegra: Use normal system sleep for Mixer ASoC: tegra: Use normal system sleep for MVC ...
2021-12-10Merge tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular fixes, pretty small overall, couple of core fixes, two i915 and two amdgpu, hopefully it stays this quiet. ttm: - fix ttm_bo_swapout syncobj: - fix fence find bug with signalled fences i915: - fix error pointer deref in gem execbuffer - fix for GT init with GuC/HuC on ICL amdgpu: - DPIA fix - eDP fix" * tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drm: drm/i915/gen11: Moving WAs to icl_gt_workarounds_init() drm/amd/display: prevent reading unitialized links drm/amd/display: Fix DPIA outbox timeout after S3/S4/reset drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer() drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence. drm/ttm: fix ttm_bo_swapout
2021-12-10selftests: mptcp: remove duplicate include in mptcp_inq.cYe Guojin
'sys/ioctl.h' included in 'mptcp_inq.c' is duplicated. Reported-by: ZealRobot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20211210071424.425773-1-ye.guojin@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"Jens Axboe
This reverts commit 776b54e97a7d993ba23696e032426d5dea5bbe70. Looks like a last minute edit snuck into this patch, and as a result, it doesn't even compile. Revert the change for now. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)Davidlohr Bueso
do_each_pid_thread(PIDTYPE_PGID) can race with a concurrent change_pid(PIDTYPE_PGID) that can move the task from one hlist to another while iterating. Serialize ioprio_get to take the tasklist_lock in this case, just like it's set counterpart. Fixes: d69b78ba1de (ioprio: grab rcu_read_lock in sys_ioprio_{set,get}()) Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lore.kernel.org/r/20211210182058.43417-1-dave@stgolabs.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-10Merge branch 'md-fixes' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.16 Pull MD fixes from Song. * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: fix double free of mddev->private in autorun_array() md: fix update super 1.0 on rdev size change
2021-12-10selftests/bpf: Tests for state pruning with u32 spill/fillPaul Chaignon
This patch adds tests for the verifier's tracking for spilled, <8B registers. The first two test cases ensure the verifier doesn't incorrectly prune states in case of <8B spill/fills. The last one simply checks that a filled u64 register is marked unknown if the register spilled in the same slack slot was less than 8B. The map value access at the end of the first program is only incorrect for the path R6=32. If the precision bit for register R8 isn't backtracked through the u32 spill/fill, the R6=32 path is pruned at instruction 9 and the program is incorrectly accepted. The second program is a variation of the same with u32 spills and a u64 fill. The additional instructions to introduce the first pruning point may be a bit fragile as they depend on the heuristics for pruning points in the verifier (currently at least 8 instructions and 2 jumps). If the heuristics are changed, the pruning point may move (e.g., to the subsequent jump) or disappear, which would cause the test to always pass. Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-10bpf: Fix incorrect state pruning for <8B spill/fillPaul Chaignon
Commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") introduced support in the verifier to track <8B spill/fills of scalars. The backtracking logic for the precision bit was however skipping spill/fills of less than 8B. That could cause state pruning to consider two states equivalent when they shouldn't be. As an example, consider the following bytecode snippet: 0: r7 = r1 1: call bpf_get_prandom_u32 2: r6 = 2 3: if r0 == 0 goto pc+1 4: r6 = 3 ... 8: [state pruning point] ... /* u32 spill/fill */ 10: *(u32 *)(r10 - 8) = r6 11: r8 = *(u32 *)(r10 - 8) 12: r0 = 0 13: if r8 == 3 goto pc+1 14: r0 = 1 15: exit The verifier first walks the path with R6=3. Given the support for <8B spill/fills, at instruction 13, it knows the condition is true and skips instruction 14. At that point, the backtracking logic kicks in but stops at the fill instruction since it only propagates the precision bit for 8B spill/fill. When the verifier then walks the path with R6=2, it will consider it safe at instruction 8 because R6 is not marked as needing precision. Instruction 14 is thus never walked and is then incorrectly removed as 'dead code'. It's also possible to lead the verifier to accept e.g. an out-of-bound memory access instead of causing an incorrect dead code elimination. This regression was found via Cilium's bpf-next CI where it was causing a conntrack map update to be silently skipped because the code had been removed by the verifier. This commit fixes it by enabling support for <8B spill/fills in the bactracking logic. In case of a <8B spill/fill, the full 8B stack slot will be marked as needing precision. Then, in __mark_chain_precision, any tracked register spilled in a marked slot will itself be marked as needing precision, regardless of the spill size. This logic makes two assumptions: (1) only 8B-aligned spill/fill are tracked and (2) spilled registers are only tracked if the spill and fill sizes are equal. Commit ef979017b837 ("bpf: selftest: Add verifier tests for <8-byte scalar spill and refill") covers the first assumption and the next commit in this patchset covers the second. Fixes: 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-10md: fix double free of mddev->private in autorun_array()zhangyue
In driver/md/md.c, if the function autorun_array() is called, the problem of double free may occur. In function autorun_array(), when the function do_md_run() returns an error, the function do_md_stop() will be called. The function do_md_run() called function md_run(), but in function md_run(), the pointer mddev->private may be freed. The function do_md_stop() called the function __md_stop(), but in function __md_stop(), the pointer mddev->private also will be freed without judging null. At this time, the pointer mddev->private will be double free, so it needs to be judged null or not. Signed-off-by: zhangyue <zhangyue1@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2021-12-10md: fix update super 1.0 on rdev size changeMarkus Hochholdinger
The superblock of version 1.0 doesn't get moved to the new position on a device size change. This leads to a rdev without a superblock on a known position, the raid can't be re-assembled. The line was removed by mistake and is re-added by this patch. Fixes: d9c0fa509eaf ("md: fix max sectors calculation for super 1.0") Cc: stable@vger.kernel.org Signed-off-by: Markus Hochholdinger <markus@hochholdinger.net> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2021-12-10nfsd: fix use-after-free due to delegation raceJ. Bruce Fields
A delegation break could arrive as soon as we've called vfs_setlease. A delegation break runs a callback which immediately (in nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we then exit nfs4_set_delegation without hashing the delegation, it will be freed as soon as the callback is done with it, without ever being removed from del_recall_lru. Symptoms show up later as use-after-free or list corruption warnings, usually in the laundromat thread. I suspect aba2072f4523 "nfsd: grant read delegations to clients holding writes" made this bug easier to hit, but I looked as far back as v3.0 and it looks to me it already had the same problem. So I'm not sure where the bug was introduced; it may have been there from the beginning. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>