summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-10sock: Use sock_owned_by_user_nocheck() instead of sk_lock.owned.Kuniyuki Iwashima
This patch moves sock_release_ownership() down in include/net/sock.h and replaces some sk_lock.owned tests with sock_owned_by_user_nocheck(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/r/20211208062158.54132-1-kuniyu@amazon.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10Merge tag 'for-5.16-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more regression fixes and stable patches, mostly one-liners. Regression fixes: - fix pointer/ERR_PTR mismatch returned from memdup_user - reset dedicated zoned mode relocation block group to avoid using it and filling it without any recourse Fixes: - handle a case to FITRIM range (also to make fstests/generic/260 work) - fix warning when extent buffer state and pages get out of sync after an IO error - fix transaction abort when syncing due to missing mapping error set on metadata inode after inlining a compressed file - fix transaction abort due to tree-log and zoned mode interacting in an unexpected way - fix memory leak of additional extent data when qgroup reservation fails - do proper handling of slot search call when deleting root refs" * tag 'for-5.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling btrfs: zoned: clear data relocation bg on zone finish btrfs: free exchange changeset on failures btrfs: fix re-dirty process of tree-log nodes btrfs: call mapping_set_error() on btree inode with a write error btrfs: clear extent buffer uptodate when we fail to write it btrfs: fail if fstrim_range->start == U64_MAX btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()
2021-12-10Merge tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Two cifs/smb3 fixes - one for stable, the other fixes a recently reported NTLMSSP auth problem" * tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix ntlmssp auth when there is no key exchange cifs: Fix crash on unload of cifs_arc4.ko
2021-12-10Merge tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Fix a race on startup and another in the delegation code. The latter has been around for years, but I suspect recent changes may have widened the race window a little, so I'd like to go ahead and get it in" * tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux: nfsd: fix use-after-free due to delegation race nfsd: Fix nsfd startup race (again)
2021-12-10mm: bdi: initialize bdi_min_ratio when bdi is unregisteredManjong Lee
Initialize min_ratio if it is set during bdi unregistration. This can prevent problems that may occur a when bdi is removed without resetting min_ratio. For example. 1) insert external sdcard 2) set external sdcard's min_ratio 70 3) remove external sdcard without setting min_ratio 0 4) insert external sdcard 5) set external sdcard's min_ratio 70 << error occur(can't set) Because when an sdcard is removed, the present bdi_min_ratio value will remain. Currently, the only way to reset bdi_min_ratio is to reboot. [akpm@linux-foundation.org: tweak comment and coding style] Link: https://lkml.kernel.org/r/20211021161942.5983-1-mj0123.lee@samsung.com Signed-off-by: Manjong Lee <mj0123.lee@samsung.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Changheun Lee <nanich.lee@samsung.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <seunghwan.hyun@samsung.com> Cc: <sookwan7.kim@samsung.com> Cc: <yt0928.kim@samsung.com> Cc: <junho89.kim@samsung.com> Cc: <jisoo2146.oh@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10hugetlbfs: fix issue of preallocation of gigantic pages can't workZhenguo Yao
Preallocation of gigantic pages can't work bacause of commit b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation"). When nid is NUMA_NO_NODE(-1), alloc_bootmem_huge_page will always return without doing allocation. Fix this by adding more check. Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com Fixes: b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Signed-off-by: Zhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()Waiman Long
All the calls to mod_objcg_mlstate(), get_obj_stock() and put_obj_stock() are done by functions defined within the same "#ifdef CONFIG_MEMCG_KMEM" compilation block. When CONFIG_MEMCG_KMEM isn't defined, the following compilation warnings will be issued [1] and [2]. mm/memcontrol.c:785:20: warning: unused function 'mod_objcg_mlstate' mm/memcontrol.c:2113:33: warning: unused function 'get_obj_stock' Fix these warning by moving those functions to under the same CONFIG_MEMCG_KMEM compilation block. There is no functional change. [1] https://lore.kernel.org/lkml/202111272014.WOYNLUV6-lkp@intel.com/ [2] https://lore.kernel.org/lkml/202111280551.LXsWYt1T-lkp@intel.com/ Link: https://lkml.kernel.org/r/20211129161140.306488-1-longman@redhat.com Fixes: 559271146efc ("mm/memcg: optimize user context object stock access") Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp") Signed-off-by: Waiman Long <longman@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/slub: fix endianness bug for alloc/free_traces attributesGerald Schaefer
On big-endian s390, the alloc/free_traces attributes produce endless output, because of always 0 idx in slab_debugfs_show(). idx is de-referenced from *v, which points to a loff_t value, with unsigned int idx = *(unsigned int *)v; This will only give the upper 32 bits on big-endian, which remain 0. Instead of only fixing this de-reference, during discussion it seemed more appropriate to change the seq_ops so that they use an explicit iterator in private loc_track struct. This patch adds idx to loc_track, which will also fix the endianness bug. Link: https://lore.kernel.org/r/20211117193932.4049412-1-gerald.schaefer@linux.ibm.com Link: https://lkml.kernel.org/r/20211126171848.17534-1-gerald.schaefer@linux.ibm.com Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs") Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reported-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Faiyaz Mohammed <faiyazm@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: split test casesSeongJae Park
Currently, the single test program, debugfs.sh, contains all test cases for DAMON. When one of the cases fails, finding which case is failed from the test log is not so easy, and all remaining tests will be skipped. To improve the situation, this commit splits the single program into small test programs having their own names. Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test debugfs file reads/writes with huge countSeongJae Park
DAMON debugfs interface users were able to trigger warning by writing some files with arbitrarily large 'count' parameter. The issue is fixed with commit db7a347b26fe ("mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation"). This commit adds a test case for the issue in DAMON selftests to avoid future regressions. Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test wrong DAMOS condition ranges inputSeongJae Park
A patch titled "mm/damon/schemes: add the validity judgment of thresholds"[1] makes DAMON debugfs interface to validate DAMON scheme inputs. This commit adds a test case for the validation logic in DAMON selftests. [1] https://lore.kernel.org/linux-mm/d78360e52158d786fcbf20bc62c96785742e76d3.1637239568.git.xhao@linux.alibaba.com/ Link: https://lkml.kernel.org/r/20211201150440.1088-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: test DAMON enabling with empty target_ids caseSeongJae Park
DAMON debugfs didn't check empty targets when starting monitoring, and the issue is fixed with commit b5ca3e83ddb0 ("mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on"). To avoid future regression, this commit adds a test case for that in DAMON selftests. Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10selftests/damon: skip test if DAMON is runningSeongJae Park
Testing the DAMON debugfs files while DAMON is running makes no sense, as any write to the debugfs files will fail. This commit makes the test be skipped in this case. Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr-test: remove unnecessary variablesSeongJae Park
A couple of test functions in DAMON virtual address space monitoring primitives implementation has unnecessary damon_ctx variables. This commit removes those. Link: https://lkml.kernel.org/r/20211201150440.1088-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr-test: split a test function having >1024 bytes frame sizeSeongJae Park
On some configuration[1], 'damon_test_split_evenly()' kunit test function has >1024 bytes frame size, so below build warning is triggered: CC mm/damon/vaddr.o In file included from mm/damon/vaddr.c:672: mm/damon/vaddr-test.h: In function 'damon_test_split_evenly': mm/damon/vaddr-test.h:309:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=] 309 | } | ^ This commit fixes the warning by separating the common logic in the function. [1] https://lore.kernel.org/linux-mm/202111182146.OV3C4uGr-lkp@intel.com/ Link: https://lkml.kernel.org/r/20211201150440.1088-6-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/vaddr: remove an unnecessary warning messageSeongJae Park
The DAMON virtual address space monitoring primitive prints a warning message for wrong DAMOS action. However, it is not essential as the code returns appropriate failure in the case. This commit removes the message to make the log clean. Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org Fixes: 6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: remove unnecessary error messagesSeongJae Park
DAMON core prints error messages when damon_target object creation is failed or wrong monitoring attributes are given. Because appropriate error code is returned for each case, the messages are not essential. Also, because the code path can be triggered with user-specified input, this could result in kernel log mistakenly being messy. To avoid the case, this commit removes the messages. Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Fixes: b9a6ac4e4ede ("mm/damon: adaptively adjust regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: kernel test robot <lkp@intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/dbgfs: remove an unnecessary error messageSeongJae Park
When wrong scheme action is requested via the debugfs interface, DAMON prints an error message. Because the function returns error code, this is not really needed. Because the code path is triggered by the user specified input, this can result in kernel log mistakenly being messy. To avoid the case, this commit removes the message. Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org Fixes: af122dd8f3c0 ("mm/damon/dbgfs: support DAMON-based Operation Schemes") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: use better timer mechanisms selection thresholdSeongJae Park
Patch series "mm/damon: Trivial fixups and improvements". This patchset contains trivial fixups and improvements for DAMON and its kunit/kselftest tests. This patch (of 11): DAMON is using hrtimer if requested sleep time is <=100ms, while the suggested threshold[1] is <=20ms. This commit applies the threshold. [1] Documentation/timers/timers-howto.rst Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org Fixes: ee801b7dd7822 ("mm/damon/schemes: activate schemes based on a watermarks mechanism") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mm/damon/core: fix fake load reports due to uninterruptible sleepsSeongJae Park
Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake load while DAMON is turned on, though it is doing nothing. This can confuse users[1]. To avoid the case, this commit makes DAMON sleeps in idle mode. [1] https://lore.kernel.org/all/11868371.O9o76ZdvQC@natalenko.name/ Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org Fixes: 2224d8485492 ("mm: introduce Data Access MONitor (DAMON)") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10timers: implement usleep_idle_range()SeongJae Park
Patch series "mm/damon: Fix fake /proc/loadavg reports", v3. This patchset fixes DAMON's fake load report issue. The first patch makes yet another variant of usleep_range() for this fix, and the second patch fixes the issue of DAMON by making it using the newly introduced function. This patch (of 2): Some kernel threads such as DAMON could need to repeatedly sleep in micro seconds level. Because usleep_range() sleeps in uninterruptible state, however, such threads would make /proc/loadavg reports fake load. To help such cases, this commit implements a variant of usleep_range() called usleep_idle_range(). It is same to usleep_range() but sets the state of the current task as TASK_IDLE while sleeping. Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10filemap: remove PageHWPoison check from next_uptodate_page()Matthew Wilcox (Oracle)
Pages are individually marked as suffering from hardware poisoning. Checking that the head page is not hardware poisoned doesn't make sense; we might be after a subpage. We check each page individually before we use it, so this was an optimisation gone wrong. It will cause us to fall back to the slow path when there was no need to do that Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Yang Shi <shy828301@gmail.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10mailmap: update email address for Guo RenGuo Ren
The ren_guo@c-sky.com would be deprecated and use guoren@kernel.org as the main email address. Link: https://lkml.kernel.org/r/20211123022741.545541-1-guoren@kernel.org Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10MAINTAINERS: update kdump maintainersDave Young
Remove myself from kdump maintainers as I have no enough time to maintain it now. But I can review patches on demand though. Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Increase default MLOCK_LIMIT to 8 MiBDrew DeVault
This limit has not been updated since 2008, when it was increased to 64 KiB at the request of GnuPG. Until recently, the main use-cases for this feature were (1) preventing sensitive memory from being swapped, as in GnuPG's use-case; and (2) real-time use-cases. In the first case, little memory is called for, and in the second case, the user is generally in a position to increase it if they need more. The introduction of IOURING_REGISTER_BUFFERS adds a third use-case: preparing fixed buffers for high-performance I/O. This use-case will take as much of this memory as it can get, but is still limited to 64 KiB by default, which is very little. This increases the limit to 8 MB, which was chosen fairly arbitrarily as a more generous, but still conservative, default value. It is also possible to raise this limit in userspace. This is easily done, for example, in the use-case of a network daemon: systemd, for instance, provides for this via LimitMEMLOCK in the service file; OpenRC via the rc_ulimit variables. However, there is no established userspace facility for configuring this outside of daemons: end-user applications do not presently have access to a convenient means of raising their limits. The buck, as it were, stops with the kernel. It's much easier to address it here than it is to bring it to hundreds of distributions, and it can only realistically be relied upon to be high-enough by end-user software if it is more-or-less ubiquitous. Most distros don't change this particular rlimit from the kernel-supplied default value, so a change here will easily provide that ubiquity. Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com Signed-off-by: Drew DeVault <sir@cmpwn.com> Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Cyril Hrubis <chrubis@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Andrew Dona-Couch <andrew@donacou.ch> Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10Merge tag 'thermal-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix the definition of one of the Tiger Lake MMIO registers in the int340x thermal driver (Sumeet Pawnikar)" * tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL
2021-12-10Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Andrii Nakryiko says: ==================== bpf-next 2021-12-10 v2 We've added 115 non-merge commits during the last 26 day(s) which contain a total of 182 files changed, 5747 insertions(+), 2564 deletions(-). The main changes are: 1) Various samples fixes, from Alexander Lobakin. 2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov. 3) A batch of new unified APIs for libbpf, logging improvements, version querying, etc. Also a batch of old deprecations for old APIs and various bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko. 4) BPF documentation reorganization and improvements, from Christoph Hellwig and Dave Tucker. 5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in libbpf, from Hengqi Chen. 6) Verifier log fixes, from Hou Tao. 7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong. 8) Extend branch record capturing to all platforms that support it, from Kajol Jain. 9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi. 10) bpftool doc-generating script improvements, from Quentin Monnet. 11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet. 12) Deprecation warning fix for perf/bpf_counter, from Song Liu. 13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf, from Tiezhu Yang. 14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song. 15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun, and others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits) libbpf: Add "bool skipped" to struct bpf_map libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition bpftool: Switch bpf_object__load_xattr() to bpf_object__load() selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr() selftests/bpf: Add test for libbpf's custom log_buf behavior selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load() libbpf: Deprecate bpf_object__load_xattr() libbpf: Add per-program log buffer setter and getter libbpf: Preserve kernel error code and remove kprobe prog type guessing libbpf: Improve logging around BPF program loading libbpf: Allow passing user log setting through bpf_object_open_opts libbpf: Allow passing preallocated log_buf when loading BTF into kernel libbpf: Add OPTS-based bpf_btf_load() API libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 samples/bpf: Remove unneeded variable bpf: Remove redundant assignment to pointer t selftests/bpf: Fix a compilation warning perf/bpf_counter: Use bpf_map_create instead of bpf_create_map samples: bpf: Fix 'unknown warning group' build warning on Clang samples: bpf: Fix xdp_sample_user.o linking with Clang ... ==================== Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10libbpf: Add "bool skipped" to struct bpf_mapShuyi Cheng
Fix error: "failed to pin map: Bad file descriptor, path: /sys/fs/bpf/_rodata_str1_1." In the old kernel, the global data map will not be created, see [0]. So we should skip the pinning of the global data map to avoid bpf_object__pin_maps returning error. Therefore, when the map is not created, we mark “map->skipped" as true and then check during relocation and during pinning. Fixes: 16e0c35c6f7a ("libbpf: Load global data maps lazily on legacy kernels") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definitionVincent Minet
The btf__dedup_deprecated name was misspelled in the definition of the compat symbol for btf__dedup. This leads it to be missing from the shared library. This fixes it. Fixes: 957d350a8b94 ("libbpf: Turn btf_dedup_opts into OPTS-based struct") Signed-off-by: Vincent Minet <vincent@vincent-minet.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211210063112.80047-1-vincent@vincent-minet.net
2021-12-10Merge branch 'Enhance and rework logging controls in libbpf'Alexei Starovoitov
Andrii Nakryiko says: ==================== Add new open options and per-program setters to control BTF and program loading log verboseness and allow providing custom log buffers to capture logs of interest. Note how custom log_buf and log_level are orthogonal, which matches previous (alas less customizable) behavior of libbpf, even though it sort of worked by accident: if someone specified log_level = 1 in bpf_object__load_xattr(), first attempt to load any BPF program resulted in wasted bpf() syscall with -EINVAL due to !!log_buf != !!log_level. Then on retry libbpf would allocated log_buffer and try again, after which prog loading would succeed and libbpf would print verbose program loading log through its print callback. This behavior is now documented and made more efficient, not wasting unnecessary syscall. But additionally, log_level can be controlled globally on a per-bpf_object level through bpf_object_open_opts, as well as on a per-program basis with bpf_program__set_log_buf() and bpf_program__set_log_level() APIs. Now that we have a more future-proof way to set log_level, deprecate bpf_object__load_xattr(). v2->v3: - added log_buf selftests for bpf_prog_load() and bpf_btf_load(); - fix !log_buf in bpf_prog_load (John); - fix log_level==0 in bpf_btf_load (thanks selftest!); v1->v2: - fix log_level == 0 handling of bpf_prog_load, add as patch #1 (Alexei); - add comments explaining log_buf_size overflow prevention (Alexei). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10bpftool: Switch bpf_object__load_xattr() to bpf_object__load()Andrii Nakryiko
Switch all the uses of to-be-deprecated bpf_object__load_xattr() into a simple bpf_object__load() calls with optional log_level passed through open_opts.kernel_log_level, if -d option is specified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-13-andrii@kernel.org
2021-12-10selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()Andrii Nakryiko
Switch from bpf_object__load_xattr() to bpf_object__load() and kernel_log_level in bpf_object_open_opts. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-12-andrii@kernel.org
2021-12-10selftests/bpf: Add test for libbpf's custom log_buf behaviorAndrii Nakryiko
Add a selftest that validates that per-program and per-object log_buf overrides work as expected. Also test same logic for low-level bpf_prog_load() and bpf_btf_load() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-11-andrii@kernel.org
2021-12-10selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()Andrii Nakryiko
Switch all selftests uses of to-be-deprecated bpf_load_btf() with equivalent bpf_btf_load() calls. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-10-andrii@kernel.org
2021-12-10libbpf: Deprecate bpf_object__load_xattr()Andrii Nakryiko
Deprecate non-extensible bpf_object__load_xattr() in v0.8 ([0]). With log_level control through bpf_object_open_opts or bpf_program__set_log_level(), we are finally at the point where bpf_object__load_xattr() doesn't provide any functionality that can't be accessed through other (better) ways. The other feature, target_btf_path, is also controllable through bpf_object_open_opts. [0] Closes: https://github.com/libbpf/libbpf/issues/289 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-9-andrii@kernel.org
2021-12-10libbpf: Add per-program log buffer setter and getterAndrii Nakryiko
Allow to set user-provided log buffer on a per-program basis ([0]). This gives great deal of flexibility in terms of which programs are loaded with logging enabled and where corresponding logs go. Log buffer set with bpf_program__set_log_buf() overrides kernel_log_buf and kernel_log_size settings set at bpf_object open time through bpf_object_open_opts, if any. Adjust bpf_object_load_prog_instance() logic to not perform own log buf allocation and load retry if custom log buffer is provided by the user. [0] Closes: https://github.com/libbpf/libbpf/issues/418 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-8-andrii@kernel.org
2021-12-10libbpf: Preserve kernel error code and remove kprobe prog type guessingAndrii Nakryiko
Instead of rewriting error code returned by the kernel of prog load with libbpf-sepcific variants pass through the original error. There is now also no need to have a backup generic -LIBBPF_ERRNO__LOAD fallback error as bpf_prog_load() guarantees that errno will be properly set no matter what. Also drop a completely outdated and pretty useless BPF_PROG_TYPE_KPROBE guess logic. It's not necessary and neither it's helpful in modern BPF applications. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-7-andrii@kernel.org
2021-12-10libbpf: Improve logging around BPF program loadingAndrii Nakryiko
Add missing "prog '%s': " prefixes in few places and use consistently markers for beginning and end of program load logs. Here's an example of log output: libbpf: prog 'handler': BPF program load failed: Permission denied libbpf: -- BEGIN PROG LOAD LOG --- arg#0 reference type('UNKNOWN ') size cannot be determined: -22 ; out1 = in1; 0: (18) r1 = 0xffffc9000cdcc000 2: (61) r1 = *(u32 *)(r1 +0) ... 81: (63) *(u32 *)(r4 +0) = r5 R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0) invalid access to map value, value_size=16 off=400 size=4 R4 min value is outside of the allowed memory range processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: failed to load program 'handler' libbpf: failed to load object 'test_skeleton' The entire verifier log, including BEGIN and END markers are now always youtput during a single print callback call. This should make it much easier to post-process or parse it, if necessary. It's not an explicit API guarantee, but it can be reasonably expected to stay like that. Also __bpf_object__open is renamed to bpf_object_open() as it's always an adventure to find the exact function that implements bpf_object's open phase, so drop the double underscored and use internal libbpf naming convention. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org
2021-12-10libbpf: Allow passing user log setting through bpf_object_open_optsAndrii Nakryiko
Allow users to provide their own custom log_buf, log_size, and log_level at bpf_object level through bpf_object_open_opts. This log_buf will be used during BTF loading. Subsequent patch will use same log_buf during BPF program loading, unless overriden at per-bpf_program level. When such custom log_buf is provided, libbpf won't be attempting retrying loading of BTF to try to provide its own log buffer to capture kernel's error log output. User is responsible to provide big enough buffer, otherwise they run a risk of getting -ENOSPC error from the bpf() syscall. See also comments in bpf_object_open_opts regarding log_level and log_buf interactions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-5-andrii@kernel.org
2021-12-10libbpf: Allow passing preallocated log_buf when loading BTF into kernelAndrii Nakryiko
Add libbpf-internal btf_load_into_kernel() that allows to pass preallocated log_buf and custom log_level to be passed into kernel during BPF_BTF_LOAD call. When custom log_buf is provided, btf_load_into_kernel() won't attempt an retry with automatically allocated internal temporary buffer to capture BTF validation log. It's important to note the relation between log_buf and log_level, which slightly deviates from stricter kernel logic. From kernel's POV, if log_buf is specified, log_level has to be > 0, and vice versa. While kernel has good reasons to request such "sanity, this, in practice, is a bit unconvenient and restrictive for libbpf's high-level bpf_object APIs. So libbpf will allow to set non-NULL log_buf and log_level == 0. This is fine and means to attempt to load BTF without logging requested, but if it failes, retry the load with custom log_buf and log_level 1. Similar logic will be implemented for program loading. In practice this means that users can provide custom log buffer just in case error happens, but not really request slower verbose logging all the time. This is also consistent with libbpf behavior when custom log_buf is not set: libbpf first tries to load everything with log_level=0, and only if error happens allocates internal log buffer and retries with log_level=1. Also, while at it, make BTF validation log more obvious and follow the log pattern libbpf is using for dumping BPF verifier log during BPF_PROG_LOAD. BTF loading resulting in an error will look like this: libbpf: BTF loading error: -22 libbpf: -- BEGIN BTF LOAD LOG --- magic: 0xeb9f version: 1 flags: 0x0 hdr_len: 24 type_off: 0 type_len: 1040 str_off: 1040 str_len: 2063598257 btf_total_size: 1753 Total section length too long -- END BTF LOAD LOG -- libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring. This makes it much easier to find relevant parts in libbpf log output. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-4-andrii@kernel.org
2021-12-10libbpf: Add OPTS-based bpf_btf_load() APIAndrii Nakryiko
Similar to previous bpf_prog_load() and bpf_map_create() APIs, add bpf_btf_load() API which is taking optional OPTS struct. Schedule bpf_load_btf() for deprecation in v0.8 ([0]). This makes naming consistent with BPF_BTF_LOAD command, sets up an API for extensibility in the future, moves options parameters (log-related fields) into optional options, and also allows to pass log_level directly. It also removes log buffer auto-allocation logic from low-level API (consistent with bpf_prog_load() behavior), but preserves a special treatment of log_level == 0 with non-NULL log_buf, which matches low-level bpf_prog_load() and high-level libbpf APIs for BTF and program loading behaviors. [0] Closes: https://github.com/libbpf/libbpf/issues/419 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-3-andrii@kernel.org
2021-12-10libbpf: Fix bpf_prog_load() log_buf logic for log_level 0Andrii Nakryiko
To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix bpf_prog_load() to follow the same logic as bpf_btf_load() and high-level bpf_object__load() API will follow in the subsequent patches: - if log_level is 0 and non-NULL log_buf is provided by a user, attempt load operation initially with no log_buf and log_level set; - if successful, we are done, return new FD; - on error, retry the load operation with log_level bumped to 1 and log_buf set; this way verbose logging will be requested only when we are sure that there is a failure, but will be fast in the common/expected success case. Of course, user can still specify log_level > 0 from the very beginning to force log collection. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-2-andrii@kernel.org
2021-12-10Merge tag 'acpi-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Create the output directory for the ACPI tools during build if it has not been present before and prevent the compilation from failing in that case (Chen Yu)" * tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: tools: Fix compilation when output directory is not present
2021-12-10Merge tag 'pm-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a kernedoc comment that doesn't match the behavior of the function documented by it" * tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Fix pm_runtime_active() kerneldoc comment
2021-12-10Merge tag 'hwmon-for-v5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - In the pwm-fan driver, ensure that the internal pwm state matches the state assumed by the pwm code. - Avoid EREMOTEIO errors in sht4 driver - In the nct6775 driver, make it explicit that the register value passed to nct6775_asuswmi_read() is an 8-bit value - Avoid WARNing in dell-smm driver removal after failing to create /proc/i8k - Stop using a plain integer as NULL pointer in corsair-psu driver * tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pwm-fan) Ensure the fan going on in .probe() hwmon: (sht4x) Fix EREMOTEIO errors hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value() hwmon: (dell-smm) Fix warning on /proc/i8k creation error hwmon: (corsair-psu) fix plain integer used as NULL pointer
2021-12-10Merge tag 'trace-v5.16-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Tracing, ftrace and tracefs fixes: - Have tracefs honor the gid mount option - Have new files in tracefs inherit the parent ownership - Have direct_ops unregister when it has no more functions - Properly clean up the ops when unregistering multi direct ops - Add a sample module to test the multiple direct ops - Fix memory leak in error path of __create_synth_event()" * tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible memory leak in __create_synth_event() error path ftrace/samples: Add module to test multi direct modify interface ftrace: Add cleanup to unregister_ftrace_direct_multi ftrace: Use direct_ops hash in unregister_ftrace_direct tracefs: Set all files to the same group ownership as the mount option tracefs: Have new files inherit the ownership of their parent
2021-12-10Merge tag 'aio-poll-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull aio poll fixes from Eric Biggers: "Fix three bugs in aio poll, and one issue with POLLFREE more broadly: - aio poll didn't handle POLLFREE, causing a use-after-free. - aio poll could block while the file is ready. - aio poll called eventfd_signal() when it isn't allowed. - POLLFREE didn't handle multiple exclusive waiters correctly. This has been tested with the libaio test suite, as well as with test programs I wrote that reproduce the first two bugs. I am sending this pull request myself as no one seems to be maintaining this code" * tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: aio: Fix incorrect usage of eventfd_signal_allowed() aio: fix use-after-free due to missing POLLFREE handling aio: keep poll requests on waitqueue until completed signalfd: use wake_up_pollfree() binder: use wake_up_pollfree() wait: add wake_up_pollfree()
2021-12-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "More x86 fixes: - Logic bugs in CR0 writes and Hyper-V hypercalls - Don't use Enlightened MSR Bitmap for L3 - Remove user-triggerable WARN Plus a few selftest fixes and a regression test for the user-triggerable WARN" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode selftests: KVM: avoid failures due to reserved HyperTransport region KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
2021-12-10i2c: mpc: Use atomic read and fix break conditionChris Packham
Maxime points out that the polling code in mpc_i2c_isr should use the _atomic API because it is called in an irq context and that the behaviour of the MCF bit is that it is 1 when the byte transfer is complete. All of this means the original code was effectively a udelay(100). Fix this by using readb_poll_timeout_atomic() and removing the negation of the break condition. Fixes: 4a8ac5e45cda ("i2c: mpc: Poll for MCF") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-10io-wq: check for wq exit after adding new worker task_workJens Axboe
We check IO_WQ_BIT_EXIT before attempting to create a new worker, and wq exit cancels pending work if we have any. But it's possible to have a race between the two, where creation checks exit finding it not set, but we're in the process of exiting. The exit side will cancel pending creation task_work, but there's a gap where we add task_work after we've canceled existing creations at exit time. Fix this by checking the EXIT bit post adding the creation task_work. If it's set, run the same cancelation that exit does. Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com Reviewed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>