Age | Commit message (Collapse) | Author |
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Variable 'insert_retries' is not effectively used in the function, so
delete it.
lib/test_rhashtable.c:437:18: warning: variable 'insert_retries' set but not used.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3242
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the vdso Makefile to use "grep -E" instead.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20220920170633.3133829-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When misc_register() failed in test_firmware_init(), the memory pointed
by test_fw_config->name is not released. The memory leak information is
as follows:
unreferenced object 0xffff88810a34cb00 (size 32):
comm "insmod", pid 7952, jiffies 4294948236 (age 49.060s)
hex dump (first 32 bytes):
74 65 73 74 2d 66 69 72 6d 77 61 72 65 2e 62 69 test-firmware.bi
6e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 n...............
backtrace:
[<ffffffff81b21fcb>] __kmalloc_node_track_caller+0x4b/0xc0
[<ffffffff81affb96>] kstrndup+0x46/0xc0
[<ffffffffa0403a49>] __test_firmware_config_init+0x29/0x380 [test_firmware]
[<ffffffffa040f068>] 0xffffffffa040f068
[<ffffffff81002c41>] do_one_initcall+0x141/0x780
[<ffffffff816a72c3>] do_init_module+0x1c3/0x630
[<ffffffff816adb9e>] load_module+0x623e/0x76a0
[<ffffffff816af471>] __do_sys_finit_module+0x181/0x240
[<ffffffff89978f99>] do_syscall_64+0x39/0xb0
[<ffffffff89a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20221119035721.18268-1-shaozhengchao@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Validate the effect of the __alloc_size attribute on allocators. If the
compiler doesn't support __builtin_dynamic_object_size(), skip the
associated tests.
(For GCC, just remove the "--make_options" line below...)
$ ./tools/testing/kunit/kunit.py run --arch x86_64 \
--kconfig_add CONFIG_FORTIFY_SOURCE=y \
--make_options LLVM=1
fortify
...
[15:16:30] ================== fortify (10 subtests) ===================
[15:16:30] [PASSED] known_sizes_test
[15:16:30] [PASSED] control_flow_split_test
[15:16:30] [PASSED] alloc_size_kmalloc_const_test
[15:16:30] [PASSED] alloc_size_kmalloc_dynamic_test
[15:16:30] [PASSED] alloc_size_vmalloc_const_test
[15:16:30] [PASSED] alloc_size_vmalloc_dynamic_test
[15:16:30] [PASSED] alloc_size_kvmalloc_const_test
[15:16:30] [PASSED] alloc_size_kvmalloc_dynamic_test
[15:16:30] [PASSED] alloc_size_devm_kmalloc_const_test
[15:16:30] [PASSED] alloc_size_devm_kmalloc_dynamic_test
[15:16:30] ===================== [PASSED] fortify =====================
[15:16:30] ============================================================
[15:16:30] Testing complete. Ran 10 tests: passed: 10
[15:16:31] Elapsed time: 8.348s total, 0.002s configuring, 6.923s building, 1.075s running
For earlier GCC prior to version 12, the dynamic tests will be skipped:
[15:18:59] ================== fortify (10 subtests) ===================
[15:18:59] [PASSED] known_sizes_test
[15:18:59] [PASSED] control_flow_split_test
[15:18:59] [PASSED] alloc_size_kmalloc_const_test
[15:18:59] [SKIPPED] alloc_size_kmalloc_dynamic_test
[15:18:59] [PASSED] alloc_size_vmalloc_const_test
[15:18:59] [SKIPPED] alloc_size_vmalloc_dynamic_test
[15:18:59] [PASSED] alloc_size_kvmalloc_const_test
[15:18:59] [SKIPPED] alloc_size_kvmalloc_dynamic_test
[15:18:59] [PASSED] alloc_size_devm_kmalloc_const_test
[15:18:59] [SKIPPED] alloc_size_devm_kmalloc_dynamic_test
[15:18:59] ===================== [PASSED] fortify =====================
[15:18:59] ============================================================
[15:18:59] Testing complete. Ran 10 tests: passed: 6, skipped: 4
[15:18:59] Elapsed time: 11.965s total, 0.002s configuring, 10.540s building, 1.068s running
Cc: David Gow <davidgow@google.com>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
If KPROBES_SANITY_TEST and ARCH_CORRECT_STACKTRACE_ON_KRETPROBE is enabled, but
STACKTRACE is not set. Build failed as below:
lib/test_kprobes.c: In function `stacktrace_return_handler':
lib/test_kprobes.c:228:8: error: implicit declaration of function `stack_trace_save'; did you mean `stacktrace_driver'? [-Werror=implicit-function-declaration]
ret = stack_trace_save(stack_buf, STACK_BUF_SIZE, 0);
^~~~~~~~~~~~~~~~
stacktrace_driver
cc1: all warnings being treated as errors
scripts/Makefile.build:250: recipe for target 'lib/test_kprobes.o' failed
make[2]: *** [lib/test_kprobes.o] Error 1
To fix this error, Select STACKTRACE if ARCH_CORRECT_STACKTRACE_ON_KRETPROBE is enabled.
Link: https://lkml.kernel.org/r/20221121030620.63181-1-hucool.lihua@huawei.com
Fixes: 1f6d3a8f5e39 ("kprobes: Add a test case for stacktrace from kretprobe handler")
Signed-off-by: Li Hua <hucool.lihua@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When we specify __GFP_NOWARN, we only expect that no warnings will be
issued for current caller. But in the __should_failslab() and
__should_fail_alloc_page(), the local GFP flags alter the global
{failslab|fail_page_alloc}.attr, which is persistent and shared by all
tasks. This is not what we expected, let's fix it.
[akpm@linux-foundation.org: unexport should_fail_ex()]
Link: https://lkml.kernel.org/r/20221118100011.2634-1-zhengqi.arch@bytedance.com
Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221122134301.69258-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221122134301.69258-3-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Or, depending on the way locking is implemented at the call sites,
some updates could be lost (has not been observed).
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221122134301.69258-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
kobject_namespace() should take a const *kobject as it does not modify
the kobject passed to it. Change that, and the functions
kobj_child_ns_ops() and kobj_ns_ops() needed to also be changed to const
*.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The call, kobject_get_ownership(), does not modify the kobject passed
into it, so make it const. This propagates down into the kobj_type
function callbacks so make the kobject passed into them also const,
ensuring that nothing in the kobject is being changed here.
This helps make it more obvious what calls and callbacks do, and do not,
modify structures passed to them.
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Rather than polling every second, use the new notifier to do this at
exactly the right moment.
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Resolve conflicts between these commits in arch/x86/kernel/asm-offsets.c:
# upstream:
debc5a1ec0d1 ("KVM: x86: use a separate asm-offsets.c file")
# retbleed work in x86/core:
5d8213864ade ("x86/retbleed: Add SKL return thunk")
... and these commits in include/linux/bpf.h:
# upstram:
18acb7fac22f ("bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")")
# x86/core commits:
931ab63664f0 ("x86/ibt: Implement FineIBT")
bea75b33895f ("x86/Kconfig: Introduce function padding")
The latter two modify BPF_DISPATCHER_ATTRIBUTES(), which was removed upstream.
Conflicts:
arch/x86/kernel/asm-offsets.c
include/linux/bpf.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Fix the kernel-doc markings for div64 functions to point to the header
file instead of the lib/ directory. This avoids having implementation
specific comments in generic documentation. Furthermore, given that
some kernel-doc comments are identical, drop them from lib/math64 and
only keep there comments that add implementation details.
Signed-off-by: Liam Beguin <liambeguin@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20221118182309.3824530-1-liambeguin@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Due to compiler optimizations like inlining, there are cases where
MMIO traces using _THIS_IP_ for caller information might not be
sufficient to provide accurate debug traces.
1) With optimizations (Seen with GCC):
In this case, _THIS_IP_ works fine and prints the caller information
since it will be inlined into the caller and we get the debug traces
on who made the MMIO access, for ex:
rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 addr=0xffff8000087447f4
rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 val=0x0 addr=0xffff8000087447f4
2) Without optimizations (Seen with Clang):
_THIS_IP_ will not be sufficient in this case as it will print only
the MMIO accessors itself which is of not much use since it is not
inlined as below for example:
rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4
So in order to handle this second case as well irrespective of the compiler
optimizations, add _RET_IP_ to MMIO trace to make it provide more accurate
debug information in all these scenarios.
Before:
rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4
After:
rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 val=0x0 addr=0xffff8000087447f4
Fixes: 210031971cdd ("asm-generic/io: Add logging support for MMIO accessors")
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
We need the kernfs changes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the char/misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make DEBUG_INFO_COMPRESSED a choice; DEBUG_INFO_COMPRESSED_NONE is the
default, DEBUG_INFO_COMPRESSED_ZLIB uses zlib,
DEBUG_INFO_COMPRESSED_ZSTD uses zstd.
This renames the existing KConfig option DEBUG_INFO_COMPRESSED to
DEBUG_INFO_COMPRESSED_ZLIB so users upgrading may need to reset the new
Kconfigs.
Some quick N=1 measurements with du, /usr/bin/time -v, and bloaty:
clang-16, x86_64 defconfig plus
CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_COMPRESSED_NONE=y:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:55.43
488M vmlinux
27.6% 136Mi 0.0% 0 .debug_info
6.1% 30.2Mi 0.0% 0 .debug_str_offsets
3.5% 17.2Mi 0.0% 0 .debug_line
3.3% 16.3Mi 0.0% 0 .debug_loclists
0.9% 4.62Mi 0.0% 0 .debug_str
clang-16, x86_64 defconfig plus
CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_COMPRESSED_ZLIB=y:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:00.35
385M vmlinux
21.8% 85.4Mi 0.0% 0 .debug_info
2.1% 8.26Mi 0.0% 0 .debug_str_offsets
2.1% 8.24Mi 0.0% 0 .debug_loclists
1.9% 7.48Mi 0.0% 0 .debug_line
0.5% 1.94Mi 0.0% 0 .debug_str
clang-16, x86_64 defconfig plus
CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_COMPRESSED_ZSTD=y:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:59.69
373M vmlinux
21.4% 81.4Mi 0.0% 0 .debug_info
2.3% 8.85Mi 0.0% 0 .debug_loclists
1.5% 5.71Mi 0.0% 0 .debug_line
0.5% 1.95Mi 0.0% 0 .debug_str_offsets
0.4% 1.62Mi 0.0% 0 .debug_str
That's only a 3.11% overall binary size savings over zlib, but at no
performance regression.
Link: https://maskray.me/blog/2022-09-09-zstd-compressed-debug-sections
Link: https://maskray.me/blog/2022-01-23-compressed-debug-sections
Suggested-by: Sedat Dilek (DHL Supply Chain) <sedat.dilek@dhl.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Now that we use -funsigned-char, there's no need for this kind of ifdef.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:
UBSAN: shift-out-of-bounds in lib/fonts/fonts.c:139:20
left shift of 1 by 31 places cannot be represented in type 'int'
<TASK>
dump_stack_lvl+0x7d/0xa5
dump_stack+0x15/0x1b
ubsan_epilogue+0xe/0x4e
__ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
get_default_font+0x1c7/0x1f0
fbcon_startup+0x347/0x3a0
do_take_over_console+0xce/0x270
do_fbcon_takeover+0xa1/0x170
do_fb_registered+0x2a8/0x340
fbcon_fb_registered+0x47/0xe0
register_framebuffer+0x294/0x4a0
__drm_fb_helper_initial_config_and_unlock+0x43c/0x880 [drm_kms_helper]
drm_fb_helper_initial_config+0x52/0x80 [drm_kms_helper]
drm_fbdev_client_hotplug+0x156/0x1b0 [drm_kms_helper]
drm_fbdev_generic_setup+0xfc/0x290 [drm_kms_helper]
bochs_pci_probe+0x6ca/0x772 [bochs]
local_pci_probe+0x4d/0xb0
pci_device_probe+0x119/0x320
really_probe+0x181/0x550
__driver_probe_device+0xc6/0x220
driver_probe_device+0x32/0x100
__driver_attach+0x195/0x200
bus_for_each_dev+0xbb/0x120
driver_attach+0x27/0x30
bus_add_driver+0x22e/0x2f0
driver_register+0xa9/0x190
__pci_register_driver+0x90/0xa0
bochs_pci_driver_init+0x52/0x1000 [bochs]
do_one_initcall+0x76/0x430
do_init_module+0x61/0x28a
load_module+0x1f82/0x2e50
__do_sys_finit_module+0xf8/0x190
__x64_sys_finit_module+0x23/0x30
do_syscall_64+0x58/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>
Link: https://lkml.kernel.org/r/20221031113829.4183153-1-cuigaosheng1@huawei.com
Fixes: c81f717cb9e0 ("fbcon: Fix typo and bogus logic in get_default_font")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
try_cmpxchg implicitly assigns old head->first value to "first" when
cmpxchg fails. There is no need to re-read the value in the loop.
Link: https://lkml.kernel.org/r/20221017145226.4044-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The variable num is being assigned a value that is never read, it is being
re-assigned a new value in both paths if an if-statement. The assignment
is redundant and can be removed.
Cleans up clang scan build warning:
lib/oid_registry.c:149:3: warning: Value stored to 'num' is
never read [deadcode.DeadStores]
Link: https://lkml.kernel.org/r/20221017214556.863357-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
include/linux/bpf.h
1f6e04a1c7b8 ("bpf: Fix offset calculation error in __copy_map_value and zero_map_value")
aa3496accc41 ("bpf: Refactor kptr_off_tab into btf_record")
f71b2f64177a ("bpf: Refactor map->off_arr handling")
https://lore.kernel.org/all/20221114095000.67a73239@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These cases were done with this Coccinelle:
@@
expression H;
expression L;
@@
- (get_random_u32_below(H) + L)
+ get_random_u32_inclusive(L, H + L - 1)
@@
expression H;
expression L;
expression E;
@@
get_random_u32_inclusive(L,
H
- + E
- - E
)
@@
expression H;
expression L;
expression E;
@@
get_random_u32_inclusive(L,
H
- - E
- + E
)
@@
expression H;
expression L;
expression E;
expression F;
@@
get_random_u32_inclusive(L,
H
- - E
+ F
- + E
)
@@
expression H;
expression L;
expression E;
expression F;
@@
get_random_u32_inclusive(L,
H
- + E
+ F
- - E
)
And then subsequently cleaned up by hand, with several automatic cases
rejected if it didn't make sense contextually.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
These cases were done with this Coccinelle:
@@
expression E;
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (I > E);
+ I = get_random_u32_below(E + 1);
@@
expression E;
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (I >= E);
+ I = get_random_u32_below(E);
@@
expression E;
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (I < E);
+ I = get_random_u32_above(E - 1);
@@
expression E;
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (I <= E);
+ I = get_random_u32_above(E);
@@
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (!I);
+ I = get_random_u32_above(0);
@@
identifier I;
@@
- do {
... when != I
- I = get_random_u32();
... when != I
- } while (I == 0);
+ I = get_random_u32_above(0);
@@
expression E;
@@
- E + 1 + get_random_u32_below(U32_MAX - E)
+ get_random_u32_above(E)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This is a simple mechanical transformation done by:
@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Jan reported the new algorithm as merged might be problematic if the
queue being awaken becomes empty between the waitqueue_active inside
sbq_wake_ptr check and the wake up. If that happens, wake_up_nr will
not wake up any waiter and we loose too many wake ups. In order to
guarantee progress, we need to wake up at least one waiter here, if
there are any. This now requires trying to wake up from every queue.
Instead of walking through all the queues with sbq_wake_ptr, this call
moves the wake up inside that function. In a previous version of the
patch, I found that updating wake_index several times when walking
through queues had a measurable overhead. This ensures we only update
it once, at the end.
Fixes: 4f8126bb2308 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221115224553.23594-4-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When a queue is awaken, the wake_index written by sbq_wake_ptr currently
keeps pointing to the same queue. On the next wake up, it will thus
retry the same queue, which is unfair to other queues, and can lead to
starvation. This patch, moves the index update to happen before the
queue is returned, such that it will now try a different queue first on
the next wake up, improving fairness.
Fixes: 4f8126bb2308 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20221115224553.23594-2-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
New initialization macro for linear ranges was added. Slightly simplify
the test code by using this macro - and at the same time also verify the
macro is working as intended.
Use the newly added LINEAR_RANGE() initialization macro for linear range
test.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Link: https://lore.kernel.org/r/Y3R13IRrs+x5PcZ4@dc75zzyyyyyyyyyyyyydt-3.rev.dnainternet.fi
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so
track it in debug_objects_replace_static_objects. Do similar things in
object_cpu_offline.
2. In debug_objects_mem_init, there is no need to call function
cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of
memory).
Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com
Fixes: 634d61f45d6f ("debugobjects: Percpu pool lookahead freeing/allocation")
Fixes: c4b73aabd098 ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done")
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
With CXL security features, and CXL dynamic provisioning, global CPU
cache flushing nvdimm requirements are no longer specific to that
subsystem, even beyond the scope of security_ops. CXL will need such
semantics for features not necessarily limited to persistent memory.
The functionality this is enabling is to be able to instantaneously
secure erase potentially terabytes of memory at once and the kernel
needs to be sure that none of the data from before the erase is still
present in the cache. It is also used when unlocking a memory device
where speculative reads and firmware accesses could have cached poison
from before the device was unlocked. Lastly this facility is used when
mapping new devices, or new capacity into an established physical
address range. I.e. when the driver switches DeviceA mapping AddressX to
DeviceB mapping AddressX then any cached data from DeviceA:AddressX
needs to be invalidated.
This capability is typically only used once per-boot (for unlock), or
once per bare metal provisioning event (secure erase), like when handing
off the system to another tenant or decommissioning a device. It may
also be used for dynamic CXL region provisioning.
Users must first call cpu_cache_has_invalidate_memregion() to know
whether this functionality is available on the architecture. On x86 this
respects the constraints of when wbinvd() is tolerable. It is already
the case that wbinvd() is problematic to allow in VMs due its global
performance impact and KVM, for example, has been known to just trap and
ignore the call. With confidential computing guest execution of wbinvd()
may even trigger an exception. Given guests should not be messing with
the bare metal address map via CXL configuration changes
cpu_cache_has_invalidate_memregion() returns false in VMs.
While this global cache invalidation facility, is exported to modules,
since NVDIMM and CXL support can be built as a module, it is not for
general use. The intent is that this facility is not available outside
of specific "device-memory" use cases. To make that expectation as clear
as possible the API is scoped to a new "DEVMEM" module namespace that
only the NVDIMM and CXL subsystems are expected to import.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
RAID6_USE_EMPTY_ZERO_PAGE is unused and hardcoded to 0, so let's drop it.
Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
skb->vlan_present seems redundant.
We can instead derive it from this boolean expression:
vlan_present = skb->vlan_proto != 0 || skb->vlan_tci != 0
Add a new union, to access both fields in a single load/store
when possible.
union {
u32 vlan_all;
struct {
__be16 vlan_proto;
__u16 vlan_tci;
};
};
This allows following patch to remove a conditional test in GRO stack.
Note:
We move remcsum_offload to keep TC_AT_INGRESS_MASK
and SKB_MONO_DELIVERY_TIME_MASK unchanged.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sbitmap suffers from code complexity, as demonstrated by recent fixes,
and eventual lost wake ups on nested I/O completion. The later happens,
from what I understand, due to the non-atomic nature of the updates to
wait_cnt, which needs to be subtracted and eventually reset when equal
to zero. This two step process can eventually miss an update when a
nested completion happens to interrupt the CPU in between the wait_cnt
updates. This is very hard to fix, as shown by the recent changes to
this code.
The code complexity arises mostly from the corner cases to avoid missed
wakes in this scenario. In addition, the handling of wake_batch
recalculation plus the synchronization with sbq_queue_wake_up is
non-trivial.
This patchset implements the idea originally proposed by Jan [1], which
removes the need for the two-step updates of wait_cnt. This is done by
tracking the number of completions and wakeups in always increasing,
per-bitmap counters. Instead of having to reset the wait_cnt when it
reaches zero, we simply keep counting, and attempt to wake up N threads
in a single wait queue whenever there is enough space for a batch.
Waking up less than batch_wake shouldn't be a problem, because we
haven't changed the conditions for wake up, and the existing batch
calculation guarantees at least enough remaining completions to wake up
a batch for each queue at any time.
Performance-wise, one should expect very similar performance to the
original algorithm for the case where there is no queueing. In both the
old algorithm and this implementation, the first thing is to check
ws_active, which bails out if there is no queueing to be managed. In the
new code, we took care to avoid accounting completions and wakeups when
there is no queueing, to not pay the cost of atomic operations
unnecessarily, since it doesn't skew the numbers.
For more interesting cases, where there is queueing, we need to take
into account the cross-communication of the atomic operations. I've
been benchmarking by running parallel fio jobs against a single hctx
nullb in different hardware queue depth scenarios, and verifying both
IOPS and queueing.
Each experiment was repeated 5 times on a 20-CPU box, with 20 parallel
jobs. fio was issuing fixed-size randwrites with qd=64 against nullb,
varying only the hardware queue length per test.
queue size 2 4 8 16 32 64
6.1-rc2 1681.1K (1.6K) 2633.0K (12.7K) 6940.8K (16.3K) 8172.3K (617.5K) 8391.7K (367.1K) 8606.1K (351.2K)
patched 1721.8K (15.1K) 3016.7K (3.8K) 7543.0K (89.4K) 8132.5K (303.4K) 8324.2K (230.6K) 8401.8K (284.7K)
The following is a similar experiment, ran against a nullb with a single
bitmap shared by 20 hctx spread across 2 NUMA nodes. This has 40
parallel fio jobs operating on the same device
queue size 2 4 8 16 32 64
6.1-rc2 1081.0K (2.3K) 957.2K (1.5K) 1699.1K (5.7K) 6178.2K (124.6K) 12227.9K (37.7K) 13286.6K (92.9K)
patched 1081.8K (2.8K) 1316.5K (5.4K) 2364.4K (1.8K) 6151.4K (20.0K) 11893.6K (17.5K) 12385.6K (18.4K)
It has also survived blktests and a 12h-stress run against nullb. I also
ran the code against nvme and a scsi SSD, and I didn't observe
performance regression in those. If there are other tests you think I
should run, please let me know and I will follow up with results.
[1] https://lore.kernel.org/all/aef9de29-e9f5-259a-f8be-12d1b734e72@google.com/
Cc: Hugh Dickins <hughd@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Liu Song <liusong@linux.alibaba.com>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20221105231055.25953-1-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Implement a minimal library version of AES-GCM based on the existing
library implementations of AES and multiplication in GF(2^128). Using
these primitives, GCM can be implemented in a straight-forward manner.
GCM has a couple of sharp edges, i.e., the amount of input data
processed with the same initialization vector (IV) should be capped to
protect the counter from 32-bit rollover (or carry), and the size of the
authentication tag should be fixed for a given key. [0]
The former concern is addressed trivially, given that the function call
API uses 32-bit signed types for the input lengths. It is still up to
the caller to avoid IV reuse in general, but this is not something we
can police at the implementation level.
As for the latter concern, let's make the authentication tag size part
of the key schedule, and only permit it to be configured as part of the
key expansion routine.
Note that table based AES implementations are susceptible to known
plaintext timing attacks on the encryption key. The AES library already
attempts to mitigate this to some extent, but given that the counter
mode encryption used by GCM operates exclusively on known plaintext by
construction (the IV and therefore the initial counter value are known
to an attacker), let's take some extra care to mitigate this, by calling
the AES library with interrupts disabled.
[0] https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-38d.pdf
Link: https://lore.kernel.org/all/c6fb9b25-a4b6-2e4a-2dd1-63adda055a49@amd.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The gf128mul library has different variants with different
memory/performance tradeoffs, where the faster ones use 4k or 64k lookup
tables precomputed at runtime, which are based on one of the
multiplication factors, which is commonly the key for keyed hash
algorithms such as GHASH.
The slowest variant is gf128_mul_lle() [and its bbe/ble counterparts],
which does not use precomputed lookup tables, but it still relies on a
single u16[256] lookup table which is input independent. The use of such
a table may cause the execution time of gf128_mul_lle() to correlate
with the value of the inputs, which is generally something that must be
avoided for cryptographic algorithms. On top of that, the function uses
a sequence of if () statements that conditionally invoke be128_xor()
based on which bits are set in the second argument of the function,
which is usually a pointer to the multiplication factor that represents
the key.
In order to remove the correlation between the execution time of
gf128_mul_lle() and the value of its inputs, let's address the
identified shortcomings:
- add a time invariant version of gf128mul_x8_lle() that replaces the
table lookup with the expression that is used at compile time to
populate the lookup table;
- make the invocations of be128_xor() unconditional, but pass a zero
vector as the third argument if the associated bit in the key is
cleared.
The resulting code is likely to be significantly slower. However, given
that this is the slowest version already, making it even slower in order
to make it more secure is assumed to be justified.
The bbe and ble counterparts could receive the same treatment, but the
former is never used anywhere in the kernel, and the latter is only
used in the driver for a asynchronous crypto h/w accelerator (Chelsio),
where timing variances are unlikely to matter.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The gf128mul library does not depend on the crypto API at all, so it can
be moved into lib/crypto. This will allow us to use it in other library
code in a subsequent patch without having to depend on CONFIG_CRYPTO.
While at it, change the Kconfig symbol name to align with other crypto
library implementations. However, the source file name is retained, as
it is reflected in the module .ko filename, and changing this might
break things for users.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
There are spelling mistakes in config show text. Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220928211637.62529-1-colin.i.king@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The strlen() may go too far when estimating the length of
the given string. In some cases it may go over the boundary
and crash the system which is the case according to the commit
13a55372b64e ("ARM: orion5x: Revert commit 4904dbda41c8.").
Rectify this by switching to strnlen() for the expected
maximum length of the string.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221108141108.62974-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Consecutive zone device pages should not be merged into the same sgl
or bvec segment with other types of pages or if they belong to different
pgmaps. Otherwise getting the pgmap of a given segment is not possible
without scanning the entire segment. This helper returns true either if
both pages are not zone device pages or both pages are zone device
pages with the same pgmap.
Factor out the check for page mergability into a pages_are_mergable()
helper and add a check with zone_device_pages_are_mergeable().
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20221021174116.7200-6-logang@deltatee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add iov_iter_get_pages_flags() and iov_iter_get_pages_alloc_flags()
which take a flags argument that is passed to get_user_pages_fast().
This is so that FOLL_PCI_P2PDMA can be passed when appropriate.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221021174116.7200-4-logang@deltatee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Switch KUnit-compatible KASAN tests from using per-task KUnit resources to
console tracepoints.
This allows for two things:
1. Migrating tests that trigger a KASAN report in the context of a task
other than current to KUnit framework.
This is implemented in the patches that follow.
2. Parsing and matching the contents of KASAN reports.
This is not yet implemented.
Link: https://lkml.kernel.org/r/9345acdd11e953b207b0ed4724ff780e63afeb36.1664298455.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In RCU mode, the node limits were being updated to the last pivot which
may not be correct and would cause the metadata to be set when it
shouldn't. Fix this by not setting a new limit in this case.
Link: https://lkml.kernel.org/r/20221107163857.867377-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
It is possible to confuse the depth tracking in the maple state by
searching the same node for values. Fix the depth tracking by moving
where the depth is incremented closer to where the node changes level.
Also change the initial depth setting when using the root node.
Link: https://lkml.kernel.org/r/20221107163814.866612-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
As pointed out by Peter Zijlstra, __msan_poison_alloca() does not play
well with IRQ code when PREEMPT_RT is on, because in that mode even
GFP_ATOMIC allocations cannot be performed.
Fixing this would require making stackdepot completely lockless, which is
quite challenging and may be excessive for the time being.
Instead, make sure KMSAN is incompatible with PREEMPT_RT, like other debug
configs are.
Link: https://lkml.kernel.org/r/20221102110611.1085175-4-glider@google.com
Link: https://lore.kernel.org/lkml/20221025221755.3810809-1-glider@google.com/
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
As pointed out by Masahiro Yamada, Kconfig picks up the first default
entry which has true 'if' condition. Hence, the previously added check
for KMSAN was never used, because it followed the checks for 64BIT and
!64BIT.
Put KMSAN check before others to ensure it is always applied.
Link: https://lkml.kernel.org/r/20221102110611.1085175-3-glider@google.com
Link: https://github.com/google/kmsan/issues/89
Link: https://lore.kernel.org/linux-mm/20221024212144.2852069-3-glider@google.com/
Fixes: 921757bc9b61 ("Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default")
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Along the development cycle, the testing code support for module/in-kernel
compiles was removed. Restore this functionality by moving any internal
API tests to the userspace side, as well as threading tests. Fix the
lockdep issues and add a way to reduce memory usage so the tests can
complete with KASAN + memleak detection. Make the tests work on 32 bit
hosts where possible and detect 32 bit hosts in the radix test suite.
[akpm@linux-foundation.org: fix module export]
[akpm@linux-foundation.org: fix it some more]
[liam.howlett@oracle.com: fix compile warnings on 32bit build in check_find()]
Link: https://lkml.kernel.org/r/20221107203816.1260327-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20221028180415.3074673-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
clang-analyzer reported some Dead Stores in mas_anode_descend(). Upon
inspection, there were a few clean ups that would make the code cleaner:
The count variable was set from the mt_slots array and then updated but
never used again. Just use the array reference directly.
Also stop updating the type since it isn't used after the update.
Stop setting the gaps pointer to NULL at the start since it is always
set before the loop begins.
Link: https://lkml.kernel.org/r/20221026151413.4032730-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There is a more direct and cleaner way of implementing the same functional
code. Remove the confusing and unnecessary use of pointers here.
Link: https://lkml.kernel.org/r/20221026151241.4031117-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|