summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-05-12riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. RISC-V uses generic per-cpu implementation where the offsets for CPUs are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores the address of the task_struct in TP register. The first element in task_struct is struct thread_info, and we can get the cpu number by reading from the TP register + offsetof(struct thread_info, cpu). Once we have the cpu number in a register we read the offset for that cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this offset to the destination register. To measure the improvement from this change, the benchmark in [1] was used on Qemu: Before: glob-arr-inc : 1.127 ± 0.013M/s arr-inc : 1.121 ± 0.004M/s hash-inc : 0.681 ± 0.052M/s After: glob-arr-inc : 1.138 ± 0.011M/s arr-inc : 1.366 ± 0.006M/s hash-inc : 0.676 ± 0.001M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12ARC: Add eBPF JIT supportShahab Vahedi
This will add eBPF JIT support to the 32-bit ARCv2 processors. The implementation is qualified by running the BPF tests on a Synopsys HSDK board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU. The test_bpf.ko reports 2-10 fold improvements in execution time of its tests. For instance: test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS test_bpf: #33 tcpdump port 22 jited:1 120 224 260 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1 23 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS Deployment and structure ------------------------ The related codes are added to "arch/arc/net": - bpf_jit.h -- The interface that a back-end translator must provide - bpf_jit_core.c -- Knows how to handle the input eBPF byte stream - bpf_jit_arcv2.c -- The back-end code that knows the translation logic The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance to the whole process. Normally, the translation is done in one pass, namely the "normal pass". In case some relocations are not known during this pass, some data (arc_jit_data) is allocated for the next pass to come. This possible next (and last) pass is called the "extra pass". 1. Normal pass # The necessary pass 1a. Dry run # Get the whole JIT length, epilogue offset, etc. 1b. Emit phase # Allocate memory and start emitting instructions 2. Extra pass # Only needed if there are relocations to be fixed 2a. Patch relocations Support status -------------- The JIT compiler supports BPF instructions up to "cpu=v4". However, it does not yet provide support for: - Tail calls - Atomic operations - 64-bit division/remainder - BPF_PROBE_MEM* (exception table) The result of "test_bpf" test suite on an HSDK board is: hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] All the failing test cases are due to the ones that were not JIT'ed. Categorically, they can be represented as: .-----------.------------.-------------. | test type | opcodes | # of cases | |-----------+------------+-------------| | atomic | 0xC3, 0xDB | 149 | | div64 | 0x37, 0x3F | 22 | | mod64 | 0x97, 0x9F | 15 | `-----------^------------+-------------| | (total) 186 | `-------------' Setup: build config ------------------- The following configs must be set to have a working JIT test: CONFIG_BPF_JIT=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_TEST_BPF=m The following options are not necessary for the tests module, but are good to have: CONFIG_DEBUG_INFO=y # prerequisite for below CONFIG_DEBUG_INFO_BTF=y # so bpftool can generate vmlinux.h CONFIG_FTRACE=y # CONFIG_BPF_SYSCALL=y # all these options lead to CONFIG_KPROBE_EVENTS=y # having CONFIG_BPF_EVENTS=y CONFIG_PERF_EVENTS=y # Some BPF programs provide data through /sys/kernel/debug: CONFIG_DEBUG_FS=y arc# mount -t debugfs debugfs /sys/kernel/debug Setup: elfutils --------------- The libdw.{so,a} library that is used by pahole for processing the final binary must come from elfutils 0.189 or newer. The support for ARCv2 [1] has been added since that version. [1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7 Setup: pahole ------------- The line below in linux/scripts/Makefile.btf must be commented out: pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats Or else, the build will fail: $ make V=1 ... BTF .btf.vmlinux.bin.o pahole -J --btf_gen_floats \ -j --lang_exclude=rust \ --skip_encoding_btf_inconsistent_proto \ --btf_gen_optimized .tmp_vmlinux.btf Complex, interval and imaginary float types are not supported Encountered error while encoding BTF. ... BTFIDS vmlinux ./tools/bpf/resolve_btfids/resolve_btfids vmlinux libbpf: failed to find '.BTF' ELF section in vmlinux FAILED: load BTF from vmlinux: No data available This is due to the fact that the ARC toolchains generate "complex float" DIE entries in libgcc and at the moment, pahole can't handle such entries. Running the tests ----------------- host$ scp /bld/linux/lib/test_bpf.ko arc: arc # sysctl net.core.bpf_jit_enable=1 arc # insmod test_bpf.ko test_suite=test_bpf ... test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] Acknowledgments --------------- - Claudiu Zissulescu for his unwavering support - Yuriy Kolerov for testing and troubleshooting - Vladimir Isaev for the pahole workaround - Sergey Matyukevich for paving the road by adding the interpreter support Signed-off-by: Shahab Vahedi <shahab@synopsys.com> Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12hwmon: (nzxt-kraken3) Bail out for unsupported device variantsGuenter Roeck
Dan Carpenter reports: Commit cbeb479ff4cd ("hwmon: (nzxt-kraken3) Decouple device names from kinds") from Apr 28, 2024 (linux-next), leads to the following Smatch static checker warning: drivers/hwmon/nzxt-kraken3.c:957 kraken3_probe() error: uninitialized symbol 'device_name'. Indeed, 'device_name' will be uninitizalized if an unknown product is encountered. In practice this should not matter because the driver should not instantiate on unknown products, but lets play safe and bail out if that happens. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-hwmon/b1738c50-db42-40f0-a899-9c027c131ffb@moroto.mountain/ Cc: Jonas Malaco <jonas@protocubo.io> Cc: Aleksa Savic <savicaleksa83@gmail.com> Fixes: cbeb479ff4cd ("hwmon: (nzxt-kraken3) Decouple device names from kinds") Acked-by: Jonas Malaco <jonas@protocubo.io> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-05-12ksmbd: avoid to send duplicate oplock break notificationsNamjae Jeon
This patch fixes generic/011 when oplocks is enable. Avoid to send duplicate oplock break notifications like smb2 leases case. Fixes: 97c2ec64667b ("ksmbd: avoid to send duplicate lease break notifications") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-12Linux 6.9v6.9Linus Torvalds
2024-05-12jffs2: Fix potential illegal address access in jffs2_free_inodeWang Yong
During the stress testing of the jffs2 file system,the following abnormal printouts were found: [ 2430.649000] Unable to handle kernel paging request at virtual address 0069696969696948 [ 2430.649622] Mem abort info: [ 2430.649829] ESR = 0x96000004 [ 2430.650115] EC = 0x25: DABT (current EL), IL = 32 bits [ 2430.650564] SET = 0, FnV = 0 [ 2430.650795] EA = 0, S1PTW = 0 [ 2430.651032] FSC = 0x04: level 0 translation fault [ 2430.651446] Data abort info: [ 2430.651683] ISV = 0, ISS = 0x00000004 [ 2430.652001] CM = 0, WnR = 0 [ 2430.652558] [0069696969696948] address between user and kernel address ranges [ 2430.653265] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 2430.654512] CPU: 2 PID: 20919 Comm: cat Not tainted 5.15.25-g512f31242bf6 #33 [ 2430.655008] Hardware name: linux,dummy-virt (DT) [ 2430.655517] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2430.656142] pc : kfree+0x78/0x348 [ 2430.656630] lr : jffs2_free_inode+0x24/0x48 [ 2430.657051] sp : ffff800009eebd10 [ 2430.657355] x29: ffff800009eebd10 x28: 0000000000000001 x27: 0000000000000000 [ 2430.658327] x26: ffff000038f09d80 x25: 0080000000000000 x24: ffff800009d38000 [ 2430.658919] x23: 5a5a5a5a5a5a5a5a x22: ffff000038f09d80 x21: ffff8000084f0d14 [ 2430.659434] x20: ffff0000bf9a6ac0 x19: 0169696969696940 x18: 0000000000000000 [ 2430.659969] x17: ffff8000b6506000 x16: ffff800009eec000 x15: 0000000000004000 [ 2430.660637] x14: 0000000000000000 x13: 00000001000820a1 x12: 00000000000d1b19 [ 2430.661345] x11: 0004000800000000 x10: 0000000000000001 x9 : ffff8000084f0d14 [ 2430.662025] x8 : ffff0000bf9a6b40 x7 : ffff0000bf9a6b48 x6 : 0000000003470302 [ 2430.662695] x5 : ffff00002e41dcc0 x4 : ffff0000bf9aa3b0 x3 : 0000000003470342 [ 2430.663486] x2 : 0000000000000000 x1 : ffff8000084f0d14 x0 : fffffc0000000000 [ 2430.664217] Call trace: [ 2430.664528] kfree+0x78/0x348 [ 2430.664855] jffs2_free_inode+0x24/0x48 [ 2430.665233] i_callback+0x24/0x50 [ 2430.665528] rcu_do_batch+0x1ac/0x448 [ 2430.665892] rcu_core+0x28c/0x3c8 [ 2430.666151] rcu_core_si+0x18/0x28 [ 2430.666473] __do_softirq+0x138/0x3cc [ 2430.666781] irq_exit+0xf0/0x110 [ 2430.667065] handle_domain_irq+0x6c/0x98 [ 2430.667447] gic_handle_irq+0xac/0xe8 [ 2430.667739] call_on_irq_stack+0x28/0x54 The parameter passed to kfree was 5a5a5a5a, which corresponds to the target field of the jffs_inode_info structure. It was found that all variables in the jffs_inode_info structure were 5a5a5a5a, except for the first member sem. It is suspected that these variables are not initialized because they were set to 5a5a5a5a during memory testing, which is meant to detect uninitialized memory.The sem variable is initialized in the function jffs2_i_init_once, while other members are initialized in the function jffs2_init_inode_info. The function jffs2_init_inode_info is called after iget_locked, but in the iget_locked function, the destroy_inode process is triggered, which releases the inode and consequently, the target member of the inode is not initialized.In concurrent high pressure scenarios, iget_locked may enter the destroy_inode branch as described in the code. Since the destroy_inode functionality of jffs2 only releases the target, the fix method is to set target to NULL in jffs2_i_init_once. Signed-off-by: Wang Yong <wang.yong12@zte.com.cn> Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn> Reviewed-by: Yang Tao <yang.tao172@zte.com.cn> Cc: Xu Xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-05-12jffs2: Simplify the allocation of slab cachesKunwu Chan
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. And change cache name from 'jffs2_tmp_dnode' to 'jffs2_tmp_dnode_info'. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-05-12jffs2: nodemgmt: fix kernel-doc commentsRandy Dunlap
Update the end of one sentence where a comment was truncated. (dwmw2) Fix a bunch of kernel-doc warnings: nodemgmt.c:72: warning: Function parameter or member 'sumsize' not described in 'jffs2_do_reserve_space' nodemgmt.c:72: warning: expecting prototype for jffs2_reserve_space(). Prototype was for jffs2_do_reserve_space() instead nodemgmt.c:76: warning: Function parameter or member 'sumsize' not described in 'jffs2_reserve_space' nodemgmt.c:76: warning: No description found for return value of 'jffs2_reserve_space' nodemgmt.c:503: warning: Function parameter or member 'ofs' not described in 'jffs2_add_physical_node_ref' nodemgmt.c:503: warning: Function parameter or member 'ic' not described in 'jffs2_add_physical_node_ref' nodemgmt.c:503: warning: Excess function parameter 'new' description in 'jffs2_add_physical_node_ref' nodemgmt.c:503: warning: No description found for return value of 'jffs2_add_physical_node_ref' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: linux-mtd@lists.infradead.org Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-05-12jffs2: print symbolic error name instead of error codeChristian Heusel
Utilize the %pe print specifier to get the symbolic error name as a string (i.e "-ENOMEM") in the log message instead of the error code to increase its readablility. This change was suggested in https://lore.kernel.org/all/92972476-0b1f-4d0a-9951-af3fc8bc6e65@suswa.mountain/ Signed-off-by: Christian Heusel <christian@heusel.eu> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-05-12Merge tag 'kselftest-fix-vfork-2024-05-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull Kselftest fixes from Mickaël Salaün: "Fix Kselftest's vfork() side effects. As reported by Kernel Test Robot and Sean Christopherson, some tests fail since v6.9-rc1 . This is due to the use of vfork() which introduced some side effects. Similarly, while making it more generic, a previous commit made some Landlock file system tests flaky, and subject to the host's file system mount configuration. This fixes all these side effects by replacing vfork() with clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and makes the Kselftest framework more robust" Link: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com Link: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com Link: https://lore.kernel.org/r/20240511171445.904356-1-mic@digikod.net * tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/harness: Handle TEST_F()'s explicit exit codes selftests/harness: Fix vfork() side effects selftests/harness: Share _metadata between forked processes selftests/pidfd: Fix wrong expectation selftests/harness: Constify fixture variants selftests/landlock: Do not allocate memory in fixture data selftests/harness: Fix interleaved scheduling leading to race conditions selftests/harness: Fix fixture teardown selftests/landlock: Fix FS tests when run on a private mount point selftests/pidfd: Fix config for pidfd_setns_test
2024-05-12Merge tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fix from Paolo Bonzini: - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for /dev/kvm * tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M
2024-05-12Merge tag 'edac_urgent_for_v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Fix a race condition when clearing error count bits and toggling the error interrupt throug the same register, in synopsys_edac * tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/synopsys: Fix ECC status and IRQ control race condition
2024-05-12hwmon: (emc1403) Add support for EMC1428 and EMC1438.Lars Petter Mostad
EMC1428 and EMC1438 are similar to EMC14xx, but have eight temperature channels, as well as signed data and limit registers. Chips currently supported by this driver have unsigned registers only. Signed-off-by: Lars Petter Mostad <larspm@gmail.com> Link: https://lore.kernel.org/r/20240510142824.824332-1-lars.petter.mostad@appear.net Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-05-12Merge tag 'x86_urgent_for_v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new PCI ID which belongs to a new AMD CPU family 0x1a - Ensure that that last level cache ID is set in all cases, in the AMD CPU topology parsing code, in order to prevent invalid scheduling domain CPU masks * tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/amd: Ensure that LLC ID is initialized x86/amd_nb: Add new PCI IDs for AMD family 0x1a
2024-05-12RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siwZhu Yanjun
When running blktests nvme/rdma, the following kmemleak issue will appear. kmemleak: Kernel memory leak detector initialized (mempool available:36041) kmemleak: Automatic memory scanning thread started kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 17 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88855da53400 (size 192): comm "rdma", pid 10630, jiffies 4296575922 hex dump (first 32 bytes): 37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 7............... 10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff .4.].....4.].... backtrace (crc 47f66721): [<ffffffff911251bd>] kmalloc_trace+0x30d/0x3b0 [<ffffffffc2640ff7>] alloc_gid_entry+0x47/0x380 [ib_core] [<ffffffffc2642206>] add_modify_gid+0x166/0x930 [ib_core] [<ffffffffc2643468>] ib_cache_update.part.0+0x6d8/0x910 [ib_core] [<ffffffffc2644e1a>] ib_cache_setup_one+0x24a/0x350 [ib_core] [<ffffffffc263949e>] ib_register_device+0x9e/0x3a0 [ib_core] [<ffffffffc2a3d389>] 0xffffffffc2a3d389 [<ffffffffc2688cd8>] nldev_newlink+0x2b8/0x520 [ib_core] [<ffffffffc2645fe3>] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core] [<ffffffffc264648c>] rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core] [<ffffffff9270e7b5>] netlink_unicast+0x445/0x710 [<ffffffff9270f1f1>] netlink_sendmsg+0x761/0xc40 [<ffffffff9249db29>] __sys_sendto+0x3a9/0x420 [<ffffffff9249dc8c>] __x64_sys_sendto+0xdc/0x1b0 [<ffffffff92db0ad3>] do_syscall_64+0x93/0x180 [<ffffffff92e00126>] entry_SYSCALL_64_after_hwframe+0x71/0x79 The root cause: rdma_put_gid_attr is not called when sgid_attr is set to ERR_PTR(-ENODEV). Reported-and-tested-by: Yi Zhang <yi.zhang@redhat.com> Closes: https://lore.kernel.org/all/19bf5745-1b3b-4b8a-81c2-20d945943aaf@linux.dev/T/ Fixes: f8ef1be816bf ("RDMA/cma: Avoid GID lookups on iWARP devices") Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20240510211247.31345-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-12ALSA: scarlett2: Increase mixer range to +12dBGeoffrey D. Bennett
The values loaded into the mixer are 16-bit values, with 8192 representing 0dB, going up to a current maximum of 16345 (+6dB). All supported interfaces have no problem going up to 32612 (+12dB), so update SCARLETT2_MIXER_MAX_DB and scarlett2_mixer_values[] to allow for this. Tested with: - Scarlett 2nd Gen 6i6, 18i8, 18i20 - Scarlett 3rd Gen 4i4, 8i6, 18i8, 18i20 - Scarlett 4th Gen Solo, 2i2, 4i4 - Clarett+ 2Pre, 4Pre, 8Pre - Vocaster One and Two Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/Zj+gYT4F2XeKTD93@m.b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-12ALSA: scarlett2: Add S/PDIF source selection controlsGeoffrey D. Bennett
Add S/PDIF Source/Digital I/O Mode selection controls for the Scarlett 3rd Gen 18i8/18i20 and Clarett 4Pre/8Pre interfaces. These models have both coax S/PDIF and optical inputs, and the optical inputs are switchable between being used as S/PDIF and ADAT inputs. The Scarlett 3rd Gen 18i20 also has a "Dual ADAT" mode for 8-channel audio at 88.2/96kHz. Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/Zj8zCTjzPsTDENN+@m.b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-12RDMA/IPoIB: Fix format truncation compilation errorsLeon Romanovsky
Truncate the device name to store IPoIB VLAN name. [leonro@5b4e8fba4ddd kernel]$ make -s -j 20 allmodconfig [leonro@5b4e8fba4ddd kernel]$ make -s -j 20 W=1 drivers/infiniband/ulp/ipoib/ drivers/infiniband/ulp/ipoib/ipoib_vlan.c: In function ‘ipoib_vlan_add’: drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:52: error: ‘%04x’ directive output may be truncated writing 4 bytes into a region of size between 0 and 15 [-Werror=format-truncation=] 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~ drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:48: note: directive argument in the range [0, 65535] 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~~~~~~ drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:9: note: ‘snprintf’ output between 6 and 21 bytes into a destination of size 16 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 188 | ppriv->dev->name, pkey); | ~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make[6]: *** [scripts/Makefile.build:244: drivers/infiniband/ulp/ipoib/ipoib_vlan.o] Error 1 make[6]: *** Waiting for unfinished jobs.... Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Link: https://lore.kernel.org/r/e9d3e1fef69df4c9beaf402cc3ac342bad680791.1715240029.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2024-05-12KVM: SVM: Add module parameter to enable SEV-SNPBrijesh Singh
Add a module parameter than can be used to enable or disable the SEV-SNP feature. Now that KVM contains the support for the SNP set the GHCB hypervisor feature flag to indicate that SNP is supported. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-18-michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNPAshish Kalra
With SNP/guest_memfd, private/encrypted memory should not be mappable, and MMU notifications for HVA-mapped memory will only be relevant to unencrypted guest memory. Therefore, the rationale behind issuing a wbinvd_on_all_cpus() in sev_guest_memory_reclaimed() should not apply for SNP guests and can be ignored. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> [mdr: Add some clarifications in commit] Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-17-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: x86: Implement hook for determining max NPT mapping levelMichael Roth
In the case of SEV-SNP, whether or not a 2MB page can be mapped via a 2MB mapping in the guest's nested page table depends on whether or not any subpages within the range have already been initialized as private in the RMP table. The existing mixed-attribute tracking in KVM is insufficient here, for instance: - gmem allocates 2MB page - guest issues PVALIDATE on 2MB page - guest later converts a subpage to shared - SNP host code issues PSMASH to split 2MB RMP mapping to 4K - KVM MMU splits NPT mapping to 4K - guest later converts that shared page back to private At this point there are no mixed attributes, and KVM would normally allow for 2MB NPT mappings again, but this is actually not allowed because the RMP table mappings are 4K and cannot be promoted on the hypervisor side, so the NPT mappings must still be limited to 4K to match this. Implement a kvm_x86_ops.private_max_mapping_level() hook for SEV that checks for this condition and adjusts the mapping level accordingly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-16-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Implement gmem hook for invalidating private pagesMichael Roth
Implement a platform hook to do the work of restoring the direct map entries of gmem-managed pages and transitioning the corresponding RMP table entries back to the default shared/hypervisor-owned state. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-15-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Implement gmem hook for initializing private pagesMichael Roth
This will handle the RMP table updates needed to put a page into a private state before mapping it into an SEV-SNP guest. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-14-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Support SEV-SNP AP Creation NAE eventTom Lendacky
Add support for the SEV-SNP AP Creation NAE event. This allows SEV-SNP guests to alter the register state of the APs on their own. This allows the guest a way of simulating INIT-SIPI. A new event, KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, is created and used so as to avoid updating the VMSA pointer while the vCPU is running. For CREATE The guest supplies the GPA of the VMSA to be used for the vCPU with the specified APIC ID. The GPA is saved in the svm struct of the target vCPU, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added to the vCPU and then the vCPU is kicked. For CREATE_ON_INIT: The guest supplies the GPA of the VMSA to be used for the vCPU with the specified APIC ID the next time an INIT is performed. The GPA is saved in the svm struct of the target vCPU. For DESTROY: The guest indicates it wishes to stop the vCPU. The GPA is cleared from the svm struct, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added to vCPU and then the vCPU is kicked. The KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event handler will be invoked as a result of the event or as a result of an INIT. If a new VMSA is to be installed, the VMSA guest page is set as the VMSA in the vCPU VMCB and the vCPU state is set to KVM_MP_STATE_RUNNABLE. If a new VMSA is not to be installed, the VMSA is cleared in the vCPU VMCB and the vCPU state is set to KVM_MP_STATE_HALTED to prevent it from being run. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-13-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add support to handle RMP nested page faultsBrijesh Singh
When SEV-SNP is enabled in the guest, the hardware places restrictions on all memory accesses based on the contents of the RMP table. When hardware encounters RMP check failure caused by the guest memory access it raises the #NPF. The error code contains additional information on the access type. See the APM volume 2 for additional information. When using gmem, RMP faults resulting from mismatches between the state in the RMP table vs. what the guest expects via its page table result in KVM_EXIT_MEMORY_FAULTs being forwarded to userspace to handle. This means the only expected case that needs to be handled in the kernel is when the page size of the entry in the RMP table is larger than the mapping in the nested page table, in which case a PSMASH instruction needs to be issued to split the large RMP entry into individual 4K entries so that subsequent accesses can succeed. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-12-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add support to handle Page State Change VMGEXITMichael Roth
SEV-SNP VMs can ask the hypervisor to change the page state in the RMP table to be private or shared using the Page State Change NAE event as defined in the GHCB specification version 2. Forward these requests to userspace as KVM_EXIT_VMGEXITs, similar to how it is done for requests that don't use a GHCB page. As with the MSR-based page-state changes, use the existing KVM_HC_MAP_GPA_RANGE hypercall format to deliver these requests to userspace via KVM_EXIT_HYPERCALL. Signed-off-by: Michael Roth <michael.roth@amd.com> Co-developed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-11-michael.roth@amd.com> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add support to handle MSR based Page State Change VMGEXITMichael Roth
SEV-SNP VMs can ask the hypervisor to change the page state in the RMP table to be private or shared using the Page State Change MSR protocol as defined in the GHCB specification. When using gmem, private/shared memory is allocated through separate pools, and KVM relies on userspace issuing a KVM_SET_MEMORY_ATTRIBUTES KVM ioctl to tell the KVM MMU whether or not a particular GFN should be backed by private memory or not. Forward these page state change requests to userspace so that it can issue the expected KVM ioctls. The KVM MMU will handle updating the RMP entries when it is ready to map a private page into a guest. Use the existing KVM_HC_MAP_GPA_RANGE hypercall format to deliver these requests to userspace via KVM_EXIT_HYPERCALL. Signed-off-by: Michael Roth <michael.roth@amd.com> Co-developed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-10-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add support to handle GHCB GPA register VMGEXITBrijesh Singh
SEV-SNP guests are required to perform a GHCB GPA registration. Before using a GHCB GPA for a vCPU the first time, a guest must register the vCPU GHCB GPA. If hypervisor can work with the guest requested GPA then it must respond back with the same GPA otherwise return -1. On VMEXIT, verify that the GHCB GPA matches with the registered value. If a mismatch is detected, then abort the guest. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-9-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH commandBrijesh Singh
Add a KVM_SEV_SNP_LAUNCH_FINISH command to finalize the cryptographic launch digest which stores the measurement of the guest at launch time. Also extend the existing SNP firmware data structures to support disabling the use of Versioned Chip Endorsement Keys (VCEK) by guests as part of this command. While finalizing the launch flow, the code also issues the LAUNCH_UPDATE SNP firmware commands to encrypt/measure the initial VMSA pages for each configured vCPU, which requires setting the RMP entries for those pages to private, so also add handling to clean up the RMP entries for these pages whening freeing vCPUs during shutdown. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Harald Hoyer <harald@profian.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-8-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE commandBrijesh Singh
A key aspect of a launching an SNP guest is initializing it with a known/measured payload which is then encrypted into guest memory as pre-validated private pages and then measured into the cryptographic launch context created with KVM_SEV_SNP_LAUNCH_START so that the guest can attest itself after booting. Since all private pages are provided by guest_memfd, make use of the kvm_gmem_populate() interface to handle this. The general flow is that guest_memfd will handle allocating the pages associated with the GPA ranges being initialized by each particular call of KVM_SEV_SNP_LAUNCH_UPDATE, copying data from userspace into those pages, and then the post_populate callback will do the work of setting the RMP entries for these pages to private and issuing the SNP firmware calls to encrypt/measure them. For more information see the SEV-SNP specification. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START commandBrijesh Singh
KVM_SEV_SNP_LAUNCH_START begins the launch process for an SEV-SNP guest. The command initializes a cryptographic digest context used to construct the measurement of the guest. Other commands can then at that point be used to load/encrypt data into the guest's initial launch image. For more information see the SEV-SNP specification. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-ID: <20240501085210.2213060-6-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Add initial SEV-SNP supportBrijesh Singh
SEV-SNP builds upon existing SEV and SEV-ES functionality while adding new hardware-based security protection. SEV-SNP adds strong memory encryption and integrity protection to help prevent malicious hypervisor-based attacks such as data replay, memory re-mapping, and more, to create an isolated execution environment. Define a new KVM_X86_SNP_VM type which makes use of these capabilities and extend the KVM_SEV_INIT2 ioctl to support it. Also add a basic helper to check whether SNP is enabled and set PFERR_PRIVATE_ACCESS for private #NPFs so they are handled appropriately by KVM MMU. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240501085210.2213060-5-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=yMichael Roth
SEV-SNP relies on private memory support to run guests, so make sure to enable that support via the CONFIG_KVM_GENERIC_PRIVATE_MEM config option. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501085210.2213060-4-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12KVM: MMU: Disable fast path if KVM_EXIT_MEMORY_FAULT is neededMichael Roth
For hardware-protected VMs like SEV-SNP guests, certain conditions like attempting to perform a write to a page which is not in the state that the guest expects it to be in can result in a nested/extended #PF which can only be satisfied by the host performing an implicit page state change to transition the page into the expected shared/private state. This is generally handled by generating a KVM_EXIT_MEMORY_FAULT event that gets forwarded to userspace to handle via KVM_SET_MEMORY_ATTRIBUTES. However, the fast_page_fault() code might misconstrue this situation as being the result of a write-protected access, and treat it as a spurious case when it sees that writes are already allowed for the sPTE. This results in the KVM MMU trying to resume the guest rather than taking any action to satisfy the real source of the #PF such as generating a KVM_EXIT_MEMORY_FAULT, resulting in the guest spinning on nested #PFs. Check for this condition and bail out of the fast path if it is detected. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Cc: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12Merge branch 'kvm-coco-hooks' into HEADPaolo Bonzini
Common patches for the target-independent functionality and hooks that are needed by SEV-SNP and TDX.
2024-05-12Merge tag 'kvm-x86-misc-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 misc changes for 6.10: - Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which is unused by hardware, so that KVM can communicate its inability to map GPAs that set bits 51:48 due to lack of 5-level paging. Guest firmware is expected to use the information to safely remap BARs in the uppermost GPA space, i.e to avoid placing a BAR at a legal, but unmappable, GPA. - Use vfree() instead of kvfree() for allocations that always use vcalloc() or __vcalloc(). - Don't completely ignore same-value writes to immutable feature MSRs, as doing so results in KVM failing to reject accesses to MSR that aren't supposed to exist given the vCPU model and/or KVM configuration. - Don't mark APICv as being inhibited due to ABSENT if APICv is disabled KVM-wide to avoid confusing debuggers (KVM will never bother clearing the ABSENT inhibit, even if userspace enables in-kernel local APIC).
2024-05-12Merge tag 'kvm-x86-mmu-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 MMU changes for 6.10: - Process TDP MMU SPTEs that are are zapped while holding mmu_lock for read after replacing REMOVED_SPTE with '0' and flushing remote TLBs, which allows vCPU tasks to repopulate the zapped region while the zapper finishes tearing down the old, defunct page tables. - Fix a longstanding, likely benign-in-practice race where KVM could fail to detect a write from kvm_mmu_track_write() to a shadowed GPTE if the GPTE is first page table being shadowed.
2024-05-12Merge tag 'kvm-x86-selftests_utils-6.10' of https://github.com/kvm-x86/linux ↵Paolo Bonzini
into HEAD KVM selftests treewide updates for 6.10: - Define _GNU_SOURCE for all selftests to fix a warning that was introduced by a change to kselftest_harness.h late in the 6.9 cycle, and because forcing every test to #define _GNU_SOURCE is painful. - Provide a global psuedo-RNG instance for all tests, so that library code can generate random, but determinstic numbers. - Use the global pRNG to randomly force emulation of select writes from guest code on x86, e.g. to help validate KVM's emulation of locked accesses. - Rename kvm_util_base.h back to kvm_util.h, as the weird layer of indirection was added purely to avoid manually #including ucall_common.h in a handful of locations. - Allocate and initialize x86's GDT, IDT, TSS, segments, and default exception handlers at VM creation, instead of forcing tests to manually trigger the related setup.
2024-05-12Merge tag 'kvm-x86-vmx-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM VMX changes for 6.10: - Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig VM-Exit to L1, as per the SDM. - Move kvm_vcpu_arch's exit_qualification into x86_exception, as the field is used only when synthesizing nested EPT violation, i.e. it's not the vCPU's "real" exit_qualification, which is tracked elsewhere. - Add a sanity check to assert that EPT Violations are the only sources of nested PML Full VM-Exits.
2024-05-12Merge tag 'kvm-x86-selftests-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM selftests cleanups and fixes for 6.10: - Enhance the demand paging test to allow for better reporting and stressing of UFFD performance. - Convert the steal time test to generate TAP-friendly output. - Fix a flaky false positive in the xen_shinfo_test due to comparing elapsed time across two different clock domains. - Skip the MONITOR/MWAIT test if the host doesn't actually support MWAIT. - Avoid unnecessary use of "sudo" in the NX hugepage test to play nice with running in a minimal userspace environment. - Allow skipping the RSEQ test's sanity check that the vCPU was able to complete a reasonable number of KVM_RUNs, as the assert can fail on a completely valid setup. If the test is run on a large-ish system that is otherwise idle, and the test isn't affined to a low-ish number of CPUs, the vCPU task can be repeatedly migrated to CPUs that are in deep sleep states, which results in the vCPU having very little net runtime before the next migration due to high wakeup latencies.
2024-05-12Merge tag 'kvm-x86-generic-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM cleanups for 6.10: - Misc cleanups extracted from the "exit on missing userspace mapping" series, which has been put on hold in anticipation of a "KVM Userfault" approach, which should provide a superset of functionality. - Remove kvm_make_all_cpus_request_except(), which got added to hack around an AVIC bug, and then became dead code when a more robust fix came along. - Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation.
2024-05-12Merge tag 'kvmarm-6.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 6.10 - Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. - Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greattly simplified. - Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. - A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! - Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. - Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. - Preserve vcpu-specific ID registers across a vcpu reset. - Various minor cleanups and improvements.
2024-05-11fs/proc: fix softlockup in __read_vmcoreRik van Riel
While taking a kernel core dump with makedumpfile on a larger system, softlockup messages often appear. While softlockup warnings can be harmless, they can also interfere with things like RCU freeing memory, which can be problematic when the kdump kexec image is configured with as little memory as possible. Avoid the softlockup, and give things like work items and RCU a chance to do their thing during __read_vmcore by adding a cond_resched. Link: https://lkml.kernel.org/r/20240507091858.36ff767f@imladris.surriel.com Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()Ryusuke Konishi
The BUG_ON check performed on the return value of __getblk() in nilfs_finish_roll_forward() assumes that a buffer that has been successfully read once is retrieved with the same parameters and does not fail (__getblk() does not return an error due to memory allocation failure). Also, nilfs_finish_roll_forward() is called at most once during mount. Taking these into consideration, rewrite the check to use WARN_ON() to avoid using BUG_ON(). Link: https://lkml.kernel.org/r/20240508221429.7559-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11scripts: checkpatch: check unused parameters for function-like macroXining Xu
If function-like macros do not utilize a parameter, it might result in a build warning. In our coding style guidelines, we advocate for utilizing static inline functions to replace such macros. This patch verifies compliance with the new rule. For a macro such as the one below, #define test(a) do { } while (0) The test result is as follows. WARNING: Argument 'a' is not used in function-like macro #21: FILE: mm/init-mm.c:20: +#define test(a) do { } while (0) total: 0 errors, 1 warnings, 8 lines checked Link: https://lkml.kernel.org/r/20240507032757.146386-3-21cnbao@gmail.com Signed-off-by: Xining Xu <mac.xxn@outlook.com> Tested-by: Barry Song <v-songbaohua@oppo.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Joe Perches <joe@perches.com> Cc: Chris Zankel <chris@zankel.net> Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mark Brown <broonie@kernel.org> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Jeff Johnson <quic_jjohnson@quicinc.com> Cc: Charlemagne Lasse <charlemagnelasse@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11Documentation: coding-style: ask function-like macros to evaluate parametersBarry Song
Patch series "codingstyle: avoid unused parameters for a function-like macro", v7. A function-like macro could result in build warnings such as "unused variable." This patchset updates the guidance to recommend always using a static inline function instead and also provides checkpatch support for this new rule. This patch (of 2): Recent commit 77292bb8ca69c80 ("crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem") leads to warnings on xtensa and loongarch, In file included from crypto/scompress.c:12: include/crypto/scatterwalk.h: In function 'scatterwalk_pagedone': include/crypto/scatterwalk.h:76:30: warning: variable 'page' set but not used [-Wunused-but-set-variable] 76 | struct page *page; | ^~~~ crypto/scompress.c: In function 'scomp_acomp_comp_decomp': >> crypto/scompress.c:174:38: warning: unused variable 'dst_page' [-Wunused-variable] 174 | struct page *dst_page = sg_page(req->dst); | The reason is that flush_dcache_page() is implemented as a noop macro on these platforms as below, #define flush_dcache_page(page) do { } while (0) The driver code, for itself, seems be quite innocent and placing maybe_unused seems pointless, struct page *dst_page = sg_page(req->dst); for (i = 0; i < nr_pages; i++) flush_dcache_page(dst_page + i); And it should be independent of architectural implementation differences. Let's provide guidance on coding style for requesting parameter evaluation or proposing the migration to a static inline function. Link: https://lkml.kernel.org/r/20240507032757.146386-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240507032757.146386-2-21cnbao@gmail.com Signed-off-by: Barry Song <v-songbaohua@oppo.com> Suggested-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Joe Perches <joe@perches.com> Cc: Chris Zankel <chris@zankel.net> Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Xining Xu <mac.xxn@outlook.com> Cc: Charlemagne Lasse <charlemagnelasse@gmail.com> Cc: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11nilfs2: use __field_struct() for a bitwise fieldBart Van Assche
As one can see in include/trace/stages/stage4_event_fields.h, the implementation of __field() uses the is_signed_type() macro. As one can see in commit dcf8e5633e2e ("tracing: Define the is_signed_type() macro once"), there has been an attempt to not make is_signed_type() trigger sparse warnings for bitwise types. Despite that change, sparse complains when passing a bitwise type to is_signed_type(). The reason is that in its definition below, an inequality comparison will be made against bitwise types, which are random collections of bits (the casts to bitwise types themselves are semantically valid and not problematic): #define is_signed_type(type) (((type)(-1)) < (__force type)1) So, as a workaround, follow the example of <trace/events/initcall.h> and suppress the following sparse warnings by changing __field() into __field_struct() that doesn't use is_signed_type(): fs/nilfs2/segment.c: note: in included file (through include/trace/trace_events.h, include/trace/define_trace.h, include/trace/events/nilfs2.h): ./include/trace/events/nilfs2.h:191:1: warning: cast to restricted blk_opf_t ./include/trace/events/nilfs2.h:191:1: warning: restricted blk_opf_t degrades to integer ./include/trace/events/nilfs2.h:191:1: warning: restricted blk_opf_t degrades to integer [konishi.ryusuke: describe the reason for the warnings per Linus's explanation] Link: https://lkml.kernel.org/r/20240507222041.4876-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240507142454.3344-1-konishi.ryusuke@gmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401092241.I4mm9OWl-lkp@intel.com/ Reported-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Closes: https://lore.kernel.org/all/20240430080019.4242-2-konishi.ryusuke@gmail.com/ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11selftests/kcmp: remove unused open modeEdward Liaw
Android bionic warns that open modes are ignored if O_CREAT or O_TMPFILE aren't specified. The permissions for the file are set above: fd1 = open(kpath, O_RDWR | O_CREAT | O_TRUNC, 0644); Link: https://lkml.kernel.org/r/20240429234610.191144-1-edliaw@google.com Fixes: d97b46a64674 ("syscalls, x86: add __NR_kcmp syscall") Signed-off-by: Edward Liaw <edliaw@google.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11nilfs2: remove calls to folio_set_error() and folio_clear_error()Matthew Wilcox (Oracle)
Nobody checks this flag on nilfs2 folios, stop setting and clearing it. That lets us simplify nilfs_end_folio_io() slightly. Link: https://lkml.kernel.org/r/20240420025029.2166544-17-willy@infradead.org Link: https://lkml.kernel.org/r/20240430050901.3239-1-konishi.ryusuke@gmail.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: kernel test robot <lkp@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-11memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_orderXiu Jianfeng
Since commit 857f21397f71 ("memcg, oom: remove unnecessary check in mem_cgroup_oom_synchronize()"), memcg_oom_gfp_mask and memcg_oom_order are no longer used any more. Link: https://lkml.kernel.org/r/20240509032628.1217652-1-xiujianfeng@huawei.com Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Benjamin Segall <bsegall@google.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>