summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-15Merge tag 'pmdomain-v6.12-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: "pmdomain core: - Add GENPD_FLAG_DEV_NAME_FW flag to generate unique names pmdomain providers: - arm: Use FLAG_DEV_NAME_FW to ensure unique names - imx93-blk-ctrl: Fix the remove path arm_scmi/qcom-cpucp: - Report duplicate OPPs as firmware bugs for arm_scmi - Skip OPP duplicates for arm_scmi - Mark the qcom-cpucp mailbox irq with IRQF_NO_SUSPEND flag" * tag 'pmdomain-v6.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: mailbox: qcom-cpucp: Mark the irq with IRQF_NO_SUSPEND flag firmware: arm_scmi: Report duplicate opps as firmware bugs firmware: arm_scmi: Skip opp duplicates pmdomain: imx93-blk-ctrl: correct remove path pmdomain: arm: Use FLAG_DEV_NAME_FW to ensure unique names pmdomain: core: Add GENPD_FLAG_DEV_NAME_FW flag
2024-11-15Merge tag 'mmc-v6.12-rc3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - dw_mmc: Revert fix for IDMAC operation with pages bigger than 4K - sunxi-mmc: Fix A100 compatible description * tag 'mmc-v6.12-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K" mmc: sunxi-mmc: Fix A100 compatible description
2024-11-15Merge tag 'sound-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few last-minute fixes. All changes are device-specific small fixes that should be pretty safe to apply" * tag 'sound-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - update set GPIO3 to default for Thinkpad with ALC1318 ALSA: hda/realtek: fix mute/micmute LEDs for a HP EliteBook 645 G10 ALSA: hda/realtek - Fixed Clevo platform headset Mic issue ALSA: usb-audio: Fix Yamaha P-125 Quirk Entry ASoC: max9768: Fix event generation for playback mute ASoC: intel: sof_sdw: add quirk for Dell SKU ASoC: audio-graph-card2: Purge absent supplies for device tree nodes
2024-11-15Merge tag 'v6.12-p5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a regression in the MIPS CRC32C code" * tag 'v6.12-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: mips/crc32 - fix the CRC32C implementation
2024-11-15Merge tag 'sched_ext-for-6.12-rc7-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fix from Tejun Heo: "One more fix for v6.12-rc7 ops.cpu_acquire() was being invoked with the wrong kfunc mask allowing the operation to call kfuncs which shouldn't be allowed. Fix it by using SCX_KF_REST instead, which is trivial and low risk" * tag 'sched_ext-for-6.12-rc7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: ops.cpu_acquire() should be called with SCX_KF_REST
2024-11-15Merge tag 'for-6.12-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix that seems urgent and good to have in 6.12 final. It could potentially lead to unexpected transaction aborts, due to wrong comparison and order of processing of delayed refs" * tag 'for-6.12-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix incorrect comparison for delayed refs
2024-11-15io_uring: add memory region registrationPavel Begunkov
Regions will serve multiple purposes. First, with it we can decouple ring/etc. object creation from registration / mapping of the memory they will be placed in. We already have hacks that allow to put both SQ and CQ into the same huge page, in the future we should be able to: region = create_region(io_ring); create_pbuf_ring(io_uring, region, offset=0); create_pbuf_ring(io_uring, region, offset=N); The second use case is efficiently passing parameters. The following patch enables back on top of regions IORING_ENTER_EXT_ARG_REG, which optimises wait arguments. It'll also be useful for request arguments replacing iovecs, msghdr, etc. pointers. Eventually it would also be handy for BPF as well if it comes to fruition. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0798cf3a14fad19cfc96fc9feca5f3e11481691d.1731689588.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15io_uring: introduce concept of memory regionsPavel Begunkov
We've got a good number of mappings we share with the userspace, that includes the main rings, provided buffer rings, upcoming rings for zerocopy rx and more. All of them duplicate user argument parsing and some internal details as well (page pinnning, huge page optimisations, mmap'ing, etc.) Introduce a notion of regions. For userspace for now it's just a new structure called struct io_uring_region_desc which is supposed to parameterise all such mapping / queue creations. A region either represents a user provided chunk of memory, in which case the user_addr field should point to it, or a request for the kernel to allocate the memory, in which case the user would need to mmap it after using the offset returned in the mmap_offset field. With a uniform userspace API we can avoid additional boiler plate code and apply future optimisation to all of them at once. Internally, there is a new structure struct io_mapped_region holding all relevant runtime information and some helpers to work with it. This patch limits it to user provided regions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0e6fe25818dfbaebd1bd90b870a6cac503fe1a24.1731689588.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15io_uring: temporarily disable registered waitsPavel Begunkov
Disable wait argument registration as it'll be replaced with a more generic feature. We'll still need IORING_ENTER_EXT_ARG_REG parsing in a few commits so leave it be. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/70b1d1d218c41ba77a76d1789c8641dab0b0563e.1731689588.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15io_uring: disable ENTER_EXT_ARG_REG for IOPOLLPavel Begunkov
IOPOLL doesn't use the extended arguments, no need for it to support IORING_ENTER_EXT_ARG_REG. Let's disable it for IOPOLL, if anything it leaves more space for future extensions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a35ecd919dbdc17bd5b7932273e317832c531b45.1731689588.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15io_uring: fortify io_pin_pages with a warningPavel Begunkov
We're a bit too frivolous with types of nr_pages arguments, converting it to long and back to int, passing an unsigned int pointer as an int pointer and so on. Shouldn't cause any problem but should be carefully reviewed, but until then let's add a WARN_ON_ONCE check to be more confident callers don't pass poorely checked arguents. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d48e0c097cbd90fb47acaddb6c247596510d8cfc.1731689588.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15switch io_msg_ring() to CLASS(fd)Al Viro
Use CLASS(fd) to get the file for sync message ring requests, rather than open-code the file retrieval dance. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20241115034902.GP3387508@ZenIV [axboe: make a more coherent commit message] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15workqueue: Reduce expensive locks for unbound workqueueWangyang Guo
For unbound workqueue, pwqs usually map to just a few pools. Most of the time, pwqs will be linked sequentially to wq->pwqs list by cpu index. Usually, consecutive CPUs have the same workqueue attribute (e.g. belong to the same NUMA node). This makes pwqs with the same pool cluster together in the pwq list. Only do lock/unlock if the pool has changed in flush_workqueue_prep_pwqs(). This reduces the number of expensive lock operations. The performance data shows this change boosts FIO by 65x in some cases when multiple concurrent threads write to xfs mount points with fsync. FIO Benchmark Details - FIO version: v3.35 - FIO Options: ioengine=libaio,iodepth=64,norandommap=1,rw=write, size=128M,bs=4k,fsync=1 - FIO Job Configs: 64 jobs in total writing to 4 mount points (ramdisks formatted as xfs file system). - Kernel Codebase: v6.12-rc5 - Test Platform: Xeon 8380 (2 sockets) Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Wangyang Guo <wangyang.guo@intel.com> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-11-15efi/libstub: Parse builtin command line after bootloader provided oneArd Biesheuvel
When CONFIG_CMDLINE_EXTEND is set, the core kernel command line handling logic appends CONFIG_CMDLINE to the bootloader provided command line. The EFI stub does the opposite, and parses the builtin one first. The usual behavior of command line options is that the last one takes precedence if it appears multiple times, unless there is a meaningful way to combine them. In either case, parsing the builtin command line first while the core kernel does it in the opposite order is likely to produce inconsistent results in such cases. Therefore, switch the order in the stub to match the core kernel. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15x86/efi: Apply EFI Memory Attributes after kexecNicolas Saenz Julienne
Kexec bypasses EFI's switch to virtual mode. In exchange, it has its own routine, kexec_enter_virtual_mode(), which replays the mappings made by the original kernel. Unfortunately, that function fails to reinstate EFI's memory attributes, which would've otherwise been set after entering virtual mode. Remediate this by calling efi_runtime_update_mappings() within kexec's routine. Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15x86/efi: Drop support for the EFI_PROPERTIES_TABLENicolas Saenz Julienne
Drop support for the EFI_PROPERTIES_TABLE. It was a failed, short-lived experiment that broke the boot both on Linux and Windows, and was replaced by the EFI_MEMORY_ATTRIBUTES_TABLE shortly after. Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15bpf: Add necessary migrate_disable to range_tree.Yonghong Song
When running bpf selftest (./test_progs -j), the following warnings showed up: $ ./test_progs -t arena_atomics ... BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u19:0/12501 caller is bpf_mem_free+0x128/0x330 ... Call Trace: <TASK> dump_stack_lvl check_preemption_disabled bpf_mem_free range_tree_destroy arena_map_free bpf_map_free_deferred process_scheduled_works ... For selftests arena_htab and arena_list, similar smp_process_id() BUGs are dumped, and the following are two stack trace: <TASK> dump_stack_lvl check_preemption_disabled bpf_mem_alloc range_tree_set arena_map_alloc map_create ... <TASK> dump_stack_lvl check_preemption_disabled bpf_mem_alloc range_tree_clear arena_vm_fault do_pte_missing handle_mm_fault do_user_addr_fault ... Add migrate_{disable,enable}() around related bpf_mem_{alloc,free}() calls to fix the issue. Fixes: b795379757eb ("bpf: Introduce range_tree data structure and use it in bpf arena") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241115060354.2832495-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-15bpf: Do not alloc arena on unsupported archesViktor Malik
Do not allocate BPF arena on arches that do not support it, instead return EOPNOTSUPP. This is useful to prevent bugs such as soft lockups while trying to free the arena which we have witnessed on ppc64le [1]. [1] https://lore.kernel.org/bpf/4afdcb50-13f2-4772-8db1-3fd02bd985b3@redhat.com/ Signed-off-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20241115082548.74972-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-15Merge tag 'soc_fsl-6.13-1' of https://github.com/chleroy/linux into soc/driversArnd Bergmann
FSL SOC changes for 6.13: - Fix a missing of_node_put() in RCPM - Fix a missing error code on failure in CPM1 QMC - Switch to using for_each_available_child_of_node_scoped() in CPM1 TSA * tag 'soc_fsl-6.13-1' of https://github.com/chleroy/linux: soc: fsl: cpm1: qmc: Set the ret error code on platform_get_irq() failure soc: fsl: rcpm: fix missing of_node_put() in copy_ippdexpcr1_setting() soc: fsl: cpm1: tsa: switch to for_each_available_child_of_node_scoped() Link: https://lore.kernel.org/r/c3c4961b-fe2a-4fcc-a7a1-f8b5352e09a2@csgroup.eu Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-15block: make struct rq_list available for !CONFIG_BLOCKJens Axboe
A previous commit changed how requests are linked in the plug structure, but unlike the previous method, it uses a new type for it rather than struct request. The latter is available even for !CONFIG_BLOCK, while struct rq_list is now. Move it outside CONFIG_BLOCK. Reported-by: Nathan Chancellor <nathan@kernel.org> Fixes: a3396b99990d ("block: add a rq_list type") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctxKarol Przybylski
In cesa/cipher.c most declarations of struct mv_cesa_op_ctx are uninitialized. This causes one of the values in the struct to be left unitialized in later usages. This patch fixes it by adding initializations in the same way it is done in cesa/hash.c. Fixes errors discovered in coverity: 1600942, 1600939, 1600935, 1600934, 1600929, 1600927, 1600925, 1600921, 1600920, 1600919, 1600915, 1600914 Signed-off-by: Karol Przybylski <karprzy7@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()Christophe JAILLET
If do_cpt_init() fails, a previous dma_alloc_coherent() call needs to be undone. Add the needed dma_free_coherent() before returning. Fixes: 9e2c7d99941d ("crypto: cavium - Add Support for Octeon-tx CPT Engine") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: aesni - Move back to module_initHerbert Xu
This patch reverts commit 0fbafd06bdde938884f7326548d3df812b267c3c ("crypto: aesni - fix failing setkey for rfc4106-gcm-aesni") by moving the aesni init function back to module_init from late_initcall. The original patch was needed because tests were synchronous. This is no longer the case so there is no need to postpone the registration. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: lib/mpi - Export mpi_set_bitHerbert Xu
This function is part of the exposed API and should be exported. Otherwise a modular user would fail to build, e.g., crypto/rsa. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: aes-gcm-p10 - Use the correct bit to test for P10Michal Suchanek
A hwcap feature bit is passed to cpu_has_feature, resulting in testing for CPU_FTR_MMCRA instead of the 3.1 platform revision. Fixes: c954b252dee9 ("crypto: powerpc/p10-aes-gcm - Register modules as SIMD") Reported-by: Nicolai Stange <nstange@suse.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15hwrng: amd - remove reference to removed PPC_MAPLE configLukas Bulwahn
Commit 62f8f307c80e ("powerpc/64: Remove maple platform") removes the PPC_MAPLE config as a consequence of the platform’s removal. The config definition of HW_RANDOM_AMD refers to this removed config option in its dependencies. Remove the reference to the removed config option. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm/crct10dif - Implement plain NEON variantArd Biesheuvel
The CRC-T10DIF algorithm produces a 16-bit CRC, and this is reflected in the folding coefficients, which are also only 16 bits wide. This means that the polynomial multiplications involving these coefficients can be performed using 8-bit long polynomial multiplication (8x8 -> 16) in only a few steps, and this is an instruction that is part of the base NEON ISA, which is all most real ARMv7 cores implement. (The 64-bit PMULL instruction is part of the crypto extensions, which are only implemented by 64-bit cores) The final reduction is a bit more involved, but we can delegate that to the generic CRC-T10DIF implementation after folding the entire input into a 16 byte vector. This results in a speedup of around 6.6x on Cortex-A72 running in 32-bit mode. On Cortex-A8 (BeagleBone White), the results are substantially better than that, but not sufficiently reproducible (with tcrypt) to quote a number here. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm/crct10dif - Macroify PMULL asm codeArd Biesheuvel
To allow an alternative version to be created of the PMULL based CRC-T10DIF algorithm, turn the bulk of it into a macro, except for the final reduction, which will only be used by the existing version. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm/crct10dif - Use existing mov_l macro instead of __adrlArd Biesheuvel
Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback codeArd Biesheuvel
The only remaining user of the fallback implementation of 64x64 polynomial multiplication using 8x8 PMULL instructions is the final reduction from a 16 byte vector to a 16-bit CRC. The fallback code is complicated and messy, and this reduction has little impact on the overall performance, so instead, let's calculate the final CRC by passing the 16 byte vector to the generic CRC-T10DIF implementation when running the fallback version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiplyArd Biesheuvel
The CRC-T10DIF implementation for arm64 has a version that uses 8x8 polynomial multiplication, for cores that lack the crypto extensions, which cover the 64x64 polynomial multiplication instruction that the algorithm was built around. This fallback version rather naively adopted the 64x64 polynomial multiplication algorithm that I ported from ARM for the GHASH driver, which needs 8 PMULL8 instructions to implement one PMULL64. This is reasonable, given that each 8-bit vector element needs to be multiplied with each element in the other vector, producing 8 vectors with partial results that need to be combined to yield the correct result. However, most PMULL64 invocations in the CRC-T10DIF code involve multiplication by a pair of 16-bit folding coefficients, and so all the partial results from higher order bytes will be zero, and there is no need to calculate them to begin with. Then, the CRC-T10DIF algorithm always XORs the output values of the PMULL64 instructions being issued in pairs, and so there is no need to faithfully implement each individual PMULL64 instruction, as long as XORing the results pairwise produces the expected result. Implementing these improvements results in a speedup of 3.3x on low-end platforms such as Raspberry Pi 4 (Cortex-A72) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: arm64/crct10dif - Remove obsolete chunking logicArd Biesheuvel
This is a partial revert of commit fc754c024a343b, which moved the logic into C code which ensures that kernel mode NEON code does not hog the CPU for too long. This is no longer needed now that kernel mode NEON no longer disables preemption, so we can drop this. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: bcm - add error check in the ahash_hmac_init functionChen Ridong
The ahash_init functions may return fails. The ahash_hmac_init should not return ok when ahash_init returns error. For an example, ahash_init will return -ENOMEM when allocation memory is error. Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver") Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15crypto: caam - add error check to caam_rsa_set_priv_key_formChen Ridong
The caam_rsa_set_priv_key_form did not check for memory allocation errors. Add the checks to the caam_rsa_set_priv_key_form functions. Fixes: 52e26d77b8b3 ("crypto: caam - add support for RSA key form 2") Signed-off-by: Chen Ridong <chenridong@huawei.com> Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15netfilter: bitwise: add support for doing AND, OR and XOR directlyJeremy Sowden
Hitherto, these operations have been converted in user space to mask-and-xor operations on one register and two immediate values, and it is the latter which have been evaluated by the kernel. We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register. Pablo made a few changes to the original patch: - EINVAL if NFTA_BITWISE_SREG2 is used with fast version. - Allow _AND,_OR,_XOR with _DATA != sizeof(u32) - Dump _SREG2 or _DATA with _AND,_OR,_XOR Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15efi/memattr: Ignore table if the size is clearly bogusArd Biesheuvel
There are reports [0] of cases where a corrupt EFI Memory Attributes Table leads to out of memory issues at boot because the descriptor size and entry count in the table header are still used to reserve the entire table in memory, even though the resulting region is gigabytes in size. Given that the EFI Memory Attributes Table is supposed to carry up to 3 entries for each EfiRuntimeServicesCode region in the EFI memory map, and given that there is no reason for the descriptor size used in the table to exceed the one used in the EFI memory map, 3x the size of the entire EFI memory map is a reasonable upper bound for the size of this table. This means that sizes exceeding that are highly likely to be based on corrupted data, and the table should just be ignored instead. [0] https://bugzilla.suse.com/show_bug.cgi?id=1231465 Cc: Gregory Price <gourry@gourry.net> Cc: Usama Arif <usamaarif642@gmail.com> Acked-by: Jiri Slaby <jirislaby@kernel.org> Acked-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/all/20240912155159.1951792-2-ardb+git@google.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15netfilter: bitwise: rename some boolean operation functionsJeremy Sowden
In the next patch we add support for doing AND, OR and XOR operations directly in the kernel, so rename some functions and an enum constant related to mask-and-xor boolean operations. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: flow_offload: Convert nft_flow_route() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading ip_hdr()->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Also, remove the comment about the net/ip.h include file, since it's now required for the ip4h_dscp() helper too. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15efi/zboot: Fix outdated comment about using LoadImage/StartImageArd Biesheuvel
EFI zboot no longer uses LoadImage/StartImage, but subsumes the arch code to load and start the bare metal image directly. Fix the Kconfig description accordingly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15efi/libstub: Free correct pointer on failureArd Biesheuvel
cmdline_ptr is an out parameter, which is not allocated by the function itself, and likely points into the caller's stack. cmdline refers to the pool allocation that should be freed when cleaning up after a failure, so pass this instead to free_pool(). Fixes: 42c8ea3dca09 ("efi: libstub: Factor out EFI stub entrypoint ...") Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-15microblaze: mb: Use str_yes_no() helper in show_cpuinfo()Thorsten Blum
Remove hard-coded strings by using the str_yes_no() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20241114224649.57946-4-thorsten.blum@linux.dev Signed-off-by: Michal Simek <michal.simek@amd.com>
2024-11-14ocfs2: uncache inode which has failed entering the groupDmitry Antipov
Syzbot has reported the following BUG: kernel BUG at fs/ocfs2/uptodate.c:509! ... Call Trace: <TASK> ? __die_body+0x5f/0xb0 ? die+0x9e/0xc0 ? do_trap+0x15a/0x3a0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? do_error_trap+0x1dc/0x2c0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? __pfx_do_error_trap+0x10/0x10 ? handle_invalid_op+0x34/0x40 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? exc_invalid_op+0x38/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? ocfs2_set_new_buffer_uptodate+0x2e/0x160 ? ocfs2_set_new_buffer_uptodate+0x144/0x160 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ocfs2_group_add+0x39f/0x15a0 ? __pfx_ocfs2_group_add+0x10/0x10 ? __pfx_lock_acquire+0x10/0x10 ? mnt_get_write_access+0x68/0x2b0 ? __pfx_lock_release+0x10/0x10 ? rcu_read_lock_any_held+0xb7/0x160 ? __pfx_rcu_read_lock_any_held+0x10/0x10 ? smack_log+0x123/0x540 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x226/0x2b0 ocfs2_ioctl+0x65e/0x7d0 ? __pfx_ocfs2_ioctl+0x10/0x10 ? smack_file_ioctl+0x29e/0x3a0 ? __pfx_smack_file_ioctl+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x43d/0x780 ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10 ? __pfx_ocfs2_ioctl+0x10/0x10 __se_sys_ioctl+0xfb/0x170 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... </TASK> When 'ioctl(OCFS2_IOC_GROUP_ADD, ...)' has failed for the particular inode in 'ocfs2_verify_group_and_input()', corresponding buffer head remains cached and subsequent call to the same 'ioctl()' for the same inode issues the BUG() in 'ocfs2_set_new_buffer_uptodate()' (trying to cache the same buffer head of that inode). Fix this by uncaching the buffer head with 'ocfs2_remove_from_cache()' on error path in 'ocfs2_group_add()'. Link: https://lkml.kernel.org/r/20241114043844.111847-1-dmantipov@yandex.ru Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+453873f1588c2d75b447@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=453873f1588c2d75b447 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Dmitry Antipov <dmantipov@yandex.ru> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14mm: fix NULL pointer dereference in alloc_pages_bulk_noprofJinjiang Tu
We triggered a NULL pointer dereference for ac.preferred_zoneref->zone in alloc_pages_bulk_noprof() when the task is migrated between cpusets. When cpuset is enabled, in prepare_alloc_pages(), ac->nodemask may be &current->mems_allowed. when first_zones_zonelist() is called to find preferred_zoneref, the ac->nodemask may be modified concurrently if the task is migrated between different cpusets. Assuming we have 2 NUMA Node, when traversing Node1 in ac->zonelist, the nodemask is 2, and when traversing Node2 in ac->zonelist, the nodemask is 1. As a result, the ac->preferred_zoneref points to NULL zone. In alloc_pages_bulk_noprof(), for_each_zone_zonelist_nodemask() finds a allowable zone and calls zonelist_node_idx(ac.preferred_zoneref), leading to NULL pointer dereference. __alloc_pages_noprof() fixes this issue by checking NULL pointer in commit ea57485af8f4 ("mm, page_alloc: fix check for NULL preferred_zone") and commit df76cee6bbeb ("mm, page_alloc: remove redundant checks from alloc fastpath"). To fix it, check NULL pointer for preferred_zoneref->zone. Link: https://lkml.kernel.org/r/20241113083235.166798-1-tujinjiang@huawei.com Fixes: 387ba26fb1cb ("mm/page_alloc: add a bulk page allocator") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Lobakin <alobakin@pm.me> Cc: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14mm, doc: update read_ahead_kb for MADV_HUGEPAGEYafang Shao
MADV_HUGEPAGE is a new addition to readahead with behavior distinct from normal pages. To prevent confusion, we should update the documentation accordingly. Link: https://lkml.kernel.org/r/20241113150711.1685-1-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14fs/proc/task_mmu: prevent integer overflow in pagemap_scan_get_args()Dan Carpenter
The "arg->vec_len" variable is a u64 that comes from the user at the start of the function. The "arg->vec_len * sizeof(struct page_region))" multiplication can lead to integer wrapping. Use size_mul() to avoid that. Also the size_add/mul() functions work on unsigned long so for 32bit systems we need to ensure that "arg->vec_len" fits in an unsigned long. Link: https://lkml.kernel.org/r/39d41335-dd4d-48ed-8a7f-402c57d8ea84@stanley.mountain Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Andrei Vagin <avagin@google.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14sched/task_stack: fix object_is_on_stack() for KASAN tagged pointersQun-Wei Lin
When CONFIG_KASAN_SW_TAGS and CONFIG_KASAN_STACK are enabled, the object_is_on_stack() function may produce incorrect results due to the presence of tags in the obj pointer, while the stack pointer does not have tags. This discrepancy can lead to incorrect stack object detection and subsequently trigger warnings if CONFIG_DEBUG_OBJECTS is also enabled. Example of the warning: ODEBUG: object 3eff800082ea7bb0 is NOT on stack ffff800082ea0000, but annotated. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:557 __debug_object_init+0x330/0x364 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc5 #4 Hardware name: linux,dummy-virt (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __debug_object_init+0x330/0x364 lr : __debug_object_init+0x330/0x364 sp : ffff800082ea7b40 x29: ffff800082ea7b40 x28: 98ff0000c0164518 x27: 98ff0000c0164534 x26: ffff800082d93ec8 x25: 0000000000000001 x24: 1cff0000c00172a0 x23: 0000000000000000 x22: ffff800082d93ed0 x21: ffff800081a24418 x20: 3eff800082ea7bb0 x19: efff800000000000 x18: 0000000000000000 x17: 00000000000000ff x16: 0000000000000047 x15: 206b63617473206e x14: 0000000000000018 x13: ffff800082ea7780 x12: 0ffff800082ea78e x11: 0ffff800082ea790 x10: 0ffff800082ea79d x9 : 34d77febe173e800 x8 : 34d77febe173e800 x7 : 0000000000000001 x6 : 0000000000000001 x5 : feff800082ea74b8 x4 : ffff800082870a90 x3 : ffff80008018d3c4 x2 : 0000000000000001 x1 : ffff800082858810 x0 : 0000000000000050 Call trace: __debug_object_init+0x330/0x364 debug_object_init_on_stack+0x30/0x3c schedule_hrtimeout_range_clock+0xac/0x26c schedule_hrtimeout+0x1c/0x30 wait_task_inactive+0x1d4/0x25c kthread_bind_mask+0x28/0x98 init_rescuer+0x1e8/0x280 workqueue_init+0x1a0/0x3cc kernel_init_freeable+0x118/0x200 kernel_init+0x28/0x1f0 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- ODEBUG: object 3eff800082ea7bb0 is NOT on stack ffff800082ea0000, but annotated. ------------[ cut here ]------------ Link: https://lkml.kernel.org/r/20241113042544.19095-1-qun-wei.lin@mediatek.com Signed-off-by: Qun-Wei Lin <qun-wei.lin@mediatek.com> Cc: Andrew Yang <andrew.yang@mediatek.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Casper Li <casper.li@mediatek.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>