summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-04-18block: propagate partition scanning errors to the BLKRRPART ioctlChristoph Hellwig
Commit 4601b4b130de ("block: reopen the device in blkdev_reread_part") lost the propagation of I/O errors from the low-level read of the partition table to the user space caller of the BLKRRPART. Apparently some user space relies on, so restore the propagation. This isn't exactly pretty as other block device open calls explicitly do not are about these errors, so add a new BLK_OPEN_STRICT_SCAN to opt into the error propagation. Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part") Reported-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240417144743.2277601-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-17gpiolib: swnode: Remove wrong header inclusionAndy Shevchenko
The flags in the software node properties are supposed to be the GPIO lookup flags, which are provided by gpio/machine.h, as the software nodes are the kernel internal thing and doesn't need to rely to any of ABIs. Fixes: e7f9ff5dc90c ("gpiolib: add support for software nodes") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-04-16mm/shmem: inline shmem_is_huge() for disabled transparent hugepagesSumanth Korikkar
In order to minimize code size (CONFIG_CC_OPTIMIZE_FOR_SIZE=y), compiler might choose to make a regular function call (out-of-line) for shmem_is_huge() instead of inlining it. When transparent hugepages are disabled (CONFIG_TRANSPARENT_HUGEPAGE=n), it can cause compilation error. mm/shmem.c: In function `shmem_getattr': ./include/linux/huge_mm.h:383:27: note: in expansion of macro `BUILD_BUG' 383 | #define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; }) | ^~~~~~~~~ mm/shmem.c:1148:33: note: in expansion of macro `HPAGE_PMD_SIZE' 1148 | stat->blksize = HPAGE_PMD_SIZE; To prevent the possible error, always inline shmem_is_huge() when transparent hugepages are disabled. Link: https://lkml.kernel.org/r/20240409155407.2322714-1-sumanthk@linux.ibm.com Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16mm,swapops: update check in is_pfn_swap_entry for hwpoison entriesOscar Salvador
Tony reported that the Machine check recovery was broken in v6.9-rc1, as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors to DRAM. After some more digging and debugging on his side, he realized that this went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")'. That commit, among other things, introduced swp_offset_pfn(), replacing hwpoison_entry_to_pfn() in its favour. The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), but is_pfn_swap_entry() never got updated to cover hwpoison entries, which means that we would hit the VM_BUG_ON whenever we would call swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. Fix this by updating the check to cover hwpoison entries as well, and update the comment while we are it. Link: https://lkml.kernel.org/r/20240407130537.16977-1-osalvador@suse.de Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-by: Tony Luck <tony.luck@intel.com> Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ Tested-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: <stable@vger.kernel.org> [6.1.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-14Merge tag 'timers-urgent-2024-04-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: - Address a (valid) W=1 build warning - Fix timer self-tests - Annotate a KCSAN warning wrt. accesses to the tick_do_timer_cpu global variable - Address a !CONFIG_BUG build warning * tag 'timers-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests: kselftest: Fix build failure with NOLIBC selftests: timers: Fix abs() warning in posix_timers test selftests: kselftest: Mark functions that unconditionally call exit() as __noreturn selftests: timers: Fix posix_timers ksft_print_msg() warning selftests: timers: Fix valid-adjtimex signed left-shift undefined behavior bug: Fix no-return-statement warning with !CONFIG_BUG timekeeping: Use READ/WRITE_ONCE() for tick_do_timer_cpu selftests/timers/posix_timers: Reimplement check_timer_distribution() irqflags: Explicitly ignore lockdep_hrtimer_exit() argument
2024-04-14Merge tag 'locking-urgent-2024-04-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix a PREEMPT_RT build bug" * tag 'locking-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
2024-04-14Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio bugfixes from Michael Tsirkin: "Some small, obvious (in hindsight) bugfixes: - new ioctl in vhost-vdpa has a wrong # - not too late to fix - vhost has apparently been lacking an smp_rmb() - due to code duplication :( The duplication will be fixed in the next merge cycle, this is a minimal fix - an error message in vhost talks about guest moving used index - which of course never happens, guest only ever moves the available index - i2c-virtio didn't set the driver owner so it did not get refcounted correctly" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: correct misleading printing information vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE virtio: store owner from modules with register_virtio_driver() vhost: Add smp_rmb() in vhost_enable_notify() vhost: Add smp_rmb() in vhost_vq_avail_empty()
2024-04-14net: change maximum number of UDP segments to 128Yuri Benditovich
The commit fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") adds check of potential number of UDP segments vs UDP_MAX_SEGMENTS in linux/virtio_net.h. After this change certification test of USO guest-to-guest transmit on Windows driver for virtio-net device fails, for example with packet size of ~64K and mss of 536 bytes. In general the USO should not be more restrictive than TSO. Indeed, in case of unreasonably small mss a lot of segments can cause queue overflow and packet loss on the destination. Limit of 128 segments is good for any practical purpose, with minimal meaningful mss of 536 the maximal UDP packet will be divided to ~120 segments. The number of segments for UDP packets is validated vs UDP_MAX_SEGMENTS also in udp.c (v4,v6), this does not affect quest-to-guest path but does affect packets sent to host, for example. It is important to mention that UDP_MAX_SEGMENTS is kernel-only define and not available to user mode socket applications. In order to request MSS smaller than MTU the applications just uses setsockopt with SOL_UDP and UDP_SEGMENT and there is no limitations on socket API level. Fixes: fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-14bootconfig: use memblock_free_late to free xbc memory to buddyQiang Zhang
On the time to free xbc memory in xbc_exit(), memblock may has handed over memory to buddy allocator. So it doesn't make sense to free memory back to memblock. memblock_free() called by xbc_exit() even causes UAF bugs on architectures with CONFIG_ARCH_KEEP_MEMBLOCK disabled like x86. Following KASAN logs shows this case. This patch fixes the xbc memory free problem by calling memblock_free() in early xbc init error rewind path and calling memblock_free_late() in xbc exit path to free memory to buddy allocator. [ 9.410890] ================================================================== [ 9.418962] BUG: KASAN: use-after-free in memblock_isolate_range+0x12d/0x260 [ 9.426850] Read of size 8 at addr ffff88845dd30000 by task swapper/0/1 [ 9.435901] CPU: 9 PID: 1 Comm: swapper/0 Tainted: G U 6.9.0-rc3-00208-g586b5dfb51b9 #5 [ 9.446403] Hardware name: Intel Corporation RPLP LP5 (CPU:RaptorLake)/RPLP LP5 (ID:13), BIOS IRPPN02.01.01.00.00.19.015.D-00000000 Dec 28 2023 [ 9.460789] Call Trace: [ 9.463518] <TASK> [ 9.465859] dump_stack_lvl+0x53/0x70 [ 9.469949] print_report+0xce/0x610 [ 9.473944] ? __virt_addr_valid+0xf5/0x1b0 [ 9.478619] ? memblock_isolate_range+0x12d/0x260 [ 9.483877] kasan_report+0xc6/0x100 [ 9.487870] ? memblock_isolate_range+0x12d/0x260 [ 9.493125] memblock_isolate_range+0x12d/0x260 [ 9.498187] memblock_phys_free+0xb4/0x160 [ 9.502762] ? __pfx_memblock_phys_free+0x10/0x10 [ 9.508021] ? mutex_unlock+0x7e/0xd0 [ 9.512111] ? __pfx_mutex_unlock+0x10/0x10 [ 9.516786] ? kernel_init_freeable+0x2d4/0x430 [ 9.521850] ? __pfx_kernel_init+0x10/0x10 [ 9.526426] xbc_exit+0x17/0x70 [ 9.529935] kernel_init+0x38/0x1e0 [ 9.533829] ? _raw_spin_unlock_irq+0xd/0x30 [ 9.538601] ret_from_fork+0x2c/0x50 [ 9.542596] ? __pfx_kernel_init+0x10/0x10 [ 9.547170] ret_from_fork_asm+0x1a/0x30 [ 9.551552] </TASK> [ 9.555649] The buggy address belongs to the physical page: [ 9.561875] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x45dd30 [ 9.570821] flags: 0x200000000000000(node=0|zone=2) [ 9.576271] page_type: 0xffffffff() [ 9.580167] raw: 0200000000000000 ffffea0011774c48 ffffea0012ba1848 0000000000000000 [ 9.588823] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 [ 9.597476] page dumped because: kasan: bad access detected [ 9.605362] Memory state around the buggy address: [ 9.610714] ffff88845dd2ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 9.618786] ffff88845dd2ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 9.626857] >ffff88845dd30000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.634930] ^ [ 9.638534] ffff88845dd30080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.646605] ffff88845dd30100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.654675] ================================================================== Link: https://lore.kernel.org/all/20240414114944.1012359-1-qiang4.zhang@linux.intel.com/ Fixes: 40caa127f3c7 ("init: bootconfig: Remove all bootconfig data when the init memory is removed") Cc: Stable@vger.kernel.org Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-12Merge tag 'io_uring-6.9-20240412' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix for sigmask restoring while waiting for events (Alexey) - Typo fix in comment (Haiyue) - Fix for a msg_control retstore on SEND_ZC retries (Pavel) * tag 'io_uring-6.9-20240412' of git://git.kernel.dk/linux: io-uring: correct typo in comment for IOU_F_TWQ_LAZY_WAKE io_uring/net: restore msg_control on sendzc retry io_uring: Fix io_cqring_wait() not restoring sigmask on get_timespec64() failure
2024-04-12Merge tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Looks like everyone woke up after holidays, this weeks pull has a bunch of stuff all over, 2 weeks worth of amdgpu is a lot of it, then i915/xe have a few, a bunch of msm fixes, then some scattered driver fixes. I expect things will settle down for rc5. client: - Protect connector modes with mode_config mutex ast: - Fix soft lockup host1x: - Do not setup DMA for virtual addresses ivpu: - Fix deadlock in context_xa - PCI fixes - Fixes to error handling nouveau: - gsp: Fix OOB access - Fix casting panfrost: - Fix error path in MMU code qxl: - Revert "drm/qxl: simplify qxl_fence_wait" vmwgfx: - Enable DMA for SEV mappings i915: - Couple CDCLK programming fixes - HDCP related fix - 4 Bigjoiner related fixes - Fix for a circular locking around GuC on reset+wedged case xe: - Fix double display mutex initializations - Fix u32 -> u64 implicit conversions - Fix RING_CONTEXT_CONTROL not marked as masked msm: - DP refcount leak fix on disconnect - Add missing newlines to prints in msm_fb and msm_kms - fix dpu debugfs entry permissions - Fix the interface table for the catalog of X1E80100 - fix irq message printing - Bindings fix to add DP node as child of mdss for mdss node - Minor typo fix in DP driver API which handles port status change - fix CHRASHDUMP_READ() - fix HHB (highest bank bit) for a619 to fix UBWC corruption amdgpu: - GPU reset fixes - Fix some confusing logging - UMSCH fix - Aborted suspend fix - DCN 3.5 fixes - S4 fix - MES logging fixes - SMU 14 fixes - SDMA 4.4.2 fix - KASAN fix - SMU 13.0.10 fix - VCN partition fix - GFX11 fixes - DWB fixes - Plane handling fix - FAMS fix - DCN 3.1.6 fix - VSC SDP fixes - OLED panel fix - GFX 11.5 fix amdkfd: - GPU reset fixes - fix ioctl integer overflow" * tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel: (65 commits) amdkfd: use calloc instead of kzalloc to avoid integer overflow drm/xe: Label RING_CONTEXT_CONTROL as masked drm/xe/xe_migrate: Cast to output precision before multiplying operands drm/xe/hwmon: Cast result to output precision on left shift of operand drm/xe/display: Fix double mutex initialization drm/amdgpu: differentiate external rev id for gfx 11.5.0 drm/amd/display: Adjust dprefclk by down spread percentage. drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4 drm/amd/display: fix disable otg wa logic in DCN316 drm/amd/display: Do not recursively call manual trigger programming drm/amd/display: always reset ODM mode in context when adding first plane drm/amdgpu: fix incorrect number of active RBs for gfx11 drm/amd/display: Return max resolution supported by DWB amd/amdkfd: sync all devices to wait all processes being evicted drm/amdgpu: clear set_q_mode_offs when VM changed drm/amdgpu: Fix VCN allocation in CPX partition drm/amd/pm: fix the high voltage issue after unload drm/amd/display: Skip on writeback when it's not applicable drm/amdgpu: implement IRQ_STATE_ENABLE for SDMA v4.4.2 ...
2024-04-11Merge tag 'hyperv-fixes-signed-20240411' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Some cosmetic changes (Erni Sri Satya Vennela, Li Zhijian) - Introduce hv_numa_node_to_pxm_info() (Nuno Das Neves) - Fix KVP daemon to handle IPv4 and IPv6 combination for keyfile format (Shradha Gupta) - Avoid freeing decrypted memory in a confidential VM (Rick Edgecombe and Michael Kelley) * tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Don't free ring buffers that couldn't be re-encrypted uio_hv_generic: Don't free decrypted memory hv_netvsc: Don't free decrypted memory Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails hv/hv_kvp_daemon: Handle IPv4 and Ipv6 combination for keyfile format hv: vmbus: Convert sprintf() family to sysfs_emit() family mshyperv: Introduce hv_numa_node_to_pxm_info() x86/hyperv: Cosmetic changes for hv_apic.c
2024-04-12Merge tag 'drm-misc-fixes-2024-04-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ast: - Fix soft lockup client: - Protect connector modes with mode_config mutex host1x: - Do not setup DMA for virtual addresses ivpu: - Fix deadlock in context_xa - PCI fixes - Fixes to error handling nouveau: - gsp: Fix OOB access - Fix casting panfrost: - Fix error path in MMU code qxl: - Revert "drm/qxl: simplify qxl_fence_wait" vmwgfx: - Enable DMA for SEV mappings Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240411073403.GA9895@localhost.localdomain
2024-04-11Merge tag 'net-6.9-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth. Current release - new code bugs: - netfilter: complete validation of user input - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev Previous releases - regressions: - core: fix u64_stats_init() for lockdep when used repeatedly in one file - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr - bluetooth: fix memory leak in hci_req_sync_complete() - batman-adv: avoid infinite loop trying to resize local TT - drv: geneve: fix header validation in geneve[6]_xmit_skb - drv: bnxt_en: fix possible memory leak in bnxt_rdma_aux_device_init() - drv: mlx5: offset comp irq index in name by one - drv: ena: avoid double-free clearing stale tx_info->xdpf value - drv: pds_core: fix pdsc_check_pci_health deadlock Previous releases - always broken: - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING - bluetooth: fix setsockopt not validating user input - af_unix: clear stale u->oob_skb. - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies - drv: virtio_net: fix guest hangup on invalid RSS update - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow - dsa: mt7530: trap link-local frames regardless of ST Port State" * tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits) net: ena: Set tx_info->xdpf value to NULL net: ena: Fix incorrect descriptor free behavior net: ena: Wrong missing IO completions check order net: ena: Fix potential sign extension issue af_unix: Fix garbage collector racing against connect() net: dsa: mt7530: trap link-local frames regardless of ST Port State Revert "s390/ism: fix receive message buffer allocation" net: sparx5: fix wrong config being used when reconfiguring PCS net/mlx5: fix possible stack overflows net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev net/mlx5e: RSS, Block XOR hash with over 128 channels net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit net/mlx5e: HTB, Fix inconsistencies with QoS SQs number net/mlx5e: Fix mlx5e_priv_init() cleanup flow net/mlx5e: RSS, Block changing channels number when RXFH is configured net/mlx5: Correctly compare pkt reformat ids net/mlx5: Properly link new fs rules into the tree net/mlx5: offset comp irq index in name by one net/mlx5: Register devlink first under devlink lock net/mlx5: E-switch, store eswitch pointer before registering devlink_param ...
2024-04-11Merge tag 'loongarch-fixes-6.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: - make {virt, phys, page, pfn} translation work with KFENCE for LoongArch (otherwise NVMe and virtio-blk cannot work with KFENCE enabled) - update dts files for Loongson-2K series to make devices work correctly - fix a build error * tag 'loongarch-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Include linux/sizes.h in addrspace.h to prevent build errors LoongArch: Update dts for Loongson-2K2000 to support GMAC/GNET LoongArch: Update dts for Loongson-2K2000 to support PCI-MSI LoongArch: Update dts for Loongson-2K2000 to support ISA/LPC LoongArch: Update dts for Loongson-2K1000 to support ISA/LPC LoongArch: Make virt_addr_valid()/__virt_addr_valid() work with KFENCE LoongArch: Make {virt, phys, page, pfn} translation work with KFENCE mm: Move lowmem_page_address() a little later
2024-04-11peci: linux/peci.h: fix Excess kernel-doc description warningRandy Dunlap
Remove the @controller: line to prevent the kernel-doc warning: include/linux/peci.h:84: warning: Excess struct member 'controller' description in 'peci_device' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Iwona Winiarska <iwona.winiarska@intel.com> Cc: openbmc@lists.ozlabs.org Reviewed-by: Iwona Winiarska <iwona.winiarska@intel.com> Fixes: 6523d3b2ffa2 ("peci: Add core infrastructure") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240329182910.29495-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-10Merge tag 'bootconfig-fixes-v6.9-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull bootconfig fixes from Masami Hiramatsu: - show the original cmdline only once, and only if it was modeified by bootconfig * tag 'bootconfig-fixes-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fs/proc: Skip bootloader comment if no embedded kernel parameters fs/proc: remove redundant comments from /proc/bootconfig
2024-04-10Drivers: hv: vmbus: Track decrypted status in vmbus_gpadlRick Edgecombe
In CoCo VMs it is possible for the untrusted host to cause set_memory_encrypted() or set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues. In order to make sure callers of vmbus_establish_gpadl() and vmbus_teardown_gpadl() don't return decrypted/shared pages to allocators, add a field in struct vmbus_gpadl to keep track of the decryption status of the buffers. This will allow the callers to know if they should free or leak the pages. Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20240311161558.1310-3-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240311161558.1310-3-mhklinux@outlook.com>
2024-04-10Merge tag 'hardening-v6.9-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel) - ubsan: fix unused variable warning in test module (Arnd Bergmann) - Improve entropy diffusion in randomize_kstack * tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: randomize_kstack: Improve entropy diffusion ubsan: fix unused variable warning in test module gcc-plugins/stackleak: Avoid .head.text section
2024-04-10mm: Move lowmem_page_address() a little laterHuacai Chen
LoongArch will override page_to_virt() which use page_address() in the KFENCE case (by defining WANT_PAGE_VIRTUAL/HASHED_PAGE_VIRTUAL). So move lowmem_page_address() a little later to avoid such build errors: error: implicit declaration of function 'page_address'. Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-04-09net: add copy_safe_from_sockptr() helperEric Dumazet
copy_from_sockptr() helper is unsafe, unless callers did the prior check against user provided optlen. Too many callers get this wrong, lets add a helper to fix them and avoid future copy/paste bugs. Instead of : if (optlen < sizeof(opt)) { err = -EINVAL; break; } if (copy_from_sockptr(&opt, optval, sizeof(opt)) { err = -EFAULT; break; } Use : err = copy_safe_from_sockptr(&opt, sizeof(opt), optval, optlen); if (err) break; Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240408082845.3957374-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-09io-uring: correct typo in comment for IOU_F_TWQ_LAZY_WAKEHaiyue Wang
The 'r' key is near to 't' key, that makes 'with' to be 'wirh' ? :) Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Link: https://lore.kernel.org/r/20240409173531.846714-1-haiyue.wang@intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-09compiler.h: Add missing quote in macro commentThorsten Blum
Add a missing doublequote in the __is_constexpr() macro comment. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-09firmware: qcom: uefisecapp: Fix memory related IO errors and crashesMaximilian Luz
It turns out that while the QSEECOM APP_SEND command has specific fields for request and response buffers, uefisecapp expects them both to be in a single memory region. Failure to adhere to this has (so far) resulted in either no response being written to the response buffer (causing an EIO to be emitted down the line), the SCM call to fail with EINVAL (i.e., directly from TZ/firmware), or the device to be hard-reset. While this issue can be triggered deterministically, in the current form it seems to happen rather sporadically (which is why it has gone unnoticed during earlier testing). This is likely due to the two kzalloc() calls (for request and response) being directly after each other. Which means that those likely return consecutive regions most of the time, especially when not much else is going on in the system. Fix this by allocating a single memory region for both request and response buffers, properly aligning both structs inside it. This unfortunately also means that the qcom_scm_qseecom_app_send() interface needs to be restructured, as it should no longer map the DMA regions separately. Therefore, move the responsibility of DMA allocation (or mapping) to the caller. Fixes: 759e7a2b62eb ("firmware: Add support for Qualcomm UEFI Secure Application") Cc: stable@vger.kernel.org # 6.7 Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Konrad Dybcio <konrad.dybcio@linaro.org> # X13s Link: https://lore.kernel.org/r/20240406130125.1047436-1-luzmaximilian@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-04-09fs/proc: Skip bootloader comment if no embedded kernel parametersMasami Hiramatsu
If the "bootconfig" kernel command-line argument was specified or if the kernel was built with CONFIG_BOOT_CONFIG_FORCE, but if there are no embedded kernel parameter, omit the "# Parameters from bootloader:" comment from the /proc/bootconfig file. This will cause automation to fall back to the /proc/cmdline file, which will be identical to the comment in this no-embedded-kernel-parameters case. Link: https://lore.kernel.org/all/20240409044358.1156477-2-paulmck@kernel.org/ Fixes: 8b8ce6c75430 ("fs/proc: remove redundant comments from /proc/bootconfig") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-08Merge tag 'fixes-2024-04-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fixes from Mike Rapoport: "Fix build errors in memblock tests: - add stubs to functions that calls to them were recently added to memblock but they were missing in tests - update gfp_types.h to include bits.h so that BIT() definitions won't depend on other includes" * tag 'fixes-2024-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: fix undefined reference to `BIT' memblock tests: fix undefined reference to `panic' memblock tests: fix undefined reference to `early_pfn_to_nid'
2024-04-08locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=ySebastian Andrzej Siewior
The commit cited below broke the build for PREEMPT_RT because rwsem_assert_held_write_nolockdep() passes a struct rw_semaphore but rw_base_assert_held_write() expects struct rwbase_rt. Fixing the type alone leads to the problem that WARN_ON() is not found because bug.h is missing. In order to resolve this: - Keep the assert (WARN_ON()) in rwsem.h (not rwbase_rt.h) - Make rwsem_assert_held_write_nolockdep() do the implementation specific (rw_base) writer check. - Replace the "inline" with __always_inline which was used before. Fixes: f70405afc99b1 ("locking: Add rwsem_assert_held() and rwsem_assert_held_write()") Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20240319182050.U4AzUF3I@linutronix.de
2024-04-08irqflags: Explicitly ignore lockdep_hrtimer_exit() argumentArnd Bergmann
When building with 'make W=1' but CONFIG_TRACE_IRQFLAGS=n, the unused argument to lockdep_hrtimer_exit() causes a warning: kernel/time/hrtimer.c:1655:14: error: variable 'expires_in_hardirq' set but not used [-Werror=unused-but-set-variable] This is intentional behavior, so add a cast to void to shut up the warning. Fixes: 73d20564e0dc ("hrtimer: Don't dereference the hrtimer pointer after the callback") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240408074609.3170807-1-arnd@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202311191229.55QXHVc6-lkp@intel.com/
2024-04-08regmap: Add regmap_read_bypassed()Richard Fitzgerald
Add a regmap_read_bypassed() to allow reads from the hardware registers while the regmap is in cache-only mode. A typical use for this is to keep the cache in cache-only mode until the hardware has reached a valid state, but one or more status registers must be polled to determine when this state is reached. For example, firmware download on the cs35l56 can take several seconds if there are multiple amps sharing limited bus bandwidth. This is too long to block in probe() so it is done as a background task. The device must be soft-reset to reboot the firmware and during this time the registers are not accessible, so the cache should be in cache-only. But the driver must poll a register to detect when reboot has completed. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file") Link: https://msgid.link/r/20240408101803.43183-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-08virtio: store owner from modules with register_virtio_driver()Krzysztof Kozlowski
Modules registering driver with register_virtio_driver() might forget to set .owner field. i2c-virtio.c for example has it missing. The field is used by some other kernel parts for reference counting (try_module_get()), so it is expected that drivers will set it. Solve the problem by moving this task away from the drivers to the core virtio code, just like we did for platform_driver in commit 9447057eaff8 ("platform_device: use a macro instead of platform_driver_register"). Fixes: 3cfc88380413 ("i2c: virtio: add a virtio i2c frontend driver") Cc: "Jie Deng" <jie.deng@intel.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Message-Id: <20240331-module-owner-virtio-v2-1-98f04bfaf46a@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-04-08bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueueJason Xing
Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which syzbot reported [1]. [1] BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 sk_psock_put include/linux/skmsg.h:459 [inline] sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 unix_release+0x4b/0x80 net/unix/af_unix.c:1048 __sock_release net/socket.c:659 [inline] sock_close+0x68/0x150 net/socket.c:1421 __fput+0x2c1/0x660 fs/file_table.c:422 __fput_sync+0x44/0x60 fs/file_table.c:507 __do_sys_close fs/open.c:1556 [inline] __se_sys_close+0x101/0x1b0 fs/open.c:1541 __x64_sys_close+0x1f/0x30 fs/open.c:1541 do_syscall_64+0xd3/0x1d0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: sk_psock_data_ready include/linux/skmsg.h:464 [inline] sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 unix_read_skb net/unix/af_unix.c:2546 [inline] unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x140/0x180 net/socket.c:745 ____sys_sendmsg+0x312/0x410 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 do_syscall_64+0xd3/0x1d0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer similarly due to no protection of saved_data_ready. Here is another different caller causing the same issue because of the same reason. So we should protect it with sk_callback_lock read lock because the writer side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". To avoid errors that could happen in future, I move those two pairs of lock into the sk_psock_data_ready(), which is suggested by John Fastabend. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com Signed-off-by: Jason Xing <kernelxing@tencent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d Link: https://lore.kernel.org/all/20240329134037.92124-1-kerneljasonxing@gmail.com Link: https://lore.kernel.org/bpf/20240404021001.94815-1-kerneljasonxing@gmail.com
2024-04-07Merge tag 'x86-urgent-2024-04-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix MCE timer reinit locking - Fix/improve CoCo guest random entropy pool init - Fix SEV-SNP late disable bugs - Fix false positive objtool build warning - Fix header dependency bug - Fix resctrl CPU offlining bug * tag 'x86-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk x86/mce: Make sure to grab mce_sysfs_mutex in set_bank() x86/CPU/AMD: Track SNP host status with cc_platform_*() x86/cc: Add cc_platform_set/_clear() helpers x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM x86/coco: Require seeding RNG with RDRAND on CoCo systems x86/numa/32: Include missing <asm/pgtable_areas.h> x86/resctrl: Fix uninitialized memory read when last CPU of domain goes offline
2024-04-07Merge tag 'timers-urgent-2024-04-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Fix various timer bugs: - Fix a timer migration bug that may result in missed events - Fix timer migration group hierarchy event updates - Fix a PowerPC64 build warning - Fix a handful of DocBook annotation bugs" * tag 'timers-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Return early on deactivation timers/migration: Fix ignored event due to missing CPU update vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h timers: Fix text inconsistencies and spelling tick/sched: Fix struct tick_sched doc warnings tick/sched: Fix various kernel-doc warnings timers: Fix kernel-doc format and add Return values time/timekeeping: Fix kernel-doc warnings and typos time/timecounter: Fix inline documentation
2024-04-06Merge branch 'linus' into x86/urgent, to pick up dependent commitIngo Molnar
We want to fix: 0e110732473e ("x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO") So merge in Linus's latest into x86/urgent to have it available. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-05u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one filePetr Tesarik
Fix bogus lockdep warnings if multiple u64_stats_sync variables are initialized in the same file. With CONFIG_LOCKDEP, seqcount_init() is a macro which declares: static struct lock_class_key __key; Since u64_stats_init() is a function (albeit an inline one), all calls within the same file end up using the same instance, effectively treating them all as a single lock-class. Fixes: 9464ca650008 ("net: make u64_stats_init() a function") Closes: https://lore.kernel.org/netdev/ea1567d9-ce66-45e6-8168-ac40a47d1821@roeck-us.net/ Signed-off-by: Petr Tesarik <petr@tesarici.cz> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240404075740.30682-1-petr@tesarici.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-05Merge tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Backport of some fixes that came up during development of the 6.10 io_uring patches. This includes some kbuf cleanups and reference fixes. - Disable multishot read if we don't have NOWAIT support on the target - Fix for a dependency issue with workqueue flushing * tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linux: io_uring/kbuf: hold io_buffer_list reference over mmap io_uring/kbuf: protect io_buffer_list teardown with a reference io_uring/kbuf: get rid of bl->is_ready io_uring/kbuf: get rid of lower BGID lists io_uring: use private workqueue for exit work io_uring: disable io-wq execution of multishot NOWAIT requests io_uring/rw: don't allow multishot reads without NOWAIT support
2024-04-05Merge tag 'devicetree-fixes-for-6.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix NIOS2 boot with external DTB - Add missing synchronization needed between fw_devlink and DT overlay removals - Fix some unit-address regex's to be hex only - Drop some 10+ year old "unstable binding" statements - Add new SoCs to QCom UFS binding - Add TPM bindings to TPM maintainers * tag 'devicetree-fixes-for-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: nios2: Only use built-in devicetree blob if configured to do so dt-bindings: timer: narrow regex for unit address to hex numbers dt-bindings: soc: fsl: narrow regex for unit address to hex numbers dt-bindings: remoteproc: ti,davinci: remove unstable remark dt-bindings: clock: ti: remove unstable remark dt-bindings: clock: keystone: remove unstable remark of: module: prevent NULL pointer dereference in vsnprintf() dt-bindings: ufs: qcom: document SM6125 UFS dt-bindings: ufs: qcom: document SC7180 UFS dt-bindings: ufs: qcom: document SC8180X UFS of: dynamic: Synchronize of_changeset_destroy() with the devlink removals driver core: Introduce device_link_wait_removal() docs: dt-bindings: add missing address/size-cells to example MAINTAINERS: Add TPM DT bindings to TPM maintainers
2024-04-05Merge tag 'mm-hotfixes-stable-2024-04-05-11-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "8 hotfixes, 3 are cc:stable There are a couple of fixups for this cycle's vmalloc changes and one for the stackdepot changes. And a fix for a very old x86 PAT issue which can cause a warning splat" * tag 'mm-hotfixes-stable-2024-04-05-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: stackdepot: rename pool_index to pool_index_plus_1 x86/mm/pat: fix VM_PAT handling in COW mappings MAINTAINERS: change vmware.com addresses to broadcom.com selftests/mm: include strings.h for ffsl mm: vmalloc: fix lockdep warning mm: vmalloc: bail out early in find_vmap_area() if vmap is not init init: open output files from cpio unpacking with O_LARGEFILE mm/secretmem: fix GUP-fast succeeding on secretmem folios
2024-04-05Merge tag 'pm-6.9-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a recent Energy Model change that went against a recent scheduler change made independently (Vincent Guittot)" * tag 'pm-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: EM: fix wrong utilization estimation in em_cpu_energy()
2024-04-05stackdepot: rename pool_index to pool_index_plus_1Peter Collingbourne
Commit 3ee34eabac2a ("lib/stackdepot: fix first entry having a 0-handle") changed the meaning of the pool_index field to mean "the pool index plus 1". This made the code accessing this field less self-documenting, as well as causing debuggers such as drgn to not be able to easily remain compatible with both old and new kernels, because they typically do that by testing for presence of the new field. Because stackdepot is a debugging tool, we should make sure that it is debugger friendly. Therefore, give the field a different name to improve readability as well as enabling debugger backwards compatibility. This is needed in 6.9, which would otherwise become an odd release with the new semantics and old name so debuggers wouldn't recognize the new semantics there. Fixes: 3ee34eabac2a ("lib/stackdepot: fix first entry having a 0-handle") Link: https://lkml.kernel.org/r/20240402001500.53533-1-pcc@google.com Link: https://linux-review.googlesource.com/id/Ib3e70c36c1d230dd0a118dc22649b33e768b9f88 Signed-off-by: Peter Collingbourne <pcc@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Marco Elver <elver@google.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Omar Sandoval <osandov@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05mm/secretmem: fix GUP-fast succeeding on secretmem foliosDavid Hildenbrand
folio_is_secretmem() currently relies on secretmem folios being LRU folios, to save some cycles. However, folios might reside in a folio batch without the LRU flag set, or temporarily have their LRU flag cleared. Consequently, the LRU flag is unreliable for this purpose. In particular, this is the case when secretmem_fault() allocates a fresh page and calls filemap_add_folio()->folio_add_lru(). The folio might be added to the per-cpu folio batch and won't get the LRU flag set until the batch was drained using e.g., lru_add_drain(). Consequently, folio_is_secretmem() might not detect secretmem folios and GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel when we would later try reading/writing to the folio, because the folio has been unmapped from the directmap. Fix it by removing that unreliable check. Link: https://lkml.kernel.org/r/20240326143210.291116-2-david@redhat.com Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: xingwei lee <xrivendell7@gmail.com> Reported-by: yue sun <samsun1006219@gmail.com> Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2rmz0HQ@mail.gmail.com/ Debugged-by: Miklos Szeredi <miklos@szeredi.hu> Tested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05Merge tag 'vfs-6.9-rc3.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "This contains a few small fixes. This comes with some delay because I wanted to wait on people running their reproducers and the Easter Holidays meant that those replies came in a little later than usual: - Fix handling of preventing writes to mounted block devices. Since last kernel we allow to prevent writing to mounted block devices provided CONFIG_BLK_DEV_WRITE_MOUNTED isn't set and the block device is opened with restricted writes. When we switched to opening block devices as files we altered the mechanism by which we recognize when a block device has been opened with write restrictions. The detection logic assumed that only read-write mounted filesystems would apply write restrictions to their block devices from other openers. That of course is not true since it also makes sense to apply write restrictions for filesystems that are read-only. Fix the detection logic using an FMODE_* bit. We still have a few left since we freed up a couple a while ago. I also picked up a patch to free up four additional FMODE_* bits scheduled for the next merge window. - Fix counting the number of writers to a block device. This just changes the logic to be consistent. - Fix a bug in aio causing a NULL pointer derefernce after we implemented batched processing in aio. - Finally, add the changes we discussed that allows to yield block devices early even though file closing itself is deferred. This also allows us to remove two holder operations to get and release the holder to align lifetime of file and holder of the block device" * tag 'vfs-6.9-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: aio: Fix null ptr deref in aio_complete() wakeup fs,block: yield devices early block: count BLK_OPEN_RESTRICT_WRITES openers block: handle BLK_OPEN_RESTRICT_WRITES correctly
2024-04-05Revert "drm/qxl: simplify qxl_fence_wait"Alex Constantino
This reverts commit 5a838e5d5825c85556011478abde708251cc0776. Changes from commit 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") would result in a '[TTM] Buffer eviction failed' exception whenever it reached a timeout. Due to a dependency to DMA_FENCE_WARN this also restores some code deleted by commit d72277b6c37d ("dma-buf: nuke DMA_FENCE_TRACE macros v2"). Fixes: 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") Link: https://lore.kernel.org/regressions/ZTgydqRlK6WX_b29@eldamar.lan/ Reported-by: Timo Lindfors <timo.lindfors@iki.fi> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514 Signed-off-by: Alex Constantino <dreaming.about.electric.sheep@gmail.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240404181448.1643-2-dreaming.about.electric.sheep@gmail.com
2024-04-04Merge tag 'net-6.9-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, bluetooth and bpf. Fairly usual collection of driver and core fixes. The large selftest accompanying one of the fixes is also becoming a common occurrence. Current release - regressions: - ipv6: fix infinite recursion in fib6_dump_done() - net/rds: fix possible null-deref in newly added error path Current release - new code bugs: - net: do not consume a full cacheline for system_page_pool - bpf: fix bpf_arena-related file descriptor leaks in the verifier - drv: ice: fix freeing uninitialized pointers, fixing misuse of the newfangled __free() auto-cleanup Previous releases - regressions: - x86/bpf: fixes the BPF JIT with retbleed=stuff - xen-netfront: add missing skb_mark_for_recycle, fix page pool accounting leaks, revealed by recently added explicit warning - tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6 non-wildcard addresses - Bluetooth: - replace "hci_qca: Set BDA quirk bit if fwnode exists in DT" with better workarounds to un-break some buggy Qualcomm devices - set conn encrypted before conn establishes, fix re-connecting to some headsets which use slightly unusual sequence of msgs - mptcp: - prevent BPF accessing lowat from a subflow socket - don't account accept() of non-MPC client as fallback to TCP - drv: mana: fix Rx DMA datasize and skb_over_panic - drv: i40e: fix VF MAC filter removal Previous releases - always broken: - gro: various fixes related to UDP tunnels - netns crossing problems, incorrect checksum conversions, and incorrect packet transformations which may lead to panics - bpf: support deferring bpf_link dealloc to after RCU grace period - nf_tables: - release batch on table validation from abort path - release mutex after nft_gc_seq_end from abort path - flush pending destroy work before exit_net release - drv: r8169: skip DASH fw status checks when DASH is disabled" * tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) netfilter: validate user input for expected length net/sched: act_skbmod: prevent kernel-infoleak net: usb: ax88179_178a: avoid the interface always configured as random address net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45() net: ravb: Always update error counters net: ravb: Always process TX descriptor ring netfilter: nf_tables: discard table flag update with pending basechain deletion netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() netfilter: nf_tables: reject new basechain after table flag update netfilter: nf_tables: flush pending destroy work before exit_net release netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path netfilter: nf_tables: release batch on table validation from abort path Revert "tg3: Remove residual error handling in tg3_suspend" tg3: Remove residual error handling in tg3_suspend net: mana: Fix Rx DMA datasize and skb_over_panic net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping net: stmmac: fix rx queue priority assignment net: txgbe: fix i2c dev name cannot match clkdev net: fec: Set mac_managed_pm during probe ...
2024-04-04Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-04-04 We've added 7 non-merge commits during the last 5 day(s) which contain a total of 9 files changed, 75 insertions(+), 24 deletions(-). The main changes are: 1) Fix x86 BPF JIT under retbleed=stuff which causes kernel panics due to incorrect destination IP calculation and incorrect IP for relocations, from Uros Bizjak and Joan Bruguera Micó. 2) Fix BPF arena file descriptor leaks in the verifier, from Anton Protopopov. 3) Defer bpf_link deallocation to after RCU grace period as currently running multi-{kprobes,uprobes} programs might still access cookie information from the link, from Andrii Nakryiko. 4) Fix a BPF sockmap lock inversion deadlock in map_delete_elem reported by syzkaller, from Jakub Sitnicki. 5) Fix resolve_btfids build with musl libc due to missing linux/types.h include, from Natanael Copa. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, sockmap: Prevent lock inversion deadlock in map delete elem x86/bpf: Fix IP for relocating call depth accounting x86/bpf: Fix IP after emitting call depth accounting bpf: fix possible file descriptor leaks in verifier tools/resolve_btfids: fix build with musl libc bpf: support deferring bpf_link dealloc to after RCU grace period bpf: put uprobe link's path and task in release callback ==================== Link: https://lore.kernel.org/r/20240404183258.4401-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-04PM: EM: fix wrong utilization estimation in em_cpu_energy()Vincent Guittot
Commit 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division") has added back map_util_perf() in em_cpu_energy() computation which has been removed with the rework of scheduler/cpufreq interface. This is wrong because sugov_effective_cpu_perf() already takes care of mapping the utilization to a performance level. Fixes: 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-04x86/cc: Add cc_platform_set/_clear() helpersBorislav Petkov (AMD)
Add functionality to set and/or clear different attributes of the machine as a confidential computing platform. Add the first one too: whether the machine is running as a host for SEV-SNP guests. Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support") Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Link: https://lore.kernel.org/r/20240327154317.29909-5-bp@alien8.de
2024-04-04memblock tests: fix undefined reference to `BIT'Wei Yang
commit 772dd0342727 ("mm: enumerate all gfp flags") define gfp flags with the help of BIT, while gfp_types.h doesn't include header file for the definition. This through an error on building memblock tests. Let's include linux/bits.h to fix it. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Suren Baghdasaryan <surenb@google.com> CC: Michal Hocko <mhocko@suse.com> Link: https://lore.kernel.org/r/20240402132701.29744-4-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-04-03randomize_kstack: Improve entropy diffusionKees Cook
The kstack_offset variable was really only ever using the low bits for kernel stack offset entropy. Add a ror32() to increase bit diffusion. Suggested-by: Arnd Bergmann <arnd@arndb.de> Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall") Link: https://lore.kernel.org/r/20240309202445.work.165-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-02io_uring/kbuf: get rid of lower BGID listsJens Axboe
Just rely on the xarray for any kind of bgid. This simplifies things, and it really doesn't bring us much, if anything. Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Jens Axboe <axboe@kernel.dk>