Age | Commit message (Collapse) | Author |
|
Use store_release_wake_up() instead of wake_up_var_locked(), because the
waiter cannot retake the nfs_uuid->lock.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Suggested-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/all/175262948827.2234665.1891349021754495573@noble.neil.brown.name/
Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
In order for the wait in nfs_uuid_put() to be safe, it is necessary to
ensure that nfs_uuid_add_file() doesn't add a new entry once the
nfs_uuid->net has been NULLed out.
Also fix up the wake_up_var_locked() / wait_var_event_spinlock() to both
use the nfs_uuid address, since nfl, and &nfl->uuid could be used elsewhere.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/all/175262893035.2234665.1735173020338594784@noble.neil.brown.name/
Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the struct nfs_file_localio is closed, its list entry will be empty,
but the nfs_uuid->files list might still contain other entries.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
A lot of test cases in the file are related to the idle and unbalanced
timers of resilient nexthop groups and these tests are reported to be
flaky on slow machines running debug kernels.
Rather than marking a lot of individual tests with xfail_on_slow(),
simply mark all the tests. Note that the test is stable on non-debug
machines and that with debug kernels we are mainly interested in the
output of various sanitizers in order to determine pass / fail.
Before:
# make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
TEST_GEN_PROGS="" run_tests
[...]
# TEST: Bucket migration after idle timer (with delete) [FAIL]
# Group expected to still be unbalanced
[...]
not ok 1 selftests: drivers/net/netdevsim: nexthop.sh # exit=1
After:
# make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
TEST_GEN_PROGS="" run_tests
[...]
# TEST: Bucket migration after idle timer (with delete) [XFAIL]
# Group expected to still be unbalanced
[...]
ok 1 selftests: drivers/net/netdevsim: nexthop.sh
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20250729160609.02e0f157@kernel.org/
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250804114320.193203-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Mohsin Bashir says:
====================
eth: fbnic: Fix drop stats support
Fix hardware drop stats support on the TX path of fbnic by addressing two
issues: ensure that tx_dropped stats are correctly copied to the
rtnl_link_stats64 struct, and protect the copying of drop stats from
fdb->hw_stats to the local variable with the hw_stats_lock to
ensure consistency.
====================
Link: https://patch.msgid.link/20250802024636.679317-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Wrap copying of drop stats on TX path from fbd->hw_stats by the
hw_stats_lock. Currently, it is being performed outside the lock and
another thread accessing fbd->hw_stats can lead to inconsistencies.
Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250802024636.679317-3-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Correctly copy the tx_dropped stats from the fbd->hw_stats to the
rtnl_link_stats64 struct.
Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250802024636.679317-2-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Alex added page bias of LONG_MAX, which is admittedly quite
a clever way of catching overflows of the pp ref count.
The page pool code was "optimized" to leave the ref at 1
for freed pages so it can't catch basic bugs by itself any more.
(Something we should probably address under DEBUG_NET...)
Unfortunately for fbnic since commit f7dc3248dcfb ("skbuff: Optimization
of SKB coalescing for page pool") core _may_ actually take two extra
pp refcounts, if one of them is returned before driver gives up the bias
the ret < 0 check in page_pool_unref_netmem() will trigger.
While at it add a FBNIC_ to the name of the driver constant.
Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free")
Link: https://patch.msgid.link/20250801170754.2439577-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After the call to phy_disconnect() netdev->phydev is reset to NULL.
So fixed_phy_unregister() would be called with a NULL pointer as argument.
Therefore cache the phy_device before this call.
Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Link: https://patch.msgid.link/2b80a77a-06db-4dd7-85dc-3a8e0de55a1d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Emails to alexandru.tachici@analog.com bounce permanently:
Remote Server returned '550 5.1.10 RESOLVER.ADR.RecipientNotFound; Recipient not found by SMTP address lookup'
so replace him with Marcelo Schmitt from Analog.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
Link: https://patch.msgid.link/20250724113758.61874-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A large DMA mapping request can loop through dma address pinning for
many pages. In cases where THP can not be used, the repeated vmf_insert_pfn can
be costly, so let the task reschedule as need to prevent CPU stalls. Failure to
do so has potential harmful side effects, like increased memory pressure
as unrelated rcu tasks are unable to make their reclaim callbacks and
result in OOM conditions.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 36-....: (20999 ticks this GP) idle=b01c/1/0x4000000000000000 softirq=35839/35839 fqs=3538
rcu: hardirqs softirqs csw/system
rcu: number: 0 107 0
rcu: cputime: 50 0 10446 ==> 10556(ms)
rcu: (t=21075 jiffies g=377761 q=204059 ncpus=384)
...
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? walk_system_ram_range+0x63/0x120
? walk_system_ram_range+0x46/0x120
? pgprot_writethrough+0x20/0x20
lookup_memtype+0x67/0xf0
track_pfn_insert+0x20/0x40
vmf_insert_pfn_prot+0x88/0x140
vfio_pci_mmap_huge_fault+0xf9/0x1b0 [vfio_pci_core]
__do_fault+0x28/0x1b0
handle_mm_fault+0xef1/0x2560
fixup_user_fault+0xf5/0x270
vaddr_get_pfns+0x169/0x2f0 [vfio_iommu_type1]
vfio_pin_pages_remote+0x162/0x8e0 [vfio_iommu_type1]
vfio_iommu_type1_ioctl+0x1121/0x1810 [vfio_iommu_type1]
? futex_wake+0x1c1/0x260
x64_sys_call+0x234/0x17a0
do_syscall_64+0x63/0x130
? exc_page_fault+0x63/0x130
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20250715184622.3561598-1-kbusch@meta.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Extend the qat_vfio_pci variant driver to support QAT 6xxx Virtual
Functions (VFs). Add the relevant QAT 6xxx VF device IDs to the driver's
probe table, enabling proper detection and initialization of these devices.
Update the module description to reflect that the driver now supports all
QAT generations.
Signed-off-by: Małgorzata Mielnik <malgorzata.mielnik@intel.com>
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Link: https://lore.kernel.org/r/20250715081150.1244466-1-suman.kumar.chakraborty@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Remove myself from VFIO QAT PCI driver maintainers as I'm leaving
Intel.
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Link: https://lore.kernel.org/r/20250715001357.33725-1-xin.zeng@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
This was missed during the initial implementation. The VFIO PCI encodes
the vf_token inside the device name when opening the device from the group
FD, something like:
"0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3"
This is used to control access to a VF unless there is co-ordination with
the owner of the PF.
Since we no longer have a device name in the cdev path, pass the token
directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field
indicated by VFIO_DEVICE_BIND_FLAG_TOKEN.
Fixes: 5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD")
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Alienware calls this key "Performance Boost". Dell calls it "G-Mode".
The goal is to have a specific keycode to detect when this key is
pressed, so userspace can act upon it and do what have to do, usually
starting the power profile for performance.
Signed-off-by: Marcos Alano <marcoshalano@gmail.com>
Link: https://lore.kernel.org/r/20250509193708.2190586-1-marcoshalano@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 fixes for v6.17-rc1:
- Fixes around DP LFPS (Low-Frequency Periodic Signaling)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/e1147bede8f219682419d198022cfe8d9d4edc28@intel.com
|
|
Exercise various mmap(), munmap() and mremap() invocations, which might
cause a perf buffer mapping to be split or truncated.
To avoid hard coding the perf event and having dependencies on
architectures and configuration options, scan through event types in sysfs
and try to open them. On success, try to mmap() and if that succeeds try to
mmap() the AUX buffer.
In case that no AUX buffer supporting event is found, only test the base
buffer mapping. If no mappable event is found or permissions are not
sufficient, skip the tests.
Reserve a PROT_NONE region for both rb and aux tests to allow testing the
case where mremap unmaps beyond the end of a mapped VMA to prevent it from
unmapping unrelated mappings.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Co-developed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
|
|
The perf mmap code is careful about mmap()'ing the user page with the
ringbuffer and additionally the auxiliary buffer, when the event supports
it. Once the first mapping is established, subsequent mapping have to use
the same offset and the same size in both cases. The reference counting for
the ringbuffer and the auxiliary buffer depends on this being correct.
Though perf does not prevent that a related mapping is split via mmap(2),
munmap(2) or mremap(2). A split of a VMA results in perf_mmap_open() calls,
which take reference counts, but then the subsequent perf_mmap_close()
calls are not longer fulfilling the offset and size checks. This leads to
reference count leaks.
As perf already has the requirement for subsequent mappings to match the
initial mapping, the obvious consequence is that VMA splits, caused by
resizing of a mapping or partial unmapping, have to be prevented.
Implement the vm_operations_struct::may_split() callback and return
unconditionally -EINVAL.
That ensures that the mapping offsets and sizes cannot be changed after the
fact. Remapping to a different fixed address with the same size is still
possible as it takes the references for the new mapping and drops those of
the old mapping.
Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams")
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27504
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: stable@vger.kernel.org
|
|
After successful allocation of a buffer or a successful attachment to an
existing buffer perf_mmap() tries to map the buffer read only into the page
table. If that fails, the already set up page table entries are zapped, but
the other perf specific side effects of that failure are not handled. The
calling code just cleans up the VMA and does not invoke perf_mmap_close().
This leaks reference counts, corrupts user->vm accounting and also results
in an unbalanced invocation of event::event_mapped().
Cure this by moving the event::event_mapped() invocation before the
map_range() call so that on map_range() failure perf_mmap_close() can be
invoked without causing an unbalanced event::event_unmapped() call.
perf_mmap_close() undoes the reference counts and eventually frees buffers.
Fixes: b709eb872e19 ("perf: map pages in advance")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
|
|
When perf_mmap() fails to allocate a buffer, it still invokes the
event_mapped() callback of the related event. On X86 this might increase
the perf_rdpmc_allowed reference counter. But nothing undoes this as
perf_mmap_close() is never called in this case, which causes another
reference count leak.
Return early on failure to prevent that.
Fixes: 1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
|
|
Failure of the AUX buffer allocation leaks the reference count.
Set the reference count to 1 only when the allocation succeeds.
Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
|
|
A recent overhaul sets the return value to 0 unconditionally after the
allocations, which causes reference count leaks and corrupts the user->vm
accounting.
Preserve the AUX buffer allocation failure return value, so that the
subsequent code works correctly.
Fixes: 0983593f32c4 ("perf/core: Lift event->mmap_mutex in perf_mmap()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
|
|
This is step 3/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
Replace the mid_flags bitmask with dedicated boolean fields to
simplify locking logic and improve code readability:
- Replace MID_DELETED with bool deleted_from_q
- Replace MID_WAIT_CANCELLED with bool wait_cancelled
- Remove mid_flags field entirely
The new boolean fields have clearer semantics:
- deleted_from_q: whether mid has been removed from pending_mid_q
- wait_cancelled: whether request was cancelled during wait
This change reduces memory usage (from 4-byte bitmask to 2 boolean
flags) and eliminates confusion about which lock protects which
flag bits, preparing for per-mid locking in the next patch.
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This is step 2/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
Add a dedicated mid_counter_lock to protect current_mid counter,
separating it from mid_queue_lock which protects pending_mid_q
operations. This reduces lock contention and prepares for finer-
grained locking in subsequent patches.
Changes:
- Add TCP_Server_Info->mid_counter_lock spinlock
- Rename CurrentMid to current_mid for consistency
- Use mid_counter_lock to protect current_mid access
- Update locking documentation in cifsglob.h
This separation allows mid allocation to proceed without blocking
queue operations, improving performance under heavy load.
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This is step 1/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
The current mid_lock name is somewhat ambiguous about what it protects.
To prepare for splitting this lock into separate, more granular locks,
this patch renames mid_lock to mid_queue_lock to clearly indicate its
specific responsibility for protecting the pending_mid_q list and
related queue operations.
No functional changes are made in this patch - it only prepares the
codebase for the lock splitting that follows.
- mid_queue_lock for queue operations
- mid_counter_lock for mid counter operations
- per-mid locks for individual mid state management
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
we have been doing this:
static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
size_t size)
[...]
/* Calculate the number of bytes we need to push, for this page
* specifically */
size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
/* If we can't splice it, then copy it in, as normal */
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
/* Set the bvec pointing to the page, with len $bytes */
bvec_set_page(&bvec, page[i], bytes, offset);
/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
lock_sock(sk);
/* Sendmsg with $size size (!!!) */
rv = tcp_sendmsg_locked(sk, &msg, size);
This means we've been sending oversized iov_iters and tcp_sendmsg calls
for a while. This has a been a benign bug because sendpage_ok() always
returned true. With the recent slab allocator changes being slowly
introduced into next (that disallow sendpage on large kmalloc
allocations), we have recently hit out-of-bounds crashes, due to slight
differences in iov_iter behavior between the MSG_SPLICE_PAGES and
"regular" copy paths:
(MSG_SPLICE_PAGES)
skb_splice_from_iter
iov_iter_extract_pages
iov_iter_extract_bvec_pages
uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
skb_splice_from_iter gets a "short" read
(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
[...]
copy_from_iter
/* this doesn't help */
if (unlikely(iter->count < len))
len = iter->count;
iterate_bvec
... and we run off the bvecs
Fix this by properly setting the iov_iter's byte count, plus sending the
correct byte count to tcp_sendmsg_locked.
Link: https://patch.msgid.link/r/20250729120348.495568-1-pfalcato@suse.de
Cc: stable@vger.kernel.org
Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: Bernard Metzler <bernard.metzler@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Pull ARM update from Russell King:
"Just one development update this time:
- Finish removing Coresight support"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9449/1: coresight: Finish removal of Coresight support in arch/arm/kernel
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Use generic_write_sync instead of vfs_fsync_range in exfat_file_write_iter.
It will fix an issue where fdatasync would be set incorrectly.
- Fix potential infinite loop by the self-linked chain.
* tag 'exfat-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: add cluster chain loop check for dir
exfat: fdatasync flag should be same like generic_write_sync()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "mseal cleanups" (Lorenzo Stoakes)
Some mseal cleaning with no intended functional change.
- "Optimizations for khugepaged" (David Hildenbrand)
Improve khugepaged throughput by batching PTE operations for large
folios. This gain is mainly for arm64.
- "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport)
A bugfix, additional debug code and cleanups to the execmem code.
- "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song)
Bugfixes, cleanups and performance improvememnts to the mTHP swapin
code"
* tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits)
mm: mempool: fix crash in mempool_free() for zero-minimum pools
mm: correct type for vmalloc vm_flags fields
mm/shmem, swap: fix major fault counting
mm/shmem, swap: rework swap entry and index calculation for large swapin
mm/shmem, swap: simplify swapin path and result handling
mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO
mm/shmem, swap: tidy up swap entry splitting
mm/shmem, swap: tidy up THP swapin checks
mm/shmem, swap: avoid redundant Xarray lookup during swapin
x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations
x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations
execmem: drop writable parameter from execmem_fill_trapping_insns()
execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP)
execmem: move execmem_force_rw() and execmem_restore_rox() before use
execmem: rework execmem_cache_free()
execmem: introduce execmem_alloc_rw()
execmem: drop unused execmem_update_copy()
mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
mm/rmap: add anon_vma lifetime debug check
mm: remove mm/io-mapping.c
...
|
|
Make vmem_pte_alloc() consistent by always allocating page table of
PAGE_SIZE granularity, regardless of whether page_table_alloc() (with
slab) or memblock_alloc() is used. This ensures page table can be fully
freed when the corresponding page table entries are removed.
Fixes: d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
Since $(LD) is directly used, hence -nostdlib is unneeded, MIPS has
removed this, we should remove it too.
bdbf2038fbf4 ("MIPS: VDSO: remove -nostdlib compiler flag").
In fact, other architectures also use $(LD) now.
fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO")
691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO")
2ff906994b6c ("MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO")
2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO")
Cc: stable@vger.kernel.org
Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The Loongson-2K2000 integrates one eMMC controller and one SDIO controller.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The Loongson-2K1000 integrates one SDIO controller for SD storage cards
and SDIO cards.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The Loongson-2K0500 integrates two SDIO controllers for SD storage cards
and SDIO cards, supporting SD storage card boot.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to
skip analysis/patching for the respective vulnerability, it is safe to
set both bpf_jit_bypass_spec_v1/v4(), because there is no speculation
barrier instruction for LoongArch.
Suggested-by: Luis Gerhorst <luis.gerhorst@fau.de>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
In specific use cases combining tailcalls and BPF-to-BPF calls,
MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt
back-propagation from callee to caller. This patch fixes this
tailcall issue caused by abusing the tailcall in bpf2bpf feature
on LoongArch like the way of "bpf, x64: Fix tailcall hierarchy".
Push tail_call_cnt_ptr and tail_call_cnt into the stack,
tail_call_cnt_ptr is passed between tailcall and bpf2bpf,
uses tail_call_cnt_ptr to increment tail_call_cnt.
Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The extra pass of bpf_int_jit_compile() skips JIT context initialization
which essentially skips offset calculation leaving out_offset = -1, so
the jmp_offset in emit_bpf_tail_call is calculated by
"#define jmp_offset (out_offset - (cur_offset))"
is a negative number, which is wrong. The final generated assembly are
as follow.
54: bgeu $a2, $t1, -8 # 0x0000004c
58: addi.d $a6, $s5, -1
5c: bltz $a6, -16 # 0x0000004c
60: alsl.d $t2, $a2, $a1, 0x3
64: ld.d $t2, $t2, 264
68: beq $t2, $zero, -28 # 0x0000004c
Before apply this patch, the follow test case will reveal soft lock issues.
cd tools/testing/selftests/bpf/
./test_progs --allow=tailcalls/tailcall_bpf2bpf_1
dmesg:
watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [test_progs:25056]
Cc: stable@vger.kernel.org
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
prologue and epilogue for this case.
With this patch, all of the struct_ops related testcases (except
struct_ops_multi_pages) passed on LoongArch.
The testcase struct_ops_multi_pages failed is because the actual
image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
Before:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
After:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
#15 bad_struct_ops:OK
...
#399 struct_ops_autocreate:OK
...
#400 struct_ops_kptr_return:OK
...
#401 struct_ops_maybe_null:OK
...
#402 struct_ops_module:OK
...
#404 struct_ops_no_cfi:OK
...
#405 struct_ops_private_stack:SKIP
...
#406 struct_ops_refcounted:OK
Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
BPF trampoline is the critical infrastructure of the BPF subsystem,
acting as a mediator between kernel functions and BPF programs. Numerous
important features, such as using BPF program for zero overhead kernel
introspection, rely on this key component.
The related tests have passed, including the following technical points:
1. fentry
2. fmod_ret
3. fexit
The following related testcases passed on LoongArch:
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
This issue was first reported by Geliang Tang in June 2024 while
debugging MPTCP BPF selftests on a LoongArch machine (see commit
eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca").
Geliang, Huacai, and Tiezhu then worked together to drive the
implementation of this feature, encouraging broader collaboration among
Chinese kernel engineers.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
Reported-by: Geliang Tang <geliang@kernel.org>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
This commit adds support for BPF dynamic code modification on the
LoongArch architecture:
1. Add bpf_arch_text_copy() for instruction block copying.
2. Add bpf_arch_text_poke() for runtime instruction patching.
3. Add bpf_arch_text_invalidate() for code invalidation.
On LoongArch, since symbol addresses in the direct mapping region can't
be reached via relative jump instructions from the paged mapping region,
we use the move_imm+jirl instruction pair as absolute jump instructions.
These require 2-5 instructions, so we reserve 5 NOP instructions in the
program as placeholders for function jumps.
The larch_insn_text_copy() function is solely used for BPF. And the use
of larch_insn_text_copy() requires PAGE_SIZE alignment. Currently, only
the size of the BPF trampoline is page-aligned.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
1. Rename the existing validate_code() to validate_ctx()
2. Factor out the code validation handling into a new helper
validate_code()
Then:
* validate_code() is used to check the validity of code.
* validate_ctx() is used to check both code validity and table entry
correctness.
The new validate_code() will be used in subsequent changes.
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Commit 7918bb2d19c9 ("vhost: basic in order support") introduces
vq->nheads to store the number of batched used buffers per used elem
but it forgets to initialize the vq->nheads to NULL in
vhost_dev_init() this will cause kfree() that would try to free it
without be allocated if SET_OWNER is not called.
Reported-by: JAEHOON KIM <jhkim@linux.ibm.com>
Reported-by: Breno Leitao <leitao@debian.org>
Fixes: 45347e79b544 ("vhost: basic in order support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250729073916.80647-1-jasowang@redhat.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Tested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Jaehoon Kim <jhkim@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
GICv5 LPI interrupts have an active state hence they cannot retrigger
while the interrupt is being handled.
Therefore, setting the IRQD_RESEND_WHEN_IN_PROGRESS flag on LPIs is
pointless, as the situation this flag caters for cannot happen.
Remove it.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250801-gic-v5-fixes-6-17-v1-3-4fcedaccf9e6@kernel.org
|
|
The 0-day bot reported that on the failure path the driver iounmap()s IWB
resources that are managed through devm_ioremap(), which is clearly wrong
because the driver would end up unmapping the MMIO resource twice on
probing failure.
Fix this by removing the error path altogether and by letting devres manage
the iounmapping on clean-up.
Fixes: 695949d8b16f ("irqchip/gic-v5: Add GICv5 IWB support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250801-gic-v5-fixes-6-17-v1-1-4fcedaccf9e6@kernel.org
Closes: https://lore.kernel.org/oe-kbuild-all/202508010038.N3r4ZmII-lkp@intel.com
|
|
When a kexec'ed kernel boots up, there might be stale unhandled interrupts
pending in the interrupt controller. These are delivered as spurious
interrupts once the boot CPU enables interrupts.
Clear all pending interrupts when the driver is initialized to prevent
these spurious interrupts from locking the CPU in an endless loop.
Signed-off-by: Elad Nachman <enachman@marvell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250803102548.669682-2-enachman@marvell.com
|
|
Commit 8b65db1e93a2 ("irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT
handling") added logic in msi_lib_irq_domain_select() to match the domain
fwnode against the fwnode parent of the fwspec.fwnode.
The fwnode_get_parent() caller must call fwnode_handle_put() on the
returned pointer value, lest fwnode refcounting for the parent ends up
being out of kilter.
Fix this by relying on the fwnode_handle clean-up handlers and by
incrementing the fwnode refcount regardless of whether parent matching is
used or not (the domain selection code already holds a reference before
calling msi_lib_irq_domain_select() but to make the exit path more uniform
if IRQ_DOMAIN_FLAG_FWNODE_PARENT is not set fwnode_handle_get() is called
again on fwspec.fwnode so that the clean-up code is the same for the two
matching patterns).
Fixes: 8b65db1e93a2 ("irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT handling")
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250804145553.795065-1-lpieralisi@kernel.org
|
|
smatch warns about a dereference before check:
drivers/irqchip/irq-riscv-imsic-platform.c:317 imsic_irqdomain_init() warn: variable dereferenced before check 'imsic' (see line 311)
Cure it by moving the firmware not assignement after the checks.
Fixes: 59422904dd98 ("irqchip/riscv-imsic: Convert to msi_create_parent_irq_domain() helper")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Closes: https://lore.kernel.org/r/202507311953.NFVZkr0a-lkp@intel.com/ ---
drivers/irqchip/irq-riscv-imsic-platform.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
SMB3.1.1 POSIX mounts support native symlinks that are created with
IO_REPARSE_TAG_SYMLINK reparse points, so skip the checking of
FILE_SUPPORTS_REPARSE_POINTS as some servers might not have it set.
Cc: linux-cifs@vger.kernel.org
Cc: Ralph Boehme <slow@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson@ed.ac.uk>
Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse
points.
Cc: linux-cifs@vger.kernel.org
Cc: Ralph Boehme <slow@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson@ed.ac.uk>
Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When making ZL3073X invisible, it was overlooked that ZL3073X depends on
NET, while ZL3073X_I2C and ZL3073X_SPI do not, causing:
WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_I2C
WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_SPI
WARNING: unmet direct dependencies detected for ZL3073X
Depends on [n]: NET [=n]
Selected by [y]:
- ZL3073X_I2C [=y] && I2C [=y]
Selected by [y]:
- ZL3073X_SPI [=y] && SPI [=y]
Fix this by adding the missing dependencies to ZL3073X_I2C and
ZL3073X_SPI.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508022110.nTqZ5Ylu-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202508022351.NHIxPF8j-lkp@intel.com/
Fixes: a4f0866e3dbbf3fe ("dpll: Make ZL3073X invisible")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250802155302.3673457-1-geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|