Age | Commit message (Collapse) | Author |
|
When can_swapin_thp() is unused, it prevents kernel builds with clang,
`make W=1` and CONFIG_WERROR=y:
mm/memory.c:4184:20: error: unused function 'can_swapin_thp' [-Werror,-Wunused-function]
Fix this by removing the unused stub.
See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
Link: https://lkml.kernel.org/r/20241008191329.2332346-1-andriy.shevchenko@linux.intel.com
Fixes: 242d12c98174 ("mm: support large folios swap-in for sync io devices")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Barry Song <baohua@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Chuanhua Han <hanchuanhua@oppo.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Map my outdated addresses within mailmap.
Link: https://lkml.kernel.org/r/20241009144934.43027-1-andybnac@gmail.com
Signed-off-by: Andy Chiu <andybnac@gmail.com>
Cc: Greentime Hu <greentime.hu@sifive.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Leon Chien <leonchien@synology.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add myself and Liam as co-maintainers of the memory mapping and VMA code
alongside Andrew as we are heavily involved in its implementation and
maintenance.
Link: https://lkml.kernel.org/r/20241009201032.6130-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
show show_smap_vma_flags() has been a using misspelled initializer in
mnemonics[] - it needed to initialize 2 element array of char and it used
NUL-padded 2 character string literals (i.e. 3-element initializer).
This has been spotted by gcc-15[*]; prior to that gcc quietly dropped the
3rd eleemnt of initializers. To fix this we are increasing the size of
mnemonics[] (from mnemonics[BITS_PER_LONG][2] to
mnemonics[BITS_PER_LONG][3]) to accomodate the NUL-padded string literals.
This also helps us in simplyfying the logic for printing of the flags as
instead of printing each character from the mnemonics[], we can just print
the mnemonics[] using seq_printf.
[*]: fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
917 | [0 ... (BITS_PER_LONG-1)] = "??",
| ^~~~
fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization]
...
Stephen pointed out:
: The C standard explicitly allows for a string initializer to be too long
: due to the NUL byte at the end ... so this warning may be overzealous.
but let's make the warning go away anwyay.
Link: https://lkml.kernel.org/r/20241005063700.2241027-1-brahmajit.xyz@gmail.com
Link: https://lkml.kernel.org/r/20241003093040.47c08382@canb.auug.org.au
Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Ben Greear reports following splat:
------------[ cut here ]------------
net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
...
Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
codetag_unload_module+0x19b/0x2a0
? codetag_load_module+0x80/0x80
nf_nat module exit calls kfree_rcu on those addresses, but the free
operation is likely still pending by the time alloc_tag checks for leaks.
Wait for outstanding kfree_rcu operations to complete before checking
resolves this warning.
Reproducer:
unshare -n iptables-nft -t nat -A PREROUTING -p tcp
grep nf_nat /proc/allocinfo # will list 4 allocations
rmmod nft_chain_nat
rmmod nf_nat # will WARN.
[akpm@linux-foundation.org: add comment]
Link: https://lkml.kernel.org/r/20241007205236.11847-1-fw@strlen.de
Fixes: a473573964e5 ("lib: code tagging module support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/netdev/bdaaef9d-4364-4171-b82b-bcfc12e207eb@candelatech.com/
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In mremap(), move_page_tables() looks at the type of the PMD entry and the
specified address range to figure out by which method the next chunk of
page table entries should be moved.
At that point, the mmap_lock is held in write mode, but no rmap locks are
held yet. For PMD entries that point to page tables and are fully covered
by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,
which first takes rmap locks, then does move_normal_pmd().
move_normal_pmd() takes the necessary page table locks at source and
destination, then moves an entire page table from the source to the
destination.
The problem is: The rmap locks, which protect against concurrent page
table removal by retract_page_tables() in the THP code, are only taken
after the PMD entry has been read and it has been decided how to move it.
So we can race as follows (with two processes that have mappings of the
same tmpfs file that is stored on a tmpfs mount with huge=advise); note
that process A accesses page tables through the MM while process B does it
through the file rmap:
process A process B
========= =========
mremap
mremap_to
move_vma
move_page_tables
get_old_pmd
alloc_new_pmd
*** PREEMPT ***
madvise(MADV_COLLAPSE)
do_madvise
madvise_walk_vmas
madvise_vma_behavior
madvise_collapse
hpage_collapse_scan_file
collapse_file
retract_page_tables
i_mmap_lock_read(mapping)
pmdp_collapse_flush
i_mmap_unlock_read(mapping)
move_pgt_entry(NORMAL_PMD, ...)
take_rmap_locks
move_normal_pmd
drop_rmap_locks
When this happens, move_normal_pmd() can end up creating bogus PMD entries
in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`. The effect
depends on arch-specific and machine-specific details; on x86, you can end
up with physical page 0 mapped as a page table, which is likely
exploitable for user->kernel privilege escalation.
Fix the race by letting process B recheck that the PMD still points to a
page table after the rmap locks have been taken. Otherwise, we bail and
let the caller fall back to the PTE-level copying path, which will then
bail immediately at the pmd_none() check.
Bug reachability: Reaching this bug requires that you can create
shmem/file THP mappings - anonymous THP uses different code that doesn't
zap stuff under rmap locks. File THP is gated on an experimental config
flag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need
shmem THP to hit this bug. As far as I know, getting shmem THP normally
requires that you can mount your own tmpfs with the right mount flags,
which would require creating your own user+mount namespace; though I don't
know if some distros maybe enable shmem THP by default or something like
that.
Bug impact: This issue can likely be used for user->kernel privilege
escalation when it is reachable.
Link: https://lkml.kernel.org/r/20241007-move_normal_pmd-vs-collapse-fix-2-v1-1-5ead9631f2ea@google.com
Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Co-developed-by: David Hildenbrand <david@redhat.com>
Closes: https://project-zero.issues.chromium.org/371047675
Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Arnd reported a build failure due to the BUILD_BUG_ON() statement in
alloc_kmem_cache_cpus(). The test
PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu)
The factors that increase the right side of the equation:
- PAGE_SIZE > 4KiB increases KMALLOC_SHIFT_HIGH
- For the local_lock_t in kmem_cache_cpu:
- PREEMPT_RT adds an actual lock.
- LOCKDEP increases the size of the lock.
- LOCK_STAT adds additional bytes plus padding to the lockdep
structure.
The net difference with and without PREEMPT_RT is 88 bytes for the
lock_lock_t, 96 bytes for kmem_cache_cpu due to additional padding. This
is enough to exceed the 80KiB limit with 16KiB page size - the 8KiB page
size is fine.
Increase PERCPU_DYNAMIC_SIZE_SHIFT to 13 on configs with PAGE_SIZE larger
than 4KiB and LOCKDEP enabled.
Link: https://lkml.kernel.org/r/20241007143049.gyMpEu89@linutronix.de
Fixes: d8fccd9ca5f9 ("arm64: Allow to enable PREEMPT_RT.")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410020326.iaZIteIx-lkp@intel.com/
Reported-by: Arnd Bergmann <arnd@kernel.org>
Closes: https://lore.kernel.org/20241004095702.637528-1-arnd@kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
On Android with arm, there is some synchronization needed to avoid a
deadlock when forking after pthread_create.
Link: https://lkml.kernel.org/r/20241003211716.371786-3-edliaw@google.com
Fixes: cff294582798 ("selftests/mm: extend and rename uffd pagemap test")
Signed-off-by: Edward Liaw <edliaw@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "selftests/mm: fix deadlock after pthread_create".
On Android arm, pthread_create followed by a fork caused a deadlock in the
case where the fork required work to be completed by the created thread.
Update the synchronization primitive to use pthread_barrier instead of
atomic_bool.
Apply the same fix to the wp-fork-with-event test.
This patch (of 2):
Swap synchronization primitive with pthread_barrier, so that stdatomic.h
does not need to be included.
The synchronization is needed on Android ARM64; we see a deadlock with
pthread_create when the parent thread races forward before the child has a
chance to start doing work.
Link: https://lkml.kernel.org/r/20241003211716.371786-1-edliaw@google.com
Link: https://lkml.kernel.org/r/20241003211716.371786-2-edliaw@google.com
Fixes: cff294582798 ("selftests/mm: extend and rename uffd pagemap test")
Signed-off-by: Edward Liaw <edliaw@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
syszbot produced this with a corrupted fs image. In theory, however an IO
error would trigger this also.
This affects just an error report, so should not be a serious error.
Link: https://lkml.kernel.org/r/87r08wjsnh.fsf@mail.parknet.co.jp
Link: https://lkml.kernel.org/r/66ff2c95.050a0220.49194.03e9.GAE@google.com
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: syzbot+ef0d7bc412553291aa86@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Syzbot reported that a task hang occurs in vcs_open() during a fuzzing
test for nilfs2.
The root cause of this problem is that in nilfs_find_entry(), which
searches for directory entries, ignores errors when loading a directory
page/folio via nilfs_get_folio() fails.
If the filesystem images is corrupted, and the i_size of the directory
inode is large, and the directory page/folio is successfully read but
fails the sanity check, for example when it is zero-filled,
nilfs_check_folio() may continue to spit out error messages in bursts.
Fix this issue by propagating the error to the callers when loading a
page/folio fails in nilfs_find_entry().
The current interface of nilfs_find_entry() and its callers is outdated
and cannot propagate error codes such as -EIO and -ENOMEM returned via
nilfs_find_entry(), so fix it together.
Link: https://lkml.kernel.org/r/20241004033640.6841-1-konishi.ryusuke@gmail.com
Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: Lizhi Xu <lizhi.xu@windriver.com>
Closes: https://lkml.kernel.org/r/20240927013806.3577931-1-lizhi.xu@windriver.com
Reported-by: syzbot+8a192e8d090fa9a31135@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8a192e8d090fa9a31135
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit f8d112a4e657 ("mm/mmap: avoid zeroing vma tree in mmap_region()")
changed how error handling is performed in mmap_region().
The error value defaults to -ENOMEM, but then gets reassigned immediately
to the result of vms_gather_munmap_vmas() if we are performing a MAP_FIXED
mapping over existing VMAs (and thus unmapping them).
This overwrites the error value, potentially clearing it.
After this, we invoke may_expand_vm() and possibly vm_area_alloc(), and
check to see if they failed. If they do so, then we perform error-handling
logic, but importantly, we do NOT update the error code.
This means that, if vms_gather_munmap_vmas() succeeds, but one of these
calls does not, the function will return indicating no error, but rather an
address value of zero, which is entirely incorrect.
Correct this and avoid future confusion by strictly setting error on each
and every occasion we jump to the error handling logic, and set the error
code immediately prior to doing so.
This way we can see at a glance that the error code is always correct.
Many thanks to Vegard Nossum who spotted this issue in discussion around
this problem.
Link: https://lkml.kernel.org/r/20241002073932.13482-1-lorenzo.stoakes@oracle.com
Fixes: f8d112a4e657 ("mm/mmap: avoid zeroing vma tree in mmap_region()")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Do not keep the obsolete EDID around after unplugging the display
from the connector.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 2a2391f857cd ("drm/ast: vga: Transparently handle BMC support")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-3-tzimmermann@suse.de
|
|
Do not keep the obsolete EDID around after unplugging the display
from the connector.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: d20c2f846428 ("drm/ast: sil164: Transparently handle BMC support")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-2-tzimmermann@suse.de
|
|
This reverts commit 6c9e14ee9f519ee605a3694fbfa4711284781d22.
This reverts commit d5070c9b29440c270b534bbacd636b8fa558e82b.
This reverts commit 89c6ea2006e2d39b125848fb0195c08fa0b354be.
The VLINE interrupt doesn't work correctly on G200SE-A (at least). We
have also seen missing interrupts on G200ER. So revert vblank support.
Fixes frozen displays and warnings about missed vblanks.
[ 33.818362] [CRTC:34:crtc-0] vblank wait timed out
From the vblank code, the driver only keeps the register constants and
the line that disables all interrupts in mgag200_device_init(). Both
is still useful without vblank handling.
Reported-by: Tony Luck <tony.luck@intel.com>
Closes: https://lore.kernel.org/dri-devel/Zvx6lSi7oq5xvTZb@agluck-desk3.sc.intel.com/raw
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015063932.8620-1-tzimmermann@suse.de
|
|
Treat each completed full size write to /dev/ttyDBC0 as a separate usb
transfer. Make sure the size of the TRBs matches the size of the tty
write by first queuing as many max packet size TRBs as possible up to
the last TRB which will be cut short to match the size of the tty write.
This solves an issue where userspace writes several transfers back to
back via /dev/ttyDBC0 into a kfifo before dbgtty can find available
request to turn that kfifo data into TRBs on the transfer ring.
The boundary between transfer was lost as xhci-dbgtty then turned
everyting in the kfifo into as many 'max packet size' TRBs as possible.
DbC would then send more data to the host than intended for that
transfer, causing host to issue a babble error.
Refuse to write more data to kfifo until previous tty write data is
turned into properly sized TRBs with data size boundaries matching tty
write size
Tested-by: Uday M Bhat <uday.m.bhat@intel.com>
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Some host controllers fail to produce the final completion event on an
isochronous TD which experienced an error mid TD. We deal with it by
flagging such TDs and checking if the next event points at the flagged
TD or at the next one, and giving back the flagged TD if the latter.
This is not enough, because the next TD may be missed by the xHC. Or
there may be no next TD but a ring underrun. We also need to get such
TD quickly out of the way, or errors on later TDs may be handled wrong.
If the next TD experiences a Missed Service Error, we will set the skip
flag on the endpoint and then attempt skipping TDs when yet another
event arrives. In such scenario, we ought to report the 'error mid TD'
transfer as such rather than skip it.
Another problem case are Stopped events. If we see one after an error
mid TD, we naively assume that it's a Force Stopped Event because it
doesn't match the pending TD, but in reality it might be an ordinary
Stopped event for the next TD, which we fail to recognize and handle.
Fix this by moving error mid TD handling before the whole TD skipping
loop. Remove unnecessary conditions, always give back the TD if the new
event points to any TRB outside it or if the pointer is NULL, as may be
the case in Ring Underrun and Overrun events on 1st gen hardware. Only
if the pending TD isn't flagged, consider other actions like skipping.
As a side effect of reordering with skip and FSE cases, error mid TD is
reordered with last_td_was_short check. This is harmless, because the
two cases are mutually exclusive - only one can happen in any given run
of handle_tx_event().
Tested on the NEC host and a USB camera with flaky cable. Dynamic debug
confirmed that Transaction Errors are sometimes seen, sometimes mid-TD,
sometimes followed by Missed Service. In such cases, they were finished
properly before skipping began.
[Rebase on 6.12-rc1 -Mathias]
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Avoid xHC host from processing a cancelled URB by always turning
cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command.
If the command fails then xHC will start processing the cancelled TD
instead of skipping it once endpoint is restarted, causing issues like
Babble error.
This is not a complete solution as a failed 'Set TR Deq' command does not
guarantee xHC TRB caches are cleared.
Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function")
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The stream contex type (SCT) bitfield is used both in the stream context
data structure, and in the 'Set TR Dequeue pointer' command TRB.
In both cases it uses bits 3:1
The SCT_FOR_TRB(p) macro used to set the stream context type (SCT) field
for the 'Set TR Dequeue pointer' command TRB incorrectly shifts the value
1 bit left before masking the three bits.
Fix this by first masking and rshifting, just like the similar
SCT_FOR_CTX(p) macro does
This issue has not been visibile as the lost bit 3 is only used with
secondary stream arrays (SSA). Xhci driver currently only supports using
a primary stream array with Linear stream addressing.
Fixes: 95241dbdf828 ("xhci: Set SCT field for Set TR dequeue on streams")
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241016140000.783905-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The syzbot fuzzer has been encountering "task hung" problems ever
since the dummy-hcd driver was changed to use hrtimers instead of
regular timers. It turns out that the problems are caused by a subtle
difference between the timer_pending() and hrtimer_active() APIs.
The changeover blindly replaced the first by the second. However,
timer_pending() returns True when the timer is queued but not when its
callback is running, whereas hrtimer_active() returns True when the
hrtimer is queued _or_ its callback is running. This difference
occasionally caused dummy_urb_enqueue() to think that the callback
routine had not yet started when in fact it was almost finished. As a
result the hrtimer was not restarted, which made it impossible for the
driver to dequeue later the URB that was just enqueued. This caused
usb_kill_urb() to hang, and things got worse from there.
Since hrtimers have no API for telling when they are queued and the
callback isn't running, the driver must keep track of this for itself.
That's what this patch does, adding a new "timer_pending" flag and
setting or clearing it at the appropriate times.
Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.com/
Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler")
Cc: Marcello Sylvester Bauer <sylv@sylv.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 160c826b4dd0 ("selftest: hid: add missing run-hid-tools-tests.sh")
has added the run-hid-tools-tests.sh script for it to be installed, but
I forgot to add the tests directory together.
If running the test case without the tests directory, will results in
the following error message:
make -C tools/testing/selftests/ TARGETS=hid install \
INSTALL_PATH=$KSFT_INSTALL_PATH
cd $KSFT_INSTALL_PATH
./run_kselftest.sh -t hid:hid-core.sh
/usr/lib/python3.11/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: helpconfig, Hook: pytest_cmdline_parse
UsageError: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...]
__main__.py: error: unrecognized arguments: --udevd
inifile: None
rootdir: /root/linux/kselftest_install/hid
In fact, the run-hid-tools-tests.sh script uses the scripts in the tests
directory to run tests. The tests directory also needs to be added to be
installed.
Fixes: ffb85d5c9e80 ("selftests: hid: import hid-tools hid-core tests")
Cc: stable@vger.kernel.org
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Acked-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
CONFIG_CLK_KUNIT_TEST=y, CONFIG_DEBUG_KMEMLEAK=y
and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the following memory leak occurs.
If the KUNIT_ASSERT_*() fails, the latter (exit() or testcases)
clk_put() or clk_hw_unregister() will fail to release the clk resource
and cause memory leaks, use new clk_hw_register_kunit()
and clk_hw_get_clk_kunit() to automatically release them.
unreferenced object 0xffffff80c6af5000 (size 512):
comm "kunit_try_catch", pid 371, jiffies 4294896001
hex dump (first 32 bytes):
20 4c c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff L..............
c0 75 e3 c6 80 ff ff ff 00 00 00 00 00 00 00 00 .u..............
backtrace (crc 8ca788fa):
[<00000000e21852d0>] kmemleak_alloc+0x34/0x40
[<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4
[<00000000d1bc850c>] __clk_register+0x80/0x1ecc
[<00000000b08c78c5>] clk_hw_register+0xc4/0x110
[<00000000b16d6df8>] clk_multiple_parents_mux_test_init+0x238/0x288
[<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac
[<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<0000000066619fb8>] kthread+0x2e8/0x374
[<00000000a1157f53>] ret_from_fork+0x10/0x20
unreferenced object 0xffffff80c6e37880 (size 96):
comm "kunit_try_catch", pid 371, jiffies 4294896002
hex dump (first 32 bytes):
00 50 af c6 80 ff ff ff 00 00 00 00 00 00 00 00 .P..............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc b4b766dd):
[<00000000e21852d0>] kmemleak_alloc+0x34/0x40
[<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4
[<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4
[<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114
[<000000006fab5bfa>] clk_test_multiple_parents_mux_set_range_set_parent_get_rate+0x3c/0xa0
[<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac
[<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<0000000066619fb8>] kthread+0x2e8/0x374
[<00000000a1157f53>] ret_from_fork+0x10/0x20
unreferenced object 0xffffff80c2b56900 (size 96):
comm "kunit_try_catch", pid 395, jiffies 4294896107
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 e0 49 c0 86 e1 ff ff ff .........I......
backtrace (crc 2e59b327):
[<00000000e21852d0>] kmemleak_alloc+0x34/0x40
[<00000000c6c715a8>] __kmalloc_noprof+0x2bc/0x3c0
[<00000000f04a7951>] __clk_register+0x70c/0x1ecc
[<00000000b08c78c5>] clk_hw_register+0xc4/0x110
[<00000000cafa9563>] clk_orphan_transparent_multiple_parent_mux_test_init+0x1a8/0x1dc
[<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac
[<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<0000000066619fb8>] kthread+0x2e8/0x374
[<00000000a1157f53>] ret_from_fork+0x10/0x20
unreferenced object 0xffffff80c87c9400 (size 512):
comm "kunit_try_catch", pid 483, jiffies 4294896907
hex dump (first 32 bytes):
a0 44 c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff .D..............
20 05 a8 c8 80 ff ff ff 00 00 00 00 00 00 00 00 ...............
backtrace (crc c25b43fb):
[<00000000e21852d0>] kmemleak_alloc+0x34/0x40
[<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4
[<00000000d1bc850c>] __clk_register+0x80/0x1ecc
[<00000000b08c78c5>] clk_hw_register+0xc4/0x110
[<000000002688be48>] clk_single_parent_mux_test_init+0x1a0/0x1d4
[<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac
[<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<0000000066619fb8>] kthread+0x2e8/0x374
[<00000000a1157f53>] ret_from_fork+0x10/0x20
unreferenced object 0xffffff80c6dd2380 (size 96):
comm "kunit_try_catch", pid 483, jiffies 4294896908
hex dump (first 32 bytes):
00 94 7c c8 80 ff ff ff 00 00 00 00 00 00 00 00 ..|.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 4401212):
[<00000000e21852d0>] kmemleak_alloc+0x34/0x40
[<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4
[<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4
[<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114
[<0000000063eb2c90>] clk_test_single_parent_mux_set_range_disjoint_child_last+0x3c/0xa0
[<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac
[<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<0000000066619fb8>] kthread+0x2e8/0x374
[<00000000a1157f53>] ret_from_fork+0x10/0x20
......
Fixes: 02cdeace1e1e ("clk: tests: Add tests for single parent mux")
Fixes: 2e9cad1abc71 ("clk: tests: Add some tests for orphan with multiple parents")
Fixes: 433fb8a611ca ("clk: tests: Add missing test case for ranges")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20241016022658.2131826-1-ruanjinjie@huawei.com
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Pull rdma fixes from Jason Gunthorpe:
"Several miscellaneous fixes. A lot of bnxt_re activity, there will be
more rc patches there coming.
- Many bnxt_re bug fixes - Memory leaks, kasn, NULL pointer deref,
soft lockups, error unwinding and some small functional issues
- Error unwind bug in rdma netlink
- Two issues with incorrect VLAN detection for iWarp
- skb_splice_from_iter() splat in siw
- Give SRP slab caches unique names to resolve the merge window
WARN_ON regression"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/bnxt_re: Fix the GID table length
RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages
RDMA/bnxt_re: Change the sequence of updating the CQ toggle value
RDMA/bnxt_re: Fix an error path in bnxt_re_add_device
RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
RDMA/bnxt_re: Fix a possible NULL pointer dereference
RDMA/bnxt_re: Return more meaningful error
RDMA/bnxt_re: Fix incorrect dereference of srq in async event
RDMA/bnxt_re: Fix out of bound check
RDMA/bnxt_re: Fix the max CQ WQEs for older adapters
RDMA/srpt: Make slab cache names unique
RDMA/irdma: Fix misspelling of "accept*"
RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP
RDMA/siw: Add sendpage_ok() check to disable MSG_SPLICE_PAGES
RDMA/core: Fix ENODEV error for iWARP test over vlan
RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event
RDMA/bnxt_re: Fix the max WQEs used in Static WQE mode
RDMA/bnxt_re: Add a check for memory allocation
RDMA/bnxt_re: Fix incorrect AVID type in WQE structure
RDMA/bnxt_re: Fix a possible memory leak
|
|
Add ArrowLake-H to the list of processors where PL4 is supported.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20241016154851.1293654-1-srinivas.pandruvada@linux.intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge an amd-pstate driver fix for 6.12-rc4 from Mario Limonciello:
"Fix a regression introduced where boost control malfunctioned in
amd-pstate"
* tag 'amd-pstate-v6.12-2024-10-16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
|
|
Fake CSR controllers don't seem to handle short-transfer properly which
cause command to time out:
kernel: usb 1-1: new full-speed USB device number 19 using xhci_hcd
kernel: usb 1-1: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=88.91
kernel: usb 1-1: New USB device strings: Mfr=0, Product=2, SerialNumber=0
kernel: usb 1-1: Product: BT DONGLE10
...
Bluetooth: hci1: Opcode 0x1004 failed: -110
kernel: Bluetooth: hci1: command 0x1004 tx timeout
According to USB Spec 2.0 Section 5.7.3 Interrupt Transfer Packet Size
Constraints a interrupt transfer is considered complete when the size is 0
(ZPL) or < wMaxPacketSize:
'When an interrupt transfer involves more data than can fit in one
data payload of the currently established maximum size, all data
payloads are required to be maximum-sized except for the last data
payload, which will contain the remaining data. An interrupt transfer
is complete when the endpoint does one of the following:
• Has transferred exactly the amount of data expected
• Transfers a packet with a payload size less than wMaxPacketSize or
transfers a zero-length packet'
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219365
Fixes: 7b05933340f4 ("Bluetooth: btusb: Fix not handling ZPL/short-transfer")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
There's issue as follows:
KASAN: maybe wild-memory-access in range [0xdead...108-0xdead...10f]
CPU: 3 UID: 0 PID: 2805 Comm: rmmod Tainted: G W
RIP: 0010:proto_unregister+0xee/0x400
Call Trace:
<TASK>
__do_sys_delete_module+0x318/0x580
do_syscall_64+0xc1/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
As bnep_init() ignore bnep_sock_init()'s return value, and bnep_sock_init()
will cleanup all resource. Then when remove bnep module will call
bnep_sock_cleanup() to cleanup sock's resource.
To solve above issue just return bnep_sock_init()'s return value in
bnep_exit().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This partially reverts 81b3e33bb054 ("Bluetooth: btusb: Don't fail
external suspend requests") as it introduced a call to hci_suspend_dev
that assumes the system-suspend which doesn't work well when just the
device is being suspended because wakeup flag is only set for remote
devices that can wakeup the system.
Reported-by: Rafael J. Wysocki <rafael@kernel.org>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Kenneth Crudup <kenny@panix.com>
Fixes: 610712298b11 ("Bluetooth: btusb: Don't fail external suspend requests")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Rafael J. Wysocki <rafael@kernel.org>
|
|
If bt_init() fails, the debugfs directory currently is not removed. If
the module is loaded again after that, the debugfs directory is not set
up properly due to the existing directory.
# modprobe bluetooth
# ls -laF /sys/kernel/debug/bluetooth
total 0
drwxr-xr-x 2 root root 0 Sep 27 14:26 ./
drwx------ 31 root root 0 Sep 27 14:25 ../
-r--r--r-- 1 root root 0 Sep 27 14:26 l2cap
-r--r--r-- 1 root root 0 Sep 27 14:26 sco
# modprobe -r bluetooth
# ls -laF /sys/kernel/debug/bluetooth
ls: cannot access '/sys/kernel/debug/bluetooth': No such file or directory
#
# modprobe bluetooth
modprobe: ERROR: could not insert 'bluetooth': Invalid argument
# dmesg | tail -n 6
Bluetooth: Core ver 2.22
NET: Registered PF_BLUETOOTH protocol family
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: Faking l2cap_init() failure for testing
NET: Unregistered PF_BLUETOOTH protocol family
# ls -laF /sys/kernel/debug/bluetooth
total 0
drwxr-xr-x 2 root root 0 Sep 27 14:31 ./
drwx------ 31 root root 0 Sep 27 14:26 ../
#
# modprobe bluetooth
# dmesg | tail -n 7
Bluetooth: Core ver 2.22
debugfs: Directory 'bluetooth' with parent '/' already present!
NET: Registered PF_BLUETOOTH protocol family
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
# ls -laF /sys/kernel/debug/bluetooth
total 0
drwxr-xr-x 2 root root 0 Sep 27 14:31 ./
drwx------ 31 root root 0 Sep 27 14:26 ../
#
Cc: stable@vger.kernel.org
Fixes: ffcecac6a738 ("Bluetooth: Create root debugfs directory during module init")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
If iso_init() has been called, iso_exit() must be called on module
unload. Without that, the struct proto that iso_init() registered with
proto_register() becomes invalid, which could cause unpredictable
problems later. In my case, with CONFIG_LIST_HARDENED and
CONFIG_BUG_ON_DATA_CORRUPTION enabled, loading the module again usually
triggers this BUG():
list_add corruption. next->prev should be prev (ffffffffb5355fd0),
but was 0000000000000068. (next=ffffffffc0a010d0).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:29!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 4159 Comm: modprobe Not tainted 6.10.11-4+bt2-ao-desktop #1
RIP: 0010:__list_add_valid_or_report+0x61/0xa0
...
__list_add_valid_or_report+0x61/0xa0
proto_register+0x299/0x320
hci_sock_init+0x16/0xc0 [bluetooth]
bt_init+0x68/0xd0 [bluetooth]
__pfx_bt_init+0x10/0x10 [bluetooth]
do_one_initcall+0x80/0x2f0
do_init_module+0x8b/0x230
__do_sys_init_module+0x15f/0x190
do_syscall_64+0x68/0x110
...
Cc: stable@vger.kernel.org
Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
If bt_debugfs is not created successfully, which happens if either
CONFIG_DEBUG_FS or CONFIG_DEBUG_FS_ALLOW_ALL is unset, then iso_init()
returns early and does not set iso_inited to true. This means that a
subsequent call to iso_init() will result in duplicate calls to
proto_register(), bt_sock_register(), etc.
With CONFIG_LIST_HARDENED and CONFIG_BUG_ON_DATA_CORRUPTION enabled, the
duplicate call to proto_register() triggers this BUG():
list_add double add: new=ffffffffc0b280d0, prev=ffffffffbab56250,
next=ffffffffc0b280d0.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:35!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 2 PID: 887 Comm: bluetoothd Not tainted 6.10.11-1-ao-desktop #1
RIP: 0010:__list_add_valid_or_report+0x9a/0xa0
...
__list_add_valid_or_report+0x9a/0xa0
proto_register+0x2b5/0x340
iso_init+0x23/0x150 [bluetooth]
set_iso_socket_func+0x68/0x1b0 [bluetooth]
kmem_cache_free+0x308/0x330
hci_sock_sendmsg+0x990/0x9e0 [bluetooth]
__sock_sendmsg+0x7b/0x80
sock_write_iter+0x9a/0x110
do_iter_readv_writev+0x11d/0x220
vfs_writev+0x180/0x3e0
do_writev+0xca/0x100
...
This change removes the early return. The check for iso_debugfs being
NULL was unnecessary, it is always NULL when iso_inited is false.
Cc: stable@vger.kernel.org
Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This uses more aggressive hueristics than the the bootup default
profile. On windows the OS has a special fullscreen 3D mode
where this is used. Since we don't have the equivalent on Linux
default to this profile for dGPUs.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 336568de918e08c825b3b1cbe2ec809f2fc26d94)
|
|
This XBOX360 compatible gamepad uses the new product id 0x310a under the
8BitDo's vendor id 0x2dc8. The change was tested using the gamepad in a
wired and wireless dongle configuration.
Signed-off-by: Stefan Kerkmann <s.kerkmann@pengutronix.de>
Link: https://lore.kernel.org/r/20241015-8bitdo_2c_ultimate_wireless-v1-1-9c9f9db2e995@pengutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- regression fix: dirty extents tracked in xarray for qgroups must be
adjusted for 32bit platforms
- fix potentially freeing uninitialized name in fscrypt structure
- fix warning about unneeded variable in a send callback
* tag 'for-6.12-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix uninitialized pointer free on read_alloc_one_name() error
btrfs: send: cleanup unneeded return variable in changed_verity()
btrfs: fix uninitialized pointer free in add_inode_ref()
btrfs: use sector numbers as keys for the dirty extents xarray
|
|
Leon Hwang says:
====================
bpf: Fix tailcall infinite loop caused by freplace
Previously, I addressed a tailcall infinite loop issue related to
trampolines[0].
In this patchset, I resolve a similar issue where a tailcall infinite loop
can occur due to the combination of tailcalls and freplace programs. The
fix prevents adding extended programs to the prog_array map and blocks the
extension of a tail callee program with freplace.
Key changes:
1. If a program or its subprogram has been extended by an freplace program,
it can no longer be updated to a prog_array map.
2. If a program has been added to a prog_array map, neither it nor its
subprograms can be extended by an freplace program.
Additionally, an extension program should not be tailcalled. As a result,
return -EINVAL if the program has a type of BPF_PROG_TYPE_EXT when adding
it to a prog_array map.
Changes:
v7 -> v8:
* Address comment from Alexei:
* guard(mutex) should not hold range all the way through
bpf_arch_text_poke().
* Address suggestion from Xu Kuohai:
* Extension prog should not be tailcalled independently.
v6 -> v7:
* Address comments from Alexei:
* Rewrite commit message more imperative and consice with AI.
* Extend bpf_trampoline_link_prog() and bpf_trampoline_unlink_prog()
to link and unlink target prog for freplace prog.
* Use guard(mutex)(&tgt_prog->aux->ext_mutex) instead of
mutex_lock()&mutex_unlock() pair.
* Address comment from Eduard:
* Remove misplaced "Reported-by" and "Closes" tags.
v5 -> v6:
* Fix a build warning reported by kernel test robot.
v4 -> v5:
* Move code of linking/unlinking target prog of freplace to trampoline.c.
* Address comments from Alexei:
* Change type of prog_array_member_cnt to u64.
* Combine two patches to one.
v3 -> v4:
* Address comments from Eduard:
* Rename 'tail_callee_cnt' to 'prog_array_member_cnt'.
* Add comment to 'prog_array_member_cnt'.
* Use a mutex to protect 'is_extended' and 'prog_array_member_cnt'.
v2 -> v3:
* Address comments from Alexei:
* Stop hacking JIT.
* Prevent the specific use case at attach/update time.
v1 -> v2:
* Address comment from Eduard:
* Explain why nop5 and xor/nop3 are swapped at prologue.
* Address comment from Alexei:
* Disallow attaching tail_call_reachable freplace prog to
not-tail_call_reachable target in verifier.
* Update "bpf, arm64: Fix tailcall infinite loop caused by freplace" with
latest arm64 JIT code.
Links:
[0] https://lore.kernel.org/bpf/20230912150442.2009-1-hffilwlqm@gmail.com/
====================
Link: https://lore.kernel.org/r/20241015150207.70264-1-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds test cases for bpf_task_from_vpid() kfunc.
task_kfunc_from_vpid_no_null_check is used to test the case where
the return value is not checked for NULL pointer.
test_task_from_vpid_current is used to test obtaining the
struct task_struct of the process in the pid namespace based on vpid.
test_task_from_vpid_invalid is used to test the case of invalid vpid.
test_task_from_vpid_current and test_task_from_vpid_invalid will run
in the new namespace.
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Link: https://lore.kernel.org/r/AM6PR03MB5848F13435CD650AC4B7BD7099442@AM6PR03MB5848.eurprd03.prod.outlook.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a test case to ensure that attaching a tail callee program with an
freplace program fails, and that updating an extended program to a
prog_array map is also prohibited.
This test is designed to prevent the potential infinite loop issue caused
by the combination of tail calls and freplace, ensuring the correct
behavior and stability of the system.
Additionally, fix the broken tailcalls/tailcall_freplace selftest
because an extension prog should not be tailcalled.
cd tools/testing/selftests/bpf; ./test_progs -t tailcalls
337/25 tailcalls/tailcall_freplace:OK
337/26 tailcalls/tailcall_bpf2bpf_freplace:OK
337 tailcalls:OK
Summary: 1/26 PASSED, 0 SKIPPED, 0 FAILED
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20241015150207.70264-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpf_task_from_pid() that currently exists looks up the
struct task_struct corresponding to the pid in the root pid
namespace (init_pid_ns).
This patch adds bpf_task_from_vpid() which looks up the
struct task_struct corresponding to vpid in the pid namespace
of the current process.
This is useful for getting information about other processes
in the same pid namespace.
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Link: https://lore.kernel.org/r/AM6PR03MB5848E50DA58F79CDE65433C399442@AM6PR03MB5848.eurprd03.prod.outlook.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
There is a potential infinite loop issue that can occur when using a
combination of tail calls and freplace.
In an upcoming selftest, the attach target for entry_freplace of
tailcall_freplace.c is subprog_tc of tc_bpf2bpf.c, while the tail call in
entry_freplace leads to entry_tc. This results in an infinite loop:
entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc.
The problem arises because the tail_call_cnt in entry_freplace resets to
zero each time entry_freplace is executed, causing the tail call mechanism
to never terminate, eventually leading to a kernel panic.
To fix this issue, the solution is twofold:
1. Prevent updating a program extended by an freplace program to a
prog_array map.
2. Prevent extending a program that is already part of a prog_array map
with an freplace program.
This ensures that:
* If a program or its subprogram has been extended by an freplace program,
it can no longer be updated to a prog_array map.
* If a program has been added to a prog_array map, neither it nor its
subprograms can be extended by an freplace program.
Moreover, an extension program should not be tailcalled. As such, return
-EINVAL if the program has a type of BPF_PROG_TYPE_EXT when adding it to a
prog_array map.
Additionally, fix a minor code style issue by replacing eight spaces with a
tab for proper formatting.
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20241015150207.70264-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Namhyung Kim says:
====================
bpf: Add kmem_cache iterator and kfunc
Hello,
I'm proposing a new iterator and a kfunc for the slab memory allocator
to get information of each kmem_cache like in /proc/slabinfo or
/sys/kernel/slab in more flexible way.
v5 changes
* set PTR_UNTRUSTED for return value of bpf_get_kmem_cache() (Alexei)
* add KF_RCU_PROTECTED to bpf_get_kmem_cache(). See below. (Song)
* add WARN_ON_ONCE and comment in kmem_cache_iter_seq_next() (Song)
* change kmem_cache_iter_seq functions not to call BPF on intermediate stop
* add a subtest to compare the kmem cache info with /proc/slabinfo (Alexei)
v4: https://lore.kernel.org/lkml/20241002180956.1781008-1-namhyung@kernel.org
* skip kmem_cache_destroy() in kmem_cache_iter_seq_stop() if possible (Vlastimil)
* fix a bug in the kmem_cache_iter_seq_start() for the last entry
v3: https://lore.kernel.org/lkml/20241002065456.1580143-1-namhyung@kernel.org/
* rework kmem_cache_iter not to hold slab_mutex when running BPF (Alexei)
* add virt_addr_valid() check (Alexei)
* fix random test failure by running test with the current task (Hyeonggon)
v2: https://lore.kernel.org/lkml/20240927184133.968283-1-namhyung@kernel.org/
* rename it to "kmem_cache_iter"
* fix a build issue
* add Acked-by's from Roman and Vlastimil (Thanks!)
* add error codes in the test for debugging
v1: https://lore.kernel.org/lkml/20240925223023.735947-1-namhyung@kernel.org/
My use case is `perf lock contention` tool which shows contended locks
but many of them are not global locks and don't have symbols. If it
can tranlate the address of the lock in a slab object to the name of
the slab, it'd be much more useful.
I'm not aware of type information in slab yet, but I was told there's
a work to associate BTF ID with it. It'd be definitely helpful to my
use case. Probably we need another kfunc to get the start address of
the object or the offset in the object from an address if the type
info is available. But I want to start with a simple thing first.
The kmem_cache_iter iterates kmem_cache objects under slab_mutex and
will be useful for userspace to prepare some work for specific slabs
like setting up filters in advance. And the bpf_get_kmem_cache()
kfunc will return a pointer to a slab from the address of a lock.
And the test code is to read from the iterator and make sure it finds
a slab cache of the task_struct for the current task.
The code is available at 'bpf/slab-iter-v5' branch in
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
====================
Link: https://lore.kernel.org/r/20241010232505.1339892-1-namhyung@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The test traverses all slab caches using the kmem_cache_iter and save
the data into slab_result array map. And check if current task's
pointer is from "task_struct" slab cache using bpf_get_kmem_cache().
Also compare the result array with /proc/slabinfo if available (when
CONFIG_SLUB_DEBUG is on). Note that many of the fields in the slabinfo
are transient, so it only compares the name and objsize fields.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241010232505.1339892-4-namhyung@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The bpf_get_kmem_cache() is to get a slab cache information from a
virtual address like virt_to_cache(). If the address is a pointer
to a slab object, it'd return a valid kmem_cache pointer, otherwise
NULL is returned.
It doesn't grab a reference count of the kmem_cache so the caller is
responsible to manage the access. The returned point is marked as
PTR_UNTRUSTED.
The intended use case for now is to symbolize locks in slab objects
from the lock contention tracepoints.
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev> (mm/*)
Acked-by: Vlastimil Babka <vbabka@suse.cz> #mm/slab
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241010232505.1339892-3-namhyung@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull smb server fixes from Steve French:
- fix race between session setup and session logoff
- add supplementary group support
* tag 'v6.12-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd:
ksmbd: add support for supplementary groups
ksmbd: fix user-after-free from session log off
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Remove bogus testmgr ENOENT error messages
- Ensure algorithm is still alive before marking it as tested
- Disable buggy hash algorithms in marvell/cesa
* tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: marvell/cesa - Disable hash algorithms
crypto: testmgr - Hide ENOENT errors better
crypto: api - Fix liveliness check in crypto_alg_tested
|
|
Add assertions/tests to verify `bpf_link_info` fields for netfilter link
are correctly populated.
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20241011193252.178997-2-wudevelops@gmail.com
|
|
This fix correctly populates the `bpf_link_info.netfilter.flags` field
when user passes the `BPF_F_NETFILTER_IP_DEFRAG` flag.
Fixes: 91721c2d02d3 ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Florian Westphal <fw@strlen.de>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/20241011193252.178997-1-wudevelops@gmail.com
|
|
UBLK_F_USER_COPY requires userspace to call write() on ublk char
device for filling request buffer, and unprivileged device can't
be trusted.
So don't allow user copy for unprivileged device.
Cc: stable@vger.kernel.org
Fixes: 1172d5b8beca ("ublk: support user copy")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20241016134847.2911721-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
greater
On display ver 20 onwards tile4 is not supported with horizontal flip
Bspec: 69853
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241007182841.2104740-1-juhapekka.heikkila@gmail.com
(cherry picked from commit 73e8e2f9a358caa005ed6e52dcb7fa2bca59d132)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled
on for manual global invalidation to take effect and actually flush
device cache, however this also turns on flushing for things like
pipecontrol, which occurs between submissions for compute/render. This
sounds like massive overkill for our needs, where we already have the
manual flushing on the display side with the global invalidation. Some
observations on BMG:
1. Disabling l2 caching for host writes and stubbing out the driver
global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has
no impact on wb-transient-vs-display IGT, which makes sense since the
pipecontrol is now flushing the device cache after the render copy.
Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also
expected since device cache is now dirty and display engine can't see
the writes.
2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global
invalidation also has no impact on wb-transient-vs-display. This
suggests that the global invalidation still works as expected and is
flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on.
With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since
we no longer flush the device cache between submissions as part of
pipecontrol.
Edit: We now also have clarification from HW side that BSpec was indeed
wrong here.
v2:
- Rebase and update commit message.
BSpec: 71718
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Vitasta Wattal <vitasta.wattal@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com
(cherry picked from commit 67ec9f87bd6c57db1251bb2244d242f7ca5a0b6a)
[ Fix conflict due to changed xe_mmio_write32() signature ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
We can incorrectly think that the fence has signalled, if we get a
non-zero value here from the kmalloc, which is quite plausible. Just use
kzalloc to prevent stuff like this.
Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.10+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241011133633.388008-2-matthew.auld@intel.com
(cherry picked from commit 26f69e88dcc95fffc62ed2aea30ad7b1fdf31fdb)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|