Age | Commit message (Collapse) | Author |
|
The mmap_region() function is somewhat terrifying, with spaghetti-like
control flow and numerous means by which issues can arise and incomplete
state, memory leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently
observed issues with resource leaks and observable inconsistent state.
Taking advantage of previous patches in this series we move a number of
checks earlier in the code, simplifying things by moving the core of the
logic into a static internal function __mmap_region().
Doing this allows us to perform a number of checks up front before we do
any real work, and allows us to unwind the writable unmap check
unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE
validation unconditionally also.
We move a number of things here:
1. We preallocate memory for the iterator before we call the file-backed
memory hook, allowing us to exit early and avoid having to perform
complicated and error-prone close/free logic. We carefully free
iterator state on both success and error paths.
2. The enclosing mmap_region() function handles the mapping_map_writable()
logic early. Previously the logic had the mapping_map_writable() at the
point of mapping a newly allocated file-backed VMA, and a matching
mapping_unmap_writable() on success and error paths.
We now do this unconditionally if this is a file-backed, shared writable
mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however
doing so does not invalidate the seal check we just performed, and we in
any case always decrement the counter in the wrapper.
We perform a debug assert to ensure a driver does not attempt to do the
opposite.
3. We also move arch_validate_flags() up into the mmap_region()
function. This is only relevant on arm64 and sparc64, and the check is
only meaningful for SPARC with ADI enabled. We explicitly add a warning
for this arch if a driver invalidates this check, though the code ought
eventually to be fixed to eliminate the need for this.
With all of these measures in place, we no longer need to explicitly close
the VMA on error paths, as we place all checks which might fail prior to a
call to any driver mmap hook.
This eliminates an entire class of errors, makes the code easier to reason
about and more robust.
Link: https://lkml.kernel.org/r/6e0becb36d2f5472053ac5d544c0edfe9b899e25.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Mark Brown <broonie@kernel.org>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently MTE is permitted in two circumstances (desiring to use MTE
having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is
specified, as checked by arch_calc_vm_flag_bits() and actualised by
setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is
shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap
hook is activated in mmap_region().
The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also
set is the arm64 implementation of arch_validate_flags().
Unfortunately, we intend to refactor mmap_region() to perform this check
earlier, meaning that in the case of a shmem backing we will not have
invoked shmem_mmap() yet, causing the mapping to fail spuriously.
It is inappropriate to set this architecture-specific flag in general mm
code anyway, so a sensible resolution of this issue is to instead move the
check somewhere else.
We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
the arch_calc_vm_flag_bits() call.
This is an appropriate place to do this as we already check for the
MAP_ANONYMOUS case here, and the shmem file case is simply a variant of
the same idea - we permit RAM-backed memory.
This requires a modification to the arch_calc_vm_flag_bits() signature to
pass in a pointer to the struct file associated with the mapping, however
this is not too egregious as this is only used by two architectures anyway
- arm64 and parisc.
So this patch performs this adjustment and removes the unnecessary
assignment of VM_MTE_ALLOWED in shmem_mmap().
[akpm@linux-foundation.org: fix whitespace, per Catalin]
Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Refactor the map_deny_write_exec() to not unnecessarily require a VMA
parameter but rather to accept VMA flags parameters, which allows us to
use this function early in mmap_region() in a subsequent commit.
While we're here, we refactor the function to be more readable and add
some additional documentation.
Link: https://lkml.kernel.org/r/6be8bb59cd7c68006ebb006eb9d8dc27104b1f70.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Jann Horn <jannh@google.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Incorrect invocation of VMA callbacks when the VMA is no longer in a
consistent state is bug prone and risky to perform.
With regards to the important vm_ops->close() callback We have gone to
great lengths to try to track whether or not we ought to close VMAs.
Rather than doing so and risking making a mistake somewhere, instead
unconditionally close and reset vma->vm_ops to an empty dummy operations
set with a NULL .close operator.
We introduce a new function to do so - vma_close() - and simplify existing
vms logic which tracked whether we needed to close or not.
This simplifies the logic, avoids incorrect double-calling of the .close()
callback and allows us to update error paths to simply call vma_close()
unconditionally - making VMA closure idempotent.
Link: https://lkml.kernel.org/r/28e89dda96f68c505cb6f8e9fc9b57c3e9f74b42.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Jann Horn <jannh@google.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "fix error handling in mmap_region() and refactor
(hotfixes)", v4.
mmap_region() is somewhat terrifying, with spaghetti-like control flow and
numerous means by which issues can arise and incomplete state, memory
leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently
observed issues with resource leaks and observable inconsistent state.
This series goes to great lengths to simplify how mmap_region() works and
to avoid unwinding errors late on in the process of setting up the VMA for
the new mapping, and equally avoids such operations occurring while the
VMA is in an inconsistent state.
The patches in this series comprise the minimal changes required to
resolve existing issues in mmap_region() error handling, in order that
they can be hotfixed and backported. There is additionally a follow up
series which goes further, separated out from the v1 series and sent and
updated separately.
This patch (of 5):
After an attempted mmap() fails, we are no longer in a situation where we
can safely interact with VMA hooks. This is currently not enforced,
meaning that we need complicated handling to ensure we do not incorrectly
call these hooks.
We can avoid the whole issue by treating the VMA as suspect the moment
that the file->f_ops->mmap() function reports an error by replacing
whatever VMA operations were installed with a dummy empty set of VMA
operations.
We do so through a new helper function internal to mm - mmap_file() -
which is both more logically named than the existing call_mmap() function
and correctly isolates handling of the vm_op reassignment to mm.
All the existing invocations of call_mmap() outside of mm are ultimately
nested within the call_mmap() from mm, which we now replace.
It is therefore safe to leave call_mmap() in place as a convenience
function (and to avoid churn). The invokers are:
ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap()
coda_file_operations -> mmap -> coda_file_mmap()
shm_file_operations -> shm_mmap()
shm_file_operations_huge -> shm_mmap()
dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops
-> i915_gem_dmabuf_mmap()
None of these callers interact with vm_ops or mappings in a problematic
way on error, quickly exiting out.
Link: https://lkml.kernel.org/r/cover.1730224667.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/d41fd763496fd0048a962f3fd9407dc72dd4fd86.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Jann Horn <jannh@google.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Recent changes are putting more pressure on THP deferred split queues:
under load revealing long-standing races, causing list_del corruptions,
"Bad page state"s and worse (I keep BUGs in both of those, so usually
don't get to see how badly they end up without). The relevant recent
changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin,
improved swap allocation, and underused THP splitting.
Before fixing locking: rename misleading folio_undo_large_rmappable(),
which does not undo large_rmappable, to folio_unqueue_deferred_split(),
which is what it does. But that and its out-of-line __callee are mm
internals of very limited usability: add comment and WARN_ON_ONCEs to
check usage; and return a bool to say if a deferred split was unqueued,
which can then be used in WARN_ON_ONCEs around safety checks (sparing
callers the arcane conditionals in __folio_unqueue_deferred_split()).
Just omit the folio_unqueue_deferred_split() from free_unref_folios(), all
of whose callers now call it beforehand (and if any forget then bad_page()
will tell) - except for its caller put_pages_list(), which itself no
longer has any callers (and will be deleted separately).
Swapout: mem_cgroup_swapout() has been resetting folio->memcg_data 0
without checking and unqueueing a THP folio from deferred split list;
which is unfortunate, since the split_queue_lock depends on the memcg
(when memcg is enabled); so swapout has been unqueueing such THPs later,
when freeing the folio, using the pgdat's lock instead: potentially
corrupting the memcg's list. __remove_mapping() has frozen refcount to 0
here, so no problem with calling folio_unqueue_deferred_split() before
resetting memcg_data.
That goes back to 5.4 commit 87eaceb3faa5 ("mm: thp: make deferred split
shrinker memcg aware"): which included a check on swapcache before adding
to deferred queue, but no check on deferred queue before adding THP to
swapcache. That worked fine with the usual sequence of events in reclaim
(though there were a couple of rare ways in which a THP on deferred queue
could have been swapped out), but 6.12 commit dafff3f4c850 ("mm: split
underused THPs") avoids splitting underused THPs in reclaim, which makes
swapcache THPs on deferred queue commonplace.
Keep the check on swapcache before adding to deferred queue? Yes: it is
no longer essential, but preserves the existing behaviour, and is likely
to be a worthwhile optimization (vmstat showed much more traffic on the
queue under swapping load if the check was removed); update its comment.
Memcg-v1 move (deprecated): mem_cgroup_move_account() has been changing
folio->memcg_data without checking and unqueueing a THP folio from the
deferred list, sometimes corrupting "from" memcg's list, like swapout.
Refcount is non-zero here, so folio_unqueue_deferred_split() can only be
used in a WARN_ON_ONCE to validate the fix, which must be done earlier:
mem_cgroup_move_charge_pte_range() first try to split the THP (splitting
of course unqueues), or skip it if that fails. Not ideal, but moving
charge has been requested, and khugepaged should repair the THP later:
nobody wants new custom unqueueing code just for this deprecated case.
The 87eaceb3faa5 commit did have the code to move from one deferred list
to another (but was not conscious of its unsafety while refcount non-0);
but that was removed by 5.6 commit fac0516b5534 ("mm: thp: don't need care
deferred split queue in memcg charge move path"), which argued that the
existence of a PMD mapping guarantees that the THP cannot be on a deferred
list. As above, false in rare cases, and now commonly false.
Backport to 6.11 should be straightforward. Earlier backports must take
care that other _deferred_list fixes and dependencies are included. There
is not a strong case for backports, but they can fix cornercases.
Link: https://lkml.kernel.org/r/8dc111ae-f6db-2da7-b25c-7a20b1effe3b@google.com
Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
Fixes: dafff3f4c850 ("mm: split underused THPs")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Recent changes are putting more pressure on THP deferred split queues:
under load revealing long-standing races, causing list_del corruptions,
"Bad page state"s and worse (I keep BUGs in both of those, so usually
don't get to see how badly they end up without). The relevant recent
changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin,
improved swap allocation, and underused THP splitting.
The new unlocked list_del_init() in deferred_split_scan() is buggy. I
gave bad advice, it looks plausible since that's a local on-stack list,
but the fact is that it can race with a third party freeing or migrating
the preceding folio (properly unqueueing it with refcount 0 while holding
split_queue_lock), thereby corrupting the list linkage.
The obvious answer would be to take split_queue_lock there: but it has a
long history of contention, so I'm reluctant to add to that. Instead,
make sure that there is always one safe (raised refcount) folio before, by
delaying its folio_put(). (And of course I was wrong to suggest updating
split_queue_len without the lock: leave that until the splice.)
And remove two over-eager partially_mapped checks, restoring those tests
to how they were before: if uncharge_folio() or free_tail_page_prepare()
finds _deferred_list non-empty, it's in trouble whether or not that folio
is partially_mapped (and the flag was already cleared in the latter case).
Link: https://lkml.kernel.org/r/81e34a8b-113a-0701-740e-2135c97eb1d7@google.com
Fixes: dafff3f4c850 ("mm: split underused THPs")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Recently, the net/lib.sh file has been modified to include defer.sh from
net/lib/sh/ directory. The Makefile from net/lib has been modified
accordingly, but not the ones from the sub-targets using net/lib.sh.
Because of that, the new file is not installed as expected when
installing the Forwarding, MPTCP, and Netfilter targets, e.g.
# make -C tools/testing/selftests TARGETS=net/mptcp install \
INSTALL_PATH=/tmp/kself
# cd /tmp/kself/
# ./run_kselftest.sh -c net/mptcp
TAP version 13
1..7
# timeout set to 1800
# selftests: net/mptcp: mptcp_connect.sh
# ./../lib.sh: line 5: /tmp/kself/net/lib/sh/defer.sh: No such file
or directory
# (...)
This can be fixed simply by adding all the .sh files from net/lib/sh
directory to the TEST_INCLUDES variable in the different Makefile's.
Fixes: a6e263f125cd ("selftests: net: lib: Introduce deferred commands")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20241104-net-next-selftests-lib-sh-deps-v1-1-7c9f7d939fc2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New netns selftest helpers netns_new() and netns_free() has been added
in network_helpers.c, let's use them in mptcp selftests too instead of
using MPTCP's own helpers create_netns() and cleanup_netns().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/c02fda3177b34f9e74a044833fda9761627f4d07.1730338692.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The gettimex64() doesn't modify values in timecounter, that's why there
is no need to update sequence counter. Reduce the contention on sequence
lock for multi-thread PHC reading use-case.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241014170103.2473580-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The exec queue timestamp is only really useful when it's being queried
through the fdinfo. There's no need to update it so often, on every
job_free. Tracing a simple app like vkcube running shows an update
rate of ~ 120Hz. In case of discrete, the BO is on vram, creating a lot
of pcie transactions.
The update on job_free() is used to cover a gap: if exec
queue is created and destroyed rapidly, before a new query, the
timestamp still needs to be accumulated and accounted for in the xef.
Initial implementation in commit 6109f24f87d7 ("drm/xe: Add helper to
accumulate exec queue runtime") couldn't do it on the exec_queue_fini
since the xef could be gone at that point. However since commit
ce8c161cbad4 ("drm/xe: Add ref counting for xe_file") the xef is
refcounted and the exec queue always holds a reference, making this safe
now.
Improve the fix in commit 2149ded63079 ("drm/xe: Fix use after free when
client stats are captured") by reducing the frequency in which the
update is needed.
Fixes: 2149ded63079 ("drm/xe: Fix use after free when client stats are captured")
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 83db047d9425d9a649f01573797558eff0f632e1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
In unlikely event that we fail during sending the new VF GGTT
configuration to the GuC, we will free only the GGTT node data
struct but will miss to release the actual GGTT allocation.
This will later lead to list corruption, GGTT space leak and
finally risking crash when unloading the driver:
[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT
[ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028).
[ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0
[ ] Call Trace:
[ ] drm_mm_insert_node_in_range+0x2c0/0x4e0
[ ] xe_ggtt_node_insert+0x46/0x70 [xe]
[ ] pf_provision_vf_ggtt+0x7f5/0xa70 [xe]
[ ] xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe]
[ ] ggtt_set+0x4b/0x70 [xe]
[ ] simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110
[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT
[ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI
[ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390
[ ] Call Trace:
[ ] <TASK>
[ ] ? die_addr+0x2e/0x80
[ ] ? exc_general_protection+0x1a1/0x3e0
[ ] ? asm_exc_general_protection+0x22/0x30
[ ] ? drm_mm_remove_node+0x1b7/0x390
[ ] ggtt_node_remove+0xa5/0xf0 [xe]
[ ] xe_ggtt_node_remove+0x35/0x70 [xe]
[ ] xe_ttm_bo_destroy+0x123/0x220 [xe]
[ ] intel_user_framebuffer_destroy+0x44/0x70 [xe]
[ ] intel_plane_destroy_state+0x3b/0xc0 [xe]
[ ] drm_atomic_state_default_clear+0x1cd/0x2f0
[ ] intel_atomic_state_clear+0x9/0x20 [xe]
[ ] __drm_atomic_state_free+0x1d/0xb0
Fix that by using pf_release_ggtt() on the error path, which now
works regardless if the node has GGTT allocation or not.
Fixes: 34e804220f69 ("drm/xe: Make xe_ggtt_node struct independent")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com
(cherry picked from commit 43b1dd2b550f0861ce80fbfffd5881b1b26272b1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Upon failure all locks need to be dropped before returning to the user.
Fixes: 58480c1c912f ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com
(cherry picked from commit 7d1a4258e602ffdce529f56686925034c1b3b095)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
In a couple of places after an exec queue is looked up the exec IOCTL
returns on input errors without dropping the exec queue ref. Fix this
ensuring the exec queue ref is dropped on input error.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com
(cherry picked from commit 07064a200b40ac2195cb6b7b779897d9377e5e6f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into clk-fixes
Pull Qualcomm clk driver fixes from Bjorn Andersson:
- Correct flags for X Elite USB MP GDSC and pcie pipediv2 clocks
- Fix alpha PLL post_div mask for the cases where width is not
specified
- Avoid hangs in the SM8350 video driver (venus) by setting HW_CTRL
trigger feature on the video clocks
* tag 'qcom-clk-fixes-for-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
clk: qcom: gcc-x1e80100: Fix USB MP SS1 PHY GDSC pwrsts flags
clk: qcom: gcc-x1e80100: Fix halt_check for pipediv2 clocks
clk: qcom: clk-alpha-pll: Fix pll post div mask when width is not set
clk: qcom: videocc-sm8350: use HW_CTRL_TRIGGER for vcodec GDSCs
|
|
As reported by Byeonguk, the bad_words test in verifier_bits_iter.c
occasionally fails on s390 host. Quoting Ilya's explanation:
s390 kernel runs in a completely separate address space, there is no
user/kernel split at TASK_SIZE. The same address may be valid in both
the kernel and the user address spaces, there is no way to tell by
looking at it. The config option related to this property is
ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
Also, unfortunately, 0 is a valid address in the s390 kernel address
space.
Fix the issue by using -4095 as the bad address for bits iterator, as
suggested by Ilya. Verify that bpf_iter_bits_new() returns -EINVAL for
NULL address and -EFAULT for bad address.
Fixes: ebafc1e535db ("selftests/bpf: Add three test cases for bits_iter")
Reported-by: Byeonguk Jeong <jungbu2855@gmail.com>
Closes: https://lore.kernel.org/bpf/ZycSXwjH4UTvx-Cn@ub22/
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20241105043057.3371482-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()") converted
scx_ops_bypass_depth from an atomic to an int. Update scx_show_state.py
accordingly.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()")
|
|
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Update of stateful object triggers:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:7759 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/3060:
#0: ffff88810f0578c8 (&nft_net->commit_mutex){+.+.}-{4:4}, [..]
... but this list is not protected by the transaction mutex but the
nfnl nftables subsystem mutex.
Switch to nft_obj_type_get which will acquire rcu read lock,
bump refcount, and returns the result.
v3: Dan Carpenter points out nft_obj_type_get returns error pointer, not
NULL, on error.
Fixes: dad3bdeef45f ("netfilter: nf_tables: fix memory leak during stateful obj update").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
type list
nft shell tests trigger:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3125 RCU-list traversed in non-reader section!!
1 lock held by nft/2068:
#0: ffff888106c6f8c8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x3c/0xf0
But the transaction mutex doesn't protect this list, the nfnl subsystem
mutex would, but we can't acquire it here without risk of ABBA
deadlocks.
Acquire the rcu read lock to avoid this issue.
v3: add a comment that explains the ->inner_ops check implies
expression is builtin and lack of a module owner reference is ok.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Like previous patches: iteration is ok if the list cannot be altered in
parallel.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Its not possible to add or delete elements from hash and bitmap sets,
as long as caller is holding the transaction mutex, so its ok to iterate
the list outside of rcu read side critical section.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The transaction mutex prevents concurrent add/delete, its ok to iterate
those lists outside of rcu read side critical sections.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same as previous patch. All set handling functions here can be called
with transaction mutex held (but not the rcu read lock).
The transaction mutex prevents concurrent add/delete, so this is fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
On rule delete we get:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
1 lock held by iptables/134:
#0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables
Code is fine, no other CPU can change the list because we're holding
transaction mutex.
Pass the needed lockdep annotation to the iterator and fix
two comments for functions that are no longer restricted to rcu-only
context.
This is enough to resolve rule delete, but there are several other
missing annotations, added in followup-patches.
Fixes: 28875945ba98 ("rcu: Add support for consolidated-RCU reader checking")
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Tested-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/da27f17f-3145-47af-ad0f-7fd2a823623e@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
kmem_buckets_create
Commit b035f5a6d852 ("mm: slab: reduce the kmalloc() minimum alignment
if DMA bouncing possible") reduced ARCH_KMALLOC_MINALIGN to 8 on arm64.
However, with KASAN_HW_TAGS enabled, arch_slab_minalign() becomes 16.
This causes kmalloc_caches[*][8] to be aliased to kmalloc_caches[*][16],
resulting in kmem_buckets_create() attempting to create a kmem_cache for
size 16 twice. This duplication triggers warnings on boot:
[ 2.325108] ------------[ cut here ]------------
[ 2.325135] kmem_cache of name 'memdup_user-16' already exists
[ 2.325783] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:107 __kmem_cache_create_args+0xb8/0x3b0
[ 2.327957] Modules linked in:
[ 2.328550] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc5mm-unstable-arm64+ #12
[ 2.328683] Hardware name: QEMU QEMU Virtual Machine, BIOS 2024.02-2 03/11/2024
[ 2.328790] pstate: 61000009 (nZCv daif -PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 2.328911] pc : __kmem_cache_create_args+0xb8/0x3b0
[ 2.328930] lr : __kmem_cache_create_args+0xb8/0x3b0
[ 2.328942] sp : ffff800083d6fc50
[ 2.328961] x29: ffff800083d6fc50 x28: f2ff0000c1674410 x27: ffff8000820b0598
[ 2.329061] x26: 000000007fffffff x25: 0000000000000010 x24: 0000000000002000
[ 2.329101] x23: ffff800083d6fce8 x22: ffff8000832222e8 x21: ffff800083222388
[ 2.329118] x20: f2ff0000c1674410 x19: f5ff0000c16364c0 x18: ffff800083d80030
[ 2.329135] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 2.329152] x14: 0000000000000000 x13: 0a73747369786520 x12: 79646165726c6120
[ 2.329169] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : 0000000000000000
[ 2.329194] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 2.329210] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 2.329226] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 2.329291] Call trace:
[ 2.329407] __kmem_cache_create_args+0xb8/0x3b0
[ 2.329499] kmem_buckets_create+0xfc/0x320
[ 2.329526] init_user_buckets+0x34/0x78
[ 2.329540] do_one_initcall+0x64/0x3c8
[ 2.329550] kernel_init_freeable+0x26c/0x578
[ 2.329562] kernel_init+0x3c/0x258
[ 2.329574] ret_from_fork+0x10/0x20
[ 2.329698] ---[ end trace 0000000000000000 ]---
[ 2.403704] ------------[ cut here ]------------
[ 2.404716] kmem_cache of name 'msg_msg-16' already exists
[ 2.404801] WARNING: CPU: 2 PID: 1 at mm/slab_common.c:107 __kmem_cache_create_args+0xb8/0x3b0
[ 2.404842] Modules linked in:
[ 2.404971] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.12.0-rc5mm-unstable-arm64+ #12
[ 2.405026] Tainted: [W]=WARN
[ 2.405043] Hardware name: QEMU QEMU Virtual Machine, BIOS 2024.02-2 03/11/2024
[ 2.405057] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2.405079] pc : __kmem_cache_create_args+0xb8/0x3b0
[ 2.405100] lr : __kmem_cache_create_args+0xb8/0x3b0
[ 2.405111] sp : ffff800083d6fc50
[ 2.405115] x29: ffff800083d6fc50 x28: fbff0000c1674410 x27: ffff8000820b0598
[ 2.405135] x26: 000000000000ffd0 x25: 0000000000000010 x24: 0000000000006000
[ 2.405153] x23: ffff800083d6fce8 x22: ffff8000832222e8 x21: ffff800083222388
[ 2.405169] x20: fbff0000c1674410 x19: fdff0000c163d6c0 x18: ffff800083d80030
[ 2.405185] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 2.405201] x14: 0000000000000000 x13: 0a73747369786520 x12: 79646165726c6120
[ 2.405217] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : 0000000000000000
[ 2.405233] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 2.405248] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 2.405271] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 2.405287] Call trace:
[ 2.405293] __kmem_cache_create_args+0xb8/0x3b0
[ 2.405305] kmem_buckets_create+0xfc/0x320
[ 2.405315] init_msg_buckets+0x34/0x78
[ 2.405326] do_one_initcall+0x64/0x3c8
[ 2.405337] kernel_init_freeable+0x26c/0x578
[ 2.405348] kernel_init+0x3c/0x258
[ 2.405360] ret_from_fork+0x10/0x20
[ 2.405370] ---[ end trace 0000000000000000 ]---
To address this, alias kmem_cache for sizes smaller than min alignment
to the aligned sized kmem_cache, as done with the default system kmalloc
bucket.
Fixes: b32801d1255b ("mm/slab: Introduce kmem_buckets_create() and family")
Cc: <stable@vger.kernel.org> # v6.11+
Signed-off-by: Koichiro Den <koichiro.den@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
In case of error when requesting ctrl_chan DMA channel, ctrl_chan is not
null. So the release of the dma channel leads to the following issue:
[ 4.879000] st,stm32-spdifrx 500d0000.audio-controller:
dma_request_slave_channel error -19
[ 4.888975] Unable to handle kernel NULL pointer dereference
at virtual address 000000000000003d
[...]
[ 5.096577] Call trace:
[ 5.099099] dma_release_channel+0x24/0x100
[ 5.103235] stm32_spdifrx_remove+0x24/0x60 [snd_soc_stm32_spdifrx]
[ 5.109494] stm32_spdifrx_probe+0x320/0x4c4 [snd_soc_stm32_spdifrx]
To avoid this issue, release channel only if the pointer is valid.
Fixes: 794df9448edb ("ASoC: stm32: spdifrx: manage rebind issue")
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com>
Link: https://patch.msgid.link/20241105140242.527279-1-olivier.moysan@foss.st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Avoid a possible buffer overflow if size is larger than 4K.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f5d873f5825b40d886d03bd2aede91d4cf002434)
Cc: stable@vger.kernel.org
|
|
Users should not be able to run these.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ba9395430f611cfc101b1c2687732baafa239d5)
Cc: stable@vger.kernel.org
|
|
Regular users shouldn't have read access.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c0cfd2e652553d607b910be47d0cc5a7f3a78641)
Cc: stable@vger.kernel.org
|
|
For DPX mode, the number of memory partitions supported should be less
than or equal to 2.
Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 990c4f580742de7bb78fa57420ffd182fc3ab4cd)
Cc: stable@vger.kernel.org
|
|
Roger Quadros says:
====================
net: ethernet: ti: am65-cpsw: Fixes to multi queue RX feature
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows. Patch 1 fixes this.
The second patch fixes a warning if there was any error in
am65_cpsw_nuss_init_rx_chns() and am65_cpsw_nuss_cleanup_rx_chns()
was called after that.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
====================
Link: https://patch.msgid.link/20241101-am65-cpsw-multi-rx-j7-fix-v3-0-338fdd6a55da@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
flow->irq is initialized to 0 which is a valid IRQ. Set it to -EINVAL
in error path of am65_cpsw_nuss_init_rx_chns() so we do not try
to free an unallocated IRQ in am65_cpsw_nuss_remove_rx_chns().
If user tried to change number of RX queues and am65_cpsw_nuss_init_rx_chns()
failed due to any reason, the warning will happen if user tries to change
the number of RX queues after the error condition.
root@am62xx-evm:~# ethtool -L eth0 rx 3
[ 40.385293] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
[ 40.393211] am65-cpsw-nuss 8000000.ethernet: Failed to init rx flow2
netlink error: Invalid argument
root@am62xx-evm:~# ethtool -L eth0 rx 2
[ 82.306427] ------------[ cut here ]------------
[ 82.311075] WARNING: CPU: 0 PID: 378 at kernel/irq/devres.c:144 devm_free_irq+0x84/0x90
[ 82.469770] Call trace:
[ 82.472208] devm_free_irq+0x84/0x90
[ 82.475777] am65_cpsw_nuss_remove_rx_chns+0x6c/0xac [ti_am65_cpsw_nuss]
[ 82.482487] am65_cpsw_nuss_update_tx_rx_chns+0x2c/0x9c [ti_am65_cpsw_nuss]
[ 82.489442] am65_cpsw_set_channels+0x30/0x4c [ti_am65_cpsw_nuss]
[ 82.495531] ethnl_set_channels+0x224/0x2dc
[ 82.499713] ethnl_default_set_doit+0xb8/0x1b8
[ 82.504149] genl_family_rcv_msg_doit+0xc0/0x124
[ 82.508757] genl_rcv_msg+0x1f0/0x284
[ 82.512409] netlink_rcv_skb+0x58/0x130
[ 82.516239] genl_rcv+0x38/0x50
[ 82.519374] netlink_unicast+0x1d0/0x2b0
[ 82.523289] netlink_sendmsg+0x180/0x3c4
[ 82.527205] __sys_sendto+0xe4/0x158
[ 82.530779] __arm64_sys_sendto+0x28/0x38
[ 82.534782] invoke_syscall+0x44/0x100
[ 82.538526] el0_svc_common.constprop.0+0xc0/0xe0
[ 82.543221] do_el0_svc+0x1c/0x28
[ 82.546528] el0_svc+0x28/0x98
[ 82.549578] el0t_64_sync_handler+0xc0/0xc4
[ 82.553752] el0t_64_sync+0x190/0x194
[ 82.557407] ---[ end trace 0000000000000000 ]---
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows.
This issue is not present on AM62 as separate pair of
rings are used for free and completion rings for each flow.
Fix this by allocating enough elements for RX free descriptor
ring 0.
However, we can no longer rely on desc_idx (descriptor based
offsets) to identify the pages in the respective flows as
free descriptor ring includes elements for all flows.
To solve this, introduce a new swdata data structure to store
flow_id and page. This can be used to identify which flow (page_pool)
and page the descriptor belonged to when popped out of the
RX rings.
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Rick reported that his Pluggable USB4 dock does not work anymore after
upgrading to v6.10 kernel.
It looks like commit c6ca1ac9f472 ("thunderbolt: Increase sideband
access polling delay") makes the device router enumeration happen later
than what might be expected by the dock (although there is no such limit
in the USB4 spec) which probably makes it assume there is something
wrong with the high-speed link and reset it. After the link is reset the
same issue happens again and again.
For this reason lower the sideband access delay from 5ms to 1ms. This
seems to work fine according to Rick's testing.
Reported-by: Rick Lahaye <rick@581238.xyz>
Closes: https://lore.kernel.org/linux-usb/000f01db247b$d10e1520$732a3f60$@581238.xyz/
Tested-by: Rick Lahaye <rick@581238.xyz>
Fixes: c6ca1ac9f472 ("thunderbolt: Increase sideband access polling delay")
Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
If dev_get_regmap() fails, it returns NULL pointer not ERR_PTR(),
replace IS_ERR() with NULL pointer check, and return -ENODEV.
Fixes: d0f8e97866bf ("i2c: muxes: add support for tsd,mule-i2c multiplexer")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
ALSA SoC Sound has Generic Sound Card (Simple-Card, Audio-Graph-Card,
Audio-Graph-Card2). Adds its Maintainer.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87ikt2a41c.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The "*cmd" variable can be controlled by the user via debugfs. That means
"new_cam" can be as high as 255 while the size of the uc->updated[] array
is UCSI_MAX_ALTMODES (30).
The call tree is:
ucsi_cmd() // val comes from simple_attr_write_xsigned()
-> ucsi_send_command()
-> ucsi_send_command_common()
-> ucsi_run_command() // calls ucsi->ops->sync_control()
-> ucsi_ccg_sync_control()
Fixes: 170a6726d0e2 ("usb: typec: ucsi: add support for separate DP altmode devices")
Cc: stable <stable@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/325102b3-eaa8-4918-a947-22aca1146586@stanley.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If the device was already runtime suspended then during system suspend
we cannot access the device registers else it will crash.
Also we cannot access any registers after dwc3_core_exit() on some
platforms so move the dwc3_enable_susphy() call to the top.
Cc: stable@vger.kernel.org # v5.15+
Reported-by: William McVicker <willmcvicker@google.com>
Closes: https://lore.kernel.org/all/ZyVfcUuPq56R2m1Y@google.com
Fixes: 705e3ce37bcc ("usb: dwc3: core: Fix system suspend on TI AM62 platforms")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Tested-by: Will McVicker <willmcvicker@google.com>
Link: https://lore.kernel.org/r/20241104-am62-lpm-usb-fix-v1-1-e93df73a4f0d@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If the read of USB_PDPHY_RX_ACKNOWLEDGE_REG failed, then hdr_len and
txbuf_len are uninitialized. This commit stops to print uninitialized
value and misleading/false data.
Cc: stable@vger.kernel.org
Fixes: a4422ff22142 (" usb: typec: qcom: Add Qualcomm PMIC Type-C driver")
Signed-off-by: Rex Nie <rex.nie@jaguarmicro.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Link: https://lore.kernel.org/r/20241030133632.2116-1-rex.nie@jaguarmicro.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When the driver is uninstalled and the VF is disabled concurrently, a
kernel crash occurs. The reason is that the two actions call function
pci_disable_sriov(). The num_VFs is checked to determine whether to
release the corresponding resources. During the second calling, num_VFs
is not 0 and the resource release function is called. However, the
corresponding resource has been released during the first invoking.
Therefore, the problem occurs:
[15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[15278.131557][T50670] Call trace:
[15278.134686][T50670] klist_put+0x28/0x12c
[15278.138682][T50670] klist_del+0x14/0x20
[15278.142592][T50670] device_del+0xbc/0x3c0
[15278.146676][T50670] pci_remove_bus_device+0x84/0x120
[15278.151714][T50670] pci_stop_and_remove_bus_device+0x6c/0x80
[15278.157447][T50670] pci_iov_remove_virtfn+0xb4/0x12c
[15278.162485][T50670] sriov_disable+0x50/0x11c
[15278.166829][T50670] pci_disable_sriov+0x24/0x30
[15278.171433][T50670] hnae3_unregister_ae_algo_prepare+0x60/0x90 [hnae3]
[15278.178039][T50670] hclge_exit+0x28/0xd0 [hclge]
[15278.182730][T50670] __se_sys_delete_module.isra.0+0x164/0x230
[15278.188550][T50670] __arm64_sys_delete_module+0x1c/0x30
[15278.193848][T50670] invoke_syscall+0x50/0x11c
[15278.198278][T50670] el0_svc_common.constprop.0+0x158/0x164
[15278.203837][T50670] do_el0_svc+0x34/0xcc
[15278.207834][T50670] el0_svc+0x20/0x30
For details, see the following figure.
rmmod hclge disable VFs
----------------------------------------------------
hclge_exit() sriov_numvfs_store()
... device_lock()
pci_disable_sriov() hns3_pci_sriov_configure()
pci_disable_sriov()
sriov_disable()
sriov_disable() if !num_VFs :
if !num_VFs : return;
return; sriov_del_vfs()
sriov_del_vfs() ...
... klist_put()
klist_put() ...
... num_VFs = 0;
num_VFs = 0; device_unlock();
In this patch, when driver is removing, we get the device_lock()
to protect num_VFs, just like sriov_numvfs_store().
Fixes: 0dd8a25f355b ("net: hns3: disable sriov before unload hclge layer")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241101091507.3644584-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit 6ed05c68cbca ("usb: musb: sunxi: Explicitly release USB PHY on
exit") will cause that usb phy @glue->xceiv is accessed after released.
1) register platform driver @sunxi_musb_driver
// get the usb phy @glue->xceiv
sunxi_musb_probe() -> devm_usb_get_phy().
2) register and unregister platform driver @musb_driver
musb_probe() -> sunxi_musb_init()
use the phy here
//the phy is released here
musb_remove() -> sunxi_musb_exit() -> devm_usb_put_phy()
3) register @musb_driver again
musb_probe() -> sunxi_musb_init()
use the phy here but the phy has been released at 2).
...
Fixed by reverting the commit, namely, removing devm_usb_put_phy()
from sunxi_musb_exit().
Fixes: 6ed05c68cbca ("usb: musb: sunxi: Explicitly release USB PHY on exit")
Cc: stable@vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20241029-sunxi_fix-v1-1-9431ed2ab826@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Daniel Machon says:
====================
net: lan969x: add VCAP functionality
== Description:
This series is the third of a multi-part series, that prepares and adds
support for the new lan969x switch driver.
The upstreaming efforts is split into multiple series (might change a
bit as we go along):
1) Prepare the Sparx5 driver for lan969x (merged)
2) Add support for lan969x (same basic features as Sparx5
provides excl. FDMA and VCAP, merged).
--> 3) Add lan969x VCAP functionality.
4) Add RGMII and FDMA functionality.
== VCAP support:
The Versatile Content-Aware Processor (VCAP) is a content-aware packet
processor that allows wirespeed packet inspection for rich
implementation of, for example, advanced VLAN and QoS classification and
manipulations, IP source guarding, longest prefix matching for Layer-3
routing, and security features for wireline and wireless applications.
This is all achieved by programming rules into the VCAP.
When a VCAP is enabled, every frame passing through the switch is
analyzed and multiple keys are created based on the contents of the
frame. The frame is examined to determine the frame type (for example,
IPv4 TCP frame), so that the frame information is extracted according to
the frame type, port-specific configuration, and classification results
from the basic classification. Keys are applied to the VCAP and when
there is a match between a key and a rule in the VCAP, the rule is then
applied to the frame from which the key was extracted.
After this series is applied, the lan969x driver will support the same
VCAP functionality as Sparx5.
== Patch breakdown:
Patch #1 exposes some VCAP symbols for lan969x.
Patch #2 replaces VCAP uses of SPX5_PORTS with n_ports from the match
data.
Patch #3 adds new VCAP constants to match data
Patch #4 removes the is_sparx5() check to now initialize the VCAP API on
lan969x.
Patch #5 adds the auto-generated VCAP data for lan969x.
Patch #6 adds the VCAP configuration data for lan969x.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================
Link: https://patch.msgid.link/20241101-sparx5-lan969x-switch-driver-3-v1-0-3c76f22f4bfa@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add configuration data (for consumption by the VCAP API) for the four
VCAP's that we are going to support. The following VCAP's will be
supported:
- VCAP CLM: (also known as IS0) is part of the analyzer and enables
frame classification using VCAP functionality.
- VCAP IS2: is part of ANA_ACL and enables access control lists, using
VCAP functionality.
- VCAP ES0: is part of the rewriter and enables rewriting of frames
using VCAP functionality.
- VCAP ES2: is part of EACL and enables egress access control lists
using VCAP functionality
The two VCAP's: CLM and IS2 use shared resources from the SUPER VCAP.
The SUPER VCAP is a shared pool of 6 blocks that can be distributed
freely among CLM and IS2. Each block in the pool has 3,072 addresses
with entries, actions, and counters. ES0 and ES2 does not use shared
resources.
In the configuration data for lan969x CLM uses blocks 2-4 with a total
of 6 lookups. IS2 uses blocks 0-1 with a total of 4 lookups.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Platform VCAP data for each VCAP instance is auto-generated using an
internal Microchip tool. The generated VCAP data contains information
about keyfields, keyfield sets, actionfields, actionfield sets and
typegroups, which in combination are used to encode and decode rules in
the VCAP.
Add the auto-generated VCAP file lan969x_vcap_ag_api.c and assign the
two structs: lan969x_vcaps and lan969x_vcap_stats to the match data.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The is_sparx5() check was introduced in an earlier series, to make sure
the sparx5_vcap_init() was not executed on lan969x, as it was not
implemented there yet. Now that it is, remove that check.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, add the following three new
VCAP constants to match data:
- vcaps_cfg (contains configuration data for each VCAP).
- vcaps (contains auto-generated information about VCAP keys and
actions).
- vcap_stats: (contains auto-generated string names of all the keys
and actions)
Add these constants to the Sparx5 match data constants and use them to
initialize the VCAP's in sparx5_vcap_init().
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The Sparx5 VCAP implementation uses the SPX5_PORTS symbol to iterate over
the 65 front ports of Sparx5. Replace the use with the n_ports constant
from the match data, which translates to 65 of Sparx5 and 30 on lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, expose the following symbols for
use by the lan969x VCAP implementation:
- The symbols SPARX5_*_LOOKUPS defines the number of lookups in each
VCAP instance. These are the same for lan969x. Move them to the
header file.
- The struct sparx5_vcap_inst encapsulates information about a single
VCAP instance. Move this struct to the header file and declare the
sparx5_vcap_inst_cfg as extern.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
HP 320 FHD Webcam (03f0:654a) seems to have flaky firmware like other
webcam devices that don't like the frequency inquiries. Also, Mic
Capture Volume has an invalid resolution, hence fix it to be 16 (as a
blind shot).
Link: https://bugzilla.suse.com/show_bug.cgi?id=1232768
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20241105120220.5740-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|