Age | Commit message (Collapse) | Author |
|
Harald Freudenberger says:
====================
This series of patches has the goal to open up a do-not-allocate
memory path from the callers of the pkey in-kernel api down to
the crypto cards and back.
The asynch in-kernel cipher implementations (and the s390 PAES
cipher implementations are one of them) may be called in a
context where memory allocations which trigger IO is not acceptable.
So this patch series reworks the AP bus code, the zcrypt layer,
the pkey layer and the pkey handlers to respect this situation
by processing a new parameter xflags (execution hints flags).
There is a flag PKEY_XFLAG_NOMEMALLOC which tells the code to
not allocate memory which may lead to IO operations.
To reach this goal, the actual code changes have been differed.
The zcrypt misc functions which need memory for cprb build
use a pre allocated memory pool for this purpose. The findcard()
functions have one temp memory area preallocated and protected
with a mutex. Some smaller data is not allocated any more but went
to the stack instead. The AP bus also uses a pre-allocated
memory pool for building AP message requests.
Note that the PAES implementation still needs to get reworked
to run the protected key derivation in a real asynchronous way.
However, this rework of AP bus, zcrypt and pkey is the base work
required before reconsidering the PAES implementation.
The patch series starts bottom (AP bus) and goes up the call
chain (PKEY). At any time in the patch stack it should compile.
For easier review I tried to have one logic code change by
each patch and thus keep the patches "small".
====================
Link: https://lore.kernel.org/r/20250424133619.16495-1-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Add a new parameter xflags to the in-kernel API function
pkey_key2protkey(). Currently there is only one flag supported:
* PKEY_XFLAG_NOMEMALLOC:
If this flag is given in the xflags parameter, the pkey
implementation is not allowed to allocate memory but instead should
fall back to use preallocated memory or simple fail with -ENOMEM.
This flag is for protected key derive within a cipher or similar
which must not allocate memory which would cause io operations - see
also the CRYPTO_ALG_ALLOCATES_MEMORY flag in crypto.h.
The one and only user of this in-kernel API - the skcipher
implementations PAES in paes_s390.c set this flag upon request
to derive a protected key from the given raw key material.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-26-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Provide and pass the xflag parameter from pkey ioctls through
the pkey handler and further down to the implementations
(CCA, EP11, PCKMO and UV). So all the code is now prepared
and ready to support xflags ("execution flag").
The pkey layer supports the xflag PKEY_XFLAG_NOMEMALLOC: If this
flag is given in the xflags parameter, the pkey implementation is
not allowed to allocate memory but instead should fall back to use
preallocated memory or simple fail with -ENOMEM. This flag is for
protected key derive within a cipher or similar which must not
allocate memory which would cause io operations - see also the
CRYPTO_ALG_ALLOCATES_MEMORY flag in crypto.h.
Within the pkey handlers this flag is then to be translated to
appropriate zcrypt xflags before any zcrypt related functions
are called. So the PKEY_XFLAG_NOMEMALLOC translates to
ZCRYPT_XFLAG_NOMEMALLOC - If this flag is set, no memory
allocations which may trigger any IO operations are done.
The pkey in-kernel pkey API still does not provide this xflag
param. That's intended to come with a separate patch which
enables this functionality.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-25-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The uv_get_secret_metadata() in-kernel function was only
offered and used by the pkey uv handler. Remove it as there
is no customer any more.
Suggested-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Acked-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-24-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The pkey uv functions may be called in a situation where memory
allocations which trigger IO operations are not allowed. An example:
decryption of the swap partition with protected key (PAES).
The pkey uv code takes care of this by holding one preallocated
struct uv_secret_list to be used with the new UV function
uv_find_secret(). The older function uv_get_secret_metadata()
used before always allocates/frees an ephemeral memory buffer.
The preallocated struct is concurrency protected by a mutex.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-23-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Rename the internal UV function find_secret() to uv_find_secret()
and publish it as new UV API in-kernel function.
The pkey uv handler may be called in a do-not-allocate memory
situation where sleeping is allowed but allocating memory which
may cause IO operations is not. For example when an encrypted
swap file is used and the encryption is done via UV retrievable
secrets with protected keys.
The UV API function uv_get_secret_metadata() allocates memory
and then calls the find_secret() function. By exposing the
find_secret() function as a new UV API function uv_find_secret()
it is possible to retrieve UV secret meta data without any
memory allocations from the UV when the caller offers space
for one struct uv_secret_list.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Acked-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-22-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
There have been some places in the EP11 handler code where relatively
small amounts of memory have been allocated an freed at the end
of the function. This code has been reworked to use the stack instead.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-21-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
There have been some places in the CCA handler code where relatively
small amounts of memory have been allocated an freed at the end
of the function. This code has been reworked to use the stack instead.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-20-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
There are two places in the ep11 misc code where a short term
memory buffer is needed. Rework this code to use the cprb mempool
to satisfy this ephemeral memory requirements.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-19-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Locate the relative small struct ep11_domain_query_info variable
onto the stack instead of kmalloc()/kfree().
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-18-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Propagate the xflags argument from the cca_get_info()
caller down to the lower level functions for proper
memory allocation hints.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-17-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Rework two places in the zcrypt cca misc code using kmalloc() for
ephemeral memory allocation. As there is anyway now a cprb mempool
let's use this pool instead to satisfy these short term memory
allocations.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-16-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Rework the memory usage of the ep11 findcard() implementation:
- findcard does not allocate memory for the list of apqns
any more.
- the callers are now responsible to provide an array of
apqns to store the matching apqns into.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-15-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Rework the memory usage of the cca findcard() implementation:
- findcard does not allocate memory for the list of apqns
any more.
- the callers are now responsible to provide an array of
apqns to store the matching apqns into.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-14-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Remove the caching of the CCA and EP11 card and domain info.
In nearly all places where the card or domain info is fetched
the verify param was enabled and thus the cache was bypassed.
The only real place where info from the cache was used was
in the sysfs pseudo files in cases where the card/queue was
switched to "offline". All other callers insisted on getting
fresh info and thus a communication to the card was enforced.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-13-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The static function findcard() and the zcrypt cca_findcard()
function are both not used any more. Remove this outdated
code and an internal function only called by these.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-12-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Introduce a pre-allocated device status array memory together with
a mutex controlling the occupation to be used by the findcard()
function. Limit the device status array to max 128 cards and max
128 domains to reduce the size of this pre-allocated memory to 64 KB.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-11-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Introduce a pre-allocated device status array memory together with
a mutex controlling the occupation to be used by the findcard2()
function. Limit the device status array to max 128 cards and max
128 domains to reduce the size of this pre-allocated memory to 64 KB.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-10-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Rework the existing function zcrypt_device_status_mask_ext():
Add two new parameters to provide upper limits for
cards and queues. The existing implementation needed an
array of 256 * 256 * 4 = 256 KB which is really huge. The
reworked function is more flexible in the sense that the
caller can decide the upper limit for cards and domains to
be stored into the status array. So for example a caller may
decide to only query for cards 0...127 and queues 0...127
and thus only an array of size 128 * 128 * 4 = 64 KB is needed.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-9-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Introduce a cprb mempool for the zcrypt ep11 misc functions
(zcrypt_ep11misc.*) do some preparation rework to support
a do-not-allocate path through some zcrypt ep11 misc functions.
The mempool is controlled by the zcrypt module parameter
"mempool_threshold" which shall control the minimal amount
of memory items for CCA and EP11.
The mempool shall support "mempool_threshold" requests/replies
in parallel which means for EP11 to hold a send and receive
buffer memory per request. Each of this cprb space items is
limited to 8 KB. So by default the mempool consumes
5 * 2 * 8KB = 80KB
If the mempool is depleted upon one ep11 misc functions is
called with the ZCRYPT_XFLAG_NOMEMALLOC xflag set, the function
will fail with -ENOMEM and the caller is responsible for taking
further actions.
This is only part of an rework to support a new xflag
ZCRYPT_XFLAG_NOMEMALLOC but not yet complete.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-8-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Introduce a new module parameter "zcrypt_mempool_threshold"
for the zcrypt module. This parameter controls the minimal
amount of mempool items which are pre-allocated for urgent
requests/replies and will be used with the support for the
new xflag ZCRYPT_XFLAG_NOMEMALLOC. The default value of 5
shall provide enough memory items to support up to 5 requests
(and their associated reply) in parallel. The minimum value
is 1 and is checked in zcrypt module init().
If the mempool is depleted upon one cca misc functions is called
with the named xflag set, the function will fail with -ENOMEM
and the caller is responsible for taking further actions.
For CCA each mempool item is 16KB, as a CCA CPRB needs to
hold the request and the reply. The pool items only support
requests/replies with a limit of about 8KB.
So by default the CCA mempool consumes
5 * 16KB = 80KB
This is only part of an rework to support a new xflag
ZCRYPT_XFLAG_NOMEMALLOC but not yet complete.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-7-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Introduce a new flag parameter for the both cprb send functions
zcrypt_send_cprb() and zcrypt_send_ep11_cprb(). This new
xflags parameter ("execution flags") shall be used to provide
execution hints and flags for this crypto request.
There are two flags implemented to be used with these functions:
* ZCRYPT_XFLAG_USERSPACE - indicates to the lower layers that
all the ptrs address userspace. So when construction the ap msg
copy_from_user() is to be used. If this flag is NOT set, the ptrs
address kernel memory and thus memcpy() is to be used.
* ZCRYPT_XFLAG_NOMEMALLOC - indicates that this task must not
allocate memory which may be allocated with io operations.
For the AP bus and zcrypt message layer this means:
* The ZCRYPT_XFLAG_USERSPACE is mapped to the already existing
bool variable "userspace" which is propagated to the zcrypt
proto implementations.
* The ZCRYPT_XFLAG_NOMEMALLOC results in setting the AP flag
AP_MSG_FLAG_MEMPOOL when the AP msg buffer is initialized.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-6-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
If there is a target list of APQNs given when an CPRB is
to be send via zcrypt_send_ep11_cprb() there is always a
kmalloc() done and the targets are copied via z_copy_from_user.
As there are callers from kernel space (zcrypt_ep11misc.c)
which signal this via the userspace parameter improve this
code to directly use the given target list in case of
kernelspace thus removing the unnecessary memory alloc
and mem copy.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-5-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
There is a need for a do-not-allocate-memory path through the AP bus
layer. The pkey layer may be triggered via the in-kernel interface
from a protected key crypto algorithm (namely PAES) to convert a
secure key into a protected key. This happens in a workqueue context,
so sleeping is allowed but memory allocations causing IO operations
are not permitted.
To accomplish this, an AP message memory pool with pre-allocated space
is established. When ap_init_apmsg() with use_mempool set to true is
called, instead of kmalloc() the ap message buffer is allocated from
the ap_msg_pool. This pool only holds a limited amount of buffers:
ap_msg_pool_min_items with the item size AP_DEFAULT_MAX_MSG_SIZE and
exactly one of these items (if available) is returned if
ap_init_apmsg() with the use_mempool arg set to true is called. When
this pool is exhausted and use_mempool is set true, ap_init_apmsg()
returns -ENOMEM without any attempt to allocate memory and the caller
has to deal with that.
Default values for this mempool of ap messages is:
* Each buffer is 12KB (that is the default AP bus size
and all the urgent messages should fit into this space).
* Minimum items held in the pool is 8. This value is adjustable
via module parameter ap.msgpool_min_items.
The zcrypt layer may use this flag to indicate to the ap bus that the
processing path for this message should not allocate memory but should
use pre-allocated memory buffer instead. This is to prevent deadlocks
with crypto and io for example with encrypted swap volumes.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-4-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Slight rework on the way how AP message buffers are allocated.
Instead of having multiple places with kmalloc() calls all
the AP message buffers are now allocated and freed on exactly
one place: ap_init_apmsg() allocates the current AP bus max
limit of ap_max_msg_size (defaults to 12KB). The AP message
buffer is then freed in ap_release_apmsg().
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-3-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Move the very small response_type struct into struct ap_msg.
So there is no need to kmalloc this tiny struct with each
ap message preparation.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133619.16495-2-freude@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
In CPUMF attribute definitions for z15 all CPUMF attributes
have configuration values of the form 0x0[0-9a-f]{3} .
However 2 defines do not match this scheme, they have two leading
zeroes instead of one. Adjust this. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The mutex unlock for vdev->submitted_jobs_lock was incorrectly placed
before unlocking file_priv->lock. Change order of unlocks to avoid potential
race conditions.
Fixes: 5bbccadaf33e ("accel/ivpu: Abort all jobs after command queue unregister")
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425093656.2228168-1-jacek.lawrynowicz@linux.intel.com
|
|
Fix deadlocks in ivpu_cmdq_create_ioctl() and ivpu_cmdq_destroy_ioctl()
related to runtime suspend.
Runtime suspend acquires file_priv->lock mutex by calling
ivpu_cmdq_reset_all_contexts(). The same lock is acquired in the cmdq
ioctls. If one of the cmdq ioctls is called while runtime suspend is in
progress, it can lead to a deadlock.
Call stacks from example deadlock below.
Runtime suspend thread:
[ 3443.179717] Call Trace:
[ 3443.179724] __schedule+0x4b6/0x16b0
[ 3443.179732] ? __mod_timer+0x27d/0x3a0
[ 3443.179738] schedule+0x2f/0x140
[ 3443.179741] schedule_preempt_disabled+0x19/0x30
[ 3443.179743] __mutex_lock.constprop.0+0x335/0x7d0
[ 3443.179745] ? xas_find+0x1ed/0x260
[ 3443.179747] ? xa_find+0x8e/0xf0
[ 3443.179749] __mutex_lock_slowpath+0x13/0x20
[ 3443.179751] mutex_lock+0x41/0x60
[ 3443.179757] ivpu_cmdq_reset_all_contexts+0x82/0x150 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.179786] ivpu_pm_runtime_suspend_cb+0x1f1/0x3f0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.179850] pci_pm_runtime_suspend+0x6e/0x1f0
[ 3443.179870] ? __pfx_pci_pm_runtime_suspend+0x10/0x10
[ 3443.179886] __rpm_callback+0x48/0x130
[ 3443.179899] rpm_callback+0x64/0x70
[ 3443.179911] rpm_suspend+0x12c/0x630
[ 3443.179922] ? __schedule+0x4be/0x16b0
[ 3443.179941] pm_runtime_work+0xca/0xf0
[ 3443.179955] process_one_work+0x188/0x3d0
[ 3443.179971] worker_thread+0x2b9/0x3c0
[ 3443.179984] kthread+0xfb/0x220
[ 3443.180001] ? __pfx_worker_thread+0x10/0x10
[ 3443.180013] ? __pfx_kthread+0x10/0x10
[ 3443.180029] ret_from_fork+0x47/0x70
[ 3443.180044] ? __pfx_kthread+0x10/0x10
[ 3443.180059] ret_from_fork_asm+0x1a/0x30
User space thread:
[ 3443.180128] Call Trace:
[ 3443.180138] __schedule+0x4b6/0x16b0
[ 3443.180159] schedule+0x2f/0x140
[ 3443.180163] rpm_resume+0x1a7/0x6a0
[ 3443.180165] ? __pfx_autoremove_wake_function+0x10/0x10
[ 3443.180169] __pm_runtime_resume+0x56/0x90
[ 3443.180171] ivpu_rpm_get+0x28/0xb0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180181] ivpu_ipc_send_receive+0x6d/0x120 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180193] ? free_frozen_pages+0x395/0x670
[ 3443.180199] ? __free_pages+0xa7/0xc0
[ 3443.180202] ivpu_jsm_hws_destroy_cmdq+0x76/0xf0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180213] ? locks_dispose_list+0x6c/0xa0
[ 3443.180219] ? kmem_cache_free+0x342/0x470
[ 3443.180222] ? vm_area_free+0x19/0x30
[ 3443.180225] ? xas_load+0x17/0xf0
[ 3443.180229] ? xa_load+0x72/0xb0
[ 3443.180230] ivpu_cmdq_unregister.isra.0+0xb1/0x100 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180241] ivpu_cmdq_destroy_ioctl+0x8d/0x130 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180251] ? __pfx_ivpu_cmdq_destroy_ioctl+0x10/0x10 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180260] drm_ioctl_kernel+0xb3/0x110
[ 3443.180265] drm_ioctl+0x2ca/0x580
[ 3443.180266] ? __pfx_ivpu_cmdq_destroy_ioctl+0x10/0x10 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180275] ? __fput+0x1ae/0x2f0
[ 3443.180279] ? kmem_cache_free+0x342/0x470
[ 3443.180282] __x64_sys_ioctl+0xa9/0xe0
[ 3443.180286] x64_sys_call+0x13b7/0x26f0
[ 3443.180289] do_syscall_64+0x62/0x180
[ 3443.180291] entry_SYSCALL_64_after_hwframe+0x71/0x79
Fixes: 465a3914b254 ("accel/ivpu: Add API for command queue create/destroy/submit")
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425093341.2202895-1-jacek.lawrynowicz@linux.intel.com
|
|
Increase JMS message state dump command timeout to 100 ms. On some
platforms, the FW may take a bit longer than 50 ms to dump its state
to the log buffer and we don't want to miss any debug info during TDR.
Fixes: 5e162f872d7a ("accel/ivpu: Add FW state dump on TDR")
Cc: stable@vger.kernel.org # v6.13+
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425092822.2194465-1-jacek.lawrynowicz@linux.intel.com
|
|
Restructure SRSO to use select/update/apply functions to create
consistent vulnerability handling. Like with retbleed, the command line
options directly select mitigations which can later be modified.
While at it, remove a comment which doesn't apply anymore due to the
changed mitigation detection flow.
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/20250418161721.1855190-17-david.kaplan@amd.com
|
|
Using guard/scoped_guard() to simplify code. Using guard() to remove
'goto unlock' label is neater especially.
[ tglx: Brought back the scoped_guard()'s which were dropped in v2 and
simplified alarmtimer_rtc_add_device() ]
Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20250430032734.2079290-4-suhui@nfschina.com
|
|
'clockid' can only be ALARM_REALTIME and ALARM_BOOTTIME. It's impossible to
return -1 and callers never check the return value.
Only alarm_clock_get_timespec(), alarm_clock_get_ktime(),
alarm_timer_create() and alarm_timer_nsleep() call clock2alarm(). These
callers use clockid_to_kclock() to get 'struct k_clock', which ensures
that clock2alarm() never returns -1.
Remove the impossible -1 return value, and add a warning to notify about any
future misuse of this function.
Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250430032734.2079290-3-suhui@nfschina.com
|
|
register_refined_jiffies() is only used in setup code and always returns 0.
Mark it as __init to save some bytes and change it to void.
Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250430032734.2079290-2-suhui@nfschina.com
|
|
The "snps,data-width" property[1] defines the AXI data width of the DMA
controller as:
width = 8 × (2^n) bits
(0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits)
where "n" is the value of "snps,data-width".
For the CV18xx DMA controller, the correct AXI data width is 32 bits,
corresponding to "snps,data-width = 2".
Test results on Milkv Duo S can be found here [2].
Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1]
Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2]
Fixes: 514951a81a5e ("riscv: dts: sophgo: cv18xx: add DMA controller")
Co-developed-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Ze Huang <huangze@whut.edu.cn>
Link: https://lore.kernel.org/r/20250428-duo-dma-config-v1-1-eb6ad836ca42@whut.edu.cn
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
|
|
ipcomp_post_acomp currently drops all frags (via pskb_trim_unique(skb,
0)), and then subtracts the old skb->data_len from truesize. This
adjustment has already be done during trimming (in skb_condense), so
we don't need to do it again.
This shows up for example when running fragmented traffic over ipcomp,
we end up hitting the WARN_ON_ONCE in skb_try_coalesce.
Fixes: eb2953d26971 ("xfrm: ipcomp: Use crypto_acomp interface")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
A few ASUS models use the ALC256_FIXUP_ASUS_HEADSET_MODE although they
have no built-in mic pin on NID 0x13, as found in the commit
c1732ede5e80 ("ALSA: hda/realtek - Fix headset and mic on several Asus
laptops with ALC256"). This was relatively harmless in the past as
NID 0x13 was assigned as the secondary mic. But since the fix for the
pin sort order, this pin became the primary one, hence user started
noticing the broken input, and we've fixed already for a few ASUS
models to switch to ALC256_FIXUP_ASUS_MIC_NO_PRESENCE.
This patch corrects the other ASUS models to use the right quirk entry
for fixing the built-in mic regression. Here we cover X541SA
(1043:12e0), X541UV (1043:12f0), Z550SA (1043:13bf0) and X555UB
(1043:1ccd).
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220058
Link: https://patch.msgid.link/20250430053210.31776-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Opt_err is not used in EROFS, we can remove it.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20250429075056.689570-1-lihongbo22@huawei.com
Signed-off-by: Gao Xiang <xiang@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a regression in scompress"
* tag 'v6.15-p6' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: scompress - increment scomp_scratch_users when already allocated
|
|
The sha512 state size in s390_sha_ctx is out by a factor of 8,
fix it so that it stays below HASH_MAX_DESCSIZE. Also fix the
state init function which was writing garbage to the state. Luckily
this is not actually used for anything other than export.
Reported-by: Harald Freudenberger <freude@linux.ibm.com>
Fixes: 572b5c4682c7 ("crypto: s390/sha512 - Use API partial block handling")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The precomputed value for the NAND_READ_LOCATION_2 register should be
stored in 'snandc->regs->read_location2'.
Fix the qcom_spi_set_read_loc_first() function accordingly.
Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Reviewed-by: Md Sadre Alam <quic_mdalam@quicinc.com>
Link: https://patch.msgid.link/20250428-qpic-snand-readloc2-fix-v1-1-50ce0877ff72@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Depending on the architecture __ffs() returns either an 'unsigned long'
or 'unsigned int' result. Compile-testing this driver on targets that
use the latter produces a warning:
sound/soc/intel/catpt/dsp.c: In function 'catpt_dsp_set_srampge':
sound/soc/intel/catpt/dsp.c:181:44: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Werror=format=]
181 | dev_dbg(cdev->dev, "sanitize block %ld: off 0x%08x\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change the type of the local variable to match the format string and
avoid the warning on any architecture.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20250429073545.3558494-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If any address or port is changed, update it in all packets and recalculate
checksum.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250426153210.14044-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Vladimir Oltean says:
====================
Fix Felix DSA taprio gates after clock jump
Richie Pearn presented a reproducible situation where traffic would get
blocked on the NXP LS1028A switch if a certain taprio schedule was
applied, and stepping the PTP clock would take place. The latter event
is an expected initial occurrence, but also at runtime, for example when
transitioning from one grandmaster to another.
The issue is completely described in patch 1/4, which also contains
the fix, but it has left me with some doubts regarding the need for
vsc9959_tas_clock_adjust() in general.
In order to prove to myself that vsc9959_tas_clock_adjust() is needed in
general, I have written a selftest for the tc-taprio data path in patch
4/4. On the LS1028A, we can clearly see the following failures without
that function:
INFO: Forcing a backward clock jump
TEST: ping [FAIL]
INFO: Setting up taprio after PTP
TEST: In band with gate [FAIL]
Reception of 100 packets failed
TEST: Out of band with gate [FAIL]
Reception of 100 packets failed
As for testing my fix from patch 1/4, that was quite a bit more complex
to do automatically. In fact, I couldn't find any other schedule that
would fail to be updated by vsc9959_tas_clock_adjust() as cleanly as
the schedule from Richie, so I've added that specific schedule as the
test_clock_jump_backward() test.
The test ordering is also (unfortunately) very strategic. Running the
selftest to the end dirties the GCL RAM, and when running
test_clock_jump_backward() once again, the GCL entries won't be all
zeroes as they were the first time around. They will contain bits and
pieces of old schedules, making it very challenging to make it fail.
Thus, test_clock_jump_backward() is the first in the test suite, and
without patch 1/4, it is only supposed to fail the _first_ time when
running after a clean boot.
====================
Link: https://patch.msgid.link/20250426144859.3128352-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a forwarding path test for tc-taprio, based on isochron. This is
specifically intended for NICs with an offloaded data path (switchdev/DSA)
and requires taprio 'flags 2'. Also, $h1 and $h2 must support hardware
timestamping, and $h1 tc-etf offload, for isochron to work.
Packets received by a switch while the egress port has a taprio schedule
with an open gate for the traffic class must be sent right away.
Packets received by the switch while the traffic class gate must be
delayed until it opens.
Packets received by the switch must be dropped if the gate for the
traffic class never opens.
Packets should pass if the maximum SDU for the traffic class allows it,
and should be dropped otherwise.
The schedule should auto-update itself if clock jumps take place while
taprio is installed. Repeat most of the above tests after forcing two
clock jumps, one backwards (in Jan 1970) and one back into the present.
Symlink it from tools/testing/selftests/drivers/net/dsa, because usually
DSA ports have the same MAC address, and we need STABLE_MAC_ADDRS=yes
from its forwarding.config for the test to run successfully.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250426144859.3128352-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make out-of-band testing (send a packet when its traffic class gate is
closed, expecting it to be delayed) more predictable by allowing the
window size to be customized by isochron_do().
From man isochron-send, the window size alters the advance time (the
delta between the transmission time of the packet, and its expected TX
time when using SO_TXTIME or tc-taprio on the sender). In absence of the
argument, isochron-send defaults to maximizing the advance time (making
it equal to the cycle length).
The default behavior is exactly what is problematic. An advance time
that is too large will make packets intended to be out-of-band still be
potentially in-band with an open gate from the schedule's previous cycle.
We need to allow that advance time to be reduced.
Perhaps a bit confusingly, isochron_do() has a shift_time argument
currently, but that does not help here. The shift time shifts both the
user space wakeup time and the expected TX time by equal amounts, it is
unable of bringing them closer to one another.
Set the window size properly for the Ocelot PSFP selftest as well.
That used to work due to a very carefully chosen SHIFT_TIME_NS.
I've re-tested that the test still works properly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250426144859.3128352-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This snippet will be necessary for a future isochron-based test, so
provide a simpler high-level interface for counting the received
packets.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250426144859.3128352-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simplest setup to reproduce the issue: connect 2 ports of the
LS1028A-RDB together (eno0 with swp0) and run:
$ ip link set eno0 up && ip link set swp0 up
$ tc qdisc replace dev swp0 parent root handle 100 taprio num_tc 8 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 map 0 1 2 3 4 5 6 7 \
base-time 0 sched-entry S 20 300000 sched-entry S 10 200000 \
sched-entry S 20 300000 sched-entry S 48 200000 \
sched-entry S 20 300000 sched-entry S 83 200000 \
sched-entry S 40 300000 sched-entry S 00 200000 flags 2
$ ptp4l -i eno0 -f /etc/linuxptp/configs/gPTP.cfg -m &
$ ptp4l -i swp0 -f /etc/linuxptp/configs/gPTP.cfg -m
One will observe that the PTP state machine on swp0 starts
synchronizing, then it attempts to do a clock step, and after that, it
never fails to recover from the condition below.
ptp4l[82.427]: selected best master clock 00049f.fffe.05f627
ptp4l[82.428]: port 1 (swp0): MASTER to UNCALIBRATED on RS_SLAVE
ptp4l[83.252]: port 1 (swp0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[83.886]: rms 4537731277 max 9075462553 freq -18518 +/- 11467 delay 818 +/- 0
ptp4l[84.170]: timed out while polling for tx timestamp
ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
ptp4l[84.172]: port 1 (swp0): send peer delay request failed
ptp4l[84.173]: port 1 (swp0): clearing fault immediately
ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE
ptp4l[85.303]: timed out while polling for tx timestamp
ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
ptp4l[84.172]: port 1 (swp0): send peer delay request failed
ptp4l[84.173]: port 1 (swp0): clearing fault immediately
ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE
ptp4l[85.303]: timed out while polling for tx timestamp
ptp4l[85.304]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
ptp4l[85.305]: port 1 (swp0): send peer delay response failed
ptp4l[85.306]: port 1 (swp0): clearing fault immediately
ptp4l[86.304]: timed out while polling for tx timestamp
A hint is given by the non-zero statistics for dropped packets which
were expecting hardware TX timestamps:
$ ethtool --include-statistics -T swp0
(...)
Statistics:
tx_pkts: 30
tx_lost: 11
tx_err: 0
We know that when PTP clock stepping takes place (from ocelot_ptp_settime64()
or from ocelot_ptp_adjtime()), vsc9959_tas_clock_adjust() is called.
Another interesting hint is that placing an early return in
vsc9959_tas_clock_adjust(), so as to neutralize this function, fixes the
issue and TX timestamps are no longer dropped.
The debugging function written by me and included below is intended to
read the GCL RAM, after the admin schedule became operational, through
the two status registers available for this purpose:
QSYS_GCL_STATUS_REG_1 and QSYS_GCL_STATUS_REG_2.
static void vsc9959_print_tas_gcl(struct ocelot *ocelot)
{
u32 val, list_length, interval, gate_state;
int i, err;
err = read_poll_timeout(ocelot_read, val,
!(val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING),
10, 100000, false, ocelot, QSYS_PARAM_STATUS_REG_8);
if (err) {
dev_err(ocelot->dev,
"Failed to wait for TAS config pending bit to clear: %pe\n",
ERR_PTR(err));
return;
}
val = ocelot_read(ocelot, QSYS_PARAM_STATUS_REG_3);
list_length = QSYS_PARAM_STATUS_REG_3_LIST_LENGTH_X(val);
dev_info(ocelot->dev, "GCL length: %u\n", list_length);
for (i = 0; i < list_length; i++) {
ocelot_rmw(ocelot,
QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM(i),
QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM_M,
QSYS_GCL_STATUS_REG_1);
interval = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_2);
val = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_1);
gate_state = QSYS_GCL_STATUS_REG_1_GATE_STATE_X(val);
dev_info(ocelot->dev, "GCL entry %d: states 0x%x interval %u\n",
i, gate_state, interval);
}
}
Calling it from two places: after the initial QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE
performed by vsc9959_qos_port_tas_set(), and after the one done by
vsc9959_tas_clock_adjust(), I notice the following difference.
From the tc-taprio process context, where the schedule was initially
configured, the GCL looks like this:
mscc_felix 0000:00:00.5: GCL length: 8
mscc_felix 0000:00:00.5: GCL entry 0: states 0x20 interval 300000
mscc_felix 0000:00:00.5: GCL entry 1: states 0x10 interval 200000
mscc_felix 0000:00:00.5: GCL entry 2: states 0x20 interval 300000
mscc_felix 0000:00:00.5: GCL entry 3: states 0x48 interval 200000
mscc_felix 0000:00:00.5: GCL entry 4: states 0x20 interval 300000
mscc_felix 0000:00:00.5: GCL entry 5: states 0x83 interval 200000
mscc_felix 0000:00:00.5: GCL entry 6: states 0x40 interval 300000
mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 200000
But from the ptp4l clock stepping process context, when the
vsc9959_tas_clock_adjust() hook is called, the GCL RAM of the
operational schedule now looks like this:
mscc_felix 0000:00:00.5: GCL length: 8
mscc_felix 0000:00:00.5: GCL entry 0: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 1: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 2: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 3: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 4: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 5: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 6: states 0x0 interval 0
mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 0
I do not have a formal explanation, just experimental conclusions.
It appears that after triggering QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE
for a port's TAS, the GCL entry RAM is updated anyway, despite what the
documentation claims: "Specify the time interval in
QSYS::GCL_CFG_REG_2.TIME_INTERVAL. This triggers the actual RAM
write with the gate state and the time interval for the entry number
specified". We don't touch that register (through vsc9959_tas_gcl_set())
from vsc9959_tas_clock_adjust(), yet the GCL RAM is updated anyway.
It seems to be updated with effectively stale memory, which in my
testing can hold a variety of things, including even pieces of the
previously applied schedule, for particular schedule lengths.
As such, in most circumstances it is very difficult to pinpoint this
issue, because the newly updated schedule would "behave strangely",
but ultimately might still pass traffic to some extent, due to some
gate entries still being present in the stale GCL entry RAM. It is easy
to miss.
With the particular schedule given at the beginning, the GCL RAM
"happens" to be reproducibly rewritten with all zeroes, and this is
consistent with what we see: when the time-aware shaper has gate entries
with all gates closed, traffic is dropped on TX, no wonder we can't
retrieve TX timestamps.
Rewriting the GCL entry RAM when reapplying the new base time fixes the
observed issue.
Fixes: 8670dc33f48b ("net: dsa: felix: update base time of time-aware shaper when adjusting PTP time")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250426144859.3128352-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the mtk_poll_rx() function detects the MTK_RESETTING flag, it will
jump to release_desc and refill the high word of the SDP on the 4GB RFB.
Subsequently, mtk_rx_clean will process an incorrect SDP, leading to a
panic.
Add patch from MediaTek's SDK to resolve this.
Fixes: 2d75891ebc09 ("net: ethernet: mtk_eth_soc: support 36-bit DMA addressing on MT7988")
Link: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/71f47ea785699c6aa3b922d66c2bdc1a43da25b1
Signed-off-by: Chad Monroe <chad@monroe.io>
Link: https://patch.msgid.link/4adc2aaeb0fb1b9cdc56bf21cf8e7fa328daa345.1745715843.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 1a931c4f5e68 ("igc: add lock preventing multiple simultaneous PTM
transactions") added a new mutex to protect concurrent PTM transactions.
This lock is acquired in igc_ptp_reset() in order to ensure the PTM
registers are properly disabled after a device reset.
The flow where the lock is acquired already holds a spinlock, so acquiring
a mutex leads to a sleep-while-locking bug, reported both by smatch,
and the kernel test robot.
The critical section in igc_ptp_reset() does correctly use the
readx_poll_timeout_atomic variants, but the standard PTM flow uses regular
sleeping variants. This makes converting the mutex to a spinlock a bit
tricky.
Instead, re-order the locking in igc_ptp_reset. Acquire the mutex first,
and then the tmreg_lock spinlock. This is safe because there is no other
ordering dependency on these locks, as this is the only place where both
locks were acquired simultaneously. Indeed, any other flow acquiring locks
in that order would be wrong regardless.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Fixes: 1a931c4f5e68 ("igc: add lock preventing multiple simultaneous PTM transactions")
Link: https://lore.kernel.org/intel-wired-lan/Z_-P-Hc1yxcw0lTB@stanley.mountain/
Link: https://lore.kernel.org/intel-wired-lan/202504211511.f7738f5d-lkp@intel.com/T/#u
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|