Age | Commit message (Collapse) | Author |
|
NPAR (Network interface card partitioning)[1] 1.2 adds a transparent
VLAN tag for all packets between the NIC and the switch. Because of
that, RX VLAN acceleration cannot be supported for any additional
host configured VLANs. The driver has to acknowledge that it can
support no RX VLAN acceleration and set the NPAR 1.2 supported flag
when registering with the FW. Otherwise, the FW call will fail and
the driver will abort on these NPAR 1.2 NICs with this error:
bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2
[1] https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/introduction/features/network-partitioning-npar.html
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250213011240.1640031-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
GCC can see that the value range for "order" is capped, but this leads
it to consider that it might be negative, leading to a false positive
warning (with GCC 15 with -Warray-bounds -fdiagnostics-details):
../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=]
691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o);
| ~~~~~~~~~~~^~~
'mlx4_alloc_db_from_pgdir': events 1-2
691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | | | (2) out of array bounds here
| (1) when the condition is evaluated to true In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53,
from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42:
../include/linux/mlx4/device.h:664:33: note: while referencing 'bits'
664 | unsigned long *bits[2];
| ^~~~
Switch the argument to unsigned int, which removes the compiler needing
to consider negative values.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
functions
Currently disabled EEE modes are shown as supported in ethtool.
Change this by filtering them out when populating data->supported
in genphy_c45_ethtool_get_eee.
Disabled EEE modes are silently filtered out by genphy_c45_write_eee_adv.
This is planned to be removed, therefore ensure in
genphy_c45_ethtool_set_eee that disabled EEE modes are removed from the
user space provided EEE advertisement. For now keep the current behavior
to do this silently.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/5187c86d-9a5a-482c-974f-cc103ce9738c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If pci_alloc_irq_vectors() can't allocate the minimum number of vectors
then it returns -ENOSPC so there is no need to check for that in the
caller. In fact, because pf->msix.min is an unsigned int, it means that
any negative error codes are type promoted to high positive values and
treated as success. So here, the "return -ENOMEM;" is unreachable code.
Check for negatives instead.
Now that we're only dealing with error codes, it's easier to propagate
the error code from pci_alloc_irq_vectors() instead of hardcoding
-ENOMEM.
Fixes: 79d97b8cf9a8 ("ice: remove splitting MSI-X between features")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/b16e4f01-4c85-46e2-b602-fce529293559@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As all PCS are using the neg_mode parameter rather than the legacy
an_mode, remove the ability to use the legacy an_mode. We remove the
tests in the phylink code, unconditionally passing the PCS neg_mode
parameter to PCS methods, and remove setting the flag from drivers.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tidPn-0040hd-2R@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Document the existence of persistent per-NAPI configuration space and
the API that drivers can opt into.
Update stale documentation which suggested that NAPI IDs cannot be
queried from userspace.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250213191535.38792-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Heiner Kallweit says:
====================
net: phy: clean up phy.h
This series is a starting point to clean up phy.h and remove
definitions which are phylib-internal.
====================
Link: https://patch.msgid.link/d14f8a69-dc21-4ff7-8401-574ffe2f4bc5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Helper phy_is_internal() is just used in two places phylib-internally.
So let's remove it from the API.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/f3f35265-80a9-4ed7-ad78-ae22c21e288b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
phy_queue_state_machine() isn't used outside phy.c,
so stop exporting it.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/16986d3d-7baf-4b02-a641-e2916d491264@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Stop exporting feature arrays which aren't used outside phylib.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/01886672-4880-4ca8-b7b0-94d40f6e0ec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
outside phylib
Certain fixup-related definitions aren't used outside phy_device.c.
So make them private and remove them from phy.h.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/ea6fde13-9183-4c7c-8434-6c0eb64fc72c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Heiner Kallweit says:
====================
net: phy: realtek: improve MMD register access for internal PHY's
The integrated PHYs on chip versions from RTL8168g allow to address
MDIO_MMD_VEND2 registers. All c22 standard registers are mapped to
MDIO_MMD_VEND2 registers. So far the paging mechanism is used to
address PHY registers. Add support for c45 ops to address MDIO_MMD_VEND2
registers directly, w/o the paging.
====================
Link: https://patch.msgid.link/c6a969ef-fd7f-48d6-8c48-4bc548831a8d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MDIO bus provided by r8169 for the internal PHY's now supports
c45 ops for the MDIO_MMD_VEND2 device. So we can switch to standard
MMD ops here.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/81416f95-0fac-4225-87b4-828e3738b8ed@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
r8169 provides the MDIO bus for the internal PHY's. It has been extended
with c45 access functions for addressing MDIO_MMD_VEND2 registers.
So we can switch from paged access to directly addressing the
MDIO_MMD_VEND2 registers.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/a5f2333c-dda9-48ad-9801-77049766e632@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The integrated PHYs on chip versions from RTL8168g allow to address
MDIO_MMD_VEND2 registers. All c22 standard registers are mapped to
MDIO_MMD_VEND2 registers. So far the paging mechanism is used to
address PHY registers. Add support for c45 ops to address MDIO_MMD_VEND2
registers directly, w/o the paging.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/d6f97eaa-0f13-468f-89cb-75a41087bc4a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Update a BUILD_BUG_ON() usage that works on current compilers, but
breaks compilation on gcc 5.3.1 (Alex Williamson)
- Avoid use of FLR for Mediatek MT7922 WiFi; the device previously
worked after a long timeout and fallback to SBR, but after a recent
RRS change it doesn't work at all after FLR (Bjorn Helgaas)
* tag 'pci-v6.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Avoid FLR for Mediatek MT7922 WiFi
PCI: Fix BUILD_BUG_ON usage for old gcc
|
|
KVM fixes for 6.14 part 1
- Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated
by KVM to fix a NULL pointer dereference.
- Enter guest mode (L2) from KVM's perspective before initializing the vCPU's
nested NPT MMU so that the MMU is properly tagged for L2, not L1.
- Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the
guest's value may be stale if a VM-Exit is handled in the fastpath.
|
|
Fix issues with enabling SNP host support and effectively SNP support
which is broken with respect to the KVM module being built-in.
SNP host support is enabled in snp_rmptable_init() which is invoked as
device_initcall(). SNP check on IOMMU is done during IOMMU PCI init
(IOMMU_PCI_INIT stage). And for that reason snp_rmptable_init() is
currently invoked via device_initcall() and cannot be invoked via
subsys_initcall() as core IOMMU subsystem gets initialized via
subsys_initcall().
Now, if kvm_amd module is built-in, it gets initialized before SNP host
support is enabled in snp_rmptable_init() :
[ 10.131811] kvm_amd: TSC scaling supported
[ 10.136384] kvm_amd: Nested Virtualization enabled
[ 10.141734] kvm_amd: Nested Paging enabled
[ 10.146304] kvm_amd: LBR virtualization supported
[ 10.151557] kvm_amd: SEV enabled (ASIDs 100 - 509)
[ 10.156905] kvm_amd: SEV-ES enabled (ASIDs 1 - 99)
[ 10.162256] kvm_amd: SEV-SNP enabled (ASIDs 1 - 99)
[ 10.171508] kvm_amd: Virtual VMLOAD VMSAVE supported
[ 10.177052] kvm_amd: Virtual GIF supported
...
...
[ 10.201648] kvm_amd: in svm_enable_virtualization_cpu
And then svm_x86_ops->enable_virtualization_cpu()
(svm_enable_virtualization_cpu) programs MSR_VM_HSAVE_PA as following:
wrmsrl(MSR_VM_HSAVE_PA, sd->save_area_pa);
So VM_HSAVE_PA is non-zero before SNP support is enabled on all CPUs.
snp_rmptable_init() gets invoked after svm_enable_virtualization_cpu()
as following :
...
[ 11.256138] kvm_amd: in svm_enable_virtualization_cpu
...
[ 11.264918] SEV-SNP: in snp_rmptable_init
This triggers a #GP exception in snp_rmptable_init() when snp_enable()
is invoked to set SNP_EN in SYSCFG MSR:
[ 11.294289] unchecked MSR access error: WRMSR to 0xc0010010 (tried to write 0x0000000003fc0000) at rIP: 0xffffffffaf5d5c28 (native_write_msr+0x8/0x30)
...
[ 11.294404] Call Trace:
[ 11.294482] <IRQ>
[ 11.294513] ? show_stack_regs+0x26/0x30
[ 11.294522] ? ex_handler_msr+0x10f/0x180
[ 11.294529] ? search_extable+0x2b/0x40
[ 11.294538] ? fixup_exception+0x2dd/0x340
[ 11.294542] ? exc_general_protection+0x14f/0x440
[ 11.294550] ? asm_exc_general_protection+0x2b/0x30
[ 11.294557] ? __pfx_snp_enable+0x10/0x10
[ 11.294567] ? native_write_msr+0x8/0x30
[ 11.294570] ? __snp_enable+0x5d/0x70
[ 11.294575] snp_enable+0x19/0x20
[ 11.294578] __flush_smp_call_function_queue+0x9c/0x3a0
[ 11.294586] generic_smp_call_function_single_interrupt+0x17/0x20
[ 11.294589] __sysvec_call_function+0x20/0x90
[ 11.294596] sysvec_call_function+0x80/0xb0
[ 11.294601] </IRQ>
[ 11.294603] <TASK>
[ 11.294605] asm_sysvec_call_function+0x1f/0x30
...
[ 11.294631] arch_cpu_idle+0xd/0x20
[ 11.294633] default_idle_call+0x34/0xd0
[ 11.294636] do_idle+0x1f1/0x230
[ 11.294643] ? complete+0x71/0x80
[ 11.294649] cpu_startup_entry+0x30/0x40
[ 11.294652] start_secondary+0x12d/0x160
[ 11.294655] common_startup_64+0x13e/0x141
[ 11.294662] </TASK>
This #GP exception is getting triggered due to the following errata for
AMD family 19h Models 10h-1Fh Processors:
Processor may generate spurious #GP(0) Exception on WRMSR instruction:
Description:
The Processor will generate a spurious #GP(0) Exception on a WRMSR
instruction if the following conditions are all met:
- the target of the WRMSR is a SYSCFG register.
- the write changes the value of SYSCFG.SNPEn from 0 to 1.
- One of the threads that share the physical core has a non-zero
value in the VM_HSAVE_PA MSR.
The document being referred to above:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf
To summarize, with kvm_amd module being built-in, KVM/SVM initialization
happens before host SNP is enabled and this SVM initialization
sets VM_HSAVE_PA to non-zero, which then triggers a #GP when
SYSCFG.SNPEn is being set and this will subsequently cause
SNP_INIT(_EX) to fail with INVALID_CONFIG error as SYSCFG[SnpEn] is not
set on all CPUs.
Essentially SNP host enabling code should be invoked before KVM
initialization, which is currently not the case when KVM is built-in.
Add fix to call snp_rmptable_init() early from iommu_snp_enable()
directly and not invoked via device_initcall() which enables SNP host
support before KVM initialization with kvm_amd module built-in.
Add additional handling for `iommu=off` or `amd_iommu=off` options.
Note that IOMMUs need to be enabled for SNP initialization, therefore,
if host SNP support is enabled but late IOMMU initialization fails
then that will cause PSP driver's SNP_INIT to fail as IOMMU SNP sanity
checks in SNP firmware will fail with invalid configuration error as
below:
[ 9.723114] ccp 0000:23:00.1: sev enabled
[ 9.727602] ccp 0000:23:00.1: psp enabled
[ 9.732527] ccp 0000:a2:00.1: enabling device (0000 -> 0002)
[ 9.739098] ccp 0000:a2:00.1: no command queues available
[ 9.745167] ccp 0000:a2:00.1: psp enabled
[ 9.805337] ccp 0000:23:00.1: SEV-SNP: failed to INIT rc -5, error 0x3
[ 9.866426] ccp 0000:23:00.1: SEV API:1.53 build:5
Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature")
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Message-ID: <138b520fb83964782303b43ade4369cd181fdd9c.1739226950.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The kernel's initcall infrastructure lacks the ability to express
dependencies between initcalls, whereas the modules infrastructure
automatically handles dependencies via symbol loading. Ensure the
PSP SEV driver is initialized before proceeding in sev_hardware_setup()
if KVM is built-in as the dependency isn't handled by the initcall
infrastructure.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Message-ID: <f78ddb64087df27e7bcb1ae0ab53f55aa0804fab.1739226950.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM is dependent on the PSP SEV driver and PSP SEV driver needs to be
loaded before KVM module. In case of module loading any dependent
modules are automatically loaded but in case of built-in modules there
is no inherent mechanism available to specify dependencies between
modules and ensure that any dependent modules are loaded implicitly.
Add a new external API interface for PSP module initialization which
allows PSP SEV driver to be loaded explicitly if KVM is built-in.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-ID: <15279ca0cad56a07cf12834ec544310f85ff5edc.1739226950.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.14, take #2
- Large set of fixes for vector handling, specially in the interactions
between host and guest state. This fixes a number of bugs affecting
actual deployments, and greatly simplifies the FP/SIMD/SVE handling.
Thanks to Mark Rutland for dealing with this thankless task.
- Fix an ugly race between vcpu and vgic creation/init, resulting in
unexpected behaviours.
- Fix use of kernel VAs at EL2 when emulating timers with nVHE.
- Small set of pKVM improvements and cleanups.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Fix a regression caused by an inadvertent change of the
THERMAL_GENL_ATTR_CPU_CAPABILITY value in one of the recent thermal
commits (Zhang Rui) and drop a stale piece of documentation (Daniel
Lezcano)"
* tag 'thermal-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/cpufreq_cooling: Remove structure member documentation
thermal/netlink: Prevent userspace segmentation fault by adjusting UAPI header
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- mtk-sd: Fix register settings for hs400(es) mode
- sdhci_am654: Revert patch for start-signal-voltage-switch
* tag 'mmc-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mtk-sd: Fix register settings for hs400(es) mode
Revert "mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch"
|
|
Pull smb client fix from Steve French:
"SMB3 client multichannel fix"
* tag 'v6.14-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: pick channels for individual subrequests
|
|
In bvec_split_segs(), max_bytes is an unsigned, so it must be less than
or equal to UINT_MAX. Remove the unnecessary min().
Prior to commit 67927d220150 ("block/merge: count bytes instead of
sectors"), the min() was with UINT_MAX >> 9, so it did have an effect.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250214193637.234702-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When io_uring submission goes async for the first time on a given task,
we'll try to create a worker thread to handle the submission. Creating
this worker thread can fail due to various transient conditions, such as
an outstanding signal in the forking thread, so we have retry logic with
a limit of 3 retries. However, this retry logic appears to be too
aggressive/fast - we've observed a thread blowing through the retry
limit while having the same outstanding signal the whole time. Here's an
excerpt of some tracing that demonstrates the issue:
First, signal 26 is generated for the process. It ends up getting routed
to thread 92942.
0) cbd-92284 /* signal_generate: sig=26 errno=0 code=-2 comm=psblkdASD pid=92934 grp=1 res=0 */
This causes create_io_thread in the signalled thread to fail with
ERESTARTNOINTR, and thus a retry is queued.
13) task_th-92942 /* io_uring_queue_async_work: ring 000000007325c9ae, request 0000000080c96d8e, user_data 0x0, opcode URING_CMD, flags 0x8240001, normal queue, work 000000006e96dd3f */
13) task_th-92942 io_wq_enqueue() {
13) task_th-92942 _raw_spin_lock();
13) task_th-92942 io_wq_activate_free_worker();
13) task_th-92942 _raw_spin_lock();
13) task_th-92942 create_io_worker() {
13) task_th-92942 __kmalloc_cache_noprof();
13) task_th-92942 __init_swait_queue_head();
13) task_th-92942 kprobe_ftrace_handler() {
13) task_th-92942 get_kprobe();
13) task_th-92942 aggr_pre_handler() {
13) task_th-92942 pre_handler_kretprobe();
13) task_th-92942 /* create_enter: (create_io_thread+0x0/0x50) fn=0xffffffff8172c0e0 arg=0xffff888996bb69c0 node=-1 */
13) task_th-92942 } /* aggr_pre_handler */
...
13) task_th-92942 } /* copy_process */
13) task_th-92942 } /* create_io_thread */
13) task_th-92942 kretprobe_rethook_handler() {
13) task_th-92942 /* create_exit: (create_io_worker+0x8a/0x1a0 <- create_io_thread) arg1=0xfffffffffffffdff */
13) task_th-92942 } /* kretprobe_rethook_handler */
13) task_th-92942 queue_work_on() {
...
The CPU is then handed to a kworker to process the queued retry:
------------------------------------------
13) task_th-92942 => kworker-54154
------------------------------------------
13) kworker-54154 io_workqueue_create() {
13) kworker-54154 io_queue_worker_create() {
13) kworker-54154 task_work_add() {
13) kworker-54154 wake_up_state() {
13) kworker-54154 try_to_wake_up() {
13) kworker-54154 _raw_spin_lock_irqsave();
13) kworker-54154 _raw_spin_unlock_irqrestore();
13) kworker-54154 } /* try_to_wake_up */
13) kworker-54154 } /* wake_up_state */
13) kworker-54154 kick_process();
13) kworker-54154 } /* task_work_add */
13) kworker-54154 } /* io_queue_worker_create */
13) kworker-54154 } /* io_workqueue_create */
And then we immediately switch back to the original task to try creating
a worker again. This fails, because the original task still hasn't
handled its signal.
-----------------------------------------
13) kworker-54154 => task_th-92942
------------------------------------------
13) task_th-92942 create_worker_cont() {
13) task_th-92942 kprobe_ftrace_handler() {
13) task_th-92942 get_kprobe();
13) task_th-92942 aggr_pre_handler() {
13) task_th-92942 pre_handler_kretprobe();
13) task_th-92942 /* create_enter: (create_io_thread+0x0/0x50) fn=0xffffffff8172c0e0 arg=0xffff888996bb69c0 node=-1 */
13) task_th-92942 } /* aggr_pre_handler */
13) task_th-92942 } /* kprobe_ftrace_handler */
13) task_th-92942 create_io_thread() {
13) task_th-92942 copy_process() {
13) task_th-92942 task_active_pid_ns();
13) task_th-92942 _raw_spin_lock_irq();
13) task_th-92942 recalc_sigpending();
13) task_th-92942 _raw_spin_lock_irq();
13) task_th-92942 } /* copy_process */
13) task_th-92942 } /* create_io_thread */
13) task_th-92942 kretprobe_rethook_handler() {
13) task_th-92942 /* create_exit: (create_worker_cont+0x35/0x1b0 <- create_io_thread) arg1=0xfffffffffffffdff */
13) task_th-92942 } /* kretprobe_rethook_handler */
13) task_th-92942 io_worker_release();
13) task_th-92942 queue_work_on() {
13) task_th-92942 clear_pending_if_disabled();
13) task_th-92942 __queue_work() {
13) task_th-92942 } /* __queue_work */
13) task_th-92942 } /* queue_work_on */
13) task_th-92942 } /* create_worker_cont */
The pattern repeats another couple times until we blow through the retry
counter, at which point we give up. All outstanding work is canceled,
and the io_uring command which triggered all this is failed with
ECANCELED:
13) task_th-92942 io_acct_cancel_pending_work() {
...
13) task_th-92942 /* io_uring_complete: ring 000000007325c9ae, req 0000000080c96d8e, user_data 0x0, result -125, cflags 0x0 extra1 0 extra2 0 */
Finally, the task gets around to processing its outstanding signal 26,
but it's too late.
13) task_th-92942 /* signal_deliver: sig=26 errno=0 code=-2 sa_handler=59566a0 sa_flags=14000000 */
Try to address this issue by adding a small scaling delay when retrying
worker creation. This should give the forking thread time to handle its
signal in the above case. This isn't a particularly satisfying solution,
as sufficiently paradoxical scheduling would still have us hitting the
same issue, and I'm open to suggestions for something better. But this
is likely to prevent this (already rare) issue from hitting in practice.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Link: https://lore.kernel.org/r/20250208-wq_retry-v2-1-4f6f5041d303@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"Take the newly introduced EFI_MEMORY_HOT_PLUGGABLE memory attribute
into account when placing the kernel image in memory at boot.
Otherwise, the presence of the kernel image could prevent such a
memory region from being unplugged at runtime if it was 'cold
plugged', i.e., already plugged in at boot time (and exposed via the
EFI memory map).
This should ensure that the new EFI_MEMORY_HOT_PLUGGABLE memory
attribute is used consistently by Linux before it ever turns up in
production, ensuring that we can make meaningful use of it without
running the risk of regressing existing users"
* tag 'efi-fixes-for-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: Use BIT_ULL() constants for memory attributes
efi: Avoid cold plugged memory for placing the kernel
|
|
Russell King says:
====================
net: phylink,xpcs,stmmac: support PCS EEE configuration
This series adds support for phylink managed EEE at the PCS level,
allowing xpcs_config_eee() to be removed. Sadly, we still end up with
a XPCS specific function to configure the clock multiplier.
====================
Link: https://patch.msgid.link/Z6naiPpxfxGr1Ic6@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move xpcs_config_eee() with the other EEE-related functions.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQd-003w7a-MM@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is now no need to pass the mult_fact into xpcs_config_eee(), so
let's remove that argument and use xpcs->eee_mult_fact directly. While
changing the function signature, as we pass true/false for enable, use
"bool" instead of "int" for this.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQY-003w7U-IG@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make xpcs_config_eee() private to the XPCS driver, called only from
the phylink pcs_disable_eee() and pcs_enable_eee() methods.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQT-003w7O-Ec@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the explicit calls to xpcs_config_eee() from the stmmac driver,
preferring instead for phylink to manage the EEE configuration at the
PCS.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQO-003w7I-Ap@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert XPCS to use the new pcs_disable_eee() and pcs_enable_eee()
methods. Since stmmac is the only user of xpcs_config_eee(), we can
make this a no-op along with this change.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQJ-003w7C-6v@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Arrange for stmmac to call the new xpcs_config_eee_mult_fact() function
to configure the EEE clock multiplying factor. This will allow the
removal of the xpcs_config_eee() calls in the next patch.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQE-003w76-3C@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a function to separate out the EEE clock multiplying factor. This
will be called by the stmmac driver to configure this value.
It would have been better had the driver used the CLK API to retrieve
this clock, get its rate and calculate the appropriate multiplier, but
that door has closed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQ8-003w70-VT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are hooks in the stmmac driver into XPCS to control the EEE
settings when LPI is configured at the MAC. This bypasses the layering.
To allow this to be removed from the stmmac driver, add two new
methods for PCS to inform them when the LPI/EEE enablement state
changes at the MAC.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thRQ3-003w6u-RH@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
inet: better inet_sock_set_state() for passive flows
Small series to make inet_sock_set_state() more interesting for
LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.
First patch is a cleanup.
====================
Link: https://patch.msgid.link/20250212131328.1514243-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
many details :
... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
sport=4901 dport=0 \
saddr=127.0.0.6 daddr=0.0.0.0 \
saddrv6=:: daddrv6=:: \
oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
Only the sport gives the listener port, no other parts of the n-tuple are correct.
In this patch, I initialize relevant fields of the new socket before
calling inet_sk_set_state(newsk, TCP_SYN_RECV).
We now have a trace including all the source/destination bits.
... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
sport=4901 dport=47648 \
saddr=127.0.0.6 daddr=127.0.0.6 \
saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250212131328.1514243-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Return early from inet_csk_clone_lock() if the socket
allocation failed, to reduce the indentation level.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250212131328.1514243-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When using the Qualcomm X55 modem on the ThinkPad X13s, the kernel log is
constantly being filled with errors related to a "sequence number glitch",
e.g.:
[ 1903.284538] sequence number glitch prev=16 curr=0
[ 1913.812205] sequence number glitch prev=50 curr=0
[ 1923.698219] sequence number glitch prev=142 curr=0
[ 2029.248276] sequence number glitch prev=1555 curr=0
[ 2046.333059] sequence number glitch prev=70 curr=0
[ 2076.520067] sequence number glitch prev=272 curr=0
[ 2158.704202] sequence number glitch prev=2655 curr=0
[ 2218.530776] sequence number glitch prev=2349 curr=0
[ 2225.579092] sequence number glitch prev=6 curr=0
Internet connectivity is working fine, so this error seems harmless. It
looks like modem does not preserve the sequence number when entering low
power state; the amount of errors depends on how actively the modem is
being used.
A similar issue has also been seen on USB-based MBIM modems [1]. However,
in cdc_ncm.c the "sequence number glitch" message is a debug message
instead of an error. Apply the same to the mhi_wwan_mbim.c driver to
silence these errors when using the modem.
[1]: https://lists.freedesktop.org/archives/libmbim-devel/2016-November/000781.html
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250212-mhi-wwan-mbim-sequence-glitch-v1-1-503735977cbd@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor clock management in EQoS driver for code reuse and to avoid
redundancy. This way, only minimal changes are required when a new platform
is added.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Swathi K S <swathi.ks@samsung.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250213041559.106111-1-swathi.ks@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For GSO packets, skb_cow_head() will reallocate the skb for TSO header
cloned skbs in airoha_dev_xmit(). For this reason, sinfo pointer can be
no more valid. Fix the issue relying on skb_shinfo() macro directly in
airoha_dev_xmit().
The problem exists since
commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
but it is not a user visible, since we can't currently enable TSO
for DSA user ports since we are missing to initialize net_device
vlan_features field.
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250213-airoha-en7581-flowtable-offload-v4-1-b69ca16d74db@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Updating MAINTAINERS to include active contributers.
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://patch.msgid.link/20250213184523.2002582-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
net: add EXPORT_IPV6_MOD()
In this series I am adding EXPORT_IPV6_MOD and EXPORT_IPV6_MOD_GPL()
so that we can replace some EXPORT_SYMBOL() when IPV6 is
not modular.
This is making all the selected symbols internal to core
linux networking.
v1: https://lore.kernel.org/20250210082805.465241-2-edumazet@google.com
====================
Link: https://patch.msgid.link/20250212132418.1524422-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need
to be exported unless CONFIG_IPV6=m
udp_table is no longer used from any modules, and does not
need to be exported anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need
to be exported unless CONFIG_IPV6=m
tcp_hashinfo and tcp_openreq_init_rwin() are no longer
used from any module anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use EXPORT_IPV6_MOD[_GPL]() for symbols that do not need to
to be exported unless CONFIG_IPV6=m
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have many EXPORT_SYMBOL(x) in networking tree because IPv6
can be built as a module.
CONFIG_IPV6=y is becoming the norm.
Define a EXPORT_IPV6_MOD(x) which only exports x
for modular IPv6.
Same principle applies to EXPORT_IPV6_MOD_GPL()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250212132418.1524422-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The goal is for me to get a kernel.org account and then send pull requests
in order to relieve some pressure from Palmer and make our workflow
smoother.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Enthusiastically-Supported-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241212131134.288819-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The init_rt_signal_env() funciton is called before the alternative patch
is applied, so using the alternative-related API to check the availability
of an extension within this function doesn't have the intended effect.
This patch reorders the init_rt_signal_env() and apply_boot_alternatives()
to get the correct signal_minsigstksz.
Fixes: e92f469b0771 ("riscv: signal: Report signal frame size to userspace via auxv")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Andy Chiu <andybnac@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241220083926.19453-3-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|