Age | Commit message (Collapse) | Author |
|
The PCIe specification allows three attention indicator states, on, off,
and blink. Enable all three states instead of basic on / off control.
This changes the userspace API (writes to the sysfs "attention" file) to
match the behavior of pciehp. Here's the comparison of previous and new
indicator behavior:
Value Previous New Behavior
----- -------- ------------------------
0 off (reserved, so undefined)
1 on on
2 on blink
3 on off
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: add specifics of behavior change]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1210309411.1359866.1752615582001.JavaMail.zimbra@raptorengineeringinc.com
|
|
The existing PowerNV hotplug code did not handle surprise plug events
correctly, leading to a complete failure of the hotplug system after device
removal and a required reboot to detect new devices.
This comes down to two issues:
1) When a device is surprise removed, often the bridge upstream
port will cause a PE freeze on the PHB. If this freeze is not
cleared, the MSI interrupts from the bridge hotplug notification
logic will not be received by the kernel, stalling all plug events
on all slots associated with the PE.
2) When a device is removed from a slot, regardless of surprise or
programmatic removal, the associated PHB/PE ls left frozen.
If this freeze is not cleared via a fundamental reset, skiboot
is unable to clear the freeze and cannot retrain / rescan the
slot. This also requires a reboot to clear the freeze and redetect
the device in the slot.
Issue the appropriate unfreeze and rescan commands on hotplug events,
and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/171044224.1359864.1752615546988.JavaMail.zimbra@raptorengineeringinc.com
|
|
Multiple race conditions existed between the PCIe hotplug driver and the
EEH driver, leading to a variety of kernel oopses of the same general
nature:
<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>
A second class of oops is also seen when the underlying bus disappears
during device recovery.
Refactor the EEH module to be PCI rescan and remove safe. Also clean
up a few minor formatting / readability issues.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1334208367.1359861.1752615503144.JavaMail.zimbra@raptorengineeringinc.com
|
|
The PowerNV hotplug driver needs to be able to clear any frozen PE(s)
on the PHB after suprise removal of a downstream device.
Export the eeh_unfreeze_pe() symbol to allow implementation of this
functionality in the php_nv module.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1778535414.1359858.1752615454618.JavaMail.zimbra@raptorengineeringinc.com
|
|
The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
was observed to incorrectly assert the Presence Detect Set bit in its
capabilities when tested on a Raptor Computing Systems Blackbird system,
resulting in the hot insert path never attempting a rescan of the bus
and any downstream devices not being re-detected.
Work around this by additionally checking whether the PCIe data link is
active or not when performing presence detection on downstream switches'
ports, similar to the pciehp_hpc.c driver.
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/505981576.1359853.1752615415117.JavaMail.zimbra@raptorengineeringinc.com
|
|
When the root of a nested PCIe bridge configuration is unplugged, the
pnv_php driver leaked the allocated IRQ resources for the child bridges'
hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all its IRQ resources
before calling pci_hp_remove_devices().
Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot(), instead of
pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will
now be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.
The abridged kernel panic that occurs without this patch is as follows:
WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
Call Trace:
msi_device_data_release+0x34/0x9c (unreliable)
release_nodes+0x64/0x13c
devres_release_all+0xc0/0x140
device_del+0x2d4/0x46c
pci_destroy_dev+0x5c/0x194
pci_hp_remove_devices+0x90/0x128
pci_hp_remove_devices+0x44/0x128
pnv_php_disable_slot+0x54/0xd4
power_write_file+0xf8/0x18c
pci_slot_attr_store+0x40/0x5c
sysfs_kf_write+0x64/0x78
kernfs_fop_write_iter+0x1b0/0x290
vfs_write+0x3bc/0x50c
ksys_write+0x84/0x140
system_call_exception+0x124/0x230
system_call_vectored_common+0x15c/0x2ec
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com
|
|
When building with CONFIG_RELOCATABLE, there is a .interp section
which contains the name of the expected ELF interpreter:
Contents of section .interp:
c0000000021c1bac 2f757372 2f6c6962 2f6c642e 736f2e31 /usr/lib/ld.so.1
c0000000021c1bbc 00 .
That information is useless and even likely wrong. Remove it.
Link: https://github.com/linuxppc/issues/issues/434
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/eeaf8fd6628a75d19872ab31cf7e7179e2baef5e.1751366959.git.christophe.leroy@csgroup.eu
|
|
The FSF does not reside in the Franklin street anymore, so we should not
request the people to write to this address. Fortunately, these header
files already contain a proper SPDX license identifier, so it should be
fine to simply drop all of this license boilerplate code here.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250711072553.198777-1-thuth@redhat.com
|
|
In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.
Switch to the regular pointer formatting which is safer and
easier to reason about.
Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250718-restricted-pointers-powerpc-v2-1-fd7bddd809f3@linutronix.de
|
|
This option was removed from the Kconfig in commit
8c710f75256b ("net/sched: Retire tcindex classifier") but it was not
removed from the defconfigs.
Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier")
Signed-off-by: Johan Korsnes <johan.korsnes@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250323191116.113482-1-johan.korsnes@gmail.com
|
|
Replace scnprintf() with sysfs_emit() in sysfs show functions.
These helpers are preferred in sysfs callbacks because they automatically
handle buffer size and improve safety and readability.
Signed-off-by: Ankit Chauhan <ankitchauhan2065@gmail.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250620024705.11321-1-ankitchauhan2065@gmail.com
|
|
mode
On PLPKS enabled PowerVM LPAR, there is no provision to load signed
third-party kernel modules when the key management mode is static. This
is because keys from secure boot secvars are only loaded when the key
management mode is dynamic.
Allow loading of the trustedcadb and moduledb keys even in the static
key management mode, where the secvar format string takes the form
"ibm,plpks-sb-v0".
Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com>
Tested-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610211907.101384-4-ssrish@linux.ibm.com
|
|
The PLPKS enabled PowerVM LPAR sysfs exposes all of the secure boot
secvars irrespective of the key management mode.
The PowerVM LPAR supports static and dynamic key management for secure
boot. The key management option can be updated in the management
console. The secvars PK, trustedcadb, and moduledb can be consumed both
in the static and dynamic key management modes for the loading of signed
third-party kernel modules. However, other secvars i.e. KEK, grubdb,
grubdbx, sbat, db and dbx, which are used to verify the grub and kernel
images, are consumed only in the dynamic key management mode.
Expose only PK, trustedcadb, and moduledb in the static key management
mode.
Co-developed-by: Souradeep <soura@imap.linux.ibm.com>
Signed-off-by: Souradeep <soura@imap.linux.ibm.com>
Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com>
Tested-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610211907.101384-3-ssrish@linux.ibm.com
|
|
On a PLPKS enabled PowerVM LPAR, the secvar format property for static
key management is misrepresented as "ibm,plpks-sb-unknown", creating
reason for confusion.
Static key management mode uses fixed, built-in keys. Dynamic key
management mode allows keys to be updated in production to handle
security updates without firmware rebuilds.
Define a function named plpks_get_sb_keymgmt_mode() to retrieve the
key management mode based on the existence of the SB_VERSION property
in the firmware.
Set the secvar format property to either "ibm,plpks-sb-v<version>" or
"ibm,plpks-sb-v0" based on the key management mode, and return the
length of the secvar format property.
Co-developed-by: Souradeep <soura@imap.linux.ibm.com>
Signed-off-by: Souradeep <soura@imap.linux.ibm.com>
Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com>
Tested-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610211907.101384-2-ssrish@linux.ibm.com
|
|
If the device configuration fails (if `dma_dev->device_config()`),
`sg_dma_address(&sg)` is not initialized and the jump to `err_dma_prep`
leads to calling `dma_unmap_single()` on `sg_dma_address(&sg)`.
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610142918.169540-2-fourier.thomas@gmail.com
|
|
The DMA map functions can fail and should be tested for errors.
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250620075602.12575-1-fourier.thomas@gmail.com
|
|
If we always align the vmemmap start to PAGE_SIZE, there is a
chance that we may end up allocating page-sized vmemmap backing
pages in RAM in the altmap not present case, because a PAGE_SIZE
aligned address is not PMD_SIZE-aligned.
In this patch, we are aligning the vmemmap start address to
PMD_SIZE if altmap is not present. This ensures that a PMD_SIZE
page is always allocated for the vmemmap mapping if altmap is
not present.
If altmap is present, Make sure we align the start vmemmap addr to
PAGE_SIZE so that we calculate the correct start_pfn in altmap
boundary check to decide whether we should use altmap or RAM based
backing memory allocation. Also the address need to be aligned for
set_pte operation. If the start addr is already PMD_SIZE aligned
and with in the altmap boundary then we will try to use a pmd size
altmap mapping else we go for page size mapping.
So if altmap is present, we try to use the maximum number of
altmap pages; otherwise, we allocate a PMD_SIZE RAM page.
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/895c4afd912c85d344a2065e348fac90529ed48f.1750593372.git.donettom@linux.ibm.com
|
|
Error conditions are not handled properly if altmap is not present
and PMD_SIZE vmemmap_alloc_block_buf fails.
In this patch, if vmemmap_alloc_block_buf fails in the non-altmap
case, we will fall back to the base mapping.
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/7f95fe91c827a2fb76367a58dbea724e811fb152.1750593372.git.donettom@linux.ibm.com
|
|
IO hotplug add event is handled in the user space with drmgr tool.
After the device is enabled, the user space uses /sys/kernel/dlpar
interface with “dt add index <drc_index>” to update the device tree.
The kernel interface (dlpar_hp_dt_add()) finds the parent node for
the specified ‘drc_index’ from ibm,drc-info property. The recent FW
provides this property from 2017 onwards. But KVM guest code in
some releases is still using the older SLOF firmware which has
ibm,drc-indexes property instead of ibm,drc-info.
If the ibm,drc-info is not available, this patch adds changes to
search ‘drc_index’ from the indexes array in ibm,drc-indexes
property to support old FW.
Fixes: 02b98ff44a57 ("powerpc/pseries/dlpar: Add device tree nodes for DLPAR IO add")
Reported-by: Kowshik Jois <kowsjois@linux.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250531235002.239213-1-haren@linux.ibm.com
|
|
The macro kvm_trace_symbol_exit is used for providing the mappings
for the trap vectors and their names. Add mapping for H_VIRT so that
trap reason is displayed as string instead of a vector number when using
the kvm_guest_exit tracepoint.
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250516121225.276466-1-gautam@linux.ibm.com
|
|
use guard(mutex) for scope based resource management of mutex
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-7-sshegde@linux.ibm.com
|
|
use guard(mutex) for scope based resource management of mutex.
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-6-sshegde@linux.ibm.com
|
|
use lock guards for scope based resource management of mutex.
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
This shows the use of both guard and scoped_guard
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-5-sshegde@linux.ibm.com
|
|
use scoped_guard for scope based resource management of mutex.
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-4-sshegde@linux.ibm.com
|
|
use guard(mutex) for scope based resource management of mutex.
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-3-sshegde@linux.ibm.com
|
|
use guard(mutex) for scope based resource management of mutex.
This would make the code simpler and easier to maintain.
More details on lock guards can be found at
https://lore.kernel.org/all/20230612093537.614161713@infradead.org/T/#u
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250505075333.184463-2-sshegde@linux.ibm.com
|
|
The kernel uses 3100 to indicate ISA version 3.1, not 3010, so
fix the Microwatt device tree to use 3100.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/aB6taMDWvJwOl9xj@bruin
|
|
Commit 030bdc3fd080 ("powerpc/defconfigs: Set HZ=100 on pseries and ppc64
defconfigs") lowered CONFIG_HZ from 250 to 100, citing reduced need for a
higher tick rate due to high-resolution timers and concerns about timer
interrupt overhead and cascading effects in the timer wheel.
However, improvements have been made to the timer wheel algorithm since
then, particularly in eliminating cascading effects at the cost of minor
timekeeping inaccuracies. More details are available here
https://lwn.net/Articles/646950/. This removes the original concern about
cascading, and the reliance on high-resolution timers is not applicable
to the scheduler, which still depends on periodic ticks set by CONFIG_HZ.
With the introduction of the EEVDF scheduler, users can specify custom
slices for workloads. The default base_slice is 3ms, but with CONFIG_HZ=100
(10ms tick interval), base_slice is ineffective. Workloads like stress-ng
that do not voluntarily yield the CPU run for ~10ms before switching out.
Additionally, setting a custom slice below 3ms (e.g., 2ms) should lower
task latency, but this effect is lost due to the coarse 10ms tick.
By increasing CONFIG_HZ to 1000 (1ms tick), base_slice is properly honored,
and user-defined slices work as expected. Benchmark results support this
change:
Latency improvements in schbench with EEVDF under stress-ng-induced noise:
Scheduler CONFIG_HZ Custom Slice 99th Percentile Latency (µs)
--------------------------------------------------------------------
EEVDF 1000 No 0.30x
EEVDF 1000 2 ms 0.29x
EEVDF (default) 100 No 1.00x
Switching to HZ=1000 reduces the 99th percentile latency in schbench by
~70%. This improvement occurs because, with HZ=1000, stress-ng tasks run
for ~3ms before yielding, compared to ~10ms with HZ=100. As a result,
schbench gets CPU time sooner, reducing its latency.
Daytrader Performance:
Daytrader results show minor variation within standard deviation,
indicating no significant regression.
Workload (Users/Instances) Throughput 1000HZ vs 100HZ (Std Dev%)
--------------------------------------------------------------------------
30 u, 1 i +3.01% (1.62%)
60 u, 1 i +1.46% (2.69%)
90 u, 1 i –1.33% (3.09%)
30 u, 2 i -1.20% (1.71%)
30 u, 3 i –0.07% (1.33%)
Avg. Response Time: No Change (=)
pgbench select queries:
Metric 1000HZ vs 100HZ (Std Dev%)
------------------------------------------------------------------
Average TPS Change +2.16% (1.27%)
Average Latency Change –2.21% (1.21%)
Average TPS: Higher the better
Average Latency: Lower the better
pgbench shows both throughput and latency improvements beyond standard
deviation.
Given these results and the improvements in timer wheel implementation,
increasing CONFIG_HZ to 1000 ensures that powerpc benefits from EEVDF’s
base_slice and allows fine-tuned scheduling for latency-sensitive
workloads.
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Reviewed-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250330074734.16679-1-vineethr@linux.ibm.com
|
|
This adds all symbols required for use case like
livepatching. Distros already enable this config
and enabling this increases build time by 3%
(in a power9 128 cpu setup) and almost no size
changes for vmlinux.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250116073419.344453-1-maddy@linux.ibm.com
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- subsystem: convert drivers to use recent callbacks of struct
i2c_algorithm A typical after-rc1 cleanup, which I couldn't send in
time for rc2
- tegra: fix YAML conversion of device tree bindings
- k1: re-add a check which got lost during upstreaming
* tag 'i2c-for-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: k1: check for transfer error
i2c: use inclusive callbacks in struct i2c_algorithm
dt-bindings: i2c: nvidia,tegra20-i2c: Specify the required properties
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Make sure the array tracking which kernel text positions need to be
alternatives-patched doesn't get mishandled by out-of-order
modifications, leading to it overflowing and causing page faults when
patching
- Avoid an infinite loop when early code does a ranged TLB invalidation
before the broadcast TLB invalidation count of how many pages it can
flush, has been read from CPUID
- Fix a CONFIG_MODULES typo
- Disable broadcast TLB invalidation when PTI is enabled to avoid an
overflow of the bitmap tracking dynamic ASIDs which need to be
flushed when the kernel switches between the user and kernel address
space
- Handle the case of a CPU going offline and thus reporting zeroes when
reading top-level events in the resctrl code
* tag 'x86_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Fix int3 handling failure from broken text_poke array
x86/mm: Fix early boot use of INVPLGB
x86/its: Fix an ifdef typo in its_alloc()
x86/mm: Disable INVLPGB when PTI is enabled
x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Fix missing prototypes warnings
- Properly initialize work context when allocating it
- Remove a method tracking when managed interrupts are suspended during
hotplug, in favor of the code using a IRQ disable depth tracking now,
and have interrupts get properly enabled again on restore
- Make sure multiple CPUs getting hotplugged don't cause wrong tracking
of the managed IRQ disable depth
* tag 'irq_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/ath79-misc: Fix missing prototypes warnings
genirq/irq_sim: Initialize work context pointers properly
genirq/cpuhotplug: Restore affinity even for suspended IRQ
genirq/cpuhotplug: Rebalance managed interrupts across multi-CPU hotplug
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Avoid a crash on a heterogeneous machine where not all cores support
the same hw events features
- Avoid a deadlock when throttling events
- Document the perf event states more
- Make sure a number of perf paths switching off or rescheduling events
call perf_cgroup_event_disable()
- Make sure perf does task sampling before its userspace mapping is
torn down, and not after
* tag 'perf_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix crash in icl_update_topdown_event()
perf: Fix the throttle error of some clock events
perf: Add comment to enum perf_event_state
perf/core: Fix WARN in perf_cgroup_switch()
perf: Fix dangling cgroup pointer in cpuctx
perf: Fix cgroup state vs ERROR
perf: Fix sample vs do_exit()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Borislav Petkov:
- Make sure the switch to the global hash is requested always under a
lock so that two threads requesting that simultaneously cannot get to
inconsistent state
- Reject negative NUMA nodes earlier in the futex NUMA interface
handling code
- Selftests fixes
* tag 'locking_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Verify under the lock if hash can be replaced
futex: Handle invalid node numbers supplied by user
selftests/futex: Set the home_node in futex_numa_mpol
selftests/futex: getopt() requires int as return value.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
- amd64: Correct the number of memory controllers on some AMD Zen
clients
- igen6: Handle firmware-disabled memory controllers properly
* tag 'edac_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/igen6: Fix NULL pointer dereference
EDAC/amd64: Correct number of UMCs for family 19h models 70h-7fh
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix another set of FP/SIMD/SVE bugs affecting NV, and plugging some
missing synchronisation
- A small fix for the irqbypass hook fixes, tightening the check and
ensuring that we only deal with MSI for both the old and the new
route entry
- Rework the way the shadow LRs are addressed in a nesting
configuration, plugging an embarrassing bug as well as simplifying
the whole process
- Add yet another fix for the dreaded arch_timer_edge_cases selftest
RISC-V:
- Fix the size parameter check in SBI SFENCE calls
- Don't treat SBI HFENCE calls as NOPs
x86 TDX:
- Complete API for handling complex TDVMCALLs in userspace.
This was delayed because the spec lacked a way for userspace to
deny supporting these calls; the new exit code is now approved"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: TDX: Exit to userspace for GetTdVmCallInfo
KVM: TDX: Handle TDG.VP.VMCALL<GetQuote>
KVM: TDX: Add new TDVMCALL status code for unsupported subfuncs
KVM: arm64: VHE: Centralize ISBs when returning to host
KVM: arm64: Remove cpacr_clear_set()
KVM: arm64: Remove ad-hoc CPTR manipulation from kvm_hyp_handle_fpsimd()
KVM: arm64: Remove ad-hoc CPTR manipulation from fpsimd_sve_sync()
KVM: arm64: Reorganise CPTR trap manipulation
KVM: arm64: VHE: Synchronize CPTR trap deactivation
KVM: arm64: VHE: Synchronize restore of host debug registers
KVM: arm64: selftests: Close the GIC FD in arch_timer_edge_cases
KVM: arm64: Explicitly treat routing entry type changes as changes
KVM: arm64: nv: Fix tracking of shadow list registers
RISC-V: KVM: Don't treat SBI HFENCE calls as NOPs
RISC-V: KVM: Fix the size parameter check in SBI SFENCE calls
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Multichannel channel allocation fix for Kerberos mounts
- Two reconnect fixes
- Fix netfs_writepages crash with smbdirect/RDMA
- Directory caching fix
- Three minor cleanup fixes
- Log error when close cached dirs fails
* tag 'v6.16-rc2-smb3-client-fixes-v2' of git://git.samba.org/sfrench/cifs-2.6:
smb: minor fix to use SMB2_NTLMV2_SESSKEY_SIZE for auth_key size
smb: minor fix to use sizeof to initialize flags_string buffer
smb: Use loff_t for directory position in cached_dirents
smb: Log an error when close_all_cached_dirs fails
cifs: Fix prepare_write to negotiate wsize if needed
smb: client: fix max_sge overflow in smb_extract_folioq_to_rdma()
smb: client: fix first command failure during re-negotiation
cifs: Remove duplicate fattr->cf_dtype assignment from wsl_to_fattr() function
smb: fix secondary channel creation issue with kerberos by populating hostname when adding channels
|
|
If spacemit_i2c_xfer_msg() times out waiting for a message transfer to
complete, or if the hardware reports an error, it returns a negative
error code (-ETIMEDOUT, -EAGAIN, -ENXIO. or -EIO).
The sole caller of spacemit_i2c_xfer_msg() is spacemit_i2c_xfer(),
which is the i2c_algorithm->xfer callback function. It currently
does not save the value returned by spacemit_i2c_xfer_msg().
The result is that transfer errors go unreported, and a caller
has no indication anything is wrong.
When this code was out for review, the return value *was* checked
in early versions. But for some reason, that assignment got dropped
between versions 5 and 6 of the series, perhaps related to reworking
the code to merge spacemit_i2c_xfer_core() into spacemit_i2c_xfer().
Simply assigning the value returned to "ret" fixes the problem.
Fixes: 5ea558473fa31 ("i2c: spacemit: add support for SpacemiT K1 SoC")
Signed-off-by: Alex Elder <elder@riscstar.com>
Cc: <stable@vger.kernel.org> # v6.15+
Reviewed-by: Troy Mitchell <troymitchell988@gmail.com>
Link: https://lore.kernel.org/r/20250616125137.1555453-1-elder@riscstar.com
Signed-off-by: Andi Shyti <andi@smida.it>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Two fixes for commits in the nfsd-6.16 merge
- One fix for the recently-added NFSD netlink facility
- One fix for a remote SunRPC crasher
* tag 'nfsd-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error
nfsd: use threads array as-is in netlink interface
SUNRPC: Cleanup/fix initial rq_pages allocation
NFSD: Avoid corruption of a referring call list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Use the mounter’s credentials for file-backed mounts to resolve
Android SELinux permission issues
- Remove the unused trace event `erofs_destroy_inode`
- Error out on crafted out-of-file-range encoded extents
- Remove an incorrect check for encoded extents
* tag 'erofs-for-6.16-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: remove a superfluous check for encoded extents
erofs: refuse crafted out-of-file-range encoded extents
erofs: remove unused trace event erofs_destroy_inode
erofs: impersonate the opener's credentials when accessing backing file
|
|
Replaced hardcoded value 16 with SMB2_NTLMV2_SESSKEY_SIZE
in the auth_key definition and memcpy call.
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Replaced hardcoded length with sizeof(flags_string).
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Change the pos field in struct cached_dirents from int to loff_t
to support large directory offsets. This avoids overflow and
matches kernel conventions for directory positions.
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Under low-memory conditions, close_all_cached_dirs() can't move the
dentries to a separate list to dput() them once the locks are dropped.
This will result in a "Dentry still in use" error, so add an error
message that makes it clear this is what happened:
[ 495.281119] CIFS: VFS: \\otters.example.com\share Out of memory while dropping dentries
[ 495.281595] ------------[ cut here ]------------
[ 495.281887] BUG: Dentry ffff888115531138{i=78,n=/} still in use (2) [unmount of cifs cifs]
[ 495.282391] WARNING: CPU: 1 PID: 2329 at fs/dcache.c:1536 umount_check+0xc8/0xf0
Also, bail out of looping through all tcons as soon as a single
allocation fails, since we're already in trouble, and kmalloc() attempts
for subseqeuent tcons are likely to fail just like the first one did.
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Acked-by: Bharath SM <bharathsm@microsoft.com>
Suggested-by: Ruben Devos <rdevos@oxya.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Fix cifs_prepare_write() to negotiate the wsize if it is unset.
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This fixes the following problem:
[ 749.901015] [ T8673] run fstests cifs/001 at 2025-06-17 09:40:30
[ 750.346409] [ T9870] ==================================================================
[ 750.346814] [ T9870] BUG: KASAN: slab-out-of-bounds in smb_set_sge+0x2cc/0x3b0 [cifs]
[ 750.347330] [ T9870] Write of size 8 at addr ffff888011082890 by task xfs_io/9870
[ 750.347705] [ T9870]
[ 750.348077] [ T9870] CPU: 0 UID: 0 PID: 9870 Comm: xfs_io Kdump: loaded Not tainted 6.16.0-rc2-metze.02+ #1 PREEMPT(voluntary)
[ 750.348082] [ T9870] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 750.348085] [ T9870] Call Trace:
[ 750.348086] [ T9870] <TASK>
[ 750.348088] [ T9870] dump_stack_lvl+0x76/0xa0
[ 750.348106] [ T9870] print_report+0xd1/0x640
[ 750.348116] [ T9870] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 750.348120] [ T9870] ? kasan_complete_mode_report_info+0x26/0x210
[ 750.348124] [ T9870] kasan_report+0xe7/0x130
[ 750.348128] [ T9870] ? smb_set_sge+0x2cc/0x3b0 [cifs]
[ 750.348262] [ T9870] ? smb_set_sge+0x2cc/0x3b0 [cifs]
[ 750.348377] [ T9870] __asan_report_store8_noabort+0x17/0x30
[ 750.348381] [ T9870] smb_set_sge+0x2cc/0x3b0 [cifs]
[ 750.348496] [ T9870] smbd_post_send_iter+0x1990/0x3070 [cifs]
[ 750.348625] [ T9870] ? __pfx_smbd_post_send_iter+0x10/0x10 [cifs]
[ 750.348741] [ T9870] ? update_stack_state+0x2a0/0x670
[ 750.348749] [ T9870] ? cifs_flush+0x153/0x320 [cifs]
[ 750.348870] [ T9870] ? cifs_flush+0x153/0x320 [cifs]
[ 750.348990] [ T9870] ? update_stack_state+0x2a0/0x670
[ 750.348995] [ T9870] smbd_send+0x58c/0x9c0 [cifs]
[ 750.349117] [ T9870] ? __pfx_smbd_send+0x10/0x10 [cifs]
[ 750.349231] [ T9870] ? unwind_get_return_address+0x65/0xb0
[ 750.349235] [ T9870] ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 750.349242] [ T9870] ? arch_stack_walk+0xa7/0x100
[ 750.349250] [ T9870] ? stack_trace_save+0x92/0xd0
[ 750.349254] [ T9870] __smb_send_rqst+0x931/0xec0 [cifs]
[ 750.349374] [ T9870] ? kernel_text_address+0x173/0x190
[ 750.349379] [ T9870] ? kasan_save_stack+0x39/0x70
[ 750.349382] [ T9870] ? kasan_save_track+0x18/0x70
[ 750.349385] [ T9870] ? __kasan_slab_alloc+0x9d/0xa0
[ 750.349389] [ T9870] ? __pfx___smb_send_rqst+0x10/0x10 [cifs]
[ 750.349508] [ T9870] ? smb2_mid_entry_alloc+0xb4/0x7e0 [cifs]
[ 750.349626] [ T9870] ? cifs_call_async+0x277/0xb00 [cifs]
[ 750.349746] [ T9870] ? cifs_issue_write+0x256/0x610 [cifs]
[ 750.349867] [ T9870] ? netfs_do_issue_write+0xc2/0x340 [netfs]
[ 750.349900] [ T9870] ? netfs_advance_write+0x45b/0x1270 [netfs]
[ 750.349929] [ T9870] ? netfs_write_folio+0xd6c/0x1be0 [netfs]
[ 750.349958] [ T9870] ? netfs_writepages+0x2e9/0xa80 [netfs]
[ 750.349987] [ T9870] ? do_writepages+0x21f/0x590
[ 750.349993] [ T9870] ? filemap_fdatawrite_wbc+0xe1/0x140
[ 750.349997] [ T9870] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.350002] [ T9870] smb_send_rqst+0x22e/0x2f0 [cifs]
[ 750.350131] [ T9870] ? __pfx_smb_send_rqst+0x10/0x10 [cifs]
[ 750.350255] [ T9870] ? local_clock_noinstr+0xe/0xd0
[ 750.350261] [ T9870] ? kasan_save_alloc_info+0x37/0x60
[ 750.350268] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.350271] [ T9870] ? _raw_spin_lock+0x81/0xf0
[ 750.350275] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.350278] [ T9870] ? smb2_setup_async_request+0x293/0x580 [cifs]
[ 750.350398] [ T9870] cifs_call_async+0x477/0xb00 [cifs]
[ 750.350518] [ T9870] ? __pfx_smb2_writev_callback+0x10/0x10 [cifs]
[ 750.350636] [ T9870] ? __pfx_cifs_call_async+0x10/0x10 [cifs]
[ 750.350756] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.350760] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.350763] [ T9870] ? __smb2_plain_req_init+0x933/0x1090 [cifs]
[ 750.350891] [ T9870] smb2_async_writev+0x15ff/0x2460 [cifs]
[ 750.351008] [ T9870] ? sched_clock_noinstr+0x9/0x10
[ 750.351012] [ T9870] ? local_clock_noinstr+0xe/0xd0
[ 750.351018] [ T9870] ? __pfx_smb2_async_writev+0x10/0x10 [cifs]
[ 750.351144] [ T9870] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 750.351150] [ T9870] ? _raw_spin_unlock+0xe/0x40
[ 750.351154] [ T9870] ? cifs_pick_channel+0x242/0x370 [cifs]
[ 750.351275] [ T9870] cifs_issue_write+0x256/0x610 [cifs]
[ 750.351554] [ T9870] ? cifs_issue_write+0x256/0x610 [cifs]
[ 750.351677] [ T9870] netfs_do_issue_write+0xc2/0x340 [netfs]
[ 750.351710] [ T9870] netfs_advance_write+0x45b/0x1270 [netfs]
[ 750.351740] [ T9870] ? rolling_buffer_append+0x12d/0x440 [netfs]
[ 750.351769] [ T9870] netfs_write_folio+0xd6c/0x1be0 [netfs]
[ 750.351798] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.351804] [ T9870] netfs_writepages+0x2e9/0xa80 [netfs]
[ 750.351835] [ T9870] ? __pfx_netfs_writepages+0x10/0x10 [netfs]
[ 750.351864] [ T9870] ? exit_files+0xab/0xe0
[ 750.351867] [ T9870] ? do_exit+0x148f/0x2980
[ 750.351871] [ T9870] ? do_group_exit+0xb5/0x250
[ 750.351874] [ T9870] ? arch_do_signal_or_restart+0x92/0x630
[ 750.351879] [ T9870] ? exit_to_user_mode_loop+0x98/0x170
[ 750.351882] [ T9870] ? do_syscall_64+0x2cf/0xd80
[ 750.351886] [ T9870] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.351890] [ T9870] do_writepages+0x21f/0x590
[ 750.351894] [ T9870] ? __pfx_do_writepages+0x10/0x10
[ 750.351897] [ T9870] filemap_fdatawrite_wbc+0xe1/0x140
[ 750.351901] [ T9870] __filemap_fdatawrite_range+0xba/0x100
[ 750.351904] [ T9870] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 750.351912] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.351916] [ T9870] filemap_write_and_wait_range+0x7d/0xf0
[ 750.351920] [ T9870] cifs_flush+0x153/0x320 [cifs]
[ 750.352042] [ T9870] filp_flush+0x107/0x1a0
[ 750.352046] [ T9870] filp_close+0x14/0x30
[ 750.352049] [ T9870] put_files_struct.part.0+0x126/0x2a0
[ 750.352053] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.352058] [ T9870] exit_files+0xab/0xe0
[ 750.352061] [ T9870] do_exit+0x148f/0x2980
[ 750.352065] [ T9870] ? __pfx_do_exit+0x10/0x10
[ 750.352069] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.352072] [ T9870] ? _raw_spin_lock_irq+0x8a/0xf0
[ 750.352076] [ T9870] do_group_exit+0xb5/0x250
[ 750.352080] [ T9870] get_signal+0x22d3/0x22e0
[ 750.352086] [ T9870] ? __pfx_get_signal+0x10/0x10
[ 750.352089] [ T9870] ? fpregs_assert_state_consistent+0x68/0x100
[ 750.352101] [ T9870] ? folio_add_lru+0xda/0x120
[ 750.352105] [ T9870] arch_do_signal_or_restart+0x92/0x630
[ 750.352109] [ T9870] ? __pfx_arch_do_signal_or_restart+0x10/0x10
[ 750.352115] [ T9870] exit_to_user_mode_loop+0x98/0x170
[ 750.352118] [ T9870] do_syscall_64+0x2cf/0xd80
[ 750.352123] [ T9870] ? __kasan_check_read+0x11/0x20
[ 750.352126] [ T9870] ? count_memcg_events+0x1b4/0x420
[ 750.352132] [ T9870] ? handle_mm_fault+0x148/0x690
[ 750.352136] [ T9870] ? _raw_spin_lock_irq+0x8a/0xf0
[ 750.352140] [ T9870] ? __kasan_check_read+0x11/0x20
[ 750.352143] [ T9870] ? fpregs_assert_state_consistent+0x68/0x100
[ 750.352146] [ T9870] ? irqentry_exit_to_user_mode+0x2e/0x250
[ 750.352151] [ T9870] ? irqentry_exit+0x43/0x50
[ 750.352154] [ T9870] ? exc_page_fault+0x75/0xe0
[ 750.352160] [ T9870] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.352163] [ T9870] RIP: 0033:0x7858c94ab6e2
[ 750.352167] [ T9870] Code: Unable to access opcode bytes at 0x7858c94ab6b8.
[ 750.352175] [ T9870] RSP: 002b:00007858c9248ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000022
[ 750.352179] [ T9870] RAX: fffffffffffffdfe RBX: 00007858c92496c0 RCX: 00007858c94ab6e2
[ 750.352182] [ T9870] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 750.352184] [ T9870] RBP: 00007858c9248d10 R08: 0000000000000000 R09: 0000000000000000
[ 750.352185] [ T9870] R10: 0000000000000000 R11: 0000000000000246 R12: fffffffffffffde0
[ 750.352187] [ T9870] R13: 0000000000000020 R14: 0000000000000002 R15: 00007ffc072d2230
[ 750.352191] [ T9870] </TASK>
[ 750.352195] [ T9870]
[ 750.395206] [ T9870] Allocated by task 9870 on cpu 0 at 750.346406s:
[ 750.395523] [ T9870] kasan_save_stack+0x39/0x70
[ 750.395532] [ T9870] kasan_save_track+0x18/0x70
[ 750.395536] [ T9870] kasan_save_alloc_info+0x37/0x60
[ 750.395539] [ T9870] __kasan_slab_alloc+0x9d/0xa0
[ 750.395543] [ T9870] kmem_cache_alloc_noprof+0x13c/0x3f0
[ 750.395548] [ T9870] mempool_alloc_slab+0x15/0x20
[ 750.395553] [ T9870] mempool_alloc_noprof+0x135/0x340
[ 750.395557] [ T9870] smbd_post_send_iter+0x63e/0x3070 [cifs]
[ 750.395694] [ T9870] smbd_send+0x58c/0x9c0 [cifs]
[ 750.395819] [ T9870] __smb_send_rqst+0x931/0xec0 [cifs]
[ 750.395950] [ T9870] smb_send_rqst+0x22e/0x2f0 [cifs]
[ 750.396081] [ T9870] cifs_call_async+0x477/0xb00 [cifs]
[ 750.396232] [ T9870] smb2_async_writev+0x15ff/0x2460 [cifs]
[ 750.396359] [ T9870] cifs_issue_write+0x256/0x610 [cifs]
[ 750.396492] [ T9870] netfs_do_issue_write+0xc2/0x340 [netfs]
[ 750.396544] [ T9870] netfs_advance_write+0x45b/0x1270 [netfs]
[ 750.396576] [ T9870] netfs_write_folio+0xd6c/0x1be0 [netfs]
[ 750.396608] [ T9870] netfs_writepages+0x2e9/0xa80 [netfs]
[ 750.396639] [ T9870] do_writepages+0x21f/0x590
[ 750.396643] [ T9870] filemap_fdatawrite_wbc+0xe1/0x140
[ 750.396647] [ T9870] __filemap_fdatawrite_range+0xba/0x100
[ 750.396651] [ T9870] filemap_write_and_wait_range+0x7d/0xf0
[ 750.396656] [ T9870] cifs_flush+0x153/0x320 [cifs]
[ 750.396787] [ T9870] filp_flush+0x107/0x1a0
[ 750.396791] [ T9870] filp_close+0x14/0x30
[ 750.396795] [ T9870] put_files_struct.part.0+0x126/0x2a0
[ 750.396800] [ T9870] exit_files+0xab/0xe0
[ 750.396803] [ T9870] do_exit+0x148f/0x2980
[ 750.396808] [ T9870] do_group_exit+0xb5/0x250
[ 750.396813] [ T9870] get_signal+0x22d3/0x22e0
[ 750.396817] [ T9870] arch_do_signal_or_restart+0x92/0x630
[ 750.396822] [ T9870] exit_to_user_mode_loop+0x98/0x170
[ 750.396827] [ T9870] do_syscall_64+0x2cf/0xd80
[ 750.396832] [ T9870] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.396836] [ T9870]
[ 750.397150] [ T9870] The buggy address belongs to the object at ffff888011082800
which belongs to the cache smbd_request_0000000008f3bd7b of size 144
[ 750.397798] [ T9870] The buggy address is located 0 bytes to the right of
allocated 144-byte region [ffff888011082800, ffff888011082890)
[ 750.398469] [ T9870]
[ 750.398800] [ T9870] The buggy address belongs to the physical page:
[ 750.399141] [ T9870] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11082
[ 750.399148] [ T9870] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[ 750.399155] [ T9870] page_type: f5(slab)
[ 750.399161] [ T9870] raw: 000fffffc0000000 ffff888022d65640 dead000000000122 0000000000000000
[ 750.399165] [ T9870] raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
[ 750.399169] [ T9870] page dumped because: kasan: bad access detected
[ 750.399172] [ T9870]
[ 750.399505] [ T9870] Memory state around the buggy address:
[ 750.399863] [ T9870] ffff888011082780: fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 750.400247] [ T9870] ffff888011082800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 750.400618] [ T9870] >ffff888011082880: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 750.400982] [ T9870] ^
[ 750.401370] [ T9870] ffff888011082900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 750.401774] [ T9870] ffff888011082980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 750.402171] [ T9870] ==================================================================
[ 750.402696] [ T9870] Disabling lock debugging due to kernel taint
[ 750.403202] [ T9870] BUG: unable to handle page fault for address: ffff8880110a2000
[ 750.403797] [ T9870] #PF: supervisor write access in kernel mode
[ 750.404204] [ T9870] #PF: error_code(0x0003) - permissions violation
[ 750.404581] [ T9870] PGD 5ce01067 P4D 5ce01067 PUD 5ce02067 PMD 78aa063 PTE 80000000110a2021
[ 750.404969] [ T9870] Oops: Oops: 0003 [#1] SMP KASAN PTI
[ 750.405394] [ T9870] CPU: 0 UID: 0 PID: 9870 Comm: xfs_io Kdump: loaded Tainted: G B 6.16.0-rc2-metze.02+ #1 PREEMPT(voluntary)
[ 750.406510] [ T9870] Tainted: [B]=BAD_PAGE
[ 750.406967] [ T9870] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 750.407440] [ T9870] RIP: 0010:smb_set_sge+0x15c/0x3b0 [cifs]
[ 750.408065] [ T9870] Code: 48 83 f8 ff 0f 84 b0 00 00 00 48 ba 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 80 3c 11 00 0f 85 69 01 00 00 49 8d 7c 24 08 <49> 89 04 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f
[ 750.409283] [ T9870] RSP: 0018:ffffc90005e2e758 EFLAGS: 00010246
[ 750.409803] [ T9870] RAX: ffff888036c53400 RBX: ffffc90005e2e878 RCX: 1ffff11002214400
[ 750.410323] [ T9870] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff8880110a2008
[ 750.411217] [ T9870] RBP: ffffc90005e2e798 R08: 0000000000000001 R09: 0000000000000400
[ 750.411770] [ T9870] R10: ffff888011082800 R11: 0000000000000000 R12: ffff8880110a2000
[ 750.412325] [ T9870] R13: 0000000000000000 R14: ffffc90005e2e888 R15: ffff88801a4b6000
[ 750.412901] [ T9870] FS: 0000000000000000(0000) GS:ffff88812bc68000(0000) knlGS:0000000000000000
[ 750.413477] [ T9870] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 750.414077] [ T9870] CR2: ffff8880110a2000 CR3: 000000005b0a6005 CR4: 00000000000726f0
[ 750.414654] [ T9870] Call Trace:
[ 750.415211] [ T9870] <TASK>
[ 750.415748] [ T9870] smbd_post_send_iter+0x1990/0x3070 [cifs]
[ 750.416449] [ T9870] ? __pfx_smbd_post_send_iter+0x10/0x10 [cifs]
[ 750.417128] [ T9870] ? update_stack_state+0x2a0/0x670
[ 750.417685] [ T9870] ? cifs_flush+0x153/0x320 [cifs]
[ 750.418380] [ T9870] ? cifs_flush+0x153/0x320 [cifs]
[ 750.419055] [ T9870] ? update_stack_state+0x2a0/0x670
[ 750.419624] [ T9870] smbd_send+0x58c/0x9c0 [cifs]
[ 750.420297] [ T9870] ? __pfx_smbd_send+0x10/0x10 [cifs]
[ 750.420936] [ T9870] ? unwind_get_return_address+0x65/0xb0
[ 750.421456] [ T9870] ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 750.421954] [ T9870] ? arch_stack_walk+0xa7/0x100
[ 750.422460] [ T9870] ? stack_trace_save+0x92/0xd0
[ 750.422948] [ T9870] __smb_send_rqst+0x931/0xec0 [cifs]
[ 750.423579] [ T9870] ? kernel_text_address+0x173/0x190
[ 750.424056] [ T9870] ? kasan_save_stack+0x39/0x70
[ 750.424813] [ T9870] ? kasan_save_track+0x18/0x70
[ 750.425323] [ T9870] ? __kasan_slab_alloc+0x9d/0xa0
[ 750.425831] [ T9870] ? __pfx___smb_send_rqst+0x10/0x10 [cifs]
[ 750.426548] [ T9870] ? smb2_mid_entry_alloc+0xb4/0x7e0 [cifs]
[ 750.427231] [ T9870] ? cifs_call_async+0x277/0xb00 [cifs]
[ 750.427882] [ T9870] ? cifs_issue_write+0x256/0x610 [cifs]
[ 750.428909] [ T9870] ? netfs_do_issue_write+0xc2/0x340 [netfs]
[ 750.429425] [ T9870] ? netfs_advance_write+0x45b/0x1270 [netfs]
[ 750.429882] [ T9870] ? netfs_write_folio+0xd6c/0x1be0 [netfs]
[ 750.430345] [ T9870] ? netfs_writepages+0x2e9/0xa80 [netfs]
[ 750.430809] [ T9870] ? do_writepages+0x21f/0x590
[ 750.431239] [ T9870] ? filemap_fdatawrite_wbc+0xe1/0x140
[ 750.431652] [ T9870] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.432041] [ T9870] smb_send_rqst+0x22e/0x2f0 [cifs]
[ 750.432586] [ T9870] ? __pfx_smb_send_rqst+0x10/0x10 [cifs]
[ 750.433108] [ T9870] ? local_clock_noinstr+0xe/0xd0
[ 750.433482] [ T9870] ? kasan_save_alloc_info+0x37/0x60
[ 750.433855] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.434214] [ T9870] ? _raw_spin_lock+0x81/0xf0
[ 750.434561] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.434903] [ T9870] ? smb2_setup_async_request+0x293/0x580 [cifs]
[ 750.435394] [ T9870] cifs_call_async+0x477/0xb00 [cifs]
[ 750.435892] [ T9870] ? __pfx_smb2_writev_callback+0x10/0x10 [cifs]
[ 750.436388] [ T9870] ? __pfx_cifs_call_async+0x10/0x10 [cifs]
[ 750.436881] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.437237] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.437579] [ T9870] ? __smb2_plain_req_init+0x933/0x1090 [cifs]
[ 750.438062] [ T9870] smb2_async_writev+0x15ff/0x2460 [cifs]
[ 750.438557] [ T9870] ? sched_clock_noinstr+0x9/0x10
[ 750.438906] [ T9870] ? local_clock_noinstr+0xe/0xd0
[ 750.439293] [ T9870] ? __pfx_smb2_async_writev+0x10/0x10 [cifs]
[ 750.439786] [ T9870] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 750.440143] [ T9870] ? _raw_spin_unlock+0xe/0x40
[ 750.440495] [ T9870] ? cifs_pick_channel+0x242/0x370 [cifs]
[ 750.440989] [ T9870] cifs_issue_write+0x256/0x610 [cifs]
[ 750.441492] [ T9870] ? cifs_issue_write+0x256/0x610 [cifs]
[ 750.441987] [ T9870] netfs_do_issue_write+0xc2/0x340 [netfs]
[ 750.442387] [ T9870] netfs_advance_write+0x45b/0x1270 [netfs]
[ 750.442969] [ T9870] ? rolling_buffer_append+0x12d/0x440 [netfs]
[ 750.443376] [ T9870] netfs_write_folio+0xd6c/0x1be0 [netfs]
[ 750.443768] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.444145] [ T9870] netfs_writepages+0x2e9/0xa80 [netfs]
[ 750.444541] [ T9870] ? __pfx_netfs_writepages+0x10/0x10 [netfs]
[ 750.444936] [ T9870] ? exit_files+0xab/0xe0
[ 750.445312] [ T9870] ? do_exit+0x148f/0x2980
[ 750.445672] [ T9870] ? do_group_exit+0xb5/0x250
[ 750.446028] [ T9870] ? arch_do_signal_or_restart+0x92/0x630
[ 750.446402] [ T9870] ? exit_to_user_mode_loop+0x98/0x170
[ 750.446762] [ T9870] ? do_syscall_64+0x2cf/0xd80
[ 750.447132] [ T9870] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.447499] [ T9870] do_writepages+0x21f/0x590
[ 750.447859] [ T9870] ? __pfx_do_writepages+0x10/0x10
[ 750.448236] [ T9870] filemap_fdatawrite_wbc+0xe1/0x140
[ 750.448595] [ T9870] __filemap_fdatawrite_range+0xba/0x100
[ 750.448953] [ T9870] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 750.449336] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.449697] [ T9870] filemap_write_and_wait_range+0x7d/0xf0
[ 750.450062] [ T9870] cifs_flush+0x153/0x320 [cifs]
[ 750.450592] [ T9870] filp_flush+0x107/0x1a0
[ 750.450952] [ T9870] filp_close+0x14/0x30
[ 750.451322] [ T9870] put_files_struct.part.0+0x126/0x2a0
[ 750.451678] [ T9870] ? __pfx__raw_spin_lock+0x10/0x10
[ 750.452033] [ T9870] exit_files+0xab/0xe0
[ 750.452401] [ T9870] do_exit+0x148f/0x2980
[ 750.452751] [ T9870] ? __pfx_do_exit+0x10/0x10
[ 750.453109] [ T9870] ? __kasan_check_write+0x14/0x30
[ 750.453459] [ T9870] ? _raw_spin_lock_irq+0x8a/0xf0
[ 750.453787] [ T9870] do_group_exit+0xb5/0x250
[ 750.454082] [ T9870] get_signal+0x22d3/0x22e0
[ 750.454406] [ T9870] ? __pfx_get_signal+0x10/0x10
[ 750.454709] [ T9870] ? fpregs_assert_state_consistent+0x68/0x100
[ 750.455031] [ T9870] ? folio_add_lru+0xda/0x120
[ 750.455347] [ T9870] arch_do_signal_or_restart+0x92/0x630
[ 750.455656] [ T9870] ? __pfx_arch_do_signal_or_restart+0x10/0x10
[ 750.455967] [ T9870] exit_to_user_mode_loop+0x98/0x170
[ 750.456282] [ T9870] do_syscall_64+0x2cf/0xd80
[ 750.456591] [ T9870] ? __kasan_check_read+0x11/0x20
[ 750.456897] [ T9870] ? count_memcg_events+0x1b4/0x420
[ 750.457280] [ T9870] ? handle_mm_fault+0x148/0x690
[ 750.457616] [ T9870] ? _raw_spin_lock_irq+0x8a/0xf0
[ 750.457925] [ T9870] ? __kasan_check_read+0x11/0x20
[ 750.458297] [ T9870] ? fpregs_assert_state_consistent+0x68/0x100
[ 750.458672] [ T9870] ? irqentry_exit_to_user_mode+0x2e/0x250
[ 750.459191] [ T9870] ? irqentry_exit+0x43/0x50
[ 750.459600] [ T9870] ? exc_page_fault+0x75/0xe0
[ 750.460130] [ T9870] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 750.460570] [ T9870] RIP: 0033:0x7858c94ab6e2
[ 750.461206] [ T9870] Code: Unable to access opcode bytes at 0x7858c94ab6b8.
[ 750.461780] [ T9870] RSP: 002b:00007858c9248ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000022
[ 750.462327] [ T9870] RAX: fffffffffffffdfe RBX: 00007858c92496c0 RCX: 00007858c94ab6e2
[ 750.462653] [ T9870] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 750.462969] [ T9870] RBP: 00007858c9248d10 R08: 0000000000000000 R09: 0000000000000000
[ 750.463290] [ T9870] R10: 0000000000000000 R11: 0000000000000246 R12: fffffffffffffde0
[ 750.463640] [ T9870] R13: 0000000000000020 R14: 0000000000000002 R15: 00007ffc072d2230
[ 750.463965] [ T9870] </TASK>
[ 750.464285] [ T9870] Modules linked in: siw ib_uverbs ccm cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs softdog vboxsf vboxguest cpuid intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec polyval_clmulni ghash_clmulni_intel sha1_ssse3 aesni_intel rapl i2c_piix4 i2c_smbus joydev input_leds mac_hid sunrpc binfmt_misc kvm_intel kvm irqbypass sch_fq_codel efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs ip_tables x_tables autofs4 hid_generic vboxvideo usbhid drm_vram_helper psmouse vga16fb vgastate drm_ttm_helper serio_raw hid ahci libahci ttm pata_acpi video wmi [last unloaded: vboxguest]
[ 750.467127] [ T9870] CR2: ffff8880110a2000
cc: Tom Talpey <tom@talpey.com>
cc: linux-cifs@vger.kernel.org
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Fixes: c45ebd636c32 ("cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
after fabc4ed200f9, server_unresponsive add a condition to check whether client
need to reconnect depending on server->lstrp. When client failed to reconnect
for some time and abort connection, server->lstrp is updated for the last time.
In the following scene, server->lstrp is too old. This cause next command
failure in re-negotiation rather than waiting for re-negotiation done.
1. mount -t cifs -o username=Everyone,echo_internal=10 //$server_ip/export /mnt
2. ssh $server_ip "echo b > /proc/sysrq-trigger &"
3. ls /mnt
4. sleep 21s
5. ssh $server_ip "service firewalld stop"
6. ls # return EHOSTDOWN
If the interval between 5 and 6 is too small, 6 may trigger sending negotiation
request. Before backgrounding cifsd thread try to receive negotiation response
from server in cifs_readv_from_socket, server_unresponsive may trigger
cifs_reconnect which cause 6 to be failed:
ls thread
----------------
smb2_negotiate
server->tcpStatus = CifsInNegotiate
compound_send_recv
wait_for_compound_request
cifsd thread
----------------
cifs_readv_from_socket
server_unresponsive
server->tcpStatus == CifsInNegotiate && jiffies > server->lstrp + 20s
cifs_reconnect
cifs_abort_connection: mid_state = MID_RETRY_NEEDED
ls thread
----------------
cifs_sync_mid_result return EAGAIN
smb2_negotiate return EHOSTDOWN
Though server->lstrp means last server response time, it is updated in
cifs_abort_connection and cifs_get_tcp_session. We can also update server->lstrp
before switching into CifsInNegotiate state to avoid failure in 6.
Fixes: 7ccc1465465d ("smb: client: fix hang in wait_for_response() for negproto")
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Acked-by: Meetakshi Setiya <msetiya@microsoft.com>
Signed-off-by: zhangjian <zhangjian496@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull io_uring fix from Jens Axboe:
"A single fix to hopefully wrap up the saga of receive bundles"
* tag 'io_uring-6.16-20250621' of git://git.kernel.dk/linux:
io_uring/net: always use current transfer count for buffer put
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix a crash in ACPICA while attempting to evaluate a control method
that expects more arguments than are being passed to it, which was
exposed by a defective firmware update from a prominent OEM on
multiple systems (Rafael Wysocki)"
* tag 'acpi-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Refuse to evaluate a method if arguments are missing
|