Age | Commit message (Collapse) | Author |
|
If one of the waveform functions is called for a chip that only supports
.apply(), we want that an error code is returned and not a NULL pointer
exception.
Fixes: 6c5126c6406d ("pwm: Provide new consumer API functions for waveforms")
Cc: stable@vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Trevor Gamblin <tgamblin@baylibre.com>
Link: https://lore.kernel.org/r/20250123172709.391349-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
The Vexia EDU ATLA 10 tablet comes in 2 different versions with
significantly different mainboards. The only outward difference is that
the charging barrel on one is marked 5V and the other is marked 9V.
Both ship with Android 4.4 as factory OS and have the usual broken DSDT
issues for x86 Android tablets.
Add a quirk to skip ACPI I2C client enumeration for the 5V version to
complement the existing quirk for the 9V version.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20250123132202.18209-1-hdegoede@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL
after job completion"), we introduced a change to assign the job pointer
to NULL after completing a job, indicating job completion.
However, this approach created a race condition between the DRM
scheduler workqueue and the IRQ execution thread. As soon as the fence is
signaled in the IRQ execution thread, a new job starts to be executed.
This results in a race condition where the IRQ execution thread sets the
job pointer to NULL simultaneously as the `run_job()` function assigns
a new job to the pointer.
This race condition can lead to a NULL pointer dereference if the IRQ
execution thread sets the job pointer to NULL after `run_job()` assigns
it to the new job. When the new job completes and the GPU emits an
interrupt, `v3d_irq()` is triggered, potentially causing a crash.
[ 466.310099] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
[ 466.318928] Mem abort info:
[ 466.321723] ESR = 0x0000000096000005
[ 466.325479] EC = 0x25: DABT (current EL), IL = 32 bits
[ 466.330807] SET = 0, FnV = 0
[ 466.333864] EA = 0, S1PTW = 0
[ 466.337010] FSC = 0x05: level 1 translation fault
[ 466.341900] Data abort info:
[ 466.344783] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 466.350285] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 466.355350] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 466.360677] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000089772000
[ 466.367140] [00000000000000c0] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 466.375875] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 466.382163] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device algif_hash algif_skcipher af_alg bnep binfmt_misc vc4 snd_soc_hdmi_codec drm_display_helper cec brcmfmac_wcc spidev rpivid_hevc(C) drm_client_lib brcmfmac hci_uart drm_dma_helper pisp_be btbcm brcmutil snd_soc_core aes_ce_blk v4l2_mem2mem bluetooth aes_ce_cipher snd_compress videobuf2_dma_contig ghash_ce cfg80211 gf128mul snd_pcm_dmaengine videobuf2_memops ecdh_generic sha2_ce ecc videobuf2_v4l2 snd_pcm v3d sha256_arm64 rfkill videodev snd_timer sha1_ce libaes gpu_sched snd videobuf2_common sha1_generic drm_shmem_helper mc rp1_pio drm_kms_helper raspberrypi_hwmon spi_bcm2835 gpio_keys i2c_brcmstb rp1 raspberrypi_gpiomem rp1_mailbox rp1_adc nvmem_rmem uio_pdrv_genirq uio i2c_dev drm ledtrig_pattern drm_panel_orientation_quirks backlight fuse dm_mod ip_tables x_tables ipv6
[ 466.458429] CPU: 0 UID: 1000 PID: 2008 Comm: chromium Tainted: G C 6.13.0-v8+ #18
[ 466.467336] Tainted: [C]=CRAP
[ 466.470306] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[ 466.476157] pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 466.483143] pc : v3d_irq+0x118/0x2e0 [v3d]
[ 466.487258] lr : __handle_irq_event_percpu+0x60/0x228
[ 466.492327] sp : ffffffc080003ea0
[ 466.495646] x29: ffffffc080003ea0 x28: ffffff80c0c94200 x27: 0000000000000000
[ 466.502807] x26: ffffffd08dd81d7b x25: ffffff80c0c94200 x24: ffffff8003bdc200
[ 466.509969] x23: 0000000000000001 x22: 00000000000000a7 x21: 0000000000000000
[ 466.517130] x20: ffffff8041bb0000 x19: 0000000000000001 x18: 0000000000000000
[ 466.524291] x17: ffffffafadfb0000 x16: ffffffc080000000 x15: 0000000000000000
[ 466.531452] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 466.538613] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffffd08c527eb0
[ 466.545777] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 466.552941] x5 : ffffffd08c4100d0 x4 : ffffffafadfb0000 x3 : ffffffc080003f70
[ 466.560102] x2 : ffffffc0829e8058 x1 : 0000000000000001 x0 : 0000000000000000
[ 466.567263] Call trace:
[ 466.569711] v3d_irq+0x118/0x2e0 [v3d] (P)
[ 466.573826] __handle_irq_event_percpu+0x60/0x228
[ 466.578546] handle_irq_event+0x54/0xb8
[ 466.582391] handle_fasteoi_irq+0xac/0x240
[ 466.586498] generic_handle_domain_irq+0x34/0x58
[ 466.591128] gic_handle_irq+0x48/0xd8
[ 466.594798] call_on_irq_stack+0x24/0x58
[ 466.598730] do_interrupt_handler+0x88/0x98
[ 466.602923] el0_interrupt+0x44/0xc0
[ 466.606508] __el0_irq_handler_common+0x18/0x28
[ 466.611050] el0t_64_irq_handler+0x10/0x20
[ 466.615156] el0t_64_irq+0x198/0x1a0
[ 466.618740] Code: 52800035 3607faf3 f9442e80 52800021 (f9406018)
[ 466.624853] ---[ end trace 0000000000000000 ]---
[ 466.629483] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 466.636384] SMP: stopping secondary CPUs
[ 466.640320] Kernel Offset: 0x100c400000 from 0xffffffc080000000
[ 466.646259] PHYS_OFFSET: 0x0
[ 466.649141] CPU features: 0x100,00000170,00901250,0200720b
[ 466.654644] Memory Limit: none
[ 466.657706] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
Fix the crash by assigning the job pointer to NULL before signaling the
fence. This ensures that the job pointer is cleared before any new job
starts execution, preventing the race condition and the NULL pointer
dereference crash.
Cc: stable@vger.kernel.org
Fixes: e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Phil Elwell <phil@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250123012403.20447-1-mcanal@igalia.com
|
|
- Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM ACPI
hotplug driver (Thomas Weißschuh)
- Update PCI_EXP_LNKCAP_SLS comment (Lukas Wunner)
- Drop superfluous pm_wakeup.h include (Wolfram Sang)
- Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong Zhang)
- Correct documentation of the 'config_acs=' kernel parameter (Akihiko
Odaki)
* pci/misc:
Documentation: Fix pci=config_acs= example
PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT
PCI: Don't include 'pm_wakeup.h' directly
PCI: Update code comment on PCI_EXP_LNKCAP_SLS for PCIe r3.0
PCI/ACPI: Constify 'struct bin_attribute'
PCI/P2PDMA: Constify 'struct bin_attribute'
PCI/VPD: Constify 'struct bin_attribute'
PCI/sysfs: Constify 'struct bin_attribute'
|
|
- Add DT binding and driver support for Xilinx Versal CPM5 (Thippeswamy
Havalige)
* pci/controller/xilinx-cpm:
PCI: xilinx-cpm: Add support for Versal CPM5 Root Port Controller 1
dt-bindings: PCI: xilinx-cpm: Add compatible string for CPM5 host1
|
|
- Add struct rockchip_pcie_ep kernel-doc to fix warnings (Damien Le Moal)
- Simplify clock and reset handling by using bulk interfaces (Anand Moon)
- Pass typed rockchip_pcie (not void) pointer to
rockchip_pcie_disable_clocks() (Anand Moon)
- Return -ENOMEM, not success, when pci_epc_mem_alloc_addr() fails (Dan
Carpenter)
* pci/controller/rockchip:
PCI: rockchip-ep: Fix error code in rockchip_pcie_ep_init_ob_mem()
PCI: rockchip: Refactor rockchip_pcie_disable_clocks() signature
PCI: rockchip: Simplify reset control handling by using reset_control_bulk*() function
PCI: rockchip: Simplify clock handling by using clk_bulk*() functions
PCI: rockchip: Add missing fields descriptions for struct rockchip_pcie_ep
|
|
- Avoid passing stack buffer as resource name (King Dix)
* pci/controller/rcar-ep:
PCI: rcar-ep: Fix incorrect variable used when calling devm_request_mem_region()
|
|
- Add MODULE_DEVICE_TABLE() to enable module autoloading (Liao Chen)
* pci/controller/mvebu:
PCI: mvebu: Enable module autoloading
|
|
- Set up the inbound address translation based on whether the platform
allows coherent or non-coherent DMA (Daire McNamara)
- Update DT binding such that platforms are DMA-coherent by default and
must specify 'dma-noncoherent' if needed (Conor Dooley)
* pci/controller/microchip:
dt-bindings: PCI: microchip,pcie-host: Allow dma-noncoherent
PCI: microchip: Set inbound address translation for coherent or non-coherent mode
|
|
- Use clk_bulk_prepare_enable() instead of separate clk_bulk_prepare() and
clk_bulk_enable() (Lorenzo Bianconi)
- Rearrange reset assert/deassert so they're both done in the *_power_up()
callbacks (Lorenzo Bianconi)
- Document that Airoha EN7581 requires PHY init and power-on before PHY
reset deassert, unlike other MediaTek Gen3 controllers (Lorenzo Bianconi)
- Move Airoha EN7581 post-reset delay from the en7581 clock .enable()
method to mtk_pcie_en7581_power_up() (Lorenzo Bianconi)
- Sleep instead of delay during Airoha EN7581 power-up, since this is a
non-atomic context (Lorenzo Bianconi)
- Skip PERST# assertion on Airoha EN7581 during probe and suspend/resume to
avoid a hardware defect (Lorenzo Bianconi)
- Enable async probe to reduce system startup time (Douglas Anderson)
* pci/controller/mediatek:
PCI: mediatek-gen3: Enable async probe by default
PCI: mediatek-gen3: Avoid PCIe resetting via PERST# for Airoha EN7581 SoC
PCI: mediatek-gen3: Rely on msleep() in mtk_pcie_en7581_power_up()
PCI: mediatek-gen3: Move reset delay in mtk_pcie_en7581_power_up()
PCI: mediatek-gen3: Add comment about initialization order in mtk_pcie_en7581_power_up()
PCI: mediatek-gen3: Move reset/assert callbacks in .power_up()
PCI: mediatek-gen3: Rely on clk_bulk_prepare_enable() in mtk_pcie_en7581_power_up()
|
|
- Simplify by using syscon_regmap_lookup_by_phandle_args() instead of
syscon_regmap_lookup_by_phandle() followed by
of_property_read_u32_array() (Krzysztof Kozlowski)
* pci/controller/layerscape:
PCI: layerscape: Use syscon_regmap_lookup_by_phandle_args
|
|
- Add DT compatible string 'fsl,imx8q-pcie-ep' and driver support for
i.MX8Q series (i.MX8QM, i.MX8QXP, and i.MX8DXL) Endpoints (Frank Li)
- Add DT binding for optional i.MX95 Refclk and driver support to enable it
if the platform hasn't enabled it (Richard Zhu)
- Configure PHY based on controller being in Root Complex or Endpoint mode
(Frank Li)
- Rely on dbi2 and iATU base addresses from DT via dw_pcie_get_resources()
instead of hardcoding them in imx6 (Richard Zhu)
- Skip controller_id computation for i.MX7D since it only has one
controller (Richard Zhu)
- Deassert apps_reset in imx_pcie_deassert_core_reset() since it is
asserted in imx_pcie_assert_core_reset() (Richard Zhu)
- Add missing reference clock enable or disable logic for IMX6SX, IMX7D,
IMX8MM (Richard Zhu)
- Remove redundant imx7d_pcie_init_phy() since imx7d_pcie_enable_ref_clk()
does the same thing (Richard Zhu)
* pci/controller/imx6:
PCI: imx6: Clean up comments and whitespace
PCI: imx6: Remove surplus imx7d_pcie_init_phy() function
PCI: imx6: Add missing reference clock disable logic
PCI: imx6: Deassert apps_reset in imx_pcie_deassert_core_reset()
PCI: imx6: Skip controller_id generation logic for i.MX7D
PCI: imx6: Fetch dbi2 and iATU base addesses from DT
PCI: imx6: Configure PHY based on Root Complex or Endpoint mode
PCI: imx6: Add Refclk for i.MX95 PCIe
dt-bindings: PCI: fsl,imx6q-pcie: Add Refclk for i.MX95 RC
PCI: imx6: Add i.MX8Q PCIe Endpoint (EP) support
dt-bindings: PCI: fsl,imx6q-pcie-ep: Add compatible string fsl,imx8q-pcie-ep
# Conflicts:
# drivers/pci/controller/dwc/pci-imx6.c
|
|
- Fix potential string truncation in dw_pcie_edma_irq_verify() (Niklas
Cassel)
- Don't wait for link up in DWC core if driver can detect Link Up event
(Krishna chaitanya chundru)
- If qcom 'global' IRQ is supported for detection of Link Up events, tell
DWC core not to wait for link up (Krishna chaitanya chundru)
- Update ICC and OPP votes after Link Up events (Krishna chaitanya chundru)
- Use dw-rockchip dll_link_up IRQ to detect Link Up and enumerate devices
so users don't have to manually rescan (Niklas Cassel)
- In dw-rockchip, the 'sys' interrupt is required and detects Link Up
events, so tell DWC core not to wait for link up (Niklas Cassel)
- Always stop link in dw_pcie_suspend_noirq(), which is required at least
for i.MX8QM to re-establish link on resume (Richard Zhu)
- Drop racy and unnecessary LTSSM state check before sending PME_TURN_OFF
message in dw_pcie_suspend_noirq() (Richard Zhu)
- Add stubs for dw_pcie_suspend_noirq() dw_pcie_resume_noirq() when
CONFIG_PCIE_DW_HOST is not defined so drivers don't need #ifdefs (Bjorn
Helgaas)
- Use DWC core suspend/resume functions for imx6 (Frank Li)
- Add imx6 suspend/resume support for i.MX8MQ, i.MX8Q, and i.MX95 (Richard
Zhu)
- Add struct of_pci_range.parent_bus_addr for devices that need their
immediate parent bus address, not the CPU address, e.g., to program an
internal Address Translation Unit (iATU) (Frank Li)
* pci/controller/dwc:
PCI: dwc: Simplify config resource lookup
of: address: Add parent_bus_addr to struct of_pci_range
PCI: imx6: Add i.MX8MQ, i.MX8Q and i.MX95 PM support
PCI: imx6: Use DWC common suspend resume method
PCI: dwc: Add dw_pcie_suspend_noirq(), dw_pcie_resume_noirq() stubs for !CONFIG_PCIE_DW_HOST
PCI: dwc: Remove LTSSM state test in dw_pcie_suspend_noirq()
PCI: dwc: Always stop link in the dw_pcie_suspend_noirq
PCI: dw-rockchip: Don't wait for link since we can detect Link Up
PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ
PCI: qcom: Update ICC and OPP values after Link Up event
PCI: qcom: Don't wait for link if we can detect Link Up
PCI: dwc: Don't wait for link up if driver can detect Link Up event
PCI: dwc: Fix potential truncation in dw_pcie_edma_irq_verify()
# Conflicts:
# drivers/pci/controller/dwc/pci-imx6.c
|
|
- Simplify by using syscon_regmap_lookup_by_phandle_args() instead of
syscon_regmap_lookup_by_phandle() followed by
of_parse_phandle_with_fixed_args() or of_property_read_u32_index()
(Krzysztof Kozlowski)
* pci/controller/dra7xx:
PCI: dra7xx: Use syscon_regmap_lookup_by_phandle_args
|
|
- Add host bridge .enable_device() and .disable_device() hooks for bridges
that need to configure things like Requester ID to StreamID mapping when
enabling devices (Frank Li)
- Add imx6 Requester ID to StreamID mapping configuration when enabling
devices (Frank Li)
- Extend struct pci_ecam_ops with .enable_device() and .disable_device()
hooks so drivers that use pci_host_common_probe() instead of their own
.probe() have a way to set the .enable_device() callbacks (Marc Zyngier)
- Convert pcie-apple StreamID mapping configuration from a bus notifier to
the .enable_device() and .disable_device() callbacks (Marc Zyngier)
* pci/controller/iommu-map:
PCI: apple: Convert to {en,dis}able_device() callbacks
PCI: host-generic: Allow {en,dis}able_device() to be provided via pci_ecam_ops
PCI: imx6: Add IOMMU and ITS MSI support for i.MX95
PCI: Add enable_device() and disable_device() callbacks for bridges
|
|
- Convert mobiveil-pcie.txt to YAML and update 'interrupt-names' and
'reg-names' (Frank Li)
- Add qcom DT SM8550 and SM8650 optional 'global' interrupt for link events
(Neil Armstrong)
- Add qcom DT 'compatible' strings for IPQ5424 PCIe controller (Manikanta
Mylavarapu)
* pci/dt-bindings:
dt-bindings: PCI: qcom: Document the IPQ5424 PCIe controller
dt-bindings: PCI: qcom,pcie-sm8550: Document 'global' interrupt
dt-bindings: PCI: mobiveil: Convert mobiveil-pcie.txt to YAML
|
|
- Clear pci-epf-test dma_chan_rx, not dma_chan_tx, after freeing
dma_chan_rx (Mohamed Khalfella)
- Correct the DMA MEMCPY test so it doesn't fail if the Endpoint supports
both DMA_PRIVATE and DMA_MEMCPY (Manivannan Sadhasivam)
- Add pci-epf-test and pci_endpoint_test support for capabilities (Niklas
Cassel)
- Add Endpoint test for consecutive BARs (Niklas Cassel)
- Remove redundant comparison from Endpoint BAR test because a > 1MB BAR
can always be exactly covered by iterating with a 1MB buffer (Hans Zhang)
- Correct the PCI Endpoint test IOCTL return value (Manivannan Sadhasivam)
- Move PCI Endpoint tests from tools/pci to Kselftests (Manivannan
Sadhasivam)
- Convert PCI Endpoint tests to the Kselftest framework (Manivannan
Sadhasivam)
* pci/endpoint-test:
selftests: pci_endpoint: Migrate to Kselftest framework
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
misc: pci_endpoint_test: Fix IOCTL return value
misc: pci_endpoint_test: Remove redundant 'remainder' test
misc: pci_endpoint_test: Add consecutive BAR test
misc: pci_endpoint_test: Add support for capabilities
PCI: endpoint: pci-epf-test: Add support for capabilities
PCI: endpoint: pci-epf-test: Fix check for DMA MEMCPY test
PCI: endpoint: pci-epf-test: Set dma_chan_rx pointer to NULL on error
|
|
- Destroy the EPC device in devm_pci_epc_destroy(), which previously didn't
call devres_release() (Zijun Hu)
- Simplify pci_epc_get() with class_find_device_by_name() (Zijun Hu)
- Finish virtual EP removal in pci_epf_remove_vepf(), which previously
caused a subsequent pci_epf_add_vepf() to fail with -EBUSY (Zijun Hu)
- Write BAR_MASK before iATU registers in pci_epc_set_bar() so we don't
depend on the BAR_MASK reset value being larger than the requested BAR
size (Niklas Cassel)
- Prevent changing BAR size/flags in pci_epc_set_bar() to prevent reads
from bypassing the iATU if we reduced the BAR size (Niklas Cassel)
- Verify address alignment when programming iATU so we don't attempt to
write bits that are read-only because of the BAR size, which could lead
to directing accesses to the wrong address (Niklas Cassel)
- Implement artpec6 pci_epc_features so we can rely on all drivers
supporting it so we can use it in EPC core code (Niklas Cassel)
- Check for BARs of fixed size to prevent endpoint drivers from trying to
change their size (Niklas Cassel)
- Verify that requested BAR size is a power of two when endpoint driver
sets the BAR (Niklas Cassel)
* pci/endpoint:
PCI: endpoint: Verify that requested BAR size is a power of two
PCI: endpoint: Add size check for fixed size BARs in pci_epc_set_bar()
PCI: artpec6: Implement dw_pcie_ep operation get_features
PCI: dwc: ep: Add 'address' alignment to 'size' check in dw_pcie_prog_ep_inbound_atu()
PCI: dwc: ep: Prevent changing BAR size/flags in pci_epc_set_bar()
PCI: dwc: ep: Write BAR_MASK before iATU registers in pci_epc_set_bar()
PCI: endpoint: Finish virtual EP removal in pci_epf_remove_vepf()
PCI: endpoint: Simplify pci_epc_get()
PCI: endpoint: Destroy the EPC device in devm_pci_epc_destroy()
PCI: endpoint: Replace magic number '6' by PCI_STD_NUM_BARS
|
|
- Add Microchip PCI100X device IDs (Rakesh Babu Saladi)
* pci/switchtec:
PCI: switchtec: Add Microchip PCI100X device IDs
|
|
- Avoid D3 for Root Ports on TUXEDO Sirius Gen1 with old BIOS because the
system can't wake up from suspend (Werner Sembach)
* pci/pm:
PCI: Avoid putting some root ports into D3 on TUXEDO Sirius Gen1
|
|
- Move reset related sysfs code from pci.c to pci-sysfs.c where other
similar code lives (Ilpo Järvinen)
- Simplify reset_method_store() memory management by using __free() instead
of explicit kfree() cleanup (Ilpo Järvinen)
- Drop unnecessary zero initializer (Ilpo Järvinen)
* pci/pci-sysfs:
PCI/sysfs: Remove unnecessary zero in initializer
PCI/sysfs: Use __free() in reset_method_store()
PCI/sysfs: Move reset related sysfs code to correct file
|
|
- Unexport of_pci_parse_bus_range() since it's only used in of.c (Bjorn
Helgaas)
- Drop 'No bus range found' message so we don't complain when DTs don't
specify the default 'bus-range = <0x00 0xff>' (Bjorn Helgaas)
- Simplify devm_of_pci_get_host_bridge_resources() interface by dropping
parameters that are always the same default values (Bjorn Helgaas)
- Update comment reference to of_pci_get_host_bridge_resources(), which no
longer exists (Bjorn Helgaas)
- Rename the drivers/pci/of_property.c struct of_pci_range to
of_pci_range_entry to avoid confusion with the global of_pci_range in
include/linux/of_address.h (Bjorn Helgaas)
* pci/of:
PCI: of_property: Rename struct of_pci_range to of_pci_range_entry
sparc/PCI: Update reference to devm_of_pci_get_host_bridge_resources()
PCI: of: Simplify devm_of_pci_get_host_bridge_resources() interface
PCI: of: Drop 'No bus range found' message
PCI: Unexport of_pci_parse_bus_range()
|
|
- Unexport pcie_read_tlp_log() to encourage drivers to use PCI core logging
rather than building their own (Ilpo Järvinen)
- Move TLP Log handling to its own file (Ilpo Järvinen)
- Add #defines for TLP Header/Prefix log sizes (Ilpo Järvinen)
- Store number of supported End-End TLP Prefixes always so we can read the
correct number of DWORDs from the TLP Prefix Log (Ilpo Järvinen)
- Read TLP Prefixes in addition to the Header Log in pcie_read_tlp_log()
(Ilpo Järvinen)
- Add pcie_print_tlp_log() to consolidate printing of TLP Header and Prefix
Log (Ilpo Järvinen)
* pci/err:
PCI: Add pcie_print_tlp_log() to print TLP Header and Prefix Log
PCI: Add TLP Prefix reading to pcie_read_tlp_log()
PCI: Store number of supported End-End TLP Prefixes
PCI: Use unsigned int i in pcie_read_tlp_log()
PCI: Use same names in pcie_read_tlp_log() prototype and definition
PCI: Add defines for TLP Header/Prefix log sizes
PCI: Move TLP Log handling to its own file
PCI: Don't expose pcie_read_tlp_log() outside PCI subsystem
|
|
- Batch sizing of multiple BARs while memory decoding is disabled instead
of disabling/enabling decoding for each BAR individually; this optimizes
virtualized environments where toggling decoding enable is expensive
(Alex Williamson)
* pci/enumeration:
PCI: Batch BAR sizing operations
|
|
- Quirk the Intel Raptor Lake-P PIO log size to accommodate vendor BIOSes
that don't configure it correctly (Takashi Iwai)
* pci/dpc:
PCI/DPC: Quirk PIO log size for Intel Raptor Lake-P
|
|
- Update resource request API documentation to encourage callers to supply
a driver name when requesting resources (Philipp Stanner)
- Export pci_intx_unmanaged() and pcim_intx() (always managed) so callers
of pci_intx() (which is sometimes managed) can explicitly choose the one
they need (Philipp Stanner)
- Convert drivers from pci_intx() to always-managed pcim_intx() or
never-managed pci_intx_unmanaged(): amd_sfh, ata (ahci, ata_piix,
pata_rdc, sata_sil24, sata_sis, sata_uli, sata_vsc), bnx2x, bna, ntb,
qtnfmac, rtsx, tifm_7xx1, vfio, xen-pciback (Philipp Stanner)
- Remove pci_intx_unmanaged() since pci_intx() is now always unmanaged and
pcim_intx() is always managed (Philipp Stanner)
* pci/devres:
PCI: Remove devres from pci_intx()
net/ethernet: Use never-managed version of pci_intx()
HID: amd_sfh: Use always-managed version of pcim_intx()
wifi: qtnfmac: use always-managed version of pcim_intx()
ata: Use always-managed version of pci_intx()
PCI/MSI: Use never-managed version of pci_intx()
vfio/pci: Use never-managed version of pci_intx()
misc: Use never-managed version of pci_intx()
ntb: Use never-managed version of pci_intx()
drivers/xen: Use never-managed version of pci_intx()
PCI: Export pci_intx_unmanaged() and pcim_intx()
PCI: Encourage resource request API users to supply driver name
|
|
- Save parent L1 PM Substates config so when we restore it along with an
endpoint's config, the parent info isn't junk (Jian-Hong Pan)
* pci/aspm:
PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()
|
|
Just allow passing in NULL for the cache, if the type in question
doesn't have a cache associated with it.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
init_once is called when an object doesn't come from the cache, and
hence needs initial clearing of certain members. While the whole
struct could get cleared by memset() in that case, a few of the cache
members are large enough that this may cause unnecessary overhead if
the caches used aren't large enough to satisfy the workload. For those
cases, some churn of kmalloc+kfree is to be expected.
Ensure that the 3 users that need clearing put the members they need
cleared at the start of the struct, and wrap the rest of the struct in
a struct group so the offset is known.
While at it, improve the interaction with KASAN such that when/if
KASAN writes to members inside the struct that should be retained over
caching, it won't trip over itself. For rw and net, the retaining of
the iovec over caching is disabled if KASAN is enabled. A helper will
free and clear those members in that case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A few spots in uring_cmd assume that the SQEs copied are always at the
start of the structure, and hence mix req->async_data and the struct
itself.
Clean that up and use the proper indices.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring_cmd_sock() does a normal read of cmd->sqe->cmd_op, where it
really should be using a READ_ONCE() as ->sqe may still be pointing to
the original SQE. Since the prep side already does this READ_ONCE() and
stores it locally, use that value rather than re-read it.
Fixes: 8e9fad0e70b7b ("io_uring: Add io_uring command support for sockets")
Link: https://lore.kernel.org/r/20250121-uring-sockcmd-fix-v1-1-add742802a29@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Merge series from Daniel Baluta <daniel.baluta@nxp.com>:
We introduce SOF support for new board revisions for i.MX8MP/QM/QXP
which wrt audio they replace wm8960 codec with wm8962.
Also add support for cs42888 codec on i.MX8QM/8QXP baseboard.
|
|
Implement a simple TAP-based test engine in bash and a few basic tests
using it, to be used to check for bugs and regressions.
A new "check" target is added to the rtla Makefile that runs the test suite
using the "prove" command implemented by Test::Harness.
The only test format currently supported is running rtla with defined
command arguments per test, checking its exit code. In case the exit
code is non-zero, the output of rtla is displayed, together with the
exit code.
The test cases are adopted from rtla tests in the Continuous Kernel
Integration (CKI) project [1] with the authors' approval.
[1] https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/blob/main/rt-tests/us/rtla/
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Chang Yin <cyin@redhat.com>
Cc: Qiao Zhao <qzhao@redhat.com>
Link: https://lore.kernel.org/20250120135630.802111-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
RV per-task monitors are implemented through a monitor structure
available for each task_struct. This structure is reset every time the
monitor is (re-)started, to avoid inconsistencies if the monitor was
activated previously.
To do so, we reset the monitor on all threads using the macro
for_each_process_thread. However, this macro excludes the idle tasks on
each CPU. Idle tasks could be considered tasks on their own right and it
should be up to the model whether to ignore them or not.
Reset monitors also on the idle tasks for each present CPU whenever we
reset all per-task monitors.
Cc: stable@vger.kernel.org
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20250115151547.605750-2-gmonaco@redhat.com
Fixes: 792575348ff7 ("rv/include: Add deterministic automata monitor definition via C macros")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Toggling memory enable is free on bare metal, but potentially expensive
in virtualized environments as the device MMIO spaces are added and
removed from the VM address space, including DMA mapping of those spaces
through the IOMMU where peer-to-peer is supported. Currently memory
decode is disabled around sizing each individual BAR, even for SR-IOV
BARs while VF Enable is cleared.
This can be better optimized for virtual environments by sizing a set
of BARs at once, stashing the resulting mask into an array, while only
toggling memory enable once. This also naturally improves the SR-IOV
path as the caller becomes responsible for any necessary decode disables
while sizing BARs, therefore SR-IOV BARs are sized relying only on the
VF Enable rather than toggling the PF memory enable in the command
register.
Link: https://lore.kernel.org/r/20250120182202.1878581-1-alex.williamson@redhat.com
Reported-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Link: https://lore.kernel.org/r/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Reviewed-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL
after job completion"), but this commit is not yet in next-fixes,
fast-forward it.
Try #2, first one didn't have v6.13 in it.
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc into soc/dt
ASPEED device tree updates for 6.14
- New machines
* IBM SPB1 AST2600 BMC machine, a Sapphire Rapids x86 server
* Ampre Mt Jefferson AST2600 BMC
- Updates to mtmitchell, rainier, blueridge, everest, fuji, system1,
harma, catalina and minerva
- A big rework of yosemite4
* tag 'aspeed-6.14-devicetree' of https://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc: (55 commits)
ARM: dts: aspeed: yosemite4: adjust secondary flash name
ARM: dts: aspeed: system1: Use crps PSU driver
ARM: dts: aspeed: minerva: add second source RTC
ARM: dts: aspeed: minerva: add bmc ready led setting
ARM: dts: aspeed: minerva: add i/o expanders on each FCB
ARM: dts: aspeed: minerva: add i/o expanders on bus 0
ARM: dts: aspeed: catalina: remove interrupt of GPIOB4 form all IOEXP
ARM: dts: aspeed: catalina: revise ltc4287 shunt-resistor value
arm: dts: aspeed: Blueridge and Rainer: Add VRM presence GPIOs
ARM: dts: aspeed: Blueridge and Fuji: Fix LED node names
arm: dts: aspeed: Everest and Fuji: Add VRM presence gpio expander
ARM: dts: aspeed: sbp1: IBM sbp1 BMC board
dt-bindings: arm: aspeed: add IBM SBP1 board
ARM: dts: aspeed: Add device tree for Ampere's Mt. Jefferson BMC
dt-bindings: arm: aspeed: add Mt. Jefferson board
ARM: dts: aspeed: yosemite4: Add i2c-mux for ADC monitor on Spider Board
ARM: dts: aspeed: yosemite4: Revise adc128d818 adc mode on Fan Boards
ARM: dts: aspeed: yosemite4: Change the address of Fan IC on fan boards
ARM: dts: aspeed: yosemite4: Revise address of i2c-mux for two fan boards
ARM: dts: aspeed: yosemite4: correct the compatible string for max31790
...
Link: https://lore.kernel.org/r/CACPK8Xe8yZLXzEQPp=1D2f0TsKA7hBZG=pHHW6U51FMpp_BiRQ@mail.gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
into soc/newsoc
RISC-V SpacemiT DT changes for 6.14
This adds support for the SpacemiT K1 SoC and the Banana Pi F3
board using it:
- basic device tree support
- pinctrl dt node info
- update MAINTAINERS info
* tag 'spacemit-dt-for-6.14-1' of https://github.com/spacemit-com/linux:
riscv: dts: spacemit: move aliases to board dts
riscv: dts: spacemit: add pinctrl property to uart0 in BPI-F3
riscv: defconfig: enable SpacemiT SoC
riscv: dts: spacemit: add Banana Pi BPI-F3 board device tree
riscv: dts: add initial SpacemiT K1 SoC device tree
riscv: add SpacemiT SoC family Kconfig support
dt-bindings: serial: 8250: Add SpacemiT K1 uart compatible
dt-bindings: interrupt-controller: Add SpacemiT K1 PLIC
dt-bindings: timer: Add SpacemiT K1 CLINT
dt-bindings: riscv: add SpacemiT K1 bindings
dt-bindings: riscv: Add SpacemiT X60 compatibles
MAINTAINERS: setup support for SpacemiT SoC tree
Link: https://wiki.banana-pi.org/Banana_Pi_BPI-F3
Link: https://www.spacemit.com/en/key-stone-k1/
Link: https://lore.kernel.org/r/20250117004911-GYA25021@gentoo
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"A smaller than usual release cycle.
The main changes are:
- Prepare selftest to run with GCC-BPF backend (Ihor Solodrai)
In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile
only mode. Half of the tests are failing, since support for
btf_decl_tag is still WIP, but this is a great milestone.
- Convert various samples/bpf to selftests/bpf/test_progs format
(Alexis Lothoré and Bastien Curutchet)
- Teach verifier to recognize that array lookup with constant
in-range index will always succeed (Daniel Xu)
- Cleanup migrate disable scope in BPF maps (Hou Tao)
- Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao)
- Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin
KaFai Lau)
- Refactor verifier lock support (Kumar Kartikeya Dwivedi)
This is a prerequisite for upcoming resilient spin lock.
- Remove excessive 'may_goto +0' instructions in the verifier that
LLVM leaves when unrolls the loops (Yonghong Song)
- Remove unhelpful bpf_probe_write_user() warning message (Marco
Elver)
- Add fd_array_cnt attribute for prog_load command (Anton Protopopov)
This is a prerequisite for upcoming support for static_branch"
* tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits)
selftests/bpf: Add some tests related to 'may_goto 0' insns
bpf: Remove 'may_goto 0' instruction in opt_remove_nops()
bpf: Allow 'may_goto 0' instruction in verifier
selftests/bpf: Add test case for the freeing of bpf_timer
bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
bpf: Bail out early in __htab_map_lookup_and_delete_elem()
bpf: Free special fields after unlock in htab_lru_map_delete_node()
tools: Sync if_xdp.h uapi tooling header
libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
bpf: selftests: verifier: Add nullness elision tests
bpf: verifier: Support eliding map lookup nullness
bpf: verifier: Refactor helper access type tracking
bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
bpf: verifier: Add missing newline on verbose() call
selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED
libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
libbpf: Fix return zero when elf_begin failed
selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test
veristat: Load struct_ops programs only once
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities updates from Serge Hallyn:
- remove the cap_mmap_file() hook, as it simply returned the default
return value and so doesn't need to exist (Paul Moore)
- add a trace event for cap_capable() (Jordan Rome)
* tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
security: add trace event for cap_capable
capabilities: remove cap_mmap_file()
|
|
The Vexia EDU ATLA 10 tablet comes in 2 different versions with
significantly different mainboards. The only outward difference is that
the charging barrel on one is marked 5V and the other is marked 9V.
The 5V version mostly works with the BYTCR defaults, except that it is
missing a CHAN package in its ACPI tables and the default of using
SSP0-AIF2 is wrong, instead SSP0-AIF1 must be used. That and its jack
detect signal is not inverted as it usually is.
Add a DMI quirk for the 5V version to fix sound not working.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20250123132507.18434-1-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
In mchp_core_pwm_apply_locked(), if hw_period_steps is equal to its max,
an error is reported and .apply fails. The max value is actually a
permitted value however, and so this check can fail where multiple
channels are enabled.
For example, the first channel to be configured requests a period that
sets hw_period_steps to the maximum value, and when a second channel
is enabled the driver reads hw_period_steps back from the hardware and
finds it to be the maximum possible value, triggering the warning on a
permitted value. The value to be avoided is 255 (PERIOD_STEPS_MAX + 1),
as that will produce undesired behaviour, so test for greater than,
rather than equal to.
Fixes: 2bf7ecf7b4ff ("pwm: add microchip soft ip corePWM driver")
Cc: stable@vger.kernel.org
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250122-pastor-fancied-0b993da2d2d2@spud
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
syzbot found that calling mr_mfc_uses_dev() for unres entries
would crash [1], because c->mfc_un.res.minvif / c->mfc_un.res.maxvif
alias to "struct sk_buff_head unresolved", which contain two pointers.
This code never worked, lets remove it.
[1]
Unable to handle kernel paging request at virtual address ffff5fff2d536613
KASAN: maybe wild-memory-access in range [0xfffefff96a9b3098-0xfffefff96a9b309f]
Modules linked in:
CPU: 1 UID: 0 PID: 7321 Comm: syz.0.16 Not tainted 6.13.0-rc7-syzkaller-g1950a0af2d55 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mr_mfc_uses_dev net/ipv4/ipmr_base.c:290 [inline]
pc : mr_table_dump+0x5a4/0x8b0 net/ipv4/ipmr_base.c:334
lr : mr_mfc_uses_dev net/ipv4/ipmr_base.c:289 [inline]
lr : mr_table_dump+0x694/0x8b0 net/ipv4/ipmr_base.c:334
Call trace:
mr_mfc_uses_dev net/ipv4/ipmr_base.c:290 [inline] (P)
mr_table_dump+0x5a4/0x8b0 net/ipv4/ipmr_base.c:334 (P)
mr_rtm_dumproute+0x254/0x454 net/ipv4/ipmr_base.c:382
ipmr_rtm_dumproute+0x248/0x4b4 net/ipv4/ipmr.c:2648
rtnl_dump_all+0x2e4/0x4e8 net/core/rtnetlink.c:4327
rtnl_dumpit+0x98/0x1d0 net/core/rtnetlink.c:6791
netlink_dump+0x4f0/0xbc0 net/netlink/af_netlink.c:2317
netlink_recvmsg+0x56c/0xe64 net/netlink/af_netlink.c:1973
sock_recvmsg_nosec net/socket.c:1033 [inline]
sock_recvmsg net/socket.c:1055 [inline]
sock_read_iter+0x2d8/0x40c net/socket.c:1125
new_sync_read fs/read_write.c:484 [inline]
vfs_read+0x740/0x970 fs/read_write.c:565
ksys_read+0x15c/0x26c fs/read_write.c:708
Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps")
Reported-by: syzbot+5cfae50c0e5f2c500013@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/678fe2d1.050a0220.15cac.00b3.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250121181241.841212-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recent change to add more cases to XFAIL has a broken regex,
the matching needs a real regex not a glob pattern.
While at it add the cases Willem pointed out during review.
Fixes: 3030e3d57ba8 ("selftests/net: packetdrill: make tcp buf limited timing tests benign")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250121143423.215261-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When port is stopped, unlock before returning
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250121005002.3938236-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since upstream commit 8bd76b3d3f3a ("gpio: sim: lock up configfs that an
instantiated device depends on"), rmdir for an active virtual devices
been prohibited.
Update gpio-sim selftest to align with the change.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202501221006.a1ca5dfa-lkp@intel.com
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Link: https://lore.kernel.org/r/20250122043309.304621-1-koichiro.den@canonical.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job
pointer is set to NULL after job completion"), but this commit is not
yet in next-fixes, fast-forward it.
Note that this recreates Linus merge in 96c84703f1cf ("Merge tag
'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel")
because I didn't want to backmerge a random point in the merge window.
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
|
|
The fallback code in blk_mq_map_hw_queues is original from
blk_mq_pci_map_queues and was added to handle the case where
pci_irq_get_affinity will return NULL for !SMP configuration.
blk_mq_map_hw_queues replaces besides blk_mq_pci_map_queues also
blk_mq_virtio_map_queues which used to use blk_mq_map_queues for the
fallback.
It's possible to use blk_mq_map_queues for both cases though.
blk_mq_map_queues creates the same map as blk_mq_clear_mq_map for !SMP
that is CPU 0 will be mapped to hctx 0.
The WARN_ON_ONCE has to be dropped for virtio as the fallback is also
taken for certain configuration on default. Though there is still a
WARN_ON_ONCE check in lib/group_cpus.c:
WARN_ON(nr_present + nr_others < numgrps);
which will trigger if the caller tries to create more hardware queues
than CPUs. It tests the same as the WARN_ON_ONCE in
blk_mq_pci_map_queues did.
Fixes: a5665c3d150c ("virtio: blk/scsi: replace blk_mq_virtio_map_queues with blk_mq_map_hw_queues")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Closes: https://lore.kernel.org/all/20250122093020.6e8a4e5b@gandalf.local.home/
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Link: https://lore.kernel.org/r/20250123-fix-blk_mq_map_hw_queues-v1-1-08dbd01f2c39@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Two mistakes were made back at the time when the teo governor was
rearranged to avoid calling tick_nohz_get_sleep_length() every time it
was looking for a CPU idle state. One of them was to make it always
skip calling tick_nohz_get_sleep_length() when idle state 0 was about
to be selected and the other one was to make skipping the invocation of
tick_nohz_get_sleep_length() in other cases conditional on whether or
not idle state 0 was a "polling" state (that is, a special sequence of
instructions continuously executed by the CPU in a loop).
One problem overlooked at that time was that, even though teo might not
need to call tick_nohz_get_sleep_length() in order to select an idle
state in some cases, it would still need to know its return value to be
able to tell whether or not the subsequent CPU wakeup would be caused by
a timer. Without that information, it has to count the wakeup as a non-
timer one, so if there were many timer wakeups between idle state 0 and
the next enabled idle state, they would all be counted as non-timer
wakeups promoting the selection of idle state 0 in the future. However,
in the absence of an imminent timer, it might have been better to select
the next enabled idle state instead of it.
There was an attempt to address this issue by making teo always call
tick_nohz_get_sleep_length() when idle state 0 was about to be selected
again, but that did not really play well with the systems where idle
state 1 was very shallow and it should have been preferred to idle state
0 (which was a "polling" state), which wanted to reduce the governor
latency as much as possible in the cases when idle state 0 was selected
and clearly calling tick_nohz_get_sleep_length() in those cases did not
align with that goal.
Another problem is that making the governor behavior conditional on any
particular propoerties of idle state 0 is confusing at best and it
should not have been done at all. It was based on the assumtion that
idle state 0 would be a "polling" one only if idle state 1 would be very
close to it, so idle state 0 would only be selected in the cases when
idle duration tends to be extemely short, but that assumption turned out
to be invalid. There are platforms in which idle state 0 is a "polling"
state and the target residency of the next enabled idle state is orders
of magnitude larger than the target residency of idle state 0.
To address the first problem described above, modify teo to count
wakeups occurring after a very small idle duration and use that metric
for avoiding tick_nohz_get_sleep_length() invocations. Namely, if idle
duration is sufficiently small in the majority of cases taken into
account and the idle state about to be selected is shallow enough, it
should be safe to skip calling tick_nohz_get_sleep_length() and count
the subsequent wakeup as non-timer one. This can also be done if the
idle state exit residency constraint value is small enough. Otherwise,
it is necessary to call tick_nohz_get_sleep_length() to avoid
classifying timer wakeups inaccurately, which may adversely affect
future governor decisions.
Doing the above also helps to address the second problem because it can
be done regardless of the properties of idle state 0.
Some preliminary cleanups and preparatory changes are made in addition.
* cpuidle-teo:
cpuidle: teo: Skip sleep length computation for low latency constraints
cpuidle: teo: Replace time_span_ns with a flag
cpuidle: teo: Simplify handling of total events count
cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
cpuidle: teo: Simplify counting events used for tick management
cpuidle: teo: Clarify two code comments
cpuidle: teo: Drop local variable prev_intercept_idx
cpuidle: teo: Combine candidate state index checks against 0
cpuidle: teo: Reorder candidate state index checks
cpuidle: teo: Rearrange idle state lookup code
|
|
blkdev_read_iter() has a few odd checks, like gating the position and
count adjustment on whether or not the result is bigger-than-or-equal to
zero (where bigger than makes more sense), and not checking the return
value of blkdev_direct_IO() before doing an iov_iter_revert(). The
latter can lead to attempting to revert with a negative value, which
when passed to iov_iter_revert() as an unsigned value will lead to
throwing a WARN_ON() because unroll is bigger than MAX_RW_COUNT.
Be sane and don't revert for -EIOCBQUEUED, like what is done in other
spots.
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|