Age | Commit message (Collapse) | Author |
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt
Qualcomm Arm64 DeviceTree updates for v6.11
This introduces 11 new boards, namely:
* ASUS Vivobook S 15
* Lenovo Smart Tab M10 DTS
* Motorola Moto E 2015 LTE (surnia)
* Motorola Moto G 2015 (osprey)
* Motorola Moto G4 Play (harpia)
* Qualcomm AIM300 AIoT development board
* Qualcomm SM8650 Hardware Development Kit (HDK)
* SHIFTphone 8
* Samsung Galaxy Z Fold5
* Schneider HMIBSC board DTS
* TP-Link Archer AX55 v1
Of particular interest here is the Asus Vivobook, the first supported X1
Elite consumer laptop.
For IPQ6018 an SDHCI controller is added and on IPQ9574 an MDIO bus is
described.
The improvements to MSM8916-based devices continues, with sound and
mdoem support added to Acer Iconia Talk S and GPLUS FL8005A, the latter
also gaining BMS support. Samsung Galaxy devices gains PMIC and charger
definitions, NFC support and MUIC. Accelerometer and magnetometer
support is added to the Samsung Galaxy Grand Prime devices.
On MSM8976 definitions for IOMMU, the display subsystem, wifi subsystem,
and Adreno GPU are added.
On MSM8996 UFS core clock frequencies are specified, FastRPC nodes are
added for the audio DSP, glink-edges are described where available, the
display subsystem reset is added.
Venus is introduced on MSM8998 and the "No MSA Ready" quirk is added to
allow ath10k to come up.
GPU support is added to QCM2290 and enabled on the RB1 development
board.
The I2C controller used for communicating with the LT9611UXC HDMI
bridge is temporarily replaced with i2c-gpio while issues with the
builtin controller is diagnosed. The same is done for RB2, on the
QRB4210 platform.
On RB2 TCPM max current draw is corrected and the vreg_l9a regulator is
marked as always on to match expectations.
On the QDU1000 platform, USB is added, secure QFPROM is introduced to
allow LLCC to access OTP data. USB is enabled on the two IDP boards.
SA8775p gains PCIe endpoint definitions, LLCCC support, IMEM and PIL
info regions. Nodes are marked as dma-coherent as needed, a dedicated
carveout for shared memory bridge allocations is introduced.
The SA8775P ride device is split in the two versions r2 and r3.
The SC7180 Trogdor clamshell/detachable fragments are refactored for
convenience, and pwmleds are disabled where unused.
On SC7280 the APR nodes for interfacing with the audio services in audio
DSP firmware are introduced. The Qualcomm SMMU TBUs are described, to
enable improved debug support. QoS clocks are added to interconnects, as
needed in order to operate the QoS settings on some buses.
SuperSpeed in park is disabled for the primary DWC3 instance to address
host controller issues under load.
The PM8008 (camera PMIC) is introduced in Fairphone 5, regulators are
named for better output, and firmware name for IPA is adjusted to the
preferred file format.
The HDMI bridge on Rb3gen2 is described, rtc, gpi-dma and qup nodes are
enabled.
The Type-C port manager found in PM7250b is enabled, for targets not
using pmic-glink firmware for Type-C management.
SC8180X gets a number of smaller corrections, and some cleanups -
related to both functional issues and DeviceTree validation.
The PSHOLD node is marked reserved, after reports that this causes
issues during shutdown. Description of the USB signals are updated to
match the signal path. The PM8008 camera PMIC is added to Lenovo
ThinkPad X13s.
The PM660 PMIC is extended with charger and rradc definitions, and the
SDM670 gains a SMEM region definition.
On SDM845 the Qualcomm SMMU TBU nodes are described, to enable improved
debug output during faults etc. The UFS PHY is associated with its GDSC,
and the DisplayPort controller is wired up to the QMP PHY.
The Lenovo Yoga C630 Embedded Controller is introduced, adding battery
and Type-C port management and altmode support. The C630 also gains WiFI
calibration variant information, to cause selection of the right data.
The missing IPA firmware path is corrected.
For the SDX75 platform, AOSS, IPCC, SDHCI, TCSR, modem SMP2P, I2C and
SPI nodes are introduced. SD-card support is added to the IDP board.
CPUfreq support is introduced for the SM4450 platform.
Missing reset is added to the SDHC controller of SM6115. The UFS PHY
is associated with its GDSC, so is the PHY on SM6350.
On Fairphone 4, the camera pmic (PM8008) is introduced, regulators are
named for more informative debug output, and USB role switching is
enabled.
On the Fairphone 3, vibrator support is added and enabled.
On SM8250, the USB signal paths are properly described in the OF graph,
the UFS PHY gains its required power-domains description.
Thanks to the introduction of PCI power sequence support, the QRB5165
RB5 WiFi chip can now be powered up, so this is added.
Touchscreen interrupt flags are corrected accross a number of Sony
Xperia devices, to remove the unexpected traces from downstream.
On SM8450 an OPP-table is introduced for the PCIe controllers, to
specify the bandwidth and performance state requirements for the
different genrations and link widths. For this the PCIe controllers also
gains interconnect path definitions. The LLCC register layout is
corrected, and the UFS PHY is associated with its GDSC.
On the SM8550 development boards speaker port mapping is added. WiFi
support is finally enabled on the QRD board.
The new AIM300 development platform/board is introduced.
For SM8650 video and camera clock controller are introduced. SCM node
gains details necessary to trigger USB ramdump (download mode) upon a
system crash.
WiFi support and speaker port mapping is added to the QRD and the newly
introduced HDK. On the MTP the USB Type-C connector is describe to be
routed to the PHY.
In addition to the base HDK, a Display Card overlay is also introduced.
For X1 Elite bwmon, fastrpc and GPU support, tsens, and the missing PCIe
6a instance are added. Thermal zones are described. Pmic-glink is
introduced for both CRD and QCP devices, and remaining PMICs are
described. Audio support is also added to the QCP.
An explicit, larger, chunk of CMA memory is added to the various
devices, in order to compensate for the lack of IOMMU for PCIe.
Across a wide range of platforms, the thermal zone polling delays are
removed as supplies are interrupt driven anyways. Also thermal related
is the introduction of GPU thermal throttling, across many SoCs.
The old SMSM implementation is finally transitioned to using the
mailbox-based description and implementation for invoking interrupts on
remote processors. As such interrupt-triggering is converted to use this
mechanism on related platforms.
The usb-role-switch property is removed for all USB instances hard coded
to either host or peripheral across a range of boards.
* tag 'qcom-arm64-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (279 commits)
dt-bindings: arm: qcom: Document samsung,ms013g
arm64: dts: qcom: Add device tree for ASUS Vivobook S 15
dt-bindings: arm: qcom: Add ASUS Vivobook S 15
arm64: dts: qcom: qrb4210-rb2: Correct max current draw for VBUS
arm64: dts: qcom: msm8998: add venus node
arm64: dts: qcom: sa8775p-ride-r3: add new board file
arm64: dts: qcom: move common parts for sa8775p-ride variants into a .dtsi
dt-bindings: arm: qcom: add sa8775p-ride Rev 3
arm64: dts: qcom: sm8550-qrd: add port mapping to speakers
arm64: dts: qcom: sm8550-mtp: add port mapping to speakers
arm64: dts: qcom: sm8550-hdk: add port mapping to speakers
arm64: dts: qcom: sm8650-qrd: add port mapping to speakers
arm64: dts: qcom: sm8650-mtp: add port mapping to speakers
arm64: dts: qcom: sm8650-hdk: add port mapping to speakers
arm64: dts: qcom: sm7225-fairphone-fp4: Name the regulators
arm64: dts: qcom: pm8916: correct thermal zone name
arm64: dts: qcom: x1e80100: Add gpu support
arm64: dts: qcom: x1e80100: Fix USB HS PHY 0.8V supply
arm64: dts: qcom: qcs6490-rb3gen2: enable hdmi bridge
arm64: dts: qcom: sm6115: add resets for sdhc_1
...
Link: https://lore.kernel.org/r/20240706173140.18887-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32 into soc/dt
STM32 DT for v6.11, round 1
Highlights:
----------
-MCU:
- Add syscfg missing clock on stm32f429.
- MPU:
- STM32MP13:
- Add camera support on stm32mp135f-dk bord using DCMIPP and
GC2145 sensor.
- Document PWM output for stm32mp135f-dk
- Add goodix touchscreen support on stm32mp135f-dk board.
- Add new DH DHCOR / DHSBC board (Som + carrier board) based on
STM32MP135F SoC.
SOM part contains: STM32MP135F SoC, 512MB DDR2L RAM and
eMMC/SDIO wifi module.
The carrier boards embedds 2 RGMII ETH ports, USB-A,USB-C
and an extansion connector.
- Add Ethernet controller support on stm32mp135f-dk.
It uses LAN8742A PHY based on RMII.
- STMP32MP15:
- Rework Octavo OSD32MP1 split for USB phy.
- Add OP-TEE IRQ for asynchronous notification support.
It allows OP-TEE to trig Linux.
- STM32MP25:
- Add OP-TEE IRQ for asynchronous notification support.
It allows OP-TEE to trig Linux.
- Enable firewall for RCC.
- Add all U(s)ART nodes for stm32mp25.
- Add 3 power domains for low power modes.
- Add HPDMA support.
- Add Ethernet controller (ETH2) support on stm32mp257f-ev1.
It uses Realtek PHY based on RGMII.
- Add and enable SCMI regulator support.
* tag 'stm32-dt-for-v6.11-1' of https://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32: (31 commits)
arm64: dts: st: describe power supplies for stm32mp257f-ev1 board
arm64: dts: st: add scmi regulators on stm32mp25
regulator: Add STM32MP25 regulator bindings
ARM: dts: stm32: omit unused pinctrl groups from stm32mp13 dtb files
arm64: dts: st: enable Ethernet2 on stm32mp257f-ev1 board
arm64: dts: st: add eth2 pinctrl entries in stm32mp25-pinctrl.dtsi
arm64: dts: st: add ethernet1 and ethernet2 support on stm32mp25
arm64: dts: st: add HPDMA nodes on stm32mp251
ARM: dts: stm32: Add ethernet support for DH STM32MP13xx DHCOR DHSBC board
ARM: dts: stm32: order stm32mp13-pinctrl nodes
ARM: dts: stm32: add ethernet1 for STM32MP135F-DK board
ARM: dts: stm32: add ethernet1/2 RMII pins for STM32MP13F-DK board
ARM: dts: stm32: add ethernet1 and ethernet2 support on stm32mp13
ARM: dts: stm32: Document output pins for PWMs on stm32mp135f-dk
ARM: dts: stm32: OP-TEE async notif interrupt for ST STM32MP15x boards
ARM: dts: stm32: Missing clocks for stm32f429's syscfg.
ARM: dts: stm32: Add support for STM32MP13xx DHCOR SoM and DHSBC board
ARM: dts: stm32: Add pinmux nodes for DH electronics STM32MP13xx DHCOR SoM and DHSBC board
dt-bindings: arm: stm32: Add compatible string for DH electronics STM32MP13xx DHCOR DHSBC board
ARM: dts: stm32: osd32: move pwr_regulators to common
...
Link: https://lore.kernel.org/r/8f10bd29-d067-4060-89ff-2e1a605f3141@foss.st.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into soc/dt
Allwinner SoC device tree changes for 6.11
This includes a commit shared with the clk tree. This commit adds clock
and reset indices to the device tree binding, and thus is needed for
both the device tree and driver changes.
ARM64 device tree and binding-only changes
- Add LRADC (low resolution ADC for resistor network based keys) for H616 SoC
- Add cache information for A64, H6, and H616 SoCs
- Correct model names and descriptions for Pine64 boards
- Add GPADC (general purpose ADC) for H616 SoC
- Add ADC joysticks based on GPADC for anbernic-rg35xx-h board
- Add additional CPU OPPs for the H700 on top of existing H616 ones
- Enable DVFS for rg35xx boards
- Add IOMMU for H616 SoC
RISC-V device tree changes
- Add system LDOs to D1s/T113 SoC
- Add ClockworkPi and DevTerm device trees
* tag 'sunxi-dt-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
riscv: dts: allwinner: Add ClockworkPi and DevTerm devicetrees
riscv: dts: allwinner: d1s-t113: Add system LDOs
arm64: dts: allwinner: h616: add IOMMU node
arm64: dts: allwinner: rg35xx: Enable DVFS CPU frequency scaling
arm64: dts: allwinner: h616: add additional CPU OPPs for the H700
arm64: dts: allwinner: anbernic-rg35xx-h: Add ADC joysticks
arm64: dts: allwinner: h616: Add GPADC device node
dt-bindings: clock: sun50i-h616-ccu: Add GPADC clocks
ARM: dts: sunxi: remove duplicated entries in makefile
arm64: dts: allwinner: Add cache information to the SoC dtsi for H616
arm64: dts: allwinner: Add cache information to the SoC dtsi for A64
arm64: dts: allwinner: Correct the model names for Pine64 boards
dt-bindings: arm: sunxi: Correct the descriptions for Pine64 boards
arm64: dts: allwinner: Add cache information to the SoC dtsi for H6
ARM: dts: sun50i: Add LRADC node
dt-bindings: input: sun4i-lradc-keys: Add H616 compatible
Link: https://lore.kernel.org/r/ZoQa8r1N8yi7FlPV@wens.tw
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The struct instances supplied by the drivers are never modified.
Handle them as const in the regmap core allowing the drivers to put them
into .rodata.
Also add a new entry to const_structs.checkpatch to make sure future
instances of this struct already enter the tree as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20240706-regmap-const-structs-v1-2-d08c776da787@weissschuh.net
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
0c18184de990 ("platform/x86: apple-gmux: support MMIO gmux on T2 Macs")
brought support for T2 Macs in apple-gmux. But in order to use dual GPU,
the integrated GPU has to be enabled. On such dual GPU EFI Macs, the EFI
stub needs to report that it is booting macOS in order to prevent the
firmware from disabling the iGPU.
This patch is also applicable for some non T2 Intel Macs.
Based on this patch for GRUB by Andreas Heider <andreas@heider.io>:
https://lists.gnu.org/archive/html/grub-devel/2013-12/msg00442.html
Credits also goto Kerem Karabay <kekrby@gmail.com> for helping porting
the patch to the Linux kernel.
Cc: Orlando Chamberlain <orlandoch.dev@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
[ardb: limit scope using list of DMI matches provided by Lukas and Orlando]
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The smbios.c source file is not currently included in the x86 build, and
before we can do so, it needs some tweaks to build correctly in
combination with the EFI mixed mode support.
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Get callers out of poking into bvec internals a bit more. Not a huge win
right now, but with the proposed new DMA mapping API we might end up with
a lot more of this otherwise.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20240706075228.2350978-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The folio_migrate_copy() is just a wrapper of folio_copy() and
folio_migrate_flags(), it is simple and only aio use it for now, unfold it
and remove folio_migrate_copy().
Link: https://lkml.kernel.org/r/20240626085328.608006-7-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a #MC variant of folio_copy() which uses copy_mc_highpage() to support
#MC handled during folio copy, it will be used in folio migration soon.
Link: https://lkml.kernel.org/r/20240626085328.608006-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: migrate: support poison recover from migrate folio", v5.
The folio migration is widely used in kernel, memory compaction, memory
hotplug, soft offline page, numa balance, memory demote/promotion, etc,
but once access a poisoned source folio when migrating, the kernel will
panic.
There is a mechanism in the kernel to recover from uncorrectable memory
errors, ARCH_HAS_COPY_MC(eg, Machine Check Safe Memory Copy on x86), which
is already used in NVDIMM or core-mm paths(eg, CoW, khugepaged, coredump,
ksm copy), see copy_mc_to_{user,kernel}, copy_mc_{user_}highpage callers.
This series of patches provide the recovery mechanism from folio copy for
the widely used folio migration. Please note, because folio migration is
no guarantee of success, so we could chose to make folio migration
tolerant of memory failures, adding folio_mc_copy() which is a #MC
versions of folio_copy(), once accessing a poisoned source folio, we could
return error and make the folio migration fail, and this could avoid the
similar panic shown below.
CPU: 1 PID: 88343 Comm: test_softofflin Kdump: loaded Not tainted 6.6.0
pc : copy_page+0x10/0xc0
lr : copy_highpage+0x38/0x50
...
Call trace:
copy_page+0x10/0xc0
folio_copy+0x78/0x90
migrate_folio_extra+0x54/0xa0
move_to_new_folio+0xd8/0x1f0
migrate_folio_move+0xb8/0x300
migrate_pages_batch+0x528/0x788
migrate_pages_sync+0x8c/0x258
migrate_pages+0x440/0x528
soft_offline_in_use_page+0x2ec/0x3c0
soft_offline_page+0x238/0x310
soft_offline_page_store+0x6c/0xc0
dev_attr_store+0x20/0x40
sysfs_kf_write+0x4c/0x68
kernfs_fop_write_iter+0x130/0x1c8
new_sync_write+0xa4/0x138
vfs_write+0x238/0x2d8
ksys_write+0x74/0x110
This patch (of 5):
There is a memory_failure_queue() call after copy_mc_[user]_highpage(),
see callers, eg, CoW/KSM page copy, it is used to mark the source page as
h/w poisoned and unmap it from other tasks, and the upcomming poison
recover from migrate folio will do the similar thing, so let's move the
memory_failure_queue() into the copy_mc_[user]_highpage() instead of
adding it into each user, this should also enhance the handling of
poisoned page in khugepaged.
Link: https://lkml.kernel.org/r/20240626085328.608006-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20240626085328.608006-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
crashes from deferred split racing folio migration", needed by "mm:
migrate: split folio_migrate_mapping()".
|
|
arm64-for-6.11
Merge IPQ9574 interconnect clock DeviceTree binding to gain access to
the interconnect constants.
|
|
Add interconnect-cells to clock provider so that it can be
used as icc provider.
Add master/slave ids for Qualcomm IPQ9574 Network-On-Chip
interfaces. This will be used by the gcc-ipq9574 driver
that will for providing interconnect services using the
icc-clk framework.
Acked-by: Georgi Djakov <djakov@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Varadarajan Narayanan <quic_varada@quicinc.com>
Link: https://lore.kernel.org/r/20240430064214.2030013-3-quic_varada@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Adds a sysfs entry for the LED on F10 above the crossed out camera icon
on 2023 Zenbooks.
Signed-off-by: Devin Bayer <dev@doubly.so>
Link: https://lore.kernel.org/r/20240628084603.217106-1-dev@doubly.so
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mani/mhi into char-misc-next
Manivannan writes:
MHI Host
========
- Used unique 'mhi_pci_dev_info' struct for product families instead of reusing
a shared struct. This allows displaying the actual product name for the MHI
devices during driver probe.
- Added support for Foxconn SDX72 based modems, T99W515 and DW5934E.
- Added a 'name' field to 'struct mhi_controller' for allowing the MHI client
drivers to uniquely identify each MHI device based on its product/device name.
This is useful in applying any device specific quirks in the client drivers.
WWAN MHI Client driver
======================
- Due to the build dependency with the MHI patch exposing 'name' field to client
drivers, the WWAN MHI client driver patch that is making use of the 'name'
field to apply custom data mux id for Foxconn modems is also included.
Collected Ack from Networking maintainer for this patch.
MHI EP
======
- Fixed the MHI EP stack to not allocate memory for MHI objects from DMA zone.
This was done accidentally while adding the slab cache support and causes the
MHI EP stack to run out of memory while doing high bandwidth transfers.
* tag 'mhi-for-v6.11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mani/mhi:
net: wwan: mhi: make default data link id configurable
bus: mhi: host: Allow controller drivers to specify name for the MHI controller
bus: mhi: host: Add support for Foxconn SDX72 modems
bus: mhi: host: pci_generic: Use unique 'mhi_pci_dev_info' for product families
bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone
|
|
MIDI2 Set Tempo message defines the tempo in 10ns unit for finer
accuracy, while MIDI1 was defined in 1us unit. For adapting this
different unit, introduce "tempo_base" field to snd_seq_queue_tempo
struct so that user-space can pass the proper tempo base unit.
The accepted value is limited, it must be either 0, 10 or 1000.
The protocol version is bumped to 1.0.4 along with this.
The access with the older protocol version ignores the tempo-base
value in ioctls and always treats as 1000.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://patch.msgid.link/20240705160344.6481-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
This patch expands the status information provided by ethtool for PSE c33
with available power limit and available power limit ranges. It also adds
a call to pse_ethtool_set_pw_limit() to configure the PSE control power
limit.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-5-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch add a way to get and set the power limit of a PSE PI.
For that it uses regulator API callbacks wrapper like get_voltage() and
get/set_current_limit() as power is simply V * I.
We used mW unit as defined by the IEEE 802.3-2022 standards.
set_current_limit() uses the voltage return by get_voltage() and the
desired power limit to calculate the current limit. get_voltage() callback
is then mandatory to set the power limit.
get_current_limit() callback is by default looking at a driver callback
and fallback to extracting the current limit from _pse_ethtool_get_status()
if the driver does not set its callback. We prefer let the user the choice
because ethtool_get_status return much more information than the current
limit.
expand pse status with c33_pw_limit_ranges to return the ranges available
to configure the power limit.
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-4-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This update expands the status information provided by ethtool for PSE c33.
It includes details such as the detected class, current power delivered,
and extended state information.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-1-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a packet sample is observed, the sampling rate that was used is
important to estimate the real frequency of such event.
Store the probability of the parent sample action in the skb's cb area
and use it in psample action to pass it down to psample module.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-7-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for a new action: psample.
This action accepts a u32 group id and a variable-length cookie and uses
the psample multicast group to make the packet available for
observability.
The maximum length of the user-defined cookie is set to 16, same as
tc_cookie, to discourage using cookies that will not be offloadable.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-6-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Although not explicitly documented in the psample module itself, the
definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.
Quoting tc-sample(8):
"RATE of 100 will lead to an average of one sampled packet out of every
100 observed."
With this semantics, the rates that we can express with an unsigned
32-bits number are very unevenly distributed and concentrated towards
"sampling few packets".
For example, we can express a probability of 2.32E-8% but we
cannot express anything between 100% and 50%.
For sampling applications that are capable of sampling a decent
amount of packets, this sampling rate semantics is not very useful.
Add a new flag to the uAPI that indicates that the sampling rate is
expressed in scaled probability, this is:
- 0 is 0% probability, no packets get sampled.
- U32_MAX is 100% probability, all packets get sampled.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-5-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a user cookie to the sample metadata so that sample emitters can
provide more contextual information to samples.
If present, send the user cookie in a new attribute:
PSAMPLE_ATTR_USER_COOKIE.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-2-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull drm fixes from Daniel Vetter:
"Just small fixes all over here, all quiet as it should.
drivers:
- amd: mostly amdgpu display fixes + radeon vm NULL deref fix
- xe: migration error handling + typoed register name in gt setup
- i915: usb-c fix to shut up warnings on MTL+
- panthor: fix sync-only jobs + ioctl validation fix to not EINVAL
wrongly
- panel quirks
- nouveau: NULL deref in get_modes
drm core:
- fbdev big endian fix for the dma memory backed variant
drivers/firmware:
- fix sysfb refcounting"
* tag 'drm-fixes-2024-07-05' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe/mcr: Avoid clobbering DSS steering
drm/xe: fix error handling in xe_migrate_update_pgtables
drm/ttm: Always take the bo delayed cleanup path for imported bos
drm/fbdev-generic: Fix framebuffer on big endian devices
drm/panthor: Fix sync-only jobs
drm/panthor: Don't check the array stride on empty uobj arrays
drm/amdgpu/atomfirmware: silence UBSAN warning
drm/radeon: check bo_va->bo is non-NULL before using it
drm/amd/display: Fix array-index-out-of-bounds in dml2/FCLKChangeSupport
drm/amd/display: Update efficiency bandwidth for dcn351
drm/amd/display: Fix refresh rate range for some panel
drm/amd/display: Account for cursor prefetch BW in DML1 mode support
drm/amd/display: Add refresh rate range check
drm/amd/display: Reset freesync config before update new state
drm: panel-orientation-quirks: Add labels for both Valve Steam Deck revisions
drm: panel-orientation-quirks: Add quirk for Valve Galileo
drm/i915/display: For MTL+ platforms skip mg dp programming
drm/nouveau: fix null pointer dereference in nouveau_connector_get_modes
firmware: sysfb: Fix reference count of sysfb parent device
|
|
libaokun@huaweicloud.com <libaokun@huaweicloud.com> says:
This is the third version of this patch series, in which another patch set
is subsumed into this one to avoid confusing the two patch sets.
(https://patchwork.kernel.org/project/linux-fsdevel/list/?series=854914)
We've been testing ondemand mode for cachefiles since January, and we're
almost done. We hit a lot of issues during the testing period, and this
patch series fixes some of the issues. The patches have passed internal
testing without regression.
The following is a brief overview of the patches, see the patches for
more details.
Patch 1-2: Add fscache_try_get_volume() helper function to avoid
fscache_volume use-after-free on cache withdrawal.
Patch 3: Fix cachefiles_lookup_cookie() and cachefiles_withdraw_cache()
concurrency causing cachefiles_volume use-after-free.
Patch 4: Propagate error codes returned by vfs_getxattr() to avoid
endless loops.
Patch 5-7: A read request waiting for reopen could be closed maliciously
before the reopen worker is executing or waiting to be scheduled. So
ondemand_object_worker() may be called after the info and object and even
the cache have been freed and trigger use-after-free. So use
cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the
reopen worker or wait for it to finish. Since it makes no sense to wait
for the daemon to complete the reopen request, to avoid this pointless
operation blocking cancel_work_sync(), Patch 1 avoids request generation
by the DROPPING state when the request has not been sent, and Patch 2
flushes the requests of the current object before cancel_work_sync().
Patch 8: Cyclic allocation of msg_id to avoid msg_id reuse misleading
the daemon to cause hung.
Patch 9: Hold xas_lock during polling to avoid dereferencing reqs causing
use-after-free. This issue was triggered frequently in our tests, and we
found that anolis 5.10 had fixed it. So to avoid failing the test, this
patch is pushed upstream as well.
Baokun Li (7):
netfs, fscache: export fscache_put_volume() and add
fscache_try_get_volume()
cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
cachefiles: propagate errors from vfs_getxattr() to avoid infinite
loop
cachefiles: stop sending new request when dropping object
cachefiles: cancel all requests for the object that is being dropped
cachefiles: cyclic allocation of msg_id to avoid reuse
Hou Tao (1):
cachefiles: wait for ondemand_object_worker to finish when dropping
object
Jingbo Xu (1):
cachefiles: add missing lock protection when polling
fs/cachefiles/cache.c | 45 ++++++++++++++++++++++++++++-
fs/cachefiles/daemon.c | 4 +--
fs/cachefiles/internal.h | 3 ++
fs/cachefiles/ondemand.c | 52 ++++++++++++++++++++++++++++++----
fs/cachefiles/volume.c | 1 -
fs/cachefiles/xattr.c | 5 +++-
fs/netfs/fscache_volume.c | 14 +++++++++
fs/netfs/internal.h | 2 --
include/linux/fscache-cache.h | 6 ++++
include/trace/events/fscache.h | 4 +++
10 files changed, 123 insertions(+), 13 deletions(-)
Link: https://lore.kernel.org/r/20240628062930.2467993-1-libaokun@huaweicloud.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
These bindings will be used for the SCMI voltage domain.
Signed-off-by: Pascal Paillet <p.paillet@foss.st.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-next
Updates for v6.11
Core:
- SM7150 support
DPU:
- SM7150 support
- Fix DSC support for DSI panels in video mode
- Fixed TE vsync source support for DSI command-mode panels
- Fix for devices without UBWC in the display controller (ie.
QCM2290)
DSI:
- Remove unused register-writing wrappers
- Fix DSC support for panels in video mode
- Add support for parsing TE vsync source
- Add support for MSM8937 (28nm DSI PHY)
MDP5:
- Add support for MSM8937
- Fix configuration for MSM8953
GPU:
- Split giant device table into per-gen "hw catalog" similar to
what is done on the display side of the driver
- Fix a702 UBWC mode
- Fix unused variably warnings
- GPU memory traces
- Add param for userspace to know if raytracing is supported
- Memory barrier cleanup and GBIF unhalt fix
- X185 support (aka gpu in X1 laptop chips)
- a505 support
- fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvZQpYEHpSCgXGJ2kaHJDK6QFAFfTsfiWm4b2zZOnjXGw@mail.gmail.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for $kernel-version:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- dp/mst: Fix daisy-chaining at resume
- dsc: Add helper to dump the DSC configuration
- tests: Add tests for the new monochrome TV mode variant
Driver Changes:
- ast: Refactor the mode setting code
- panfrost: Fix devfreq job reporting
- stm: Add LDVS support, DSI PHY updates
- panels:
- New panel: AUO G104STN01, K&d kd101ne3-40ti,
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704-curvy-outstanding-lizard-bcea78@houat
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-07-03:
amdgpu:
- Use vmalloc for dc_state
- Replay fixes
- Freesync fixes
- DCN 4.0.1 fixes
- DML fixes
- DCC updates
- Misc code cleanups and bug fixes
- 8K display fixes
- DCN 3.5 fixes
- Restructure DIO code
- DML1 fixes
- DML2 fixes
- GFX11 fix
- GFX12 updates
- GFX12 modifiers fixes
- RAS fixes
- IP dump fixes
- Add some updated IP version checks
_ Silence UBSAN warning
radeon:
- GPUVM fix
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703211314.2041893-1-alexander.deucher@amd.com
|
|
The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.
Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Currently the only STMMAC platform driver using the DW XPCS code is the
Intel mGBE device driver. (It can be determined by finding all the drivers
having the stmmac_mdio_bus_data::has_xpcs flag set.) At the same time the
low-level platform driver masks out the DW XPCS MDIO-address from being
auto-detected as PHY by the MDIO subsystem core. Seeing the PCS MDIO ID is
known the procedure of the DW XPCS device creation can be simplified by
dropping the loop over all the MDIO IDs. From now the DW XPCS device
descriptor will be created for the MDIO-bus address pre-defined by the
platform drivers via the stmmac_mdio_bus_data::pcs_mask field.
Note besides this shall speed up a bit the Intel mGBE probing.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's now possible to have the DW XPCS device defined as a standard
platform device for instance in the platform DT-file. Although that
functionality is useless unless there is a way to have the device found by
the client drivers (STMMAC/DW *MAC, NXP SJA1105 Eth Switch, etc). Provide
such ability by means of the xpcs_create_fwnode() method. It needs to be
called with the device DW XPCS fwnode instance passed. That node will be
then used to find the MDIO-device instance in order to create the DW XPCS
descriptor.
Note the method semantics and name is similar to what has been recently
introduced in the Lynx PCS driver.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Synopsys DesignWare XPCS IP-core can be synthesized with the device CSRs
being accessible over the MCI or APB3 interface instead of the MDIO bus
(see the CSR_INTERFACE HDL parameter). Thus all the PCS registers can be
just memory mapped and be a subject of the standard MMIO operations of
course taking into account the peculiarities of the Clause C45 CSRs
mapping. From that perspective the DW XPCS devices would look as just
normal platform devices for the kernel.
On the other hand in order to have the DW XPCS devices handled by the
pcs-xpcs.c driver they need to be registered in the framework of the
MDIO-subsystem. So the suggested change is about providing a DW XPCS
platform device driver registering a virtual MDIO-bus with a single
MDIO-device representing the DW XPCS device.
DW XPCS platform device is supposed to be described by the respective
compatible string "snps,dw-xpcs" (or with the PMA-specific compatible
string), CSRs memory space and optional peripheral bus and reference clock
sources. Depending on the INDIRECT_ACCESS IP-core synthesize parameter the
memory-mapped reg-space can be represented as either directly or
indirectly mapped Clause 45 space. In the former case the particular
address is determined based on the MMD device and the registers offset (5
+ 16 bits all together) within the device reg-space. In the later case
there is only 8 lower address bits are utilized for the registers mapping
(255 CSRs). The upper bits are supposed to be written into the respective
viewport CSR in order to select the respective MMD sub-page.
Note, only the peripheral bus clock source is requested in the platform
device probe procedure. The core and pad clocks handling has been
implemented in the framework of the xpcs_create() method intentionally
since the clocks-related setups are supposed to be performed later, during
the DW XPCS main configuration procedures. (For instance they will be
required for the DW Gen5 10G PMA configuration.)
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The being introduced structure will preserve the PCS and PMA IDs retrieved
from the respective DW XPCS MMDs or potentially pre-defined by the client
drivers. (The later change will be introduced later in the framework of
the commit adding the memory-mapped DW XPCS devices support.)
The structure fields are filled in in the xpcs_get_id() function, which
used to be responsible for the PCS Device ID getting only. Besides of the
PCS ID the method now fetches the PMA/PMD IDs too from MMD 1, which used
to be done in xpcs_dev_flag(). The retrieved PMA ID will be from now
utilized for the PMA-specific tweaks like it was introduced for the
Wangxun TxGBE PCS in the commit f629acc6f210 ("net: pcs: xpcs: support to
switch mode for Wangxun NICs").
Note 1. The xpcs_get_id() error-handling semantics has been changed. From
now the error number will be returned from the function. There is no point
in the next IOs or saving 0xffs and then looping over the actual device
IDs if device couldn't be reached. -ENODEV will be returned if the very
first IO operation failed thus indicating that no device could be found.
Note 2. The PCS and PMA IDs macros have been converted to enum'es. The
enum'es will be populated later in another commit with the virtual IDs
identifying the DW XPCS devices which have some platform-specifics, but
have been synthesized with the default PCS/PMA ID.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A structure with the PCS/PMA MMD IDs data is being introduced in one of
the next commits. In order to prevent the names ambiguity let's convert
the xpcs_id structure name to dw_xpcs_desc. The later version is more
suitable since the structure content is indeed the device descriptor
containing the data and callbacks required for the driver to correctly set
the device up.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One of the next commits will alter the DW XPCS driver to support setting a
custom device ID for the particular MDIO-device detected on the platform.
The generic DW XPCS ID can be used as a custom ID as well in case if the
DW XPCS-device was erroneously synthesized with no or some undefined ID.
In addition to that having all supported DW XPCS device IDs defined in a
single place will improve the code maintainability and readability.
Note while at it rename the macros to being shorter and looking alike to
the already defined NXP XPCS ID macro.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Zeroout can access a significant capacity and take longer than the user
expected. A user may change their mind about wanting to run that
command and attempt to kill the process and do something else with their
device. But since the task is uninterruptable, they have to wait for it
to finish, which could be many hours.
Add a new BLKDEV_ZERO_KILLABLE flag for blkdev_issue_zeroout that checks
for a fatal signal at each iteration so the user doesn't have to wait for
their regretted operation to complete naturally.
Heavily based on an earlier patch from Keith Busch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240701165219.1571322-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now that device mapper can handle resetting all zones of a mapped zoned
device using REQ_OP_ZONE_RESET_ALL, all zoned block device drivers
support this operation. With this, the request queue feature
BLK_FEAT_ZONE_RESETALL is not necessary and the emulation code in
blk-zone.c can be removed.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-5-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This commit implements processing of the REQ_OP_ZONE_RESET_ALL operation
for zoned mapped devices. Given that this operation always has a BIO
sector of 0 and a 0 size, processing through the regular BIO
__split_and_process_bio() function does not work because this function
would always select the first target. Instead, handling of this
operation is implemented using the function __send_zone_reset_all().
Similarly to the __send_empty_flush() function, the new
__send_zone_reset_all() function manually goes through all targets of a
mapped device table doing the following:
1) If the target can natively support REQ_OP_ZONE_RESET_ALL,
__send_duplicate_bios() is used to forward the reset all operation to
the target. This case is handled with the
__send_zone_reset_all_native() function.
2) For other targets, the function __send_zone_reset_all_emulated() is
executed to emulate the execution of REQ_OP_ZONE_RESET_ALL using
regular REQ_OP_ZONE_RESET operations.
Targets that can natively support REQ_OP_ZONE_RESET_ALL are identified
using the new target field zone_reset_all_supported. This boolean is set
to true in for targets that have reliable zone limits, that is, targets
that map all sequential write required zones of their zoned device(s).
Setting this field is handled in dm_set_zones_restrictions() and
device_get_zone_resource_limits().
For targets with unreliable zone limits, REQ_OP_ZONE_RESET_ALL must be
emulated (case 2 above). This is implemented with
__send_zone_reset_all_emulated() and is similar to the block layer
function blkdev_zone_reset_all_emulated(): first a report zones is done
for the zones of the target to identify zones that need reset, that is,
any sequential write required zone that is not already empty. This is
done using a bitmap and the function dm_zone_get_reset_bitmap() which
sets to 1 the bit corresponding to a zone that needs reset. Next, this
zone bitmap is inspected and a clone BIO modified to use the
REQ_OP_ZONE_RESET operation issued for any zone with its bit set in the
zone bitmap.
This implementation is more efficient than what the block layer does
with blkdev_zone_reset_all_emulated(), which is always used for DM zoned
devices currently: as we can natively use REQ_OP_ZONE_RESET_ALL on
targets mapping all sequential write required zones, resetting all zones
of a zoned mapped device can be much faster compared to always emulating
this operation using regular per-zone reset. In the worst case, this
implementation is as-efficient as the block layer emulation. This
reduction in the time it takes to reset all zones of a zoned mapped
device depends directly on the mapped device targets mapping (reliable
zone limits or not).
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-4-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently UFS clk scaling is getting suspended only when the clks are
scaled down. When high load is generated, a huge amount of latency is added
due to scaling up the clk and completing the request post that.
Suspending the scaling in its existing state when high load is generated
improves the random performance KPI by 28%. So suspending the scaling when
there are no requests. And the clk would be put in low scaled state when
the actual request load is low.
Make this change optional by having the check enabled using vops since for
some devices suspending without bringing the clk in low scaled state might
have impact on power consumption of the SoC.
Signed-off-by: Ram Prakash Gupta <quic_rampraka@quicinc.com>
Link: https://lore.kernel.org/r/20240627083756.25340-2-quic_rampraka@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The struct task_struct's in_user_fault member is not used by the cgroup
v2's memory controller, so it can be put under the CONFIG_MEMCG_V1 config
option. To do so, mem_cgroup_enter_user_fault() and
mem_cgroup_exit_user_fault() are moved under the CONFIG_MEMCG_V1 option as
well.
Link: https://lkml.kernel.org/r/20240628210317.272856-10-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The memcg_in_oom field of the struct task_struct is not used by the cgroup
v2's memory controller, so it can be happily compiled out if
CONFIG_MEMCG_V1 is not set.
Link: https://lkml.kernel.org/r/20240628210317.272856-9-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Put memcg1-specific members of struct mem_cgroup_per_node under the
CONFIG_MEMCG_V1 config option.
Link: https://lkml.kernel.org/r/20240628210317.272856-8-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Put memcg1-specific members of struct mem_cgroup under the CONFIG_MEMCG_V1
config option. Also group them close to the end of struct mem_cgroup just
before the dynamic per-node part.
Link: https://lkml.kernel.org/r/20240628210317.272856-7-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move out the legacy cgroup v1 socket memory accounting code into
mm/memcontrol-v1.c.
This commit introduces three new functions: memcg1_tcpmem_active(),
memcg1_charge_skmem() and memcg1_uncharge_skmem(), which contain all
cgroup v1-specific code and become trivial if CONFIG_MEMCG_V1 isn't set.
Note, that !!memcg->tcpmem_pressure check in
mem_cgroup_under_socket_pressure() can't be easily moved into
memcontrol-v1.h without including memcontrol-v1.h from memcontrol.h which
isn't a good idea, so it's better to just #ifdef it.
Link: https://lkml.kernel.org/r/20240628210317.272856-3-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Allow proactive reclaimers to submit an additional swappiness=<val>
argument to memory.reclaim. This overrides the global or per-memcg
swappiness setting for that reclaim attempt.
For example:
echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim
will perform reclaim on the rootcg with a swappiness setting of 0 (no
swap) regardless of the vm.swappiness sysctl setting.
Userspace proactive reclaimers use the memory.reclaim interface to trigger
reclaim. The memory.reclaim interface does not allow for any way to
effect the balance of file vs anon during proactive reclaim. The only
approach is to adjust the vm.swappiness setting. However, there are a few
reasons we look to control the balance of file vs anon during proactive
reclaim, separately from reactive reclaim:
* Swapout should be limited to manage SSD write endurance. In near-OOM
situations we are fine with lots of swap-out to avoid OOMs. As these
are typically rare events, they have relatively little impact on write
endurance. However, proactive reclaim runs continuously and so its
impact on SSD write endurance is more significant. Therefore it is
desireable to control swap-out for proactive reclaim separately from
reactive reclaim
* Some userspace OOM killers like systemd-oomd[1] support OOM killing on
swap exhaustion. This makes sense if the swap exhaustion is triggered
due to reactive reclaim but less so if it is triggered due to proactive
reclaim (e.g. one could see OOMs when free memory is ample but anon is
just particularly cold). Therefore, it's desireable to have proactive
reclaim reduce or stop swap-out before the threshold at which OOM
killing occurs.
In the case of Meta's Senpai proactive reclaimer, we adjust vm.swappiness
before writes to memory.reclaim[2]. This has been in production for
nearly two years and has addressed our needs to control proactive vs
reactive reclaim behavior but is still not ideal for a number of reasons:
* vm.swappiness is a global setting, adjusting it can race/interfere
with other system administration that wishes to control vm.swappiness.
In our case, we need to disable Senpai before adjusting vm.swappiness.
* vm.swappiness is stateful - so a crash or restart of Senpai can leave
a misconfigured setting. This requires some additional management to
record the "desired" setting and ensure Senpai always adjusts to it.
With this patch, we avoid these downsides of adjusting vm.swappiness
globally.
[1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html
[2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598
Link: https://lkml.kernel.org/r/20240103164841.2800183-3-schatzberg.dan@gmail.com
Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Suggested-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yue Zhao <findns94@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Add swappiness argument to memory.reclaim", v6.
This patch proposes augmenting the memory.reclaim interface with a
swappiness=<val> argument that overrides the swappiness value for that
instance of proactive reclaim.
Userspace proactive reclaimers use the memory.reclaim interface to trigger
reclaim. The memory.reclaim interface does not allow for any way to
effect the balance of file vs anon during proactive reclaim. The only
approach is to adjust the vm.swappiness setting. However, there are a few
reasons we look to control the balance of file vs anon during proactive
reclaim, separately from reactive reclaim:
* Swapout should be limited to manage SSD write endurance. In near-OOM
situations we are fine with lots of swap-out to avoid OOMs. As these
are typically rare events, they have relatively little impact on write
endurance. However, proactive reclaim runs continuously and so its
impact on SSD write endurance is more significant. Therefore it is
desireable to control swap-out for proactive reclaim separately from
reactive reclaim
* Some userspace OOM killers like systemd-oomd[1] support OOM killing on
swap exhaustion. This makes sense if the swap exhaustion is triggered
due to reactive reclaim but less so if it is triggered due to proactive
reclaim (e.g. one could see OOMs when free memory is ample but anon is
just particularly cold). Therefore, it's desireable to have proactive
reclaim reduce or stop swap-out before the threshold at which OOM
killing occurs.
In the case of Meta's Senpai proactive reclaimer, we adjust vm.swappiness
before writes to memory.reclaim[2]. This has been in production for
nearly two years and has addressed our needs to control proactive vs
reactive reclaim behavior but is still not ideal for a number of reasons:
* vm.swappiness is a global setting, adjusting it can race/interfere
with other system administration that wishes to control vm.swappiness.
In our case, we need to disable Senpai before adjusting vm.swappiness.
* vm.swappiness is stateful - so a crash or restart of Senpai can leave
a misconfigured setting. This requires some additional management to
record the "desired" setting and ensure Senpai always adjusts to it.
With this patch, we avoid these downsides of adjusting vm.swappiness
globally.
Previously, this exact interface addition was proposed by Yosry[3]. In
response, Roman proposed instead an interface to specify precise
file/anon/slab reclaim amounts[4]. More recently Huan also proposed this
as well[5] and others similarly questioned if this was the proper
interface.
Previous proposals sought to use this to allow proactive reclaimers to
effectively perform a custom reclaim algorithm by issuing proactive
reclaim with different settings to control file vs anon reclaim (e.g. to
only reclaim anon from some applications). Responses argued that
adjusting swappiness is a poor interface for custom reclaim.
In contrast, I argue in favor of a swappiness setting not as a way to
implement custom reclaim algorithms but rather to bias the balance of anon
vs file due to differences of proactive vs reactive reclaim. In this
context, swappiness is the existing interface for controlling this balance
and this patch simply allows for it to be configured differently for
proactive vs reactive reclaim.
Specifying explicit amounts of anon vs file pages to reclaim feels
inappropriate for this prupose. Proactive reclaimers are un-aware of the
relative age of file vs anon for a cgroup which makes it difficult to
manage proactive reclaim of different memory pools. A proactive reclaimer
would need some amount of anon reclaim attempts separate from the amount
of file reclaim attempts which seems brittle given that it's difficult to
observe the impact.
[1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html
[2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598
[3]https://lore.kernel.org/linux-mm/CAJD7tkbDpyoODveCsnaqBBMZEkDvshXJmNdbk51yKSNgD7aGdg@mail.gmail.com/
[4]https://lore.kernel.org/linux-mm/YoPHtHXzpK51F%2F1Z@carbon/
[5]https://lore.kernel.org/lkml/20231108065818.19932-1-link@vivo.com/
This patch (of 2):
We use the constants 0 and 200 in a few places in the mm code when
referring to the min and max swappiness. This patch adds MIN_SWAPPINESS
and MAX_SWAPPINESS #defines to improve clarity. There are no functional
changes.
Link: https://lkml.kernel.org/r/20240103164841.2800183-1-schatzberg.dan@gmail.com
Link: https://lkml.kernel.org/r/20240103164841.2800183-2-schatzberg.dan@gmail.com
Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Yue Zhao <findns94@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Put legacy cgroup v1 memory controller code under a new CONFIG_MEMCG_V1
config option. The option is turned off by default. Nobody except those
who are still using cgroup v1 should turn it on.
If the option is not set, memory controller can still be mounted under
cgroup v1, but none of memcg-specific control files are present.
Please note, that not all cgroup v1's memory controller code is guarded
yet (but most of it), it's a subject for some follow-up work.
Thanks to Michal Hocko for providing a better Kconfig option description.
[roman.gushchin@linux.dev: better config option description provided by Michal]
Link: https://lkml.kernel.org/r/ZnxXNtvqllc9CDoo@google.com
Link: https://lkml.kernel.org/r/20240625005906.106920-14-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Group all cgroup v1-related declarations at the end of memcontrol.h and
mm/memcontrol-v1.h with an intention to put them all together under a
config option later on. It should make things easier to follow and
maintain too.
Link: https://lkml.kernel.org/r/20240625005906.106920-13-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Cgroup v1's memory controller contains a pretty complicated event
notifications mechanism which is not used on cgroup v2. Let's move the
corresponding code into memcontrol-v1.c.
Please, note, that mem_cgroup_event_ratelimit() remains in memcontrol.c,
otherwise it would require exporting too many details on memcg stats
outside of memcontrol.c.
Link: https://lkml.kernel.org/r/20240625005906.106920-7-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|