Age | Commit message (Collapse) | Author |
|
Convert to refcount_t and prepare for supporting to register bvec buffer
automatically, which needs to initialize reference counter as 2, and
kref doesn't provide this interface, so convert to refcount_t.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
__run_io_and_remove() is used in several stress tests for running heavy
IO vs. removing device meantime.
However, sequential `readwrite` is taken in the fio script, which isn't
correct, we should take random IO for saturating ublk device.
Also turns out '--num_jobs=4' isn't stressful enough, so change it to
'--num_jobs=$(nproc)'.
Finally we don't cover single queue test in `test_stress_02.sh`, so add
single queue test which can trigger request tag recycling easier.
With above change the issue in #1 can be reproduced reliably in stress_02.sh.
Link:https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/ #1
Cc: Jared Holzman <jholzman@nvidia.com>
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250519031620.245749-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
for-6.16/block
Pull NVMe updates from Christoph:
"nvme updates for Linux 6.16
- add per-node DMA pools and use them for PRP/SGL allocations
(Caleb Sander Mateos, Keith Busch)
- nvme-fcloop refcounting fixes (Daniel Wagner)
- support delayed removal of the multipath node and optionally support
the multipath node for private namespaces (Nilay Shroff)
- support shared CQs in the PCI endpoint target code (Wilfred Mallawa)
- support admin-queue only authentication (Hannes Reinecke)
- use the crc32c library instead of the crypto API (Eric Biggers)
- misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes Reinecke,
Leon Romanovsky, Gustavo A. R. Silva)"
* tag 'nvme-6.16-2025-05-20' of git://git.infradead.org/nvme: (42 commits)
nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk
nvme: introduce multipath_always_on module param
nvme-multipath: introduce delayed removal of the multipath head node
nvme-pci: derive and better document max segments limits
nvme-pci: use struct_size for allocation struct nvme_dev
nvme-pci: add a symolic name for the small pool size
nvme-pci: use a better encoding for small prp pool allocations
nvme-pci: rename the descriptor pools
nvme-pci: remove struct nvme_descriptor
nvme-pci: store aborted state in flags variable
nvme-pci: don't try to use SGLs for metadata on the admin queue
nvme-pci: make PRP list DMA pools per-NUMA-node
nvme-pci: factor out a nvme_init_hctx_common() helper
dmapool: add NUMA affinity support
nvme-fc: do not reference lsrsp after failure
nvmet-fcloop: don't wait for lport cleanup
nvmet-fcloop: add missing fcloop_callback_host_done
nvmet-fc: take tgtport refs for portentry
nvmet-fc: free pending reqs on tgtport unregister
nvmet-fcloop: drop response if targetport is gone
...
|
|
This adds a compatible string for the SPI controller on RK3528.
Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250520100102.1226725-2-amadeus@jmu.edu.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs fix from Mike Marshall:
"Fix for orangefs page writeout counting"
* tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: adjust counting code to recover from 665575cf
|
|
Block devices can be opened read-write even if they can't be written to
for historic reasons. Remove the check requiring file->f_op->write_iter
when the block devices was opened in loop_configure. The call to
loop_check_backing_file just below ensures the ->write_iter is present
for backing files opened for writing, which is the only check that is
actually needed.
Fixes: f5c84eff634b ("loop: Add sanity check for read/write_iter")
Reported-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250520135420.1177312-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A late commit to 6.14-rc7! broke orangefs. 665575cf seems like a
good change, but maybe should have been introduced during the merge
window. This patch adjusts the counting code associated with
writing out pages so that orangefs works in a 665575cf world.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
New HP ZBook with Realtek HDA codec ALC3247 needs the quirk
ALC236_FIXUP_HP_GPIO_LED to fix the micmute LED.
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250520132101.120685-1-chris.chiu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Add support for HP Agusta.
Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250520124757.12597-1-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
strcpy() is deprecated; use strscpy() instead.
Both the destination and source buffer are of fixed length
so strscpy with 2-arguments is used.
No functional changes intended.
Link: https://github.com/KSPP/linux/issues/88
Signed-off-by: Siddarth Gundu <siddarthsgml@gmail.com>
Link: https://patch.msgid.link/20250520113012.70270-1-siddarthsgml@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-05-20
this is a pull request of 3 patches for net/main.
The 1st patch is by Rob Herring, and fixes the $id path in the
microchip,mcp2510.yaml device tree bindinds documentation.
The last 2 patches are from Oliver Hartkopp and fix a use-after-free
read and an out-of-bounds read in the CAN Broadcast Manager (BCM)
protocol.
linux-can-fixes-for-6.15-20250520
* tag 'linux-can-fixes-for-6.15-20250520' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: bcm: add missing rcu read protection for procfs content
can: bcm: add locking for bcm_op runtime updates
dt-bindings: can: microchip,mcp2510: Fix $id path
====================
Link: https://patch.msgid.link/20250520091424.142121-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This adds a match entry for using all the amps on a CDB35L63-CB2 board
without the CS42L43 codec. Configuration is:
SDW3: 1x CS35L63 (OUT1)
SDW1: 1x CS35L63 (OUT2)
Speaker playback and amp feedback are aggregated.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://patch.msgid.link/20250516152107.210994-3-sbinding@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
CS35L63 is very similar to CS35L56, and uses the same driver, so we
can use the same configuration.
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://patch.msgid.link/20250516152107.210994-2-sbinding@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use the previously parsed DisCo information from ACPI to create the DAI
drivers required to connect an SDCA Function into an ASoC soundcard.
Create DAI driver structures and populate the supported sample rates
and sample widths into them based on the Input/Output Terminal and any
attach Clock Source entities. More complex relationships with channels
etc. will be added later as constraints as part of the DAI startup.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-8-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use the previously parsed DisCo information from ACPI to create the
ALSA controls required by an SDCA Function. This maps all User and
Application level SDCA Controls to ALSA controls. Typically controls
marked with those access levels are just volumes and mutes.
SDCA defines volume controls as an integer in 1/256ths of a dB and
then provides a mechanism to specify what values are valid (range
templates). Currently only a simple case of a single linear volume
range with a power of 2 step size is supported. This allows the code
to expose the volume control using a simple shift. This will need
expanded in the future, to support more complex ranges and probably
also some additional control types but this should be sufficient to
for a first pass.
For non-dataport terminal widgets also add a pin switch to allow
that endpoint to be turned on/off.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-7-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use the previously parsed DisCo information from ACPI to create DAPM
widgets and routes representing a SDCA Function. For the most part SDCA
maps well to the DAPM abstractions.
The primary point of interest is the SDCA Power Domain Entities
(PDEs), which actually control the power status of the device. Whilst
these PDEs are the primary widgets the other parts of the SDCA graph
are added to maintain a consistency with the hardware abstract,
and allow routing to take effect. As for the PDEs themselves the
code currently only handle PS0 and PS3 (basically on and off),
the two intermediate power states are not commonly used and don't
map well to ASoC/DAPM.
Other minor points of slightly complexity include, the Group Entities
(GEs) these set the value of several other controls, typically
Selector Units (SUs) for enabling a cetain jack configuration. Multiple
SUs being controlled by a GE are easily modelled creating a single
control and sharing it among the controlled muxes.
SDCA also has a slight habit of having fully connected paths, relying
more on activating the PDEs to enable functionality. This doesn't
map quite so perfectly to DAPM which considers the path a reason to
power the PDE. Whilst in the current specification Mixer Units are
defined as fixed-function, in DAPM we create a virtual control for
each input (which defaults to connected). This allows paths to be
connected/disconnected, providing a more ASoC style approach to
managing the power. PIN_SWITCHs will also be added for non-dataport
terminal entities in a later patch along with the other ALSA controls,
providing greater flexibility in power management.
A top level helper sdca_asoc_populate_component() is exported that
counts and allocates everything, however, the intermediate counting and
population functions are also exported. This will allow end drivers to
do allocation and add custom handling, which is probably fairly likely
for the early SDCA devices.
Clock muxes are currently not fully supported, so some future work will
also be required there.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-6-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The core currently supports pin switches for source/sink widgets, but
only at the card level. SDCA components specify the fabric at the
level of the individual components, to support this add helpers to
allow component level pin switches.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-5-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Move the allocation of the PDE delays array until after the size has
been adjusted, this saves an additional division and simplifies the
code slightly.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-4-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
There is no need to include MODULE_LICENSE() and MODULE_DESCRIPTION() in
sdca_regmap.c as this file is part of a larger module that already
defines these.
Fixes: e3f7caf74b79 ("ASoC: SDCA: Add generic regmap SDCA helpers")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Fix minor typo SDAC -> SDCA.
Fixes: 42b144cb6a2d ("ASoC: SDCA: Add SDCA Control parsing")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20250516131011.221310-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
A few, quite rare, WMI attributes have names that are not compatible with
filenames, e.g. "Intel VT for Directed I/O (VT-d)".
For these cases the '/' gets replaced with '\' for display, but doesn't
get switched again when doing the WMI access.
Fix this by keeping the original attribute name and using that for sending
commands to the BIOS
Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms")
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20250520005027.3840705-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
If user modifies the battery charge threshold an ACPI event is generated.
Confirmed with Lenovo FW team this is only generated on user event. As no
action is needed, ignore the event and prevent spurious kernel logs.
Reported-by: Derek Barbosa <debarbos@redhat.com>
Closes: https://lore.kernel.org/platform-driver-x86/7e9a1c47-5d9c-4978-af20-3949d53fb5dc@app.fastmail.com/T/#m5f5b9ae31d3fbf30d7d9a9d76c15fb3502dfd903
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250517023348.2962591-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Merge series from Geert Uytterhoeven <geert+renesas@glider.be>:
This patch series (A) improves single transfer sizes in the MSIOF
driver, using two methods:
- By increasing the assumed FIFO sizes, impacting both PIO and DMA
transfers,
- By using two groups, impacting DMA transfers,
and (B) lets the recently-introduced MSIOF I2S drive reuse the SPI
driver's register definitions. All of this is covered with a thick
sauce of fixes for (harmless) bugs, cleanups, and refactorings.
Note that the driver uses the limitations as specified in the hardware
documentation. For discovering the actual FIFO sizes, I wrote some
crude test code that can be found at [2].
This is based on spi/for-next and sound-asoc/for-next, and has been
tested on a variery of R-Car SoCs.
[1] https://lore.kernel.org/cover.1746180072.git.geert+renesas@glider.be
[2] https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/log/?h=topic/msiof-fifo
|
|
Merge series from Sumanth Gavini <sumanth.gavini@yahoo.com>:
This series fixes the misspelling of "Electronics" as "Electrnoics"
across multiple subsystems (MFD, NFC, EXTCON). Each patch targets
a different subsystem for easier review.
The changes are mechanical and do not affect functionality.
Sumanth Gavini (6):
nfc: s3fwrn5: Correct Samsung "Electronics" spelling in copyright
headers
nfc: virtual_ncidev: Correct Samsung "Electronics" spelling in
copyright headers
extcon: extcon-max77693: Correct Samsung "Electronics" spelling in
copyright headers
mfd: maxim: Correct Samsung "Electronics" spelling in copyright
headers
mfd: maxim: Correct Samsung "Electronics" spelling in headers
regulator: max8952: Correct Samsung "Electronics" spelling in
copyright headers
drivers/extcon/extcon-max77693.c | 2 +-
drivers/nfc/s3fwrn5/core.c | 2 +-
drivers/nfc/s3fwrn5/firmware.c | 2 +-
drivers/nfc/s3fwrn5/firmware.h | 2 +-
drivers/nfc/s3fwrn5/i2c.c | 2 +-
drivers/nfc/s3fwrn5/nci.c | 2 +-
drivers/nfc/s3fwrn5/nci.h | 2 +-
drivers/nfc/s3fwrn5/phy_common.c | 4 ++--
drivers/nfc/s3fwrn5/phy_common.h | 4 ++--
drivers/nfc/s3fwrn5/s3fwrn5.h | 2 +-
drivers/nfc/virtual_ncidev.c | 2 +-
include/linux/mfd/max14577-private.h | 2 +-
include/linux/mfd/max14577.h | 2 +-
include/linux/mfd/max77686-private.h | 2 +-
include/linux/mfd/max77686.h | 2 +-
include/linux/mfd/max77693-private.h | 2 +-
include/linux/mfd/max77693.h | 2 +-
include/linux/mfd/max8997-private.h | 2 +-
include/linux/mfd/max8997.h | 2 +-
include/linux/mfd/max8998-private.h | 2 +-
include/linux/mfd/max8998.h | 2 +-
include/linux/regulator/max8952.h | 2 +-
22 files changed, 24 insertions(+), 24 deletions(-)
--
2.43.0
|
|
Merge series from Mohammad Rafi Shaik <mohammad.rafi.shaik@oss.qualcomm.com>:
This patchset adds support for sound card on Qualcomm QCS9100 and
QCS9075 boards.
|
|
If either REGMAP_IRQ or REGMAP_MDIO are set then REGMAP is also set.
This then enables the selecting of IRQ_DOMAIN or MDIO_BUS from REGMAP
based on the above two symbols respectively. This makes it very easy
to end up with "circular dependencies".
Instead select the IRQ_DOMAIN or MDIO_BUS from the symbols that make
use of them. This is almost equivalent to before but makes it less
likely to end up with false circular dependency detections.
Signed-off-by: Andrew Davis <afd@ti.com>
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Closes: https://lore.kernel.org/r/bfe991fa-f54c-4d58-b2e0-34c4e4eb48f4@linaro.org/
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250516141722.13772-1-afd@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The function sdm845_slim_snd_hw_params() calls the functuion
snd_soc_dai_set_channel_map() but does not check its return
value. A proper implementation can be found in msm_snd_hw_params().
Add error handling for snd_soc_dai_set_channel_map(). If the
function fails and it is not a unsupported error, return the
error code immediately.
Fixes: 5caf64c633a3 ("ASoC: qcom: sdm845: add support to DB845c and Lenovo Yoga")
Cc: stable@vger.kernel.org # v5.6
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://patch.msgid.link/20250519075739.1458-1-vulab@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
We are already within another `#ifdef CONFIG_GPIOLIB_IRQCHIP` in
gpiochip_to_irq() so there's no need for another guard. Remove it.
Acked-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-3-fe6ba1c6116d@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
This driver uses gpiochip_irq_reqres() and gpiochip_irq_relres() which
are only built with GPIOLIB_IRQCHIP=y. Add the missing Kconfig select.
Fixes: 7688a54d5b53 ("gpio: mpc8xxx: Make irq_chip immutable")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505180309.1nosQMkI-lkp@intel.com/
Acked-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-2-fe6ba1c6116d@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
This driver uses gpiochip_irq_reqres() and gpiochip_irq_relres() which
are only built with GPIOLIB_IRQCHIP=y. Add the missing Kconfig select.
Fixes: 20117cf426b6 ("gpio: pxa: Make irq_chip immutable")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505181429.mzyIatOU-lkp@intel.com/
Acked-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-1-fe6ba1c6116d@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Lenovo Yoga Pro 7 (gen 10) with Realtek ALC3306 and combined CS35L56
amplifiers need quirk ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN to
enable bass
Signed-off-by: Ed Burcher <git@edburcher.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250519224907.31265-2-git@edburcher.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In the NVMe context, the term "shutdown" has a specific technical
meaning. To avoid confusion, this commit renames the nvme_mpath_
shutdown_disk function to nvme_mpath_remove_disk to better reflect
its purpose (i.e. removing the disk from the system). However,
nvme_mpath_remove_disk was already in use, and its functionality
is related to releasing or putting the head node disk. To resolve
this naming conflict and improve clarity, the existing nvme_mpath_
remove_disk function is also renamed to nvme_mpath_put_disk.
This renaming improves code readability and better aligns function
names with their actual roles.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Currently, a multipath head disk node is not created for single-
ported NVMe adapters or private namespaces with non-unique NSID.
However, creating a head node in these cases can help transparently
handle transient PCIe link failures. Without a head node, features
like delayed removal cannot be leveraged, making it difficult to
tolerate such link failures. To address this, this commit introduces
nvme_core module parameter multipath_always_on.
When multipath_always_on is set to true, it forces the creation of a
multipath head node regardless NVMe disk or namespace type. So this
option allows the use of delayed removal of head node functionality
even for single-ported NVMe disks and private namespaces with a unique
NSID and thus helps transparently handle transient PCIe link failures.
By default multipath_always_on is set to false, thus preserving the
existing behavior. Setting it to true enables improved fault tolerance
in PCIe setups. Moreover, please note that enabling this option would
also implicitly enable nvme_core.multipath.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Currently, the multipath head node of an NVMe disk is removed
immediately as soon as all paths of the disk are removed. However,
this can cause issues in scenarios where:
- The disk hot-removal followed by re-addition.
- Transient PCIe link failures that trigger re-enumeration,
temporarily removing and then restoring the disk.
In these cases, removing the head node prematurely may lead to a head
disk node name change upon re-addition, requiring applications to
reopen their handles if they were performing I/O during the failure.
To address this, introduce a delayed removal mechanism of head disk
node. During transient failure, instead of immediate removal of head
disk node, the system waits for a configurable timeout, allowing the
disk to recover.
During transient disk failure, if application sends any IO then we
queue it instead of failing such IO immediately. If the disk comes back
online within the timeout, the queued IOs are resubmitted to the disk
ensuring seamless operation. In case disk couldn't recover from the
failure then queued IOs are failed to its completion and application
receives the error.
So this way, if disk comes back online within the configured period,
the head node remains unchanged, ensuring uninterrupted workloads
without requiring applications to reopen device handles.
A new sysfs attribute, named "delayed_removal_secs" is added under head
disk blkdev for user who wish to configure time for the delayed removal
of head disk node. The default value of this attribute is set to zero
second ensuring no behavior change unless explicitly configured.
Link: https://lore.kernel.org/linux-nvme/Y9oGTKCFlOscbPc2@infradead.org/
Link: https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@kbusch-mbp.dhcp.thefacebook.com/
Suggested-by: Keith Busch <kbusch@kernel.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
[nilay: reworked based on the original idea/POC from Christoph and Keith]
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Redefine the max segments and max integrity limits based on the limiting
factors. This keeps exactly the same values for 4k PAGE_SIZE systems,
but increases the number of segments for larger page size as it properly
derives the scatterlist allocation based limit for them instead of
assuming a 4k PAGE_SIZE.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
This avoids open coding the variable size array arithmetics.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
|
|
Open coding magic numbers in multiple places is never a good idea.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
|
|
Add a separate flag to encode that the transfer is using the small
page sized pool, and use a normal 0..n count for the number of
descriptors.
Contains improvements and suggestions from Kanchan Joshi
<joshi.k@samsung.com> and Leon Romanovsky <leon@kernel.org>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
|
|
They are used for both PRPs and SGLs, and we use descriptor elsewhere
when referring to their allocations, so use that name here as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
|
|
There is no real point in having a union of two pointer types here, just
use a void pointer as we mix and match types between the arms of the
union between the allocation and freeing side already.
Also rename the nr_allocations field to nr_descriptors to better describe
what it does.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[leon: ported forward to include metadata SGL support]
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
|
|
Instead of keeping dedicated "bool aborted" variable, switch to a flags
flags that can be used for other flags as well.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
|
|
No admin command defined in an NVMe specification supports metadata,
but to protect against vendor specific commands using metadata ensure
that we don't try to use SGLs for metadata on the admin queue, as NVMe
does not support SGLs on the admin queue for the PCI transport. Do
this by checking if the data transfer has been setup using SGLs as
that is required for using SGLs for metadata.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
|
|
NVMe commands with over 8 KB of discontiguous data allocate PRP list
pages from the per-nvme_device dma_pool prp_page_pool or prp_small_pool.
Each call to dma_pool_alloc() and dma_pool_free() takes the per-dma_pool
spinlock. These device-global spinlocks are a significant source of
contention when many CPUs are submitting to the same NVMe devices. On a
workload issuing 32 KB reads from 16 CPUs (8 hypertwin pairs) across 2
NUMA nodes to 23 NVMe devices, we observed 2.4% of CPU time spent in
_raw_spin_lock_irqsave called from dma_pool_alloc and dma_pool_free.
Ideally, the dma_pools would be per-hctx to minimize contention. But
that could impose considerable resource costs in a system with many NVMe
devices and CPUs.
As a compromise, allocate per-NUMA-node PRP list DMA pools. Map each
nvme_queue to the set of DMA pools corresponding to its device and its
hctx's NUMA node. This reduces the _raw_spin_lock_irqsave overhead by
about half, to 1.2%. Preventing the sharing of PRP list pages across
NUMA nodes also makes them cheaper to initialize.
Link: https://lore.kernel.org/linux-nvme/CADUfDZqa=OOTtTTznXRDmBQo1WrFcDw1hBA7XwM7hzJ-hpckcA@mail.gmail.com/T/#u
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
nvme_init_hctx() and nvme_admin_init_hctx() are very similar. In
preparation for adding more logic, factor out a nvme_init_hctx-common()
helper.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Introduce dma_pool_create_node(), like dma_pool_create() but taking an
additional NUMA node argument. Allocate struct dma_pool on the desired
node, and store the node on dma_pool for allocating struct dma_page.
Make dma_pool_create() an alias for dma_pool_create_node() with node set
to NUMA_NO_NODE.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The lsrsp object is maintained by the LLDD. The lifetime of the lsrsp
object is implicit. Because there is no explicit cleanup/free call into
the LLDD, it is not safe to assume after xml_rsp_fails, that the lsrsp
is still valid. The LLDD could have freed the object already.
With the recent changes how fcloop tracks the resources, this is the
case. Thus don't access lsrsp after xml_rsp_fails.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The lifetime of the fcloop_lsreq is not tight to the lifetime of the
host or target port, thus there is no need anymore to synchronize the
cleanup path anymore.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Add the missing fcloop_call_host_done calls so that the caller
frees resources when something goes wrong.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Ensure that the tgtport is not going away as long portentry has a
pointer on it.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
When nvmet_fc_unregister_targetport is called by the LLDD, it's not
possible to communicate with the host, thus all pending request will not
be process. Thus explicitly free them.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|