summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-22Merge tag 'platform-drivers-x86-v6.4-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Hans de Goede: "One small fix for an AMD PMF driver issue which is causing issues for users of just released AMD laptop models" * tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmf: Register notify handler only if SPS is enabled
2023-06-22Merge tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: "A fix for a race condition with poll removal and linked timeouts, and then a few followup fixes/tweaks for the msg_control patch from last week. Not super important, particularly the sparse fixup, as it was broken before that recent commit. But let's get it sorted for real for this release, rather than just have it broken a bit differently" * tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux: io_uring/net: use the correct msghdr union member in io_sendmsg_copy_hdr io_uring/net: disable partial retries for recvmsg with cmsg io_uring/net: clear msg_controllen on partial sendmsg retry io_uring/poll: serialize poll linked timer start with poll removal
2023-06-23spi: Helper for deriving timeout valuesMark Brown
Merge series from Miquel Raynal <miquel.raynal@bootlin.com>: I recently came across an issue with the Atmel spi controller driver which would stop my transfers after a too small timeout when performing big transfers (reading a 4MiB flash in one transfer). My initial idea was to derive a the maximum amount of time a transfer would take depending on its size and use that as value to avoid erroring-out when not relevant. Mark wanted to go further by creating a core helper doing that, based on the heuristics from the sun6i driver. Here is a small series of 3 patches doing exactly that.
2023-06-22Merge tag 'cgroup-for-6.4-rc7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "It's late but here are two bug fixes. Both fix problems which can be severe but are very confined in scope. The risk to most use cases should be minimal. - Fix for an old bug which triggers if a cgroup subsystem is remounted to a different hierarchy while someone is reading its cgroup.procs/tasks file. The risk is pretty low given how seldom cgroup subsystems are moved across hierarchies. - We moved cpus_read_lock() outside of cgroup internal locks a while ago but forgot to update the legacy_freezer leading to lockdep triggers. Fixed" * tag 'cgroup-for-6.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Do not corrupt task iteration when rebinding subsystem cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}()
2023-06-22spi: sun6i: Use the new helper to derive the xfer timeout valueMiquel Raynal
A helper was recently added to the core to factorize common code between drivers, like the amount of time a driver should wait for a transfer to happen. It is of course possible to use a default value (like eg. 1s) but it is way stronger to adapt this amount of time to the transfer. Indeed, long transfers (eg. 4MiB) on a slow single-spi bus might take more than the usual second of timeout and prevent lengthy transfers. The core helper was heavily inspired by the logic applied in this driver, the only difference being the minimum amount of time which was enlarged from 0.1s to 0.5s. Use this helper instead of open-coding it. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Jernej Škrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/Message-Id: <20230622090634.3411468-4-miquel.raynal@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-22spi: atmel: Prevent false timeouts on long transfersMiquel Raynal
A slow SPI bus clocks at ~20MHz, which means it would transfer about 2500 bytes per second with a single data line. Big transfers, like when dealing with flashes can easily reach a few MiB. The current DMA timeout is set to 1 second, which means any working transfer of about 4MiB will always be cancelled. With the above derivations, on a slow bus, we can assume every byte will take at most 0.4ms. Said otherwise, we could add 4ms to the 1-second timeout delay every 10kiB. On a 4MiB transfer, it would bring the timeout delay up to 2.6s which still seems rather acceptable for a timeout. The consequence of this is that long transfers might be allowed, which hence requires the need to interrupt the transfer if wanted by the user. We can hence switch to the _interruptible variant of wait_for_completion. This leads to a little bit more handling to also handle the interrupted case but looks really acceptable overall. While at it, we drop the useless, noisy and redundant WARN_ON() call. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Ryan Wanner <ryan.wanner@microchip.com> Link: https://lore.kernel.org/r/Message-Id: <20230622090634.3411468-3-miquel.raynal@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-22dt-bindings: mtd: marvell-nand: Convert to YAML DT schemeVadym Kochan
Switch the DT binding to a YAML schema to enable the DT validation. There was also an incorrect reference to dma-names being "rxtx" where the driver and existing device trees actually use dma-names = "data" so this is corrected in the conversion. Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230619040742.1108172-2-chris.packham@alliedtelesis.co.nz
2023-06-22dt-bindings: mtd: ti,am654: Prevent unevaluated propertiesMiquel Raynal
Reference mtd-physmap.yaml which contains all the relevant properties for this device. Add "unevaluatedProperties: false" to avoid any spurious addition of random properties. Cc: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-18-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: mediatek: Prevent NAND chip unevaluated propertiesMiquel Raynal
nand-on-flash-bbt is a generic property which may apply to any raw NAND chip, it does not need to be listed in each controller description. The raw NAND chip description file which contains the property is already referenced, so no need to mention the property here again. Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Xiangsheng Hou <xiangsheng.hou@mediatek.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-17-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: mediatek: Reference raw-nand-chip.yamlMiquel Raynal
The mediatek NAND controller should reference the new raw-nand-chip.yaml binding instead of the original nand-chip.yaml which does not contain *all* the properties that may be used to fully describe the NAND devices, certain properties being actually described under nand-controller.yaml. Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Xiangsheng Hou <xiangsheng.hou@mediatek.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-16-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: stm32: Prevent NAND chip unevaluated propertiesMiquel Raynal
List all the possible properties in the NAND chip as per the example and set unevaluatedProperties to false in the NAND chip section. Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Christophe Kerello <christophe.kerello@foss.st.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-15-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: rockchip: Prevent NAND chip unevaluated propertiesMiquel Raynal
List all the possible properties in the NAND chip as per the example and set unevaluatedProperties to false in the NAND chip section. Cc: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-14-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: intel: Prevent NAND chip unevaluated propertiesMiquel Raynal
nand-ecc-mode is a generic property which may apply to any raw NAND chip, it does not need to be listed in each controller description. Instead, let's reference the raw NAND chip description file which contains the property. The description contained "additionalProperties: false" which is wrong as other properties such as partitions might very well be added in the final .dts, and anyway needs to be converted into "unexpectedProperties: false" to fit the property change new requirements. Cc: Vadivel Murugan <vadivel.muruganx.ramuthevar@linux.intel.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-13-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: denali: Prevent NAND chip unevaluated propertiesMiquel Raynal
Ensure all raw NAND chip properties are valid by referencing the relevant schema and set unevaluatedProperties to false in the NAND chip section to avoid spurious additions of random properties. Doing this in one location also saves us from dupplicating the description of the NAND chip object. Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-12-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: brcmnand: Prevent NAND chip unevaluated propertiesMiquel Raynal
Ensure all raw NAND chip properties are valid by referencing the relevant schema and set unevaluatedProperties to false in the NAND chip section to avoid spurious additions of random properties. Cc: Brian Norris <computersforpeace@gmail.com> Cc: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-11-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: meson: Prevent NAND chip unevaluated propertiesMiquel Raynal
Ensure all raw NAND chip properties are valid by referencing the relevant schema and set unevaluatedProperties to false in the NAND chip section to avoid spurious additions of random properties. Cc: Liang Yang <liang.yang@amlogic.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-10-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: sunxi: Prevent NAND chip unevaluated propertiesMiquel Raynal
nand-ecc-mode is a generic property which may apply to any raw NAND chip, it does not need to be listed in each controller description. Instead, let's reference the raw NAND chip description file which contains the property. The description contained "additionalProperties: false" which is wrong as other properties such as partitions might very well be added in the final .dts, and anyway needs to be converted into "unexpectedProperties: false" to fit the property change new requirements. Cc: Maxime Ripard <mripard@kernel.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Samuel Holland <samuel@sholland.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-9-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: ingenic: Prevent NAND chip unevaluated propertiesMiquel Raynal
List all the possible properties in the NAND chip as per the example and set unevaluatedProperties to false in the NAND chip section. Cc: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-8-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: qcom: Prevent NAND chip unevaluated propertiesMiquel Raynal
List all the possible properties in the NAND chip as per the example and set unevaluatedProperties to false in the NAND chip section. Cc: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-7-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: qcom: Fix a property positionMiquel Raynal
qcom,boot-partitions is a NAND chip property, not a NAND controller property. Move the description of the property into the NAND chip section and just enable the property in the if/else block. Fixes: 5278cc93a97f ("dt-bindings: mtd: qcom_nandc: document qcom,boot-partitions binding") Cc: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-6-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: Describe nand-ecc-modeMiquel Raynal
This property has been extensively used for almost two decades already, a lot of device trees use it, this is not the preferred way to configure the ECC engines but we cannot just ignore it. Describe the property, list the exact strings which have once been supported and mark it deprecated. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-5-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: Mark nand-ecc-placement deprecatedMiquel Raynal
The nand-ecc-placement property has been deprecated for a long time already, it does not really mean something useful for the ECC engines and is anyway in the vast majority of cases totally useless. Just mark it deprecated to avoid appealing people to use it. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-4-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: Create a file for raw NAND chip propertiesMiquel Raynal
In an effort to constrain as much as we can the existing binding, we want to add "unevaluatedProperties: false" in all the NAND chip descriptions part of NAND controller bindings. But in order to do that properly, we also need to reference a file which contains all the "allowed" properties. Right now this file is nand-chip.yaml but in practice raw NAND controllers may use additional properties in their NAND chip children node. These properties are listed under nand-controller.yaml, which makes the "unevaluatedProperties" checks fail while the description are valid. We need to move these NAND chip related properties into another file, because we do not want to pollute nand-chip.yaml which is also referenced by eg. SPI-NAND devices. Let's create a raw-nand-chip.yaml file to reference all the properties a raw NAND chip description can contain. The chain of inheritance becomes: nand-controller.yaml <- raw-nand-chip.yaml raw-nand-chip.yaml <- nand-chip.yaml spi-nand.yaml <- nand-chip.yaml Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-3-miquel.raynal@bootlin.com
2023-06-22dt-bindings: mtd: Accept nand related node namesMiquel Raynal
There is no addition there, but the mtd.yaml file is so generic, it can be referenced by a wide variety of devices, including nand ones which already define the node name to "nand@<cs>". Right now it does not lead to any failure but when we will constrain more the schema, this will become a problem because we want the mtd-wide properties like label or partitions to be available for the callers. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/linux-mtd/20230619092916.3028470-2-miquel.raynal@bootlin.com
2023-06-22mtd: sm_ftl: Fix typos in commentsBo Liu
Fix typo in the description of the 'succesfull'. Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230621020331.1508-1-wangdeming@inspur.com
2023-06-22Merge tag 'kvmarm-fixes-6.4-4' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.4, take #4 - Correctly save/restore PMUSERNR_EL0 when host userspace is using PMU counters directly - Fix GICv2 emulation on GICv3 after the locking rework - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and document why...
2023-06-22arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devicesDouglas Anderson
Just like for sc7180 devices using the Chrome bootflow (AKA trogdor and IDP), sc7280 devices using the Chrome bootflow also need their firmware marked dma-coherent. On sc7280 this wasn't causing WiFi to fail to startup, since WiFi works differently there. However, on sc7280 devices we were still getting the message at bootup after commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""): qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 9c900000.memory: assign memory failed qcom_rmtfs_mem: probe of 9c900000.memory failed with error -22 We should mark SCM properly just like we did for trogdor. Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: 7a1f4e7f740d ("arm64: dts: qcom: sc7280: Add basic dts/dtsi files for sc7280 soc") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.4.I21dc14a63327bf81c6bb58fe8ed91dbdc9849ee2@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdorDouglas Anderson
Trogdor devices use firmware backed by TF-A instead of Qualcomm's normal TZ. On TF-A we end up mapping memory as cacheable. Specifically, you can see in Trogdor's TF-A code [1] in qti_sip_mem_assign() that we call qti_mmap_add_dynamic_region() with MT_RO_DATA. This translates down to MT_MEMORY instead of MT_NON_CACHEABLE or MT_DEVICE. Apparently Qualcomm's normal TZ implementation maps the memory as non-cacheable. Let's add the "dma-coherent" attribute to the SCM for trogdor. Adding "dma-coherent" like this fixes WiFi on sc7180-trogdor devices. WiFi was broken as of commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""). Specifically at bootup we'd get: qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 94600000.memory: assign memory failed qcom_rmtfs_mem: probe of 94600000.memory failed with error -22 From discussion on the mailing lists [2] and over IRC [3], it was determined that we should always have been tagging the SCM as dma-coherent on trogdor but that the old "invalidate" happened to make things work most of the time. Tagging it properly like this is a much more robust solution. [1] https://chromium.googlesource.com/chromiumos/third_party/arm-trusted-firmware/+/refs/heads/firmware-trogdor-13577.B/plat/qti/common/src/qti_syscall.c [2] https://lore.kernel.org/r/20230614165904.1.I279773c37e2c1ed8fbb622ca6d1397aea0023526@changeid [3] https://oftc.irclog.whitequark.org/linux-msm/2023-06-15 Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt") Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.3.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDPDouglas Anderson
sc7180-idp is, for most intents and purposes, a trogdor device. Specifically, sc7180-idp is designed to run the same style of firmware as trogdor devices. This can be seen from the fact that IDP has the same "Reserved memory changes" in its device tree that trogdor has. Recently it was realized that we need to mark SCM as dma-coherent to match what trogdor's style of firmware (based on TF-A) does [1]. That means we need this dma-coherent tag on IDP as well. Without this, on newer versions of Linux, specifically those with commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""), WiFi will fail to work. At bootup you'll see: qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 94600000.memory: assign memory failed qcom_rmtfs_mem: probe of 94600000.memory failed with error -22 [1] https://lore.kernel.org/r/20230615145253.1.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: f5ab220d162c ("arm64: dts: qcom: sc7180: Add remoteproc enablers") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.2.I3c17d546d553378aa8a0c68c3fe04bccea7cba17@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherentDouglas Anderson
Trogdor devices use firmware backed by TF-A instead of Qualcomm's normal TZ. On TF-A we end up mapping memory as cacheable. Specifically, you can see in Trogdor's TF-A code [1] in qti_sip_mem_assign() that we call qti_mmap_add_dynamic_region() with MT_RO_DATA. This translates down to MT_MEMORY instead of MT_NON_CACHEABLE or MT_DEVICE. Let's allow devices like trogdor to be described properly by allowing "dma-coherent" in the SCM node. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230616081440.v2.1.Ie79b5f0ed45739695c9970df121e11d724909157@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22KVM: Avoid illegal stage2 mapping on invalid memory slotGavin Shan
We run into guest hang in edk2 firmware when KSM is kept as running on the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash device (TYPE_PFLASH_CFI01) during the operation of sector erasing or buffered write. The status is returned by reading the memory region of the pflash device and the read request should have been forwarded to QEMU and emulated by it. Unfortunately, the read request is covered by an illegal stage2 mapping when the guest hang issue occurs. The read request is completed with QEMU bypassed and wrong status is fetched. The edk2 firmware runs into an infinite loop with the wrong status. The illegal stage2 mapping is populated due to same page sharing by KSM at (C) even the associated memory slot has been marked as invalid at (B) when the memory slot is requested to be deleted. It's notable that the active and inactive memory slots can't be swapped when we're in the middle of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count is elevated, and kvm_swap_active_memslots() will busy loop until it reaches to zero again. Besides, the swapping from the active to the inactive memory slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(), corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots(). CPU-A CPU-B ----- ----- ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION) kvm_vm_ioctl_set_memory_region kvm_set_memory_region __kvm_set_memory_region kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE) kvm_invalidate_memslot kvm_copy_memslot kvm_replace_memslot kvm_swap_active_memslots (A) kvm_arch_flush_shadow_memslot (B) same page sharing by KSM kvm_mmu_notifier_invalidate_range_start : kvm_mmu_notifier_change_pte kvm_handle_hva_range __kvm_handle_hva_range kvm_set_spte_gfn (C) : kvm_mmu_notifier_invalidate_range_end Fix the issue by skipping the invalid memory slot at (C) to avoid the illegal stage2 mapping so that the read request for the pflash's status is forwarded to QEMU and emulated by it. In this way, the correct pflash's status can be returned from QEMU to break the infinite loop in the edk2 firmware. We tried a git-bisect and the first problematic commit is cd4c71835228 (" KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this, clean_dcache_guest_page() is called after the memory slots are iterated in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called before the iteration on the memory slots before this commit. This change literally enlarges the racy window between kvm_mmu_notifier_change_pte() and memory slot removal so that we're able to reproduce the issue in a practical test case. However, the issue exists since commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup"). Cc: stable@vger.kernel.org # v3.9+ Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") Reported-by: Shuai Hu <hshuai@redhat.com> Reported-by: Zhenyu Zhang <zhenyzha@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Message-Id: <20230615054259.14911-1-gshan@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-22Revert "cgroup: Avoid -Wstringop-overflow warnings"Tejun Heo
This reverts commit 36de5f303ca1bd6fce74815ef17ef3d8ff8737b5. The commit caused boot failures on some configurations due to cgroup hierarchies not being created at all. Signed-off-by: Tejun Heo <tj@kernel.org>
2023-06-22spi: dt-bindings: stm32: do not disable spi-slave property for stm32f4-f7Valentin Caron
STM32F4-F7 are, from hardware point of view, capable to handle device mode. So this property should not be forced at false in dt-bindings. Signed-off-by: Valentin Caron <valentin.caron@foss.st.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/Message-Id: <20230621115523.923176-3-valentin.caron@foss.st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-22ACPI: video: Add backlight=native DMI quirk for Dell Studio 1569Hans de Goede
The Dell Studio 1569 predates Windows 8, so it defaults to using acpi_video# for backlight control, but this is non functional on this model. Add a DMI quirk to use the native intel_backlight interface which does work properly. Reported-by: raycekarneal <raycekarneal@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-22block: don't return -EINVAL for not found names in devt_from_devnameChristoph Hellwig
When we didn't find a device and didn't guess it might be a partition, it might still show up later, so don't disable rootwait for it by returning -EINVAL. Fixes: 079caa35f786 ("init: clear root_wait on all invalid root= strings") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230622150644.600327-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22btrfs: fix remaining u32 overflows when left shifting stripe_nrQu Wenruo
There was regression caused by a97699d1d610 ("btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN") and supposedly fixed by a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr"). To avoid code churn the fix was open coding the type casts but unfortunately missed one which was still possible to hit [1]. The missing place was assignment of bioc->full_stripe_logical inside btrfs_map_block(). Fix it by adding a helper that does the safe calculation of the offset and use it everywhere even though it may not be strictly necessary due to already using u64 types. This replaces all remaining "<< BTRFS_STRIPE_LEN_SHIFT" calls. [1] https://lore.kernel.org/linux-btrfs/20230622065438.86402-1-wqu@suse.com/ Fixes: a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr") Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-22spi: Create a helper to derive adaptive timeoutsMiquel Raynal
Big transfers might take a bit of time, too constraining timeouts might lead to false positives. In order to simplify the drivers work and with the goal of factorizing code in mind, let's add a helper that can be used by any spi controller driver to derive a relevant per-transfer timeout value. The logic is simple: we know how much time it would take to transfer a byte, we can easily derive the total theoretical amount of time involved for each transfer. We multiply it by two to have a bit of margin and enforce a minimum of 500ms. Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/Message-Id: <20230622090634.3411468-2-miquel.raynal@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-22cdrom: Fix spectre-v1 gadgetJordy Zomer
This patch fixes a spectre-v1 gadget in cdrom. The gadget could be triggered by speculatively bypassing the cdi->capacity check. Signed-off-by: Jordy Zomer <jordyzomer@google.com> Link: https://lore.kernel.org/all/20230612110040.849318-2-jordyzomer@google.com Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/all/ZI1+1OG9Ut1MqsUC@equinox Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/r/20230617113828.1230-2-phil@philpotter.co.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22block: make sure local irq is disabled when calling __blkcg_rstat_flushMing Lei
When __blkcg_rstat_flush() is called from cgroup_rstat_flush*() code path, interrupt is always disabled. When we start to flush blkcg per-cpu stats list in __blkg_release() for avoiding to leak blkcg_gq's reference in commit 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq"), local irq isn't disabled yet, then lockdep warning may be triggered because the dependent cgroup locks may be acquired from irq(soft irq) handler. Fix the issue by disabling local irq always. Fixes: 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-block/pz2wzwnmn5tk3pwpskmjhli6g3qly7eoknilb26of376c7kwxy@qydzpvt6zpis/T/#u Cc: stable@vger.kernel.org Cc: Jay Shin <jaeshin@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230622084249.1208005-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22erofs: clean up zmap.cGao Xiang
Several trivial cleanups which aren't quite necessary to split: - Rename lcluster load functions as well as justify full indexes since they are typically used for global deduplication for compressed data; - Avoid unnecessary lines, comments for simplicity. No logic changes. Reviewed-by: Guo Xuenan <guoxuenan@huaweicloud.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230615064421.103178-1-hsiangkao@linux.alibaba.com
2023-06-22erofs: remove unnecessary gotoYangtao Li
It's redundant, let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Link: https://lore.kernel.org/r/20230615034539.14286-1-frank.li@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-06-22erofs: Fix detection of atomic contextSandeep Dhavale
Current check for atomic context is not sufficient as z_erofs_decompressqueue_endio can be called under rcu lock from blk_mq_flush_plug_list(). See the stacktrace [1] In such case we should hand off the decompression work for async processing rather than trying to do sync decompression in current context. Patch fixes the detection by checking for rcu_read_lock_any_held() and while at it use more appropriate !in_task() check than in_atomic(). Background: Historically erofs would always schedule a kworker for decompression which would incur the scheduling cost regardless of the context. But z_erofs_decompressqueue_endio() may not always be in atomic context and we could actually benefit from doing the decompression in z_erofs_decompressqueue_endio() if we are in thread context, for example when running with dm-verity. This optimization was later added in patch [2] which has shown improvement in performance benchmarks. ============================================== [1] Problem stacktrace [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291 [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi [name:core&]preempt_count: 0, expected: 0 [name:core&]RCU nest depth: 1, expected: 0 CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1 Hardware name: MT6897 (DT) Call trace: dump_backtrace+0x108/0x15c show_stack+0x20/0x30 dump_stack_lvl+0x6c/0x8c dump_stack+0x20/0x48 __might_resched+0x1fc/0x308 __might_sleep+0x50/0x88 mutex_lock+0x2c/0x110 z_erofs_decompress_queue+0x11c/0xc10 z_erofs_decompress_kickoff+0x110/0x1a4 z_erofs_decompressqueue_endio+0x154/0x180 bio_endio+0x1b0/0x1d8 __dm_io_complete+0x22c/0x280 clone_endio+0xe4/0x280 bio_endio+0x1b0/0x1d8 blk_update_request+0x138/0x3a4 blk_mq_plug_issue_direct+0xd4/0x19c blk_mq_flush_plug_list+0x2b0/0x354 __blk_flush_plug+0x110/0x160 blk_finish_plug+0x30/0x4c read_pages+0x2fc/0x370 page_cache_ra_unbounded+0xa4/0x23c page_cache_ra_order+0x290/0x320 do_sync_mmap_readahead+0x108/0x2c0 filemap_fault+0x19c/0x52c __do_fault+0xc4/0x114 handle_mm_fault+0x5b4/0x1168 do_page_fault+0x338/0x4b4 do_translation_fault+0x40/0x60 do_mem_abort+0x60/0xc8 el0_da+0x4c/0xe0 el0t_64_sync_handler+0xd4/0xfc el0t_64_sync+0x1a0/0x1a4 [2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/ Reported-by: Will Shiu <Will.Shiu@mediatek.com> Suggested-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Sandeep Dhavale <dhavale@google.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20230621220848.3379029-1-dhavale@google.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-06-22s390/defconfigs: set CONFIG_NET_TC_SKB_EXT=yNiklas Schnelle
As made explicit by commit 03a283cdc8c8 ("net/mlx5: Kconfig: Make tc offload depend on tc skb extension") tc skb extension is required for offloading tc as well as bridges on switchdev capable ConnectX devices. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-22Merge tag 'nf-23-06-21' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This is v3, including a crash fix for patch 01/14. The following patchset contains Netfilter/IPVS fixes for net: 1) Fix UDP segmentation with IPVS tunneled traffic, from Terin Stock. 2) Fix chain binding transaction logic, add a bound flag to rule transactions. Remove incorrect logic in nft_data_hold() and nft_data_release(). 3) Add a NFT_TRANS_PREPARE_ERROR deactivate state to deal with releasing the set/chain as a follow up to 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE") 4) Drop map element references from preparation phase instead of set destroy path, otherwise bogus EBUSY with transactions such as: flush chain ip x y delete chain ip x w where chain ip x y contains jump/goto from set elements. 5) Pipapo set type does not regard generation mask from the walk iteration. 6) Fix reference count underflow in set element reference to stateful object. 7) Several patches to tighten the nf_tables API: - disallow set element updates of bound anonymous set - disallow unbound anonymous set/chain at the end of transaction. - disallow updates of anonymous set. - disallow timeout configuration for anonymous sets. 8) Fix module reference leak in chain updates. 9) Fix nfnetlink_osf module autoload. 10) Fix deletion of basechain when NFTA_CHAIN_HOOK is specified as in iptables-nft. This Netfilter batch is larger than usual at this stage, I am aware we are fairly late in the -rc cycle, if you prefer to route them through net-next, please let me know. netfilter pull request 23-06-21 * tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: disallow element updates of bound anonymous sets netfilter: nf_tables: fix underflow in object reference counter netfilter: nft_set_pipapo: .walk does not deal with generations netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: fix chain binding transaction logic ipvs: align inner_mac_header for encapsulation ==================== Link: https://lore.kernel.org/r/20230621100731.68068-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22s390/cpum_cf: rework PER_CPU_DEFINE of struct cpu_cf_eventsThomas Richter
Struct cpu_cf_events is a large data structure and is statically defined for each possible CPU. Rework this and replace it by dynamically allocated data structures created when a perf_event_open() system call is invoked or an access via character device /dev/hwctr takes place. It is replaced by an array of pointers to all possible CPUs and reference counting. The array of pointers is allocated when the first event is created. For each online CPU an event is installed on, a struct cpu_cf_events is allocated and a pointer to struct cpu_cf_events is stored in the array: CPU 0 1 2 3 ... N +---+---+---+---+---+---+ cpu_cf_root::cpucf--> | * | | | |...| | +-|-+---+---+---+---+---+ | | \|/ +-------------+ |cpu_cf_events| | | +-------------+ With this approach the large data structure is only allocated when an event is actually installed and used. Also implement proper reference counting for allocation and removal. During interrupt processing make sure the pointer to cpu_cf_events is valid. The interrupt handler is shared and might be called when no event is active. This requires checking for a valid pointer to struct cpu_cf_events. When the pointer to the per-cpu cpu_cf_events is NULL, simply return. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-22revert "net: align SO_RCVMARK required privileges with SO_MARK"Maciej Żenczykowski
This reverts commit 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") because the reasoning in the commit message is not really correct: SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which sets the socket mark and does require privs. Additionally incoming skb->mark may already be visible if sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled. Furthermore, it is easier to block the getsockopt via bpf (either cgroup setsockopt hook, or via syscall filters) then to unblock it if it requires CAP_NET_RAW/ADMIN. On Android the socket mark is (among other things) used to store the network identifier a socket is bound to. Setting it is privileged, but retrieving it is not. We'd like unprivileged userspace to be able to read the network id of incoming packets (where mark is set via iptables [to be moved to bpf])... An alternative would be to add another sysctl to control whether setting SO_RCVMARK is privilged or not. (or even a MASK of which bits in the mark can be exposed) But this seems like over-engineering... Note: This is a non-trivial revert, due to later merged commit e42c7beee71d ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()") which changed both 'ns_capable' into 'sockopt_ns_capable' calls. Fixes: 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") Cc: Larysa Zaremba <larysa.zaremba@intel.com> Cc: Simon Horman <simon.horman@corigine.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eyal Birger <eyal.birger@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Patrick Rohr <prohr@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22net: wwan: iosm: Convert single instance struct member to flexible arrayKees Cook
struct mux_adth actually ends with multiple struct mux_adth_dg members. This is seen both in the comments about the member: /** * struct mux_adth - Structure of the Aggregated Datagram Table Header. ... * @dg: datagramm table with variable length */ and in the preparation for populating it: adth_dg_size = offsetof(struct mux_adth, dg) + ul_adb->dg_count[i] * sizeof(*dg); ... adth_dg_size -= offsetof(struct mux_adth, dg); memcpy(&adth->dg, ul_adb->dg[i], adth_dg_size); This was reported as a run-time false positive warning: memcpy: detected field-spanning write (size 16) of single field "&adth->dg" at drivers/net/wwan/iosm/iosm_ipc_mux_codec.c:852 (size 8) Adjust the struct mux_adth definition and associated sizeof() math; no binary output differences are observed in the resulting object file. Reported-by: Florian Klink <flokli@flokli.de> Closes: https://lore.kernel.org/lkml/dbfa25f5-64c8-5574-4f5d-0151ba95d232@gmail.com/ Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Cc: M Chetan Kumar <m.chetan.kumar@intel.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Intel Corporation <linuxwwan@intel.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620194234.never.023-kees@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22dt-bindings: mmc: fsl-imx-esdhc: Add imx6ul supportOleksij Rempel
Add the 'fsl,imx6ul-usdhc' value to the compatible properties list in the fsl-imx-esdhc.yaml file. This is required to match the compatible strings present in the 'mmc@2190000' node of 'imx6ul-prti6g.dtb'. This commit addresses the following dtbs_check warning: imx6ul-prti6g.dtb:0:0: /soc/bus@2100000/mmc@2190000: failed to match any schema with compatible: ['fsl,imx6ul-usdhc', 'fsl,imx6sx-usdhc'] Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230621093245.78130-2-o.rempel@pengutronix.de Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-22mmc: mmci: Add support for SW busy-end timeoutsUlf Hansson
The ux500 variant doesn't have a HW based timeout to use for busy-end IRQs. To avoid hanging and waiting for the card to stop signaling busy, let's schedule a delayed work, according to the corresponding cmd->busy_timeout for the command. If the work gets to run, let's kick the IRQ handler to complete the currently running request/command. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20230620091113.33393-1-ulf.hansson@linaro.org
2023-06-22sch_netem: acquire qdisc lock in netem_change()Eric Dumazet
syzbot managed to trigger a divide error [1] in netem. It could happen if q->rate changes while netem_enqueue() is running, since q->rate is read twice. It turns out netem_change() always lacked proper synchronization. [1] divide error: 0000 [#1] SMP KASAN CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 RIP: 0010:div64_u64 include/linux/math64.h:69 [inline] RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline] RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576 Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03 RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246 RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000 RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297 R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358 R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000 FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0 Call Trace: <TASK> [<ffffffff84714385>] __dev_xmit_skb net/core/dev.c:3931 [inline] [<ffffffff84714385>] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290 [<ffffffff84d22df2>] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [<ffffffff84d22df2>] neigh_hh_output include/net/neighbour.h:531 [inline] [<ffffffff84d22df2>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff84d22df2>] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235 [<ffffffff84d21e63>] __ip_finish_output+0xc3/0x2b0 [<ffffffff84d10a81>] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323 [<ffffffff84d10f14>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff84d10f14>] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437 [<ffffffff84d123b5>] dst_output include/net/dst.h:444 [inline] [<ffffffff84d123b5>] ip_local_out net/ipv4/ip_output.c:127 [inline] [<ffffffff84d123b5>] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542 [<ffffffff84d12fdc>] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>