summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2022-02-06Merge tag 'irq_urgent_for_v5.17_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Borislav Petkov: "Remove a bogus warning introduced by the recent PCI MSI irq affinity overhaul" * tag 'irq_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Remove bogus warning in pci_irq_get_affinity()
2022-02-06Merge tag 'edac_urgent_for_v5.17_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: "Fix altera and xgene EDAC drivers to propagate the correct error code from platform_get_irq() so that deferred probing still works" * tag 'edac_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/xgene: Fix deferred probing EDAC/altera: Fix deferred probing
2022-02-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Some medium sized bugs in the various drivers. A couple are more recent regressions: - Fix two panics in hfi1 and two allocation problems - Send the IGMP to the correct address in cma - Squash a syzkaller bug related to races reading the multicast list - Memory leak in siw and cm - Fix a corner case spec compliance for HFI/QIB - Correct the implementation of fences in siw - Error unwind bug in mlx4" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx4: Don't continue event handler after memory allocation failure RDMA/siw: Fix broken RDMA Read Fence/Resume logic. IB/rdmavt: Validate remote_addr during loopback atomic tests IB/cm: Release previously acquired reference counter in the cm_id_priv RDMA/siw: Fix refcounting leak in siw_create_qp() RDMA/ucma: Protect mc during concurrent multicast leaves RDMA/cma: Use correct address when leaving multicast group IB/hfi1: Fix tstats alloc and dealloc IB/hfi1: Fix AIP early init panic IB/hfi1: Fix alloc failure with larger txqueuelen IB/hfi1: Fix panic with larger ipoib send_queue_size
2022-02-04Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Seven fixes, six of which are fairly obvious driver fixes. The one core change to the device budget depth is to try to ensure that if the default depth is large (which can produce quite a sizeable bitmap allocation per device), we give back the memory we don't need if there's a queue size reduction in slave_configure (which happens to a lot of devices)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: hisi_sas: Fix setting of hisi_sas_slot.is_internal scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task scsi: pm8001: Fix use-after-free for aborted TMF sas_task scsi: pm8001: Fix warning for undescribed param in process_one_iomb() scsi: core: Reallocate device's budget map on queue depth change scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe scsi: pm80xx: Fix double completion for SATA devices
2022-02-04Merge tag 'pci-v5.17-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Restructure j721e_pcie_probe() so we don't dereference a NULL pointer (Bjorn Helgaas) - Add a kirin_pcie_data struct to identify different Kirin variants to fix probe failure for controllers with an internal PHY (Bjorn Helgaas) * tag 'pci-v5.17-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: kirin: Add dev struct for of_device_get_match_data() PCI: j721e: Initialize pcie->cdns_pcie before using it
2022-02-04PCI: kirin: Add dev struct for of_device_get_match_data()Bjorn Helgaas
Bean reported that a622435fbe1a ("PCI: kirin: Prefer of_device_get_match_data()") broke kirin_pcie_probe() because it assumed match data of 0 was a failure when in fact, it meant the match data was "(void *)PCIE_KIRIN_INTERNAL_PHY". Therefore, probing of "hisilicon,kirin960-pcie" devices failed with -EINVAL and an "OF data missing" message. Add a struct kirin_pcie_data to encode the PHY type. Then the result of of_device_get_match_data() should always be a non-NULL pointer to a struct kirin_pcie_data that contains the PHY type. Fixes: a622435fbe1a ("PCI: kirin: Prefer of_device_get_match_data()") Link: https://lore.kernel.org/r/20220202162659.GA12603@bhelgaas Link: https://lore.kernel.org/r/20220201215941.1203155-1-huobean@gmail.com Reported-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-02-04Merge tag 'block-5.17-2022-02-04' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request - fix use-after-free in rdma and tcp controller reset (Sagi Grimberg) - fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar) - MD nowait null pointer fix (Song) - blk-integrity seed advance fix (Martin) - Fix a dio regression in this merge window (Ilya) * tag 'block-5.17-2022-02-04' of git://git.kernel.dk/linux-block: block: bio-integrity: Advance seed correctly for larger interval sizes nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts() md: fix NULL pointer deref with nowait but no mddev->queue block: fix DIO handling regressions in blkdev_read_iter() nvme-rdma: fix possible use-after-free in transport error_recovery work nvme-tcp: fix possible use-after-free in transport error_recovery work nvme: fix a possible use-after-free in controller reset during load
2022-02-04Merge tag 'ata-5.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ATA fixes from Damien Le Moal: - Sergey volunteered to be a reviewer for the Renesas R-Car SATA driver and PATA drivers. Update the MAINTAINERS file accordingly. - Regression fix: add a horkage flag to prevent accessing the log directory log page with SATADOM-ML 3ME SATA devices as they react badly to reading that log page (from Anton). * tag 'ata-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-core: Introduce ATA_HORKAGE_NO_LOG_DIR horkage MAINTAINERS: add myself as Renesas R-Car SATA driver reviewer MAINTAINERS: add myself as PATA drivers reviewer
2022-02-04Merge tag 'iommu-fixes-v5.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Warning fixes and a fix for a potential use-after-free in IOMMU core code - Another potential memory leak fix for the Intel VT-d driver - Fix for an IO polling loop timeout issue in the AMD IOMMU driver * tag 'iommu-fixes-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix loop timeout issue in iommu_ga_log_enable() iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping() iommu: Fix some W=1 warnings iommu: Fix potential use-after-free during probe
2022-02-04Merge tag 'random-5.17-rc3-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fixes from Jason Donenfeld: "For this week, we have: - A fix to make more frequent use of hwgenerator randomness, from Dominik. - More cleanups to the boot initialization sequence, from Dominik. - A fix for an old shortcoming with the ZAP ioctl, from me. - A workaround for a still unfixed Clang CFI/FullLTO compiler bug, from me. On one hand, it's a bummer to commit workarounds for experimental compiler features that have bugs. But on the other, I think this actually improves the code somewhat, independent of the bug. So a win-win" * tag 'random-5.17-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: only call crng_finalize_init() for primary_crng random: access primary_pool directly rather than through pointer random: wake up /dev/random writers after zap random: continually use hwgenerator randomness lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI
2022-02-04Merge tag 'acpi-5.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix compilation in the case when ACPI is selected and CRC32, depended on by ACPI after recent changes, is not (Randy Dunlap)" * tag 'acpi-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: require CRC32 to build
2022-02-04Merge tag 'sound-5.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. The major changes are ASoC core fixes, addressing the DPCM locking issue after the recent code changes and the potentially invalid register accesses via control API. Also, HD-audio got a core fix for Oops at dynamic unbinding. The rest are device-specific small fixes, including the usual stuff like HD-audio and USB-audio quirks" * tag 'sound-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits) ALSA: hda: Skip codec shutdown in case the codec is not registered ALSA: usb-audio: Correct quirk for VF0770 ALSA: Replace acpi_bus_get_device() Input: wm97xx: Simplify resource management ALSA: hda/realtek: Add quirk for ASUS GU603 ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset) ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks ALSA: hda: realtek: Fix race at concurrent COEF updates ASoC: ops: Check for negative values before reading them ASoC: rt5682: Fix deadlock on resume ASoC: hdmi-codec: Fix OOB memory accesses ASoC: soc-pcm: Move debugfs removal out of spinlock ASoC: soc-pcm: Fix DPCM lockdep warning due to nested stream locks ASoC: fsl: Add missing error handling in pcm030_fabric_probe ALSA: hda: Fix signedness of sscanf() arguments ALSA: usb-audio: initialize variables that could ignore errors ALSA: hda: Fix UAF of leds class devs at unbinding ASoC: qdsp6: q6apm-dai: only stop graphs that are started ASoC: codecs: wcd938x: fix return value of mixer put function ...
2022-02-04Merge tag 'drm-fixes-2022-02-04' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular fixes for the week. Daniel has agreed to bring back the fbcon hw acceleration under a CONFIG option for the non-drm fbdev users, we don't advise turning this on unless you are in the niche that is old fbdev drivers, Since it's essentially a revert and shouldn't be high impact seemed like a good time to do it now. Otherwise, i915 and amdgpu fixes are most of it, along with some minor fixes elsewhere. fbdev: - readd fbcon acceleration i915: - fix DP monitor via type-c dock - fix for engine busyness and read timeout with GuC - use ALLOW_FAIL for error capture buffer allocs - don't use interruptible lock on error paths - smatch fix to reject zero sized overlays. amdgpu: - mGPU fan boost fix for beige goby - S0ix fixes - Cyan skillfish hang fix - DCN fixes for DCN 3.1 - DCN fixes for DCN 3.01 - Apple retina panel fix - ttm logic inversion fix dma-buf: - heaps: fix potential spectre v1 gadget kmb: - fix potential oob access mxsfb: - fix NULL ptr deref nouveau: - fix potential oob access during BIOS decode" * tag 'drm-fixes-2022-02-04' of git://anongit.freedesktop.org/drm/drm: (24 commits) drm: mxsfb: Fix NULL pointer dereference drm/amdgpu: fix logic inversion in check drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels drm/amd/display: revert "Reset fifo after enable otg" drm/amd/display: watermark latencies is not enough on DCN31 drm/amd/display: Update watermark values for DCN301 drm/amdgpu: fix a potential GPU hang on cyan skillfish drm/amd: Only run s3 or s0ix if system is configured properly drm/amd: add support to check whether the system is set to s3 fbcon: Add option to enable legacy hardware acceleration Revert "fbcon: Disable accelerated scrolling" Revert "fbdev: Garbage collect fbdev scrolling acceleration, part 1 (from TODO list)" drm/i915/pmu: Fix KMD and GuC race on accessing busyness dma-buf: heaps: Fix potential spectre v1 gadget drm/amd: Warn users about potential s0ix problems drm/amd/pm: correct the MGpuFanBoost support for Beige Goby drm/nouveau: fix off by one in BIOS boundary checking drm/i915/adlp: Fix TypeC PHY-ready status readout drm/i915/pmu: Use PM timestamp instead of RING TIMESTAMP for reference ...
2022-02-04random: only call crng_finalize_init() for primary_crngDominik Brodowski
crng_finalize_init() returns instantly if it is called for another pool than primary_crng. The test whether crng_finalize_init() is still required can be moved to the relevant caller in crng_reseed(), and crng_need_final_init can be reset to false if crng_finalize_init() is called with workqueues ready. Then, no previous callsite will call crng_finalize_init() unless it is needed, and we can get rid of the superfluous function parameter. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04random: access primary_pool directly rather than through pointerDominik Brodowski
Both crng_initialize_primary() and crng_init_try_arch_early() are only called for the primary_pool. Accessing it directly instead of through a function parameter simplifies the code. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04random: wake up /dev/random writers after zapJason A. Donenfeld
When account() is called, and the amount of entropy dips below random_write_wakeup_bits, we wake up the random writers, so that they can write some more in. However, the RNDZAPENTCNT/RNDCLEARPOOL ioctl sets the entropy count to zero -- a potential reduction just like account() -- but does not unblock writers. This commit adds the missing logic to that ioctl to unblock waiting writers. Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04random: continually use hwgenerator randomnessDominik Brodowski
The rngd kernel thread may sleep indefinitely if the entropy count is kept above random_write_wakeup_bits by other entropy sources. To make best use of multiple sources of randomness, mix entropy from hardware RNGs into the pool at least once within CRNG_RESEED_INTERVAL. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()Joerg Roedel
The polling loop for the register change in iommu_ga_log_enable() needs to have a udelay() in it. Otherwise the CPU might be faster than the IOMMU hardware and wrongly trigger the WARN_ON() further down the code stream. Use a 10us for udelay(), has there is some hardware where activation of the GA log can take more than a 100ms. A future optimization should move the activation check of the GA log to the point where it gets used for the first time. But that is a bigger change and not suitable for a fix. Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220204115537.3894-1-joro@8bytes.org
2022-02-04PCI/MSI: Remove bogus warning in pci_irq_get_affinity()Thomas Gleixner
The recent overhaul of pci_irq_get_affinity() introduced a regression when pci_irq_get_affinity() is called for an MSI-X interrupt which was not allocated with affinity descriptor information. The original code just returned a NULL pointer in that case, but the rework added a WARN_ON() under the assumption that the corresponding WARN_ON() in the MSI case can be applied to MSI-X as well. In fact the MSI warning in the original code does not make sense either because it's legitimate to invoke pci_irq_get_affinity() for a MSI interrupt which was not allocated with affinity descriptor information. Remove it and just return NULL as the original code did. Fixes: f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87ee4n38sm.ffs@tglx
2022-02-04ata: libata-core: Introduce ATA_HORKAGE_NO_LOG_DIR horkageAnton Lundin
06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls") introduced additional calls to ata_identify_page_supported(), thus also adding indirectly accesses to the device log directory log page through ata_log_supported(). Reading this log page causes SATADOM-ML 3ME devices to lock up. Introduce the horkage flag ATA_HORKAGE_NO_LOG_DIR to prevent accesses to the log directory in ata_log_supported() and add a blacklist entry with this flag for "SATADOM-ML 3ME" devices. Fixes: 636f6e2af4fb ("libata: add horkage for missing Identify Device log") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Anton Lundin <glance@acc.umu.se> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-04Merge tag 'drm-intel-fixes-2022-02-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Fix GitLab issue #4698: DP monitor through Type-C dock(Dell DA310) doesn't work. Fixes for inconsistent engine busyness value and read timeout with GuC. Fix to use ALLOW_FAIL for error capture buffer allocation. Don't use interruptible lock on error path. Smatch fix to reject zero sized overlays. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YfuiG8SKMKP5V/Dm@jlahtine-mobl.ger.corp.intel.com
2022-02-04Merge tag 'drm-misc-fixes-2022-02-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes * dma-buf/heaps: Fix potential spectre v1 gadget * drm/kmb: Fix potential out-of-bounds access * drm/mxsfb: Fix NULL-pointer dereference * drm/nouveau: Fix potential out-of-bounds access in BIOS decoding * fbdev: Re-add support for fbcon hardware acceleration Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/Yfu8mTZQUNt1RwZd@linux-uq9g
2022-02-03Merge tag 'net-5.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, netfilter, and ieee802154. Current release - regressions: - Partially revert "net/smc: Add netlink net namespace support", fix uABI breakage - netfilter: - nft_ct: fix use after free when attaching zone template - nft_byteorder: track register operations Previous releases - regressions: - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback - phy: qca8081: fix speeds lower than 2.5Gb/s - sched: fix use-after-free in tc_new_tfilter() Previous releases - always broken: - tcp: fix mem under-charging with zerocopy sendmsg() - tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() - neigh: do not trigger immediate probes on NUD_FAILED from neigh_managed_work, avoid a deadlock - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN false-positives - netfilter: nft_reject_bridge: fix for missing reply from prerouting - smc: forward wakeup to smc socket waitqueue after fallback - ieee802154: - return meaningful error codes from the netlink helpers - mcr20a: fix lifs/sifs periods - at86rf230, ca8210: stop leaking skbs on error paths - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent - ax25: add refcount in ax25_dev to avoid UAF bugs - eth: mlx5e: - fix SFP module EEPROM query - fix broken SKB allocation in HW-GRO - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows - eth: amd-xgbe: - fix skb data length underflow - ensure reset of the tx_timer_active flag, avoid Tx timeouts - eth: stmmac: fix runtime pm use in stmmac_dvr_remove() - eth: e1000e: handshake with CSME starts from Alder Lake platforms" * tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) ax25: fix reference count leaks of ax25_dev net: stmmac: ensure PTP time register reads are consistent net: ipa: request IPA register values be retained dt-bindings: net: qcom,ipa: add optional qcom,qmp property tools/resolve_btfids: Do not print any commands when building silently bpf: Use VM_MAP instead of VM_ALLOC for ringbuf net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() net: sparx5: do not refer to skb after passing it on Partially revert "net/smc: Add netlink net namespace support" net/mlx5e: Avoid field-overflowing memcpy() net/mlx5e: Use struct_group() for memcpy() region net/mlx5e: Avoid implicit modify hdr for decap drop rule net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic net/mlx5e: Don't treat small ceil values as unlimited in HTB offload net/mlx5: E-Switch, Fix uninitialized variable modact net/mlx5e: Fix handling of wrong devices during bond netevent net/mlx5e: Fix broken SKB allocation in HW-GRO net/mlx5e: Fix wrong calculation of header index in HW_GRO ...
2022-02-03net: stmmac: ensure PTP time register reads are consistentYannick Vignon
Even if protected from preemption and interrupts, a small time window remains when the 2 register reads could return inconsistent values, each time the "seconds" register changes. This could lead to an about 1-second error in the reported time. Add logic to ensure the "seconds" and "nanoseconds" values are consistent. Fixes: 92ba6888510c ("stmmac: add the support for PTP hw clock driver") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03Merge tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme into block-5.17Jens Axboe
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.17 - fix a use-after-free in rdm and tcp controller reset (Sagi Grimberg) - fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar)" * tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme: nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts() nvme-rdma: fix possible use-after-free in transport error_recovery work nvme-tcp: fix possible use-after-free in transport error_recovery work nvme: fix a possible use-after-free in controller reset during load
2022-02-03net: ipa: request IPA register values be retainedAlex Elder
In some cases, the IPA hardware needs to request the always-on subsystem (AOSS) to coordinate with the IPA microcontroller to retain IPA register values at power collapse. This is done by issuing a QMP request to the AOSS microcontroller. A similar request ondoes that request. We must get and hold the "QMP" handle early, because we might get back EPROBE_DEFER for that. But the actual request should be sent while we know the IPA clock is active, and when we know the microcontroller is operational. Fixes: 1aac309d3207 ("net: ipa: use autosuspend") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03drm: mxsfb: Fix NULL pointer dereferenceAlexander Stein
mxsfb should not ever dereference the NULL pointer which drm_atomic_get_new_bridge_state is allowed to return. Assume a fixed format instead. Fixes: b776b0f00f24 ("drm: mxsfb: Use bus_format from the nearest bridge if present") Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Marek Vasut <marex@denx.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220202081755.145716-3-alexander.stein@ew.tq-group.com
2022-02-03nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts()Uday Shankar
Controller deletion/reset, immediately followed by or concurrent with a reconnect, is hard failing the connect attempt resulting in a complete loss of connectivity to the controller. In the connect request, fabrics looks for an existing controller with the same address components and aborts the connect if a controller already exists and the duplicate connect option isn't set. The match routine filters out controllers that are dead or dying, so they don't interfere with the new connect request. When NVME_CTRL_DELETING_NOIO was added, it missed updating the state filters in the nvmf_ctlr_matches_baseopts() routine. Thus, when in this new state, it's seen as a live controller and fails the connect request. Correct by adding the DELETING_NIO state to the match checks. Fixes: ecca390e8056 ("nvme: fix deadlock in disconnect during scan_work and/or ana_work") Cc: <stable@vger.kernel.org> # v5.7+ Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-02-02drm/amdgpu: fix logic inversion in checkChristian König
We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabledMario Limonciello
dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: e25443d2765f4 ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina ↵Aun-Ali Zaidi
panels The eDP link rate reported by the DP_MAX_LINK_RATE dpcd register (0xa) is contradictory to the highest rate supported reported by EDID (0xc = LINK_RATE_RBR2). The effects of this compounded with commit '4a8ca46bae8a ("drm/amd/display: Default max bpc to 16 for eDP")' results in no display modes being found and a dark panel. For now, simply force the maximum supported link rate for the eDP attached 2018 15" Apple Retina panels. Additionally, we must also check the firmware revision since the device ID reported by the DPCD is identical to that of the more capable 16,1, incorrectly quirking it. We also use said firmware check to quirk the refreshed 15,1 models with Vega graphics as they use a slightly newer firmware version. Tested-by: Aun-Ali Zaidi <admin@kodeit.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Aun-Ali Zaidi <admin@kodeit.net> Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-02-02drm/amd/display: revert "Reset fifo after enable otg"Zhan Liu
[Why] This change causes regression, that prevents some systems from lighting up internal displays. [How] Revert this patch until a new solution is ready. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Zhan Liu <Zhan.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-02-02drm/amd/display: watermark latencies is not enough on DCN31Paul Hsieh
[Why] The original latencies were causing underflow in some modes. Resolution: 2880x1620@60p when HDR enable [How] 1. Replace with the up-to-date watermark values based on new measurments 2. Correct the ddr_wm_table name to DDR5 on DCN31 Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Paul Hsieh <paul.hsieh@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-02-02drm/amd/display: Update watermark values for DCN301Agustin Gutierrez
[Why] There is underflow / visual corruption DCN301, for high bandwidth MST DSC configurations such as 2x1440p144 or 2x4k60. [How] Use up-to-date watermark values for DCN301. Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-02-02drm/amdgpu: fix a potential GPU hang on cyan skillfishLang Yu
We observed a GPU hang when querying GMC CG state(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Just prevent it from accessing GMC CG registers. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-02-02drm/amd: Only run s3 or s0ix if system is configured properlyMario Limonciello
This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02drm/amd: add support to check whether the system is set to s3Mario Limonciello
This will be used to help make decisions on what to do in misconfigured systems. v2: squash in semicolon fix from Stephen Rothwell Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02md: fix NULL pointer deref with nowait but no mddev->queueSong Liu
Leon reported NULL pointer deref with nowait support: [ 15.123761] device-mapper: raid: Loading target version 1.15.1 [ 15.124185] device-mapper: raid: Ignoring chunk size parameter for RAID 1 [ 15.124192] device-mapper: raid: Choosing default region size of 4MiB [ 15.129524] BUG: kernel NULL pointer dereference, address: 0000000000000060 [ 15.129530] #PF: supervisor write access in kernel mode [ 15.129533] #PF: error_code(0x0002) - not-present page [ 15.129535] PGD 0 P4D 0 [ 15.129538] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 15.129541] CPU: 5 PID: 494 Comm: ldmtool Not tainted 5.17.0-rc2-1-mainline #1 9fe89d43dfcb215d2731e6f8851740520778615e [ 15.129546] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F36e 10/14/2021 [ 15.129549] RIP: 0010:blk_queue_flag_set+0x7/0x20 [ 15.129555] Code: 00 00 00 0f 1f 44 00 00 48 8b 35 e4 e0 04 02 48 8d 57 28 bf 40 01 \ 00 00 e9 16 c1 be ff 66 0f 1f 44 00 00 0f 1f 44 00 00 89 ff <f0> 48 0f ab 7e 60 \ 31 f6 89 f7 c3 66 66 2e 0f 1f 84 00 00 00 00 00 [ 15.129559] RSP: 0018:ffff966b81987a88 EFLAGS: 00010202 [ 15.129562] RAX: ffff8b11c363a0d0 RBX: ffff8b11e294b070 RCX: 0000000000000000 [ 15.129564] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001d [ 15.129566] RBP: ffff8b11e294b058 R08: 0000000000000000 R09: 0000000000000000 [ 15.129568] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11e294b070 [ 15.129570] R13: 0000000000000000 R14: ffff8b11e294b000 R15: 0000000000000001 [ 15.129572] FS: 00007fa96e826780(0000) GS:ffff8b18deb40000(0000) knlGS:0000000000000000 [ 15.129575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.129577] CR2: 0000000000000060 CR3: 000000010b8ce000 CR4: 00000000003506e0 [ 15.129580] Call Trace: [ 15.129582] <TASK> [ 15.129584] md_run+0x67c/0xc70 [md_mod 1e470c1b6bcf1114198109f42682f5a2740e9531] [ 15.129597] raid_ctr+0x134a/0x28ea [dm_raid 6a645dd7519e72834bd7e98c23497eeade14cd63] [ 15.129604] ? dm_split_args+0x63/0x150 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129615] dm_table_add_target+0x188/0x380 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129625] table_load+0x13b/0x370 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129635] ? dev_suspend+0x2d0/0x2d0 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129644] ctl_ioctl+0x1bd/0x460 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129655] dm_ctl_ioctl+0xa/0x20 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e] [ 15.129663] __x64_sys_ioctl+0x8e/0xd0 [ 15.129667] do_syscall_64+0x5c/0x90 [ 15.129672] ? syscall_exit_to_user_mode+0x23/0x50 [ 15.129675] ? do_syscall_64+0x69/0x90 [ 15.129677] ? do_syscall_64+0x69/0x90 [ 15.129679] ? syscall_exit_to_user_mode+0x23/0x50 [ 15.129682] ? do_syscall_64+0x69/0x90 [ 15.129684] ? do_syscall_64+0x69/0x90 [ 15.129686] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 15.129689] RIP: 0033:0x7fa96ecd559b [ 15.129692] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c \ c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff \ ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48 [ 15.129696] RSP: 002b:00007ffcaf85c258 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [ 15.129699] RAX: ffffffffffffffda RBX: 00007fa96f1b48f0 RCX: 00007fa96ecd559b [ 15.129701] RDX: 00007fa97017e610 RSI: 00000000c138fd09 RDI: 0000000000000003 [ 15.129702] RBP: 00007fa96ebab583 R08: 00007fa97017c9e0 R09: 00007ffcaf85bf27 [ 15.129704] R10: 0000000000000001 R11: 0000000000000206 R12: 00007fa97017e610 [ 15.129706] R13: 00007fa97017e640 R14: 00007fa97017e6c0 R15: 00007fa97017e530 [ 15.129709] </TASK> This is caused by missing mddev->queue check for setting QUEUE_FLAG_NOWAIT Fix this by moving the QUEUE_FLAG_NOWAIT logic to under mddev->queue check. Fixes: f51d46d0e7cb ("md: add support for REQ_NOWAIT") Reported-by: Leon Möller <jkhsjdhjs@totally.rip> Tested-by: Leon Möller <jkhsjdhjs@totally.rip> Cc: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org>
2022-02-02Merge tag 'pinctrl-v5.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Most interesting and urgent is the Intel stuff affecting Chromebooks and laptops. - Fix up group name building on the Intel Thunderbay - Fix interrupt problems on the Intel Cherryview - Fix some pin data on the Sunxi H616 - Fix up the CONFIG_PINCTRL_ST Kconfig sort order as noted during the merge window - Fix an unexpected interrupt problem on the Intel Sunrisepoint - Fix a glitch when updating IRQ flags on all Intel pin controllers - Revert a Zynqmp patch to unify the pin naming, let's find some better solution - Fix some error paths in the Broadcom BCM2835 driver - Fix a Kconfig problem pertaining to the BCM63XX drivers - Fix the regmap support in the Microchip SGPIO driver" * tag 'pinctrl-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: microchip-sgpio: Fix support for regmap pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP pinctrl: bcm2835: Fix a few error paths pinctrl: zynqmp: Revert "Unify pin naming" pinctrl: intel: Fix a glitch when updating IRQ flags on a preconfigured line pinctrl: intel: fix unexpected interrupt pinctrl: Place correctly CONFIG_PINCTRL_ST in the Makefile pinctrl: sunxi: Fix H616 I2S3 pin data pinctrl: cherryview: Trigger hwirq0 for interrupt-lines without a mapping pinctrl: thunderbay: rework loops looking for groups names pinctrl: thunderbay: comment process of building functions a bit
2022-02-02net: sparx5: do not refer to skb after passing it onSteen Hegelund
Do not try to use any SKB fields after the packet has been passed up in the receive stack. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20220202083039.3774851-1-steen.hegelund@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02Merge tag 'mlx5-fixes-2022-02-01' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-02-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. Sorry about the long series, but I had to move the top two patches from net-next to net to help avoiding a build break when kspp branch is merged into linus-next on next merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02fbcon: Add option to enable legacy hardware accelerationHelge Deller
Add a config option CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION to enable bitblt and fillrect hardware acceleration in the framebuffer console. If disabled, such acceleration will not be used, even if it is supported by the graphics hardware driver. If you plan to use DRM as your main graphics output system, you should disable this option since it will prevent compiling in code which isn't used later on when DRM takes over. For all other configurations, e.g. if none of your graphic cards support DRM (yet), DRM isn't available for your architecture, or you can't be sure that the graphic card in the target system will support DRM, you most likely want to enable this option. In the non-accelerated case (e.g. when DRM is used), the inlined fb_scrollmode() function is hardcoded to return SCROLL_REDRAW and as such the compiler is able to optimize much unneccesary code away. In this v3 patch version I additionally changed the GETVYRES() and GETVXRES() macros to take a pointer to the fbcon_display struct. This fixes the build when console rotation is enabled and helps the compiler again to optimize out code. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220202135531.92183-4-deller@gmx.de
2022-02-02Revert "fbcon: Disable accelerated scrolling"Helge Deller
This reverts commit 39aead8373b3c20bb5965c024dfb51a94e526151. Revert the first (of 2) commits which disabled scrolling acceleration in fbcon/fbdev. It introduced a regression for fbdev-supported graphic cards because of the performance penalty by doing screen scrolling by software instead of using the existing graphic card 2D hardware acceleration. Console scrolling acceleration was disabled by dropping code which checked at runtime the driver hardware capabilities for the BINFO_HWACCEL_COPYAREA or FBINFO_HWACCEL_FILLRECT flags and if set, it enabled scrollmode SCROLL_MOVE which uses hardware acceleration to move screen contents. After dropping those checks scrollmode was hard-wired to SCROLL_REDRAW instead, which forces all graphic cards to redraw every character at the new screen position when scrolling. This change effectively disabled all hardware-based scrolling acceleration for ALL drivers, because now all kind of 2D hardware acceleration (bitblt, fillrect) in the drivers isn't used any longer. The original commit message mentions that only 3 DRM drivers (nouveau, omapdrm and gma500) used hardware acceleration in the past and thus code for checking and using scrolling acceleration is obsolete. This statement is NOT TRUE, because beside the DRM drivers there are around 35 other fbdev drivers which depend on fbdev/fbcon and still provide hardware acceleration for fbdev/fbcon. The original commit message also states that syzbot found lots of bugs in fbcon and thus it's "often the solution to just delete code and remove features". This is true, and the bugs - which actually affected all users of fbcon, including DRM - were fixed, or code was dropped like e.g. the support for software scrollback in vgacon (commit 973c096f6a85). So to further analyze which bugs were found by syzbot, I've looked through all patches in drivers/video which were tagged with syzbot or syzkaller back to year 2005. The vast majority fixed the reported issues on a higher level, e.g. when screen is to be resized, or when font size is to be changed. The few ones which touched driver code fixed a real driver bug, e.g. by adding a check. But NONE of those patches touched code of either the SCROLL_MOVE or the SCROLL_REDRAW case. That means, there was no real reason why SCROLL_MOVE had to be ripped-out and just SCROLL_REDRAW had to be used instead. The only reason I can imagine so far was that SCROLL_MOVE wasn't used by DRM and as such it was assumed that it could go away. That argument completely missed the fact that SCROLL_MOVE is still heavily used by fbdev (non-DRM) drivers. Some people mention that using memcpy() instead of the hardware acceleration is pretty much the same speed. But that's not true, at least not for older graphic cards and machines where we see speed decreases by factor 10 and more and thus this change leads to console responsiveness way worse than before. That's why the original commit is to be reverted. By reverting we reintroduce hardware-based scrolling acceleration and fix the performance regression for fbdev drivers. There isn't any impact on DRM when reverting those patches. Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220202135531.92183-3-deller@gmx.de
2022-02-02Revert "fbdev: Garbage collect fbdev scrolling acceleration, part 1 (from ↵Helge Deller
TODO list)" This reverts commit b3ec8cdf457e5e63d396fe1346cc788cf7c1b578. Revert the second (of 2) commits which disabled scrolling acceleration in fbcon/fbdev. It introduced a regression for fbdev-supported graphic cards because of the performance penalty by doing screen scrolling by software instead of using the existing graphic card 2D hardware acceleration. Console scrolling acceleration was disabled by dropping code which checked at runtime the driver hardware capabilities for the BINFO_HWACCEL_COPYAREA or FBINFO_HWACCEL_FILLRECT flags and if set, it enabled scrollmode SCROLL_MOVE which uses hardware acceleration to move screen contents. After dropping those checks scrollmode was hard-wired to SCROLL_REDRAW instead, which forces all graphic cards to redraw every character at the new screen position when scrolling. This change effectively disabled all hardware-based scrolling acceleration for ALL drivers, because now all kind of 2D hardware acceleration (bitblt, fillrect) in the drivers isn't used any longer. The original commit message mentions that only 3 DRM drivers (nouveau, omapdrm and gma500) used hardware acceleration in the past and thus code for checking and using scrolling acceleration is obsolete. This statement is NOT TRUE, because beside the DRM drivers there are around 35 other fbdev drivers which depend on fbdev/fbcon and still provide hardware acceleration for fbdev/fbcon. The original commit message also states that syzbot found lots of bugs in fbcon and thus it's "often the solution to just delete code and remove features". This is true, and the bugs - which actually affected all users of fbcon, including DRM - were fixed, or code was dropped like e.g. the support for software scrollback in vgacon (commit 973c096f6a85). So to further analyze which bugs were found by syzbot, I've looked through all patches in drivers/video which were tagged with syzbot or syzkaller back to year 2005. The vast majority fixed the reported issues on a higher level, e.g. when screen is to be resized, or when font size is to be changed. The few ones which touched driver code fixed a real driver bug, e.g. by adding a check. But NONE of those patches touched code of either the SCROLL_MOVE or the SCROLL_REDRAW case. That means, there was no real reason why SCROLL_MOVE had to be ripped-out and just SCROLL_REDRAW had to be used instead. The only reason I can imagine so far was that SCROLL_MOVE wasn't used by DRM and as such it was assumed that it could go away. That argument completely missed the fact that SCROLL_MOVE is still heavily used by fbdev (non-DRM) drivers. Some people mention that using memcpy() instead of the hardware acceleration is pretty much the same speed. But that's not true, at least not for older graphic cards and machines where we see speed decreases by factor 10 and more and thus this change leads to console responsiveness way worse than before. That's why the original commit is to be reverted. By reverting we reintroduce hardware-based scrolling acceleration and fix the performance regression for fbdev drivers. There isn't any impact on DRM when reverting those patches. Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220202135531.92183-2-deller@gmx.de
2022-02-02nvme-rdma: fix possible use-after-free in transport error_recovery workSagi Grimberg
While nvme_rdma_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state. Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2022-02-02nvme-tcp: fix possible use-after-free in transport error_recovery workSagi Grimberg
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state. Tested-by: Chris Leech <cleech@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2022-02-02nvme: fix a possible use-after-free in controller reset during loadSagi Grimberg
Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl readiness for AER submission. This may lead to a use-after-free condition that was observed with nvme-tcp. The race condition may happen in the following scenario: 1. driver executes its reset_ctrl_work 2. -> nvme_stop_ctrl - flushes ctrl async_event_work 3. ctrl sends AEN which is received by the host, which in turn schedules AEN handling 4. teardown admin queue (which releases the queue socket) 5. AEN processed, submits another AER, calling the driver to submit 6. driver attempts to send the cmd ==> use-after-free In order to fix that, add ctrl state check to validate the ctrl is actually able to accept the AER submission. This addresses the above race in controller resets because the driver during teardown should: 1. change ctrl state to RESETTING 2. flush async_event_work (as well as other async work elements) So after 1,2, any other AER command will find the ctrl state to be RESETTING and bail out without submitting the AER. Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2022-02-01Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-01 This series contains updates to e1000e driver only. Sasha removes CSME handshake with TGL platform as this is not supported and is causing hardware unit hangs to be reported. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Handshake with CSME starts from ADL platforms e1000e: Separate ADP board type from TGP ==================== Link: https://lore.kernel.org/r/20220201173754.580305-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01net/mlx5e: Avoid field-overflowing memcpy()Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use flexible arrays instead of zero-element arrays (which look like they are always overflowing) and split the cross-field memcpy() into two halves that can be appropriately bounds-checked by the compiler. We were doing: #define ETH_HLEN 14 #define VLAN_HLEN 4 ... #define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN) ... struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi); ... struct mlx5_wqe_eth_seg *eseg = &wqe->eth; struct mlx5_wqe_data_seg *dseg = wqe->data; ... memcpy(eseg->inline_hdr.start, xdptxd->data, MLX5E_XDP_MIN_INLINE); target is wqe->eth.inline_hdr.start (which the compiler sees as being 2 bytes in size), but copying 18, intending to write across start (really vlan_tci, 2 bytes). The remaining 16 bytes get written into wqe->data[0], covering byte_count (4 bytes), lkey (4 bytes), and addr (8 bytes). struct mlx5e_tx_wqe { struct mlx5_wqe_ctrl_seg ctrl; /* 0 16 */ struct mlx5_wqe_eth_seg eth; /* 16 16 */ struct mlx5_wqe_data_seg data[]; /* 32 0 */ /* size: 32, cachelines: 1, members: 3 */ /* last cacheline: 32 bytes */ }; struct mlx5_wqe_eth_seg { u8 swp_outer_l4_offset; /* 0 1 */ u8 swp_outer_l3_offset; /* 1 1 */ u8 swp_inner_l4_offset; /* 2 1 */ u8 swp_inner_l3_offset; /* 3 1 */ u8 cs_flags; /* 4 1 */ u8 swp_flags; /* 5 1 */ __be16 mss; /* 6 2 */ __be32 flow_table_metadata; /* 8 4 */ union { struct { __be16 sz; /* 12 2 */ u8 start[2]; /* 14 2 */ } inline_hdr; /* 12 4 */ struct { __be16 type; /* 12 2 */ __be16 vlan_tci; /* 14 2 */ } insert; /* 12 4 */ __be32 trailer; /* 12 4 */ }; /* 12 4 */ /* size: 16, cachelines: 1, members: 9 */ /* last cacheline: 16 bytes */ }; struct mlx5_wqe_data_seg { __be32 byte_count; /* 0 4 */ __be32 lkey; /* 4 4 */ __be64 addr; /* 8 8 */ /* size: 16, cachelines: 1, members: 3 */ /* last cacheline: 16 bytes */ }; So, split the memcpy() so the compiler can reason about the buffer sizes. "pahole" shows no size nor member offset changes to struct mlx5e_tx_wqe nor struct mlx5e_umr_wqe. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences and optimizations). Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-01net/mlx5e: Use struct_group() for memcpy() regionKees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use struct_group() in struct vlan_ethhdr around members h_dest and h_source, so they can be referenced together. This will allow memcpy() and sizeof() to more easily reason about sizes, improve readability, and avoid future warnings about writing beyond the end of h_dest. "pahole" shows no size nor member offset changes to struct vlan_ethhdr. "objdump -d" shows no object code changes. Fixes: 34802a42b352 ("net/mlx5e: Do not modify the TX SKB") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>