summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2023-02-16cxl/trace: Standardize device information outputIra Weiny
The trace points were written to take a struct device input for the trace. In CXL multiple device objects are associated with each CXL hardware device. Using different device objects in the trace point can lead to confusion for users. The PCIe device is nice to have, but the user space tooling relies on the memory device naming. It is better to have those device names reported. Change all trace points to take struct cxl_memdev as a standard and report that name. Furthermore, standardize on the name 'memdev' in both /sys/kernel/tracing/trace and cxl-cli monitor output. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230208-cxl-event-names-v2-1-fca130c2c68b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-30cxl/pci: Fix irq oneshot expectationsDan Williams
The IRQ core expects that users of the default hardirq handler specify IRQF_ONESHOT to keep interrupts disabled until the threaded handler runs. That meets the CXL driver's expectations since it is an edge triggered MSI and this flag would have been passed by default using pci_request_irq() instead of devm_request_threaded_irq(). Fixes: a49aa8141b65 ("cxl/mem: Wire up event interrupts") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-30cxl/pci: Set the device timestampJonathan Cameron
CXL r3.0 section 8.2.9.4.2 "Set Timestamp" recommends that the host sets the timestamp after every Conventional or CXL Reset to ensure accurate timestamps. This should include on initial boot up. The time base that is being set is used by a device for the poison list overflow timestamp and all event timestamps. Note that the command is optional and if not supported and the device cannot return accurate timestamps it will fill the fields in with an appropriate marker (see the specification description of each timestamp). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230130151327.32415-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-30cxl/mbox: Add missing parameter to docs.Jonathan Cameron
Kernel-doc should be complete, so add documentation for the status parameter. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230130153437.3153-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26cxl/mem: Trace Memory Module Event RecordIra Weiny
CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record. Determine if the event read is memory module record and if so trace the record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-5-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26cxl/mem: Trace DRAM Event RecordIra Weiny
CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record. Determine if the event read is a DRAM event record and if so trace the record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-4-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26cxl/mem: Trace General Media Event RecordIra Weiny
CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record. Determine if the event read is a general media record and if so trace the record as a General Media Event Record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-3-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26cxl/mem: Wire up event interruptsDavidlohr Bueso
Currently the only CXL features targeted for irq support require their message numbers to be within the first 16 entries. The device may however support less than 16 entries depending on the support it provides. Attempt to allocate these 16 irq vectors. If the device supports less then the PCI infrastructure will allocate that number. Upon successful allocation, users can plug in their respective isr at any point thereafter. CXL device events are signaled via interrupts. Each event log may have a different interrupt message number. These message numbers are reported in the Get Event Interrupt Policy mailbox command. Add interrupt support for event logs. Interrupts are allocated as shared interrupts. Therefore, all or some event logs can share the same message number. In addition all logs are queried on any interrupt in order of the most to least severe based on the status register. Finally place all event configuration logic into cxl_event_config(). Previously the logic was a simple 'read all' on start up. But interrupts must be configured prior to any reads to ensure no events are missed. A single event configuration function results in a cleaner over all implementation. Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-2-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26cxl/mem: Read, trace, and clear events on driver loadIra Weiny
CXL devices have multiple event logs which can be queried for CXL event records. Devices are required to support the storage of at least one event record in each event log type. Devices track event log overflow by incrementing a counter and tracking the time of the first and last overflow event seen. Software queries events via the Get Event Record mailbox command; CXL rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section 8.2.9.2.3 Clear Event Records mailbox command. If the result of negotiating CXL Error Reporting Control is OS control, read and clear all event logs on driver load. Ensure a clean slate of events by reading and clearing the events on driver load. The status register is not used because a device may continue to trigger events and the only requirement is to empty the log at least once. This allows for the required transition from empty to non-empty for interrupt generation. Handling of interrupts is in a follow on patch. The device can return up to 1MB worth of event records per query. Allocate a shared large buffer to handle the max number of records based on the mailbox payload size. This patch traces a raw event record and leaves specific event record type tracing to subsequent patches. Macros are created to aid in tracing the common CXL Event header fields. Each record is cleared explicitly. A clear all bit is specified but is only valid when the log overflows. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-1-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-25cxl/port: Link the 'parent_dport' in portX/ and endpointX/ sysfsDan Williams
Similar to the justification in: 1b58b4cac6fc ("cxl/port: Record parent dport when adding ports") ...userspace wants to know the routing information for ports for calculating the memdev order for region creation among other things. Cache the information the kernel discovers at enumeration time in a 'parent_dport' attribute to save userspace the time of trawling sysfs to recover the same information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167124082375.1626103.6047000000121298560.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-25cxl/region: Clarify when a cxld->commit() callback is mandatoryDan Williams
Both cxl_switch_decoders() and cxl_endpoint_decoders() are considered by cxl_region_decode_commit(). Flag cases where cxl_switch_decoders with multiple targets, or cxl_endpoint_decoders do not have a commit callback set. The switch case is unlikely to happen since switches are only enumerated by the CXL core, but the endpoint case may support decoders defined by drivers outside of drivers/cxl, like accerator drivers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167124081824.1626103.1555704405392757219.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-24cxl/pci: Show opcode in debug messages when sending a commandRobert Richter
For debugging it is very helpful to see which commands are sent. Add it to the debug message. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20230103210151.1126873-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-05cxl/region: Only warn about cpu_cache_invalidate_memregion() onceDavidlohr Bueso
No need for more than once per module load. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20221215183836.24136-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-05PCI/CXL: Export native CXL error reporting controlIra Weiny
CXL _OSC Error Reporting Control is used by the OS to determine if Firmware has control of various CXL error reporting capabilities including the event logs. Expose the result of negotiating CXL Error Reporting Control in struct pci_host_bridge for consumption by the CXL drivers. Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: linux-pci@vger.kernel.org Cc: linux-acpi@vger.kernel.org Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-04cxl/pci: Move tracepoint definitions to drivers/cxl/core/Dan Williams
CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-01Merge tag 'drm-fixes-2023-01-01' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Daniel Vetter: "I'm just back from the mountains, and Dave is out at the beach and should be back in a week again. Just i915 fixes and since Rodrigo bothered to make the pull last week I figured I should warm up gpg and forward this in a nice signed tag as a new years present! - i915 fixes for newer platforms - i915 locking rework to not give up in vm eviction fallback path too early" * tag 'drm-fixes-2023-01-01' of git://anongit.freedesktop.org/drm/drm: drm/i915/dsi: fix MIPI_BKLT_EN_1 native GPIO index drm/i915/dsi: add support for ICL+ native MIPI GPIO sequence drm/i915/uc: Fix two issues with over-size firmware files drm/i915: improve the catch-all evict to handle lock contention drm/i915: Remove __maybe_unused from mtl_info drm/i915: fix TLB invalidation for Gen12.50 video and compute engines
2022-12-31Merge tag 'ata-6.2-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: "A single fix to address an issue with wake from suspend with PCS adapters, from Adam" * tag 'ata-6.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: ahci: Fix PCS quirk application for suspend
2022-12-30Merge tag 'acpi-6.2-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These are new ACPI IRQ override quirks, low-power S0 idle (S0ix) support adjustments and ACPI backlight handling fixes, mostly for platforms using AMD chips. Specifics: - Add ACPI IRQ override quirks for Asus ExpertBook B2502, Lenovo 14ALC7, and XMG Core 15 (Hans de Goede, Adrian Freund, Erik Schumacher). - Adjust ACPI video detection fallback path to prevent non-operational ACPI backlight devices from being created on systems where the native driver does not detect a suitable panel (Mario Limonciello). - Fix Apple GMUX backlight detection (Hans de Goede). - Add a low-power S0 idle (S0ix) handling quirk for HP Elitebook 865 and stop using AMD-specific low-power S0 idle code path for systems with Rembrandt chips and newer (Mario Limonciello)" * tag 'acpi-6.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+ ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865 ACPI: video: Fix Apple GMUX backlight detection ACPI: resource: Add Asus ExpertBook B2502 to Asus quirks ACPI: resource: do IRQ override on Lenovo 14ALC7 ACPI: resource: do IRQ override on XMG Core 15 ACPI: video: Don't enable fallback path for creating ACPI backlight by default drm/amd/display: Report to ACPI video if no panels were found ACPI: video: Allow GPU drivers to report no panels
2022-12-30Merge branches 'acpi-resource' and 'acpi-video'Rafael J. Wysocki
Merge ACPI resource handling quirks and ACPI backlight handling fixes for 6.2-rc2: - Add ACPI IRQ override quirks for Asus ExpertBook B2502, Lenovo 14ALC7, and XMG Core 15 (Hans de Goede, Adrian Freund, Erik Schumacher). - Adjust ACPI video detection fallback path to prevent non-operational ACPI backlight devices from being created on systems where the native driver does not detect a suitable panel (Mario Limonciello). - Fix Apple GMUX backlight detection (Hans de Goede). * acpi-resource: ACPI: resource: Add Asus ExpertBook B2502 to Asus quirks ACPI: resource: do IRQ override on Lenovo 14ALC7 ACPI: resource: do IRQ override on XMG Core 15 * acpi-video: ACPI: video: Fix Apple GMUX backlight detection ACPI: video: Don't enable fallback path for creating ACPI backlight by default drm/amd/display: Report to ACPI video if no panels were found ACPI: video: Allow GPU drivers to report no panels
2022-12-30drm/i915/dsi: fix MIPI_BKLT_EN_1 native GPIO indexJani Nikula
Due to copy-paste fail, MIPI_BKLT_EN_1 would always use PPS index 1, never 0. Fix the sloppiest commit in recent memory. Fixes: 963bbdb32b47 ("drm/i915/dsi: add support for ICL+ native MIPI GPIO sequence") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221220140105.313333-1-jani.nikula@intel.com (cherry picked from commit a561933c571798868b5fa42198427a7e6df56c09) Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-30drm/i915/dsi: add support for ICL+ native MIPI GPIO sequenceJani Nikula
Starting from ICL, the default for MIPI GPIO sequences seems to be using native GPIOs i.e. GPIOs available in the GPU. These native GPIOs reuse many pins that quite frankly seem scary to poke based on the VBT sequences. We pretty much have to trust that the board is configured such that the relevant HPD, PP_CONTROL and GPIO bits aren't used for anything else. MIPI sequence v4 also adds a flag to fall back to non-native sequences. v5: - Wrap SHOTPLUG_CTL_DDI modification in spin_lock() in icp_irq_handler() too (Ville) - References instead of Closes issue 6131 because this does not fix everything v4: - Wrap SHOTPLUG_CTL_DDI modification in spin_lock_irq() (Ville) v3: - Fix -Wbitwise-conditional-parentheses (kernel test robot <lkp@intel.com>) v2: - Fix HPD pin output set (impacts GPIOs 0 and 5) - Fix GPIO data output direction set (impacts GPIOs 4 and 9) - Reduce register accesses to single intel_de_rwm() References: https://gitlab.freedesktop.org/drm/intel/-/issues/6131 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219105955.4014451-1-jani.nikula@intel.com (cherry picked from commit f087cfe6fcff58044f7aa3b284965af47f472fb0) Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-30drm/i915/uc: Fix two issues with over-size firmware filesJohn Harrison
In the case where a firmware file is too large (e.g. someone downloaded a web page ASCII dump from github...), the firmware object is released but the pointer is not zerod. If no other firmware file was found then release would be called again leading to a double kfree. Also, the size check was only being applied to the initial firmware load not any of the subsequent attempts. So move the check into a wrapper that is used for all loads. Fixes: 016241168dc5 ("drm/i915/uc: use different ggtt pin offsets for uc loads") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221221193031.687266-4-John.C.Harrison@Intel.com (cherry picked from commit 4071d98b296a5bc5fd4b15ec651bd05800ec9510) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-30drm/i915: improve the catch-all evict to handle lock contentionMatthew Auld
The catch-all evict can fail due to object lock contention, since it only goes as far as trylocking the object, due to us already holding the vm->mutex. Doing a full object lock here can deadlock, since the vm->mutex is always our inner lock. Add another execbuf pass which drops the vm->mutex and then tries to grab the object will the full lock, before then retrying the eviction. This should be good enough for now to fix the immediate regression with userspace seeing -ENOSPC from execbuf due to contended object locks during GTT eviction. v2 (Mani) - Also revamp the docs for the different passes. Testcase: igt@gem_ppgtt@shrink-vs-evict-* Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.") References: https://gitlab.freedesktop.org/drm/intel/-/issues/7627 References: https://gitlab.freedesktop.org/drm/intel/-/issues/7570 References: https://bugzilla.mozilla.org/show_bug.cgi?id=1779558 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Mani Milani <mani@chromium.org> Cc: <stable@vger.kernel.org> # v5.18+ Reviewed-by: Mani Milani <mani@chromium.org> Tested-by: Mani Milani <mani@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221216113456.414183-1-matthew.auld@intel.com (cherry picked from commit 801fa7a81f6da533cc5442fc40e32c72b76cd42a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-30drm/i915: Remove __maybe_unused from mtl_infoLucas De Marchi
The attribute __maybe_unused should remain only until the respective info is not in the pciidlist. The info can't be added together with its definition because that would cause the driver to automatically probe for the device, while it's still not ready for that. However once pciidlist contains it, the attribute can be removed. Fixes: 7835303982d1 ("drm/i915/mtl: Add MeteorLake PCI IDs") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221214194944.3670344-1-lucas.demarchi@intel.com (cherry picked from commit 50490ce05b7a50b0bd4108fa7d6db3ca2972fa83) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-30drm/i915: fix TLB invalidation for Gen12.50 video and compute enginesAndrzej Hajda
In case of Gen12.50 video and compute engines, TLB_INV registers are masked - to modify one bit, corresponding bit in upper half of the register must be enabled, otherwise nothing happens. Fixes: 77fa9efc16a9 ("drm/i915/xehp: Create separate reg definitions for new MCR registers") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221214075439.402485-1-andrzej.hajda@intel.com (cherry picked from commit 4d5cf7b1680a1e6db327e3c935ef58325cbedb2c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-29Merge tag 'block-6.2-2022-12-29' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Mostly just NVMe, but also a single fixup for BFQ for a regression that happened during the merge window. In detail: - NVMe pull requests via Christoph: - Fix doorbell buffer value endianness (Klaus Jensen) - Fix Linux vs NVMe page size mismatch (Keith Busch) - Fix a potential use memory access beyong the allocation limit (Keith Busch) - Fix a multipath vs blktrace NULL pointer dereference (Yanjun Zhang) - Fix various problems in handling the Command Supported and Effects log (Christoph Hellwig) - Don't allow unprivileged passthrough of commands that don't transfer data but modify logical block content (Christoph Hellwig) - Add a features and quirks policy document (Christoph Hellwig) - Fix some really nasty code that was correct but made smatch complain (Sagi Grimberg) - Use-after-free regression in BFQ from this merge window (Yu)" * tag 'block-6.2-2022-12-29' of git://git.kernel.dk/linux: nvme-auth: fix smatch warning complaints nvme: consult the CSE log page for unprivileged passthrough nvme: also return I/O command effects from nvme_command_effects nvmet: don't defer passthrough commands with trivial effects to the workqueue nvmet: set the LBCC bit for commands that modify data nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition docs, nvme: add a feature and quirk policy document nvme-pci: update sqsize when adjusting the queue depth nvme: fix setting the queue depth in nvme_alloc_io_tag_set block, bfq: fix uaf for bfqq in bfq_exit_icq_bfqq nvme: fix multipath crash caused by flush request when blktrace is enabled nvme-pci: fix page size checks nvme-pci: fix mempool alloc size nvme-pci: fix doorbell buffer value endianness
2022-12-28nvme-auth: fix smatch warning complaintsSagi Grimberg
When initializing auth context, there may be no secrets passed by the user. Make return code explicit when returning successfully. smatch warnings: drivers/nvme/host/auth.c:950 nvme_auth_init_ctrl() warn: missing error code? 'ret' Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-12-28nvme: consult the CSE log page for unprivileged passthroughChristoph Hellwig
Commands like Write Zeros can change the contents of a namespaces without actually transferring data. To protect against this, check the Commands Supported and Effects log is supported by the controller for any unprivileg command passthrough and refuse unprivileged passthrough if the command has any effects that can change data or metadata. Note: While the Commands Support and Effects log page has only been mandatory since NVMe 2.0, it is widely supported because Windows requires it for any command passthrough from userspace. Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28nvme: also return I/O command effects from nvme_command_effectsChristoph Hellwig
To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28nvmet: don't defer passthrough commands with trivial effects to the workqueueChristoph Hellwig
Mask out the "Command Supported" and "Logical Block Content Change" bits and only defer execution of commands that have non-trivial effects to the workqueue for synchronous execution. This allows to execute admin commands asynchronously on controllers that provide a Command Supported and Effects log page, and will keep allowing to execute Write commands asynchronously once command effects on I/O commands are taken into account. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28nvmet: set the LBCC bit for commands that modify dataChristoph Hellwig
Write, Write Zeroes, Zone append and a Zone Reset through Zone Management Send modify the logical block content of a namespace, so make sure the LBCC bit is reported for them. Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-28nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding itChristoph Hellwig
Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-27ata: ahci: Fix PCS quirk application for suspendAdam Vodopjan
Since kernel 5.3.4 my laptop (ICH8M controller) does not see Kingston SV300S37A60G SSD disk connected into a SATA connector on wake from suspend. The problem was introduced in c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond"): the quirk is not applied on wake from suspend as it originally was. It is worth to mention the commit contained another bug: the quirk is not applied at all to controllers which require it. The fix commit 09d6ac8dc51a ("libata/ahci: Fix PCS quirk application") landed in 5.3.8. So testing my patch anywhere between commits c312ef176399 and 09d6ac8dc51a is pointless. Not all disks trigger the problem. For example nothing bad happens with Western Digital WD5000LPCX HDD. Test hardware: - Acer 5920G with ICH8M SATA controller - sda: some SATA HDD connnected into the DVD drive IDE port with a SATA-IDE caddy. It is a boot disk - sdb: Kingston SV300S37A60G SSD connected into the only SATA port Sample "dmesg --notime | grep -E '^(sd |ata)'" output on wake: sd 0:0:0:0: [sda] Starting disk sd 2:0:0:0: [sdb] Starting disk ata4: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out ata1.00: ACPI cmd ef/03:42:00:00:00:a0 (SET FEATURES) filtered out ata1: FORCE: cable set to 80c ata5: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3.00: disabled sd 2:0:0:0: rejecting I/O to offline device ata3.00: detaching (SCSI 2:0:0:0) sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Synchronizing SCSI cache sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Stopping disk sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Commit c312ef176399 dropped ahci_pci_reset_controller() which internally calls ahci_reset_controller() and applies the PCS quirk if needed after that. It was called each time a reset was required instead of just ahci_reset_controller(). This patch puts the function back in place. Fixes: c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond") Signed-off-by: Adam Vodopjan <grozzly@protonmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-12-26nvme-pci: update sqsize when adjusting the queue depthChristoph Hellwig
Update the core sqsize field in addition to the PCIe-specific q_depth field as the core tagset allocation helpers rely on it. Fixes: 0da7feaa5913 ("nvme-pci: use the tagset alloc/free helpers") Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/20221225103234.226794-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-26nvme: fix setting the queue depth in nvme_alloc_io_tag_setChristoph Hellwig
While the CAP.MQES field in NVMe is a 0s based filed with a natural one off, we also need to account for the queue wrap condition and fix undo the one off again in nvme_alloc_io_tag_set. This was never properly done by the fabrics drivers, but they don't seem to care because there is no actual physical queue that can wrap around, but it became a problem when converting over the PCIe driver. Also add back the BLK_MQ_MAX_DEPTH check that was lost in the same commit. Fixes: 0da7feaa5913 ("nvme-pci: use the tagset alloc/free helpers") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/20221225103234.226794-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-25treewide: Convert del_timer*() to timer_shutdown*()Steven Rostedt (Google)
Due to several bugs caused by timers being re-armed after they are shutdown and just before they are freed, a new state of timers was added called "shutdown". After a timer is set to this state, then it can no longer be re-armed. The following script was run to find all the trivial locations where del_timer() or del_timer_sync() is called in the same function that the object holding the timer is freed. It also ignores any locations where the timer->function is modified between the del_timer*() and the free(), as that is not considered a "trivial" case. This was created by using a coccinelle script and the following commands: $ cat timer.cocci @@ expression ptr, slab; identifier timer, rfield; @@ ( - del_timer(&ptr->timer); + timer_shutdown(&ptr->timer); | - del_timer_sync(&ptr->timer); + timer_shutdown_sync(&ptr->timer); ) ... when strict when != ptr->timer ( kfree_rcu(ptr, rfield); | kmem_cache_free(slab, ptr); | kfree(ptr); ) $ spatch timer.cocci . > /tmp/t.patch $ patch -p1 < /tmp/t.patch Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ] Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ] Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-23Merge tag 'spi-fix-v6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "One driver specific change here which handles the case where a SPI device for some reason tries to change the bus speed during a message on fsl_spi hardware, this should be very unusual" * tag 'spi-fix-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: fsl_spi: Don't change speed while chipselect is active
2022-12-23Merge tag 'regulator-fix-v6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two core fixes here, one for a long standing race which some Qualcomm systems have started triggering with their UFS driver and another fixing a problem with supply lookup introduced by the fixes for devm related use after free issues that were introduced in this merge window" * tag 'regulator-fix-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: fix deadlock on regulator enable regulator: core: Fix resolve supply lookup issue
2022-12-23Merge tag 'hardening-v6.2-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening fixes from Kees Cook: - Fix CFI failure with KASAN (Sami Tolvanen) - Fix LKDTM + CFI under GCC 7 and 8 (Kristina Martsenko) - Limit CONFIG_ZERO_CALL_USED_REGS to Clang > 15.0.6 (Nathan Chancellor) - Ignore "contents" argument in LoadPin's LSM hook handling - Fix paste-o in /sys/kernel/warn_count API docs - Use READ_ONCE() consistently for oops/warn limit reading * tag 'hardening-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: cfi: Fix CFI failure with KASAN exit: Use READ_ONCE() for all oops/warn limit reads security: Restrict CONFIG_ZERO_CALL_USED_REGS to gcc or clang > 15.0.6 lkdtm: cfi: Make PAC test work with GCC 7 and 8 docs: Fix path paste-o for /sys/kernel/warn_count LoadPin: Ignore the "contents" argument of the LSM hooks
2022-12-23Merge tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Holiday fixes! Two batches from amd, and one group of i915 changes. amdgpu: - Spelling fix - BO pin fix - Properly handle polaris 10/11 overlap asics - GMC9 fix - SR-IOV suspend fix - DCN 3.1.4 fix - KFD userptr locking fix - SMU13.x fixes - GDS/GWS/OA handling fix - Reserved VMID handling fixes - FRU EEPROM fix - BO validation fixes - Avoid large variable on the stack - S0ix fixes - SMU 13.x fixes - VCN fix - Add missing fence reference amdkfd: - Fix init vm error handling - Fix double release of compute pasid i915 - Documentation fixes - OA-perf related fix - VLV/CHV HDMI/DP audio fix - Display DDI/Transcoder fix - Migrate fixes" * tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm: (39 commits) drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency drm/amdgpu: enable VCN DPG for GC IP v11.0.4 drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0 drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34 drm/amdgpu: skip MES for S0ix as well since it's part of GFX drm/amd/pm: avoid large variable on kernel stack drm/amdkfd: Fix double release compute pasid drm/amdkfd: Fix kfd_process_device_init_vm error handling drm/amd/pm: update SMU13.0.0 reported maximum shader clock drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings drm/amd/pm: enable GPO dynamic control support for SMU13.0.7 drm/amd/pm: enable GPO dynamic control support for SMU13.0.0 drm/amdgpu: revert "generally allow over-commit during BO allocation" drm/amdgpu: Remove unnecessary domain argument drm/amdgpu: Fix size validation for non-exclusive domains (v4) drm/amdgpu: Check if fru_addr is not NULL (v2) drm/i915/ttm: consider CCS for backup objects drm/i915/migrate: fix corner case in CCS aux copying drm/amdgpu: rework reserved VMID handling ...
2022-12-22Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull more SCSI updates from James Bottomley: "Mostly small bug fixes and small updates. The only things of note is a qla2xxx fix for crash on hotplug and timeout and the addition of a user exposed abstraction layer for persistent reservation error return handling (which necessitates the conversion of nvme.c as well as SCSI)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix crash when I/O abort times out nvme: Convert NVMe errors to PR errors scsi: sd: Convert SCSI errors to PR errors scsi: core: Rename status_byte to sg_status_byte block: Add error codes for common PR failures scsi: sd: sd_zbc: Trace zone append emulation scsi: libfc: Include the correct header
2022-12-22ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+Mario Limonciello
After we introduced a module parameter and quirk infrastructure for picking the Microsoft GUID over the SOC vendor GUID we discovered that lots and lots of systems are getting this wrong. The table continues to grow, and is becoming unwieldy. We don't really have any benefit to forcing vendors to populate the AMD GUID. This is just extra work, and more and more vendors seem to mess it up. As the Microsoft GUID is used by Windows as well, it's very likely that it won't be messed up like this. So drop all the quirks forcing it and the Rembrandt behavior. This means that Cezanne or later effectively only run the Microsoft GUID codepath with the exception of HP Elitebook 8*5 G9. Fixes: fd894f05cf30 ("ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt") Cc: stable@vger.kernel.org # 6.1 Reported-by: Benjamin Cheng <ben@bcheng.me> Reported-by: bilkow@tutanota.com Reported-by: Paul <paul@zogpog.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2292 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216768 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com> Tested-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865Mario Limonciello
HP Elitebook 865 supports both the AMD GUID w/ _REV 2 and Microsoft GUID with _REV 0. Both have very similar code but the AMD GUID has a special workaround that is specific to a problem with spurious wakeups on systems with Qualcomm WLAN. This is believed to be a bug in the Qualcomm WLAN F/W (it doesn't affect any other WLAN H/W). If this WLAN firmware is fixed this quirk can be dropped. Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: video: Fix Apple GMUX backlight detectionHans de Goede
The apple-gmux driver only binds to old GMUX devices which have an IORESOURCE_IO resource (using inb()/outb()) rather then memory-mapped IO (IORESOURCE_MEM). T2 MacBooks use the new style GMUX devices (with IORESOURCE_MEM access), so these are not supported by the apple-gmux driver. This is not a problem since they have working ACPI video backlight support. But the apple_gmux_present() helper only checks if an ACPI device with the "APP000B" HID is present, causing acpi_video_get_backlight_type() to return acpi_backlight_apple_gmux disabling the acpi_video backlight device. Add a new apple_gmux_backlight_present() helper which checks that the "APP000B" device actually is an old GMUX device with an IORESOURCE_IO resource. This fixes the acpi_video0 backlight no longer registering on T2 MacBooks. Note people are working to add support for the new style GMUX to Linux: https://github.com/kekrby/linux-t2/commits/wip/hybrid-graphics Once this lands this patch should be reverted so that acpi_video_get_backlight_type() also prefers the gmux on new style GMUX MacBooks, but for now this is necessary to avoid regressing backlight control on T2 Macs. Fixes: 21245df307cb ("ACPI: video: Add Apple GMUX brightness control detection") Reported-and-tested-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: resource: Add Asus ExpertBook B2502 to Asus quirksHans de Goede
The Asus ExpertBook B2502 has the same keyboard issue as Asus Vivobook K3402ZA/K3502ZA. The kernel overrides IRQ 1 to Edge_High when it should be Active_Low. This patch adds the ExpertBook B2502 model to the existing quirk list of Asus laptops with this issue. Fixes: b5f9223a105d ("ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142574 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: resource: do IRQ override on Lenovo 14ALC7Adrian Freund
Commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad") added an override for Lenovo IdeaPad 5 16ALC7. The 14ALC7 variant also suffers from a broken touchscreen and trackpad. Fixes: 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216804 Signed-off-by: Adrian Freund <adrian@freund.io> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: resource: do IRQ override on XMG Core 15Erik Schumacher
The Schenker XMG CORE 15 (M22) is Ryzen-6 based and needs IRQ overriding for the keyboard to work. Adding an entry for this laptop to the override_table makes the internal keyboard functional again. Signed-off-by: Erik Schumacher <ofenfisch@googlemail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: video: Don't enable fallback path for creating ACPI backlight by defaultMario Limonciello
The ACPI video detection code has a module parameter `register_backlight_delay` which is currently configured to 8 seconds. This means that if after 8 seconds of booting no native driver has created a backlight device then the code will attempt to make an ACPI video backlight device. This was intended as a safety mechanism with the backlight overhaul that occurred in kernel 6.1, but as it doesn't appear necesssary set it to be disabled by default. Suggested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22drm/amd/display: Report to ACPI video if no panels were foundMario Limonciello
On desktop APUs amdgpu doesn't create a native backlight device as no eDP panels are found. However if the BIOS has reported backlight control methods in the ACPI tables then an acpi_video0 backlight device will be made 8 seconds after boot. This has manifested in a power slider on a number of desktop APUs ranging from Ryzen 5000 through Ryzen 7000 on various motherboard manufacturers. To avoid this, report to the acpi video detection that the system does not have any panel connected in the native driver. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783786 Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22ACPI: video: Allow GPU drivers to report no panelsMario Limonciello
The current logic for the ACPI backlight detection will create a backlight device if no native or vendor drivers have created 8 seconds after the system has booted if the ACPI tables included backlight control methods. If the GPU drivers have loaded, they may be able to report whether any LCD panels were found. Allow using this information to factor in whether to enable the fallback logic for making an acpi_video0 backlight device. Suggested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>