summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2022-02-25iavf: Rework mutexes for better synchronisationSlawomir Laba
The driver used to crash in multiple spots when put to stress testing of the init, reset and remove paths. The user would experience call traces or hangs when creating, resetting, removing VFs. Depending on the machines, the call traces are happening in random spots, like reset restoring resources racing with driver remove. Make adapter->crit_lock mutex a mandatory lock for guarding the operations performed on all workqueues and functions dealing with resource allocation and disposal. Make __IAVF_REMOVE a final state of the driver respected by workqueues that shall not requeue, when they fail to obtain the crit_lock. Make the IRQ handler not to queue the new work for adminq_task when the __IAVF_REMOVE state is set. Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25Merge tag 'usb-5.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB driver fixes for 5.17-rc6 to resolve reported problems and add new device ids. They include: - dwc3: - device mapping fix - new device ids - driver fixes - xhci driver fixes - gadget driver fixes - usb-serial driver device id updates All of these have been in linux-next with no reported problems" * tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: gadget: rndis: add spinlock for rndis response list usb: dwc3: gadget: Let the interrupt handler disable bottom halves. USB: gadget: validate endpoint index for xilinx udc USB: serial: option: add Telit LE910R1 compositions USB: serial: option: add support for DW5829e Revert "USB: serial: ch341: add new Product ID for CH341A" usb: dwc2: drd: fix soft connect when gadget is unconfigured usb: dwc3: pci: Fix Bay Trail phy GPIO mappings tps6598x: clear int mask on probe failure xhci: Prevent futile URB re-submissions due to incorrect return value. xhci: re-initialize the HC during resume if HCE was set usb: dwc3: pci: Add "snps,dis_u2_susphy_quirk" for Intel Bay Trail usb: dwc3: pci: add support for the Intel Raptor Lake-S
2022-02-25Merge tag 'ata-5.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: "Two fixes for the pata_hpt37x driver, both from Sergey: - Fix a PCI register access using an incorrect size (8bits instead of 16bits) - Make sure to always disable the primary channel as it is unused" * tag 'ata-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_hpt37x: disable primary channel on HPT371 ata: pata_hpt37x: fix PCI clock detection
2022-02-25net: stmmac: fix return value of __setup handlerRandy Dunlap
__setup() handlers should return 1 on success, i.e., the parameter has been handled. A return of 0 causes the "option=value" string to be added to init's environment strings, polluting it. Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.") Fixes: f3240e2811f0 ("stmmac: remove warning when compile as built-in (V2)") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru> Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Jose Abreu <joabreu@synopsys.com> Link: https://lore.kernel.org/r/20220224033536.25056-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25net: sxgbe: fix return value of __setup handlerRandy Dunlap
__setup() handlers should return 1 on success, i.e., the parameter has been handled. A return of 0 causes the "option=value" string to be added to init's environment strings, polluting it. Fixes: acc18c147b22 ("net: sxgbe: add EEE(Energy Efficient Ethernet) for Samsung sxgbe") Fixes: 1edb9ca69e8a ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru> Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Cc: Siva Reddy <siva.kallam@samsung.com> Cc: Girish K S <ks.giri@samsung.com> Cc: Byungho An <bh74.an@samsung.com> Link: https://lore.kernel.org/r/20220224033528.24640-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when ↵Lad Prabhakar
fully ready Register the CAN device only when all the necessary initialization is completed. This patch makes sure all the data structures and locks are initialized before registering the CAN device. Link: https://lore.kernel.org/all/20220221225935.12300-1-prabhakar.mahadev-lad.rj@bp.renesas.com Reported-by: Pavel Machek <pavel@denx.de> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Pavel Machek <pavel@denx.de> Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-25Merge tag 'soc-fsl-fix-v5.17' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/fixes NXP/FSL SoC driver fixes for v5.17 - Add missing SoC compatible in existing binding - Replace kernel.h with the necessary inclusions - MAINTAINERS file fixes - Fix memory allocation failure check in guts driver - Various cleanups and minor fixes * tag 'soc-fsl-fix-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux: soc: fsl: qe: Check of ioremap return value soc: fsl: qe: fix typo in a comment soc: fsl: guts: Add a missing memory allocation failure check soc: fsl: guts: Revert commit 3c0d64e867ed soc: fsl: Correct MAINTAINERS database (SOC) soc: fsl: Correct MAINTAINERS database (QUICC ENGINE LIBRARY) soc: fsl: Replace kernel.h with the necessary inclusions dt-bindings: fsl,layerscape-dcfg: add missing compatible for lx2160a dt-bindings: qoriq-clock: add missing compatible for lx2160a Link: https://lore.kernel.org/r/20220219012208.21835-1-leoyang.li@nxp.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25net: sparx5: Fix add vlan when invalid operationCasper Andersson
Check if operation is valid before changing any settings in hardware. Otherwise it results in changes being made despite it not being a valid operation. Fixes: 78eab33bb68b ("net: sparx5: add vlan support") Signed-off-by: Casper Andersson <casper.casan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: chelsio: cxgb3: check the return value of pci_find_capability()Jia-Ju Bai
The function pci_find_capability() in t3_prep_adapter() can fail, so its return value should be checked. Fixes: 4d22de3e6cc4 ("Add support for the latest 1G/10G Chelsio adapter, T3") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: Allow queueing resets during probeSukadev Bhattiprolu
We currently don't allow queuing resets when adapter is in VNIC_PROBING state - instead we throw away the reset and return EBUSY. The reasoning is probably that during ibmvnic_probe() the ibmvnic_adapter itself is being initialized so performing a reset during this time can lead us to accessing fields in the ibmvnic_adapter that are not fully initialized. A review of the code shows that all the adapter state neede to process a reset is initialized before registering the CRQ so that should no longer be a concern. Further the expectation is that if we do get a reset (transport event) during probe, the do..while() loop in ibmvnic_probe() will handle this by reinitializing the CRQ. While that is true to some extent, it is possible that the reset might occur _after_ the CRQ is registered and CRQ_INIT message was exchanged but _before_ the adapter state is set to VNIC_PROBED. As mentioned above, such a reset will be thrown away. While the client assumes that the adapter is functional, the vnic server will wait for the client to reinit the adapter. This disconnect between the two leaves the adapter down needing manual intervention. Because ibmvnic_probe() has other work to do after initializing the CRQ (such as registering the netdev at a minimum) and because the reset event can occur at any instant after the CRQ is initialized, there will always be a window between initializing the CRQ and considering the adapter ready for resets (ie state == PROBED). So rather than discarding resets during this window, allow queueing them - but only process them after the adapter is fully initialized. To do this, introduce a new completion state ->probe_done and have the reset worker thread wait on this before processing resets. This change brings up two new situations in or just after ibmvnic_probe(). First after one or more resets were queued, we encounter an error and decide to retry the initialization. At that point the queued resets are no longer relevant since we could be talking to a new vnic server. So we must purge/flush the queued resets before restarting the initialization. As a side note, since we are still in the probing stage and we have not registered the netdev, it will not be CHANGE_PARAM reset. Second this change opens up a potential race between the worker thread in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the following sequence of events: 1. Register CRQ 2. Get transport event before CRQ_INIT completes. 3. Tasklet schedules reset: a) add rwi to list b) schedule_work() to start worker thread which runs and waits for ->probe_done. 4. ibmvnic_probe() decides to retry, purges rwi_list 5. Re-register crq and this time rest of probe succeeds - register netdev and complete(->probe_done). 6. Worker thread resumes in __ibmvnic_reset() from 3b. 7. Worker thread sets ->resetting bit 8. ibmvnic_open() comes in, notices ->resetting bit, sets state to IBMVNIC_OPEN and returns early expecting worker thread to finish the open. 9. Worker thread finds rwi_list empty and returns without opening the interface. If this happens, the ->ndo_open() call is effectively lost and the interface remains down. To address this, ensure that ->rwi_list is not empty before setting the ->resetting bit. See also comments in __ibmvnic_reset(). Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: clear fop when retrying probeSukadev Bhattiprolu
Clear ->failover_pending flag that may have been set in the previous pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open() call would be misled into thinking a failover is pending and assuming that the reset worker thread would open the adapter. If this pass of registering the CRQ succeeds (i.e there is no transport event), there wouldn't be a reset worker thread. This would leave the adapter unconfigured and require manual intervention to bring it up during boot. Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: init init_done_rc earlierSukadev Bhattiprolu
We currently initialize the ->init_done completion/return code fields before issuing a CRQ_INIT command. But if we get a transport event soon after registering the CRQ the taskslet may already have recorded the completion and error code. If we initialize here, we might overwrite/ lose that and end up issuing the CRQ_INIT only to timeout later. If that timeout happens during probe, we will leave the adapter in the DOWN state rather than retrying to register/init the CRQ. Initialize the completion before registering the CRQ so we don't lose the notification. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: register netdev after init of adapterSukadev Bhattiprolu
Finish initializing the adapter before registering netdev so state is consistent. Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: complete init_done on transport eventsSukadev Bhattiprolu
If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a868bb ("ibmvnic: Fix completion structure initialization") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: define flush_reset_queue helperSukadev Bhattiprolu
Define and use a helper to flush the reset queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: initialize rc before completing waitSukadev Bhattiprolu
We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0cb378 ("ibmvnic delay complete()") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25ibmvnic: free reset-work-item when flushingSukadev Bhattiprolu
Fix a tiny memory leak when flushing the reset work queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: stmmac: only enable DMA interrupts when readyVincent Whitchurch
In this driver's ->ndo_open() callback, it enables DMA interrupts, starts the DMA channels, then requests interrupts with request_irq(), and then finally enables napi. If RX DMA interrupts are received before napi is enabled, no processing is done because napi_schedule_prep() will return false. If the network has a lot of broadcast/multicast traffic, then the RX ring could fill up completely before napi is enabled. When this happens, no further RX interrupts will be delivered, and the driver will fail to receive any packets. Fix this by only enabling DMA interrupts after all other initialization is complete. Fixes: 523f11b5d4fd72efb ("net: stmmac: move hardware setup for stmmac_open to new function") Reported-by: Lars Persson <larper@axis.com> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25xen/netfront: destroy queues before real_num_tx_queues is zeroedMarek Marczykowski-Górecki
xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5 ("net-sysfs: update the queue counts in the unregistration path"), unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two facts together means, that xennet_destroy_queues() called from xennet_remove() cannot do its job, because it's called after unregister_netdev(). This results in kfree-ing queues that are still linked in napi, which ultimately crashes: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226 RIP: 0010:free_netdev+0xa3/0x1a0 Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00 RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050 R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680 FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0 Call Trace: <TASK> xennet_remove+0x13d/0x300 [xen_netfront] xenbus_dev_remove+0x6d/0xf0 __device_release_driver+0x17a/0x240 device_release_driver+0x24/0x30 bus_remove_device+0xd8/0x140 device_del+0x18b/0x410 ? _raw_spin_unlock+0x16/0x30 ? klist_iter_exit+0x14/0x20 ? xenbus_dev_request_and_reply+0x80/0x80 device_unregister+0x13/0x60 xenbus_dev_changed+0x18e/0x1f0 xenwatch_thread+0xc0/0x1a0 ? do_wait_intr_irq+0xa0/0xa0 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Fix this by calling xennet_destroy_queues() from xennet_uninit(), when real_num_tx_queues is still available. This ensures that queues are destroyed when real_num_tx_queues is set to 0, regardless of how unregister_netdev() was called. Originally reported at https://github.com/QubesOS/qubes-issues/issues/7257 Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path") Cc: stable@vger.kernel.org Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25Merge tag 'omap-for-v5.17/fixes-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps Fixes for devkit8000 timer regression. Similar to the earlier beagleboard fixes, we must not configure the clocksource drivers to use an alternative timer configuration. It causes unnecessary issues with power management. Only some old designs based on early beagleboard revisions with a miswired timer need to use the alternative timer. * tag 'omap-for-v5.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Use 32KiHz oscillator on devkit8000 ARM: dts: switch timer config to common devkit8000 devicetree Link: https://lore.kernel.org/r/pull-1645606483-876944@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25can: gs_usb: change active_channels's type from atomic_t to u8Vincent Mailhol
The driver uses an atomic_t variable: gs_usb:active_channels to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. However, the driver does not decrement the counter when an error occurs in gs_can_open(). This issue is fixed by changing the type from atomic_t to u8 and by simplifying the logic accordingly. It is safe to use an u8 here because the network stack big kernel lock (a.k.a. rtnl_mutex) is being hold. For details, please refer to [1]. [1] https://lore.kernel.org/linux-can/CAMZ6Rq+sHpiw34ijPsmp7vbUpDtJwvVtdV7CvRZJsLixjAFfrg@mail.gmail.com/T/#t Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://lore.kernel.org/all/20220214234814.1321599-1-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-25can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8Vincent Mailhol
The driver uses an atomic_t variable: struct es58x_device::opened_channel_cnt to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. While the intent was to prevent race conditions, the choice of an atomic_t turns out to be a bad idea for several reasons: - implementation is incorrect and fails to decrement opened_channel_cnt when the URB allocation fails as reported in [1]. - even if opened_channel_cnt were to be correctly decremented, atomic_t is insufficient to cover edge cases: there can be a race condition in which 1/ a first process fails to allocate URBs memory 2/ a second process enters es58x_open() before the first process does its cleanup and decrements opened_channed_cnt. In which case, the second process would successfully return despite the URBs memory not being allocated. - actually, any kind of locking mechanism was useless here because it is redundant with the network stack big kernel lock (a.k.a. rtnl_lock) which is being hold by all the callers of net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many() [3]. The atmomic_t is thus replaced by a simple u8 type and the logic to increment and decrement es58x_device:opened_channel_cnt is simplified accordingly fixing the bug reported in [1]. We do not check again for ASSERST_RTNL() as this is already done by the callers. [1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u [2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463 [3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541 Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Link: https://lore.kernel.org/all/20220212112713.577957-1-mailhol.vincent@wanadoo.fr Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A couple driver fixes in the clk subsystem - Fix a hang due to bad clk parent in the Ingenic jz4725b driver - Fix SD controllers on Qualcomm MSM8994 SoCs by removing clks that shouldn't be touched" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: jz4725b: fix mmc0 clock gating clk: qcom: gcc-msm8994: Remove NoC clocks
2022-02-24Merge tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular drm fixes pull, i915, amdgpu and tegra mostly, all pretty small. core: - edid: Always set RGB444 tegra: - tegra186 suspend/resume fixes - syncpoint wait fix - build warning fix - eDP on older devices fix amdgpu: - Display FP fix - PCO powergating fix - RDNA2 OEM SKU stability fixes - Display PSR fix - PCI ASPM fix - Display link encoder fix for TEST_COMMIT - Raven2 suspend/resume fix - Fix a regression in virtual display support - GPUVM eviction fix i915: - Fix QGV handling on ADL-P+ - Fix bw atomic check when switching between SAGV vs. no SAGV - Disconnect PHYs left connected by BIOS on disabled ports - Fix SAVG to no SAGV transitions on TGL+ - Print PHY name properly on calibration error (DG2) imx: - dcss: Select GEM CMA helpers radeon: - Fix some variables's type vc4: - Fix codec cleanup - Fix PM reference counting" * tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm: (24 commits) drm/amdgpu: check vm ready by amdgpu_vm->evicting flag drm/amdgpu: bypass tiling flag check in virtual display case (v2) Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" drm/amdgpu: do not enable asic reset for raven2 drm/amd/display: Fix stream->link_enc unassigned during stream removal drm/amd: Check if ASPM is enabled from PCIe subsystem drm/edid: Always set RGB444 drm/tegra: dpaux: Populate AUX bus drm/radeon: fix variable type drm/amd/display: For vblank_disable_immediate, check PSR is really used drm/amd/pm: fix some OEM SKU specific stability issues drm/amdgpu: disable MMHUB PG for Picasso drm/amd/display: Protect update_bw_bounding_box FPU code. drm/i915/dg2: Print PHY name properly on calibration error drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV drm/i915: Correctly populate use_sagv_wm for all pipes drm/i915: Disconnect PHYs left connected by BIOS on disabled ports drm/i915: Widen the QGV point mask drm/imx/dcss: i.MX8MQ DCSS select DRM_GEM_CMA_HELPER drm/vc4: crtc: Fix runtime_pm reference counting ...
2022-02-24clk: lan966x: Fix linking errorHoratiu Vultur
If the config options HAS_IOMEM is not set then the driver fails to link with the following error: clk-lan966x.c:(.text+0x950): undefined reference to `devm_platform_ioremap_resource' Therefor add missing dependencies: HAS_IOMEM and OF. Fixes: 54104ee02333 ("clk: lan966x: Add lan966x SoC clock driver") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20220219141536.460812-1-horatiu.vultur@microchip.com Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-25drm/exynos: Search for TE-gpio in DSI panel's nodeMarek Szyprowski
TE-gpio, if defined, is placed in the panel's node, not the parent DSI node. Change the devm_gpiod_get_optional() to gpiod_get_optional() and pass proper device node to it. The code already has a proper cleanup path, so it looks that the devm_* variant has been applied accidentally during the conversion to gpiod API. Fixes: ee6c8b5afa62 ("drm/exynos: Replace legacy gpio interface for gpiod interface") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixed a typo. Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos: Don't fail if no TE-gpio is defined for DSI driverMarek Szyprowski
TE-gpio is optional and if it is not found then gpiod_get_optional() returns NULL. In such case the code will continue and try to convert NULL gpiod to irq what in turn fails. The failure is then propagated and driver is not registered. Fix this by returning early from exynos_dsi_register_te_irq() if no TE-gpio is found. Fixes: ee6c8b5afa62 ("drm/exynos: Replace legacy gpio interface for gpiod interface") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos: gsc: Use platform_get_irq() to get the interruptLad Prabhakar
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypassed the hierarchical setup and messed up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos/fimc: Use platform_get_irq() to get the interruptLad Prabhakar
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypassed the hierarchical setup and messed up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get the interruptLad Prabhakar
platform_get_resource_byname(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypassed the hierarchical setup and messed up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq_byname(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos: mixer: Use platform_get_irq() to get the interruptLad Prabhakar
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypassed the hierarchical setup and messed up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to get the interruptLad Prabhakar
platform_get_resource_byname(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypassed the hierarchical setup and messed up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq_byname(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-24clk: qcom: dispcc: Update the transition delay for MDSS GDSCTaniya Das
On SC7180 we observe black screens because the gdsc is being enabled/disabled very rapidly and the GDSC FSM state does not work as expected. This is due to the fact that the GDSC reset value is being updated from SW. The recommended transition delay for mdss core gdsc updated for SC7180/SC7280/SM8250. Fixes: dd3d06622138 ("clk: qcom: Add display clock controller driver for SC7180") Fixes: 1a00c962f9cd ("clk: qcom: Add display clock controller driver for SC7280") Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250") Signed-off-by: Taniya Das <tdas@codeaurora.org> Link: https://lore.kernel.org/r/20220223185606.3941-2-tdas@codeaurora.org Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> [sboyd@kernel.org: lowercase hex] Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-24clk: qcom: gdsc: Add support to update GDSC transition delayTaniya Das
GDSCs have multiple transition delays which are used for the GDSC FSM states. Older targets/designs required these values to be updated from gdsc code to certain default values for the FSM state to work as expected. But on the newer targets/designs the values updated from the GDSC driver can hamper the FSM state to not work as expected. On SC7180 we observe black screens because the gdsc is being enabled/disabled very rapidly and the GDSC FSM state does not work as expected. This is due to the fact that the GDSC reset value is being updated from SW. Thus add support to update the transition delay from the clock controller gdscs as required. Fixes: 45dd0e55317cc ("clk: qcom: Add support for GDSCs) Signed-off-by: Taniya Das <tdas@codeaurora.org> Link: https://lore.kernel.org/r/20220223185606.3941-1-tdas@codeaurora.org Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-24Merge tag 'imx-fixes-5.17-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 5.17, round 2: - Drop reset signal from i.MX8MM vpumix power domain to fix a system hang. - Fix a dtbs_check warning caused by #thermal-sensor-cells in i.MX8ULP device tree. - Fix a clock disabling imbalance in gpcv2 driver. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-24Merge tag 'pci-v5.17-fixes-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Fix a merge error that broke PCI device enumeration on mvebu platforms, including Turris Omnia (Armada 385) (Pali Rohár) - Avoid using ATS on all AMD Navi10 and Navi14 GPUs because some VBIOSes don't account for "harvested" (disabled) parts of the chip when initializing caches (Alex Deucher) * tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken PCI: mvebu: Fix device enumeration regression
2022-02-24Merge tag 'net-5.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - regressions: - bpf: fix crash due to out of bounds access into reg2btf_ids - mvpp2: always set port pcs ops, avoid null-deref - eth: marvell: fix driver load from initrd - eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC" Current release - new code bugs: - mptcp: fix race in overlapping signal events Previous releases - regressions: - xen-netback: revert hotplug-status changes causing devices to not be configured - dsa: - avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held - fix panic when removing unoffloaded port from bridge - dsa: microchip: fix bridging with more than two member ports Previous releases - always broken: - bpf: - fix crash due to incorrect copy_map_value when both spin lock and timer are present in a single value - fix a bpf_timer initialization issue with clang - do not try bpf_msg_push_data with len 0 - add schedule points in batch ops - nf_tables: - unregister flowtable hooks on netns exit - correct flow offload action array size - fix a couple of memory leaks - vsock: don't check owner in vhost_vsock_stop() while releasing - gso: do not skip outer ip header in case of ipip and net_failover - smc: use a mutex for locking "struct smc_pnettable" - openvswitch: fix setting ipv6 fields causing hw csum failure - mptcp: fix race in incoming ADD_ADDR option processing - sysfs: add check for netdevice being present to speed_show - sched: act_ct: fix flow table lookup after ct clear or switching zones - eth: intel: fixes for SR-IOV forwarding offloads - eth: broadcom: fixes for selftests and error recovery - eth: mellanox: flow steering and SR-IOV forwarding fixes Misc: - make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends not report freed skbs as drops - force inlining of checksum functions in net/checksum.h" * tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: mv643xx_eth: process retval from of_get_mac_address ping: remove pr_err from ping_lookup Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC" openvswitch: Fix setting ipv6 fields causing hw csum failure ipv6: prevent a possible race condition with lifetimes net/smc: Use a mutex for locking "struct smc_pnettable" bnx2x: fix driver load from initrd Revert "xen-netback: Check for hotplug-status existence before watching" Revert "xen-netback: remove 'hotplug-status' once it has served its purpose" net/mlx5e: Fix VF min/max rate parameters interchange mistake net/mlx5e: Add missing increment of count net/mlx5e: MPLSoUDP decap, fix check for unsupported matches net/mlx5e: Fix MPLSoUDP encap to use MPLS action information net/mlx5e: Add feature check for set fec counters net/mlx5e: TC, Skip redundant ct clear actions net/mlx5e: TC, Reject rules with forward and drop actions net/mlx5e: TC, Reject rules with drop and modify hdr action net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets net/mlx5e: Fix wrong return value on ioctl EEPROM query failure net/mlx5: Fix possible deadlock on rule deletion ...
2022-02-25Merge tag 'drm-intel-fixes-2022-02-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix QGV handling on ADL-P+ (Ville Syrjälä) - Fix bw atomic check when switching between SAGV vs. no SAGV (Ville Syrjälä) - Disconnect PHYs left connected by BIOS on disabled ports (Imre Deak) - Fix SAVG to no SAGV transitions on TGL+ (Ville Syrjälä) - Print PHY name properly on calibration error (DG2) (Matt Roper) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YhdyHwRWkOTWwlqi@tursulin-mobl2
2022-02-24Merge tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request: - send H2CData PDUs based on MAXH2CDATA (Varun Prakash) - fix passthrough to namespaces with unsupported features (Christoph Hellwig) - Clear iocb->private at poll completion (Stefano) * tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block: nvme-tcp: send H2CData PDUs based on MAXH2CDATA nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info nvme: don't return an error from nvme_configure_metadata block: clear iocb->private in blkdev_bio_end_io_async()
2022-02-24thermal: int340x: fix memory leak in int3400_notify()Chuansheng Liu
It is easy to hit the below memory leaks in my TigerLake platform: unreferenced object 0xffff927c8b91dbc0 (size 32): comm "kworker/0:2", pid 112, jiffies 4294893323 (age 83.604s) hex dump (first 32 bytes): 4e 41 4d 45 3d 49 4e 54 33 34 30 30 20 54 68 65 NAME=INT3400 The 72 6d 61 6c 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 rmal.kkkkkkkkkk. backtrace: [<ffffffff9c502c3e>] __kmalloc_track_caller+0x2fe/0x4a0 [<ffffffff9c7b7c15>] kvasprintf+0x65/0xd0 [<ffffffff9c7b7d6e>] kasprintf+0x4e/0x70 [<ffffffffc04cb662>] int3400_notify+0x82/0x120 [int3400_thermal] [<ffffffff9c8b7358>] acpi_ev_notify_dispatch+0x54/0x71 [<ffffffff9c88f1a7>] acpi_os_execute_deferred+0x17/0x30 [<ffffffff9c2c2c0a>] process_one_work+0x21a/0x3f0 [<ffffffff9c2c2e2a>] worker_thread+0x4a/0x3b0 [<ffffffff9c2cb4dd>] kthread+0xfd/0x130 [<ffffffff9c201c1f>] ret_from_fork+0x1f/0x30 Fix it by calling kfree() accordingly. Fixes: 38e44da59130 ("thermal: int3400_thermal: process "thermal table changed" event") Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-24Merge branch 'cpufreq/arm/fixes' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq fixes for 5.18-rc6 from Viresh Kumar: "This fixes issues related to throttle IRQ for Qcom SoCs." * 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-hw: Delay enabling throttle_irq cpufreq: Reintroduce ready() callback
2022-02-24Merge tag 'platform-drivers-x86-v5.17-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull more x86 platform driver fixes from Hans de Goede: "Two more fixes: - Fix suspend/resume regression on AMD Cezanne APUs in >= 5.16 - Fix Microsoft Surface 3 battery readings" * tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: surface: surface3_power: Fix battery readings on batteries without a serial number platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
2022-02-24net: mv643xx_eth: process retval from of_get_mac_addressMauri Sandberg
Obtaining a MAC address may be deferred in cases when the MAC is stored in an NVMEM block, for example, and it may not be ready upon the first retrieval attempt and return EPROBE_DEFER. It is also possible that a port that does not rely on NVMEM has been already created when getting the defer request. Thus, also the resources allocated previously must be freed when doing a roll-back. Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Mauri Sandberg <maukka@ext.kapsi.fi> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220223142337.41757-1-maukka@ext.kapsi.fi Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"Mateusz Palczewski
Revert of a patch that instead of fixing a AQ error when trying to reset BW limit introduced several regressions related to creation and managing TC. Currently there are errors when creating a TC on both PF and VF. Error log: [17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391 [17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391 This reverts commit 3d2504663c41104b4359a15f35670cfa82de1bbf. Fixes: 3d2504663c41 (i40e: Fix reset bw limit when DCB enabled with 1 TC) Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220223175347.1690692-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24bnx2x: fix driver load from initrdManish Chopra
Commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added new firmware support in the driver with maintaining older firmware compatibility. However, older firmware was not added in MODULE_FIRMWARE() which caused missing firmware files in initrd image leading to driver load failure from initrd. This patch adds MODULE_FIRMWARE() for older firmware version to have firmware files included in initrd. Fixes: b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215627 Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220223085720.12021-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Revert "xen-netback: Check for hotplug-status existence before watching"Marek Marczykowski-Górecki
This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d. The reasoning in the commit was wrong - the code expected to setup the watch even if 'hotplug-status' didn't exist. In fact, it relied on the watch being fired the first time - to check if maybe 'hotplug-status' is already set to 'connected'. Not registering a watch for non-existing path (which is the case if hotplug script hasn't been executed yet), made the backend not waiting for the hotplug script to execute. This in turns, made the netfront think the interface is fully operational, while in fact it was not (the vif interface on xen-netback side might not be configured yet). This was a workaround for 'hotplug-status' erroneously being removed. But since that is reverted now, the workaround is not necessary either. More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Michael Brown <mbrown@fensystems.co.uk> Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"Marek Marczykowski-Górecki
This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2. The 'hotplug-status' node should not be removed as long as the vif device remains configured. Otherwise the xen-netback would wait for re-running the network script even if it was already called (in case of the frontent re-connecting). But also, it _should_ be removed when the vif device is destroyed (for example when unbinding the driver) - otherwise hotplug script would not configure the device whenever it re-appear. Moving removal of the 'hotplug-status' node was a workaround for nothing calling network script after xen-netback module is reloaded. But when vif interface is re-created (on xen-netback unbind/bind for example), the script should be called, regardless of who does that - currently this case is not handled by the toolstack, and requires manual script call. Keeping hotplug-status=connected to skip the call is wrong and leads to not configured interface. More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Merge tag 'usb-serial-5.17-rc6' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 5.17-rc6 Here's a revert of a commit which erroneously added a device id used for the EPP/MEM mode of ch341 devices. Included are also some new modem device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-5.17-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Telit LE910R1 compositions USB: serial: option: add support for DW5829e Revert "USB: serial: ch341: add new Product ID for CH341A"
2022-02-24surface: surface3_power: Fix battery readings on batteries without a serial ↵Hans de Goede
number The battery on the 2nd hand Surface 3 which I recently bought appears to not have a serial number programmed in. This results in any I2C reads from the registers containing the serial number failing with an I2C NACK. This was causing mshw0011_bix() to fail causing the battery readings to not work at all. Ignore EREMOTEIO (I2C NACK) errors when retrieving the serial number and continue with an empty serial number to fix this. Fixes: b1f81b496b0d ("platform/x86: surface3_power: MSHW0011 rev-eng implementation") BugLink: https://github.com/linux-surface/linux-surface/issues/608 Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20220224101848.7219-1-hdegoede@redhat.com
2022-02-24platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeupMario Limonciello
commit 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup") adds support for using another platform timer in lieu of the RTC which doesn't work properly on some systems. This path was validated and worked well before submission. During the 5.16-rc1 merge window other patches were merged that caused this to stop working properly. When this feature was used with 5.16-rc1 or later some OEM laptops with the matching firmware requirements from that commit would shutdown instead of program a timer based wakeup. This was bisected to commit 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path"). This wasn't supposed to cause any negative impacts and also tested well on both Intel and ARM platforms. However this changed the semantics of when CPUs are allowed to be in the deepest state. For the AMD systems in question it appears this causes a firmware crash for timer based wakeup. It's hypothesized to be caused by the `amd-pmc` driver sending `OS_HINT` and all the CPUs going into a deep state while the timer is still being programmed. It's likely a firmware bug, but to avoid it don't allow setting CPUs into the deepest state while using CZN timer wakeup path. If later it's discovered that this also occurs from "regular" suspends without a timer as well or on other silicon, this may be later expanded to run in the suspend path for more scenarios. Cc: stable@vger.kernel.org # 5.16+ Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/linux-acpi/BL1PR12MB51570F5BD05980A0DCA1F3F4E23A9@BL1PR12MB5157.namprd12.prod.outlook.com/T/#mee35f39c41a04b624700ab2621c795367f19c90e Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path") Fixes: 23f62d7ab25b ("PM: sleep: Pause cpuidle later and resume it earlier during system transitions") Fixes: 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup" Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20220223175237.6209-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>