linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-09-13	scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command	Michal Grzedzicki
	Some cards have more than one SAS address. Using an incorrect address causes communication issues with some devices like expanders. Closes: https://lore.kernel.org/linux-kernel/A57AEA84-5CA0-403E-8053-106033C73C70@fb.com/ Signed-off-by: Michal Grzedzicki <mge@meta.com> Link: https://lore.kernel.org/r/20230913155611.3183612-1-mge@meta.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13	Merge branch '6.6/scsi-staging' into 6.6/scsi-fixes	Martin K. Petersen
	Pull in staged fixes for 6.6. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13	Merge tag 'pmdomain-v6.6-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull genpm / pmdomain rename from Ulf Hansson: "This renames the genpd subsystem to pmdomain. As discussed on LKML, using 'genpd' as the name of a subsystem isn't very self-explanatory and the acronym itself that means Generic PM Domain, is known only by a limited group of people. The suggestion to improve the situation is to rename the subsystem to 'pmdomain', which there seems to be a good consensus around using. Ideally it should indicate that its purpose is to manage Power Domains or 'PM domains' as we often also use within the Linux Kernel terminology" * tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: Rename the genpd subsystem to pmdomain
2023-09-13	Merge tag 'tpmdd-v6.6-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fix from Jarkko Sakkinen. * tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Fix typo in tpmrm class definition
2023-09-13	Merge tag 'parisc-for-6.6-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: - fix reference to exported symbols for parisc64 [Masahiro Yamada] - Block-TLB (BTLB) support on 32-bit CPUs - sparse and build-warning fixes * tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: linux/export: fix reference to exported functions for parisc64 parisc: BTLB: Initialize BTLB tables at CPU startup parisc: firmware: Simplify calling non-PA20 functions parisc: BTLB: _edata symbol has to be page aligned for BTLB support parisc: BTLB: Add BTLB insert and purge firmware function wrappers parisc: BTLB: Clear possibly existing BTLB entries parisc: Prepare for Block-TLB support on 32-bit kernel parisc: shmparam.h: Document aliasing requirements of PA-RISC parisc: irq: Make irq_stack_union static to avoid sparse warning parisc: drivers: Fix sparse warning parisc: iosapic.c: Fix sparse warnings parisc: ccio-dma: Fix sparse warnings parisc: sba-iommu: Fix sparse warnigs parisc: sba: Fix compile warning wrt list of SBA devices parisc: sba_iommu: Fix build warning if procfs if disabled
2023-09-13	firmware: cirrus: cs_dsp: Only log list of algorithms in debug build	Richard Fitzgerald
	Change the logging of each algorithm from info level to debug level. On the original devices supported by this code there were typically only one or two algorithms in a firmware and one or two DSPs so this logging only used a small number of log lines. However, for the latest devices there could be 30-40 algorithms in a firmware and 8 DSPs being loaded in parallel, so using 300+ lines of log for information that isn't particularly important to have logged. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230913160523.3701189-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-13	igb: clean up in all error paths when enabling SR-IOV	Corinna Vinschen
	After commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), removing the igb module could hang or crash (depending on the machine) when the module has been loaded with the max_vfs parameter set to some value != 0. In case of one test machine with a dual port 82580, this hang occurred: [ 232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1 [ 233.093257] igb 0000:41:00.1: IOV Disabled [ 233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0 [ 233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata) [ 233.352248] igb 0000:41:00.0: device [8086:1516] error status/mask=00100000 [ 233.361088] igb 0000:41:00.0: [20] UnsupReq (First) [ 233.368183] igb 0000:41:00.0: AER: TLP Header: 40000001 0000040f cdbfc00c c [ 233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata) [ 233.388779] igb 0000:41:00.1: device [8086:1516] error status/mask=00100000 [ 233.397629] igb 0000:41:00.1: [20] UnsupReq (First) [ 233.404736] igb 0000:41:00.1: AER: TLP Header: 40000001 0000040f cdbfc00c c [ 233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback) [ 233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0 [ 233.546197] pcieport 0000:40:01.0: AER: device recovery failed [ 234.157244] igb 0000:41:00.0: IOV Disabled [ 371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds. [ 371.627489] Not tainted 6.4.0-dirty #2 [ 371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this. [ 371.641000] task:irq/35-aerdrv state:D stack:0 pid:257 ppid:2 f0 [ 371.650330] Call Trace: [ 371.653061] <TASK> [ 371.655407] __schedule+0x20e/0x660 [ 371.659313] schedule+0x5a/0xd0 [ 371.662824] schedule_preempt_disabled+0x11/0x20 [ 371.667983] __mutex_lock.constprop.0+0x372/0x6c0 [ 371.673237] ? __pfx_aer_root_reset+0x10/0x10 [ 371.678105] report_error_detected+0x25/0x1c0 [ 371.682974] ? __pfx_report_normal_detected+0x10/0x10 [ 371.688618] pci_walk_bus+0x72/0x90 [ 371.692519] pcie_do_recovery+0xb2/0x330 [ 371.696899] aer_process_err_devices+0x117/0x170 [ 371.702055] aer_isr+0x1c0/0x1e0 [ 371.705661] ? __set_cpus_allowed_ptr+0x54/0xa0 [ 371.710723] ? __pfx_irq_thread_fn+0x10/0x10 [ 371.715496] irq_thread_fn+0x20/0x60 [ 371.719491] irq_thread+0xe6/0x1b0 [ 371.723291] ? __pfx_irq_thread_dtor+0x10/0x10 [ 371.728255] ? __pfx_irq_thread+0x10/0x10 [ 371.732731] kthread+0xe2/0x110 [ 371.736243] ? __pfx_kthread+0x10/0x10 [ 371.740430] ret_from_fork+0x2c/0x50 [ 371.744428] </TASK> The reproducer was a simple script: #!/bin/sh for i in `seq 1 5`; do modprobe -rv igb modprobe -v igb max_vfs=1 sleep 1 modprobe -rv igb done It turned out that this could only be reproduce on 82580 (quad and dual-port), but not on 82576, i350 and i210. Further debugging showed that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because dev->is_physfn is 0 on 82580. Prior to commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), igb_enable_sriov() jumped into the "err_out" cleanup branch. After this commit it only returned the error code. So the cleanup didn't take place, and the incorrect VF setup in the igb_adapter structure fooled the igb driver into assuming that VFs have been set up where no VF actually existed. Fix this problem by cleaning up again if pci_enable_sriov() fails. Fixes: 50f303496d92 ("igb: Enable SR-IOV after reinit") Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13	ixgbe: fix timestamp configuration code	Vadim Fedorenko
	The commit in fixes introduced flags to control the status of hardware configuration while processing packets. At the same time another structure is used to provide configuration of timestamper to user-space applications. The way it was coded makes this structures go out of sync easily. The repro is easy for 82599 chips: [root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1 current settings: tx_type 0 rx_filter 0 new settings: tx_type 1 rx_filter 12 The eth0 device is properly configured to timestamp any PTPv2 events. [root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1 current settings: tx_type 1 rx_filter 12 SIOCSHWTSTAMP failed: Numerical result out of range The requested time stamping mode is not supported by the hardware. The error is properly returned because HW doesn't support all packets timestamping. But the adapter->flags is cleared of timestamp flags even though no HW configuration was done. From that point no RX timestamps are received by user-space application. But configuration shows good values: [root@hostname ~]# hwstamp_ctl -i eth0 current settings: tx_type 1 rx_filter 12 Fix the issue by applying new flags only when the HW was actually configured. Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13	i2c: cadence: Fix the kernel-doc warnings	Shubhrajyoti Datta
	This fixes the below warnings drivers/i2c/busses/i2c-cadence.c:221: warning: Function parameter or member 'rinfo' not described in 'cdns_i2c' Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308171510.bKHBcZQW-lkp@intel.com/ Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-13	pmdomain: Rename the genpd subsystem to pmdomain	Ulf Hansson
	It has been pointed out that naming a subsystem "genpd" isn't very self-explanatory and the acronym itself that means Generic PM Domain, is known only by a limited group of people. In a way to improve the situation, let's rename the subsystem to pmdomain, which ideally should indicate that this is about so called Power Domains or "PM domains" as we often also use within the Linux Kernel terminology. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230912221127.487327-1-ulf.hansson@linaro.org
2023-09-13	i2c: aspeed: Reset the i2c controller when timeout occurs	Tommy Huang
	Reset the i2c controller when an i2c transfer timeout occurs. The remaining interrupts and device should be reset to avoid unpredictable controller behavior. Fixes: 2e57b7cebb98 ("i2c: aspeed: Add multi-master use case support") Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Tommy Huang <tommy_huang@aspeedtech.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-13	i2c: I2C_MLXCPLD on ARM64 should depend on ACPI	Geert Uytterhoeven
	The "i2c_mlxcpld" platform device is only instantiated on X86 systems (through drivers/platform/x86/mlx-platform.c), or on ARM64 systems with ACPI (through drivers/platform/mellanox/nvsw-sn2201.c). Hence further restrict the dependency on ARM64 to ACPI, to prevent asking the user about this driver when configuring an ARM64 kernel without ACPI support. While at it, document in the Kconfig help text that the driver supports ARM64/ACPI based systems, too. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Acked-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-13	i2c: Make I2C_ATR invisible	Geert Uytterhoeven
	I2C Address Translator (ATR) support is not a stand-alone driver, but a library. All of its users select I2C_ATR. Hence there is no need for the user to enable this symbol manually, except when compile-testing. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-13	w1: ds2482: Switch back to use struct i2c_driver's .probe()	Uwe Kleine-König
	After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/lkml/20230612072807.839689-1-u.kleine-koenig@pengutronix.de/ Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-12	drm/amdkfd: Insert missing TLB flush on GFX10 and later	Harish Kasiviswanathan
	Heavy-weight TLB flush is required after unmap on all GPUs for correctness and security. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-09-12	tpm: Fix typo in tpmrm class definition	Justin M. Forbes
	Commit d2e8071bed0be ("tpm: make all 'class' structures const") unfortunately had a typo for the name on tpmrm. Fixes: d2e8071bed0b ("tpm: make all 'class' structures const") Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-09-12	nvme-pci: do not set the NUMA node of device if it has none	Pratyush Yadav
	If a device has no NUMA node information associated with it, the driver puts the device in node first_memory_node (say node 0). Not having a NUMA node and being associated with node 0 are completely different things and it makes little sense to mix the two. Signed-off-by: Pratyush Yadav <ptyadav@amazon.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-09-12	veth: Update XDP feature set when bringing up device	Toke Høiland-Jørgensen
	There's an early return in veth_set_features() if the device is in a down state, which leads to the XDP feature flags not being updated when enabling GRO while the device is down. Which in turn leads to XDP_REDIRECT not working, because the redirect code now checks the flags. Fix this by updating the feature flags after bringing the device up. Before this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: no NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: no After this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: yes NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: yes Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230911135826.722295-1-toke@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12	driver core: return an error when dev_set_name() hasn't happened	Andy Shevchenko
	The commit d21fdd07cea4 ("driver core: Return proper error code when dev_set_name() fails") rewrote the logic of handling the dev_set_name() error codes, but missed the point that initially set error value to -EINVAL might be rewritten and hence the error path can't be triggered at some circumstances. To fix this, make sure that error variable is set to -EINVAL when other conditionals are false. Reported-by: syzbot+bdfb03b1ec8b342c12cb@syzkaller.appspotmail.com Fixes: d21fdd07cea4 ("driver core: Return proper error code when dev_set_name() fails") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230828145824.3895288-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-12	Revert "comedi: add HAS_IOPORT dependencies"	Ian Abbott
	This reverts commit b5c75b68b7ded84d4c82118974ce3975a4dcaa74. The commit makes it impossible to select configuration options that depend on COMEDI_8254, COMEDI_DAS08, COMEDI_NI_LABPC, or COMEDI_AMPLC_DIO200 options due to changing 'select' directives to 'depends on' directives and there being no other way to select those codependent configuration options. Fixes: b5c75b68b7de ("comedi: add HAS_IOPORT dependencies") Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: <stable@vger.kernel.org> # v6.5+ Acked-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20230905090922.3314-1-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-12	net: macb: fix sleep inside spinlock	Sascha Hauer
	macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() which can sleep. This results in: \| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 \| pps pps1: new PPS source ptp1 \| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 \| preempt_count: 1, expected: 0 \| RCU nest depth: 0, expected: 0 \| 4 locks held by kworker/u4:3/40: \| #0: ffff000003409148 \| macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. \| ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c \| #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c \| #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 \| #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac \| irq event stamp: 113998 \| hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 \| hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 \| softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 \| softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c \| CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 \| Hardware name: ... ZynqMP ... (DT) \| Workqueue: events_power_efficient phylink_resolve \| Call trace: \| dump_backtrace+0x98/0xf0 \| show_stack+0x18/0x24 \| dump_stack_lvl+0x60/0xac \| dump_stack+0x18/0x24 \| __might_resched+0x144/0x24c \| __might_sleep+0x48/0x98 \| __mutex_lock+0x58/0x7b0 \| mutex_lock_nested+0x24/0x30 \| clk_prepare_lock+0x4c/0xa8 \| clk_set_rate+0x24/0x8c \| macb_mac_link_up+0x25c/0x2ac \| phylink_resolve+0x178/0x4e8 \| process_one_work+0x1ec/0x51c \| worker_thread+0x1ec/0x3e4 \| kthread+0x120/0x124 \| ret_from_fork+0x10/0x20 The obvious fix is to move the call to macb_set_tx_clk() out of the protected area. This seems safe as rx and tx are both disabled anyway at this point. It is however not entirely clear what the spinlock shall protect. It could be the read-modify-write access to the NCFGR register, but this is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well without holding the spinlock. It could also be the register accesses done in mog_init_rings() or macb_init_buffers(), but again these functions are called without holding the spinlock in macb_hresp_error_task(). The locking seems fishy in this driver and it might deserve another look before this patch is applied. Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/r/20230908112913.1701766-1-s.hauer@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12	drm/i915: Only check eDP HPD when AUX CH is shared	Ville Syrjälä
	Apparently Acer Chromebook C740 (BDW-ULT) doesn't have the eDP HPD line properly connected, and thus fails the new HPD check during eDP probe. The result is that we lose the eDP output. I suspect all such machines would be Chromebooks or other Linux exclusive systems as the Windows driver likely wouldn't work either. I did check a few other BDW machines here and those do have eDP HPD connected, one of them even is a different Chromebook (Samus). To account for these funky machines let's skip the HPD check when it looks like the eDP port is the only one using that specific AUX channel. In case of multiple ports sharing the same AUX CH (eg. on Asrock B250M-HDV) we still do the check and thus should correctly ignore the eDP port in favor of the other DP port (usually a DP->VGA converter). v2: Don't oops during list iteration Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9264 Fixes: cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230908052527.685-1-ville.syrjala@linux.intel.com Reviewed-by: Luca Coelho <luciano.coelho@intel.com> (cherry picked from commit 70052100fabec5d8c1b09c9959817a2f4517e6b5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-09-12	Merge drm/drm-fixes into drm-misc-fixes	Thomas Zimmermann
	Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2023-09-11	drm/amd/display: Fix 2nd DPIA encoder Assignment	Mustapha Ghaddar
	[HOW & Why] There seems to be an issue with 2nd DPIA acquiring link encoder for tiled displays. Solution is to remove check for eng_id before we get first dynamic encoder for it Reviewed-by: Cruise Hung <cruise.hung@amd.com> Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amd/display: Add DPIA Link Encoder Assignment Fix	Mustapha Ghaddar
	For DPIA we should have preferred DIG assignment based on DPIA selected as per the ASIC design. Reviewed-by: George Shen <george.shen@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-09-11	drm/amd/display: fix replay_mode kernel-doc warning	Randy Dunlap
	Fix the typo in the kernel-doc for @replay_mode to prevent kernel-doc warnings: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:623: warning: Incorrect use of kernel-doc format: * @replay mode: Replay supported drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:626: warning: Function parameter or member 'replay_mode' not described in 'amdgpu_hdmi_vsdb_info' Fixes: ec8e59cb4e0c ("drm/amd/display: Get replay info from VSDB") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu: Handle null atom context in VBIOS info ioctl	David Francis
	On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	cxl/pci: Replace host_bridge->native_aer with pcie_aer_is_native()	Smita Koralahalli
	Use pcie_aer_is_native() to determine the native AER ownership as the usage of host_bride->native_aer does not cover command line override of AER ownership. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230823234305.27333-4-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-09-11	PCI/AER: Export pcie_aer_is_native()	Smita Koralahalli
	Export and move the declaration of pcie_aer_is_native() to a common header file to be reused by cxl/pci module. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230823234305.27333-3-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-09-11	cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registers	Smita Koralahalli
	cxl_pci fails to unmask CXL protocol errors when CXL memory error reporting is not granted native control. Given that CXL memory error reporting uses the event interface and protocol errors use AER, unmask protocol errors based only on the native AER setting. Without this change end user deployments will fail to report protocol errors in the case where native memory error handling is not granted to Linux. Also, return zero instead of an error code to not block the communication with the cxl device when in native memory error reporting mode. Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") Cc: <stable@vger.kernel.org> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230823234305.27333-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-09-11	drm/amdkfd: Checkpoint and restore queues on GFX11	David Francis
	The code in kfd_mqd_manager_v11.c to support criu dump and restore of queue state was missing. Added it; should be equivalent to kfd_mqd_manager_v10.c. CC: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amd/display: Adjust the MST resume flow	Wayne Lin
	[Why] In drm_dp_mst_topology_mgr_resume() today, it will resume the mst branch to be ready handling mst mode and also consecutively do the mst topology probing. Which will cause the dirver have chance to fire hotplug event before restoring the old state. Then Userspace will react to the hotplug event based on a wrong state. [How] Adjust the mst resume flow as: 1. set dpcd to resume mst branch status 2. restore source old state 3. Do mst resume topology probing For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to pull out topology probing work into a 2nd part procedure of the mst resume. Will have a follow up patch in drm. Reviewed-by: Chao-kai Wang <stylon.wang@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram	Hawking Zhang
	So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV	Alex Deucher
	Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV	Alex Deucher
	This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amd/display: Don't check registers, if using AUX BL control	Swapnil Patel
	[Why] Currently the driver looks DCN registers to access if BL is on or not. This check is not valid if we are using AUX based brightness control. This causes driver to not send out "backlight off" command during power off sequence as it already thinks it is off. [How] Only check DCN registers if we aren't using AUX based brightness control. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Swapnil Patel <swapnil.patel@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu: fix retry loop test	Dan Carpenter
	This loop will exit with "retry" set to -1 if it fails but the code checks for if "retry" is zero. Fix this by changing post-op to a pre-op. --retry vs retry--. Fixes: e01eeffc3f86 ("drm/amd/pm: avoid driver getting empty metrics table for the first time") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amd/display: Add dirty rect support for Replay	Bhawanpreet Lakha
	Dirty rect can be used with replay, so enable them to allow for more powersaving. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory"	Hamza Mahfooz
	This reverts commit 70e64c4d522b732e31c6475a3be2349de337d321. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amd/display: fix the white screen issue when >= 64GB DRAM	Yifan Zhang
	Dropping bit 31:4 of page table base is wrong, it makes page table base points to wrong address if phys addr is beyond 64GB; dropping page_table_start/end bit 31:4 is unnecessary since dcn20_vmid_setup will do that. Also, while we are at it, cleanup the assignments using upper_32_bits()/lower_32_bits() and AMDGPU_GPU_PAGE_SHIFT. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)") Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Co-developed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdkfd: Update CU masking for GFX 9.4.3	Mukul Joshi
	The CU mask passed from user-space will change based on different spatial partitioning mode. As a result, update CU masking code for GFX9.4.3 to work for all partitioning modes. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdkfd: Update cache info reporting for GFX v9.4.3	Mukul Joshi
	Update cache info reporting in sysfs to report the correct number of CUs and associated cache information based on different spatial partitioning modes. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3	Mukul Joshi
	Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdkfd: Fix unaligned 64-bit doorbell warning	Mukul Joshi
	This patch fixes the following unaligned 64-bit doorbell warning seen when submitting packets on HIQ on GFX v9.4.3 by making the HIQ doorbell 64-bit aligned. The warning is seen when GPU is loaded in any mode other than SPX mode. [ +0.000301] ------------[ cut here ]------------ [ +0.000003] Unaligned 64-bit doorbell [ +0.000030] WARNING: /amdkfd/kfd_doorbell.c:339 write_kernel_doorbell64+0x72/0x80 [ +0.000003] RIP: 0010:write_kernel_doorbell64+0x72/0x80 [ +0.000004] RSP: 0018:ffffc90004287730 EFLAGS: 00010246 [ +0.000005] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ +0.000003] RDX: 0000000000000001 RSI: ffffffff82837c71 RDI: 00000000ffffffff [ +0.000003] RBP: ffffc90004287748 R08: 0000000000000003 R09: 0000000000000001 [ +0.000002] R10: 000000000000001a R11: ffff88a034008198 R12: ffffc900013bd004 [ +0.000003] R13: 0000000000000008 R14: ffffc900042877b0 R15: 000000000000007f [ +0.000003] FS: 00007fa8c7b62000(0000) GS:ffff889f88400000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000056111c45aaf0 CR3: 00000001414f2002 CR4: 0000000000770ee0 [ +0.000003] PKRU: 55555554 [ +0.000002] Call Trace: [ +0.000004] <TASK> [ +0.000006] kq_submit_packet+0x45/0x50 [amdgpu] [ +0.000524] pm_send_set_resources+0x7f/0xc0 [amdgpu] [ +0.000500] set_sched_resources+0xe4/0x160 [amdgpu] [ +0.000503] start_cpsch+0x1c5/0x2a0 [amdgpu] [ +0.000497] kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu] [ +0.000743] amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu] [ +0.000602] amdgpu_device_init.cold+0x1813/0x2176 [amdgpu] [ +0.000684] ? pci_bus_read_config_word+0x4a/0x80 [ +0.000012] ? do_pci_enable_device+0xdc/0x110 [ +0.000008] amdgpu_driver_load_kms+0x1a/0x110 [amdgpu] [ +0.000545] amdgpu_pci_probe+0x197/0x400 [amdgpu] Fixes: c31866651086 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	drm/amdkfd: Fix reg offset for setting CWSR grace period	Mukul Joshi
	This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11	md/raid1: fix error: ISO C90 forbids mixed declarations	Nigel Croxon
	There is a compile error when this commit is added: md: raid1: fix potential OOB in raid1_remove_disk() drivers/md/raid1.c: In function 'raid1_remove_disk': drivers/md/raid1.c:1844:9: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 1844 \| struct raid1_info *p = conf->mirrors + number; \| ^~~~~~ That's because the new code was inserted before the struct. The change is move the struct command above this commit. Fixes: 8b0472b50bcf ("md: raid1: fix potential OOB in raid1_remove_disk()") Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/46d929d0-2aab-4cf2-b2bf-338963e8ba5a@redhat.com
2023-09-11	spi: intel-pci: Add support for Granite Rapids SPI serial flash	Mika Westerberg
	Intel Granite Rapids has a flash controller that is compatible with the other Cannon Lake derivatives. Add Granite Rapids PCI ID to the driver list of supported devices. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230911074616.3473347-1-mika.westerberg@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-09-11	thermal: Constify the trip argument of the .get_trend() zone callback	Rafael J. Wysocki
	Add 'const' to the definition of the 'trip' argument of the .get_trend() thermal zone callback to indicate that the trip point passed to it should not be modified by it and adjust the callback functions implementing it, thermal_get_trend() in the ACPI thermal driver and __ti_thermal_get_trend(), accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com>
2023-09-11	thermal/of: add missing of_node_put()	Julia Lawall
	for_each_child_of_node performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. This was done using the Coccinelle semantic patch iterators/for_each_child.cocci Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-11	Merge tag 'drm-misc-next-fixes-2023-09-11' of ↵	Daniel Vetter
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * nouveau: Lockdep workaround * fbdev/g364fb: Build fix Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230911141915.GA983@linux-uq9g