Age | Commit message (Collapse) | Author |
|
The mana_hwc_rx_event_handler() / mana_hwc_handle_resp() calls
complete(&ctx->comp_event) before posting the wqe back. It's
possible that other callers, like mana_create_txq(), start the
next round of mana_hwc_send_request() before the posting of wqe.
And if the HW is fast enough to respond, it can hit no_wqe error
on the HW channel, then the response message is lost. The mana
driver may fail to create queues and open, because of waiting for
the HW response and timed out.
Sample dmesg:
[ 528.610840] mana 39d4:00:02.0: HWC: Request timed out!
[ 528.614452] mana 39d4:00:02.0: Failed to send mana message: -110, 0x0
[ 528.618326] mana 39d4:00:02.0 enP14804s2: Failed to create WQ object: -110
To fix it, move posting of rx wqe before complete(&ctx->comp_event).
Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PMC driver has capability to get the idle mask values and STB data from
the PMFW. Extend this support to the platforms that belong to family 1Ah
model 60h series.
Co-developed-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240822095357.395808-2-Shyam-sundar.S-k@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The commit 426463d94d45 ("platform/x86/amd/pmc: Send OS_HINT command for
new AMD platform") was introduced to enable sending mailbox commands to
PMFW on newer platforms. However, it was later discovered that the commit
did not configure the correct message port ID (i.e., S2D or PMC). Without
this configuration, all command submissions to PMFW are treated as
invalid, leading to command failures.
To address this issue, the CPU ID association for the new platform needs
to be added in amd_pmc_get_ip_info(). This ensures that the correct SMU
port IDs are selected.
Fixes: 426463d94d45 ("platform/x86/amd/pmc: Send OS_HINT command for new AMD platform")
Co-developed-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240822095357.395808-1-Shyam-sundar.S-k@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.
This call will always be made while holding the RTNL which ensures
that the SFP driver won't unbind from the device. The returned pointer
to the bus name will only be used while RTNL is held.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.
By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.
With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.
The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.
Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.
The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.
This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.
The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The irq_domain passed to msi_lib_irq_domain_select() may not have
msi_parent_ops set. There is a NULL pointer check for it, but unfortunately
there is a dereference of the parent ops pointer before that.
Move the NULL pointer test before the first use of that pointer.
This was found on a MacchiatoBin (Marvell Armada 8K SoC), which uses the
irq-mvebu-sei driver.
Fixes: 72e257c6f058 ("irqchip: Provide irq-msi-lib")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240823100733.1900666-1-maxime.chevallier@bootlin.com
Closes: https://lore.kernel.org/all/20240821165034.1af97bad@fedora-3.home/
|
|
The GICv3 driver pokes GICv3 system registers in gic_prio_init() before
gic_cpu_sys_reg_init() ensures that GICv3 system registers have been
enabled by writing to ICC_SRE_EL1.SRE.
On arm64 this is benign as has_useable_gicv3_cpuif() runs earlier during
cpufeature detection, and this enables the GICv3 system registers.
On 32-bit arm when booting on an FVP using the boot-wrapper, the accesses
in gic_prio_init() end up being UNDEFINED and crashes the kernel during
boot.
This is a regression introduced by the addition of gic_prio_init().
Fix this by factoring out the SRE initialization into a new function and
calling it early in the three paths where SRE may not have been
initialized:
(1) gic_init_bases(), before the primary CPU pokes GICv3 sysregs in
gic_prio_init().
(2) gic_starting_cpu(), before secondary CPUs initialize GICv3 sysregs
in gic_cpu_init().
(3) gic_cpu_pm_notifier(), before CPUs re-initialize GICv3 sysregs in
gic_cpu_sys_reg_init().
Fixes: d447bf09a4013541 ("irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
It has turned out that having _set_required_opps() to recursively call
dev_pm_opp_set_opp() to set the required OPPs, doesn't really work as well
as we expected.
More precisely, at each recursive call to dev_pm_opp_set_opp() we are
changing an OPP for a required_dev that belongs to a required-OPP table.
The problem with this, is that we may have several devices sharing the same
required-OPP table, which leads to an incorrect behaviour in regards to
aggregating the per device votes.
To fix the problem for a required-OPP table belonging to a PM domain, which
is the only existing usecase for now, let's simply replace the call to
dev_pm_opp_set_opp() in _set_required_opps() by a call to _set_opp_level().
Moving forward we may potentially need to add support for other types of
required-OPP tables. In this case, the aggregation needs to be thought of.
Fixes: e37440e7e2c2 ("OPP: Call dev_pm_opp_set_opp() for required OPPs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20240822224547.385095-2-ulf.hansson@linaro.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull power sequencing fix from Bartosz Golaszewski:
- request the wlan-enable GPIO "as-is" to fix an issue with the wifi
module being already powered up before linux boots
* tag 'pwrseq-fixes-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
power: sequencing: request the WLAN enable GPIO as-is
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- imx: Remove duplicated clocks for scu power domain
- imx: Wait for SSAR to complete power-on for i.MX93 power domain
* tag 'pmdomain-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: imx: wait SSAR when i.MX93 power domain on
pmdomain: imx: scu-pd: Remove duplicated clocks
|
|
The iommu_report_device_fault function was updated to return void while
assuming that drivers only need to call iommu_report_device_fault() for
reporting an iopf. This implementation causes following problems:
1. The drivers rely on the core code to call it's page_reponse,
however, when a fault is received and no fault capable domain is
attached / iopf_param is NULL, the ops->page_response is NOT called
causing the device to stall in case the fault type was PAGE_REQ.
2. The arm_smmu_v3 driver relies on the returned value to log errors
returning void from iommu_report_device_fault causes these events to
be missed while logging.
Modify the iommu_report_device_fault function to return -EINVAL for
cases where no fault capable domain is attached or iopf_param was NULL
and calls back to the driver (ops->page_response) in case the fault type
was IOMMU_FAULT_PAGE_REQ. The returned value can be used by the drivers
to log the fault/event as needed.
Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Closes: https://lore.kernel.org/all/6147caf0-b9a0-30ca-795e-a1aa502a5c51@huawei.com/
Fixes: 3dfa64aecbaf ("iommu: Make iommu_report_device_fault() return void")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20240816104906.1010626-1-praan@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:
- Fix the max segment size and max number of segments supported by the
pata_macio driver to fix issues with BIO splitting leading to an
overflow of the adapter DMA table (from Michael)
- Related to the previous fix, change BUG_ON() calls for incorrect
command buffer segmentation into WARN_ON() and an error return (from
Michael)
* tag 'ata-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: pata_macio: Use WARN instead of BUG
ata: pata_macio: Fix DMA table overflow
|
|
If formatting a suspended disk (such as formatting with different DIF
type), the disk will be resuming first, and then the format command will
submit to the disk through SG_IO ioctl.
When the disk is processing the format command, the system does not
submit other commands to the disk. Therefore, the system attempts to
suspend the disk again and sends the SYNCHRONIZE CACHE command. However,
the SYNCHRONIZE CACHE command will fail because the disk is in the
formatting process. This will cause the runtime_status of the disk to
error and it is difficult for user to recover it. Error info like:
[ 669.925325] sd 6:0:6:0: [sdg] Synchronizing SCSI cache
[ 670.202371] sd 6:0:6:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
[ 670.216300] sd 6:0:6:0: [sdg] Sense Key : 0x2 [current]
[ 670.221860] sd 6:0:6:0: [sdg] ASC=0x4 ASCQ=0x4
To solve the issue, ignore the error and return success/0 when format is
in progress.
Cc: stable@vger.kernel.org
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Link: https://lore.kernel.org/r/20240819090934.2130592-1-liyihang9@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
aac_probe_one() calls hardware-specific init functions through the
aac_driver_ident::init pointer, all of which eventually call down to
aac_init_adapter().
If aac_init_adapter() fails after allocating memory for aac_dev::queues,
it frees the memory but does not clear that member.
After the hardware-specific init function returns an error,
aac_probe_one() goes down an error path that frees the memory pointed to
by aac_dev::queues, resulting.in a double-free.
Reported-by: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
Link: https://bugs.debian.org/1075855
Fixes: 8e0c5ebde82b ("[SCSI] aacraid: Newer adapter communication iterface support")
Signed-off-by: Ben Hutchings <benh@debian.org>
Link: https://lore.kernel.org/r/ZsZvfqlQMveoL5KQ@decadent.org.uk
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Build failed while enabling "CONFIG_GCOV_KERNEL=y" and
"CONFIG_GCOV_PROFILE_ALL=y" with following error:
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_get_cgnbuf_info':
BUILDSTDERR: ./include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' accessing 18446744073709551615 bytes at offsets 0 and 0 overlaps 9223372036854775807 bytes at offset -9223372036854775808 [-Werror=restrict]
BUILDSTDERR: 114 | #define __underlying_memcpy __builtin_memcpy
BUILDSTDERR: | ^
BUILDSTDERR: ./include/linux/fortify-string.h:637:9: note: in expansion of macro '__underlying_memcpy'
BUILDSTDERR: 637 | __underlying_##op(p, q, __fortify_size); \
BUILDSTDERR: | ^~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/fortify-string.h:682:26: note: in expansion of macro '__fortify_memcpy_chk'
BUILDSTDERR: 682 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
BUILDSTDERR: | ^~~~~~~~~~~~~~~~~~~~
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c:5468:9: note: in expansion of macro 'memcpy'
BUILDSTDERR: 5468 | memcpy(cgn_buff, cp, cinfosz);
BUILDSTDERR: | ^~~~~~
This happens from the commit 06bb7fc0feee ("kbuild: turn on -Wrestrict by
default"). Address this issue by using size_t type.
Signed-off-by: Sherry Yang <sherry.yang@oracle.com>
Link: https://lore.kernel.org/r/20240821065131.1180791-1-sherry.yang@oracle.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/broadcom/bnxt/bnxt.h
c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops")
f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC")
Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds with GCC 14.2.0 warn that:
.../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~
.../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~
.../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
the range is 0 - 255. Further, on inspecting the code, it seems
that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
the range is actually 0 - 8.
So, it seems that the condition that GCC flags will not occur.
But, nonetheless, it would be nice if it didn't emit the warning.
It seems that this can be achieved by changing the format specifier
from %d to %u, in which case I believe GCC recognises an upper bound
on the range of tc of 0 - 255. After some experimentation I think
this is due to the combination of the use of %u and the type of
cfg->tcs (u8).
Empirically, updating the type of the tc variable to unsigned int
has the same effect.
As both of these changes seem to make sense in relation to what the code
is actually doing - iterating over unsigned values - do both.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth and netfilter.
Current release - regressions:
- virtio_net: avoid crash on resume - move netdev_tx_reset_queue()
call before RX napi enable
Current release - new code bugs:
- net/mlx5e: fix page leak and incorrect header release w/ HW GRO
Previous releases - regressions:
- udp: fix receiving fraglist GSO packets
- tcp: prevent refcount underflow due to concurrent execution of
tcp_sk_exit_batch()
Previous releases - always broken:
- ipv6: fix possible UAF when incrementing error counters on output
- ip6: tunnel: prevent merging of packets with different L2
- mptcp: pm: fix IDs not being reusable
- bonding: fix potential crashes in IPsec offload handling
- Bluetooth: HCI:
- MGMT: add error handling to pair_device() to avoid a crash
- invert LE State quirk to be opt-out rather then opt-in
- fix LE quote calculation
- drv: dsa: VLAN fixes for Ocelot driver
- drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings
- drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192
Misc:
- netpoll: do not export netpoll_poll_[disable|enable]()
- MAINTAINERS: update the list of networking headers"
* tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
s390/iucv: Fix vargs handling in iucv_alloc_device()
net: ovs: fix ovs_drop_reasons error
net: xilinx: axienet: Fix dangling multicast addresses
net: xilinx: axienet: Always disable promiscuous mode
MAINTAINERS: Mark JME Network Driver as Odd Fixes
MAINTAINERS: Add header files to NETWORKING sections
MAINTAINERS: Add limited globs for Networking headers
MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
octeontx2-af: Fix CPT AF register offset calculation
net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
net: ngbe: Fix phy mode set to external phy
netfilter: flowtable: validate vlan header
bnxt_en: Fix double DMA unmapping for XDP_REDIRECT
ipv6: prevent possible UAF in ip6_xmit()
ipv6: fix possible UAF in ip6_finish_output2()
ipv6: prevent UAF in ip6_send_skb()
netpoll: do not export netpoll_poll_[disable|enable]()
selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
udp: fix receiving fraglist GSO packets
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Eliminate the fdtoverlay command duplication in scripts/Makefile.lib
- Fix 'make compile_commands.json' for external modules
- Ensure scripts/kconfig/merge_config.sh handles missing newlines
- Fix some build errors on macOS
* tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix typos "prequisites" to "prerequisites"
Documentation/llvm: turn make command for ccache into code block
kbuild: avoid scripts/kallsyms parsing /dev/null
treewide: remove unnecessary <linux/version.h> inclusion
scripts: kconfig: merge_config: config files: add a trailing newline
Makefile: add $(srctree) to dependency of compile_commands.json target
kbuild: clean up code duplication in cmd_fdtoverlay
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5 (Geert)
Driver Changes:
- Fix opregion leak (Lucas)
- Fix OA sysfs entry (Ashutosh)
- Fix VM dma-resv lock (Brost)
- Fix tile fini sequence (Brost)
- Prevent UAF around preempt fence (Auld)
- Fix DGFX display suspend/resume (Maarten)
- Many Xe/Xe2 critical workarounds (Auld, Ngai-Mint, Bommu, Tejas, Daniele)
- Fix devm/drmm issues (Daniele)
- Fix missing workqueue destroy in xe_gt_pagefault (Stuart)
- Drop HW fence pointer to HW fence ctx (Brost)
- Free job before xe_exec_queue_put (Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZsdVe0XI2Pq8C-ON@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
nouveau:
- firmware: use dma non-coherent allocator
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240822123907.GA234335@localhost.localdomain
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Fix for HDCP timeouts
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZsbPMm6XfzimmZW0@jlahtine-mobl.ger.corp.intel.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.11-2024-08-21:
amdgpu:
- GFX10 firmware loading fix
- SDMA 5.2 fix
- Debugfs parameter validation fix
- eGPU hotplug fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821172810.302416-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.11-rc5
1) Fixes from the virtual plane series, namely
- fix the list of formats for QCM2290 since it has no YUV support
- minor fix in dpu_plane_atomic_check_pipe() to check only for csc and
not csc and scaler while allowing yuv formats
- take rotation into account while allocating virtual planes
2) Fix to cleanup FB if dpu_format_populate_layout() fails. This fixes the
warning splat during DRM file closure
3) Fix to reset the phy link params before re-starting link training. This
fixes the 100% link training failure when someone starts modetest while
cable is connected
4) Long pending fix to fix a visual corruption seen for 4k modes. Root-cause
was we cannot support 4k@30 with 30bpp with 2 lanes so this is a critical
fix to use 24bpp for such cases
5) Fix to move dpu encoder's connector assignment to atomic_enable(). This
fixes the NULL ptr crash for cases when there is an atomic_enable()
without atomic_modeset() after atomic_disable() . This happens for
connectors_changed case of crtc. It fixes a NULL ptr crash reported
during hotplug.
6) Fix to simplify DPU's debug macros without which dynamic debug does not
work as expected
7) Fix the highest bank bit setting for sc7180
8) adreno: fix error return if missing firmware-name
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvxF2p3-AsjUydmSYrA0Vb+Ea7nh3VtNX0pT0Ae_Me-Kw@mail.gmail.com
|
|
logical_die_id()
After the commit 63edbaa48a57 ("x86/cpu/topology: Add support for the
AMD 0x80000026 leaf"), the topolgy_logical_die_id() function returns
the logical Core Chiplet Die (CCD) ID instead of the logical socket
ID.
Since this is currently used to set MSR_AMD_CPPC_ENABLE, which needs
to be set on any one of the threads of the socket, it is prudent to
use topology_logical_package_id() in place of
topology_logical_die_id().
Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
cc: stable@vger.kernel.org # 6.10
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Link: https://lore.kernel.org/lkml/20240801124509.3650-1-Dhananjay.Ugwekar@amd.com/
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
Smatch complains that "ret" could be uninitialized:
drivers/cpufreq/amd-pstate.c:734 amd_pstate_cpu_boost_update()
error: uninitialized symbol 'ret'.
This seems like it probably is a real issue. Initialize "ret" to zero to
be safe.
Fixes: c8c68c38b56f ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/lkml/7ff53543-6c04-48a0-8d99-7dc010b93b3a@stanley.mountain/T/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
The "name" field in struct nvme_ctrl is unsued so removing it.
This would help save 12 bytes of space for each nvme_ctrl instance
created.
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Commit 4733b65d82bd ("nvme: start keep-alive after admin queue setup")
moves starting keep-alive from nvme_start_ctrl() into
nvme_init_ctrl_finish(), but don't move stopping keep-alive into
nvme_uninit_ctrl(), so keep-alive work can be started and keep pending
after failing to start controller, finally use-after-free is triggered if
nvme host driver is unloaded.
This patch fixes kernel panic when running nvme/004 in case that connection
failure is triggered, by moving stopping keep-alive into nvme_uninit_ctrl().
This way is reasonable because keep-alive is now started in
nvme_init_ctrl_finish().
Fixes: 3af755a46881 ("nvme: move nvme_stop_keep_alive() back to original position")
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mark O'Donovan <shiftee@posteo.net>
Reported-by: Changhui Zhong <czhong@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
If a multicast address is removed but there are still some multicast
addresses, that address would remain programmed into the frame filter.
Fix this by explicitly setting the enable bit for each filter.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If promiscuous mode is disabled when there are fewer than four multicast
addresses, then it will not be reflected in the hardware. Fix this by
always clearing the promiscuous mode flag even when we program multicast
addresses.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
success
After successfully programming the SPI flash with an MFPS auto update
image, the error sysfs attribute reports programming:timeout.
This is caused by an incorrect check on the return value from
wait_for_completion_timeout() in mpfs_auto_update_poll_complete().
Fixes: ec5b0f1193ad ("firmware: microchip: add PolarFire SoC Auto Update support")
Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Checking for index < 0 is useless because the find_deepest_state()
function never really returns a negative value.
Since this hasn't been reported in over 9 years it's dead code, so
remove it.
Signed-off-by: Dhruva Gole <d-gole@ti.com>
Link: https://patch.msgid.link/20240821114250.1416421-1-d-gole@ti.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Terminating for_each_available_child_of_node() loop requires dropping OF
node reference, so bailing out on errors misses this. Solve the OF node
reference leak with scoped for_each_available_child_of_node_scoped().
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
thermal_of_zone_register() calls of_thermal_zone_find() which will
iterate over OF nodes with for_each_available_child_of_node() to find
matching thermal zone node. When it finds such, it exits the loop and
returns the node. Prematurely ending for_each_available_child_of_node()
loops requires dropping OF node reference, thus success of
of_thermal_zone_find() means that caller must drop the reference.
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Terminating for_each_child_of_node() loop requires dropping OF node
reference, so bailing out after thermal_of_populate_trip() error misses
this. Solve the OF node reference leak with scoped
for_each_child_of_node_scoped().
Fixes: d0c75fa2c17f ("thermal/of: Initialize trip points separately")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Make the imx_thermal driver use the .should_bind() thermal zone callback
to provide the thermal core with the information on whether or not to
bind the given cooling device to the given trip point in the given
thermal zone. If it returns 'true', the thermal core will bind the
cooling device to the trip and the corresponding unbinding will be
taken care of automatically by the core on the removal of the involved
thermal zone or cooling device.
In the imx_thermal case, it only needs to return 'true' for the passive
trip point and it will match any cooling device passed to it, in
analogy with the old-style imx_bind() callback function.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/2485070.jE0xQCEvom@rjwysocki.net
|
|
Make the mlxsw core_thermal driver use the .should_bind() thermal zone
callback to provide the thermal core with the information on whether or
not to bind the given cooling device to the given trip point in the
given thermal zone. If it returns 'true', the thermal core will bind
the cooling device to the trip and the corresponding unbinding will be
taken care of automatically by the core on the removal of the involved
thermal zone or cooling device.
It replaces the .bind() and .unbind() thermal zone callbacks (in 3
places) which assumed the same trip points ordering in the driver
and in the thermal core (that may not be true any more in the
future). The .bind() callbacks used loops over trip point indices
to call thermal_zone_bind_cooling_device() for the same cdev (once
it had been verified) and all of the trip points, but they passed
different 'upper' and 'lower' values to it for each trip.
To retain the original functionality, the .should_bind() callbacks
need to use the same 'upper' and 'lower' values that would be used
by the corresponding .bind() callbacks when they are about to return
'true'. To that end, the 'priv' field of each trip is set during the
thermal zone initialization to point to the corresponding 'state'
object containing the maximum and minimum cooling states of the
cooling device.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/2216931.Icojqenx9y@rjwysocki.net
|
|
Make the acerhdf driver use the .should_bind() thermal zone
callback to provide the thermal core with the information on whether or
not to bind the given cooling device to the given trip point in the
given thermal zone. If it returns 'true', the thermal core will bind
the cooling device to the trip and the corresponding unbinding will be
taken care of automatically by the core on the removal of the involved
thermal zone or cooling device.
The previously existing acerhdf_bind() function bound cooling devices
to thermal trip point 0 only, so the new callback needs to return 'true'
for trip point 0. However, it is straightforward to observe that trip
point 0 is an active trip point and the only other trip point in the
driver's thermal zone is a critical one, so it is sufficient to return
'true' from that callback if the type of the given trip point is
THERMAL_TRIP_ACTIVE.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Peter Kästle <peter@piie.net>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/3779411.MHq7AAxBmi@rjwysocki.net
|
|
thermal_unbind_cdev_from_trip()
Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip()
are only called locally in the thermal core now, they can be static,
so change their definitions accordingly and drop their headers from
the global thermal header file.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/3512161.QJadu78ljV@rjwysocki.net
|
|
Make the ACPI thermal zone driver use the .should_bind() thermal zone
callback to provide the thermal core with the information on whether or
not to bind the given cooling device to the given trip point in the
given thermal zone. If it returns 'true', the thermal core will bind
the cooling device to the trip and the corresponding unbinding will be
taken care of automatically by the core on the removal of the involved
thermal zone or cooling device.
This replaces the .bind() and .unbind() thermal zone callbacks which
allows the code to be simplified quite significantly while providing
the same functionality.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/1812827.VLH7GnMWUR@rjwysocki.net
|
|
The current design of the code binding cooling devices to trip points in
thermal zones is convoluted and hard to follow.
Namely, a driver that registers a thermal zone can provide .bind()
and .unbind() operations for it, which are required to call either
thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip(),
respectively, or thermal_zone_bind_cooling_device() and
thermal_zone_unbind_cooling_device(), respectively, for every relevant
trip point and the given cooling device. Moreover, if .bind() is
provided and .unbind() is not, the cleanup necessary during the removal
of a thermal zone or a cooling device may not be carried out.
In other words, the core relies on the thermal zone owners to do the
right thing, which is error prone and far from obvious, even though all
of that is not really necessary. Specifically, if the core could ask
the thermal zone owner, through a special thermal zone callback, whether
or not a given cooling device should be bound to a given trip point in
the given thermal zone, it might as well carry out all of the binding
and unbinding by itself. In particular, the unbinding can be done
automatically without involving the thermal zone owner at all because
all of the thermal instances associated with a thermal zone or cooling
device going away must be deleted regardless.
Accordingly, introduce a new thermal zone operation, .should_bind(),
that can be invoked by the thermal core for a given thermal zone,
trip point and cooling device combination in order to check whether
or not the cooling device should be bound to the trip point at hand.
It takes an additional cooling_spec argument allowing the thermal
zone owner to specify the highest and lowest cooling states of the
cooling device and its weight for the given trip point binding.
Make the thermal core use this operation, if present, in the absence of
.bind() and .unbind(). Note that .should_bind() will be called under
the thermal zone lock.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/9334403.CDJkKcVGEf@rjwysocki.net
|
|
Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip()
acquire the thermal zone lock, the locking rules for their callers get
complicated. In particular, the thermal zone lock cannot be acquired
in any code path leading to one of these functions even though it might
be useful to do so.
To address this, remove the thermal zone locking from both these
functions, add lockdep assertions for the thermal zone lock to both
of them and make their callers acquire the thermal zone lock instead.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/3837835.kQq0lBPeGt@rjwysocki.net
|
|
Two sysfs show/store functions for attributes representing thermal
instances, trip_point_show() and weight_store(), retrieve the thermal
zone pointer from the instance object at hand, but they may also get
it from their dev argument, which is more consistent with what the
other thermal sysfs functions do, so make them do so.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/1987669.PYKUYFuaPT@rjwysocki.net
|
|
Because the trip and cdev pointers are sufficient to identify a thermal
instance holding them unambiguously, drop the additional thermal zone
checks from two loops walking the list of thermal instances in a
thermal zone.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/10527734.nUPlyArG6x@rjwysocki.net
|
|
It is not necessary to look up the thermal zone and the cooling device
in the respective global lists to check whether or not they are
registered. It is sufficient to check whether or not their respective
list nodes are empty for this purpose.
Use the above observation to simplify thermal_bind_cdev_to_trip(). In
addition, eliminate an unnecessary ternary operator from it.
Moreover, add lockdep_assert_held() for thermal_list_lock to it because
that lock must be held by its callers when it is running.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/3324214.44csPzL39Z@rjwysocki.net
|
|
Fold bind_cdev() into __thermal_cooling_device_register() and bind_tz()
into thermal_zone_device_register_with_trips() to reduce code bloat and
make it somewhat easier to follow the code flow.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/2962184.e9J7NaK4W3@rjwysocki.net
|
|
Introduce a facility allowing the thermal core functionality to be
exercised in a controlled way in order to verify its behavior, without
affecting its regular users noticeably.
It is based on the idea of preparing thermal zone templates along with
their trip points by writing to files in debugfs. When ready, those
templates can be used for registering test thermal zones with the
thermal core.
The temperature of a test thermal zone created this way can be adjusted
via debugfs, which also triggers a __thermal_zone_device_update() call
for it. By manipulating the temperature of a test thermal zone, one can
check if the thermal core reacts to the changes of it as expected.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/6065927.lOV4Wx5bFT@rjwysocki.net
[ rjw: Fixed ordering of kcalloc() arguments ]
[ rjw: Fixed debugfs_create_dir() return value checks ]
[ rjw: Fixed two kerneldoc comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Let the kememdup_array() take care about multiplication and possible
overflows.
Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821081447.12430-1-yujiaoliang@vivo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|