linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-12-02	ibmvnic: drop bad optimization in reuse_tx_pools()	Sukadev Bhattiprolu
	When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02	ibmvnic: drop bad optimization in reuse_rx_pools()	Sukadev Bhattiprolu
	When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02	ethernet: aquantia: Try MAC address from device tree	Tianhao Chai
	Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the card, but instead need to obtain MAC addresses from the device tree. In this case the hardware will report an invalid MAC. Currently atlantic driver does not query the DT for MAC address and will randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in. This patch causes the driver to perfer a valid MAC address from OF (if present) over HW self-reported MAC and only fall back to a random MAC address when neither of them is valid. Signed-off-by: Tianhao Chai <cth451@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02	irqchip: nvic: Fix offset for Interrupt Priority Offsets	Vladimir Murzin
	According to ARM(v7M) ARM Interrupt Priority Offsets located at 0xE000E400-0xE000E5EC, while 0xE000E300-0xE000E33C covers read-only Interrupt Active Bit Registers Fixes: 292ec080491d ("irqchip: Add support for ARMv7-M NVIC") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211201110259.84857-1-vladimir.murzin@arm.com
2021-12-02	ata: replace snprintf in show functions with sysfs_emit	Yang Guang
	coccinelle report： ./drivers/ata/libata-sata.c:830:8-16: WARNING: use scnprintf or sprintf Use sysfs_emit instead of scnprintf or sprintf makes more sense. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Yang Guang <yang.guang5@zte.com.cn> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-12-01	Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"	Marek Behún
	This reverts commit 239edf686c14a9ff926dec2f350289ed7adfefe2. 239edf686c14 ("PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge") added support for the Type 1 Expansion ROM BAR at config offset 0x38, based on the register being listed in the Marvell Armada A3720 spec. But the spec doesn't document it at all for RC mode, and there is no ROM in the SOC, so remove this emulation for now. The PCI bridge which represents aardvark's PCIe Root Port has an Expansion ROM Base Address register at offset 0x30, but its meaning is different than PCI's Expansion ROM BAR register, although the layout is the same. (This is why we thought it does the same thing.) First: there is no ROM (or part of BootROM) in the A3720 SOC dedicated for PCIe Root Port (or controller in RC mode) containing executable code that would initialize the Root Port, suitable for execution in bootloader (this is how Expansion ROM BAR is used on x86). Second: in A3720 spec the register (address 0xD0070030) is not documented at all for Root Complex mode, but similar to other BAR registers, it has an "entangled partner" in register 0xD0075920, which does address translation for the BAR in 0xD0070030: - the BAR register sets the address from the view of PCIe bus - the translation register sets the address from the view of the CPU The other BAR registers also have this entangled partner, and they can be used to: - in RC mode: address-checking on the receive side of the RC (they can define address ranges for memory accesses from remote Endpoints to the RC) - in Endpoint mode: allow the remote CPU to access memory on A3720 The Expansion ROM BAR has only the Endpoint part documented, but from the similarities we think that it can also be used in RC mode in that way. So either Expansion ROM BAR has different meaning (if the hypothesis above is true), or we don't know it's meaning (since it is not documented for RC mode). Remove the register from the emulated bridge accessing functions. [bhelgaas: summarize reason for removal (first paragraph)] Fixes: 239edf686c14 ("PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge") Link: https://lore.kernel.org/r/20211125160148.26029-3-kabel@kernel.org Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Pali Rohár <pali@kernel.org>
2021-12-01	octeontx2-af: Fix a memleak bug in rvu_mbox_init()	Zhou Qingyang
	In rvu_mbox_init(), mbox_regions is not freed or passed out under the switch-default region, which could lead to a memory leak. Fix this bug by changing 'return err' to 'goto free_regions'. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_OCTEONTX2_AF=y show no new warnings, and our static analyzer no longer warns about this code. Fixes: 98c561116360 (“octeontx2-af: cn10k: Add mbox support for CN10K platform”) Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130165039.192426-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01	net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()	Zhou Qingyang
	In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv(). After that mlx4_en_alloc_resources() is called and there is a dereference of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to a use after free problem on failure of mlx4_en_copy_priv(). Fix this bug by adding a check of mlx4_en_copy_priv() This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_MLX4_EN=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01	vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit	Stephen Suryaputra
	IPCB/IP6CB need to be initialized when processing outbound v4 or v6 pkts in the codepath of vrf device xmit function so that leftover garbage doesn't cause futher code that uses the CB to incorrectly process the pkt. One occasion of the issue might occur when MPLS route uses the vrf device as the outgoing device such as when the route is added using "ip -f mpls route add <label> dev <vrf>" command. The problems seems to exist since day one. Hence I put the day one commits on the Fixes tags. Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Fixes: 35402e313663 ("net: Add IPv6 support to VRF device") Cc: stable@vger.kernel.org Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211130162637.3249-1-ssuryaextr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01	net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()	Zhou Qingyang
	In qlcnic_83xx_add_rings(), the indirect function of ahw->hw_ops->alloc_mbx_args will be called to allocate memory for cmd.req.arg, and there is a dereference of it in qlcnic_83xx_add_rings(), which could lead to a NULL pointer dereference on failure of the indirect function like qlcnic_83xx_alloc_mbx_args(). Fix this bug by adding a check of alloc_mbx_args(), this patch imitates the logic of mbx_cmd()'s failure handling. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_QLCNIC=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: 7f9664525f9c ("qlcnic: 83xx memory map and HW access routine") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130110848.109026-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02	Merge tag 'amd-drm-fixes-5.16-2021-12-01' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.16-2021-12-01: amdgpu: - IP discovery based enumeration fixes - vkms fixes - DSC fixes for DP MST - Audio fix for hotplug with tiled displays - Misc display fixes - DP tunneling fix - DP fix - Aldebaran fix amdkfd: - Locking fix - Static checker fix - Fix double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211201232802.5801-1-alexander.deucher@amd.com
2021-12-02	Merge tag 'drm-msm-fixes-2021-11-28' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/msm into drm-fixes msm misc fixes, build, display Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsV-ntO_u323XMKuD6bgbgvXporwi1sbyXwNDAuA52Afw@mail.gmail.com
2021-12-01	drm/amdkfd: process_info lock not needed for svm	Philip Yang
	process_info->lock is used to protect kfd_bo_list, vm_list_head, n_vms and userptr valid/inval list, svm_range_restore_work and svm_range_set_attr don't access those, so do not need to take process_info lock. This will avoid potential circular locking issue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdgpu: adjust the kfd reset sequence in reset sriov function	shaoyunl
	This change revert previous commits: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amd/display: add connector type check for CRC source set	Perry Yuan
	[Why] IGT bypass test will set crc source as DPRX,and display DM didn`t check connection type, it run the test on the HDMI connector ,then the kernel will be crashed because aux->transfer is set null for HDMI connection. This patch will skip the invalid connection test and fix kernel crash issue. [How] Check the connector type while setting the pipe crc source as DPRX or auto,if the type is not DP or eDP, the crtc crc source will not be set and report error code to IGT test,IGT will show the this subtest as no valid crtc/connector combinations found. 116.779714] [IGT] amd_bypass: starting subtest 8bpc-bypass-mode [ 117.730996] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 117.731001] #PF: supervisor instruction fetch in kernel mode [ 117.731003] #PF: error_code(0x0010) - not-present page [ 117.731004] PGD 0 P4D 0 [ 117.731006] Oops: 0010 [#1] SMP NOPTI [ 117.731009] CPU: 11 PID: 2428 Comm: amd_bypass Tainted: G OE 5.11.0-34-generic #36~20.04.1-Ubuntu [ 117.731011] Hardware name: AMD CZN/, BIOS AB.FD 09/07/2021 [ 117.731012] RIP: 0010:0x0 [ 117.731015] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 117.731016] RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246 [ 117.731017] RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffffa8d64225bb5e [ 117.731018] RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: ffff931511a1a9d8 [ 117.731022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 117.731023] CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: 0000000000750ee0 [ 117.731023] PKRU: 55555554 [ 117.731024] Call Trace: [ 117.731027] drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper] [ 117.731036] drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper] [ 117.731040] drm_dp_start_crc+0x38/0xb0 [drm_kms_helper] [ 117.731047] amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu] [ 117.731149] crtc_crc_open+0x174/0x220 [drm] [ 117.731162] full_proxy_open+0x168/0x1f0 [ 117.731165] ? open_proxy_open+0x100/0x100 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1546 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdkfd: fix double free mem structure	Philip Yang
	drm_gem_object_put calls release_notify callback to free the mem structure and unreserve_mem_limit, move it down after the last access of mem and make it conditional call. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdkfd: set "r = 0" explicitly before goto	Philip Yang
	To silence the following Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2615 svm_range_restore_pages() warn: missing error code here? 'get_task_mm()' failed. 'r' = '0' Signed-off-by: Philip Yang <Philip.Yang@amd.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amd/display: Add work around for tunneled MST.	Jimmy Kizito
	[Why] Certain USB4 docks do not seem to be able to handle disabling DSC once it has been enabled on an MST stream. This can result in blank displays. [How] As a work around, always enable DSC on docks exhibiting this issue. The flag to indicate the use of DSC for MST streams on a USB4 dock is set during detection of the dock and only cleared when the USB4 dock is disconnected. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amd/display: Fix for the no Audio bug with Tiled Displays	Mustapha Ghaddar
	[WHY] It seems like after a series of plug/unplugs we end up in a situation where tiled display doesnt support Audio. [HOW] The issue seems to be related to when we check streams changed after an HPD, we should be checking the audio_struct as well to see if any of its values changed. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Mustapha Ghaddar <mustapha.ghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amd/display: Clear DPCD lane settings after repeater training	Shen, George
	[Why] VS and PE requested by repeater should not persist for the sink. [How] Clear DPCD lane settings after repeater link training finishes. Reviewed-by: Wesley Chalmers <wesley.chalmers@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: George Shen <George.Shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amd/display: Allow DSC on supported MST branch devices	Nicholas Kazlauskas
	[Why] When trying to lightup two 4k60 non-DSC displays behind a branch device that supports DSC we can't lightup both at once due to bandwidth limitations - each requires 48 VCPI slots but we only have 63. [How] The workaround already exists in the code but is guarded by a CONFIG that cannot be set by the user and shouldn't need to be. Check for specific branch device IDs to device whether to enable the workaround for multiple display scenarios. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-12-01	drm/amdgpu: Don't halt RLC on GFX suspend	Lijo Lazar
	On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend and keep it running till PMFW disables all DPMs. [ 578.019986] amdgpu 0000:23:00.0: amdgpu: GPU reset begin! [ 583.245566] amdgpu 0000:23:00.0: amdgpu: Failed to disable smu features. [ 583.245621] amdgpu 0000:23:00.0: amdgpu: Fail to disable dpm features! [ 583.245639] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 [ 583.248504] [drm] free PSP TMR buffer Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdgpu: fix the missed handling for SDMA2 and SDMA3	Guchun Chen
	There is no base reg offset or ip_version set for SDMA2 and SDMA3 on SIENNA_CICHLID, so add them. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kevin Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdgpu: check atomic flag to differeniate with legacy path	Flora Cui
	since vkms support atomic KMS interface Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <aleander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdgpu: cancel the correct hrtimer on exit	Flora Cui
	Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID	Jane Jian
	[WHY] for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature), which will be mismatched with original vcn0 revision [HOW] add new version check for vcn0 disabled revision(3, 0, 192), typically modified under sriov mode Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01	iavf: restore MSI state on reset	Mitch Williams
	If the PF experiences an FLR, the VF's MSI and MSI-X configuration will be conveniently and silently removed in the process. When this happens, reset recovery will appear to complete normally but no traffic will pass. The netdev watchdog will helpfully notify everyone of this issue. To prevent such public embarrassment, restore MSI configuration at every reset. For normal resets, this will do no harm, but for VF resets resulting from a PF FLR, this will keep the VF working. Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-01	cpufreq: Fix a comment in cpufreq_policy_free	Tang Yizhou
	Make the comment in blocking_notifier_call_chain() easier to understand. Signed-off-by: Tang Yizhou <tangyizhou@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-01	powercap/drivers/dtpm: Disable DTPM at boot time	Daniel Lezcano
	The DTPM framework misses a mechanism to set it up. That is currently under review but will come after the next cycle. As the distro are enabling all the kernel options, the DTPM framework is enabled on platforms where the energy model is not implemented, thus making the framework inconsistent and disrupting the CPU frequency scaling service. Remove the initialization at boot time as a hot fix. Fixes: 7a89d7eacf8e ("powercap/drivers/dtpm: Simplify the dtpm table") Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reported-By: Doug Smythies <dsmythies@telus.net> Tested-By: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-01	cpufreq: Fix get_cpu_device() failure in add_cpu_dev_symlink()	Xiongfeng Wang
	When I hot added a CPU, I found 'cpufreq' directory was not created below /sys/devices/system/cpu/cpuX/. It is because get_cpu_device() failed in add_cpu_dev_symlink(). cpufreq_add_dev() is the .add_dev callback of a CPU subsys interface. It will be called when the CPU device registered into the system. The call chain is as follows: register_cpu() ->device_register() ->device_add() ->bus_probe_device() ->cpufreq_add_dev() But only after the CPU device has been registered, we can get the CPU device by get_cpu_device(), otherwise it will return NULL. Since we already have the CPU device in cpufreq_add_dev(), pass it to add_cpu_dev_symlink(). I noticed that the 'kobj' of the CPU device has been added into the system before cpufreq_add_dev(). Fixes: 2f0ba790df51 ("cpufreq: Fix creation of symbolic links to policy directories") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-01	Merge tag 'wireless-drivers-2021-12-01' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.16 First set of fixes for v5.16. Mostly crash and driver initialisation fixes, the fix for rtw89 being most important. iwlwifi * compiler, lockdep and smatch warning fixes * fix for a rare driver initialisation failure * fix a memory leak rtw89 * fix const buffer modification causing a kernel crash mt76 * fix null pointer access * fix idr leak rt2x00 * fix driver initialisation errors, a regression since v5.2-rc1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	Merge tag 'mlx5-fixes-2021-11-30' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-11-30 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed	Marek Behún
	Function mv88e6xxx_serdes_pcs_get_state() currently does not report link up if AN is enabled, Link bit is set, but Speed and Duplex Resolved bit is not set, which testing shows is the case for when auto-negotiation was bypassed (we have AN enabled but link partner does not). An example of such link partner is Marvell 88X3310 PHY, when put into the mode where host interface changes between 10gbase-r, 5gbase-r, 2500base-x and sgmii according to copper speed. The 88X3310 does not enable AN in 2500base-x, and so SerDes on mv88e6xxx currently does not link with it. Fix this. Fixes: a5a6858b793f ("net: dsa: mv88e6xxx: extend phylink to Serdes PHYs") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family	Marek Behún
	Inband AN is broken on Amethyst in 2500base-x mode when set by standard mechanism (via cmode). (There probably is some weird setting done by default in the switch for this mode that make it cycle in some state or something, because when the peer is the mvneta controller, it receives link change interrupts every ~0.3ms, but the link is always down.) Get around this by configuring the PCS mode to 1000base-x (where inband AN works), and then changing the SerDes frequency while SerDes transmitter and receiver are disabled, before enabling SerDes PHY. After disabling SerDes PHY, change the PCS mode back to 2500base-x, to avoid confusing the device (if we leave it at 1000base-x PCS mode but with different frequency, and then change cmode to sgmii, the device won't change the frequency because it thinks it already has the correct one). The register which changes the frequency is undocumented. I discovered it by going through all registers in the ranges 4.f000-4.f100 and 1e.8000-1e.8200 for all SerDes cmodes (sgmii, 1000base-x, 2500base-x, 5gbase-r, 10gbase-r, usxgmii) and filtering out registers that didn't make sense (the value was the same for modes which have different frequency). The result of this was: reg sgmii 1000base-x 2500base-x 5gbase-r 10gbase-r usxgmii 04.f002 005b 0058 0059 005c 005d 005f 04.f076 3000 0000 1000 4000 5000 7000 04.f07c 0950 0950 1850 0550 0150 0150 1e.8000 0059 0059 0058 0055 0051 0051 1e.8140 0e20 0e20 0e28 0e21 0e42 0e42 Register 04.f002 is the documented Port Operational Confiuration register, it's last 3 bits select PCS type, so changing this register also changes the frequency to the appropriate value. Registers 04.f076 and 04.f07c are not writable. Undocumented register 1e.8000 was the one: changing bits 3:0 from 9 to 8 changed SerDes frequency to 3.125 GHz, while leaving the value of PCS mode in register 04.f002.2:0 at 1000base-x. Inband autonegotiation started working correctly. (I didn't try anything with register 1e.8140 since 1e.8000 solved the problem.) Since I don't have documentation for this register 1e.8000.3:0, I am using the constants without names, but my hypothesis is that this register selects PHY frequency. If in the future I have access to an oscilloscope able to handle these frequencies, I will try to test this hypothesis. Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Add fix for erratum 5.2 of 88E6393X family	Marek Behún
	Add fix for erratum 5.2 of the 88E6393X (Amethyst) family: for 10gbase-r mode, some undocumented registers need to be written some special values. Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Save power by disabling SerDes trasmitter and receiver	Marek Behún
	Save power on 88E6393X by disabling SerDes receiver and transmitter after SerDes is SerDes is disabled. Signed-off-by: Marek Behún <kabel@kernel.org> Cc: stable@vger.kernel.org # de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Drop unnecessary check in mv88e6393x_serdes_erratum_4_6()	Marek Behún
	The check for lane is unnecessary, since the function is called only with allowed lane argument. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	net: dsa: mv88e6xxx: Fix application of erratum 4.8 for 88E6393X	Marek Behún
	According to SERDES scripts for 88E6393X, erratum 4.8 has to be applied every time before SerDes is powered on. Split the code for erratum 4.8 into separate function and call it in mv88e6393x_serdes_power(). Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01	drm/i915/dp: Perform 30ms delay after source OUI write	Lyude Paul
	While working on supporting the Intel HDR backlight interface, I noticed that there's a couple of laptops that will very rarely manage to boot up without detecting Intel HDR backlight support - even though it's supported on the system. One example of such a laptop is the Lenovo P17 1st generation. Following some investigation Ville Syrjälä did through the docs they have available to them, they discovered that there's actually supposed to be a 30ms wait after writing the source OUI before we begin setting up the rest of the backlight interface. This seems to be correct, as adding this 30ms delay seems to have completely fixed the probing issues I was previously seeing. So - let's start performing a 30ms wait after writing the OUI, which we do in a manner similar to how we keep track of PPS delays (e.g. record the timestamp of the OUI write, and then wait for however many ms are left since that timestamp right before we interact with the backlight) in order to avoid waiting any longer then we need to. As well, this also avoids us performing this delay on systems where we don't end up using the HDR backlight interface. V3: * Move last_oui_write into intel_dp V2: * Move panel delays into intel_pps Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Fixes: 4a8d79901d5b ("drm/i915/dp: Enable Intel's HDR backlight interface (only SDR for now)") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.12+ Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211130212912.212044-1-lyude@redhat.com (cherry picked from commit c7c90b0b8418a97d3aa8b39aae1992908948efad) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-12-01	dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow	Guangming
	For previous version, it uses 'sg_table.nent's to traverse sg_table in pages free flow. However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of created entries in the DMA adderess space. So, use 'sg_table.nents' in pages free flow will case some pages can't be freed. Here we should use sg_table.orig_nents to free pages memory, but use the sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common helper 'for_each_sg' which maybe cause memory leak) is much better. Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available") Signed-off-by: Guangming <Guangming.Cao@mediatek.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Cc: <stable@vger.kernel.org> # 5.11.* Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211126074904.88388-1-guangming.cao@mediatek.com
2021-11-30	net/mlx5e: SHAMPO, Fix constant expression result	Ben Ben-Ishay
	mlx5e_build_shampo_hd_umr uses counters i and index incorrectly as unsigned, thus the err state err_unmap could stuck in endless loop. Change i to int to solve the first issue. Reduce index check to solve the second issue, the caller function validates that index could not rotate. Fixes: 64509b052525 ("net/mlx5e: Add data path for SHAMPO feature") Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: Fix access to a non-supported register	Aya Levin
	Validate MRTC register is supported before triggering a delayed work which accesses it. Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: Fix too early queueing of log timestamp work	Gal Pressman
	The log timestamp work should not be queued before the command interface is initialized, move it to a later stage in the init flow. Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: Fix use after free in mlx5_health_wait_pci_up	Amir Tzin
	The device health recovery flow calls mlx5_health_wait_pci_up() which queries the device for FW_RESET timeout after freeing the device timeouts structure on mlx5_function_teardown(). Fix this bug by moving timeouts structure init/cleanup to the device's init/uninit phases. Since it is necessary to reset default software timeouts on function reload, extract setting of defaults values from mlx5_tout_init() and call mlx5_tout_set_def_val() directly from mlx5_function_setup(). Fixes: 5945e1adeab5 ("net/mlx5: Read timeout values from init segment") Reported by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: E-Switch, Use indirect table only if all destinations support it	Maor Dickman
	When adding rule with multiple destinations, indirect table is used for all of the destinations if at least one of the destinations support it, this can cause creation of invalid indirect tables for the destinations that doesn't support it. Fixed it by using indirect table only if all destinations support it. Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: E-Switch, Check group pointer before reading bw_share value	Dmytro Linkin
	If log_esw_max_sched_depth is not supported group pointer of the vport is NULL. Hence, check the pointer before reading bw_share value. Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: E-Switch, fix single FDB creation on BlueField	Mark Bloch
	Always use MLX5_FLOW_TABLE_OTHER_VPORT flag when creating egress ACL table for single FDB. Not doing so on BlueField will make firmware fail the command. On BlueField the E-Switch manager is the ECPF (vport 0xFFFE) which is filled in the flow table creation command but as the other_vport field wasn't set the firmware complains about a bad parameter. This is different from a regular HCA where the E-Switch manager vport is the PF (vport 0x0). Passing MLX5_FLOW_TABLE_OTHER_VPORT will make the firmware happy both on BlueField and on regular HCAs without special condition for each. This fixes the bellow firmware syndrome: mlx5_cmd_check:819:(pid 571): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x754a4) Fixes: db202995f503 ("net/mlx5: E-Switch, add logic to enable shared FDB") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: E-switch, Respect BW share of the new group	Dmytro Linkin
	To enable transmit schduler on vport FW require non-zero configuration for vport's TSAR. If vport added to the group which has configured BW share value and TX rate values of the vport are zero, then scheduler wouldn't be enabled on this vport. Fix that by calling BW normalization if BW share of the new group is configured. Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: Lag, Fix recreation of VF LAG	Maor Gottlieb
	Driver needs to nullify the port select attributes of the LAG when port selection is destroyed, otherwise it breaks recreation of the LAG. It fixes the below kernel oops: [ 587.906377] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 587.908843] #PF: supervisor read access in kernel mode [ 587.910730] #PF: error_code(0x0000) - not-present page [ 587.912580] PGD 0 P4D 0 [ 587.913632] Oops: 0000 [#1] SMP PTI [ 587.914644] CPU: 5 PID: 165 Comm: kworker/u20:5 Tainted: G OE 5.9.0_mlnx #1 [ 587.916152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 587.918332] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core] [ 587.919479] RIP: 0010:mlx5_del_flow_rules+0x10/0x270 [mlx5_core] [ 587.920568] mlx5_core 0000:08:00.1 enp8s0f1: Link up [ 587.920680] Code: c0 09 80 a0 e8 cf 42 a4 e0 48 c7 c3 f4 ff ff ff e8 8a 88 dd e0 e9 ab fe ff ff 0f 1f 44 00 00 41 56 41 55 49 89 fd 41 54 55 53 <48> 8b 47 08 48 8b 68 28 48 85 ed 74 2e 48 8d 7d 38 e8 6a 64 34 e1 [ 587.925116] bond0: (slave enp8s0f1): Enslaving as an active interface with an up link [ 587.930415] RSP: 0018:ffffc9000048fd88 EFLAGS: 00010282 [ 587.930417] RAX: ffff88846c14fac0 RBX: ffff88846cddcb80 RCX: 0000000080400007 [ 587.930417] RDX: 0000000080400008 RSI: ffff88846cddcb80 RDI: 0000000000000000 [ 587.930419] RBP: ffff88845fd80140 R08: 0000000000000001 R09: ffffffffa074ba00 [ 587.938132] R10: ffff88846c14fec0 R11: 0000000000000001 R12: ffff88846c122f10 [ 587.939473] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88846d7a0000 [ 587.940800] FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000 [ 587.942416] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 587.943536] CR2: 0000000000000008 CR3: 000000000240a002 CR4: 0000000000770ee0 [ 587.944904] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 587.946308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 587.947639] PKRU: 55555554 [ 587.948236] Call Trace: [ 587.948834] mlx5_lag_destroy_definer.isra.3+0x16/0x90 [mlx5_core] [ 587.950033] mlx5_lag_destroy_definers+0x5b/0x80 [mlx5_core] [ 587.951128] mlx5_deactivate_lag+0x6e/0x80 [mlx5_core] [ 587.952146] mlx5_do_bond+0x150/0x450 [mlx5_core] [ 587.953086] mlx5_do_bond_work+0x3e/0x50 [mlx5_core] [ 587.954086] process_one_work+0x1eb/0x3e0 [ 587.954899] worker_thread+0x2d/0x3c0 [ 587.955656] ? process_one_work+0x3e0/0x3e0 [ 587.956493] kthread+0x115/0x130 [ 587.957174] ? kthread_park+0x90/0x90 [ 587.957929] ret_from_fork+0x1f/0x30 [ 587.973055] ---[ end trace 71ccd6eca89f5513 ]--- Fixes: b7267869e923 ("net/mlx5: Lag, add support to create/destroy/modify port selection") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30	net/mlx5: Move MODIFY_RQT command to ignore list in internal error state	Moshe Shemesh
	When the device is in internal error state, command interface isn't accessible and the driver decides which commands to fail and which to ignore. Move the MODIFY_RQT command to the ignore list in order to avoid the following redundant warning messages in internal error state: mlx5_core 0000:82:00.1: mlx5e_rss_disable:419:(pid 23754): Failed to redirect RQT 0x0 to drop RQ 0xc00848: err = -5 mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:598:(pid 23754): Failed to redirect direct RQT 0x1 to drop RQ 0xc00848 (channel 0): err = -5 mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:607:(pid 23754): Failed to redirect XSK RQT 0x19 to drop RQ 0xc00848 (channel 0): err = -5 Fixes: 43ec0f41fa73 ("net/mlx5e: Hide all implementation details of mlx5e_rx_res") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>