linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-01-10	ACPI: video: Allow selecting NVidia-WMI-EC or Apple GMUX backlight from the ↵	Hans de Goede
	cmdline The patches adding NVidia-WMI-EC and Apple GMUX backlight detection support to acpi_video_get_backlight_type(), forgot to update acpi_video_parse_cmdline() to allow manually selecting these from the commandline. Add support for these to acpi_video_parse_cmdline(). Fixes: fe7aebb40d42 ("ACPI: video: Add Nvidia WMI EC brightness control detection (v3)") Fixes: 21245df307cb ("ACPI: video: Add Apple GMUX brightness control detection") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-10	ACPI: resource: Skip IRQ override on Asus Expertbook B2402CBA	Tamim Khan
	Like the Asus Expertbook B2502CBA and various Asus Vivobook laptops, the Asus Expertbook B2402CBA has an ACPI DSDT table that describes IRQ 1 as ActiveLow while the kernel overrides it to Edge_High. This prevents the keyboard from working. To fix this issue, add this laptop to the skip_override_table so that the kernel does not override IRQ 1. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216864 Tested-by: zelenat <zelenat@gmail.com> Signed-off-by: Tamim Khan <tamim@fusetak.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-10	vfio/type1: Respect IOMMU reserved regions in vfio_test_domain_fgsp()	Niklas Schnelle
	Since commit cbf7827bc5dc ("iommu/s390: Fix potential s390_domain aperture shrinking") the s390 IOMMU driver uses reserved regions for the system provided DMA ranges of PCI devices. Previously it reduced the size of the IOMMU aperture and checked it on each mapping operation. On current machines the system denies use of DMA addresses below 2^32 for all PCI devices. Usually mapping IOVAs in a reserved regions is harmless until a DMA actually tries to utilize the mapping. However on s390 there is a virtual PCI device called ISM which is implemented in firmware and used for cross LPAR communication. Unlike real PCI devices this device does not use the hardware IOMMU but inspects IOMMU translation tables directly on IOTLB flush (s390 RPCIT instruction). If it detects IOVA mappings outside the allowed ranges it goes into an error state. This error state then causes the device to be unavailable to the KVM guest. Analysing this we found that vfio_test_domain_fgsp() maps 2 pages at DMA address 0 irrespective of the IOMMUs reserved regions. Even if usually harmless this seems wrong in the general case so instead go through the freshly updated IOVA list and try to find a range that isn't reserved, and fits 2 pages, is PAGE_SIZE * 2 aligned. If found use that for testing for fine grained super pages. Fixes: af029169b8fd ("vfio/type1: Check reserved region conflict and update iova list") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230110164427.4051938-2-schnelle@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-01-10	drm/i915/gt: Reset twice	Chris Wilson
	After applying an engine reset, on some platforms like Jasperlake, we occasionally detect that the engine state is not cleared until shortly after the resume. As we try to resume the engine with volatile internal state, the first request fails with a spurious CS event (it looks like it reports a lite-restore to the hung context, instead of the expected idle->active context switch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212161338.1007659-1-andi.shyti@linux.intel.com (cherry picked from commit 3db9d590557da3aa2c952f2fecd3e9b703dad790) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-10	PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe()	Christophe JAILLET
	If an error occurs after a successful pm_genpd_init() call, it should be undone by a corresponding pm_genpd_remove(). Add the missing call in the error handling path, as already done in the remove function. Fixes: bf6910abf548 ("power: avs: Add support for CPR (Core Power Reduction)") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/0f520597dbad89ab99c217c8986912fa53eaf5f9.1671293108.git.christophe.jaillet@wanadoo.fr
2023-01-10	drm/amdgpu: fix pipeline sync v2	Christian König
	This fixes a potential memory leak of dma_fence objects in the CS code as well as glitches in firefox because of missing pipeline sync. v2: use the scheduler instead of the fence context Signed-off-by: Christian König <christian.koenig@amd.com> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2323 Tested-by: Michal Kubecek mkubecek@suse.cz Tested-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230109130120.73389-1-christian.koenig@amd.com
2023-01-10	IB/hfi1: Remove user expected buffer invalidate race	Dean Luick
	During setup, there is a possible race between a page invalidate and hardware programming. Add a covering invalidate over the user target range during setup. If anything within that range is invalidated during setup, fail the setup. Once set up, each TID will have its own invalidate callback and invalidate. Fixes: 3889551db212 ("RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv") Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/167328549178.1472310.9867497376936699488.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	IB/hfi1: Immediately remove invalid memory from hardware	Dean Luick
	When a user expected receive page is unmapped, it should be immediately removed from hardware rather than depend on a reaction from user space. Fixes: 2677a7680e77 ("IB/hfi1: Fix memory leak during unexpected shutdown") Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/167328548663.1472310.7871808081861622659.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	IB/hfi1: Fix expected receive setup error exit issues	Dean Luick
	Fix three error exit issues in expected receive setup. Re-arrange error exits to increase readability. Issues and fixes: 1. Possible missed page unpin if tidlist copyout fails and not all pinned pages where made part of a TID. Fix: Unpin the unused pages. 2. Return success with unset return values tidcnt and length when no pages were pinned. Fix: Return -ENOSPC if no pages were pinned. 3. Return success with unset return values tidcnt and length when no rcvarray entries available. Fix: Return -ENOSPC if no rcvarray entries are available. Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body") Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres") Fixes: f404ca4c7ea8 ("IB/hfi1: Refactor hfi_user_exp_rcv_setup() IOCTL") Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/167328548150.1472310.1492305874804187634.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	IB/hfi1: Reserve user expected TIDs	Dean Luick
	To avoid a race, reserve the number of user expected TIDs before setup. Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body") Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/167328547636.1472310.7419712824785353905.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	IB/hfi1: Reject a zero-length user expected buffer	Dean Luick
	A zero length user buffer makes no sense and the code does not handle it correctly. Instead, reject a zero length as invalid. Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres") Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/167328547120.1472310.6362802432127399257.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	RDMA/core: Fix ib block iterator counter overflow	Yonatan Nachum
	When registering a new DMA MR after selecting the best aligned page size for it, we iterate over the given sglist to split each entry to smaller, aligned to the selected page size, DMA blocks. In given circumstances where the sg entry and page size fit certain sizes and the sg entry is not aligned to the selected page size, the total size of the aligned pages we need to cover the sg entry is >= 4GB. Under this circumstances, while iterating page aligned blocks, the counter responsible for counting how much we advanced from the start of the sg entry is overflowed because its type is u32 and we pass 4GB in size. This can lead to an infinite loop inside the iterator function because the overflow prevents the counter to be larger than the size of the sg entry. Fix the presented problem by changing the advancement condition to eliminate overflow. Backtrace: [ 192.374329] efa_reg_user_mr_dmabuf [ 192.376783] efa_register_mr [ 192.382579] pgsz_bitmap 0xfffff000 rounddown 0x80000000 [ 192.386423] pg_sz [0x80000000] umem_length[0xc0000000] [ 192.392657] start 0x0 length 0xc0000000 params.page_shift 31 params.page_num 3 [ 192.399559] hp_cnt[3], pages_in_hp[524288] [ 192.403690] umem->sgt_append.sgt.nents[1] [ 192.407905] number entries: [1], pg_bit: [31] [ 192.411397] biter->__sg_nents [1] biter->__sg [0000000008b0c5d8] [ 192.415601] biter->__sg_advance [665837568] sg_dma_len[3221225472] [ 192.419823] biter->__sg_nents [1] biter->__sg [0000000008b0c5d8] [ 192.423976] biter->__sg_advance [2813321216] sg_dma_len[3221225472] [ 192.428243] biter->__sg_nents [1] biter->__sg [0000000008b0c5d8] [ 192.432397] biter->__sg_advance [665837568] sg_dma_len[3221225472] Fixes: a808273a495c ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks") Signed-off-by: Yonatan Nachum <ynachum@amazon.com> Link: https://lore.kernel.org/r/20230109133711.13678-1-ynachum@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10	octeontx2-pf: Fix resource leakage in VF driver unbind	Hariprasad Kelam
	resources allocated like mcam entries to support the Ntuple feature and hash tables for the tc feature are not getting freed in driver unbind. This patch fixes the issue. Fixes: 2da489432747 ("octeontx2-pf: devlink params support to set mcam entry count") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Link: https://lore.kernel.org/r/20230109061325.21395-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-10	nvme: don't allow unprivileged passthrough on partitions	Christoph Hellwig
	Passthrough commands can always access the entire device, and thus submitting them on partitions is an privelege escalation. In hindsight we should have never allowed any passthrough commands on partitions, but it's probably too late to change that decision now. Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2023-01-10	nvme: replace the "bool vec" arguments with flags in the ioctl path	Christoph Hellwig
	To prepare for passing down more information, replace the boolean vec argument with a more extensible flags one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2023-01-10	nvme: remove __nvme_ioctl	Christoph Hellwig
	Open code __nvme_ioctl in the two callers to make future changes that pass down additional paramters in the ioctl path easier. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2023-01-10	nvme-pci: fix error handling in nvme_pci_enable()	Tong Zhang
	There are two issues in nvme_pci_enable(): 1) If pci_alloc_irq_vectors() fails, device is left enabled. Fix this by adding a goto disable statement. 2) nvme_pci_configure_admin_queue could return -ENODEV, in this case, we will need to free IRQ properly. Otherwise the following warning could be triggered: [ 5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140 [ 5.290547] Call Trace: [ 5.290626] <TASK> [ 5.290695] msi_remove_device_irq_domain+0xc9/0xf0 [ 5.290843] msi_device_data_release+0x15/0x80 [ 5.290978] release_nodes+0x58/0x90 [ 5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80 [ 5.297573] Call Trace: [ 5.297651] <TASK> [ 5.297719] release_nodes+0x58/0x90 [ 5.297831] devres_release_all+0xef/0x140 [ 5.298339] device_unbind_cleanup+0x11/0xc0 [ 5.298479] really_probe+0x296/0x320 Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") Co-developed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-01-10	nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers	Hector Martin
	This mirrors the quirk added to Apple Silicon controllers in apple.c. These controllers do not support the Active NS ID List command and behave identically to the SoC version judging by existing user reports/syslogs, so will need the same fix. This quirk reverts back to NVMe 1.0 behavior and disables the broken commands. Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") Signed-off-by: Hector Martin <marcan@marcan.st> Tested-by: Orlando Chamberlain <orlandoch.dev@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-01-10	nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression	Hector Martin
	From the get-go, this driver and the ANS syslog have been complaining about namespace identification. In 6.2-rc1, commit 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") regressed the driver by no longer allowing fallback to sequential namespace scans, leaving us with no namespaces. It turns out that the real problem is that this controller claiming NVMe 1.1 compat is treating the CNS field as a binary field, as in NVMe 1.0. This already has a quirk, NVME_QUIRK_IDENTIFY_CNS, so set it for the controller to fix all this nonsense (including other errors triggered by other CNS commands). Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") Fixes: 5bd2927aceba ("nvme-apple: Add initial Apple SoC NVMe driver") Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-01-09	net/mlx5e: Fix macsec possible null dereference when updating MAC security ↵	Emeel Hakim
	entity (SecY) Upon updating MAC security entity (SecY) in hw offload path, the macsec security association (SA) initialization routine is called. In case of extended packet number (epn) is enabled the salt and ssci attributes are retrieved using the MACsec driver rx_sa context which is unavailable when updating a SecY property such as encoding-sa hence the null dereference. Fix by using the provided SA to set those attributes. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: Fix macsec ssci attribute handling in offload path	Emeel Hakim
	Currently when macsec offload is set with extended packet number (epn) enabled, the driver wrongly deduce the short secure channel identifier (ssci) from the salt instead of the stand alone ssci attribute as it should, consequently creating a mismatch between the kernel and driver's ssci values. Fix by using the ssci value from the relevant attribute. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5: E-switch, Coverity: overlapping copy	Shay Drory
	When a capability is set via port function caps callbacks, a memcpy() is performed in which the source and the target are the same address, e.g.: the copy is redundant. Hence, Remove it. Discovered by Coverity. Fixes: 7db98396ef45 ("net/mlx5: E-Switch, Implement devlink port function cmds to control RoCE") Fixes: e5b9642a33be ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: Don't support encap rules with gbp option	Gavin Li
	Previously, encap rules with gbp option would be offloaded by mistake but driver does not support gbp option offload. To fix this issue, check if the encap rule has gbp option and don't offload the rule Fixes: d8f9dfae49ce ("net: sched: allow flower to match vxlan options") Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5: Fix ptp max frequency adjustment range	Rahul Rameshbabu
	.max_adj of ptp_clock_info acts as an absolute value for the amount in ppb that can be set for a single call of .adjfine. This means that a single call to .getfine cannot be greater than .max_adj or less than -(.max_adj). Provides correct value for max frequency adjustment value supported by devices. Fixes: 3d8c38af1493 ("net/mlx5e: Add PTP Hardware Clock (PHC) support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: Fix memory leak on updating vport counters	Aya Levin
	When updating statistics driver queries the vport's counters. On fail, add error path releasing the allocated buffer avoiding memory leak. Fixes: 64b68e369649 ("net/mlx5: Refactor and expand rep vport stat group") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: TC, Restore pkt rate policing support	Oz Shlomo
	The offending commit removed the support for all packet rate metering. Restore the pkt rate metering support by removing the restriction. Fixes: 3603f26633e7 ("net/mlx5e: TC, allow meter jump control action") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: TC, ignore match level for post meter rules	Oz Shlomo
	The post meter table only matches on reg_c5. As such, the inner/outer match levels are irrelevant for the match critieria. The cited patch only sets the outer criteria to none, thus setting the inner match level for encapsulated packets. This caused rules with police action on tunnel devices to not find an existing flow group for the match criteria, thus failing to offload the rule. Set both the inner and outer match levels to none for post_meter rules. Fixes: 0d8c38d44f33 ("net/mlx5e: TC, init post meter rules with branching attributes") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: IPoIB, Fix child PKEY interface stats on rx path	Dragos Tatulea
	The current code always does the accounting using the stats from the parent interface (linked in the rq). This doesn't work when there are child interfaces configured. Fix this behavior by always using the stats from the child interface priv. This will also work for parent only interfaces: the child (netdev) and parent netdev (rq->netdev) will point to the same thing. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: IPoIB, Block PKEY interfaces with less rx queues than parent	Dragos Tatulea
	A user is able to configure an arbitrary number of rx queues when creating an interface via netlink. This doesn't work for child PKEY interfaces because the child interface uses the parent receive channels. Although the child shares the parent's receive channels, the number of rx queues is important for the channel_stats array: the parent's rx channel index is used to access the child's channel_stats. So the array has to be at least as large as the parent's rx queue size for the counting to work correctly and to prevent out of bound accesses. This patch checks for the mentioned scenario and returns an error when trying to create the interface. The error is propagated to the user. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: IPoIB, Block queue count configuration when sub interfaces are ↵	Dragos Tatulea
	present PKEY sub interfaces share the receive queues with the parent interface. While setting the sub interface queue count is not supported, it is currently possible to change the number of queues of the parent interface. Thus we can end up with inconsistent queue sizes between the parent and its sub interfaces. This change disallows setting the queue count on the parent interface when sub interfaces are present. This is achieved by introducing an explicit reference to the parent netdev in the mlx5i_priv of the child interface. An additional counter is also required on the parent side to detect when sub interfaces are attached and for proper cleanup. The rtnl lock is taken during the ethtool op and the sub interface ndo_init/uninit ops. There is no race here around counting the sub interfaces, reading the sub interfaces and setting the number of channels. The ASSERT_RTNL was added to document that. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: Verify dev is present for fix features ndo	Roy Novich
	The native NIC port net device instance is being used as Uplink representor. While changing profiles private resources are not available, fix features ndo does not check if the netdev is present. Add driver protection to verify private resources are ready. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5: Fix command stats access after free	Moshe Shemesh
	Command may fail while driver is reloading and can't accept FW commands till command interface is reinitialized. Such command failure is being logged to command stats. This results in NULL pointer access as command stats structure is being freed and reallocated during mlx5 devlink reload (see kernel log below). Fix it by making command stats statically allocated on driver probe. Kernel log: [ 2394.808802] BUG: unable to handle kernel paging request at 000000000002a9c0 [ 2394.810610] PGD 0 P4D 0 [ 2394.811811] Oops: 0002 [#1] SMP NOPTI ... [ 2394.815482] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0 ... [ 2394.829505] Call Trace: [ 2394.830667] _raw_spin_lock_irq+0x23/0x26 [ 2394.831858] cmd_status_err+0x55/0x110 [mlx5_core] [ 2394.833020] mlx5_access_reg+0xe7/0x150 [mlx5_core] [ 2394.834175] mlx5_query_port_ptys+0x78/0xa0 [mlx5_core] [ 2394.835337] mlx5e_ethtool_get_link_ksettings+0x74/0x590 [mlx5_core] [ 2394.836454] ? kmem_cache_alloc_trace+0x140/0x1c0 [ 2394.837562] __rh_call_get_link_ksettings+0x33/0x100 [ 2394.838663] ? __rtnl_unlock+0x25/0x50 [ 2394.839755] __ethtool_get_link_ksettings+0x72/0x150 [ 2394.840862] duplex_show+0x6e/0xc0 [ 2394.841963] dev_attr_show+0x1c/0x40 [ 2394.843048] sysfs_kf_seq_show+0x9b/0x100 [ 2394.844123] seq_read+0x153/0x410 [ 2394.845187] vfs_read+0x91/0x140 [ 2394.846226] ksys_read+0x4f/0xb0 [ 2394.847234] do_syscall_64+0x5b/0x1a0 [ 2394.848228] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 34f46ae0d4b3 ("net/mlx5: Add command failures data to debugfs") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5e: TC, Keep mod hdr actions after mod hdr alloc	Ariel Levkovich
	When offloading TC NIC rule which has mod_hdr action, the mod_hdr actions list is freed upon mod_hdr allocation. In the new format of handling multi table actions and CT in particular, the mod_hdr actions list is still relevant when setting the pre and post rules and therefore, freeing the list may cause adding rules which don't set the FTE_ID. Therefore, the mod_hdr actions list needs to be kept for the pre/post flows as well and should be left for these handler to be freed. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5: check attr pointer validity before dereferencing it	Ariel Levkovich
	Fix attr pointer validity checks after it was already dereferenced. Fixes: cb0d54cbf948 ("net/mlx5e: Fix wrong source vport matching on tunnel rule") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-09	net/mlx5: DR, Fix 'stack frame size exceeds limit' error in dr_rule	Yevgeny Kliteynik
	If the kernel configuration asks the compiler to check frame limit of 1K, dr_rule_create_rule_nic exceed this limit: "stack frame size (1184) exceeds limit (1024)" Fixing this issue by checking configured frame limit and using the optimization STE array only for cases with the usual 2K (or larger) stack size warning. Fixes: b9b81e1e9382 ("net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2023-01-09	Revert "r8169: disable detection of chip version 36"	Heiner Kallweit
	This reverts commit 42666b2c452ce87894786aae05e3fad3cfc6cb59. This chip version seems to be very rare, but it exits in consumer devices, see linked report. Link: https://stackoverflow.com/questions/75049473/cant-setup-a-wired-network-in-archlinux-fresh-install Fixes: 42666b2c452c ("r8169: disable detection of chip version 36") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/42e9674c-d5d0-a65a-f578-e5c74f244739@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10	cpufreq: armada-37xx: stop using 0 as NULL pointer	Miles Chen
	Use NULL for NULL pointer to fix the following sparse warning: drivers/cpufreq/armada-37xx-cpufreq.c:448:32: sparse: warning: Using plain integer as NULL pointer Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-01-09	drm/vmwgfx: Remove rcu locks from user resources	Zack Rusin
	User resource lookups used rcu to avoid two extra atomics. Unfortunately the rcu paths were buggy and it was easy to make the driver crash by submitting command buffers from two different threads. Because the lookups never show up in performance profiles replace them with a regular spin lock which fixes the races in accesses to those shared resources. Fixes kernel oops'es in IGT's vmwgfx execution_buffer stress test and seen crashes with apps using shared resources. Fixes: e14c02e6b699 ("drm/vmwgfx: Look up objects without taking a reference") Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221207172907.959037-1-zack@kde.org
2023-01-10	drm/virtio: Fix GEM handle creation UAF	Rob Clark
	Userspace can guess the handle value and try to race GEM object creation with handle close, resulting in a use-after-free if we dereference the object after dropping the handle's reference. For that reason, dropping the handle's reference must be done after we are done dereferencing the object. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Fixes: 62fb7a5e1096 ("virtio-gpu: add 3d/virgl support") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221216233355.542197-2-robdclark@gmail.com
2023-01-09	Merge tag 'imx-fixes-6.2' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.2: - Fix i.MX8MP DT for missing GPC Interrupt, power-domain typo and USB clock error. - Fix mach-imx cpu code to add missing of_node_put() call. - A couple of verdin-imx8mm DT fixes for audio playback support. - Fix pca9547 i2c-mux node name for i.MX and Vybrid device trees. - Fix an imx93-11x11-evk uSDHC pad setting problem that causes Micron eMMC CMD8 CRC error in HS400ES/HS400 mode. - A couple of imx8mp-blk-ctrl driver fixes from Lucas Stach, enabling pixclk with HDMI_TX_PHY PD, dropping power device name setting. - Fix the error check for of_clk_get_by_name() in soc-imx8m driver. - Other various DT fixes and cleanups. * tag 'imx-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: (22 commits) soc: imx8m: Fix incorrect check for of_clk_get_by_name() arm64: dts: imx8mm-venice-gw7901: fix USB2 controller OC polarity arm64: dts: imx8mp-evk: pcie0-refclk cosmetic cleanup arm64: dts: imx8mp: Fix power-domain typo arm64: dts: imx8mp: Fix missing GPC Interrupt soc: imx: imx8mp-blk-ctrl: don't set power device name arm64: dts: imx8mm: Drop xtal clock specifier from eDM SBC ARM: imx: add missing of_node_put() arm64: dts: imx93-11x11-evk: correct clock and strobe pad setting arm64: dts: verdin-imx8mm: fix dev board audio playback arm64: dts: imx8mq-thor96: fix no-mmc property for SDHCI arm64: dts: imx8mm-beacon: Fix ecspi2 pinmux arm64: dts: freescale: Fix pca954x i2c-mux node names ARM: dts: vf610: Fix pca9548 i2c-mux node names ARM: dts: imx: Fix pca9547 i2c-mux node name arm64: dts: verdin-imx8mm: fix dahlia audio playback ARM: dts: imx6qdl-gw560x: Remove incorrect 'uart-has-rtscts' ARM: dts: imx7d-pico: Use 'clock-frequency' ARM: dts: imx6ul-pico-dwarf: Use 'clock-frequency' arm64: dts: imx8mp-phycore-som: Remove invalid PMIC property ... Link: https://lore.kernel.org/r/20230102132016.GA10699@T480 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-09	Merge tag 'reset-fixes-for-v6.2' of git://git.pengutronix.de/pza/linux into ↵	Arnd Bergmann
	arm/fixes Reset controller fixes for v6.2 Avoid a build error by disabling the invalid combination of building reset-ti-sci built-in but ti-sci as a module. Fix a possible NULL pointer dereference in reset-uniphier-glue in case platform_get_resource() fails. * tag 'reset-fixes-for-v6.2' of git://git.pengutronix.de/pza/linux: reset: uniphier-glue: Fix possible null-ptr-deref reset: ti-sci: honor TI_SCI_PROTOCOL setting when not COMPILE_TEST Link: https://lore.kernel.org/r/20230104095855.3809733-1-p.zabel@pengutronix.de Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-09	Merge tag 'scmi-fixes-6.2' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm SCMI fixes for v6.2 Few fixes addressing: 1. Possible compromise with the shorter message size from a misbheaving SCMI platform firmware. The shmem accesses are now hardened to handle the same in fetch_notification and fetch_response. 2. Possible unsafe locking scenario which is solved by calling virtio_break_device() before getting hold of vioch->lock. 3. Possible stale error status reported from a previous message being used again as it is not cleared. * tag 'scmi-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Fix virtio channels cleanup on shutdown firmware: arm_scmi: Harden shared memory access in fetch_notification firmware: arm_scmi: Harden shared memory access in fetch_response firmware: arm_scmi: Clear stale xfer->hdr.status Link: https://lore.kernel.org/r/20230106093909.652657-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-09	Merge tag 'memory-controller-drv-fixes-6.2' of ↵	Arnd Bergmann
	https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/fixes Memory controller drivers - fixes for v6.2 Broken in v6.2: 1. OMAP GPMC: do not fail if "gpmc,wait-pin" optional property (introduced for v6.2) is missing. Broken earlier: 1. Tegra MC: Drop SID override programming as it is handled by bootloader and doing it in the kernel can cause unexpected results. 2. Atmel SDRAMC, MVEBU devbus: Add missing clock unprepare/disable in exit and error paths. * tag 'memory-controller-drv-fixes-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl: memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe() memory: atmel-sdramc: Fix missing clk_disable_unprepare in atmel_ramc_probe() memory: tegra: Remove clients SID override programming memory: omap-gpmc: fix wait pin validation Link: https://lore.kernel.org/r/20230109150322.329614-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-09	ice: Add check for kzalloc	Jiasheng Jiang
	Add the check for the return value of kzalloc in order to avoid NULL pointer dereference. Moreover, use the goto-label to share the clean code. Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-09	ice: Fix potential memory leak in ice_gnss_tty_write()	Yuan Can
	The ice_gnss_tty_write() return directly if the write_buf alloc failed, leaking the cmd_buf. Fix by free cmd_buf if write_buf alloc failed. Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-09	drm/amdgpu: Fixed bug on error when unloading amdgpu	YiPeng Chai
	Fixed bug on error when unloading amdgpu. The error message is as follows: [ 377.706202] kernel BUG at drivers/gpu/drm/drm_buddy.c:278! [ 377.706215] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 377.706222] CPU: 4 PID: 8610 Comm: modprobe Tainted: G IOE 6.0.0-thomas #1 [ 377.706231] Hardware name: ASUS System Product Name/PRIME Z390-A, BIOS 2004 11/02/2021 [ 377.706238] RIP: 0010:drm_buddy_free_block+0x26/0x30 [drm_buddy] [ 377.706264] Code: 00 00 00 90 0f 1f 44 00 00 48 8b 0e 89 c8 25 00 0c 00 00 3d 00 04 00 00 75 10 48 8b 47 18 48 d3 e0 48 01 47 28 e9 fa fe ff ff <0f> 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 48 89 f5 53 [ 377.706282] RSP: 0018:ffffad2dc4683cb8 EFLAGS: 00010287 [ 377.706289] RAX: 0000000000000000 RBX: ffff8b1743bd5138 RCX: 0000000000000000 [ 377.706297] RDX: ffff8b1743bd5160 RSI: ffff8b1743bd5c78 RDI: ffff8b16d1b25f70 [ 377.706304] RBP: ffff8b1743bd59e0 R08: 0000000000000001 R09: 0000000000000001 [ 377.706311] R10: ffff8b16c8572400 R11: ffffad2dc4683cf0 R12: ffff8b16d1b25f70 [ 377.706318] R13: ffff8b16d1b25fd0 R14: ffff8b1743bd59c0 R15: ffff8b16d1b25f70 [ 377.706325] FS: 00007fec56c72c40(0000) GS:ffff8b1836500000(0000) knlGS:0000000000000000 [ 377.706334] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 377.706340] CR2: 00007f9b88c1ba50 CR3: 0000000110450004 CR4: 00000000003706e0 [ 377.706347] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 377.706354] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 377.706361] Call Trace: [ 377.706365] <TASK> [ 377.706369] drm_buddy_free_list+0x2a/0x60 [drm_buddy] [ 377.706376] amdgpu_vram_mgr_fini+0xea/0x180 [amdgpu] [ 377.706572] amdgpu_ttm_fini+0x12e/0x1a0 [amdgpu] [ 377.706650] amdgpu_bo_fini+0x22/0x90 [amdgpu] [ 377.706727] gmc_v11_0_sw_fini+0x26/0x30 [amdgpu] [ 377.706821] amdgpu_device_fini_sw+0xa1/0x3c0 [amdgpu] [ 377.706897] amdgpu_driver_release_kms+0x12/0x30 [amdgpu] [ 377.706975] drm_dev_release+0x20/0x40 [drm] [ 377.707006] release_nodes+0x35/0xb0 [ 377.707014] devres_release_all+0x8b/0xc0 [ 377.707020] device_unbind_cleanup+0xe/0x70 [ 377.707027] device_release_driver_internal+0xee/0x160 [ 377.707033] driver_detach+0x44/0x90 [ 377.707039] bus_remove_driver+0x55/0xe0 [ 377.707045] pci_unregister_driver+0x3b/0x90 [ 377.707052] amdgpu_exit+0x11/0x6c [amdgpu] [ 377.707194] __x64_sys_delete_module+0x142/0x2b0 [ 377.707201] ? fpregs_assert_state_consistent+0x22/0x50 [ 377.707208] ? exit_to_user_mode_prepare+0x3e/0x190 [ 377.707215] do_syscall_64+0x38/0x90 [ 377.707221] entry_SYSCALL_64_after_hwframe+0x63/0xcd Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-09	drm/amd: Delay removal of the firmware framebuffer	Mario Limonciello
	Removing the firmware framebuffer from the driver means that even if the driver doesn't support the IP blocks in a GPU it will no longer be functional after the driver fails to initialize. This change will ensure that unsupported IP blocks at least cause the driver to work with the EFI framebuffer. Cc: stable@vger.kernel.org Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09	drm/amdgpu: Fix potential NULL dereference	Luben Tuikov
	Fix potential NULL dereference, in the case when "man", the resource manager might be NULL, when/if we print debug information. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Cc: Dan Carpenter <error27@gmail.com> Cc: kernel test robot <lkp@intel.com> Fixes: 7554886daa31ea ("drm/amdgpu: Fix size validation for non-exclusive domains (v4)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09	rtc: efi: Enable SET/GET WAKEUP services as optional	Shanker Donthineni
	The current implementation of rtc-efi is expecting all the 4 time services GET{SET}_TIME{WAKEUP} must be supported by UEFI firmware. As per the EFI_RT_PROPERTIES_TABLE, the platform specific implementations can choose to enable selective time services based on the RTC device capabilities. This patch does the following changes to provide GET/SET RTC services on platforms that do not support the WAKEUP feature. 1) Relax time services cap check when creating a platform device. 2) Clear RTC_FEATURE_ALARM bit in the absence of WAKEUP services. 3) Conditional alarm entries in '/proc/driver/rtc'. Cc: <stable@vger.kernel.org> # v6.0+ Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Link: https://lore.kernel.org/r/20230102230630.192911-1-sdonthineni@nvidia.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-01-09	cxl: fix cxl_report_and_clear() RAS UE addr mis-assignment	Dave Jiang
	'addr' that contains RAS UE register address is re-assigned to RAS_CAP_CONTROL offset if there are multiple UE errors. Use different addr variable to avoid the reassignment mistake. Fixes: 2905cb5236cb ("cxl/pci: Add (hopeful) error handling support") Reported-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167302318779.580155.15233596744650706167.stgit@djiang5-mobl3.local Signed-off-by: Dan Williams <dan.j.williams@intel.com>