git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-03-05	btrfs: mark btrfs_put_caching_control() static	Lijuan Li
	btrfs_put_caching_control() is only used in block-group.c, so mark it static. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Lijuan Li <lilijuan@iscas.ac.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: remove SLAB_MEM_SPREAD flag use	Chengming Zhou
	The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was removed as of v6.8-rc1, so it became a dead flag since the commit 16a1d968358a ("mm/slab: remove mm/slab.c and slab_def.h"). And the series[1] went on to mark it obsolete to avoid confusion for users. Here we can just remove all its users, which has no functional change. [1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: qgroup: always free reserved space for extent records	Qu Wenruo
	[BUG] If qgroup is marked inconsistent (e.g. caused by operations needing full subtree rescan, like creating a snapshot and assign to a higher level qgroup), btrfs would immediately start leaking its data reserved space. The following script can easily reproduce it: mkfs.btrfs -O quota -f $dev mount $dev $mnt btrfs subvolume create $mnt/subv1 btrfs qgroup create 1/0 $mnt # This snapshot creation would mark qgroup inconsistent, # as the ownership involves different higher level qgroup, thus # we have to rescan both source and snapshot, which can be very # time consuming, thus here btrfs just choose to mark qgroup # inconsistent, and let users to determine when to do the rescan. btrfs subv snapshot -i 1/0 $mnt/subv1 $mnt/snap1 # Now this write would lead to qgroup rsv leak. xfs_io -f -c "pwrite 0 64k" $mnt/file1 # And at unmount time, btrfs would report 64K DATA rsv space leaked. umount $mnt And we would have the following dmesg output for the unmount: BTRFS info (device dm-1): last unmount of filesystem 14a3d84e-f47b-4f72-b053-a8a36eef74d3 BTRFS warning (device dm-1): qgroup 0/5 has unreleased space, type 0 rsv 65536 [CAUSE] Since commit e15e9f43c7ca ("btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting"), we introduce a mode for btrfs qgroup to skip the timing consuming backref walk, if the qgroup is already inconsistent. But this skip also covered the data reserved freeing, thus the qgroup reserved space for each newly created data extent would not be freed, thus cause the leakage. [FIX] Make the data extent reserved space freeing mandatory. The qgroup reserved space handling is way cheaper compared to the backref walking part, and we always have the super sensitive leak detector, thus it's definitely worth to always free the qgroup reserved data space. Reported-by: Fabian Vogt <fvogt@suse.com> Fixes: e15e9f43c7ca ("btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting") CC: stable@vger.kernel.org # 6.1+ Link: https://bugzilla.suse.com/show_bug.cgi?id=1216196 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: tree-checker: dump the page status if hit something wrong	Qu Wenruo
	[BUG] There is a bug report about very suspicious tree-checker got triggered: BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] SELinux: inode_doinit_use_xattr: getxattr returned 117 for dev=dm-0 ino=5737268 [ANALYZE] The root cause is still unclear, but there are some clues already: - Unaligned eb bytenr The block bytenr is 8550954455682405139, which is not even aligned to 2. This bytenr is fetched from extent buffer header, not from eb->start. This means, at the initial time of read, eb header bytenr is still correct (the very basis check to continue read), but later something wrong happened, got at least the first page corrupted. Thus we got such obviously incorrect value. - Invalid extent buffer header owner The read itself is triggered for subvolume 256, but the eb header owner is 11858205567642294356, which is not really possible. The problem here is, subvolume id is limited to (1 << 48 - 1), and this one definitely goes beyond that limit. So this value is another garbage. We already got two garbage from an extent buffer, which passed the initial bytenr and csum checks, but later the contents become garbage at some point. This looks like a page lifespan problem (e.g. we didn't properly hold the page). [ENHANCEMENT] The current tree-checker only outputs things from the extent buffer, nothing with the page status. So this patch would enhance the tree-checker output by also dumping the first page, which would look like this: page:00000000aa9f3ce8 refcount:4 mapcount:0 mapping:00000000169aa6b6 index:0x1d0c pfn:0x1022e5 memcg:ffff888103456000 aops:btree_aops [btrfs] ino:1 flags: 0x2ffff0000008000(private\|node=0\|zone=2\|lastcpupid=0xffff) page_type: 0xffffffff() raw: 02ffff0000008000 0000000000000000 dead000000000122 ffff88811e06e220 raw: 0000000000001d0c ffff888102fdb1d8 00000004ffffffff ffff888103456000 page dumped because: eb page dump BTRFS critical (device dm-3): corrupt leaf: root=5 block=30457856 slot=6 ino=257 file_offset=0, invalid disk_bytenr for file extent, have 10617606235235216665, should be aligned to 4096 BTRFS error (device dm-3): read time tree block corruption detected on logical 30457856 mirror 1 From the dump we can see some extra info, something can help us to do extra cross-checks: - Page refcount if it's too low, it definitely means something bad. - Page aops Any mapped eb page should have btree_aops with inode number 1. - Page index Since a mapped eb page should has its bytenr matching the page position, (index << PAGE_SHIFT) should match the bytenr of the bytenr from the critical line. - Page Private flags A mapped eb page should have Private flag set to indicate it's managed by btrfs. Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: compression: remove dead comments in btrfs_compress_heuristic()	Qu Wenruo
	Since commit a440d48c7f93 ("Btrfs: heuristic: implement sampling logic"), btrfs_compress_heuristic() is no longer a simple "return true", but more complex to determine if we should compress. Thus the comment is dead and can be confusing, just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: subpage: make writer lock utilize bitmap	Qu Wenruo
	For the writer counter, it's pretty much the same as the reader counter, and they are exclusive. So move them to the new locked bitmap. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: subpage: make reader lock utilize bitmap	Qu Wenruo
	Currently btrfs_subpage utilizes its atomic member @reader to manage the reader counter. However it is only utilized to prevent the page to be released/unlocked when we still have reads underway. In that use case, we don't really allow multiple readers on the same subpage sector. So here we can introduce a new locked bitmap to represent exactly which subpage range is locked for read. In theory we can remove btrfs_subpage::reader as it's just the set bits of the new locked bitmap. But unfortunately bitmap doesn't provide such handy API yet, so we still keep the reader counter. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: unexport btrfs_subpage_start_writer() and ↵	Qu Wenruo
	btrfs_subpage_end_and_test_writer() Both functions were introduced in commit 1e1de38792e0 ("btrfs: make process_one_page() to handle subpage locking"), but they have never been utilized out of subpage code. So just unexport them. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	btrfs: pass a valid extent map cache pointer to __get_extent_map()	David Sterba
	We can pass a valid em cache pointer down to __get_extent_map() and drop the validity check. This avoids the special case, the call stacks are simple: btrfs_read_folio btrfs_do_readpage __get_extent_map extent_readahead contiguous_readpages btrfs_do_readpage __get_extent_map Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05	nvme: fcloop: make fcloop_class constant	Ricardo B. Marliere
	Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the fcloop_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-05	nvme: fabrics: make nvmf_class constant	Ricardo B. Marliere
	Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the nvmf_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-05	nvme: core: constify struct class usage	Ricardo B. Marliere
	Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the structures nvme_class, nvme_subsys_class and nvme_ns_chr_class to be declared at build time placing them into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-05	ASoC: dt-bindings: nvidia: Fix 'lge' vendor prefix	Rob Herring
	The documented vendor prefix for LG Electronics is 'lg' not 'lge'. Just change the example to 'lg' as there doesn't appear to be any dependency on the existing compatible string. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://msgid.link/r/20240305152131.3424326-1-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05	NFSD: send OP_CB_RECALL_ANY to clients when number of delegations reaches ↵	Dai Ngo
	its limit The NFS server should ask clients to voluntarily return unused delegations when the number of granted delegations reaches the max_delegations. This is so that the server can continue to grant delegations for new requests. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-05	NFSD: Document nfsd_setattr() fill-attributes behavior	Chuck Lever
	Add an explanation to prevent the future removal of the fill- attribute call sites in nfsd_setattr(). Some NFSv3 client implementations don't behave correctly if wcc data is not present in an NFSv3 SETATTR reply. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-05	mei: gsc_proxy: match component when GSC is on different bus	Alexander Usyskin
	On Arrow Lake S systems, MEI is no longer strictly connected to bus 0, while graphics remain exclusively on bus 0. Adapt the component matching logic to accommodate this change: Original behavior: Required both MEI and graphics to be on the same bus 0. New behavior: Only enforces graphics to be on bus 0 (integrated), allowing MEI to reside on any bus. This ensures compatibility with Arrow Lake S and maintains functionality for the legacy systems. Fixes: 1dd924f6885b ("mei: gsc_proxy: add gsc proxy driver") Cc: stable@vger.kernel.org # v6.3+ Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240220200020.231192-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	misc: fastrpc: Pass proper arguments to scm call	Ekansh Gupta
	For CMA memory allocation, ownership is assigned to DSP to make it accessible by the PD running on the DSP. With current implementation HLOS VM is stored in the channel structure during rpmsg_probe and this VM is passed to qcom_scm call as the source VM. The qcom_scm call will overwrite the passed source VM with the next VM which would cause a problem in case the scm call is again needed. Adding a local copy of source VM whereever scm call is made to avoid this problem. Fixes: 0871561055e6 ("misc: fastrpc: Add support for audiopd") Cc: stable <stable@kernel.org> Signed-off-by: Ekansh Gupta <quic_ekangupt@quicinc.com> Reviewed-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20240224114247.85953-2-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	comedi: comedi_test: Prevent timers rescheduling during deletion	Ian Abbott
	The comedi_test devices have a couple of timers (ai_timer and ao_timer) that can be started to simulate hardware interrupts. Their expiry functions normally reschedule the timer. The driver code calls either del_timer_sync() or del_timer() to delete the timers from the queue, but does not currently prevent the timers from rescheduling themselves so synchronized deletion may be ineffective. Add a couple of boolean members (one for each timer: ai_timer_enable and ao_timer_enable) to the device private data structure to indicate whether the timers are allowed to reschedule themselves. Set the member to true when adding the timer to the queue, and to false when deleting the timer from the queue in the waveform_ai_cancel() and waveform_ao_cancel() functions. The del_timer_sync() function is also called from the waveform_detach() function, but the timer enable members will already be set to false when that function is called, so no change is needed there. Fixes: 403fe7f34e33 ("staging: comedi: comedi_test: fix timer race conditions") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20240214100747.16203-1-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	comedi: comedi_8255: Correct error in subdevice initialization	Frej Drejhammar
	The refactoring done in commit 5c57b1ccecc7 ("comedi: comedi_8255: Rework subdevice initialization functions") to the initialization of the io field of struct subdev_8255_private broke all cards using the drivers/comedi/drivers/comedi_8255.c module. Prior to 5c57b1ccecc7, __subdev_8255_init() initialized the io field in the newly allocated struct subdev_8255_private to the non-NULL callback given to the function, otherwise it used a flag parameter to select between subdev_8255_mmio and subdev_8255_io. The refactoring removed that logic and the flag, as subdev_8255_mm_init() and subdev_8255_io_init() now explicitly pass subdev_8255_mmio and subdev_8255_io respectively to __subdev_8255_init(), only __subdev_8255_init() never sets spriv->io to the supplied callback. That spriv->io is NULL leads to a later BUG: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP PTI CPU: 1 PID: 1210 Comm: systemd-udevd Not tainted 6.7.3-x86_64 #1 Hardware name: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffa3f1c02d7b78 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff91f847aefd00 RCX: 000000000000009b RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff91f840f6fc00 RBP: ffff91f840f6fc00 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 000000000000005f R12: 0000000000000000 R13: 0000000000000000 R14: ffffffffc0102498 R15: ffff91f847ce6ba8 FS: 00007f72f4e8f500(0000) GS:ffff91f8d5c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000010540e000 CR4: 00000000000406f0 Call Trace: <TASK> ? __die_body+0x15/0x57 ? page_fault_oops+0x2ef/0x33c ? insert_vmap_area.constprop.0+0xb6/0xd5 ? alloc_vmap_area+0x529/0x5ee ? exc_page_fault+0x15a/0x489 ? asm_exc_page_fault+0x22/0x30 __subdev_8255_init+0x79/0x8d [comedi_8255] pci_8255_auto_attach+0x11a/0x139 [8255_pci] comedi_auto_config+0xac/0x117 [comedi] ? __pfx___driver_attach+0x10/0x10 pci_device_probe+0x88/0xf9 really_probe+0x101/0x248 __driver_probe_device+0xbb/0xed driver_probe_device+0x1a/0x72 __driver_attach+0xd4/0xed bus_for_each_dev+0x76/0xb8 bus_add_driver+0xbe/0x1be driver_register+0x9a/0xd8 comedi_pci_driver_register+0x28/0x48 [comedi_pci] ? __pfx_pci_8255_driver_init+0x10/0x10 [8255_pci] do_one_initcall+0x72/0x183 do_init_module+0x5b/0x1e8 init_module_from_file+0x86/0xac __do_sys_finit_module+0x151/0x218 do_syscall_64+0x72/0xdb entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7f72f50a0cb9 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 47 71 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd47e512d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000562dd06ae070 RCX: 00007f72f50a0cb9 RDX: 0000000000000000 RSI: 00007f72f52d32df RDI: 000000000000000e RBP: 0000000000000000 R08: 00007f72f5168b20 R09: 0000000000000000 R10: 0000000000000050 R11: 0000000000000246 R12: 00007f72f52d32df R13: 0000000000020000 R14: 0000562dd06785c0 R15: 0000562dcfd0e9a8 </TASK> Modules linked in: 8255_pci(+) comedi_8255 comedi_pci comedi intel_gtt e100(+) acpi_cpufreq rtc_cmos usbhid CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffa3f1c02d7b78 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff91f847aefd00 RCX: 000000000000009b RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff91f840f6fc00 RBP: ffff91f840f6fc00 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 000000000000005f R12: 0000000000000000 R13: 0000000000000000 R14: ffffffffc0102498 R15: ffff91f847ce6ba8 FS: 00007f72f4e8f500(0000) GS:ffff91f8d5c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000010540e000 CR4: 00000000000406f0 This patch simply corrects the above mistake by initializing spriv->io to the given io callback. Fixes: 5c57b1ccecc7 ("comedi: comedi_8255: Rework subdevice initialization functions") Signed-off-by: Frej Drejhammar <frej.drejhammar@gmail.com> Cc: stable@vger.kernel.org Acked-by: Ian Abbott <abbotti@mev.co.uk> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20240211175822.1357-1-frej.drejhammar@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf	Karol Herbst
	If VM_BIND is enabled on the client the legacy submission ioctl can't be used, however if a client tries to do so regardless it will return an error. In this case the clients mutex remained unlocked leading to a deadlock inside nouveau_drm_postclose or any other nouveau ioctl call. Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Cc: Danilo Krummrich <dakr@redhat.com> Cc: <stable@vger.kernel.org> # v6.6+ Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240305133853.2214268-1-kherbst@redhat.com
2024-03-05	xhci: Fix failure to detect ring expansion need.	Mathias Nyman
	Ring expansion checker may incorrectly assume a completely full ring is empty, missing the need for expansion. This is due to a special empty ring case where the dequeue ends up ahead of the enqueue pointer. This is seen when enqueued TRBs fill up exactly a segment, with enqueue then pointing to the end link TRB. Once those TRBs are handled the dequeue pointer will follow the link TRB and end up pointing to the first entry on the next segment, past the enqueue. This same enqueue - dequeue condition can be true if a ring is full, with enqueue ending on that last link TRB before the dequeue pointer on the next segment. This can be seen when queuing several ~510 small URBs via usbfs in one go before a single one is handled (i.e. dequeue not moved from first entry in segment). Expand the ring already when enqueue reaches the link TRB before the dequeue segment, instead of expanding it when enqueue moves into the dequeue segment. Reported-by: Chris Yokum <linux-usb@mail.totalphase.com> Closes: https://lore.kernel.org/all/949223224.833962.1709339266739.JavaMail.zimbra@totalphase.com Tested-by: Chris Yokum <linux-usb@mail.totalphase.com> Fixes: f5af638f0609 ("xhci: Fix transfer ring expansion size calculation") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240305132312.955171-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	Revert "tty: serial: simplify qcom_geni_serial_send_chunk_fifo()"	Douglas Anderson
	This reverts commit 5c7e105cd156fc9adf5294a83623d7a40c15f9b9. As identified by KASAN, the simplification done by the cleanup patch was not legal. >From tracing through the code, it can be seen that we're transmitting from a 4096-byte circular buffer. We copy anywhere from 1-4 bytes from it each time. The simplification runs into trouble when we get near the end of the circular buffer. For instance, we might start out with xmit->tail = 4094 and we want to transfer 4 bytes. With the code before simplification this was no problem. We'd read buf[4094], buf[4095], buf[0], and buf[1]. With the new code we'll do a memcpy(&buf[4094], 4) which reads 2 bytes past the end of the buffer and then skips transmitting what's at buf[0] and buf[1]. KASAN isn't 100% consistent at reporting this for me, but to be extra confident in the analysis, I added traces of the tail and tx_bytes and then wrote a test program: while true; do echo -n "abcdefghijklmnopqrstuvwxyz0" > /dev/ttyMSM0 sleep .1 done I watched the traces over SSH and saw: qcom_geni_serial_send_chunk_fifo: 4093 4 qcom_geni_serial_send_chunk_fifo: 1 3 Which indicated that one byte should be missing. Sure enough the output that should have been: abcdefghijklmnopqrstuvwxyz0 In one case was actually missing a byte: abcdefghijklmnopqrstuvwyz0 Running "ls -al" on large directories also made the missing bytes obvious since columns didn't line up. While the original code may not be the most elegant, we only talking about copying up to 4 bytes here. Let's just go back to the code that worked. Fixes: 5c7e105cd156 ("tty: serial: simplify qcom_geni_serial_send_chunk_fifo()") Cc: stable <stable@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Jiri Slaby <jirislaby@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240304174952.1.I920a314049b345efd1f69d708e7f74d2213d0b49@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	tty: serial: fsl_lpuart: avoid idle preamble pending if CTS is enabled	Sherry Sun
	If the remote uart device is not connected or not enabled after booting up, the CTS line is high by default. At this time, if we enable the flow control when opening the device(for example, using “stty -F /dev/ttyLP4 crtscts” command), there will be a pending idle preamble(first writing 0 and then writing 1 to UARTCTRL_TE will queue an idle preamble) that cannot be sent out, resulting in the uart port fail to close(waiting for TX empty), so the user space stty will have to wait for a long time or forever. This is an LPUART IP bug(idle preamble has higher priority than CTS), here add a workaround patch to enable TX CTS after enabling UARTCTRL_TE, so that the idle preamble does not get stuck due to CTS is deasserted. Fixes: 380c966c093e ("tty: serial: fsl_lpuart: add 32-bit register interface support") Cc: stable <stable@kernel.org> Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://lore.kernel.org/r/20240305015706.1050769-1-sherry.sun@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	usb: port: Don't try to peer unused USB ports based on location	Mathias Nyman
	Unused USB ports may have bogus location data in ACPI PLD tables. This causes port peering failures as these unused USB2 and USB3 ports location may match. Due to these failures the driver prints a "usb: port power management may be unreliable" warning, and unnecessarily blocks port power off during runtime suspend. This was debugged on a couple DELL systems where the unused ports all returned zeroes in their location data. Similar bugreports exist for other systems. Don't try to peer or match ports that have connect type set to USB_PORT_NOT_USED. Fixes: 3bfd659baec8 ("usb: find internal hub tier mismatch via acpi") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218465 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218486 Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/linux-usb/5406d361-f5b7-4309-b0e6-8c94408f7d75@molgen.mpg.de Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218490 Link: https://lore.kernel.org/r/20240222233343.71856-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	usb: gadget: ncm: Fix handling of zero block length packets	Krishna Kurapati
	While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX set to 65536, it has been observed that we receive short packets, which come at interval of 5-10 seconds sometimes and have block length zero but still contain 1-2 valid datagrams present. According to the NCM spec: "If wBlockLength = 0x0000, the block is terminated by a short packet. In this case, the USB transfer must still be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, and the size is a multiple of wMaxPacketSize for the given pipe, then no ZLP shall be sent. wBlockLength= 0x0000 must be used with extreme care, because of the possibility that the host and device may get out of sync, and because of test issues. wBlockLength = 0x0000 allows the sender to reduce latency by starting to send a very large NTB, and then shortening it when the sender discovers that there’s not sufficient data to justify sending a large NTB" However, there is a potential issue with the current implementation, as it checks for the occurrence of multiple NTBs in a single giveback by verifying if the leftover bytes to be processed is zero or not. If the block length reads zero, we would process the same NTB infintely because the leftover bytes is never zero and it leads to a crash. Fix this by bailing out if block length reads zero. Cc: stable@vger.kernel.org Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20240228115441.2105585-1-quic_kriskura@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	usb: typec: altmodes/displayport: create sysfs nodes as driver's default ↵	RD Babiera
	device attribute group The DisplayPort driver's sysfs nodes may be present to the userspace before typec_altmode_set_drvdata() completes in dp_altmode_probe. This means that a sysfs read can trigger a NULL pointer error by deferencing dp->hpd in hpd_show or dp->lock in pin_assignment_show, as dev_get_drvdata() returns NULL in those cases. Remove manual sysfs node creation in favor of adding attribute group as default for devices bound to the driver. The ATTRIBUTE_GROUPS() macro is not used here otherwise the path to the sysfs nodes is no longer compliant with the ABI. Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Link: https://lore.kernel.org/r/20240229001101.3889432-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	usb: typec: tpcm: Fix PORT_RESET behavior for self powered devices	Badhri Jagan Sridharan
	While commit 69f89168b310 ("usb: typec: tpcm: Fix issues with power being removed during reset") fixes the boot issues for bus powered devices such as LibreTech Renegade Elite/Firefly, it trades off the CC pins NOT being Hi-Zed during errory recovery (i.e PORT_RESET) for devices which are NOT bus powered(a.k.a self powered). This change Hi-Zs the CC pins only for self powered devices, thus preventing brown out for bus powered devices Adhering to spec is gaining more importance due to the Common charger initiative enforced by the European Union. Quoting from the spec: 4.5.2.2.2.1 ErrorRecovery State Requirements The port shall not drive VBUS or VCONN, and shall present a high-impedance to ground (above zOPEN) on its CC1 and CC2 pins. Hi-Zing the CC pins is the inteded behavior for PORT_RESET. CC pins are set to default state after tErrorRecovery in PORT_RESET_WAIT_OFF. 4.5.2.2.2.2 Exiting From ErrorRecovery State A Sink shall transition to Unattached.SNK after tErrorRecovery. A Source shall transition to Unattached.SRC after tErrorRecovery. Fixes: 69f89168b310 ("usb: typec: tpcm: Fix issues with power being removed during reset") Cc: stable@vger.kernel.org Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Tested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240228000512.746252-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	usb: typec: ucsi: fix UCSI on SM8550 & SM8650 Qualcomm devices	Neil Armstrong
	On SM8550 and SM8650 Qualcomm platforms a call to UCSI_GET_PDOS for non-PD partners will cause a firmware crash with no easy way to recover from it. Add UCSI_NO_PARTNER_PDOS quirk for those platform until we find a way to properly handle the crash. Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240223-topic-sm8550-upstream-ucsi-no-pdos-v1-1-8900ad510944@linaro.org Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-05	string: Convert helpers selftest to KUnit	Kees Cook
	Convert test-string_helpers.c to KUnit so it can be easily run with everything else. Failure reporting doesn't need to be open-coded in most places, for example, forcing a failure in the expected output for upper/lower testing looks like this: [12:18:43] # test_upper_lower: EXPECTATION FAILED at lib/string_helpers_kunit.c:579 [12:18:43] Expected dst == strings_upper[i].out, but [12:18:43] dst == "ABCDEFGH1234567890TEST" [12:18:43] strings_upper[i].out == "ABCDEFGH1234567890TeST" [12:18:43] [FAILED] test_upper_lower Currently passes without problems: $ ./tools/testing/kunit/kunit.py run string_helpers ... [12:23:55] Starting KUnit Kernel (1/1)... [12:23:55] ============================================================ [12:23:55] =============== string_helpers (3 subtests) ================ [12:23:55] [PASSED] test_get_size [12:23:55] [PASSED] test_upper_lower [12:23:55] [PASSED] test_unescape [12:23:55] ================= [PASSED] string_helpers ================== [12:23:55] ============================================================ [12:23:55] Testing complete. Ran 3 tests: passed: 3 [12:23:55] Elapsed time: 6.709s total, 0.001s configuring, 6.591s building, 0.066s running Link: https://lore.kernel.org/r/20240301202732.2688342-2-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-05	string: Convert selftest to KUnit	Kees Cook
	Convert test_string.c to KUnit so it can be easily run with everything else. Additional text context is retained for failure reporting. For example, when forcing a bad match, we can see the loop counters reported for the memset() tests: [09:21:52] # test_memset64: ASSERTION FAILED at lib/string_kunit.c:93 [09:21:52] Expected v == 0xa2a1a1a1a1a1a1a1ULL, but [09:21:52] v == -6799976246779207263 (0xa1a1a1a1a1a1a1a1) [09:21:52] 0xa2a1a1a1a1a1a1a1ULL == -6727918652741279327 (0xa2a1a1a1a1a1a1a1) [09:21:52] i:0 j:0 k:0 [09:21:52] [FAILED] test_memset64 Currently passes without problems: $ ./tools/testing/kunit/kunit.py run string ... [09:37:40] Starting KUnit Kernel (1/1)... [09:37:40] ============================================================ [09:37:40] =================== string (6 subtests) ==================== [09:37:40] [PASSED] test_memset16 [09:37:40] [PASSED] test_memset32 [09:37:40] [PASSED] test_memset64 [09:37:40] [PASSED] test_strchr [09:37:40] [PASSED] test_strnchr [09:37:40] [PASSED] test_strspn [09:37:40] ===================== [PASSED] string ====================== [09:37:40] ============================================================ [09:37:40] Testing complete. Ran 6 tests: passed: 6 [09:37:40] Elapsed time: 6.730s total, 0.001s configuring, 6.562s building, 0.131s running Link: https://lore.kernel.org/r/20240301202732.2688342-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-05	xfrm: set skb control buffer based on packet offload as well	Mike Yu
	In packet offload, packets are not encrypted in XFRM stack, so the next network layer which the packets will be forwarded to should depend on where the packet came from (either xfrm4_output or xfrm6_output) rather than the matched SA's family type. Test: verified IPv6-in-IPv4 packets on Android device with IPsec packet offload enabled Signed-off-by: Mike Yu <yumike@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-03-05	xfrm: fix xfrm child route lookup for packet offload	Mike Yu
	In current code, xfrm_bundle_create() always uses the matched SA's family type to look up a xfrm child route for the skb. The route returned by xfrm_dst_lookup() will eventually be used in xfrm_output_resume() (skb_dst(skb)->ops->local_out()). If packet offload is used, the above behavior can lead to calling ip_local_out() for an IPv6 packet or calling ip6_local_out() for an IPv4 packet, which is likely to fail. This change fixes the behavior by checking if the matched SA has packet offload enabled. If not, keep the same behavior; if yes, use the matched SP's family type for the lookup. Test: verified IPv6-in-IPv4 packets on Android device with IPsec packet offload enabled Signed-off-by: Mike Yu <yumike@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-03-05	fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion	Bart Van Assche
	The first kiocb_set_cancel_fn() argument may point at a struct kiocb that is not embedded inside struct aio_kiocb. With the current code, depending on the compiler, the req->ki_ctx read happens either before the IOCB_AIO_RW test or after that test. Move the req->ki_ctx read such that it is guaranteed that the IOCB_AIO_RW test happens first. Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: Benjamin LaHaise <ben@communityfibre.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Sandeep Dhavale <dhavale@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Fixes: b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240304235715.3790858-1-bvanassche@acm.org Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-05	drm/i915: Don't explode when the dig port we don't have an AUX CH	Ville Syrjälä
	The icl+ power well code currently assumes that every AUX power well maps to an encoder which is using said power well. That is by no menas guaranteed as we: - only register encoders for ports declared in the VBT - combo PHY HDMI-only encoder no longer get an AUX CH since commit 9856308c94ca ("drm/i915: Only populate aux_ch if really needed") However we have places such as intel_power_domains_sanitize_state() that blindly traverse all the possible power wells. So these bits of code may very well encounbter an aux power well with no associated encoder. In this particular case the BIOS seems to have left one AUX power well enabled even though we're dealing with a HDMI only encoder on a combo PHY. We then proceed to turn off said power well and explode when we can't find a matching encoder. As a short term fix we should be able to just skip the PHY related parts of the power well programming since we know this situation can only happen with combo PHYs. Another option might be to go back to always picking an AUX CH for all encoders. However I'm a bit wary about that since we might in theory end up conflicting with the VBT AUX CH assignment. Also that wouldn't help with encoders not declared in the VBT, should we ever need to poke the corresponding power wells. Longer term we need to figure out what the actual relationship is between the PHY vs. AUX CH vs. AUX power well. Currently this is entirely unclear. Cc: stable@vger.kernel.org Fixes: 9856308c94ca ("drm/i915: Only populate aux_ch if really needed") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10184 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223203216.15210-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit 6a8c66bf0e565c34ad0a18f820e0bb17951f7f91) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-03-05	platform/x86/amd/pmf: Fix missing error code in amd_pmf_init_smart_pc()	Harshit Mogalapalli
	On the error path, assign -ENOMEM to ret when memory allocation of "dev->prev_data" fails. Fixes: e70961505808 ("platform/x86/amd/pmf: Fixup error handling for amd_pmf_init_smart_pc()") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240226144011.2100804-1-harshit.m.mogalapalli@oracle.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-03-05	platform/x86: p2sb: On Goldmont only cache P2SB and SPI devfn BAR	Hans de Goede
	On Goldmont p2sb_bar() only ever gets called for 2 devices, the actual P2SB devfn 13,0 and the SPI controller which is part of the P2SB, devfn 13,2. But the current p2sb code tries to cache BAR0 info for all of devfn 13,0 to 13,7 . This involves calling pci_scan_single_device() for device 13 functions 0-7 and the hw does not seem to like pci_scan_single_device() getting called for some of the other hidden devices. E.g. on an ASUS VivoBook D540NV-GQ065T this leads to continuous ACPI errors leading to high CPU usage. Fix this by only caching BAR0 info and thus only calling pci_scan_single_device() for the P2SB and the SPI controller. Fixes: 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe") Reported-by: Danil Rybakov <danilrybakov249@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218531 Tested-by: Danil Rybakov <danilrybakov249@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240304134356.305375-2-hdegoede@redhat.com
2024-03-05	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook	Andy Chi
	The HP EliteBook using ALC236 codec which using 0x02 to control mute LED and 0x01 to control micmute LED. Therefore, add a quirk to make it works. Signed-off-by: Andy Chi <andy.chi@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240304134033.773348-1-andy.chi@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-03-05	Revert "fs/aio: Make io_cancel() generate completions again"	Bart Van Assche
	Patch "fs/aio: Make io_cancel() generate completions again" is based on the assumption that calling kiocb->ki_cancel() does not complete R/W requests. This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers set a cancellation function that calls usb_ep_dequeue(). According to its documentation, usb_ep_dequeue() calls the completion routine with status -ECONNRESET. Hence this revert. Cc: Benjamin LaHaise <ben@communityfibre.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Sandeep Dhavale <dhavale@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Reported-by: syzbot+b91eb2ed18f599dd3c31@syzkaller.appspotmail.com Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240304182945.3646109-1-bvanassche@acm.org Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-05	Merge tag 'drm-intel-fixes-2024-03-01' of ↵	Daniel Vetter
	https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes - Fix to extract HDCP information from primary connector - Check for NULL mmu_interval_notifier before removing Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZeGOUTfiA0_FNKLg@jlahtine-mobl.ger.corp.intel.com
2024-03-05	MAINTAINERS: Update email address for Tvrtko Ursulin	Tvrtko Ursulin
	I will lose access to my @.*intel.com e-mail addresses soon so let me adjust the maintainers entry and update the mailmap too. While at it consolidate a few other of my old emails to point to the main one. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240228142240.2539358-1-tvrtko.ursulin@linux.intel.com
2024-03-04	Merge tag 'mlx5-fixes-2024-03-01' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2024-03-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2024-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Switch to using _bh variant of of spinlock API in port timestamping NAPI poll context net/mlx5e: Use a memory barrier to enforce PTP WQ xmit submission tracking occurs after populating the metadata_map net/mlx5e: Fix MACsec state loss upon state update in offload path net/mlx5e: Change the warning when ignore_flow_level is not supported net/mlx5: Check capability for fw_reset net/mlx5: Fix fw reporter diagnose output net/mlx5: E-switch, Change flow rule destination checking Revert "net/mlx5e: Check the number of elements before walk TC rhashtable" Revert "net/mlx5: Block entering switchdev mode with ns inconsistency" ==================== Link: https://lore.kernel.org/r/20240302070318.62997-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04	Merge branch '10GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-03-01 (ixgbe, i40e, ice) This series contains updates to ixgbe, i40e, and ice drivers. Maciej corrects disable flow for ixgbe, i40e, and ice drivers which could cause non-functional interface with AF_XDP. Michal restores host configuration when changing MSI-X count for VFs on ice driver. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: reconfig host after changing MSI-X on VF ice: reorder disabling IRQ and NAPI in ice_qp_dis i40e: disable NAPI right after disabling irqs when handling xsk_pool ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}able ==================== Link: https://lore.kernel.org/r/20240301192549.2993798-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04	net: sparx5: Fix use after free inside sparx5_del_mact_entry	Horatiu Vultur
	Based on the static analyzis of the code it looks like when an entry from the MAC table was removed, the entry was still used after being freed. More precise the vid of the mac_entry was used after calling devm_kfree on the mac_entry. The fix consists in first using the vid of the mac_entry to delete the entry from the HW and after that to free it. Fixes: b37a1bae742f ("net: sparx5: add mactable support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240301080608.3053468-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04	mailmap: fix Kishon's email	Niklas Cassel
	Kishon updated his email in commit e6aa4edd2f5b ("MAINTAINERS: Update Kishon's email address in PCI endpoint subsystem"). However, as he is no longer at TI, his TI email now bounces. Add the same email as he has in MAINTAINERS to a mailmap, so that get_maintainer.pl will not output an email that bounces. (This is neeed as get_maintainer.pl will use "git author" to CC people who have significantly modified the same file as you.) Link: https://lkml.kernel.org/r/20240229134318.1201935-1-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Cc: Kishon Vijay Abraham I <kishon@kernel.org> Cc: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04	init/Kconfig: lower GCC version check for -Warray-bounds	Kees Cook
	We continue to see false positives from -Warray-bounds even in GCC 10, which is getting reported in a few places[1] still: security/security.c:811:2: warning: `memcpy' offset 32 is out of the bounds [0, 0] [-Warray-bounds] Lower the GCC version check from 11 to 10. Link: https://lkml.kernel.org/r/20240223170824.work.768-kees@kernel.org Reported-by: Lu Yao <yaolu@kylinos.cn> Closes: https://lore.kernel.org/lkml/20240117014541.8887-1-yaolu@kylinos.cn/ Link: https://lore.kernel.org/linux-next/65d84438.620a0220.7d171.81a7@mx.google.com [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Paul Moore <paul@paul-moore.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Marc Aurèle La France <tsi@tuyoix.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04	mm, mmap: fix vma_merge() case 7 with vma_ops->close	Vlastimil Babka
	When debugging issues with a workload using SysV shmem, Michal Hocko has come up with a reproducer that shows how a series of mprotect() operations can result in an elevated shm_nattch and thus leak of the resource. The problem is caused by wrong assumptions in vma_merge() commit 714965ca8252 ("mm/mmap: start distinguishing if vma can be removed in mergeability test"). The shmem vmas have a vma_ops->close callback that decrements shm_nattch, and we remove the vma without calling it. vma_merge() has thus historically avoided merging vma's with vma_ops->close and commit 714965ca8252 was supposed to keep it that way. It relaxed the checks for vma_ops->close in can_vma_merge_after() assuming that it is never called on a vma that would be a candidate for removal. However, the vma_merge() code does also use the result of this check in the decision to remove a different vma in the merge case 7. A robust solution would be to refactor vma_merge() code in a way that the vma_ops->close check is only done for vma's that are actually going to be removed, and not as part of the preliminary checks. That would both solve the existing bug, and also allow additional merges that the checks currently prevent unnecessarily in some cases. However to fix the existing bug first with a minimized risk, and for easier stable backports, this patch only adds a vma_ops->close check to the buggy case 7 specifically. All other cases of vma removal are covered by the can_vma_merge_before() check that includes the test for vma_ops->close. The reproducer code, adapted from Michal Hocko's code: int main(int argc, char argv[]) { int segment_id; size_t segment_size = 20 PAGE_SIZE; char * sh_mem; struct shmid_ds shmid_ds; key_t key = 0x1234; segment_id = shmget(key, segment_size, IPC_CREAT \| IPC_EXCL \| S_IRUSR \| S_IWUSR); sh_mem = (char )shmat(segment_id, NULL, 0); mprotect(sh_mem + 2PAGE_SIZE, PAGE_SIZE, PROT_NONE); mprotect(sh_mem + PAGE_SIZE, PAGE_SIZE, PROT_WRITE); mprotect(sh_mem + 2*PAGE_SIZE, PAGE_SIZE, PROT_WRITE); shmdt(sh_mem); shmctl(segment_id, IPC_STAT, &shmid_ds); printf("nattch after shmdt(): %lu (expected: 0)\n", shmid_ds.shm_nattch); if (shmctl(segment_id, IPC_RMID, 0)) printf("IPCRM failed %d\n", errno); return (shmid_ds.shm_nattch) ? 1 : 0; } Link: https://lkml.kernel.org/r/20240222215930.14637-2-vbabka@suse.cz Fixes: 714965ca8252 ("mm/mmap: start distinguishing if vma can be removed in mergeability test") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04	mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails	Qi Zheng
	After ptep_clear_flush(), if we find that src_folio is pinned we will fail UFFDIO_MOVE and put src_folio back to src_pte entry, but the change to src_folio->{mapping,index} is not restored in this process. This is not what we expected, so fix it. This can cause the rmap for that page to be invalid, possibly resulting in memory corruption. At least swapout+migration would no longer work, because we might fail to locate the mappings of that folio. Link: https://lkml.kernel.org/r/20240222080815.46291-1-zhengqi.arch@bytedance.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04	mm, vmscan: prevent infinite loop for costly GFP_NOIO \| __GFP_RETRY_MAYFAIL ↵	Vlastimil Babka
	allocations Sven reports an infinite loop in __alloc_pages_slowpath() for costly order __GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO. Such combination can happen in a suspend/resume context where a GFP_KERNEL allocation can have __GFP_IO masked out via gfp_allowed_mask. Quoting Sven: 1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER) with __GFP_RETRY_MAYFAIL set. 2. page alloc's __alloc_pages_slowpath tries to get a page from the freelist. This fails because there is nothing free of that costly order. 3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim, which bails out because a zone is ready to be compacted; it pretends to have made a single page of progress. 4. page alloc tries to compact, but this always bails out early because __GFP_IO is not set (it's not passed by the snd allocator, and even if it were, we are suspending so the __GFP_IO flag would be cleared anyway). 5. page alloc believes reclaim progress was made (because of the pretense in item 3) and so it checks whether it should retry compaction. The compaction retry logic thinks it should try again, because: a) reclaim is needed because of the early bail-out in item 4 b) a zonelist is suitable for compaction 6. goto 2. indefinite stall. (end quote) The immediate root cause is confusing the COMPACT_SKIPPED returned from __alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be indicating a lack of order-0 pages, and in step 5 evaluating that in should_compact_retry() as a reason to retry, before incrementing and limiting the number of retries. There are however other places that wrongly assume that compaction can happen while we lack __GFP_IO. To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO evaluation and switch the open-coded test in try_to_compact_pages() to use it. Also use the new helper in: - compaction_ready(), which will make reclaim not bail out in step 3, so there's at least one attempt to actually reclaim, even if chances are small for a costly order - in_reclaim_compaction() which will make should_continue_reclaim() return false and we don't over-reclaim unnecessarily - in __alloc_pages_slowpath() to set a local variable can_compact, which is then used to avoid retrying reclaim/compaction for costly allocations (step 5) if we can't compact and also to skip the early compaction attempt that we do in some cases Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz Fixes: 3250845d0526 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sven van Ashbrook <svenva@chromium.org> Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzPUVOZF%2Bg@mail.gmail.com/ Tested-by: Karthikeyan Ramasubramanian <kramasub@chromium.org> Cc: Brian Geffon <bgeffon@google.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Takashi Iwai <tiwai@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04	io_uring/net: fix overflow check in io_recvmsg_mshot_prep()	Dan Carpenter
	The "controllen" variable is type size_t (unsigned long). Casting it to int could lead to an integer underflow. The check_add_overflow() function considers the type of the destination which is type int. If we add two positive values and the result cannot fit in an integer then that's counted as an overflow. However, if we cast "controllen" to an int and it turns negative, then negative values can fit into an int type so there is no overflow. Good: 100 + (unsigned long)-4 = 96 <-- overflow Bad: 100 + (int)-4 = 96 <-- no overflow I deleted the cast of the sizeof() as well. That's not a bug but the cast is unnecessary. Fixes: 9b0fc3c054ff ("io_uring: fix types in io_recvmsg_multishot_overflow") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/138bd2e2-ede8-4bcc-aa7b-f3d9de167a37@moroto.mountain Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-04	io_uring/net: correct the type of variable	Muhammad Usama Anjum
	The namelen is of type int. It shouldn't be made size_t which is unsigned. The signed number is needed for error checking before use. Fixes: c55978024d12 ("io_uring/net: move receive multishot out of the generic msghdr path") Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240301144349.2807544-1-usama.anjum@collabora.com Signed-off-by: Jens Axboe <axboe@kernel.dk>