summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-01Merge tag 'firewire-fixes-6.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A single patch to use a flexible array rather than a zero-length one" * tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: Replace zero-length array with flexible-array member
2023-06-01Merge tag 'mailbox-fixes-6.4-rc5' of ↵Linus Torvalds
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox fix from Jassi Brar: "Fix missing mutex unlock in mailbox-test" * tag 'mailbox-fixes-6.4-rc5' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()
2023-06-01net: dsa: mv88e6xxx: Increase wait after reset deactivationAndreas Svensson
A switch held in reset by default needs to wait longer until we can reliably detect it. An issue was observed when testing on the Marvell 88E6393X (Link Street). The driver failed to detect the switch on some upstarts. Increasing the wait time after reset deactivation solves this issue. The updated wait time is now also the same as the wait time in the mv88e6xxx_hardware_reset function. Fixes: 7b75e49de424 ("net: dsa: mv88e6xxx: wait after reset deactivation") Signed-off-by: Andreas Svensson <andreas.svensson@axis.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230530145223.1223993-1-andreas.svensson@axis.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01firewire: Replace zero-length array with flexible-array memberGustavo A. R. Silva
Zero-length and one-element arrays are deprecated, and we are moving towards adopting C99 flexible-array members, instead. Address the following warnings found with GCC-13 and -fstrict-flex-arrays=3 enabled: sound/firewire/amdtp-stream.c: In function ‘build_it_pkt_header’: sound/firewire/amdtp-stream.c:694:17: warning: ‘generate_cip_header’ accessing 8 bytes in a region of size 0 [-Wstringop-overflow=] 694 | generate_cip_header(s, cip_header, data_block_counter, syt); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sound/firewire/amdtp-stream.c:694:17: note: referencing argument 2 of type ‘__be32[2]’ {aka ‘unsigned int[2]’} sound/firewire/amdtp-stream.c:667:13: note: in a call to function ‘generate_cip_header’ 667 | static void generate_cip_header(struct amdtp_stream *s, __be32 cip_header[2], | ^~~~~~~~~~~~~~~~~~~ This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [1]. Link: https://github.com/KSPP/linux/issues/21 Link: https://github.com/KSPP/linux/issues/303 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/ZHT0V3SpvHyxCv5W@work Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2023-06-01btrfs: zoned: fix dev-replace after the scrub reworkQu Wenruo
[BUG] After commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure"), scrub no longer works for zoned device at all. Even an empty zoned btrfs cannot be replaced: # mkfs.btrfs -f /dev/nvme0n1 # mount /dev/nvme0n1 /mnt/btrfs # btrfs replace start -Bf 1 /dev/nvme0n2 /mnt/btrfs Resetting device zones /dev/nvme1n1 (160 zones) ... ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/btrfs/": Input/output error And we can hit kernel crash related to that: BTRFS info (device nvme1n1): host-managed zoned block device /dev/nvme3n1, 160 zones of 134217728 bytes BTRFS info (device nvme1n1): dev_replace from /dev/nvme2n1 (devid 2) to /dev/nvme3n1 started nvme3n1: Zone Management Append(0x7d) @ LBA 65536, 4 blocks, Zone Is Full (sct 0x1 / sc 0xb9) DNR I/O error, dev nvme3n1, sector 786432 op 0xd:(ZONE_APPEND) flags 0x4000 phys_seg 3 prio class 2 BTRFS error (device nvme1n1): bdev /dev/nvme3n1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 BUG: kernel NULL pointer dereference, address: 00000000000000a8 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40 Call Trace: <IRQ> btrfs_lookup_ordered_extent+0x31/0x190 btrfs_record_physical_zoned+0x18/0x40 btrfs_simple_end_io+0xaf/0xc0 blk_update_request+0x153/0x4c0 blk_mq_end_request+0x15/0xd0 nvme_poll_cq+0x1d3/0x360 nvme_irq+0x39/0x80 __handle_irq_event_percpu+0x3b/0x190 handle_irq_event+0x2f/0x70 handle_edge_irq+0x7c/0x210 __common_interrupt+0x34/0xa0 common_interrupt+0x7d/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 [CAUSE] Dev-replace reuses scrub code to iterate all extents and write the existing content back to the new device. And for zoned devices, we call fill_writer_pointer_gap() to make sure all the writes into the zoned device is sequential, even if there may be some gaps between the writes. However we have several different bugs all related to zoned dev-replace: - We are using ZONE_APPEND operation for metadata style write back For zoned devices, btrfs has two ways to write data: * ZONE_APPEND for data This allows higher queue depth, but will not be able to know where the write would land. Thus needs to grab the real on-disk physical location in it's endio. * WRITE for metadata This requires single queue depth (new writes can only be submitted after previous one finished), and all writes must be sequential. For scrub, we go single queue depth, but still goes with ZONE_APPEND, which requires btrfs_bio::inode being populated. This is the cause of that crash. - No correct tracing of write_pointer After a write finished, we should forward sctx->write_pointer, or fill_writer_pointer_gap() would not work properly and cause more than necessary zero out, and fill the whole zone prematurely. - Incorrect physical bytenr passed to fill_writer_pointer_gap() In scrub_write_sectors(), one call site passes logical address, which is completely wrong. The other call site passes physical address of current sector, but we should pass the physical address of the btrfs_bio we're submitting. This is the cause of the -EIO errors. [FIX] - Do not use ZONE_APPEND for btrfs_submit_repair_write(). - Manually forward sctx->write_pointer after successful writeback - Use the physical address of the to-be-submitted btrfs_bio for fill_writer_pointer_gap() Now zoned device replace would work as expected. Reported-by: Christoph Hellwig <hch@lst.de> Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-01Merge tag 'for-linus-2023060101' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - Regression fix for overlong long timeouts during initialization on some Logitech Unifying devices (Bastien Nocera) - error handling and overflow fixes for Wacom driver (Denis Arefev, Jason Gerecke, Nikita Zhandarovich) * tag 'for-linus-2023060101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: logitech-hidpp: Handle timeout differently from busy HID: wacom: Add error check to wacom_parse_and_register() HID: google: add jewel USB id HID: wacom: avoid integer overflow in wacom_intuos_inout() HID: wacom: Check for string overflow from strscpy calls
2023-06-01Merge tag 'ata-6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: - Fix ata_find_dev() use of the device number to find a struct ata_device for a port. This addresses issues with some passthrough commands with libsas managed devices. * tag 'ata-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-scsi: Use correct device no in ata_find_dev()
2023-06-01Merge tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: "Eight server fixes (most also for stable): - Two fixes for uninitialized pointer reads (rename and link) - Fix potential UAF in oplock break - Two fixes for potential out of bound reads in negotiate - Fix crediting bug - Two fixes for xfstests (allocation size fix for test 694 and lookup issue shown by test 464)" * tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: call putname after using the last component ksmbd: fix incorrect AllocationSize set in smb2_get_info ksmbd: fix UAF issue from opinfo->conn ksmbd: fix multiple out-of-bounds read during context decoding ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate ksmbd: fix credit count leakage ksmbd: fix uninitialized pointer read in smb2_create_link() ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
2023-06-01net: ipa: Use correct value for IPA_STATUS_SIZEBert Karwatzki
IPA_STATUS_SIZE was introduced in commit b8dc7d0eea5a as a replacement for the size of the removed struct ipa_status which had size sizeof(__le32[8]). Use this value as IPA_STATUS_SIZE. Fixes: b8dc7d0eea5a ("net: ipa: stop using sizeof(status)") Signed-off-by: Bert Karwatzki <spasswolf@web.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230531103618.102608-1-spasswolf@web.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01tcp: fix mishandling when the sack compression is deferred.fuyuanli
In this patch, we mainly try to handle sending a compressed ack correctly if it's deferred. Here are more details in the old logic: When sack compression is triggered in the tcp_compressed_ack_kick(), if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED and then defer to the release cb phrase. Later once user releases the sock, tcp_delack_timer_handler() should send a ack as expected, which, however, cannot happen due to lack of ICSK_ACK_TIMER flag. Therefore, the receiver would not sent an ack until the sender's retransmission timeout. It definitely increases unnecessary latency. Fixes: 5d9f4262b7ea ("tcp: add SACK compression") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: fuyuanli <fuyuanli@didiglobal.com> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/ Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01net/sched: flower: fix possible OOB write in fl_set_geneve_opt()Hangyu Hua
If we send two TCA_FLOWER_KEY_ENC_OPTS_GENEVE packets and their total size is 252 bytes(key->enc_opts.len = 252) then key->enc_opts.len = opt->length = data_len / 4 = 0 when the third TCA_FLOWER_KEY_ENC_OPTS_GENEVE packet enters fl_set_geneve_opt. This bypasses the next bounds check and results in an out-of-bounds. Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Link: https://lore.kernel.org/r/20230531102805.27090-1-hbh25y@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01iommu/mediatek: Flush IOTLB completely only if domain has been attachedChen-Yu Tsai
If an IOMMU domain was never attached, it lacks any linkage to the actual IOMMU hardware. Attempting to do flush_iotlb_all() on it will result in a NULL pointer dereference. This seems to happen after the recent IOMMU core rework in v6.4-rc1. Unable to handle kernel read from unreadable memory at virtual address 0000000000000018 Call trace: mtk_iommu_flush_iotlb_all+0x20/0x80 iommu_create_device_direct_mappings.part.0+0x13c/0x230 iommu_setup_default_domain+0x29c/0x4d0 iommu_probe_device+0x12c/0x190 of_iommu_configure+0x140/0x208 of_dma_configure_id+0x19c/0x3c0 platform_dma_configure+0x38/0x88 really_probe+0x78/0x2c0 Check if the "bank" field has been filled in before actually attempting the IOTLB flush to avoid it. The IOTLB is also flushed when the device comes out of runtime suspend, so it should have a clean initial state. Fixes: 08500c43d4f7 ("iommu/mediatek: Adjust the structure") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20230526085402.394239-1-wenst@chromium.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-06-01drm/i915/perf: Clear out entire reports after reading if not power of 2 sizeAshutosh Dixit
Clearing out report id and timestamp as means to detect unlanded reports only works if report size is power of 2. That is, only when report size is a sub-multiple of the OA buffer size can we be certain that reports will land at the same place each time in the OA buffer (after rewind). If report size is not a power of 2, we need to zero out the entire report to be able to detect unlanded reports reliably. v2: Add Fixes tag (Umesh) Fixes: 1cc064dce4ed ("drm/i915/perf: Add support for OA media units") Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230523204042.4180641-1-ashutosh.dixit@intel.com (cherry picked from commit 09a36015d9a0940214c080f95afc605c47648bbd) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2023-05-31sfc: fix error unwinds in TC offloadEdward Cree
Failure ladders weren't exactly unwinding what the function had done up to that point; most seriously, when we encountered an already offloaded rule, the failure path tried to remove the new rule from the hashtable, which would in fact remove the already-present 'old' rule (since it has the same key) from the table, and leak its resources. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202305200745.xmIlkqjH-lkp@intel.com/ Fixes: d902e1a737d4 ("sfc: bare bones TC offload on EF100") Fixes: 17654d84b47c ("sfc: add offloading of 'foreign' TC (decap) rules") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230530202527.53115-1-edward.cree@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31net/mlx5: Read embedded cpu after init bit clearedMoshe Shemesh
During driver load it reads embedded_cpu bit from initialization segment, but the initialization segment is readable only after initialization bit is cleared. Move the call to mlx5_read_embedded_cpu() right after initialization bit cleared. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Fixes: 591905ba9679 ("net/mlx5: Introduce Mellanox SmartNIC and modify page management logic") Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-05-31net/mlx5e: Fix error handling in mlx5e_refresh_tirsSaeed Mahameed
Allocation failure is outside the critical lock section and should return immediately rather than jumping to the unlock section. Also unlock as soon as required and remove the now redundant jump label. Fixes: 80a2a9026b24 ("net/mlx5e: Add a lock on tir list") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-05-31net/mlx5: Ensure af_desc.mask is properly initializedChuck Lever
[ 9.837087] mlx5_core 0000:02:00.0: firmware version: 16.35.2000 [ 9.843126] mlx5_core 0000:02:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link) [ 10.311515] mlx5_core 0000:02:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps [ 10.321948] mlx5_core 0000:02:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) [ 10.344324] mlx5_core 0000:02:00.0: mlx5_pcie_event:301:(pid 88): PCIe slot advertised sufficient power (27W). [ 10.354339] BUG: unable to handle page fault for address: ffffffff8ff0ade0 [ 10.361206] #PF: supervisor read access in kernel mode [ 10.366335] #PF: error_code(0x0000) - not-present page [ 10.371467] PGD 81ec39067 P4D 81ec39067 PUD 81ec3a063 PMD 114b07063 PTE 800ffff7e10f5062 [ 10.379544] Oops: 0000 [#1] PREEMPT SMP PTI [ 10.383721] CPU: 0 PID: 117 Comm: kworker/0:6 Not tainted 6.3.0-13028-g7222f123c983 #1 [ 10.391625] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0b 06/12/2017 [ 10.398750] Workqueue: events work_for_cpu_fn [ 10.403108] RIP: 0010:__bitmap_or+0x10/0x26 [ 10.407286] Code: 85 c0 0f 95 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 c9 31 c0 48 83 c1 3f 48 c1 e9 06 39 c> [ 10.426024] RSP: 0000:ffffb45a0078f7b0 EFLAGS: 00010097 [ 10.431240] RAX: 0000000000000000 RBX: ffffffff8ff0adc0 RCX: 0000000000000004 [ 10.438365] RDX: ffff9156801967d0 RSI: ffffffff8ff0ade0 RDI: ffff9156801967b0 [ 10.445489] RBP: ffffb45a0078f7e8 R08: 0000000000000030 R09: 0000000000000000 [ 10.452613] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ec [ 10.459737] R13: ffffffff8ff0ade0 R14: 0000000000000001 R15: 0000000000000020 [ 10.466862] FS: 0000000000000000(0000) GS:ffff9165bfc00000(0000) knlGS:0000000000000000 [ 10.474936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 10.480674] CR2: ffffffff8ff0ade0 CR3: 00000001011ae003 CR4: 00000000003706f0 [ 10.487800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 10.494922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 10.502046] Call Trace: [ 10.504493] <TASK> [ 10.506589] ? matrix_alloc_area.constprop.0+0x43/0x9a [ 10.511729] ? prepare_namespace+0x84/0x174 [ 10.515914] irq_matrix_reserve_managed+0x56/0x10c [ 10.520699] x86_vector_alloc_irqs+0x1d2/0x31e [ 10.525146] irq_domain_alloc_irqs_hierarchy+0x39/0x3f [ 10.530284] irq_domain_alloc_irqs_parent+0x1a/0x2a [ 10.535155] intel_irq_remapping_alloc+0x59/0x5e9 [ 10.539859] ? kmem_cache_debug_flags+0x11/0x26 [ 10.544383] ? __radix_tree_lookup+0x39/0xb9 [ 10.548649] irq_domain_alloc_irqs_hierarchy+0x39/0x3f [ 10.553779] irq_domain_alloc_irqs_parent+0x1a/0x2a [ 10.558650] msi_domain_alloc+0x8c/0x120 [ 10.567697] irq_domain_alloc_irqs_locked+0x11d/0x286 [ 10.572741] __irq_domain_alloc_irqs+0x72/0x93 [ 10.577179] __msi_domain_alloc_irqs+0x193/0x3f1 [ 10.581789] ? __xa_alloc+0xcf/0xe2 [ 10.585273] msi_domain_alloc_irq_at+0xa8/0xfe [ 10.589711] pci_msix_alloc_irq_at+0x47/0x5c The crash is due to matrix_alloc_area() attempting to access per-CPU memory for CPUs that are not present on the system. The CPU mask passed into reserve_managed_vector() via it's @irqd parameter is corrupted because it contains uninitialized stack data. Fixes: bbac70c74183 ("net/mlx5: Use newer affinity descriptor") Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-05-31net/mlx5: Fix setting of irq->map.index for static IRQ caseNiklas Schnelle
When dynamic IRQ allocation is not supported all IRQs are allocated up front in mlx5_irq_table_create() instead of dynamically as part of mlx5_irq_alloc(). In the latter dynamic case irq->map.index is set via the mapping returned by pci_msix_alloc_irq_at(). In the static case and prior to commit 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq") irq->map.index was set in mlx5_irq_alloc() twice once initially to 0 and then to the requested index before storing in the xarray. After this commit it is only set to 0 which breaks all other IRQ mappings. Fix this by setting irq->map.index to the requested index together with irq->map.virq and improve the related comment to make it clearer which cases it deals with. Cc: Chuck Lever III <chuck.lever@oracle.com> Tested-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Fixes: 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-05-31net/mlx5: Remove rmap also in case dynamic MSIX not supportedShay Drory
mlx5 add IRQs to rmap upon MSIX request, and mlx5 remove rmap from MSIX only if msi_map.index is populated. However, msi_map.index is populated only when dynamic MSIX is supported. This results in freeing IRQs without removing them from rmap, which triggers the bellow WARN_ON[1]. rmap is a feature which have no relation to dynamic MSIX. Hence, remove the check of msi_map.index when removing IRQ from rmap. [1] [ 200.307160 ] WARNING: CPU: 20 PID: 1702 at kernel/irq/manage.c:2034 free_irq+0x2ac/0x358 [ 200.316990 ] CPU: 20 PID: 1702 Comm: modprobe Not tainted 6.4.0-rc3_for_upstream_min_debug_2023_05_24_14_02 #1 [ 200.318939 ] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 200.321659 ] pc : free_irq+0x2ac/0x358 [ 200.322400 ] lr : free_irq+0x20/0x358 [ 200.337865 ] Call trace: [ 200.338360 ] free_irq+0x2ac/0x358 [ 200.339029 ] irq_release+0x58/0xd0 [mlx5_core] [ 200.340093 ] mlx5_irqs_release_vectors+0x80/0xb0 [mlx5_core] [ 200.341344 ] destroy_comp_eqs+0x120/0x170 [mlx5_core] [ 200.342469 ] mlx5_eq_table_destroy+0x1c/0x38 [mlx5_core] [ 200.343645 ] mlx5_unload+0x8c/0xc8 [mlx5_core] [ 200.344652 ] mlx5_uninit_one+0x78/0x118 [mlx5_core] [ 200.345745 ] remove_one+0x80/0x108 [mlx5_core] [ 200.346752 ] pci_device_remove+0x40/0xd8 [ 200.347554 ] device_remove+0x50/0x88 [ 200.348272 ] device_release_driver_internal+0x1c4/0x228 [ 200.349312 ] driver_detach+0x54/0xa0 [ 200.350030 ] bus_remove_driver+0x74/0x100 [ 200.350833 ] driver_unregister+0x34/0x68 [ 200.351619 ] pci_unregister_driver+0x28/0xa0 [ 200.352476 ] mlx5_cleanup+0x14/0x2210 [mlx5_core] [ 200.353536 ] __arm64_sys_delete_module+0x190/0x2e8 [ 200.354495 ] el0_svc_common.constprop.0+0x6c/0x1d0 [ 200.355455 ] do_el0_svc+0x38/0x98 [ 200.356122 ] el0_svc+0x1c/0x80 [ 200.356739 ] el0t_64_sync_handler+0xb4/0x130 [ 200.357604 ] el0t_64_sync+0x174/0x178 [ 200.358345 ] ---[ end trace 0000000000000000 ]--- Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-05-31drm/amdgpu: enable tmz by default for GC 11.0.1Ikshwaku Chauhan
Add IP GC 11.0.1 in the list of target to have tmz enabled by default. Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-05-31Merge tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Four small smb3 client fixes: - two small fixes suggested by kernel test robot - small cleanup fix - update Paulo's email address in the maintainer file" * tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: address unused variable warning smb: delete an unnecessary statement smb3: missing null check in SMB2_change_notify smb3: update a reviewer email in MAINTAINERS file
2023-05-31drm/amd/pm: resolve reboot exception for si olandGuchun Chen
During reboot test on arm64 platform, it may failure on boot. The error message are as follows: [ 1.706570][ 3] [ T273] [drm:si_thermal_enable_alert [amdgpu]] *ERROR* Could not enable thermal interrupts. [ 1.716547][ 3] [ T273] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <si_dpm> failed -22 [ 1.727064][ 3] [ T273] amdgpu 0000:02:00.0: amdgpu_device_ip_late_init failed [ 1.734367][ 3] [ T273] amdgpu 0000:02:00.0: Fatal error during GPU init v2: squash in built warning fix (Alex) Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v4_0Horatio Zhang
Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in jpeg_v4_0_hw_fini. [ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246 [ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09: 0000000000000000 [ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12: ffff99b2105242d8 [ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15: ffff99b210500000 [ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000) knlGS:0000000000000000 [ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4: 0000000000750ee0 [ 50.497624] PKRU: 55555554 [ 50.497625] Call Trace: [ 50.497625] <TASK> [ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu] [ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu] [ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 50.498060] process_one_work+0x21f/0x400 [ 50.498063] worker_thread+0x200/0x3f0 [ 50.498064] ? process_one_work+0x400/0x400 [ 50.498065] kthread+0xee/0x120 [ 50.498067] ? kthread_complete_and_exit+0x20/0x20 [ 50.498068] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v2_6Horatio Zhang
Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amdgpu: separate ras irq from jpeg instance irq for UVD_POISONHoratio Zhang
Separate jpegbRAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from jpeg instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amdgpu: add RAS POISON interrupt funcs for vcn_v4_0Horatio Zhang
Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in vcn_v4_0_hw_fini. [ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246 [ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000 [ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8 [ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8 [ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000 [ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0 [ 44.563633] PKRU: 55555554 [ 44.563633] Call Trace: [ 44.563634] <TASK> [ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu] [ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu] [ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 44.564061] process_one_work+0x21f/0x400 [ 44.564062] worker_thread+0x200/0x3f0 [ 44.564063] ? process_one_work+0x400/0x400 [ 44.564064] kthread+0xee/0x120 [ 44.564065] ? kthread_complete_and_exit+0x20/0x20 [ 44.564066] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amdgpu: add RAS POISON interrupt funcs for vcn_v2_6Horatio Zhang
Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amdgpu: separate ras irq from vcn instance irq for UVD_POISONHoratio Zhang
Separate vcn RAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from vcn instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31Revert "drm/amd/display: Do not set drr on pipe commit"Michel Dänzer
This reverts commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be. Caused a regression: Samsung Odyssey Neo G9, running at 5120x1440@240/VRR, connected to Navi 21 via DisplayPort, blanks and the GPU hangs while starting the Steam game Assetto Corsa Competizione (via Proton 7.0). Example dmesg excerpt: amdgpu 0000:0c:00.0: [drm] ERROR [CRTC:82:crtc-0] flip_done timed out NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 [...] RIP: 0010:amdgpu_device_rreg.part.0+0x2f/0xf0 [amdgpu] Code: 41 54 44 8d 24 b5 00 00 00 00 55 89 f5 53 48 89 fb 4c 3b a7 60 0b 00 00 73 6a 83 e2 02 74 29 4c 03 a3 68 0b 00 00 45 8b 24 24 <48> 8b 43 08 0f b7 70 3e 66 90 44 89 e0 5b 5d 41 5c 31 d2 31 c9 31 RSP: 0000:ffffb39a119dfb88 EFLAGS: 00000086 RAX: ffffffffc0eb96a0 RBX: ffff9e7963dc0000 RCX: 0000000000007fff RDX: 0000000000000000 RSI: 0000000000004ff6 RDI: ffff9e7963dc0000 RBP: 0000000000004ff6 R08: ffffb39a119dfc40 R09: 0000000000000010 R10: ffffb39a119dfc40 R11: ffffb39a119dfc44 R12: 00000000000e05ae R13: 0000000000000000 R14: ffff9e7963dc0010 R15: 0000000000000000 FS: 000000001012f6c0(0000) GS:ffff9e805eb80000(0000) knlGS:000000007fd40000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000461ca000 CR3: 00000002a8a20000 CR4: 0000000000350ee0 Call Trace: <TASK> dm_read_reg_func+0x37/0xc0 [amdgpu] generic_reg_get2+0x22/0x60 [amdgpu] optc1_get_crtc_scanoutpos+0x6a/0xc0 [amdgpu] dc_stream_get_scanoutpos+0x74/0x90 [amdgpu] dm_crtc_get_scanoutpos+0x82/0xf0 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x91/0x190 [amdgpu] ? dm_read_reg_func+0x37/0xc0 [amdgpu] amdgpu_get_vblank_counter_kms+0xb4/0x1a0 [amdgpu] dm_pflip_high_irq+0x213/0x2f0 [amdgpu] amdgpu_dm_irq_handler+0x8a/0x200 [amdgpu] amdgpu_irq_dispatch+0xd4/0x220 [amdgpu] amdgpu_ih_process+0x7f/0x110 [amdgpu] amdgpu_irq_handler+0x1f/0x70 [amdgpu] __handle_irq_event_percpu+0x46/0x1b0 handle_irq_event+0x34/0x80 handle_edge_irq+0x9f/0x240 __common_interrupt+0x66/0x110 common_interrupt+0x5c/0xd0 asm_common_interrupt+0x22/0x40 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31Revert "drm/amd/display: Block optimize on consecutive FAMS enables"Michel Dänzer
This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2. It depends on its parent commit, which we want to revert. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> [Hamza: fix a whitespace issue in dcn30_prepare_bandwidth()] Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-31drm/amd/pm: reverse mclk and fclk clocks levels for renoirTim Huang
This patch reverses the DPM clocks levels output of pp_dpm_mclk and pp_dpm_fclk for renoir. On dGPUs and older APUs we expose the levels from lowest clocks to highest clocks. But for some APUs, the clocks levels are given the reversed orders by PMFW. Like the memory DPM clocks that are exposed by pp_dpm_mclk. It's not intuitive that they are reversed on these APUs. All tools and software that talks to the driver then has to know different ways to interpret the data depending on the asic. So we need to reverse them to expose the clocks levels from the driver consistently. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31drm/amd/pm: reverse mclk and fclk clocks levels for vangoghTim Huang
This patch reverses the DPM clocks levels output of pp_dpm_mclk and pp_dpm_fclk. On dGPUs and older APUs we expose the levels from lowest clocks to highest clocks. But for some APUs, the clocks levels that from the DFPstateTable are given the reversed orders by PMFW. Like the memory DPM clocks that are exposed by pp_dpm_mclk. It's not intuitive that they are reversed on these APUs. All tools and software that talks to the driver then has to know different ways to interpret the data depending on the asic. So we need to reverse them to expose the clocks levels from the driver consistently. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31drm/amd/pm: reverse mclk and fclk clocks levels for yellow carpTim Huang
This patch reverses the DPM clocks levels output of pp_dpm_mclk and pp_dpm_fclk. On dGPUs and older APUs we expose the levels from lowest clocks to highest clocks. But for some APUs, the clocks levels that from the DFPstateTable are given the reversed orders by PMFW. Like the memory DPM clocks that are exposed by pp_dpm_mclk. It's not intuitive that they are reversed on these APUs. All tools and software that talks to the driver then has to know different ways to interpret the data depending on the asic. So we need to reverse them to expose the clocks levels from the driver consistently. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31drm/amd/pm: reverse mclk clocks levels for SMU v13.0.5Tim Huang
This patch reverses the DPM clocks levels output of pp_dpm_mclk. On dGPUs and older APUs we expose the levels from lowest clocks to highest clocks. But for some APUs, the clocks levels that from the DFPstateTable are given the reversed orders by PMFW. Like the memory DPM clocks that are exposed by pp_dpm_mclk. It's not intuitive that they are reversed on these APUs. All tools and software that talks to the driver then has to know different ways to interpret the data depending on the asic. So we need to reverse them to expose the clocks levels from the driver consistently. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31drm/amd/pm: reverse mclk and fclk clocks levels for SMU v13.0.4Tim Huang
This patch reverses the DPM clocks levels output of pp_dpm_mclk and pp_dpm_fclk. On dGPUs and older APUs we expose the levels from lowest clocks to highest clocks. But for some APUs, the clocks levels that from the DFPstateTable are given the reversed orders by PMFW. Like the memory DPM clocks that are exposed by pp_dpm_mclk. It's not intuitive that they are reversed on these APUs. All tools and software that talks to the driver then has to know different ways to interpret the data depending on the asic. So we need to reverse them to expose the clocks levels from the driver consistently. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-31mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()Dan Carpenter
There was a bug where this code forgot to unlock the tdev->mutex if the kzalloc() failed. Fix this issue, by moving the allocation outside the lock. Fixes: 2d1e952a2b8e ("mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Lee Jones <lee@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2023-05-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix 64K ARM page size support in bnxt_re and efa - bnxt_re fixes for a memory leak, incorrect error handling and a remove a bogus FW failure when running on a VF - Update MAINTAINERS for hns and efa - Fix two rxe regressions added this merge window in error unwind and incorrect spinlock primitives - hns gets a better algorithm for allocating page tables to avoid running out of resources, and a timeout adjustment - Fix a text case failure in hns - Use after free in irdma and fix incorrect construction of a WQE causing mis-execution * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/irdma: Fix Local Invalidate fencing RDMA/irdma: Prevent QP use after free MAINTAINERS: Update maintainer of Amazon EFA driver RDMA/bnxt_re: Do not enable congestion control on VFs RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx RDMA/bnxt_re: Fix a possible memory leak RDMA/hns: Modify the value of long message loopback slice RDMA/hns: Fix base address table allocation RDMA/hns: Fix timeout attr in query qp for HIP08 RDMA/efa: Fix unsupported page sizes in device RDMA/rxe: Convert spin_{lock_bh,unlock_bh} to spin_{lock_irqsave,unlock_irqrestore} RDMA/rxe: Fix double unlock in rxe_qp.c MAINTAINERS: Update maintainers of HiSilicon RoCE RDMA/bnxt_re: Fix the page_size used during the MR creation
2023-05-31Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix two regressions in ext4 and a number of issues reported by syzbot" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: enable the lazy init thread when remounting read/write ext4: fix fsync for non-directories ext4: add lockdep annotations for i_data_sem for ea_inode's ext4: disallow ea_inodes with extended attributes ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find() ext4: add EA_INODE checking to ext4_iget()
2023-05-31nvme: fix the name of Zone Append for verbose loggingChristoph Hellwig
No Management involved in Zone Appened. Fixes: bd83fe6f2cd2 ("nvme: add verbose error logging") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-05-31scsi: stex: Fix gcc 13 warningsBart Van Assche
gcc 13 may assign another type to enumeration constants than gcc 12. Split the large enum at the top of source file stex.c such that the type of the constants used in time expressions is changed back to the same type chosen by gcc 12. This patch suppresses compiler warnings like this one: In file included from ./include/linux/bitops.h:7, from ./include/linux/kernel.h:22, from drivers/scsi/stex.c:13: drivers/scsi/stex.c: In function ‘stex_common_handshake’: ./include/linux/typecheck.h:12:25: error: comparison of distinct pointer types lacks a cast [-Werror] 12 | (void)(&__dummy == &__dummy2); \ | ^~ ./include/linux/jiffies.h:106:10: note: in expansion of macro ‘typecheck’ 106 | typecheck(unsigned long, b) && \ | ^~~~~~~~~ drivers/scsi/stex.c:1035:29: note: in expansion of macro ‘time_after’ 1035 | if (time_after(jiffies, before + MU_MAX_DELAY * HZ)) { | ^~~~~~~~~~ See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107405. Cc: stable@vger.kernel.org Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230529195034.3077-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31HID: logitech-hidpp: Handle timeout differently from busyBastien Nocera
If an attempt at contacting a receiver or a device fails because the receiver or device never responds, don't restart the communication, only restart it if the receiver or device answers that it's busy, as originally intended. This was the behaviour on communication timeout before commit 586e8fede795 ("HID: logitech-hidpp: Retry commands when device is busy"). This fixes some overly long waits in a critical path on boot, when checking whether the device is connected by getting its HID++ version. Signed-off-by: Bastien Nocera <hadess@hadess.net> Suggested-by: Mark Lord <mlord@pobox.com> Fixes: 586e8fede795 ("HID: logitech-hidpp: Retry commands when device is busy") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217412 Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2023-05-31riscv: Fix relocatable kernels with early alternatives using -fno-pieAlexandre Ghiti
Early alternatives are called with the mmu disabled, and then should not access any global symbols through the GOT since it requires relocations, relocations that we do before but *virtually*. So only use medany code model for this early code. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> # booted on nezha & unmatched Fixes: 39b33072941f ("riscv: Introduce CONFIG_RELOCATABLE") Link: https://lore.kernel.org/r/20230526154630.289374-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-05-31nfsd: fix double fget() bug in __write_ports_addfd()Dan Carpenter
The bug here is that you cannot rely on getting the same socket from multiple calls to fget() because userspace can influence that. This is a kind of double fetch bug. The fix is to delete the svc_alien_sock() function and instead do the checking inside the svc_addsock() function. Fixes: 3064639423c4 ("nfsd: check passed socket's net matches NFSd superblock's one") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-05-31tracing/probe: trace_probe_primary_from_call(): checked list_first_entryPietro Borrello
All callers of trace_probe_primary_from_call() check the return value to be non NULL. However, the function returns list_first_entry(&tpe->probes, ...) which can never be NULL. Additionally, it does not check for the list being possibly empty, possibly causing a type confusion on empty lists. Use list_first_entry_or_null() which solves both problems. Link: https://lore.kernel.org/linux-trace-kernel/20230128-list-entry-null-check-v1-1-8bde6a3da2ef@diag.uniroma1.it/ Fixes: 60d53e2c3b75 ("tracing/probe: Split trace_event related data from trace_probe") Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Mukesh Ojha <quic_mojha@quicinc.com> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-05-31udp6: Fix race condition in udp6_sendmsg & connectVladislav Efanov
Syzkaller got the following report: BUG: KASAN: use-after-free in sk_setup_caps+0x621/0x690 net/core/sock.c:2018 Read of size 8 at addr ffff888027f82780 by task syz-executor276/3255 The function sk_setup_caps (called by ip6_sk_dst_store_flow-> ip6_dst_store) referenced already freed memory as this memory was freed by parallel task in udpv6_sendmsg->ip6_sk_dst_lookup_flow-> sk_dst_check. task1 (connect) task2 (udp6_sendmsg) sk_setup_caps->sk_dst_set | | sk_dst_check-> | sk_dst_set | dst_release sk_setup_caps references | to already freed dst_entry| The reason for this race condition is: sk_setup_caps() keeps using the dst after transferring the ownership to the dst cache. Found by Linux Verification Center (linuxtesting.org) with syzkaller. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-31net/netlink: fix NETLINK_LIST_MEMBERSHIPS length reportPedro Tammela
The current code for the length calculation wrongly truncates the reported length of the groups array, causing an under report of the subscribed groups. To fix this, use 'BITS_TO_BYTES()' which rounds up the division by 8. Fixes: b42be38b2778 ("netlink: add API to retrieve all group memberships") Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230529153335.389815-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30net: sched: fix NULL pointer dereference in mq_attachZhengchao Shao
When use the following command to test: 1)ip link add bond0 type bond 2)ip link set bond0 up 3)tc qdisc add dev bond0 root handle ffff: mq 4)tc qdisc replace dev bond0 parent ffff:fff1 handle ffff: mq The kernel reports NULL pointer dereference issue. The stack information is as follows: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Internal error: Oops: 0000000096000006 [#1] SMP Modules linked in: pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mq_attach+0x44/0xa0 lr : qdisc_graft+0x20c/0x5cc sp : ffff80000e2236a0 x29: ffff80000e2236a0 x28: ffff0000c0e59d80 x27: ffff0000c0be19c0 x26: ffff0000cae3e800 x25: 0000000000000010 x24: 00000000fffffff1 x23: 0000000000000000 x22: ffff0000cae3e800 x21: ffff0000c9df4000 x20: ffff0000c9df4000 x19: 0000000000000000 x18: ffff80000a934000 x17: ffff8000f5b56000 x16: ffff80000bb08000 x15: 0000000000000000 x14: 0000000000000000 x13: 6b6b6b6b6b6b6b6b x12: 6b6b6b6b00000001 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : ffff0000c0be0730 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008 x5 : ffff0000cae3e864 x4 : 0000000000000000 x3 : 0000000000000001 x2 : 0000000000000001 x1 : ffff8000090bc23c x0 : 0000000000000000 Call trace: mq_attach+0x44/0xa0 qdisc_graft+0x20c/0x5cc tc_modify_qdisc+0x1c4/0x664 rtnetlink_rcv_msg+0x354/0x440 netlink_rcv_skb+0x64/0x144 rtnetlink_rcv+0x28/0x34 netlink_unicast+0x1e8/0x2a4 netlink_sendmsg+0x308/0x4a0 sock_sendmsg+0x64/0xac ____sys_sendmsg+0x29c/0x358 ___sys_sendmsg+0x90/0xd0 __sys_sendmsg+0x7c/0xd0 __arm64_sys_sendmsg+0x2c/0x38 invoke_syscall+0x54/0x114 el0_svc_common.constprop.1+0x90/0x174 do_el0_svc+0x3c/0xb0 el0_svc+0x24/0xec el0t_64_sync_handler+0x90/0xb4 el0t_64_sync+0x174/0x178 This is because when mq is added for the first time, qdiscs in mq is set to NULL in mq_attach(). Therefore, when replacing mq after adding mq, we need to initialize qdiscs in the mq before continuing to graft. Otherwise, it will couse NULL pointer dereference issue in mq_attach(). And the same issue will occur in the attach functions of mqprio, taprio and htb. ffff:fff1 means that the repalce qdisc is ingress. Ingress does not allow any qdisc to be attached. Therefore, ffff:fff1 is incorrectly used, and the command should be dropped. Fixes: 6ec1c69a8f64 ("net_sched: add classful multiqueue dummy scheduler") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Tested-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20230527093747.3583502-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30Merge branch 'net-sched-fixes-for-sch_ingress-and-sch_clsact'Jakub Kicinski
Peilin Ye says: ==================== net/sched: Fixes for sch_ingress and sch_clsact These are v6 fixes for ingress and clsact Qdiscs, including only first 4 patches (already tested and reviewed) from v5. Patch 5 and 6 from previous versions are still under discussion and will be sent separately. [a] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b Link to v5: https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/ Link to v4: https://lore.kernel.org/r/cover.1684825171.git.peilin.ye@bytedance.com/ Link to v3 (incomplete): https://lore.kernel.org/r/cover.1684821877.git.peilin.ye@bytedance.com/ Link to v2: https://lore.kernel.org/r/cover.1684796705.git.peilin.ye@bytedance.com/ Link to v1: https://lore.kernel.org/r/cover.1683326865.git.peilin.ye@bytedance.com/ ==================== Link: https://lore.kernel.org/r/cover.1685388545.git.peilin.ye@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30net/sched: Prohibit regrafting ingress or clsact QdiscsPeilin Ye
Currently, after creating an ingress (or clsact) Qdisc and grafting it under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under e.g. a TBF Qdisc: $ ip link add ifb0 type ifb $ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000 $ tc qdisc add dev ifb0 clsact $ tc qdisc link dev ifb0 handle ffff: parent 1:1 $ tc qdisc show dev ifb0 qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms qdisc clsact ffff: parent ffff:fff1 refcnt 2 ^^^^^^^^ clsact's refcount has increased: it is now grafted under both TC_H_CLSACT and 1:1. ingress and clsact Qdiscs should only be used under TC_H_INGRESS (TC_H_CLSACT). Prohibit regrafting them. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: 1f211a1b929c ("net, sched: add clsact qdisc") Tested-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) QdiscsPeilin Ye
Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1 (TC_H_INGRESS, TC_H_CLSACT): $ ip link add name ifb0 type ifb $ tc qdisc add dev ifb0 parent ffff:fff1 htb $ tc qdisc add dev ifb0 clsact Error: Exclusivity flag on, cannot modify. $ drgn ... >>> ifb0 = netdev_get_by_name(prog, "ifb0") >>> qdisc = ifb0.ingress_queue.qdisc_sleeping >>> print(qdisc.ops.id.string_().decode()) htb >>> qdisc.flags.value_() # TCQ_F_INGRESS 2 Only allow ingress and clsact Qdiscs under ffff:fff1. Return -EINVAL for everything else. Make TCQ_F_INGRESS a static flag of ingress and clsact Qdiscs. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: 1f211a1b929c ("net, sched: add clsact qdisc") Tested-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>