summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-16scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()Dan Carpenter
The "mybuf" string comes from the user, so we need to ensure that it is NUL terminated. Link: https://lore.kernel.org/r/20211214070527.GA27934@kili Fixes: bd2cdd5e400f ("scsi: lpfc: NVME Initiator: Add debugfs support") Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-13scsi: pm8001: Fix phys_to_virt() usage on dma_addr_tJohn Garry
The driver supports a "direct" mode of operation, where the SMP req frame is directly copied into the command payload (and vice-versa for the SMP resp). To get at the SMP req frame data in the scatterlist the driver uses phys_to_virt() on the DMA mapped memory dma_addr_t . This is broken, and subsequently crashes as follows when an IOMMU is enabled: Unable to handle kernel paging request at virtual address ffff0000fcebfb00 ... pc : pm80xx_chip_smp_req+0x2d0/0x3d0 lr : pm80xx_chip_smp_req+0xac/0x3d0 pm80xx_chip_smp_req+0x2d0/0x3d0 pm8001_task_exec.constprop.0+0x368/0x520 pm8001_queue_command+0x1c/0x30 smp_execute_task_sg+0xdc/0x204 sas_discover_expander.part.0+0xac/0x6cc sas_discover_root_expander+0x8c/0x150 sas_discover_domain+0x3ac/0x6a0 process_one_work+0x1d0/0x354 worker_thread+0x13c/0x470 kthread+0x17c/0x190 ret_from_fork+0x10/0x20 Code: 371806e1 910006d6 6b16033f 54000249 (38766b05) ---[ end trace b91d59aaee98ea2d ]--- note: kworker/u192:0[7] exited with preempt_count 1 Instead use kmap_atomic(). -- Difference to v1: - use kmap_atomic() in both locations Difference to v2: - add whitespace around arithmetic (Damien) Link: https://lore.kernel.org/r/1639390248-213603-1-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06scsi: qla2xxx: Format log strings only if neededRoman Bolshakov
Commit 598a90f2002c ("scsi: qla2xxx: add ring buffer for tracing debug logs") introduced unconditional log string formatting to ql_dbg() even if ql_dbg_log event is disabled. It harms performance because some strings are formatted in fastpath and/or interrupt context. Link: https://lore.kernel.org/r/20211112145446.51210-1-r.bolshakov@yadro.com Fixes: 598a90f2002c ("scsi: qla2xxx: add ring buffer for tracing debug logs") Cc: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06scsi: scsi_debug: Fix buffer size of REPORT ZONES commandShin'ichiro Kawasaki
According to ZBC and SPC specifications, the unit of ALLOCATION LENGTH field of REPORT ZONES command is byte. However, current scsi_debug implementation handles it as number of zones to calculate buffer size to report zones. When the ALLOCATION LENGTH has a large number, this results in too large buffer size and causes memory allocation failure. Fix the failure by handling ALLOCATION LENGTH as byte unit. Link: https://lore.kernel.org/r/20211207010638.124280-1-shinichiro.kawasaki@wdc.com Fixes: f0d1cf9378bd ("scsi: scsi_debug: Add ZBC zone commands") Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issueManish Rangankar
When issued LUN reset under heavy I/O we hit the qedi WARN_ON because of a mismatch in firmware I/O cmd cleanup request count and I/O cmd cleanup response count received. The mismatch is because of a race caused by the postfix increment of cmd_cleanup_cmpl. [qedi_clearsq:1295]:18: fatal error, need hard reset, cid=0x0 WARNING: CPU: 48 PID: 110963 at drivers/scsi/qedi/qedi_fw.c:1296 qedi_clearsq+0xa5/0xd0 [qedi] CPU: 48 PID: 110963 Comm: kworker/u130:0 Kdump: loaded Tainted: G W Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 04/15/2020 Workqueue: iscsi_conn_cleanup iscsi_cleanup_conn_work_fn [scsi_transport_iscsi] RIP: 0010:qedi_clearsq+0xa5/0xd0 [qedi] RSP: 0018:ffffac2162c7fd98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff975213c40ab8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9761bf816858 RDI: ffff9761bf816858 RBP: ffff975247018628 R08: 000000000000522c R09: 000000000000005b R10: 0000000000000000 R11: ffffac2162c7fbd8 R12: ffff97522e1b2be8 R13: 0000000000000000 R14: ffff97522e1b2800 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff9761bf800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1a34e3e1a0 CR3: 0000000108bb2000 CR4: 0000000000350ee0 Call Trace: qedi_ep_disconnect+0x533/0x550 [qedi] ? iscsi_dbg_trace+0x63/0x80 [scsi_transport_iscsi] ? _cond_resched+0x15/0x30 ? iscsi_suspend_queue+0x19/0x40 [libiscsi] iscsi_ep_disconnect+0xb0/0x130 [scsi_transport_iscsi] iscsi_cleanup_conn_work_fn+0x82/0x130 [scsi_transport_iscsi] process_one_work+0x1a7/0x360 ? create_worker+0x1a0/0x1a0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x22/0x40 ---[ end trace 5f1441f59082235c ]--- Link: https://lore.kernel.org/r/20211203095218.5477-1-mrangankar@marvell.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-02scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()Igor Pylypiv
Calling scsi_remove_host() before scsi_add_host() results in a crash: BUG: kernel NULL pointer dereference, address: 0000000000000108 RIP: 0010:device_del+0x63/0x440 Call Trace: device_unregister+0x17/0x60 scsi_remove_host+0xee/0x2a0 pm8001_pci_probe+0x6ef/0x1b90 [pm80xx] local_pci_probe+0x3f/0x90 We cannot call scsi_remove_host() in pm8001_alloc() because scsi_add_host() has not been called yet at that point in time. Function call tree: pm8001_pci_probe() | `- pm8001_pci_alloc() | | | `- pm8001_alloc() | | | `- scsi_remove_host() | `- scsi_add_host() Link: https://lore.kernel.org/r/20211201041627.1592487-1-ipylypiv@google.com Fixes: 05c6c029a44d ("scsi: pm80xx: Increase number of supported queues") Reviewed-by: Vishakha Channapattan <vishakhavc@google.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-29scsi: ufs: ufs-pci: Add support for Intel ADLAdrian Hunter
Add PCI ID and callbacks to support Intel Alder Lake. Link: https://lore.kernel.org/r/20211124204218.1784559-1-adrian.hunter@intel.com Cc: stable@vger.kernel.org # v5.15+ Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-23scsi: lpfc: Fix non-recovery of remote ports following an unsolicited LOGOJames Smart
A commit introduced formal regstration of all Fabric nodes to the SCSI transport as well as REG/UNREG RPI mailbox requests. The commit introduced the NLP_RELEASE_RPI flag for rports set in the lpfc_cmpl_els_logo_acc() routine to help clean up the RPIs. This new code caused the driver to release the RPI value used for the remote port and marked the RPI invalid. When the driver later attempted to re-login, it would use the invalid RPI and the adapter rejected the PLOGI request. As no login occurred, the devloss timer on the rport expired and connectivity was lost. This patch corrects the code by removing the snippet that requests the rpi to be unregistered. This change only occurs on a node that is already marked to be rediscovered. This puts the code back to its original behavior, preserving the already-assigned rpi value (registered or not) which can be used on the re-login attempts. Link: https://lore.kernel.org/r/20211123165646.62740-1-jsmart2021@gmail.com Fixes: fe83e3b9b422 ("scsi: lpfc: Fix node handling for Fabric Controller and Domain Controller") Cc: <stable@vger.kernel.org> # v5.14+ Co-developed-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-22scsi: scsi_debug: Zero clear zones at reset write pointerShin'ichiro Kawasaki
When a reset is requested the position of the write pointer is updated but the data in the corresponding zone is not cleared. Instead scsi_debug returns any data written before the write pointer was reset. This is an error and prevents using scsi_debug for stale page cache testing of the BLKRESETZONE ioctl. Zero written data in the zone when resetting the write pointer. Link: https://lore.kernel.org/r/20211122061223.298890-1-shinichiro.kawasaki@wdc.com Fixes: f0d1cf9378bd ("scsi: scsi_debug: Add ZBC zone commands") Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-22scsi: core: sysfs: Fix setting device state to SDEV_RUNNINGMike Christie
This fixes an issue added in commit 4edd8cd4e86d ("scsi: core: sysfs: Fix hang when device state is set via sysfs") where if userspace is requesting to set the device state to SDEV_RUNNING when the state is already SDEV_RUNNING, we return -EINVAL instead of count. The commmit above set ret to count for this case, when it should have set it to 0. Link: https://lore.kernel.org/r/20211120164917.4924-1-michael.christie@oracle.com Fixes: 4edd8cd4e86d ("scsi: core: sysfs: Fix hang when device state is set via sysfs") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-22scsi: scsi_debug: Sanity check block descriptor length in resp_mode_select()George Kennedy
In resp_mode_select() sanity check the block descriptor len to avoid UAF. BUG: KASAN: use-after-free in resp_mode_select+0xa4c/0xb40 drivers/scsi/scsi_debug.c:2509 Read of size 1 at addr ffff888026670f50 by task scsicmd/15032 CPU: 1 PID: 15032 Comm: scsicmd Not tainted 5.15.0-01d0625 #15 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Call Trace: <TASK> dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:107 print_address_description.constprop.9+0x28/0x160 mm/kasan/report.c:257 kasan_report.cold.14+0x7d/0x117 mm/kasan/report.c:443 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report_generic.c:306 resp_mode_select+0xa4c/0xb40 drivers/scsi/scsi_debug.c:2509 schedule_resp+0x4af/0x1a10 drivers/scsi/scsi_debug.c:5483 scsi_debug_queuecommand+0x8c9/0x1e70 drivers/scsi/scsi_debug.c:7537 scsi_queue_rq+0x16b4/0x2d10 drivers/scsi/scsi_lib.c:1521 blk_mq_dispatch_rq_list+0xb9b/0x2700 block/blk-mq.c:1640 __blk_mq_sched_dispatch_requests+0x28f/0x590 block/blk-mq-sched.c:325 blk_mq_sched_dispatch_requests+0x105/0x190 block/blk-mq-sched.c:358 __blk_mq_run_hw_queue+0xe5/0x150 block/blk-mq.c:1762 __blk_mq_delay_run_hw_queue+0x4f8/0x5c0 block/blk-mq.c:1839 blk_mq_run_hw_queue+0x18d/0x350 block/blk-mq.c:1891 blk_mq_sched_insert_request+0x3db/0x4e0 block/blk-mq-sched.c:474 blk_execute_rq_nowait+0x16b/0x1c0 block/blk-exec.c:63 sg_common_write.isra.18+0xeb3/0x2000 drivers/scsi/sg.c:837 sg_new_write.isra.19+0x570/0x8c0 drivers/scsi/sg.c:775 sg_ioctl_common+0x14d6/0x2710 drivers/scsi/sg.c:941 sg_ioctl+0xa2/0x180 drivers/scsi/sg.c:1166 __x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:52 do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:50 entry_SYSCALL_64_after_hwframe+0x44/0xae arch/x86/entry/entry_64.S:113 Link: https://lore.kernel.org/r/1637262208-28850-1-git-send-email-george.kennedy@oracle.com Reported-by: syzkaller <syzkaller@googlegroups.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: target: configfs: Delete unnecessary checks for NULLDan Carpenter
The "item" pointer is always going to be valid pointer and does not need to be checked. But if "item" were NULL then item_to_lun() would not return a NULL, but instead, the container_of() pointer math would return a value in the error pointer range. This confuses static checkers since it looks like a NULL vs IS_ERR() bug. Delete the bogus checks. Link: https://lore.kernel.org/r/20211118084900.GA24550@kili Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: target: core: Use RCU helpers for INQUIRY t10_alua_tg_pt_gpMike Christie
Fix the sparse warnings about t10_alua_tg_pt_gp accesses in target_core_spc.c caused by commit 7324f47d4293 ("scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path") That commit replaced the lun_tg_pt_gp_lock use in the I/O path, but it didn't update the INQUIRY code. Link: https://lore.kernel.org/r/20211117213928.8634-1-michael.christie@oracle.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: mpt3sas: Fix incorrect system timestampSreekanth Reddy
For updating the IOC firmware's timestamp with system timestamp, the driver issues the Mpi26IoUnitControlRequest message. While framing the Mpi26IoUnitControlRequest, the driver should copy the lower 32 bits of the current timestamp into IOCParameterValue field and the higher 32 bits into Reserved7 field. Link: https://lore.kernel.org/r/20211117123215.25487-1-sreekanth.reddy@broadcom.com Fixes: f98790c00375 ("scsi: mpt3sas: Sync time periodically between driver and firmware") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: mpt3sas: Fix system going into read-only modeSreekanth Reddy
While determining the SAS address of a drive, the driver checks whether the handle number is less than the HBA phy count or not. If the handle number is less than the HBA phy count then driver assumes that this handle belongs to HBA and hence it assigns the HBA SAS address. During IOC firmware downgrade operation, if the number of HBA phys is reduced and the OS drive's device handle drops below the phy count while determining the drive's SAS address, the driver ends up using the HBA's SAS address. This leads to a mismatch of drive's SAS address and hence the driver unregisters the OS drive and the system goes into read-only mode. Update the IOC's num_phys to the HBA phy count provided by actual loaded firmware. Link: https://lore.kernel.org/r/20211117105058.3505-1-sreekanth.reddy@broadcom.com Fixes: a5e99fda0172 ("scsi: mpt3sas: Update hba_port objects after host reset") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: mpt3sas: Fix kernel panic during drive powercycle testSreekanth Reddy
While looping over shost's sdev list it is possible that one of the drives is getting removed and its sas_target object is freed but its sdev object remains intact. Consequently, a kernel panic can occur while the driver is trying to access the sas_address field of sas_target object without also checking the sas_target object for NULL. Link: https://lore.kernel.org/r/20211117104909.2069-1-sreekanth.reddy@broadcom.com Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: ufs: ufs-mediatek: Add put_device() after of_find_device_by_node()Ye Guojin
This was found by coccicheck: ./drivers/scsi/ufs/ufs-mediatek.c, 211, 1-7, ERROR missing put_device; call of_find_device_by_node on line 1185, but without a corresponding object release within this function. Link: https://lore.kernel.org/r/20211110105133.150171-1-ye.guojin@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: scsi_debug: Fix type in min_t to avoid stack OOBGeorge Kennedy
Change min_t() to use type "u32" instead of type "int" to avoid stack out of bounds. With min_t() type "int" the values get sign extended and the larger value gets used causing stack out of bounds. BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:191 [inline] BUG: KASAN: stack-out-of-bounds in sg_copy_buffer+0x1de/0x240 lib/scatterlist.c:976 Read of size 127 at addr ffff888072607128 by task syz-executor.7/18707 CPU: 1 PID: 18707 Comm: syz-executor.7 Not tainted 5.15.0-syzk #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:106 print_address_description.constprop.9+0x28/0x160 mm/kasan/report.c:256 __kasan_report mm/kasan/report.c:442 [inline] kasan_report.cold.14+0x7d/0x117 mm/kasan/report.c:459 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x1a3/0x210 mm/kasan/generic.c:189 memcpy+0x23/0x60 mm/kasan/shadow.c:65 memcpy include/linux/fortify-string.h:191 [inline] sg_copy_buffer+0x1de/0x240 lib/scatterlist.c:976 sg_copy_from_buffer+0x33/0x40 lib/scatterlist.c:1000 fill_from_dev_buffer.part.34+0x82/0x130 drivers/scsi/scsi_debug.c:1162 fill_from_dev_buffer drivers/scsi/scsi_debug.c:1888 [inline] resp_readcap16+0x365/0x3b0 drivers/scsi/scsi_debug.c:1887 schedule_resp+0x4d8/0x1a70 drivers/scsi/scsi_debug.c:5478 scsi_debug_queuecommand+0x8c9/0x1ec0 drivers/scsi/scsi_debug.c:7533 scsi_dispatch_cmd drivers/scsi/scsi_lib.c:1520 [inline] scsi_queue_rq+0x16b0/0x2d40 drivers/scsi/scsi_lib.c:1699 blk_mq_dispatch_rq_list+0xb9b/0x2700 block/blk-mq.c:1639 __blk_mq_sched_dispatch_requests+0x28f/0x590 block/blk-mq-sched.c:325 blk_mq_sched_dispatch_requests+0x105/0x190 block/blk-mq-sched.c:358 __blk_mq_run_hw_queue+0xe5/0x150 block/blk-mq.c:1761 __blk_mq_delay_run_hw_queue+0x4f8/0x5c0 block/blk-mq.c:1838 blk_mq_run_hw_queue+0x18d/0x350 block/blk-mq.c:1891 blk_mq_sched_insert_request+0x3db/0x4e0 block/blk-mq-sched.c:474 blk_execute_rq_nowait+0x16b/0x1c0 block/blk-exec.c:62 sg_common_write.isra.18+0xeb3/0x2000 drivers/scsi/sg.c:836 sg_new_write.isra.19+0x570/0x8c0 drivers/scsi/sg.c:774 sg_ioctl_common+0x14d6/0x2710 drivers/scsi/sg.c:939 sg_ioctl+0xa2/0x180 drivers/scsi/sg.c:1165 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Link: https://lore.kernel.org/r/1636484247-21254-1-git-send-email-george.kennedy@oracle.com Reported-by: syzkaller <syzkaller@googlegroups.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: qla2xxx: edif: Fix off by one bug in qla_edif_app_getfcinfo()Dan Carpenter
The > comparison needs to be >= to prevent accessing one element beyond the end of the app_reply->ports[] array. Link: https://lore.kernel.org/r/20211109115219.GE16587@kili Fixes: 7878f22a2e03 ("scsi: qla2xxx: edif: Add getfcinfo and statistic bsgs") Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-18scsi: ufs: ufshpb: Fix warning in ufshpb_set_hpb_read_to_upiu()Bean Huo
Fix the following sparse warnings in ufshpb_set_hpb_read_to_upiu(): sparse warnings: (new ones prefixed by >>) drivers/scsi/ufs/ufshpb.c:335:27: sparse: sparse: cast from restricted __be64 drivers/scsi/ufs/ufshpb.c:335:25: sparse: expected restricted __be64 [usertype] ppn_tmp drivers/scsi/ufs/ufshpb.c:335:25: sparse: got unsigned long long [usertype] Link: https://lore.kernel.org/r/20211111222452.384089-1-huobean@gmail.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id()Ewan D. Milne
The SCM changes set the flags in mcp->out_mb instead of mcp->in_mb so the data was not actually being read into the mcp->mb[] array from the adapter. Link: https://lore.kernel.org/r/20211108183012.13895-1-emilne@redhat.com Fixes: 9f2475fe7406 ("scsi: qla2xxx: SAN congestion management implementation") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Fix another task management completion raceAdrian Hunter
hba->outstanding_tasks, which is read under host_lock spinlock, tells the interrupt handler what task management tags are in use by the driver. The doorbell register bits indicate which tags are in use by the hardware. A doorbell bit that is 0 is because the bit has yet to be set by the driver, or because the task is complete. It is only possible to disambiguate the 2 cases, if reading/writing the doorbell register is synchronized with reading/writing hba->outstanding_tasks. For that reason, reading REG_UTP_TASK_REQ_DOOR_BELL must be done under spinlock. Link: https://lore.kernel.org/r/20211108064815.569494-3-adrian.hunter@intel.com Fixes: f5ef336fd2e4 ("scsi: ufs: core: Fix task management completion") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Fix task management completion timeout raceAdrian Hunter
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, which races with the completion function ufshcd_tmc_handler() which expects req->end_io_data to have a value. Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already synchronized using hba->tmf_rqs and hba->outstanding_tasks under the host_lock spinlock. It is also not necessary (nor typical) to clear req->end_io_data because the block layer does it before allocating out requests e.g. via blk_get_request(). So fix by not clearing it. Link: https://lore.kernel.org/r/20211108064815.569494-2-adrian.hunter@intel.com Fixes: f5ef336fd2e4 ("scsi: ufs: core: Fix task management completion") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: core: sysfs: Fix hang when device state is set via sysfsMike Christie
This fixes a regression added with: commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") The problem is that after iSCSI recovery, iscsid will call into the kernel to set the dev's state to running, and with that patch we now call scsi_rescan_device() with the state_mutex held. If the SCSI error handler thread is just starting to test the device in scsi_send_eh_cmnd() then it's going to try to grab the state_mutex. We are then stuck, because when scsi_rescan_device() tries to send its I/O scsi_queue_rq() calls -> scsi_host_queue_ready() -> scsi_host_in_recovery() which will return true (the host state is still in recovery) and I/O will just be requeued. scsi_send_eh_cmnd() will then never be able to grab the state_mutex to finish error handling. To prevent the deadlock move the rescan-related code to after we drop the state_mutex. This also adds a check for if we are already in the running state. This prevents extra scans and helps the iscsid case where if the transport class has already onlined the device during its recovery process then we don't need userspace to do it again plus possibly block that daemon. Link: https://lore.kernel.org/r/20211105221048.6541-3-michael.christie@oracle.com Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Cc: Bart Van Assche <bvanassche@acm.org> Cc: lijinlin <lijinlin3@huawei.com> Cc: Wu Bo <wubo40@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: iscsi: Unblock session then wake up error handlerMike Christie
We can race where iscsi_session_recovery_timedout() has woken up the error handler thread and it's now setting the devices to offline, and session_recovery_timedout()'s call to scsi_target_unblock() is also trying to set the device's state to transport-offline. We can then get a mix of states. For the case where we can't relogin we want the devices to be in transport-offline so when we have repaired the connection __iscsi_unblock_session() can set the state back to running. Set the device state then call into libiscsi to wake up the error handler. Link: https://lore.kernel.org/r/20211105221048.6541-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-16scsi: ufs: core: Improve SCSI abort handlingBart Van Assche
The following has been observed on a test setup: WARNING: CPU: 4 PID: 250 at drivers/scsi/ufs/ufshcd.c:2737 ufshcd_queuecommand+0x468/0x65c Call trace: ufshcd_queuecommand+0x468/0x65c scsi_send_eh_cmnd+0x224/0x6a0 scsi_eh_test_devices+0x248/0x418 scsi_eh_ready_devs+0xc34/0xe58 scsi_error_handler+0x204/0x80c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 That warning is triggered by the following statement: WARN_ON(lrbp->cmd); Fix this warning by clearing lrbp->cmd from the abort handler. Link: https://lore.kernel.org/r/20211104181059.4129537-1-bvanassche@acm.org Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-14Linux 5.16-rc1Linus Torvalds
2021-11-14kconfig: Add support for -Wimplicit-fallthroughGustavo A. R. Silva
Add Kconfig support for -Wimplicit-fallthrough for both GCC and Clang. The compiler option is under configuration CC_IMPLICIT_FALLTHROUGH, which is enabled by default. Special thanks to Nathan Chancellor who fixed the Clang bug[1][2]. This bugfix only appears in Clang 14.0.0, so older versions still contain the bug and -Wimplicit-fallthrough won't be enabled for them, for now. This concludes a long journey and now we are finally getting rid of the unintentional fallthrough bug-class in the kernel, entirely. :) Link: https://github.com/llvm/llvm-project/commit/9ed4a94d6451046a51ef393cd62f00710820a7e8 [1] Link: https://bugs.llvm.org/show_bug.cgi?id=51094 [2] Link: https://github.com/KSPP/linux/issues/115 Link: https://github.com/ClangBuiltLinux/linux/issues/236 Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-14Merge tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs cleanups from Darrick Wong: "The most 'exciting' aspect of this branch is that the xfsprogs maintainer and I have worked through the last of the code discrepancies between kernel and userspace libxfs such that there are no code differences between the two except for #includes. IOWs, diff suffices to demonstrate that the userspace tools behave the same as the kernel, and kernel-only bits are clearly marked in the /kernel/ source code instead of just the userspace source. Summary: - Clean up open-coded swap() calls. - A little bit of #ifdef golf to complete the reunification of the kernel and userspace libxfs source code" * tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: sync xfs_btree_split macros with userspace libxfs xfs: #ifdef out perag code for userspace xfs: use swap() to make dabtree code cleaner
2021-11-14Merge tag 'for-5.16/parisc-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull more parisc fixes from Helge Deller: "Fix a build error in stracktrace.c, fix resolving of addresses to function names in backtraces, fix single-stepping in assembly code and flush userspace pte's when using set_pte_at()" * tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc/entry: fix trace test in syscall exit path parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page parisc: Fix implicit declaration of function '__kernel_text_address' parisc: Fix backtrace to always include init funtion names
2021-11-14Merge tag 'sh-for-5.16' of git://git.libc.org/linux-shLinus Torvalds
Pull arch/sh updates from Rich Felker. * tag 'sh-for-5.16' of git://git.libc.org/linux-sh: sh: pgtable-3level: Fix cast to pointer from integer of different size sh: fix READ/WRITE redefinition warnings sh: define __BIG_ENDIAN for math-emu sh: math-emu: drop unused functions sh: fix kconfig unmet dependency warning for FRAME_POINTER sh: Cleanup about SPARSE_IRQ sh: kdump: add some attribute to function maple: fix wrong return value of maple_bus_init(). sh: boot: avoid unneeded rebuilds under arch/sh/boot/compressed/ sh: boot: add intermediate vmlinux.bin* to targets instead of extra-y sh: boards: Fix the cacography in irq.c sh: check return code of request_irq sh: fix trivial misannotations
2021-11-14Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: - Fix early_iounmap - Drop cc-option fallbacks for architecture selection * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9156/1: drop cc-option fallbacks for architecture selection ARM: 9155/1: fix early early_iounmap()
2021-11-14Merge tag 'devicetree-fixes-for-5.16-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Two fixes due to DT node name changes on Arm, Ltd. boards - Treewide rename of Ingenic CGU headers - Update ST email addresses - Remove Netlogic DT bindings - Dropping few more cases of redundant 'maxItems' in schemas - Convert toshiba,tc358767 bridge binding to schema * tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: watchdog: sunxi: fix error in schema bindings: media: venus: Drop redundant maxItems for power-domain-names dt-bindings: Remove Netlogic bindings clk: versatile: clk-icst: Ensure clock names are unique of: Support using 'mask' in making device bus id dt-bindings: treewide: Update @st.com email address to @foss.st.com dt-bindings: media: Update maintainers for st,stm32-hwspinlock.yaml dt-bindings: media: Update maintainers for st,stm32-cec.yaml dt-bindings: mfd: timers: Update maintainers for st,stm32-timers dt-bindings: timer: Update maintainers for st,stm32-timer dt-bindings: i2c: imx: hardware do not restrict clock-frequency to only 100 and 400 kHz dt-bindings: display: bridge: Convert toshiba,tc358767.txt to yaml dt-bindings: Rename Ingenic CGU headers to ingenic,*.h
2021-11-14Merge tag 'timers-urgent-2021-11-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for POSIX CPU timers to address a problem where POSIX CPU timer delivery stops working for a new child task because copy_process() copies state information which is only valid for the parent task" * tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
2021-11-14Merge tag 'irq-urgent-2021-11-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for the interrupt subsystem Core code: - A regression fix for the Open Firmware interrupt mapping code where a interrupt controller property in a node caused a map property in the same node to be ignored. Interrupt chip drivers: - Workaround a limitation in SiFive PLIC interrupt chip which silently ignores an EOI when the interrupt line is masked. - Provide the missing mask/unmask implementation for the CSKY MP interrupt controller. PCI/MSI: - Prevent a use after free when PCI/MSI interrupts are released by destroying the sysfs entries before freeing the memory which is accessed in the sysfs show() function. - Implement a mask quirk for the Nvidia ION AHCI chip which does not advertise masking capability despite implementing it. Even worse the chip comes out of reset with all MSI entries masked, which due to the missing masking capability never get unmasked. - Move the check which prevents accessing the MSI[X] masking for XEN back into the low level accessors. The recent consolidation missed that these accessors can be invoked from places which do not have that check which broke XEN. Move them back to he original place instead of sprinkling tons of these checks all over the code" * tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: of/irq: Don't ignore interrupt-controller when interrupt-map failed irqchip/sifive-plic: Fixup EOI failed when masked irqchip/csky-mpintc: Fixup mask/unmask implementation PCI/MSI: Destroy sysfs before freeing entries PCI: Add MSI masking quirk for Nvidia ION AHCI PCI/MSI: Deal with devices lying about their MSI mask capability PCI/MSI: Move non-mask check back into low level accessors
2021-11-14Merge tag 'locking-urgent-2021-11-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 static call update from Thomas Gleixner: "A single fix for static calls to make the trampoline patching more robust by placing explicit signature bytes after the call trampoline to prevent patching random other jumps like the CFI jump table entries" * tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: static_call,x86: Robustify trampoline patching
2021-11-14Merge tag 'sched_urgent_for_v5.16_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Avoid touching ~100 config files in order to be able to select the preemption model - clear cluster CPU masks too, on the CPU unplug path - prevent use-after-free in cfs - Prevent a race condition when updating CPU cache domains - Factor out common shared part of smp_prepare_cpus() into a common helper which can be called by both baremetal and Xen, in order to fix a booting of Xen PV guests * tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: preempt: Restore preemption model selection configs arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology() sched/fair: Prevent dead task groups from regaining cfs_rq's sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() x86/smp: Factor out parts of native_smp_prepare_cpus()
2021-11-14Merge tag 'perf_urgent_for_v5.16_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Prevent unintentional page sharing by checking whether a page reference to a PMU samples page has been acquired properly before that - Make sure the LBR_SELECT MSR is saved/restored too - Reset the LBR_SELECT MSR when resetting the LBR PMU to clear any residual data left * tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Avoid put_page() when GUP fails perf/x86/vlbr: Add c->flags to vlbr event constraints perf/x86/lbr: Reset LBR_SELECT during vlbr reset
2021-11-14Merge tag 'x86_urgent_for_v5.16_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add the model number of a new, Raptor Lake CPU, to intel-family.h - Do not log spurious corrected MCEs on SKL too, due to an erratum - Clarify the path of paravirt ops patches upstream - Add an optimization to avoid writing out AMX components to sigframes when former are in init state * tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Raptor Lake to Intel family x86/mce: Add errata workaround for Skylake SKX37 MAINTAINERS: Add some information to PARAVIRT_OPS entry x86/fpu: Optimize out sigframe xfeatures when in init state
2021-11-14Merge tag 'perf-tools-for-v5.16-2021-11-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: "Hardware tracing: - ARM: * Print the size of the buffer size consistently in hexadecimal in ARM Coresight. * Add Coresight snapshot mode support. * Update --switch-events docs in 'perf record'. * Support hardware-based PID tracing. * Track task context switch for cpu-mode events. - Vendor events: * Add metric events JSON file for power10 platform perf test: - Get 'perf test' unit tests closer to kunit. - Topology tests improvements. - Remove bashisms from some tests. perf bench: - Fix memory leak of perf_cpu_map__new() in the futex benchmarks. libbpf: - Add some more weak libbpf functions o allow building with the libbpf versions, old ones, present in distros. libbeauty: - Translate [gs]setsockopt 'level' argument integer values to strings. tools headers UAPI: - Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files with the kernel sources. Documentation: - Add documentation to 'struct symbol'. - Synchronize the definition of enum perf_hw_id with code in tools/perf/design.txt" * tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits) perf tests: Remove bash constructs from stat_all_pmu.sh perf tests: Remove bash construct from record+zstd_comp_decomp.sh perf test: Remove bash construct from stat_bpf_counters.sh test perf bench futex: Fix memory leak of perf_cpu_map__new() tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers UAPI: Sync sound/asound.h with the kernel sources tools headers UAPI: Sync linux/prctl.h with the kernel sources tools headers UAPI: Sync arch prctl headers with the kernel sources perf tools: Add more weak libbpf functions perf bpf: Avoid memory leak from perf_env__insert_btf() perf symbols: Factor out annotation init/exit perf symbols: Bit pack to save a byte perf symbols: Add documentation to 'struct symbol' tools headers UAPI: Sync files changed by new futex_waitv syscall perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning perf arm-spe: Support hardware-based PID tracing perf arm-spe: Save context ID in record perf arm-spe: Update --switch-events docs in 'perf record' perf arm-spe: Track task context switch for cpu-mode events ...
2021-11-14Merge tag 'irqchip-fixes-5.16-1' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Address an issue with the SiFive PLIC being unable to EOI a masked interrupt - Move the disable/enable methods in the CSky mpintc to mask/unmask - Fix a regression in the OF irq code where an interrupt-controller property in the same node as an interrupt-map property would get ignored Link: https://lore.kernel.org/all/20211112173459.4015233-1-maz@kernel.org
2021-11-13Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linuxLinus Torvalds
Pull zstd update from Nick Terrell: "Update to zstd-1.4.10. Add myself as the maintainer of zstd and update the zstd version in the kernel, which is now 4 years out of date, to a much more recent zstd release. This includes bug fixes, much more extensive fuzzing, and performance improvements. And generates the kernel zstd automatically from upstream zstd, so it is easier to keep the zstd verison up to date, and we don't fall so far out of date again. This includes 5 commits that update the zstd library version: - Adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same API and preserves the semantics, so that none of the callers need to be updated. All callers are updated in the commit, because there are zero functional changes. - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't depend on the layout of `lib/zstd/` to include every source file. This allows the next patch to be automatically generated. - Imports the zstd-1.4.10 source code. This commit is automatically generated from upstream zstd (https://github.com/facebook/zstd). - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`. - Fixes a newly added build warning for clang. The discussion around this patchset has been pretty long, so I've included a FAQ-style summary of the history of the patchset, and why we are taking this approach. Why do we need to update? ------------------------- The zstd version in the kernel is based off of zstd-1.3.1, which is was released August 20, 2017. Since then zstd has seen many bug fixes and performance improvements. And, importantly, upstream zstd is continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to older versions. So the only way to sanely get these fixes is to keep up to date with upstream zstd. There are no known security issues that affect the kernel, but we need to be able to update in case there are. And while there are no known security issues, there are relevant bug fixes. For example the problem with large kernel decompression has been fixed upstream for over 2 years [1] Additionally the performance improvements for kernel use cases are significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz: - BtrFS zstd compression at levels 1 and 3 is 5% faster - BtrFS zstd decompression+read is 15% faster - SquashFS zstd decompression+read is 15% faster - F2FS zstd compression+write at level 3 is 8% faster - F2FS zstd decompression+read is 20% faster - ZRAM decompression+read is 30% faster - Kernel zstd decompression is 35% faster - Initramfs zstd decompression+build is 5% faster On top of this, there are significant performance improvements coming down the line in the next zstd release, and the new automated update patch generation will allow us to pull them easily. How is the update patch generated? ---------------------------------- The first two patches are preparation for updating the zstd version. Then the 3rd patch in the series imports upstream zstd into the kernel. This patch is automatically generated from upstream. A script makes the necessary changes and imports it into the kernel. The changes are: - Replace all libc dependencies with kernel replacements and rewrite includes. - Remove unncessary portability macros like: #if defined(_MSC_VER). - Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. The automated process makes it easy to keep the kernel version of zstd up to date. The current zstd in the kernel shares the guts of the code, but has a lot of API and minor changes to work in the kernel. This is because at the time upstream zstd was not ready to be used in the kernel envrionment as-is. But, since then upstream zstd has evolved to support being used in the kernel as-is. Why are we updating in one big patch? ------------------------------------- The 3rd patch in the series is very large. This is because it is restructuring the code, so it both deletes the existing zstd, and re-adds the new structure. Future updates will be directly proportional to the changes in upstream zstd since the last import. They will admittidly be large, as zstd is an actively developed project, and has hundreds of commits between every release. However, there is no other great alternative. One option ruled out is to replay every upstream zstd commit. This is not feasible for several reasons: - There are over 3500 upstream commits since the zstd version in the kernel. - The automation to automatically generate the kernel update was only added recently, so older commits cannot easily be imported. - Not every upstream zstd commit builds. - Only zstd releases are "supported", and individual commits may have bugs that were fixed before a release. Another option to reduce the patch size would be to first reorganize to the new file structure, and then apply the patch. However, the current kernel zstd is formatted with clang-format to be more "kernel-like". But, the new method imports zstd as-is, without additional formatting, to allow for closer correlation with upstream, and easier debugging. So the patch wouldn't be any smaller. It also doesn't make sense to import upstream zstd commit by commit going forward. Upstream zstd doesn't support production use cases running of the development branch. We have a lot of post-commit fuzzing that catches many bugs, so indiviudal commits may be buggy, but fixed before a release. So going forward, I intend to import every (important) zstd release into the Kernel. So, while it isn't ideal, updating in one big patch is the only patch I see forward. Who is responsible for this code? --------------------------------- I am. This patchset adds me as the maintainer for zstd. Previously, there was no tree for zstd patches. Because of that, there were several patches that either got ignored, or took a long time to merge, since it wasn't clear which tree should pick them up. I'm officially stepping up as maintainer, and setting up my tree as the path through which zstd patches get merged. I'll make sure that patches to the kernel zstd get ported upstream, so they aren't erased when the next version update happens. How is this code tested? ------------------------ I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS, Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and aarch64. I checked both performance and correctness. Also, thanks to many people in the community who have tested these patches locally. Lastly, this code will bake in linux-next before being merged into v5.16. Why update to zstd-1.4.10 when zstd-1.5.0 has been released? ------------------------------------------------------------ This patchset has been outstanding since 2020, and zstd-1.4.10 was the latest release when it was created. Since the update patch is automatically generated from upstream, I could generate it from zstd-1.5.0. However, there were some large stack usage regressions in zstd-1.5.0, and are only fixed in the latest development branch. And the latest development branch contains some new code that needs to bake in the fuzzer before I would feel comfortable releasing to the kernel. Once this patchset has been merged, and we've released zstd-1.5.1, we can update the kernel to zstd-1.5.1, and exercise the update process. You may notice that zstd-1.4.10 doesn't exist upstream. This release is an artifical release based off of zstd-1.4.9, with some fixes for the kernel backported from the development branch. I will tag the zstd-1.4.10 release after this patchset is merged, so the Linux Kernel is running a known version of zstd that can be debugged upstream. Why was a wrapper API added? ---------------------------- The first versions of this patchset migrated the kernel to the upstream zstd API. It first added a shim API that supported the new upstream API with the old code, then updated callers to use the new shim API, then transitioned to the new code and deleted the shim API. However, Cristoph Hellwig suggested that we transition to a kernel style API, and hide zstd's upstream API behind that. This is because zstd's upstream API is supports many other use cases, and does not follow the kernel style guide, while the kernel API is focused on the kernel's use cases, and follows the kernel style guide. Where is the previous discussion? --------------------------------- Links for the discussions of the previous versions of the patch set below. The largest changes in the design of the patchset are driven by the discussions in v11, v5, and v1. Sorry for the mix of links, I couldn't find most of the the threads on lkml.org" Link: https://lkml.org/lkml/2020/9/29/27 [1] Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12] Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11] Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10] Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9] Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8] Link: https://lkml.org/lkml/2020/12/3/1195 [v7] Link: https://lkml.org/lkml/2020/12/2/1245 [v6] Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5] Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4] Link: https://lkml.org/lkml/2020/9/23/1074 [v3] Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2] Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1] Signed-off-by: Nick Terrell <terrelln@fb.com> Tested By: Paul Jones <paul@pauljones.id.au> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64 Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf> * tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux: lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical MAINTAINERS: Add maintainer entry for zstd lib: zstd: Upgrade to latest upstream zstd version 1.4.10 lib: zstd: Add decompress_sources.h for decompress_unzstd lib: zstd: Add kernel-specific API
2021-11-13Merge tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linuxLinus Torvalds
Pull virtio-mem update from David Hildenbrand: "Support the VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature in virtio-mem, now that "accidential" access to logically unplugged memory inside added Linux memory blocks is no longer possible, because we: - Removed /dev/kmem in commit bbcd53c96071 ("drivers/char: remove /dev/kmem for good") - Disallowed access to virtio-mem device memory via /dev/mem in commit 2128f4e21aa ("virtio-mem: disallow mapping virtio-mem memory via /dev/mem") - Sanitized access to virtio-mem device memory via /proc/kcore in commit 0daa322b8ff9 ("fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages") - Sanitized access to virtio-mem device memory via /proc/vmcore in commit ce2814622e84 ("virtio-mem: kdump mode to sanitize /proc/vmcore access") The new VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature that will be required by some hypervisors implementing virtio-mem in the near future, so let's support it now that we safely can" * tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux: virtio-mem: support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
2021-11-13perf tests: Remove bash constructs from stat_all_pmu.shJames Clark
The tests were passing but without testing and were printing the following: $ ./perf test -v 90 90: perf all PMU test : --- start --- test child forked, pid 51650 Testing cpu/branch-instructions/ ./tests/shell/stat_all_pmu.sh: 10: [: Performance counter stats for 'true': 137,307 cpu/branch-instructions/ 0.001686672 seconds time elapsed 0.001376000 seconds user 0.000000000 seconds sys: unexpected operator Changing the regexes to a grep works in sh and prints this: $ ./perf test -v 90 90: perf all PMU test : --- start --- test child forked, pid 60186 [...] Testing tlb_flush.stlb_any test child finished with 0 ---- end ---- perf all PMU test: Ok Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20211028134828.65774-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13perf tests: Remove bash construct from record+zstd_comp_decomp.shJames Clark
Commit 463538a383a2 ("perf tests: Fix test 68 zstd compression for s390") inadvertently removed the -g flag from all platforms rather than just s390, because the [[ ]] construct fails in sh. Changing to single brackets restores testing of call graphs and removes the following error from the output: $ ./perf test -v 85 85: Zstd perf.data compression/decompression : --- start --- test child forked, pid 50643 Collecting compressed record file: ./tests/shell/record+zstd_comp_decomp.sh: 15: [[: not found Fixes: 463538a383a2 ("perf tests: Fix test 68 zstd compression for s390") Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20211028134828.65774-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13perf test: Remove bash construct from stat_bpf_counters.sh testJames Clark
Currently the test skips with an error because == only works in bash: $ ./perf test 91 -v Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 91: perf stat --bpf-counters test : --- start --- test child forked, pid 44586 ./tests/shell/stat_bpf_counters.sh: 26: [: -v: unexpected operator test child finished with -2 ---- end ---- perf stat --bpf-counters test: Skip Changing == to = does the same thing, but doesn't result in an error: ./perf test 91 -v Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 91: perf stat --bpf-counters test : --- start --- test child forked, pid 45833 Skipping: --bpf-counters not supported Error: unknown option `bpf-counters' [...] test child finished with -2 ---- end ---- perf stat --bpf-counters test: Skip Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20211028134828.65774-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13perf bench futex: Fix memory leak of perf_cpu_map__new()Sohaib Mohamed
ASan reports memory leaks while running: $ sudo ./perf bench futex all The leaks are caused by perf_cpu_map__new not being freed. This patch adds the missing perf_cpu_map__put since it calls cpu_map_delete implicitly. Fixes: 9c3516d1b850ea93 ("libperf: Add perf_cpu_map__new()/perf_cpu_map__read() functions") Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: André Almeida <andrealmeid@collabora.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lore.kernel.org/lkml/20211112201134.77892-1-sohaib.amhmd@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: dae1bd58389615d4 ("x86/msr-index: Add MSRs for XFD") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h --- tools/arch/x86/include/asm/msr-index.h 2021-07-15 16:17:01.819817827 -0300 +++ arch/x86/include/asm/msr-index.h 2021-11-06 15:49:33.738517311 -0300 @@ -625,6 +625,8 @@ #define MSR_IA32_BNDCFGS_RSVD 0x00000ffc +#define MSR_IA32_XFD 0x000001c4 +#define MSR_IA32_XFD_ERR 0x000001c5 #define MSR_IA32_XSS 0x00000da0 #define MSR_IA32_APICBASE 0x0000001b $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/after $ diff -u /tmp/before /tmp/after --- /tmp/before 2021-11-13 11:10:39.964201505 -0300 +++ /tmp/after 2021-11-13 11:10:47.902410873 -0300 @@ -93,6 +93,8 @@ [0x000001b0] = "IA32_ENERGY_PERF_BIAS", [0x000001b1] = "IA32_PACKAGE_THERM_STATUS", [0x000001b2] = "IA32_PACKAGE_THERM_INTERRUPT", + [0x000001c4] = "IA32_XFD", + [0x000001c5] = "IA32_XFD_ERR", [0x000001c8] = "LBR_SELECT", [0x000001c9] = "LBR_TOS", [0x000001d9] = "IA32_DEBUGCTLMSR", $ And this gets rebuilt: CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o INSTALL trace_plugins LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o LD /tmp/build/perf/trace/beauty/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf Now one can trace systemwide asking to see backtraces to where those MSRs are being read/written with: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR" ^C# # If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR" <SNIP> New filter for msr:read_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781) New filter for msr:write_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781) <SNIP> ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 pipewire/2479 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) schedule_hrtimeout_range_clock ([kernel.kallsyms]) do_epoll_wait ([kernel.kallsyms]) __x64_sys_epoll_wait ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) epoll_wait (/usr/lib64/libc-2.33.so) [0x76c4] (/usr/lib64/spa-0.2/support/libspa-support.so) [0x4cf0] (/usr/lib64/spa-0.2/support/libspa-support.so) 0.027 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) start_kernel ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Borislav Petkov <bp@suse.de> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/YY%2FJdb6on7swsn+C@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13tools headers UAPI: Sync drm/i915_drm.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: e5e32171a2cf1e43 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface") 9409eb35942713d0 ("drm/i915: Expose logical engine instance to user") ea673f17ab763879 ("drm/i915/uapi: Add comment clarifying purpose of I915_TILING_* values") d3ac8d42168a9be7 ("drm/i915/pxp: interfaces for using protected objects") cbbd3764b2399ad8 ("drm/i915/pxp: Create the arbitrary session after boot") That don't add any new ioctl, so no changes in tooling. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Huang, Sean Z <sean.z.huang@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13tools headers UAPI: Sync sound/asound.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: 5aec579e08e4f2be ("ALSA: uapi: Fix a C++ style comment in asound.h") That is just changing a // style comment to /* */. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h' diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>