linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-08-17	scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba	Arun Easi
	qla_nvme_register_hba() puts out a warning when there are not enough queue pairs available for FC-NVME. Just fail the NVME registration rather than a WARNING + call Trace. Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set ↵	Arun Easi
	anytime ql2xextended_error_logging can now be set to 1 to get the default mask value, as opposed to at module load time only. Link: https://lore.kernel.org/r/20200806111014.28434-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Reduce noisy debug message	Quinn Tran
	Update debug level and message for ELS IOCB done. Link: https://lore.kernel.org/r/20200806111014.28434-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Fix login timeout	Quinn Tran
	Multipath errors were seen during failback due to login timeout. The remote device sent LOGO, the local host tore down the session and did relogin. The RSCN arrived indicates remote device is going through failover after which the relogin is in a 20s timeout phase. At this point the driver is stuck in the relogin process. Add a fix to delete the session as part of abort/flush the login. Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Indicate correct supported speeds for Mezz card	Quinn Tran
	Correct the supported speeds for 16G Mezz card. Link: https://lore.kernel.org/r/20200806111014.28434-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Flush I/O on zone disable	Quinn Tran
	Perform implicit logout to flush I/O on zone disable. Link: https://lore.kernel.org/r/20200806111014.28434-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Flush all sessions on zone disable	Quinn Tran
	On Zone Disable, certain switches would ignore all commands. This causes timeout for both switch scan command and abort of that command. On detection of this condition, all sessions will be shutdown. Link: https://lore.kernel.org/r/20200806111014.28434-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values	Enzo Matsumiya
	Improves readability of qla_mbx.c. Link: https://lore.kernel.org/r/20200805200546.22497-1-ematsumiya@suse.de Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: scsi_debug: Fix scp is NULL errors	Douglas Gilbert
	John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The problem was tracked down to a missing critical section on a "short circuit" path. Namely, the time to process the current command so far has already exceeded the requested command duration (i.e. the number of nanoseconds in the ndelay parameter). The random=1 parameter setting was pivotal in finding this error. The failure scenario involved first taking that "short circuit" path (due to a very short command duration) and then taking the more likely hrtimer_start() path (due to a longer command duration). With random=1 each command's duration is taken from the uniformly distributed [0..ndelay) interval. The fio utility also helped by reliably generating the error scenario at about once per minute on a RPi 4 (64 bit OS). Link: https://lore.kernel.org/r/20200813155738.109298-1-dgilbert@interlog.com Reported-by: John Garry <john.garry@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: zfcp: Fix use-after-free in request timeout handlers	Steffen Maier
	Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: No need to send Abort Task if the task in DB was cleared	Bean Huo
	If the bit corresponding to a task in the Doorbell register has been cleared, no need to poll the status of the task on the device side and to send an Abort Task TM. Instead, let it directly goto cleanup. In addition, to keep original debug output, move the goto below the debug print. Link: https://lore.kernel.org/r/20200811141859.27399-3-huobean@gmail.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: Clean up completed request without interrupt notification	Stanley Chu
	If somehow no interrupt notification is raised for a completed request and its doorbell bit is cleared by host, UFS driver needs to cleanup its outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally in the following scenario: After ufshcd_abort() returns, this request will be requeued by SCSI layer with its outstanding bit set. Any future completed request will trigger ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At this time the "abnormal outstanding bit" will be detected and the "requeued request" will be chosen to execute request post-processing flow. This is wrong because this request is still "alive". Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.com Reviewed-by: Can Guo <cang@codeaurora.org> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: Improve interrupt handling for shared interrupts	Adrian Hunter
	For shared interrupts, the interrupt status might be zero, so check that first. Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: Fix interrupt error message for shared interrupts	Adrian Hunter
	The interrupt might be shared, in which case it is not an error for the interrupt handler to be called when the interrupt status is zero, so don't print the message unless there was enabled interrupt status. Link: https://lore.kernel.org/r/20200811133936.19171-1-adrian.hunter@intel.com Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL	Adrian Hunter
	Intel EHL UFS host controller advertises auto-hibernate capability but it does not work correctly. Add a quirk for that. [mkp: checkpatch fix] Link: https://lore.kernel.org/r/20200810141024.28859-1-adrian.hunter@intel.com Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL") Acked-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs-mediatek: Fix incorrect time to wait link status	Stanley Chu
	Fix incorrect calculation of "ms" based waiting time in function ufs_mtk_setup_clocks(). Link: https://lore.kernel.org/r/20200809055702.20140-1-stanley.chu@mediatek.com Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: Fix possible infinite loop in ufshcd_hold	Stanley Chu
	In ufshcd_suspend(), after clk-gating is suspended and link is set as Hibern8 state, ufshcd_hold() is still possibly invoked before ufshcd_suspend() returns. For example, MediaTek's suspend vops may issue UIC commands which would call ufshcd_hold() during the command issuing flow. Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled, then ufshcd_hold() may enter infinite loops because there is no clk-ungating work scheduled or pending. In this case, ufshcd_hold() shall just bypass, and keep the link as Hibern8 state. Link: https://lore.kernel.org/r/20200809050734.18740-1-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Co-developed-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: fcoe: Fix I/O path allocation	Mike Christie
	ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of GFP_KERNEL. Link: https://lore.kernel.org/r/1596831813-9839-1-git-send-email-michael.christie@oracle.com cc: Hannes Reinecke <hare@suse.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	scsi: ufs: ti-j721e-ufs: Fix error return in ti_j721e_ufs_probe()	Jing Xiangfeng
	Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200806070135.67797-1-jingxiangfeng@huawei.com Fixes: 22617e216331 ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds
	Pull networking fixes from David Miller: "Another batch of fixes: 1) Remove nft_compat counter flush optimization, it generates warnings from the refcount infrastructure. From Florian Westphal. 2) Fix BPF to search for build id more robustly, from Jiri Olsa. 3) Handle bogus getopt lengths in ebtables, from Florian Westphal. 4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and Oleksij Rempel. 5) Reset iter properly on mptcp sendmsg() error, from Florian Westphal. 6) Show a saner speed in bonding broadcast mode, from Jarod Wilson. 7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones. 8) Fix double unregister in bonding during namespace tear down, from Cong Wang. 9) Disable RP filter during icmp_redirect selftest, from David Ahern" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits) otx2_common: Use devm_kcalloc() in otx2_config_npa() net: qrtr: fix usage of idr in port assignment to socket selftests: disable rp_filter for icmp_redirect.sh Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" phylink: <linux/phylink.h>: fix function prototype kernel-doc warning mptcp: sendmsg: reset iter on error redux net: devlink: Remove overzealous WARN_ON with snapshots tipc: not enable tipc when ipv6 works as a module tipc: fix uninit skb->data in tipc_nl_compat_dumpit() net: Fix potential wrong skb->protocol in skb_vlan_untag() net: xdp: pull ethernet header off packet after computing skb->protocol ipvlan: fix device features bonding: fix a potential double-unregister can: j1939: add rxtimer for multipacket broadcast session can: j1939: abort multipacket broadcast session when timeout occurs can: j1939: cancel rxtimer on multipacket broadcast session complete can: j1939: fix support for multipacket broadcast message net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs' net: fddi: skfp: cfm: Remove set but unused variable 'oldstate' net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs' ...
2020-08-17	otx2_common: Use devm_kcalloc() in otx2_config_npa()	Xu Wang
	A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kcalloc". Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-17	PCI/P2PDMA: Fix build without DMA ops	Christoph Hellwig
	My commit to make DMA ops support optional missed the reference in the p2pdma code. And while the build bot didn't manage to find a config where this can happen, Matthew did. Fix this by replacing two IS_ENABLED checks with ifdefs. Fixes: 2f9237d4f6df ("dma-mapping: make support for dma ops optional") Link: https://lore.kernel.org/r/20200810124843.1532738-1-hch@lst.de Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2020-08-17	libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group	Zqiang
	Because the last member of the "nvdimm_firmware_attributes" array was not assigned a null ptr, when traversal of "grp->attrs" array is out of bounds in "create_files" func. func: create_files: ->for (i = 0, attr = grp->attrs; *attr && !error; i++, attr++) ->.... BUG: KASAN: global-out-of-bounds in create_files fs/sysfs/group.c:43 [inline] BUG: KASAN: global-out-of-bounds in internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149 Read of size 8 at addr ffffffff8a2e4cf0 by task kworker/u17:10/959 CPU: 2 PID: 959 Comm: kworker/u17:10 Not tainted 5.8.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound async_run_entry_fn Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 create_files fs/sysfs/group.c:43 [inline] internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149 internal_create_groups.part.0+0x90/0x140 fs/sysfs/group.c:189 internal_create_groups fs/sysfs/group.c:185 [inline] sysfs_create_groups+0x25/0x50 fs/sysfs/group.c:215 device_add_groups drivers/base/core.c:2024 [inline] device_add_attrs drivers/base/core.c:2178 [inline] device_add+0x7fd/0x1c40 drivers/base/core.c:2881 nd_async_device_register+0x12/0x80 drivers/nvdimm/bus.c:506 async_run_entry_fn+0x121/0x530 kernel/async.c:123 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 The buggy address belongs to the variable: nvdimm_firmware_attributes+0x10/0x40 Link: https://lore.kernel.org/r/20200812085501.30963-1-qiang.zhang@windriver.com Link: https://lore.kernel.org/r/20200814150509.225615-1-vaibhav@linux.ibm.com Fixes: 48001ea50d17f ("PM, libnvdimm: Add runtime firmware activation support") Reported-by: syzbot+1cf0ffe61aecf46f588f@syzkaller.appspotmail.com Reported-by: Sandipan Das <sandipan@linux.ibm.com> Reported-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-08-17	drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR	Sharat Masetty
	This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	of/address: check for invalid range.cpu_addr	Colin Ian King
	Currently invalid CPU addresses are not being sanity checked resulting in SATA setup failure on a SynQuacer SC2A11 development machine. The original check was removed by and earlier commit, so add a sanity check back in to avoid this regression. Fixes: 7a8b64d17e35 ("of/address: use range parser for of_dma_get_range") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200817113208.523805-1-colin.king@canonical.com Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-17	drm/msm/gpu: make ringbuffer readonly	Rob Clark
	The GPU has no business writing into the ringbuffer, let's make it readonly to the GPU. Fixes: 7198e6b03155 ("drm/msm: add a3xx gpu support") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	drm/msm/adreno: fix updating ring fence	Rob Clark
	We need to set it to the most recent completed fence, not the most recent submitted. Otherwise we have races where we think we can retire submits that the GPU is not finished with, if the GPU doesn't manage to overwrite the seqno before we look at it. This can show up with hang recovery if one of the submits after the crashing submit also hangs after it is replayed. Fixes: f97decac5f4c ("drm/msm: Support multiple ringbuffers") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	drm/msm/dpu: fix unitialized variable error	Rob Clark
	drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c:817 dpu_crtc_enable() error: uninitialized symbol 'request_bandwidth'. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	drm/msm/dpu: Fix scale params in plane validation	Kalyan Thota
	Plane validation uses an API drm_calc_scale which will return src/dst value as a scale ratio. when viewing the range on a scale the values should fall in as Upscale ratio < Unity scale < Downscale ratio for src/dst formula Fix the min and max scale ratios to suit the API accordingly. Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org> Tested-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	drm/msm/dpu: Fix reservation failures in modeset	Kalyan Thota
	In TEST_ONLY commit, rm global_state will duplicate the object and request for new reservations, once they pass then the new state will be swapped with the old and will be available for the Atomic Commit. This patch fixes some of missing links in the resource reservation sequence mentioned above. 1) Creation of duplicate state in test_only commit (Rob) 2) Allocate and release the resources on every modeset. 3) Avoid allocation only when active is false. In a modeset operation, swap state happens well before disable. Hence clearing reservations in disable will cause failures in modeset enable. Allow reservations to be cleared/allocated before swap, such that only newly committed resources are pushed to HW. Changes in v1: - Move the rm release to atomic_check. - Ensure resource allocation and free happens when active is not changed i.e only when mode is changed.(Rob) Changes in v2: - Handle dpu_kms_get_global_state API failure as it may return EDEADLK (swboyd). Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-17	drm/modeset-lock: Take the modeset BKL for legacy drivers	Daniel Vetter
	This fell off in the conversion in commit 9bcaa3fe58ab7559e71df798bcff6e0795158695 Author: Michal Orzel <michalorzel.eng@gmail.com> Date: Tue Apr 28 19:10:04 2020 +0200 drm: Replace drm_modeset_lock/unlock_all with DRM_MODESET_LOCK_ALL_* helpers but it's caught by the drm_warn_on_modeset_not_all_locked() that the legacy modeset code uses. Since this is the bkl and it's unclear what's all protected, play it safe and grab it again for legacy drivers. Unfortunately this means we need to sprinkle a few more #includes around. Also we need to add the drm_device as a parameter to the _END macro. Finally remove the mute_lock() from setcrtc, since that's now done by the macro. Cc: Alex Deucher <alexdeucher@gmail.com> References: https://gitlab.freedesktop.org/drm/amd/-/issues/1224 Fixes: 9bcaa3fe58ab ("drm: Replace drm_modeset_lock/unlock_all with DRM_MODESET_LOCK_ALL_* helpers") Cc: Michal Orzel <michalorzel.eng@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200814093842.3048472-1-daniel.vetter@ffwll.ch
2020-08-17	vfio/type1: Add proper error unwind for vfio_iommu_replay()	Alex Williamson
	The vfio_iommu_replay() function does not currently unwind on error, yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma structure to indicate IOMMU mapping. The IOMMU mappings are torn down when the domain is destroyed, but the other actions go on to cause trouble later. For example, the iommu->domain_list can be empty if we only have a non-IOMMU backed mdev attached. We don't currently check if the list is empty before getting the first entry in the list, which leads to a bogus domain pointer. If a vfio_dma entry is erroneously marked as iommu_mapped, we'll attempt to use that bogus pointer to retrieve the existing physical page addresses. This is the scenario that uncovered this issue, attempting to hot-add a vfio-pci device to a container with an existing mdev device and DMA mappings, one of which could not be pinned, causing a failure adding the new group to the existing container and setting the conditions for a subsequent attempt to explode. To resolve this, we can first check if the domain_list is empty so that we can reject replay of a bogus domain, should we ever encounter this inconsistent state again in the future. The real fix though is to add the necessary unwind support, which means cleaning up the current pinning if an IOMMU mapping fails, then walking back through the r-b tree of DMA entries, reading from the IOMMU which ranges are mapped, and unmapping and unpinning those ranges. To be able to do this, we also defer marking the DMA entry as IOMMU mapped until all entries are processed, in order to allow the unwind to know the disposition of each entry. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-08-17	vfio-pci: Avoid recursive read-lock usage	Alex Williamson
	A down_read on memory_lock is held when performing read/write accesses to MMIO BAR space, including across the copy_to/from_user() callouts which may fault. If the user buffer for these copies resides in an mmap of device MMIO space, the mmap fault handler will acquire a recursive read-lock on memory_lock. Avoid this by reducing the lock granularity. Sequential accesses requiring multiple ioread/iowrite cycles are expected to be rare, therefore typical accesses should not see additional overhead. VGA MMIO accesses are expected to be non-fatal regardless of the PCI memory enable bit to allow legacy probing, this behavior remains with a comment added. ioeventfds are now included in memory access testing, with writes dropped while memory space is disabled. Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-08-17	drm/dp_mst: Don't return error code when crtc is null	Bhawanpreet Lakha
	[Why] In certain cases the crtc can be NULL and returning -EINVAL causes atomic check to fail when it shouln't. This leads to valid configurations failing because atomic check fails. [How] Don't early return if crtc is null Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> [added stable cc] Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 8ec046716ca8 ("drm/dp_mst: Add helper to trigger modeset on affected DSC MST CRTCs") Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200814170140.24917-1-Bhawanpreet.Lakha@amd.com
2020-08-17	block: virtio_blk: fix handling single range discard request	Ming Lei
	1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts to support multi-range discard for virtio-blk. However, the virtio-blk disk may report max discard segment as 1, at least that is exactly what qemu is doing. So far, block layer switches to normal request merge if max discard segment limit is 1, and multiple bios can be merged to single segment. This way may cause memory corruption in virtblk_setup_discard_write_zeroes(). Fix the issue by handling single max discard segment in straightforward way. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Changpeng Liu <changpeng.liu@intel.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-17	block: loop: set discard granularity and alignment for block device backed loop	Ming Lei
	In case of block device backend, if the backend supports write zeros, the loop device will set queue flag of QUEUE_FLAG_DISCARD. However, limits.discard_granularity isn't setup, and this way is wrong, see the following description in Documentation/ABI/testing/sysfs-block: A discard_granularity of 0 means that the device does not support discard functionality. Especially 9b15d109a6b2 ("block: improve discard bio alignment in __blkdev_issue_discard()") starts to take q->limits.discard_granularity for computing max discard sectors. And zero discard granularity may cause kernel oops, or fail discard request even though the loop queue claims discard support via QUEUE_FLAG_DISCARD. Fix the issue by setup discard granularity and alignment. Fixes: c52abf563049 ("loop: Better discard support for block devices") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Xiao Ni <xni@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Evan Green <evgreen@chromium.org> Cc: Gwendal Grignou <gwendal@chromium.org> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-17	usb: dwc3: gadget: Handle ZLP for sg requests	Thinh Nguyen
	Currently dwc3 doesn't handle usb_request->zero for SG requests. This change checks and prepares extra TRBs for the ZLP for SG requests. Cc: <stable@vger.kernel.org> # v4.5+ Fixes: 04c03d10e507 ("usb: dwc3: gadget: handle request->zero") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-08-17	usb: dwc3: gadget: Fix handling ZLP	Thinh Nguyen
	The usb_request->zero doesn't apply for isoc. Also, if we prepare a 0-length (ZLP) TRB for the OUT direction, we need to prepare an extra TRB to pad up to the MPS alignment. Use the same bounce buffer for the ZLP TRB and the extra pad TRB. Cc: <stable@vger.kernel.org> # v4.5+ Fixes: d6e5a549cc4d ("usb: dwc3: simplify ZLP handling") Fixes: 04c03d10e507 ("usb: dwc3: gadget: handle request->zero") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-08-17	usb: dwc3: gadget: Don't setup more than requested	Thinh Nguyen
	The SG list may be set up with entry size more than the requested length. Check the usb_request->length and make sure that we don't setup the TRBs to send/receive more than requested. This case may occur when the SG entry is allocated up to a certain minimum size, but the request length is less than that. It can also occur when the request is reused for a different request length. Cc: <stable@vger.kernel.org> # v4.18+ Fixes: a31e63b608ff ("usb: dwc3: gadget: Correct handling of scattergather lists") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-08-17	regulator: remove superfluous lock in regulator_resolve_coupling()	Michał Mirosław
	The code modifies rdev, but locks c_rdev instead. Remove the lock as this is held together by regulator_list_mutex taken in the caller. Fixes: f9503385b187 ("regulator: core: Mutually resolve regulators coupling") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/25eb81cefb37a646f3e44eaaf1d8ae8881cfde52.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: cleanup regulator_ena_gpio_free()	Michał Mirosław
	Since only regulator_ena_gpio_request() allocates rdev->ena_pin, and it guarantees that same gpiod gets same pin structure, it is enough to compare just the pointers. Also we know there can be only one matching entry on the list. Rework the code take advantage of the facts. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/3ff002c7aa3bd774491af4291a9df23541fcf892.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: plug of_node leak in regulator_register()'s error path	Michał Mirosław
	By calling device_initialize() earlier and noting that kfree(NULL) is ok, we can save a bit of code in error handling and plug of_node leak. Fixed commit already did part of the work. Fixes: 9177514ce349 ("regulator: fix memory leak on error path of regulator_register()") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Reviewed-by: Vladimir Zapolskiy <vz@mleia.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/f5035b1b4d40745e66bacd571bbbb5e4644d21a1.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: push allocation in set_consumer_device_supply() out of lock	Michał Mirosław
	Pull regulator_list_mutex into set_consumer_device_supply() and keep allocations outside of it. Fourth of the fs_reclaim deadlock case. Fixes: 45389c47526d ("regulator: core: Add early supply resolution for regulators") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/f0380bdb3d60aeefa9693c4e234d2dcda7e56747.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: push allocations in create_regulator() outside of lock	Michał Mirosław
	Move all allocations outside of the regulator_lock()ed section. ====================================================== WARNING: possible circular locking dependency detected 5.7.13+ #535 Not tainted ------------------------------------------------------ f2fs_discard-179:7/702 is trying to acquire lock: c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0 but task is already holding lock: cb95b080 (&dcc->cmd_lock){+.+.}-{3:3}, at: __issue_discard_cmd+0xec/0x5f8 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: [...] -> #3 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire.part.11+0x40/0x50 fs_reclaim_acquire+0x24/0x28 __kmalloc_track_caller+0x54/0x218 kstrdup+0x40/0x5c create_regulator+0xf4/0x368 regulator_resolve_supply+0x1a0/0x200 regulator_register+0x9c8/0x163c [...] other info that might help us debug this: Chain exists of: regulator_list_mutex --> &sit_i->sentry_lock --> &dcc->cmd_lock [...] Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/6eebc99b2474f4ffaa0405b15178ece0e7e4f608.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: push allocation in regulator_ena_gpio_request() out of lock	Michał Mirosław
	Move another allocation out of regulator_list_mutex-protected region, as reclaim might want to take the same lock. WARNING: possible circular locking dependency detected 5.7.13+ #534 Not tainted ------------------------------------------------------ kswapd0/383 is trying to acquire lock: c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0 but task is already holding lock: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire.part.11+0x40/0x50 fs_reclaim_acquire+0x24/0x28 kmem_cache_alloc_trace+0x40/0x1e8 regulator_register+0x384/0x1630 devm_regulator_register+0x50/0x84 reg_fixed_voltage_probe+0x248/0x35c [...] other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(regulator_list_mutex); lock(fs_reclaim); lock(regulator_list_mutex); * DEADLOCK * [...] 2 locks held by kswapd0/383: #0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50 #1: cb70e5e0 (hctx->srcu){....}-{0:0}, at: hctx_lock+0x60/0xb8 [...] Fixes: 541d052d7215 ("regulator: core: Only support passing enable GPIO descriptors") [this commit only changes context] Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking") [this is when the regulator_list_mutex was introduced in reclaim locking path] Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/41fe6a9670335721b48e8f5195038c3d67a3bf92.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	regulator: push allocation in regulator_init_coupling() outside of lock	Michał Mirosław
	Allocating memory with regulator_list_mutex held makes lockdep unhappy when memory pressure makes the system do fs_reclaim on eg. eMMC using a regulator. Push the lock inside regulator_init_coupling() after the allocation. ====================================================== WARNING: possible circular locking dependency detected 5.7.13+ #533 Not tainted ------------------------------------------------------ kswapd0/383 is trying to acquire lock: cca78ca4 (&sbi->write_io[i][j].io_rwsem){++++}-{3:3}, at: __submit_merged_write_cond+0x104/0x154 but task is already holding lock: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire.part.11+0x40/0x50 fs_reclaim_acquire+0x24/0x28 __kmalloc+0x54/0x218 regulator_register+0x860/0x1584 dummy_regulator_probe+0x60/0xa8 [...] other info that might help us debug this: Chain exists of: &sbi->write_io[i][j].io_rwsem --> regulator_list_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(regulator_list_mutex); lock(fs_reclaim); lock(&sbi->write_io[i][j].io_rwsem); * DEADLOCK * 1 lock held by kswapd0/383: #0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50 [...] Fixes: d8ca7d184b33 ("regulator: core: Introduce API for regulators coupling customization") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1a889cf7f61c6429c9e6b34ddcdde99be77a26b6.1597195321.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-08-17	s390/pci: re-introduce zpci_remove_device()	Niklas Schnelle
	For fixing the PF to VF link removal we need to perform some action on every removal of a zdev from the common PCI subsystem. So in preparation re-introduce zpci_remove_device() and use that instead of directly calling the common code functions. This was actually still declared from earlier code but no longer implemented. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17	s390/cio: add cond_resched() in the slow_eval_known_fn() loop	Vineeth Vijayan
	The scanning through subchannels during the time of an event could take significant amount of time in case of platforms with lots of known subchannels. This might result in higher scheduling latencies for other tasks especially on systems with a single CPU. Add cond_resched() call, as the loop in slow_eval_known_fn() can be executed for a longer duration. Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17	usb: gadget: f_tcm: Fix some resource leaks in some error paths	Christophe JAILLET
	If a memory allocation fails within a 'usb_ep_alloc_request()' call, the already allocated memory must be released. Fix a mix-up in the code and free the correct requests. Fixes: c52661d60f63 ("usb-gadget: Initial merge of target module for UASP + BOT") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-08-17	HID: hiddev: Fix slab-out-of-bounds write in hiddev_ioctl_usage()	Peilin Ye
	`uref->usage_index` is not always being properly checked, causing hiddev_ioctl_usage() to go out of bounds under some cases. Fix it. Reported-by: syzbot+34ee1b45d88571c2fa8b@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=f2aebe90b8c56806b050a20b36f51ed6acabe802 Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>