linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-12-10	usb: roles: fix a potential use after free	Wen Yang
	Free the sw structure only after we are done using it. This patch just moves the put_device() down a bit to avoid the use after free. Fixes: 5c54fcac9a9d ("usb: roles: Take care of driver module reference counting") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Peter Chen <peter.chen@nxp.com> Cc: stable <stable@vger.kernel.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Chunfeng Yun <chunfeng.yun@mediatek.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/20191124142236.25671-1-wenyang@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	drm/i915/gt: Detect if we miss WaIdleLiteRestore	Chris Wilson
	In order to avoid confusing the HW, we must never submit an empty ring during lite-restore, that is we should always advance the RING_TAIL before submitting to stay ahead of the RING_HEAD. Normally this is prevented by keeping a couple of spare NOPs in the request->wa_tail so that on resubmission we can advance the tail. This relies on the request only being resubmitted once, which is the normal condition as it is seen once for ELSP[1] and then later in ELSP[0]. On preemption, the requests are unwound and the tail reset back to the normal end point (as we know the request is incomplete and therefore its RING_HEAD is even earlier). However, if this w/a should fail we would try and resubmit the request with the RING_TAIL already set to the location of this request's wa_tail potentially causing a GPU hang. We can spot when we do try and incorrectly resubmit without advancing the RING_TAIL and spare any embarrassment by forcing the context restore. In the case of preempt-to-busy, we leave the requests running on the HW while we unwind. As the ring is still live, we cannot rewind our rq->tail without forcing a reload so leave it set to rq->wa_tail and only force a reload if we resubmit after a lite-restore. (Normally, the forced reload will be a part of the preemption event.) Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Closes: https://gitlab.freedesktop.org/drm/intel/issues/673 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@kernel.vger.org Link: https://patchwork.freedesktop.org/patch/msgid/20191209023215.3519970-1-chris@chris-wilson.co.uk (cherry picked from commit 82c69bf58650e644c61aa2bf5100b63a1070fd2f) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-12-10	drm/i915/hdcp: Nuke intel_hdcp_transcoder_config()	Ville Syrjälä
	intel_hdcp_transcoder_config() is clobbering some globally visible state in .compute_config(). That is a big no no as .compute_config() is supposed to have no visible side effects when either the commit fails or it's just a TEST_ONLY commit. Inline this stuff into intel_hdcp_enable() so that the state only gets modified when we actually commit the state to the hardware. Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Fixes: 39e2df090c3c ("drm/i915/hdcp: update current transcoder into intel_hdcp") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191204180549.1267-2-ville.syrjala@linux.intel.com Reviewed-by: Ramalingam C <ramalingam.c@intel.com> (cherry picked from commit 67e1d5ed85a83e232a9e0b995f5778a86722b96e) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-12-10	drm/i915/fbc: Disable fbc by default on all glk+	Ville Syrjälä
	We're missing a workaround in the fbc code for all glk+ platforms which can cause corruption around the top of the screen. So enabling fbc by default is a bad idea. I'm not keen to backport the w/a so let's start by disabling fbc by default on all glk+. We'll lift the restriction once the w/a is in place. Cc: stable@vger.kernel.org Cc: Daniel Drake <drake@endlessm.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Jian-Hong Pan <jian-hong@endlessm.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127201222.16669-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit cd8c021b36a66833cefe2c90a79a9e312a2a5690) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-12-10	drm/i915/perf: Configure OAR for specific context	Umesh Nerlige Ramappa
	Gen12 supports saving/restoring render counters per context. Apply OAR configuration only for the context that is passed in to perf. v2: - Fix OACTXCONTROL value to only stop/resume counters. - Remove gen12_update_reg_state_unlocked as power state is already applied by the caller. v3: (Lionel) - Move register initialization into the array - Assume a valid oa_config in enable_metric_set Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit ccdeed497042676e13fc1625e2a341880eff5da5) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-12-10	drm/i915/perf: Allow non-privileged access when OA buffer is not sampled	Umesh Nerlige Ramappa
	SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer. Since reports from OA buffer had system wide visibility, collecting samples from the OA buffer was a privileged operation on previous platforms. Prior to TGL, it was also necessary to sample the OA buffer to normalize reports from MI REPORT PERF COUNT. TGL has a dedicated OAR unit to sample perf reports for a specific render context. This removes the necessity to sample OA buffer. - If not sampling the OA buffer, allow non-privileged access. An earlier patch allows the non-privilege access: https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1 - Clear up the path for non-privileged access in this patch Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206194339.31356-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 322d56aa3145a28445907ecc638a2c3aa3295c6b) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-12-10	staging: gigaset: add endpoint-type sanity check	Johan Hovold
	Add missing endpoint-type sanity checks to probe. This specifically prevents a warning in USB core on URB submission when fuzzing USB descriptors. Signed-off-by: Johan Hovold <johan@kernel.org> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191202085610.12719-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: gigaset: fix illegal free on probe errors	Johan Hovold
	The driver failed to initialise its receive-buffer pointer, something which could lead to an illegal free on late probe errors. Fix this by making sure to clear all driver data at allocation. Fixes: 2032e2c2309d ("usb_gigaset: code cleanup") Cc: stable <stable@vger.kernel.org> # 2.6.33 Cc: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191202085610.12719-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: gigaset: fix general protection fault on probe	Johan Hovold
	Fix a general protection fault when accessing the endpoint descriptors which could be triggered by a malicious device due to missing sanity checks on the number of endpoints. Reported-by: syzbot+35b1c403a14f5c89eba7@syzkaller.appspotmail.com Fixes: 07dc1f9f2f80 ("[PATCH] isdn4linux: Siemens Gigaset drivers - M105 USB DECT adapter") Cc: stable <stable@vger.kernel.org> # 2.6.17 Cc: Hansjoerg Lipp <hjlipp@web.de> Cc: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191202085610.12719-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: vchiq: call unregister_chrdev_region() when driver registration fails	Marcelo Diop-Gonzalez
	This undoes the previous call to alloc_chrdev_region() on failure, and is probably what was meant originally given the label name. Signed-off-by: Marcelo Diop-Gonzalez <marcgonzalez@google.com> Cc: stable <stable@vger.kernel.org> Fixes: 187ac53e590c ("staging: vchiq_arm: rework probe and init functions") Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Link: https://lore.kernel.org/r/20191203153921.70540-1-marcgonzalez@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: exfat: fix multiple definition error of `rename_file'	Brendan Higgins
	`rename_file' was exported but not properly namespaced causing a multiple definition error because `rename_file' is already defined in fs/hostfs/hostfs_user.c: ld: drivers/staging/exfat/exfat_core.o: in function `rename_file': drivers/staging/exfat/exfat_core.c:2327: multiple definition of `rename_file'; fs/hostfs/hostfs_user.o:fs/hostfs/hostfs_user.c:350: first defined here make: *** [Makefile:1077: vmlinux] Error 1 This error can be reproduced on ARCH=um by selecting: CONFIG_EXFAT_FS=y CONFIG_HOSTFS=y Add a namespace prefix exfat_* to fix this error. Reported-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Brendan Higgins <brendanhiggins@google.com> Cc: stable <stable@vger.kernel.org> Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu> Tested-by: David Gow <davidgow@google.com> Reviewed-by: David Gow <davidgow@google.com> Link: https://lore.kernel.org/r/20191204234522.42855-1-brendanhiggins@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging/wlan-ng: add CRC32 dependency in Kconfig	Kay Friedrich
	wlan-ng uses the function crc32_le, but CRC32 wasn't a dependency of wlan-ng Co-developed-by: Michael Kupfer <michael.kupfer@fau.de> Signed-off-by: Michael Kupfer <michael.kupfer@fau.de> Signed-off-by: Kay Friedrich <kay.friedrich@fau.de> Link: https://lore.kernel.org/r/20191127112457.2301-1-kay.friedrich@fau.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: hp100: Fix build error without ETHERNET	YueHaibing
	It should depends on ETHERNET, otherwise building fails: drivers/staging/hp/hp100.o: In function `hp100_pci_remove': hp100.c:(.text+0x165): undefined reference to `unregister_netdev' hp100.c:(.text+0x214): undefined reference to `free_netdev' Fixes: 52340b82cf1a ("hp100: Move 100BaseVG AnyLAN driver to staging") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20191113021306.35464-1-yuehaibing@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: fbtft: Do not hardcode SPI CS polarity inversion	Linus Walleij
	The current use of the mode flag SPI_CS_HIGH is fragile: it overwrites anything already assigned by the SPI core. Assign ^= SPI_CS_HIGH since we might be active high already, and that is usually the case with GPIOs used for chip select, even if they are in practice active low. Add a comment clarifying why ^= SPI_CS_HIGH is the right choice here. Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20191204233230.22309-1-linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	staging: exfat: properly support discard in clr_alloc_bitmap()	Andrea Righi
	Currently the discard code in clr_alloc_bitmap() is just dead code. Move code around so that the discard operation is properly attempted when enabled. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Link: https://lore.kernel.org/r/20191205152913.GJ3276@xps-13 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	ocxl: Fix concurrent AFU open and device removal	Frederic Barrat
	If an ocxl device is unbound through sysfs at the same time its AFU is being opened by a user process, the open code may dereference freed stuctures, which can lead to kernel oops messages. You'd have to hit a tiny time window, but it's possible. It's fairly easy to test by making the time window bigger artificially. Fix it with a combination of 2 changes: - when an AFU device is found in the IDR by looking for the device minor number, we should hold a reference on the device until after the context is allocated. A reference on the AFU structure is kept when the context is allocated, so we can release the reference on the device after the context allocation. - with the fix above, there's still another even tinier window, between the time the AFU device is found in the IDR and the reference on the device is taken. We can fix this one by removing the IDR entry earlier, when the device setup is removed, instead of waiting for the 'release' device callback. With proper locking around the IDR. Fixes: 75ca758adbaf ("ocxl: Create a clear delineation between ocxl backend & frontend") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190624144148.32022-1-fbarrat@linux.ibm.com
2019-12-10	staging/octeon: Mark Ethernet driver as BROKEN	Guenter Roeck
	The code doesn't compile due to incompatible pointer errors such as drivers/staging/octeon/ethernet-tx.c:649:50: error: passing argument 1 of 'cvmx_wqe_get_grp' from incompatible pointer type This is due to mixing, for example, cvmx_wqe_t with 'struct cvmx_wqe'. Unfortunately, one can not just revert the primary offending commit, as doing so results in secondary errors. This is made worse by the fact that the "removed" typedefs still exist and are used widely outside the staging directory, making the entire set of "remove typedef" changes pointless and wrong. Reflect reality and mark the driver as BROKEN. Fixes: ef1fe6b7369a ("staging: octeon: remove typedef declaration for cvmx_wqe") Fixes: 73aef0c9d2c6 ("staging: octeon: remove typedef declaration for cvmx_helper_link_info") Cc: Wambui Karuga <wambui.karugax@gmail.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191202141836.9363-1-linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10	iwlwifi: pcie: move power gating workaround earlier in the flow	Luca Coelho
	We need to reset the NIC after setting the bits to enable power gating and that cannot be done too late in the flow otherwise it cleans other registers and things that were already configured, causing initialization to fail. In order to fix this, move the function to the common code in trans.c so it can be called directly from there at an earlier point, just after the reset we already do during initialization. Fixes: 9a47cb988338 ("iwlwifi: pcie: add workaround for power gating in integrated 22000") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205719 Cc: stable@ver.kernel.org # 5.4+ Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-10	Revert "iwlwifi: assign directly to iwl_trans->cfg in QuZ detection"	Anders Kaseorg
	This reverts commit 968dcfb4905245dc64d65312c0d17692fa087b99. Both that commit and commit 809805a820c6445f7a701ded24fdc6bbc841d1e4 attempted to fix the same bug (dead assignments to the local variable cfg), but they did so in incompatible ways. When they were both merged, independently of each other, the combination actually caused the bug to reappear, leading to a firmware crash on boot for some cards. https://bugzilla.kernel.org/show_bug.cgi?id=205719 Signed-off-by: Anders Kaseorg <andersk@mit.edu> Acked-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-09	NFC: nxp-nci: Fix probing without ACPI	Stephan Gerhold
	devm_acpi_dev_add_driver_gpios() returns -ENXIO if CONFIG_ACPI is disabled (e.g. on device tree platforms). In this case, nxp-nci will silently fail to probe. The other NFC drivers only log a debug message if devm_acpi_dev_add_driver_gpios() fails. Do the same in nxp-nci to fix this problem. Fixes: ad0acfd69add ("NFC: nxp-nci: Get rid of code duplication in ->probe()") Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-09	scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func	Bo Wu
	In iscsi_if_rx func, after receiving one request through iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to reply to the request in a do-while loop. If the iscsi_if_send_reply function keeps returning -EAGAIN, a deadlock will occur. For example, a client only send msg without calling recvmsg func, then it will result in the watchdog soft lockup. The details are given as follows: sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI); retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr); while (1) { state_msg = sendmsg(sock_fd, &msg, 0); //Note: recvmsg(sock_fd, &msg, 0) is not processed here. } close(sock_fd); watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat: curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0 deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq: TIMER: 992 SCHED: 8 Sample irqstat: irq 2: delta 1003, curr: 3103802, arch_timer CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G OE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO) pc : __alloc_skb+0x104/0x1b0 lr : __alloc_skb+0x9c/0x1b0 sp : ffff000033603a30 x29: ffff000033603a30 x28: 00000000000002dd x27: ffff800b34ced810 x26: ffff800ba7569f00 x25: 00000000ffffffff x24: 0000000000000000 x23: ffff800f7c43f600 x22: 0000000000480020 x21: ffff0000091d9000 x20: ffff800b34eff200 x19: ffff800ba7569f00 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0001000101000100 x13: 0000000101010000 x12: 0101000001010100 x11: 0001010101010001 x10: 00000000000002dd x9 : ffff000033603d58 x8 : ffff800b34eff400 x7 : ffff800ba7569200 x6 : ffff800b34eff400 x5 : 0000000000000000 x4 : 00000000ffffffff x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace: __alloc_skb+0x104/0x1b0 iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi] netlink_unicast+0x1e0/0x258 netlink_sendmsg+0x310/0x378 sock_sendmsg+0x4c/0x70 sock_write_iter+0x90/0xf0 __vfs_write+0x11c/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: iscsi: Fix a potential deadlock in the timeout handler	Bart Van Assche
	Some time ago the block layer was modified such that timeout handlers are called from thread context instead of interrupt context. Make it safe to run the iSCSI timeout handler in thread context. This patch fixes the following lockdep complaint: ================================ WARNING: inconsistent lock state 5.5.1-dbg+ #11 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/7:1H/206 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff88802d9827e8 (&(&session->frwd_lock)->rlock){+.?.}, at: iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_check_transport_timeouts+0x3e/0x210 [libiscsi] call_timer_fn+0x132/0x470 __run_timers.part.0+0x39f/0x5b0 run_timer_softirq+0x63/0xc0 __do_softirq+0x12d/0x5fd irq_exit+0xb3/0x110 smp_apic_timer_interrupt+0x131/0x3d0 apic_timer_interrupt+0xf/0x20 default_idle+0x31/0x230 arch_cpu_idle+0x13/0x20 default_idle_call+0x53/0x60 do_idle+0x38a/0x3f0 cpu_startup_entry+0x24/0x30 start_secondary+0x222/0x290 secondary_startup_64+0xa4/0xb0 irq event stamp: 1383705 hardirqs last enabled at (1383705): [<ffffffff81aace5c>] _raw_spin_unlock_irq+0x2c/0x50 hardirqs last disabled at (1383704): [<ffffffff81aacb98>] _raw_spin_lock_irq+0x18/0x50 softirqs last enabled at (1383690): [<ffffffffa0e2efea>] iscsi_queuecommand+0x76a/0xa20 [libiscsi] softirqs last disabled at (1383682): [<ffffffffa0e2e998>] iscsi_queuecommand+0x118/0xa20 [libiscsi] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&session->frwd_lock)->rlock); <Interrupt> lock(&(&session->frwd_lock)->rlock); * DEADLOCK * 2 locks held by kworker/7:1H/206: #0: ffff8880d57bf928 ((wq_completion)kblockd){+.+.}, at: process_one_work+0x472/0xab0 #1: ffff88802b9c7de8 ((work_completion)(&q->timeout_work)){+.+.}, at: process_one_work+0x476/0xab0 stack backtrace: CPU: 7 PID: 206 Comm: kworker/7:1H Not tainted 5.5.1-dbg+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kblockd blk_mq_timeout_work Call Trace: dump_stack+0xa5/0xe6 print_usage_bug.cold+0x232/0x23b mark_lock+0x8dc/0xa70 __lock_acquire+0xcea/0x2af0 lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] scsi_times_out+0xf4/0x440 [scsi_mod] scsi_timeout+0x1d/0x20 [scsi_mod] blk_mq_check_expired+0x365/0x3a0 bt_iter+0xd6/0xf0 blk_mq_queue_tag_busy_iter+0x3de/0x650 blk_mq_timeout_work+0x1af/0x380 process_one_work+0x56d/0xab0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 Fixes: 287922eb0b18 ("block: defer timeouts to a workqueue") Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191209173457.187370-1-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: libsas: stop discovering if oob mode is disconnected	Jason Yan
	The discovering of sas port is driven by workqueue in libsas. When libsas is processing port events or phy events in workqueue, new events may rise up and change the state of some structures such as asd_sas_phy. This may cause some problems such as follows: ==>thread 1 ==>thread 2 ==>phy up ==>phy_up_v3_hw() ==>oob_mode = SATA_OOB_MODE; ==>phy down quickly ==>hisi_sas_phy_down() ==>sas_ha->notify_phy_event() ==>sas_phy_disconnected() ==>oob_mode = OOB_NOT_CONNECTED ==>workqueue wakeup ==>sas_form_port() ==>sas_discover_domain() ==>sas_get_port_device() ==>oob_mode is OOB_NOT_CONNECTED and device is wrongly taken as expander This at last lead to the panic when libsas trying to issue a command to discover the device. [183047.614035] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [183047.622896] Mem abort info: [183047.625762] ESR = 0x96000004 [183047.628893] Exception class = DABT (current EL), IL = 32 bits [183047.634888] SET = 0, FnV = 0 [183047.638015] EA = 0, S1PTW = 0 [183047.641232] Data abort info: [183047.644189] ISV = 0, ISS = 0x00000004 [183047.648100] CM = 0, WnR = 0 [183047.651145] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000b7df67be [183047.657834] [0000000000000058] pgd=0000000000000000 [183047.662789] Internal error: Oops: 96000004 [#1] SMP [183047.667740] Process kworker/u16:2 (pid: 31291, stack limit = 0x00000000417c4974) [183047.675208] CPU: 0 PID: 3291 Comm: kworker/u16:2 Tainted: G W OE 4.19.36-vhulk1907.1.0.h410.eulerosv2r8.aarch64 #1 [183047.687015] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10, BIOS 0.15 10/22/2019 [183047.695007] Workqueue: 0000:74:02.0_disco_q sas_discover_domain [183047.700999] pstate: 20c00009 (nzCv daif +PAN +UAO) [183047.705864] pc : prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw] [183047.711510] lr : prep_ata_v3_hw+0xb0/0x230 [hisi_sas_v3_hw] [183047.717153] sp : ffff00000f28ba60 [183047.720541] x29: ffff00000f28ba60 x28: ffff8026852d7228 [183047.725925] x27: ffff8027dba3e0a8 x26: ffff8027c05fc200 [183047.731310] x25: 0000000000000000 x24: ffff8026bafa8dc0 [183047.736695] x23: ffff8027c05fc218 x22: ffff8026852d7228 [183047.742079] x21: ffff80007c2f2940 x20: ffff8027c05fc200 [183047.747464] x19: 0000000000f80800 x18: 0000000000000010 [183047.752848] x17: 0000000000000000 x16: 0000000000000000 [183047.758232] x15: ffff000089a5a4ff x14: 0000000000000005 [183047.763617] x13: ffff000009a5a50e x12: ffff8026bafa1e20 [183047.769001] x11: ffff0000087453b8 x10: ffff00000f28b870 [183047.774385] x9 : 0000000000000000 x8 : ffff80007e58f9b0 [183047.779770] x7 : 0000000000000000 x6 : 000000000000003f [183047.785154] x5 : 0000000000000040 x4 : ffffffffffffffe0 [183047.790538] x3 : 00000000000000f8 x2 : 0000000002000007 [183047.795922] x1 : 0000000000000008 x0 : 0000000000000000 [183047.801307] Call trace: [183047.803827] prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw] [183047.809127] hisi_sas_task_prep+0x750/0x888 [hisi_sas_main] [183047.814773] hisi_sas_task_exec.isra.7+0x88/0x1f0 [hisi_sas_main] [183047.820939] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] [183047.826757] smp_execute_task_sg+0xec/0x218 [183047.831013] smp_execute_task+0x74/0xa0 [183047.834921] sas_discover_expander.part.7+0x9c/0x5f8 [183047.839959] sas_discover_root_expander+0x90/0x160 [183047.844822] sas_discover_domain+0x1b8/0x1e8 [183047.849164] process_one_work+0x1b4/0x3f8 [183047.853246] worker_thread+0x54/0x470 [183047.856981] kthread+0x134/0x138 [183047.860283] ret_from_fork+0x10/0x18 [183047.863931] Code: f9407a80 528000e2 39409281 72a04002 (b9405800) [183047.870097] kernel fault(0x1) notification starting on CPU 0 [183047.875828] kernel fault(0x1) notification finished on CPU 0 [183047.881559] Modules linked in: unibsp(OE) hns3(OE) hclge(OE) hnae3(OE) mem_drv(OE) hisi_sas_v3_hw(OE) hisi_sas_main(OE) [183047.892418] ---[ end trace 4cc26083fc11b783 ]--- [183047.897107] Kernel panic - not syncing: Fatal exception [183047.902403] kernel fault(0x5) notification starting on CPU 0 [183047.908134] kernel fault(0x5) notification finished on CPU 0 [183047.913865] SMP: stopping secondary CPUs [183047.917861] Kernel Offset: disabled [183047.921422] CPU features: 0x2,a2a00a38 [183047.925243] Memory Limit: none [183047.928372] kernel reboot(0x2) notification starting on CPU 0 [183047.934190] kernel reboot(0x2) notification finished on CPU 0 [183047.940008] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com Reported-by: Gao Chuan <gaochuan4@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: ufs: Disable autohibern8 feature in Cadence UFS	sheebab
	This patch disables autohibern8 feature in Cadence UFS. The autohibern8 feature has issues due to which unexpected interrupt trigger is happening. After the interrupt issue is sorted out, autohibern8 feature will be re-enabled Link: https://lore.kernel.org/r/1575367635-22662-1-git-send-email-sheebab@cadence.com Cc: <stable@vger.kernel.org> Signed-off-by: sheebab <sheebab@cadence.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Tested-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: iscsi: qla4xxx: fix double free in probe	Dan Carpenter
	On this error path we call qla4xxx_mem_free() and then the caller also calls qla4xxx_free_adapter() which calls qla4xxx_mem_free(). It leads to a couple double frees: drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->chap_dma_pool' double freed drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->fw_ddb_dma_pool' double freed Fixes: afaf5a2d341d ("[SCSI] Initial Commit of qla4xxx") Link: https://lore.kernel.org/r/20191203094421.hw7ex7qr3j2rbsmx@kili.mountain Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: ufs: Give an unique ID to each ufs-bsg	Can Guo
	Considering there can be multiple UFS hosts in SoC, give each ufs-bsg an unique ID by appending the scsi host number to its device name. Link: https://lore.kernel.org/r/0101016eca8dc9d7-d24468d3-04d2-4ef3-a906-abe8b8cbcd3d-000000@us-west-2.amazonses.com Fixes: df032bf27a41 ("scsi: ufs: Add a bsg endpoint that supports UPIUs") Signed-off-by: Can Guo <cang@codeaurora.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Add debug dump of LOGO payload and ELS IOCB	Roman Bolshakov
	The change adds a way to debug LOGO ELS, likewise PLOGI. Link: https://lore.kernel.org/r/20191125165702.1013-14-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Acked-by: Quinn Tran <qutran@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI	Roman Bolshakov
	PORT UPDATE asynchronous event is generated on the host that issues PLOGI ELS (in the case of higher WWPN). In that case, the event shouldn't be handled as it sets unwanted DPC flags (i.e. LOOP_RESYNC_NEEDED) that trigger link flap. Ignore the event if the host has higher WWPN, but handle otherwise. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-13-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Don't defer relogin unconditonally	Roman Bolshakov
	qla2x00_configure_local_loop sets RELOGIN_NEEDED bit and calls qla24xx_fcport_handle_login to perform the login. This bit triggers a wake up of DPC later after a successful login. The deferred call is not needed if login succeeds, and it's set in qla24xx_fcport_handle_login in case of errors, hence it should be safe to drop. Link: https://lore.kernel.org/r/20191125165702.1013-12-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Acked-by: Quinn Tran <qutran@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Send Notify ACK after N2N PLOGI	Roman Bolshakov
	qlt_handle_login schedules session for deletion even if a login is in progress. That causes login bouncing, i.e. a few logins are made before it settles down. Complete the first login by sending Notify Acknowledge IOCB via qlt_plogi_ack_unref if the session is pending login completion. Fixes: 9cd883f07a54 ("scsi: qla2xxx: Fix session cleanup for N2N") Cc: Krishna Kant <krishna.kant@purestorage.com> Cc: Alexei Potashnik <alexei@purestorage.com> Link: https://lore.kernel.org/r/20191125165702.1013-11-r.bolshakov@yadro.com Acked-by: Quinn Tran <qutran@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Configure local loop for N2N target	Roman Bolshakov
	qla2x00_configure_local_loop initializes PLOGI payload for PLOGI ELS using Get Parameters mailbox command. In the case when the driver is running in target mode, the topology is N2N and the target port has higher WWPN, LOCAL_LOOP_UPDATE bit is cleared too early and PLOGI payload is not initialized by the Get Parameters command. That causes a failure of ELS IOCB carrying the PLOGI with 0x15 aka Data Underrun error. LOCAL_LOOP_UPDATE has to be set to initialize PLOGI payload. Fixes: 48acad099074 ("scsi: qla2xxx: Fix N2N link re-connect") Link: https://lore.kernel.org/r/20191125165702.1013-10-r.bolshakov@yadro.com Acked-by: Quinn Tran <qutran@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length	Roman Bolshakov
	The size of the buffer is hardcoded as 0x70 or 112 bytes, while the size of ELS IOCB is 0x40 and the size of PLOGI payload returned by Get Parameters command is 0x74. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-9-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Don't call qlt_async_event twice	Roman Bolshakov
	MBA_PORT_UPDATE generates duplicate log lines in target mode because qlt_async_event is called twice. Drop the calls within the case as the function will be called right after the switch statement. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-8-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvel.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Allow PLOGI in target mode	Roman Bolshakov
	According to FC-LS-3 (Fibre Channel Link Services) 6.3.2.4 "N_Port Login - No Fabric present", if both parties in the point-to-point connection know N_Port_Names of each other, Nx_Port with the highest N_Port_name shall transmit PLOGI. The specification sets no restrictions on the port role that should send PLOGI. However, FCP-4 (Fibre Channel Protocol for SCSI, Fourth Version) 6.2 "Overview of Process Login and Process Logout", instructs that in point-to-point topology, initiator shall send explicit PRLI ELS. The change fixes stuck P2P login, when target WWPN is higher than initiator WWPN. Cc: Quinn Tran <qutran@marvell.com> Cc: Himanshu Madhani <hmadhani@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-7-r.bolshakov@yadro.com Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Change discovery state before PLOGI	Roman Bolshakov
	When a port sends PLOGI, discovery state should be changed to login pending, otherwise RELOGIN_NEEDED bit is set in qla24xx_handle_plogi_done_event(). RELOGIN_NEEDED triggers another PLOGI, and it never goes out of the loop until login timer expires. Fixes: 8777e4314d397 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine") Fixes: 8b5292bcfcacf ("scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag") Cc: Quinn Tran <qutran@marvell.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191125165702.1013-6-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Drop superfluous INIT_WORK of del_work	Roman Bolshakov
	del_work is already initialized inside qla2x00_alloc_fcport, there's no need to overwrite it. Indeed, it might prevent complete traversal of workqueue list. Fixes: a01c77d2cbc45 ("scsi: qla2xxx: Move session delete to driver work queue") Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-5-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Initialize free_work before flushing it	Roman Bolshakov
	Target creation triggers a new BUG_ON introduced in in commit 4d43d395fed1 ("workqueue: Try to catch flush_work() without INIT_WORK()."). The BUG_ON reveals an attempt to flush free_work in qla24xx_do_nack_work before it's initialized in qlt_unreg_sess: WARNING: CPU: 7 PID: 211 at kernel/workqueue.c:3031 __flush_work.isra.38+0x40/0x2e0 CPU: 7 PID: 211 Comm: kworker/7:1 Kdump: loaded Tainted: G E 5.3.0-rc7-vanilla+ #2 Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx] NIP: c000000000159620 LR: c0080000009d91b0 CTR: c0000000001598c0 REGS: c000000005f3f730 TRAP: 0700 Tainted: G E (5.3.0-rc7-vanilla+) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002222 XER: 00000000 CFAR: c0000000001598d0 IRQMASK: 0 GPR00: c0080000009d91b0 c000000005f3f9c0 c000000001670a00 c0000003f8655ca8 GPR04: c0000003f8655c00 000000000000ffff 0000000000000011 ffffffffffffffff GPR08: c008000000949228 0000000000000000 0000000000000001 c0080000009e7780 GPR12: 0000000000002200 c00000003fff6200 c000000000161bc8 0000000000000004 GPR16: c0000003f9d68280 0000000002000000 0000000000000005 0000000000000003 GPR20: 0000000000000002 000000000000ffff 0000000000000000 fffffffffffffef7 GPR24: c000000004f73848 c000000004f73838 c000000004f73f28 c000000005f3fb60 GPR28: c000000004f73e48 c000000004f73c80 c000000004f73818 c0000003f9d68280 NIP [c000000000159620] __flush_work.isra.38+0x40/0x2e0 LR [c0080000009d91b0] qla24xx_do_nack_work+0x88/0x180 [qla2xxx] Call Trace: [c000000005f3f9c0] [c000000000159644] __flush_work.isra.38+0x64/0x2e0 (unreliable) [c000000005f3fa50] [c0080000009d91a0] qla24xx_do_nack_work+0x78/0x180 [qla2xxx] [c000000005f3fae0] [c0080000009496ec] qla2x00_do_work+0x604/0xb90 [qla2xxx] [c000000005f3fc40] [c008000000949cd8] qla2x00_iocb_work_fn+0x60/0xe0 [qla2xxx] [c000000005f3fc80] [c000000000157bb8] process_one_work+0x2c8/0x5b0 [c000000005f3fd10] [c000000000157f28] worker_thread+0x88/0x660 [c000000005f3fdb0] [c000000000161d64] kthread+0x1a4/0x1b0 [c000000005f3fe20] [c00000000000b960] ret_from_kernel_thread+0x5c/0x7c Instruction dump: 3d22001d 892966b1 7d908026 91810008 f821ff71 69290001 0b090000 2e290000 40920200 e9230018 7d2a0074 794ad182 <0b0a0000> 2fa90000 419e01e8 7c0802a6 ---[ end trace 5ccf335d4f90fcb8 ]--- Fixes: 1021f0bc2f3d6 ("scsi: qla2xxx: allow session delete to finish before create.") Cc: Quinn Tran <qutran@marvell.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191125165702.1013-4-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Use explicit LOGO in target mode	Quinn Tran
	Target makes implicit LOGO on session teardown. LOGO ELS is not send on the wire and initiator is not aware that target no longer wants talking to it. Initiator keeps sending I/O requests, target responds with BA_RJT, they time out and then initiator sends ABORT TASK (ABTS-LS). Current behaviour incurs unneeded I/O timeout and can be fixed for some initiators by making explicit LOGO on session deletion. Link: https://lore.kernel.org/r/20191125165702.1013-3-r.bolshakov@yadro.com Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd	Roman Bolshakov
	If ABTS cannot be completed in target mode, the driver attempts to free related management command and crashes: NIP [d000000019181ee8] tcm_qla2xxx_free_mcmd+0x40/0x80 [tcm_qla2xxx] LR [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx] Call Trace: [c000003fff27bb50] [c000003fff27bc10] 0xc000003fff27bc10 (unreliable) [c000003fff27bb70] [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx] [c000003fff27bc10] [d00000001dbc2be0] qla24xx_process_response_queue+0x5d8/0xbd0 [qla2xxx] [c000003fff27bd50] [d00000001dbc632c] qla24xx_msix_rsp_q+0x64/0x150 [qla2xxx] [c000003fff27bde0] [c000000000187200] __handle_irq_event_percpu+0x90/0x310 [c000003fff27bea0] [c0000000001874b8] handle_irq_event_percpu+0x38/0x90 [c000003fff27bee0] [c000000000187574] handle_irq_event+0x64/0xb0 [c000003fff27bf10] [c00000000018cd38] handle_fasteoi_irq+0xe8/0x280 [c000003fff27bf40] [c000000000185ccc] generic_handle_irq+0x4c/0x70 [c000003fff27bf60] [c000000000016cec] __do_irq+0x7c/0x1d0 [c000003fff27bf90] [c00000000002a530] call_do_irq+0x14/0x24 [c00000207d2cba90] [c000000000016edc] do_IRQ+0x9c/0x130 [c00000207d2cbae0] [c000000000008bf4] hardware_interrupt_common+0x114/0x120 --- interrupt: 501 at arch_local_irq_restore+0x74/0x90 LR = arch_local_irq_restore+0x74/0x90 [c00000207d2cbdd0] [c0000000001c64fc] tick_broadcast_oneshot_control+0x4c/0x60 (unreliable) [c00000207d2cbdf0] [c0000000007ac840] cpuidle_enter_state+0xf0/0x450 [c00000207d2cbe50] [c00000000016b81c] call_cpuidle+0x4c/0x90 [c00000207d2cbe70] [c00000000016bc30] do_idle+0x2b0/0x330 [c00000207d2cbec0] [c00000000016beec] cpu_startup_entry+0x3c/0x50 [c00000207d2cbef0] [c00000000004a06c] start_secondary+0x63c/0x670 [c00000207d2cbf90] [c00000000000aa6c] start_secondary_prolog+0x10/0x14 The crash can be triggered by ACL deletion when there's active I/O. During ACL deletion, qla2xxx performs implicit LOGO that's invisible for the initiator. Only the driver and firmware are aware of the logout. Therefore the initiator continues to send SCSI commands and the target always responds with SAM STATUS BUSY as it can't find the session. The command times out after a while and initiator invokes ABORT TASK TMF for the command. The TMF is mapped to ABTS-LS in FCP. The target can't find session for S_ID originating ABTS-LS so it never allocates mcmd. And since N_Port handle was deleted after LOGO, it is no longer valid and ABTS Response IOCB is returned from firmware with status 31. Then free_mcmd is invoked on NULL pointer and the kernel crashes. [ 7734.578642] qla2xxx [0000:00:0c.0]-e837:6: ABTS_RECV_24XX: instance 0 [ 7734.578644] qla2xxx [0000:00:0c.0]-f811:6: qla_target(0): task abort (s_id=1:2:0, tag=1209504, param=0) [ 7734.578645] find_sess_by_s_id: 0x010200 [ 7734.578645] Unable to locate s_id: 0x010200 [ 7734.578646] qla2xxx [0000:00:0c.0]-f812:6: qla_target(0): task abort for non-existent session [ 7734.578648] qla2xxx [0000:00:0c.0]-e806:6: Sending task mgmt ABTS response (ha=c0000000d5819000, atio=c0000000d3fd4700, status=4 [ 7734.578730] qla2xxx [0000:00:0c.0]-e838:6: ABTS_RESP_24XX: compl_status 31 [ 7734.578732] qla2xxx [0000:00:0c.0]-e863:6: qla_target(0): ABTS_RESP_24XX failed 31 (subcode 19:a) [ 7734.578740] Unable to handle kernel paging request for data at address 0x00000200 Fixes: 6b0431d6fa20b ("scsi: qla2xxx: Fix out of order Termination and ABTS response") Cc: Quinn Tran <qutran@marvell.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Thomas Abraham <tabraham@suse.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191125165702.1013-2-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-10	ACPI: PM: Avoid attaching ACPI PM domain to certain devices	Rafael J. Wysocki
	Certain ACPI-enumerated devices represented as platform devices in Linux, like fans, require special low-level power management handling implemented by their drivers that is not in agreement with the ACPI PM domain behavior. That leads to problems with managing ACPI fans during system-wide suspend and resume. For this reason, make acpi_dev_pm_attach() skip the affected devices by adding a list of device IDs to avoid to it and putting the IDs of the affected devices into that list. Fixes: e5cc8ef31267 (ACPI / PM: Provide ACPI PM callback routines for subsystems) Reported-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-12-09	soc: amlogic: meson-ee-pwrc: propagate errors from pm_genpd_init()	Martin Blumenstingl
	pm_genpd_init() can return an error. Propagate the error code to prevent the driver from indicating that it successfully probed while there were errors during pm_genpd_init(). Fixes: eef3c2ba0a42a6 ("soc: amlogic: Add support for Everything-Else power domains controller") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-12-09	soc: amlogic: meson-ee-pwrc: propagate PD provider registration errors	Martin Blumenstingl
	of_genpd_add_provider_onecell() can return an error. Propagate the error so the driver registration fails when of_genpd_add_provider_onecell() did not work. Fixes: eef3c2ba0a42a6 ("soc: amlogic: Add support for Everything-Else power domains controller") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2019-12-09	scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func	Bo Wu
	When phba->mbox_ext_buf_ctx.seqNum != phba->mbox_ext_buf_ctx.numBuf, dd_data should be freed before return SLI_CONFIG_HANDLED. When lpfc_sli_issue_mbox func return fails, pmboxq should be also freed in job_error tag. Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E7A966@DGGEML525-MBS.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Fix incorrect SFUB length used for Secure Flash Update MB Cmd	Michael Hernandez
	SFUB length should be in DWORDs when passed to FW. Fixes: 3f006ac342c03 ("scsi: qla2xxx: Secure flash update support for ISP28XX") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191203223657.22109-4-hmadhani@marvell.com Signed-off-by: Michael Hernandez <mhernandez@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Added support for MPI and PEP regions for ISP28XX	Michael Hernandez
	This patch adds support for MPI/PEP region updates which is required with secure flash updates for ISP28XX. Fixes: 3f006ac342c0 ("scsi: qla2xxx: Secure flash update support for ISP28XX") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191203223657.22109-3-hmadhani@marvell.com Signed-off-by: Michael Hernandez <mhernandez@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	scsi: qla2xxx: Correctly retrieve and interpret active flash region	Himanshu Madhani
	ISP27XX/28XX supports multiple flash regions. This patch fixes issue where active flash region was not interpreted correctly during secure flash update process. [mkp: typo] Fixes: 5fa8774c7f38c ("scsi: qla2xxx: Add 28xx flash primary/secondary status/image mechanism") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191203223657.22109-2-hmadhani@marvell.com Signed-off-by: Michael Hernandez <mhernandez@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-09	fjes: fix missed check in fjes_acpi_add	Chuhong Yuan
	fjes_acpi_add() misses a check for platform_device_register_simple(). Add a check to fix it. Fixes: 658d439b2292 ("fjes: Introduce FUJITSU Extended Socket Network Device driver") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-09	net: ethernet: ti: davinci_cpdma: fix warning "device driver frees DMA ↵	Grygorii Strashko
	memory with different size" The TI CPSW(s) driver produces warning with DMA API debug options enabled: WARNING: CPU: 0 PID: 1033 at kernel/dma/debug.c:1025 check_unmap+0x4a8/0x968 DMA-API: cpsw 48484000.ethernet: device driver frees DMA memory with different size [device address=0x00000000abc6aa02] [map size=64 bytes] [unmap size=42 bytes] CPU: 0 PID: 1033 Comm: ping Not tainted 5.3.0-dirty #41 Hardware name: Generic DRA72X (Flattened Device Tree) [<c0112c60>] (unwind_backtrace) from [<c010d270>] (show_stack+0x10/0x14) [<c010d270>] (show_stack) from [<c09bc564>] (dump_stack+0xd8/0x110) [<c09bc564>] (dump_stack) from [<c013b93c>] (__warn+0xe0/0x10c) [<c013b93c>] (__warn) from [<c013b9ac>] (warn_slowpath_fmt+0x44/0x6c) [<c013b9ac>] (warn_slowpath_fmt) from [<c01e0368>] (check_unmap+0x4a8/0x968) [<c01e0368>] (check_unmap) from [<c01e08a8>] (debug_dma_unmap_page+0x80/0x90) [<c01e08a8>] (debug_dma_unmap_page) from [<c0752414>] (__cpdma_chan_free+0x114/0x16c) [<c0752414>] (__cpdma_chan_free) from [<c07525c4>] (__cpdma_chan_process+0x158/0x17c) [<c07525c4>] (__cpdma_chan_process) from [<c0753690>] (cpdma_chan_process+0x3c/0x5c) [<c0753690>] (cpdma_chan_process) from [<c0758660>] (cpsw_tx_mq_poll+0x48/0x94) [<c0758660>] (cpsw_tx_mq_poll) from [<c0803018>] (net_rx_action+0x108/0x4e4) [<c0803018>] (net_rx_action) from [<c010230c>] (__do_softirq+0xec/0x598) [<c010230c>] (__do_softirq) from [<c0143914>] (do_softirq.part.4+0x68/0x74) [<c0143914>] (do_softirq.part.4) from [<c0143a44>] (__local_bh_enable_ip+0x124/0x17c) [<c0143a44>] (__local_bh_enable_ip) from [<c0871590>] (ip_finish_output2+0x294/0xb7c) [<c0871590>] (ip_finish_output2) from [<c0875440>] (ip_output+0x210/0x364) [<c0875440>] (ip_output) from [<c0875e2c>] (ip_send_skb+0x1c/0xf8) [<c0875e2c>] (ip_send_skb) from [<c08a7fd4>] (raw_sendmsg+0x9a8/0xc74) [<c08a7fd4>] (raw_sendmsg) from [<c07d6b90>] (sock_sendmsg+0x14/0x24) [<c07d6b90>] (sock_sendmsg) from [<c07d8260>] (__sys_sendto+0xbc/0x100) [<c07d8260>] (__sys_sendto) from [<c01011ac>] (__sys_trace_return+0x0/0x14) Exception stack(0xea9a7fa8 to 0xea9a7ff0) ... The reason is that cpdma_chan_submit_si() now stores original buffer length (sw_len) in CPDMA descriptor instead of adjusted buffer length (hw_len) used to map the buffer. Hence, fix an issue by passing correct buffer length in CPDMA descriptor. Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Fixes: 6670acacd59e ("net: ethernet: ti: davinci_cpdma: add dma mapped submit") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-09	Merge tag 'thermal-5.5-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal fixes from Zhang Rui: "Starting from this release cycle, we have Daniel Lezcano work as the new thermal co-maintainer because Eduardo's email is bouncing for sometime and we can not reach him. We also have a new shared git tree so that both Daniel and I can actively working on it. Specifics: - Update MAINTAINER file for new thermal co-maintainer and new thermal git tree address. (Daniel Lezcano, Florian Fainelli, Zhang Rui) - Fix a Kconfig warning. (YueHaibing)" * tag 'thermal-5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: MAINTAINERS: thermal: Change the git tree location MAINTAINERS: thermal: Add Daniel Lezcano as the thermal maintainer MAINTAINERS: thermal: Eduardo's email is bouncing thermal: power_allocator: Fix Kconfig warning
2019-12-09	rxe: correctly calculate iCRC for unaligned payloads	Steve Wise
	If RoCE PDUs being sent or received contain pad bytes, then the iCRC is miscalculated, resulting in PDUs being emitted by RXE with an incorrect iCRC, as well as ingress PDUs being dropped due to erroneously detecting a bad iCRC in the PDU. The fix is to include the pad bytes, if any, in iCRC computations. Note: This bug has caused broken on-the-wire compatibility with actual hardware RoCE devices since the soft-RoCE driver was first put into the mainstream kernel. Fixing it will create an incompatibility with the original soft-RoCE devices, but is necessary to be compatible with real hardware devices. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Steve Wise <larrystevenwise@gmail.com> Link: https://lore.kernel.org/r/20191203020319.15036-2-larrystevenwise@gmail.com Signed-off-by: Doug Ledford <dledford@redhat.com>