linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-09-23	tools: move pcmcia crc32hash tool from Documentation	Shuah Khan
	Move pcmcia crc32hash tool from Documentation to tools/pcmcia and remove it from Documentation Makefile. Update location information for this tool. Create a new Makefile to build pcmcia. It can be built from top level directory or from pcmcia directory: Run make -C tools/pcmcia or cd tools/pcmcia; make Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	tools: move laptops dslm tool from Documentation	Shuah Khan
	Move laptops dslm tool to tools/laptop/dslm and remove it from Documentation Makefile. Update location information for this tool. Create a new Makefile to build dslm. It can be built from top level directory or from laptops directory: Run make -C tools/laptop/dslm or cd tools/laptop/dslm; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	tools: move accounting tool from Documentation	Shuah Khan
	Move accounting tool to tools and remove it from Documentation Makefile. Update location information for this tool. Create a new Makefile to build accounting. It can be built from top level directory or from accounting directory: Run make -C tools/accounting or cd tools/accounting; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	Merge tag 'regmap-fix-v4.8-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "A fix for an issue with double locking that was introduced earlier this release. I'd missed in review that we were already in a locked region when trying to drop part of the cache" * tag 'regmap-fix-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: fix deadlock on _regmap_raw_write() error path
2016-09-23	netns: move {inc,dec}_net_namespaces into #ifdef	Arnd Bergmann
	With the newly enforced limit on the number of namespaces, we get a build warning if CONFIG_NETNS is disabled: net/core/net_namespace.c:273:13: error: 'dec_net_namespaces' defined but not used [-Werror=unused-function] net/core/net_namespace.c:268:24: error: 'inc_net_namespaces' defined but not used [-Werror=unused-function] This moves the two added functions inside the #ifdef that guards their callers. Fixes: 703286608a22 ("netns: Add a limit on the number of net namespaces") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-09-23	Merge branch 'linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a regression in RSA that was only half-fixed earlier in the cycle. It also fixes an older regression that breaks the keyring subsystem" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa-pkcs1pad - Handle leading zero for decryption KEYS: Fix skcipher IV clobbering
2016-09-23	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "A couple of last-minute arm64 fixes for 4.8: - Fix secondary CPU to NUMA node assignment - Fix kgdb breakpoint insertion in read-only text sections (when CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kgdb: handle read-only text / modules arm64: Call numa_store_cpu_info() earlier.
2016-09-23	Merge tag 'tags/nand-fixes-for-4.8-rc8' of git://git.infradead.org/linux-ubifs	Linus Torvalds
	Pull MTD fixes from Richard Weinberger: "NAND Fixes for 4.8-rc8. This contains fixes for bugs which got introduced in -rc1. Usually Brian takes NAND patches from Boris, but since Brian is very busy these days with other stuff and Boris is not yet member of the kernel.org web of trust I stepped in. Boris will be in Berlin at ELCE, I'll sign his key and hopefully other Kernel developers too such that he can issue his own pull requests soon. Summary: - Fix a wrong OOB layout definition in the mxc driver - Fix incorrect ECC handling in the mtk driver" * tag 'tags/nand-fixes-for-4.8-rc8' of git://git.infradead.org/linux-ubifs: mtd: nand: mxc: fix obiwan error in mxc_nand_v[12]_ooblayout_free() functions mtd: nand: fix chances to create incomplete ECC data when writing mtd: nand: fix generating over-boundary ECC data when writing
2016-09-23	Merge tag 'mmc-v4.8-rc7' of git://git.linaro.org/people/ulf.hansson/mmc	Linus Torvalds
	Pull MMC fix from Ulf Hansson: "MMC host: - dw_mmc: fix the spamming log message" * tag 'mmc-v4.8-rc7' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: dw_mmc: fix the spamming log message
2016-09-23	samples: move auxdisplay example code from Documentation	Shuah Khan
	Move auxdisplay examples to samples and remove it from Documentation Makefile. Create a new Makefile to build auxdisplay. It can be built from top level directory or from auxdisplay directory: Run make -C samples/auxdisplay or cd samples/auxdisplay; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	samples: move watchdog example code from Documentation	Shuah Khan
	Move watchdog examples to samples and remove it from Documentation Makefile. Create a new Makefile to build watchdog. It can be built from top level directory or from watchdog directory: Run make -C samples/watchdog or cd samples/watchdog; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	samples: move timers example code from Documentation	Shuah Khan
	Move timers examples to samples and remove it from Documentation Makefile. Create a new Makefile to build timers. It can be built from top level directory or from timers directory: Run make -C samples/timers or cd samples/timers; make Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	samples: move misc-devices/mei example code from Documentation	Shuah Khan
	Move misc-devices/mei examples to samples/mei and remove it from Documentation Makefile. Delete misc-devices/Makefile. Create a new Makefile to build samples/mei. It can be built from top level directory or from mei directory: Run make -C samples/mei or cd samples/mei; make Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23	IB/core: remove ib_get_dma_mr	Christoph Hellwig
	We now only use it from ib_alloc_pd to create a local DMA lkey if the device doesn't provide one, or a global rkey if the ULP requests it. This patch removes ib_get_dma_mr and open codes the functionality in ib_alloc_pd so that we can simplify the code and prevent abuse of the functionality. As a side effect we can also simplify things by removing the valid access bit check, and the PD refcounting. In the future I hope to also remove the per-PD global MR entirely by shifting this work into the HW drivers, as one step towards avoiding the struct ib_mr overload for various different use cases. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	nvme-rdma: use IB_PD_UNSAFE_GLOBAL_RKEY	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	IB/srp: use IB_PD_UNSAFE_GLOBAL_RKEY	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	IB/iser: use IB_PD_UNSAFE_GLOBAL_RKEY	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	IB/core: add support to create a unsafe global rkey to ib_create_pd	Christoph Hellwig
	Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or less unchecked, this moves the capability of creating a global rkey into the RDMA core, where it can be easily audited. It also prints a warning everytime this feature is used as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	IB/core: rename pd->local_mr to pd->__internal_mr	Christoph Hellwig
	This has two reasons: a) to clearly mark that drivers don't have any business using it, and b) because we're going to use it for the (dangerous) global rkey soon, so that drivers don't create on themselves. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23	nbd: use BLK_MQ_F_BLOCKING	Josef Bacik
	We take a mutex when sending commands and send stuff over the network, we need to have queue_rq called asynchronously. Signed-off-by: Josef Bacik <jbacik@fb.com> Fixes: fd8383fd88a2 ("nbd: convert to blkmq") Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23	blkcg: Annotate blkg_hint correctly	Bart Van Assche
	Avoid that sparse complains about blkg_hint manipulations. Fixes: a637120e4902 ("blkcg: use radix tree to index blkgs from blkcg") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23	Staging: ks7010: remove unused function in ks_wlan_net.c	Baoyou Xie
	We get 1 warning when building kernel with W=1: drivers/staging/ks7010/ks_wlan_net.c:3520:5: warning: no previous prototype for 'ks_wlan_reset' [-Wmissing-prototypes] In fact, these functions are unused in ks_wlan_net.c, but should be removed. So this patch removes the unused function. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	staging: rtl8192u: remove unused functions in r8192U_core.c	Baoyou Xie
	We get 2 warnings when building kernel with W=1: drivers/staging/rtl8192u/r8192U_core.c:925:12: warning: no previous declaration for 'ieeerate2rtlrate' [-Wmissing-declarations] drivers/staging/rtl8192u/r8192U_core.c:958:12: warning: no previous declaration for 'rtl8192_rate2rate' [-Wmissing-declarations] drivers/staging/rtl8192u/r8192U_core.c:1322:11: warning: no previous declaration for 'rtl8192_IsWirelessBMode' [-Wmissing-declarations] In fact, these functions are unused in r8192U_core.c, but should be removed. So this patch removes the unused functions. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	staging: rtl8192u: ieee80211: ieee80211_softmac: mark symbols static where ↵	Baoyou Xie
	possible We get 5 warnings when building kernel with W=1: drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c:287:13: warning: no previous declaration for 'softmac_ps_mgmt_xmit' [-Wmissing-declarations] drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c:323:24: warning: no previous declaration for 'ieee80211_probe_req' [-Wmissing-declarations] drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c:643:24: warning: no previous declaration for 'ieee80211_authentication_req' [-Wmissing-declarations] drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c:981:24: warning: no previous declaration for 'ieee80211_association_req' [-Wmissing-declarations] drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c:3094:24: warning: no previous declaration for 'ieee80211_disassociate_skb' [-Wmissing-declarations] In fact, these functions are only used in the file in which they are declared and don't need a declaration, but can be made static. so this patch marks these functions with 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	staging: android: ion: mark symbols static where possible	Baoyou Xie
	We get 4 warnings when building kernel with W=1: drivers/staging/android/ion/ion_carveout_heap.c:36:17: warning: no previous prototype for 'ion_carveout_allocate' [-Wmissing-prototypes] drivers/staging/android/ion/ion_carveout_heap.c:50:6: warning: no previous prototype for 'ion_carveout_free' [-Wmissing-prototypes] drivers/staging/android/ion/ion_of.c:28:5: warning: no previous prototype for 'ion_parse_dt_heap_common' [-Wmissing-prototypes] drivers/staging/android/ion/ion_of.c:54:5: warning: no previous prototype for 'ion_setup_heap_common' [-Wmissing-prototypes] In fact, these functions are only used in the file in which they are declared and don't need a declaration, but can be made static. so this patch marks these functions with 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	staging: most: aim-cdev: make syscall write accept buffers of arbitrary size	Christian Gromm
	This patch allows to call the write() function for synchronous and isochronous channels with buffers of any size. The AIM simply waits for data to fill up the MOST buffer object according to the network interface controller specification for streaming channels, before it submits the buffer to the HDM. The new behavior is backward compatible to the old applications, since all known applications needed to fill the buffer completely anyway. Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	staging: greybus: Use setup_timer function	sayli karnik
	This patch uses setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: sayli karnik <karniksayli1995@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-23	Merge tag 'configfs-for-4.8-2' of git://git.infradead.org/users/hch/configfs	Linus Torvalds
	Pull configfs fix from Christoph Hellwig: "One more trivial fix for the binary attribute code from Phil Turnbull" * tag 'configfs-for-4.8-2' of git://git.infradead.org/users/hch/configfs: configfs: Return -EFBIG from configfs_write_bin_file.
2016-09-23	blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx	Christoph Hellwig
	This provides the caller a feedback that a given hctx is not mapped and thus no command can be sent on it. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23	MIPS: Fix pre-r6 emulation FPU initialisation	Paul Burton
	In the mipsr2_decoder() function, used to emulate pre-MIPSr6 instructions that were removed in MIPSr6, the init_fpu() function is called if a removed pre-MIPSr6 floating point instruction is the first floating point instruction used by the task. However, init_fpu() performs varous actions that rely upon not being migrated. For example in the most basic case it sets the coprocessor 0 Status.CU1 bit to enable the FPU & then loads FP register context into the FPU registers. If the task were to migrate during this time, it may end up attempting to load FP register context on a different CPU where it hasn't set the CU1 bit, leading to errors such as: do_cpu invoked from kernel context![#2]: CPU: 2 PID: 7338 Comm: fp-prctl Tainted: G D 4.7.0-00424-g49b0c82 #2 task: 838e4000 ti: 88d38000 task.ti: 88d38000 $ 0 : 00000000 00000001 ffffffff 88d3fef8 $ 4 : 838e4000 88d38004 00000000 00000001 $ 8 : 3400fc01 801f8020 808e9100 24000000 $12 : dbffffff 807b69d8 807b0000 00000000 $16 : 00000000 80786150 00400fc4 809c0398 $20 : 809c0338 0040273c 88d3ff28 808e9d30 $24 : 808e9d30 00400fb4 $28 : 88d38000 88d3fe88 00000000 8011a2ac Hi : 0040273c Lo : 88d3ff28 epc : 80114178 _restore_fp+0x10/0xa0 ra : 8011a2ac mipsr2_decoder+0xd5c/0x1660 Status: 1400fc03 KERNEL EXL IE Cause : 1080002c (ExcCode 0b) PrId : 0001a920 (MIPS I6400) Modules linked in: Process fp-prctl (pid: 7338, threadinfo=88d38000, task=838e4000, tls=766527d0) Stack : 00000000 00000000 00000000 88d3fe98 00000000 00000000 809c0398 809c0338 808e9100 00000000 88d3ff28 00400fc4 00400fc4 0040273c 7fb69e18 004a0000 004a0000 004a0000 7664add0 8010de18 00000000 00000000 88d3fef8 88d3ff28 808e9100 00000000 766527d0 8010e534 000c0000 85755000 8181d580 00000000 00000000 00000000 004a0000 00000000 766527d0 7fb69e18 004a0000 80105c20 ... Call Trace: [<80114178>] _restore_fp+0x10/0xa0 [<8011a2ac>] mipsr2_decoder+0xd5c/0x1660 [<8010de18>] do_ri+0x90/0x6b8 [<80105c20>] ret_from_exception+0x0/0x10 Fix this by disabling preemption around the call to init_fpu(), ensuring that it starts & completes on one CPU. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: b0a668fb2038 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6") Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.0+ Patchwork: https://patchwork.linux-mips.org/patch/14305/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-23	arm/arm64: arch_timer: Use archdata to indicate vdso suitability	Scott Wood
	Instead of comparing the name to a magic string, use archdata to explicitly communicate whether the arch timer is suitable for direct vdso access. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-23	arm64: arch_timer: Work around QorIQ Erratum A-008585	Scott Wood
	Erratum A-008585 says that the ARM generic timer counter "has the potential to contain an erroneous value for a small number of core clock cycles every time the timer value changes". Accesses to TVAL (both read and write) are also affected due to the implicit counter read. Accesses to CVAL are not affected. The workaround is to reread TVAL and count registers until successive reads return the same value. Writes to TVAL are replaced with an equivalent write to CVAL. The workaround is to reread TVAL and count registers until successive reads return the same value, and when writing TVAL to retry until counter reads before and after the write return the same value. The workaround is enabled if the fsl,erratum-a008585 property is found in the timer node in the device tree. This can be overridden with the clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM users to enable the workaround until a mechanism is implemented to automatically communicate this information. This erratum can be found on LS1043A and LS2080A. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Scott Wood <oss@buserror.net> [will: renamed read macro to reflect that it's not usually unstable] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-23	acpi: Fix broken error check in map_processor()	Thomas Gleixner
	map_processor() checks the cpuid value returned by acpi_map_cpuid() for -1 but acpi_map_cpuid() returns -EINVAL in case of error. As a consequence the error is ignored and the following access into percpu data with that negative cpuid results in a boot crash. This happens always when NR_CPUS/nr_cpu_ids is smaller than the number of processors listed in the ACPI tables. Use a proper error check for id < 0 so the function returns instead of trying to map CPU#(-EINVAL). Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> Cc: akpm@linux-foundation.org Cc: chen.tang@easystack.cn Cc: cl@linux.com Cc: gongzhaogang@inspur.com Cc: isimatu.yasuaki@jp.fujitsu.com Cc: izumi.taku@jp.fujitsu.com Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: len.brown@intel.com Cc: lenb@kernel.org Cc: linux-acpi@vger.kernel.org Cc: linux-mm@kvack.org Cc: mika.j.penttila@gmail.com Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: tj@kernel.org Cc: yasu.isimatu@gmail.com Fixes: dc6db24d2476 ("x86/acpi: Set persistent cpuid <-> nodeid mapping when booting") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1609231705570.5640@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-23	cfq: fix starvation of asynchronous writes	Glauber Costa
	While debugging timeouts happening in my application workload (ScyllaDB), I have observed calls to open() taking a long time, ranging everywhere from 2 seconds - the first ones that are enough to time out my application - to more than 30 seconds. The problem seems to happen because XFS may block on pending metadata updates under certain circumnstances, and that's confirmed with the following backtrace taken by the offcputime tool (iovisor/bcc): ffffffffb90c57b1 finish_task_switch ffffffffb97dffb5 schedule ffffffffb97e310c schedule_timeout ffffffffb97e1f12 __down ffffffffb90ea821 down ffffffffc046a9dc xfs_buf_lock ffffffffc046abfb _xfs_buf_find ffffffffc046ae4a xfs_buf_get_map ffffffffc046babd xfs_buf_read_map ffffffffc0499931 xfs_trans_read_buf_map ffffffffc044a561 xfs_da_read_buf ffffffffc0451390 xfs_dir3_leaf_read.constprop.16 ffffffffc0452b90 xfs_dir2_leaf_lookup_int ffffffffc0452e0f xfs_dir2_leaf_lookup ffffffffc044d9d3 xfs_dir_lookup ffffffffc047d1d9 xfs_lookup ffffffffc0479e53 xfs_vn_lookup ffffffffb925347a path_openat ffffffffb9254a71 do_filp_open ffffffffb9242a94 do_sys_open ffffffffb9242b9e sys_open ffffffffb97e42b2 entry_SYSCALL_64_fastpath 00007fb0698162ed [unknown] Inspecting my run with blktrace, I can see that the xfsaild kthread exhibit very high "Dispatch wait" times, on the dozens of seconds range and consistent with the open() times I have saw in that run. Still from the blktrace output, we can after searching a bit, identify the request that wasn't dispatched: 8,0 11 152 81.092472813 804 A WM 141698288 + 8 <- (8,1) 141696240 8,0 11 153 81.092472889 804 Q WM 141698288 + 8 [xfsaild/sda1] 8,0 11 154 81.092473207 804 G WM 141698288 + 8 [xfsaild/sda1] 8,0 11 206 81.092496118 804 I WM 141698288 + 8 ( 22911) [xfsaild/sda1] <==== 'I' means Inserted (into the IO scheduler) ===================================> 8,0 0 289372 96.718761435 0 D WM 141698288 + 8 (15626265317) [swapper/0] <==== Only 15s later the CFQ scheduler dispatches the request ======================> As we can see above, in this particular example CFQ took 15 seconds to dispatch this request. Going back to the full trace, we can see that the xfsaild queue had plenty of opportunity to run, and it was selected as the active queue many times. It would just always be preempted by something else (example): 8,0 1 0 81.117912979 0 m N cfq1618SN / insert_request 8,0 1 0 81.117913419 0 m N cfq1618SN / add_to_rr 8,0 1 0 81.117914044 0 m N cfq1618SN / preempt 8,0 1 0 81.117914398 0 m N cfq767A / slice expired t=1 8,0 1 0 81.117914755 0 m N cfq767A / resid=40 8,0 1 0 81.117915340 0 m N / served: vt=1948520448 min_vt=1948520448 8,0 1 0 81.117915858 0 m N cfq767A / sl_used=1 disp=0 charge=0 iops=1 sect=0 where cfq767 is the xfsaild queue and cfq1618 corresponds to one of the ScyllaDB IO dispatchers. The requests preempting the xfsaild queue are synchronous requests. That's a characteristic of ScyllaDB workloads, as we only ever issue O_DIRECT requests. While it can be argued that preempting ASYNC requests in favor of SYNC is part of the CFQ logic, I don't believe that doing so for 15+ seconds is anyone's goal. Moreover, unless I am misunderstanding something, that breaks the expectation set by the "fifo_expire_async" tunable, which in my system is set to the default. Looking at the code, it seems to me that the issue is that after we make an async queue active, there is no guarantee that it will execute any request. When the queue itself tests if it cfq_may_dispatch() it can bail if it sees SYNC requests in flight. An incoming request from another queue can also preempt it in such situation before we have the chance to execute anything (as seen in the trace above). This patch sets the must_dispatch flag if we notice that we have requests that are already fifo_expired. This flag is always cleared after cfq_dispatch_request() returns from cfq_dispatch_requests(), so it won't pin the queue for subsequent requests (unless they are themselves expired) Care is taken during preempt to still allow rt requests to preempt us regardless. Testing my workload with this patch applied produces much better results. From the application side I see no timeouts, and the open() latency histogram generated by systemtap looks much better, with the worst outlier at 131ms: Latency histogram of xfs_buf_lock acquisition (microseconds): value \|-------------------------------------------------- count 0 \| 11 1 \|@@@@ 161 2 \|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1966 4 \|@ 54 8 \| 36 16 \| 7 32 \| 0 64 \| 0 ~ 1024 \| 0 2048 \| 0 4096 \| 1 8192 \| 1 16384 \| 2 32768 \| 0 65536 \| 0 131072 \| 1 262144 \| 0 524288 \| 0 Signed-off-by: Glauber Costa <glauber@scylladb.com> CC: Jens Axboe <axboe@kernel.dk> CC: linux-block@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Glauber Costa <glauber@scylladb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23	blk-mq: fixup "Convert to new hotplug state machine"	Sebastian Andrzej Siewior
	The "blk_mq_queue_reinit_dead()" just cleared the cpumask instead doing a copy. Since we might never had an online callback we could end up with a ZERO mask which in turn leads to crash as test robot demonstarted. Fixes: 65d5291eee66 ("blk-mq: Convert to new hotplug state machine") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23	drm/amdgpu: fix amdgpu_vm_bo_update param error	Flora Cui
	Signed-off-by: Flora Cui <Flora.Cui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-23	Merge branch 'fixes' into devel	Linus Walleij

2016-09-23	of: Add vendor prefix for Engicam s.r.l company	Jagan Teki
	Engicam providing design services of electronic systems with high content of technology, relying on a long experience in electronic design. For more info visit http://www.engicam.com/en/ Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Fabio Estevam <fabio.estevam@nxp.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Matteo Lisi <matteo.lisi@engicam.com> Cc: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Jagan Teki <jagan@amarulasolutions.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-09-23	ARM: gic-v3: Work around definition of gic_write_bpr1	Marc Zyngier
	A new accessor for gic_write_bpr1 is added to arch_gicv3.h in 4.9, whilst the CP15 accessors are redifined in a separate branch. This leads to a horrible clash, where the new accessor ends up with a crap "asm volatile" definition. Work around this by carrying our own definition of gic_write_bpr1, creating a small conflict which will be obvious to resolve. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-09-23	arm64: arch_timer: Add device tree binding for A-008585 erratum	Scott Wood
	This erratum describes a bug in logic outside the core, so MIDR can't be used to identify its presence, and reading an SoC-specific revision register from common arch timer code would be awkward. So, describe it in the device tree. Signed-off-by: Scott Wood <oss@buserror.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-23	drm/amdgpu: Constify tables	Nils Wallménius
	Mark some powerplay tables as 'const' and adjust pointers acessing them to avoid introducing warnings. Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-23	rxrpc: Add a tracepoint to log which packets will be retransmitted	David Howells
	Add a tracepoint to log in rxrpc_resend() which packets will be retransmitted. Note that if a positive ACK comes in whilst we have dropped the lock to retransmit another packet, the actual retransmission may not happen, though some of the effects will (such as altering the congestion management). Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Add tracepoint for ACK proposal	David Howells
	Add a tracepoint to log proposed ACKs, including whether the proposal is used to update a pending ACK or is discarded in favour of an easlier, higher priority ACK. Whilst we're at it, get rid of the rxrpc_acks() function and access the name array directly. We do, however, need to validate the ACK reason number given to trace_rxrpc_rx_ack() to make sure we don't overrun the array. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Add a tracepoint to log injected Rx packet loss	David Howells
	Add a tracepoint to log received packets that get discarded due to Rx packet loss. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Add data Tx tracepoint and adjust Tx ACK tracepoint	David Howells
	Add a tracepoint to log transmission of DATA packets (including loss injection). Adjust the ACK transmission tracepoint to include the packet serial number and to line this up with the DATA transmission display. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Add a tracepoint for the call timer	David Howells
	Add a tracepoint to log call timer initiation, setting and expiry. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Don't call the tx_ack tracepoint if don't generate an ACK	David Howells
	rxrpc_send_call_packet() is invoking the tx_ack tracepoint before it checks whether there's an ACK to transmit (another thread may jump in and transmit it). Fix this by only invoking the tracepoint if we get a valid ACK to transmit. Further, only allocate a serial number if we're going to actually transmit something. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Pass the last Tx packet marker in the annotation buffer	David Howells
	When the last packet of data to be transmitted on a call is queued, tx_top is set and then the RXRPC_CALL_TX_LAST flag is set. Unfortunately, this leaves a race in the ACK processing side of things because the flag affects the interpretation of tx_top and also allows us to start receiving reply data before we've finished transmitting. To fix this, make the following changes: (1) rxrpc_queue_packet() now sets a marker in the annotation buffer instead of setting the RXRPC_CALL_TX_LAST flag. (2) rxrpc_rotate_tx_window() detects the marker and sets the flag in the same context as the routines that use it. (3) rxrpc_end_tx_phase() is simplified to just shift the call state. The Tx window must have been rotated before calling to discard the last packet. (4) rxrpc_receiving_reply() is added to handle the arrival of the first DATA packet of a reply to a client call (which is an implicit ACK of the Tx phase). (5) The last part of rxrpc_input_ack() is reordered to perform Tx rotation, then soft-ACK application and then to end the phase if we've rotated the last packet. In the event of a terminal ACK, the soft-ACK application will be skipped as nAcks should be 0. (6) rxrpc_input_ackall() now has to rotate as well as ending the phase. In addition: (7) Alter the transmit tracepoint to log the rotation of the last packet. (8) Remove the no-longer relevant queue_reqack tracepoint note. The ACK-REQUESTED packet header flag is now set as needed when we actually transmit the packet and may vary by retransmission. Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Fix call timer	David Howells
	Fix the call timer in the following ways: (1) If call->resend_at or call->ack_at are before or equal to the current time, then ignore that timeout. (2) If call->expire_at is before or equal to the current time, then don't set the timer at all (possibly we should queue the call). (3) Don't skip modifying the timer if timer_pending() is true. This indicates that the timer is working, not that it has expired and is running/waiting to run its expiry handler. Also call rxrpc_set_timer() to start the call timer going rather than calling add_timer(). Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-23	rxrpc: Fix accidental cancellation of scheduled resend by ACK parser	David Howells
	When rxrpc_input_soft_acks() is parsing the soft-ACKs from an ACK packet, it updates the Tx packet annotations in the annotation buffer. If a soft-ACK is an ACK, then we overwrite unack'd, nak'd or to-be-retransmitted states and that is fine; but if the soft-ACK is an NACK, we overwrite the to-be-retransmitted with a nak - which isn't. Instead, we need to let any scheduled retransmission stand if the packet was NAK'd. Note that we don't reissue a resend if the annotation is in the to-be-retransmitted state because someone else must've scheduled the resend already. Signed-off-by: David Howells <dhowells@redhat.com>