linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2018-10-19	Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-4.20/block	Jens Axboe
	Pull NVMe updates from Christoph: "The second batch of updates for Linux 4.20: - lot of fixes for issues found by static type checkers from Bart - two small fixes from Keith - fabrics cleanups in preparation of the TCP transport from Sagi - more cleanups from Chaitanya" * 'nvme-4.20' of git://git.infradead.org/nvme: nvme-fabrics: move controller options matching to fabrics nvme-rdma: always have a valid trsvcid nvme-pci: remove duplicate check nvme-pci: fix hot removal during error handling nvmet-fcloop: suppress a compiler warning nvme-core: make implicit seed truncation explicit nvmet-fc: fix kernel-doc headers nvme-fc: rework the request initialization code nvme-fc: introduce struct nvme_fcp_op_w_sgl nvme-fc: fix kernel-doc headers nvmet: avoid integer overflow in the discard code nvmet-rdma: declare local symbols static nvmet: use strlcpy() instead of strcpy() nvme-pci: fix nvme_suspend_queue() kernel-doc header nvme-core: rework a NQN copying operation nvme-core: declare local symbols static nvmet-rdma: check for timeout in nvme_rdma_wait_for_cm() nvmet: use strcmp() instead of strncmp() for subsystem lookup nvmet: remove unreachable code nvme: update node paths after adding new path
2018-10-19	genirq: Fix race on spurious interrupt detection	Lukas Wunner
	Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs") made detection of spurious interrupts work for threaded handlers by: a) incrementing a counter every time the thread returns IRQ_HANDLED, and b) checking whether that counter has increased every time the thread is woken. However for oneshot interrupts, the commit unmasks the interrupt before incrementing the counter. If another interrupt occurs right after unmasking but before the counter is incremented, that interrupt is incorrectly considered spurious: time \| irq_thread() \| irq_thread_fn() \| action->thread_fn() \| irq_finalize_oneshot() \| unmask_threaded_irq() /* interrupt is unmasked / \| \| / interrupt fires, incorrectly deemed spurious / \| \| atomic_inc(&desc->threads_handled); / counter is incremented */ v This is observed with a hi3110 CAN controller receiving data at high volume (from a separate machine sending with "cangen -g 0 -i -x"): The controller signals a huge number of interrupts (hundreds of millions per day) and every second there are about a dozen which are deemed spurious. In theory with high CPU load and the presence of higher priority tasks, the number of incorrectly detected spurious interrupts might increase beyond the 99,900 threshold and cause disablement of the interrupt. In practice it just increments the spurious interrupt count. But that can cause people to waste time investigating it over and over. Fix it by moving the accounting before the invocation of irq_finalize_oneshot(). [ tglx: Folded change log update ] Fixes: 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mathias Duckeck <m.duckeck@kunbus.de> Cc: Akshay Bhat <akshay.bhat@timesys.com> Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com> Cc: stable@vger.kernel.org # v3.16+ Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de
2018-10-19	arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work	James Morse
	enable_smccc_arch_workaround_1() passes NULL as the hyp_vecs start and end if the HVC conduit is in use, and ARM_SMCCC_ARCH_WORKAROUND_1 is detected. If the guest kernel happened to be built with KVM_INDIRECT_VECTORS, we go on to allocate a slot, memcpy() the empty workaround in and do the appropriate cache maintenance. This works as we always tell memcpy() the range is 0, so it never accesses the NULL src pointer, but we still do the cache maintenance. If hyp_vecs_start is NULL we know we're a guest, just update the fn like the !KVM_INDIRECT_VECTORS version. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-19	regulator: lochnagar: Use a consisent comment style for SPDX header	Charles Keepax
	Update the rest of the comment at the start of the file to also use C++ style comments to match the required style of the SPDX header. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	Merge tag 'kvmarm-for-v4.20' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for 4.20 - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups
2018-10-19	x86/intel_rdt: Prevent pseudo-locking from using stale pointers	Jithu Joseph
	When the last CPU in an rdt_domain goes offline, its rdt_domain struct gets freed. Current pseudo-locking code is unaware of this scenario and tries to dereference the freed structure in a few places. Add checks to prevent pseudo-locking code from doing this. While further work is needed to seamlessly restore resource groups (not just pseudo-locking) to their configuration when the domain is brought back online, the immediate issue of invalid pointers is addressed here. Fixes: f4e80d67a5274 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information") Fixes: 443810fe61605 ("x86/intel_rdt: Create debugfs files for pseudo-locking testing") Fixes: 746e08590b864 ("x86/intel_rdt: Create character device exposing pseudo-locked region") Fixes: 33dc3e410a0d9 ("x86/intel_rdt: Make CPU information accessible for pseudo-locked regions") Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/231f742dbb7b00a31cc104416860e27dba6b072d.1539384145.git.reinette.chatre@intel.com
2018-10-19	fs: group frequently accessed fields of struct super_block together	Amir Goldstein
	Kernel test robot reported [1] a 6% performance regression in a concurrent unlink(2) workload on commit 60f7ed8c7c4d ("fsnotify: send path type events to group with super block marks"). The performance test was run with no fsnotify marks at all on the data set, so the only extra instructions added by the offending commit are tests of the super_block fields s_fsnotify_{marks,mask} and these tests happen on almost every single inode access. When adding those fields to the super_block struct, we did not give much thought of placing them on a hot cache lines (we just placed them at the end of the struct). Re-organize struct super_block to try and keep some frequently accessed fields on the same cache line. Move the frequently accessed fields s_fsnotify_{marks,mask} near the frequently accessed fields s_fs_info,s_time_gran, while filling a 64bit alignment hole after s_time_gran. Move the seldom accessed fields s_id,s_uuid,s_max_links,s_mode near the seldom accessed fields s_vfs_rename_mutex,s_subtype. Rong Chen confirmed that this patch solved the reported problem. [1] https://lkml.org/lkml/2018/9/30/206 Reported-by: kernel test robot <rong.a.chen@intel.com> Tested-by: kernel test robot <rong.a.chen@intel.com> Fixes: 1e6cb72399 ("fsnotify: add super block object type") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-10-19	spi: omap2-mcspi: Add slave mode support	Vignesh R
	Add support to use McSPI controller as SPI slave. In slave mode, DMA TX completion does not mean entire data has been shifted out as data might still be stuck in FIFO waiting for master to clock the bus. Therefore, add an IRQ handler for slave mode to know when entire data in FIFO has been shifted out. Signed-off-by: Vignesh R <vigneshr@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	spi: omap2-mcspi: Set FIFO DMA trigger level to word length	Vignesh R
	McSPI has 32 byte FIFO in Transmit-Receive mode. Current code tries to configuration FIFO watermark level for DMA trigger to be GCD of transfer length and max FIFO size which would mean trigger level may be set to 32 for transmit-receive mode if length is aligned. This does not work in case of SPI slave mode where FIFO always needs to have data ready whenever master starts the clock. With DMA trigger size of 32 there will be a small window during slave TX where DMA is still putting data into FIFO but master would have started clock for next byte, resulting in shifting out of stale data. Similarly, on Slave RX side there may be RX FIFO overflow Fix this by setting FIFO watermark for DMA trigger to word length. This means DMA is triggered as soon as FIFO has space for word length bytes and DMA would make sure FIFO is almost always full therefore improving FIFO occupancy in both master and slave mode. Signed-off-by: Vignesh R <vigneshr@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	spi: omap2-mcspi: Switch to readl_poll_timeout()	Vignesh R
	Use standard readl_poll_timeout() macro for polling on status bits. Signed-off-by: Vignesh R <vigneshr@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	spi: spi-mem: add stm32 qspi controller	Ludovic Barre
	The qspi controller is a specialized communication interface targeting single, dual or quad SPI Flash memories (NOR/NAND). It can operate in any of the following modes: -indirect mode: all the operations are performed using the quadspi registers -read memory-mapped mode: the external Flash memory is mapped to the microcontroller address space and is seen by the system as if it was an internal memory tested on: -NOR: mx66l51235l -NAND: MT29F2G01ABAGD Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	dt-bindings: spi: add stm32 qspi controller	Ludovic Barre
	This patch adds the documentation of device tree bindings for the STM32 QSPI controller. It is a specialized communication interface targeting single, dual or quad SPI Flash memories (NOR/NAND). Signed-off-by: Ludovic Barre <ludovic.barre@st.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	Merge remote-tracking branch 'asoc/for-4.19' into asoc-4.20	Mark Brown

2018-10-19	ASoC: wm_adsp: Log addresses as 8 digits in wm_adsp_buffer_populate	Richard Fitzgerald
	Increase the address value width in the debug log from 4 digits to 8 digits to allow for DSP cores with larger memory address ranges. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: wm_adsp: Rename memory fields in wm_adsp_buffer	Richard Fitzgerald
	The wm_adsp_buffer struct is the control header of a circular buffer used to transfer data from the firmware over the control interface to an ALSA compressed stream. The original names of the fields pointing to the data buffer were based on ADSP2V2 memory layout where they correspond to {XM, XM, YM}. But this circular buffer could be used on other types of DSP core that have different memory region types. Also the names and description of the size fields were not very clear. The field names and descriptions have been changed to be generic and not imply any particular memory types. This patch updates the wm_adsp driver to the new field names. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	regulator: bd718x7: Remove struct bd718xx_pmic	Axel Lin
	All the fields in struct bd718xx_pmic are not really necessary. Remove struct bd718xx_pmic to simplify the code. Signed-off-by: Axel Lin <axel.lin@ingics.com> Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	nvme-fabrics: move controller options matching to fabrics	Sagi Grimberg
	IP transports will most likely use the same controller options matching when detecting a duplicate connect. Move it to fabrics. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-10-19	regmap: use less #ifdef for LOG_DEVICE	Ben Dooks
	Move the checking of the LOG_DEVICE into a function to reduce the number of #ifdefs and ensure more of the code gets compiled/checked, and make it easier to change this for internal debugging purposes (such as checking >1 device). Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	nvme-rdma: always have a valid trsvcid	Sagi Grimberg
	If not passed, we set the default trsvcid. We can rely on having trsvcid and can simplify the controller matching logic. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-10-19	ASoC: cs42l51: add mclk support	Olivier Moysan
	Add MCLK dapm to allow configuration of cirrus CS42l51 codec as a master clock consumer. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: stm32: sai: set sai as mclk clock provider	Olivier Moysan
	Add master clock generation support in STM32 SAI. The master clock provided by SAI can be used to feed a codec. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: dt-bindings: add mclk support to cs42l51	Olivier Moysan
	Add clocks properties to cs42l51 Cirrus codec, to support master clock provider. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: dt-bindings: add mclk provider support to stm32 sai	Olivier Moysan
	add mclk provider support to stm32 sai Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: soc-core: fix trivial checkpatch issues	Marcel Ziswiler
	Fix a few trivial aka cosmetic only checkpatch issues like long lines, wrong indentations, spurious blanks and newlines, missing newlines, multi-line comments etc. Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: dapm: Add support for hw_free on CODEC to CODEC links	Charles Keepax
	Currently, on power down for a CODEC to CODEC DAI link we only call digital_mute and shutdown. Provide a little more flexibility for drivers by adding a call to hw_free as well. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	media: rename soc_camera I2C drivers	Mauro Carvalho Chehab
	Those drivers are part of the legacy SoC camera framework. They're being converted to not use it, but sometimes we're keeping both legacy any new driver. This time, for example, we have two drivers on media with the same name: ov772x. That's bad. So, in order to prevent that to happen, let's prepend the SoC legacy drivers with soc_. No functional changes. Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-10-19	Revert "netfilter: xt_quota: fix the behavior of xt_quota module"	Pablo Neira Ayuso
	This reverts commit e9837e55b0200da544a095a1fca36efd7fd3ba30. When talking to Maze and Chenbo, we agreed to keep this back by now due to problems in the ruleset listing path with 32-bit arches. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	netfilter: nfnetlink_log: remove empty nfnetlink_log.h header file	Taehee Yoo
	/include/net/netfilter/nfnetlink_log.h file is empty. so that it can be removed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	netfilter: remove two unused variables.	Weongyo Jeong
	nft_dup_netdev_ingress_ops and nft_fwd_netdev_ingress_ops variables are no longer used at the code. Signed-off-by: Weongyo Jeong <weongyo.linux@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	regmap: Add regmap_noinc_write API	Ben Whitten
	The regmap API had a noinc_read function added for instances where devices supported returning data from an internal FIFO in a single read. This commit adds the noinc_write variant to allow writing to a non incrementing register, this is used in devices such as the sx1301 for loading firmware. Signed-off-by: Ben Whitten <ben.whitten@lairdtech.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	ASoC: Intel: kbl_da7219_max98927: minor white space clean up	Dan Carpenter
	I just added a couple missing tabs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-10-19	netfilter: nf_flow_table: do not remove offload when other netns's interface ↵	Taehee Yoo
	is down When interface is down, offload cleanup function(nf_flow_table_do_cleanup) is called and that checks whether interface index of offload and index of link down interface is same. but only interface index checking is not enough because flowtable is not pernet list. So that, if other netns's interface that has index is same with offload is down, that offload will be removed. This patch adds netns checking code to the offload cleanup routine. Fixes: 59c466dd68e7 ("netfilter: nf_flow_table: add a new flow state for tearing down offloading") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	netfilter: nf_flow_table: remove unnecessary parameter of ↵	Taehee Yoo
	nf_flow_table_cleanup() parameter net of nf_flow_table_cleanup() is not used. So that it can be removed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit ↵	Taehee Yoo
	routine When device is unregistered, flowtable flush routine is called by notifier_call(nf_tables_flowtable_event). and exit callback of nftables pernet_operation(nf_tables_exit_net) also has flowtable flush routine. but when network namespace is destroyed, both notifier_call and pernet_operation are called. hence flowtable flush routine in pernet_operation is unnecessary. test commands: %ip netns add vm1 %ip netns exec vm1 nft add table ip filter %ip netns exec vm1 nft add flowtable ip filter w \ { hook ingress priority 0\; devices = { lo }\; } %ip netns del vm1 splat looks like: [ 265.187019] WARNING: CPU: 0 PID: 87 at net/netfilter/core.c:309 nf_hook_entry_head+0xc7/0xf0 [ 265.187112] Modules linked in: nf_flow_table_ipv4 nf_flow_table nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip_tables x_tables [ 265.187390] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc3+ #5 [ 265.187453] Workqueue: netns cleanup_net [ 265.187514] RIP: 0010:nf_hook_entry_head+0xc7/0xf0 [ 265.187546] Code: 8d 81 68 03 00 00 5b c3 89 d0 83 fa 04 48 8d 84 c7 e8 11 00 00 76 81 0f 0b 31 c0 e9 78 ff ff ff 0f 0b 48 83 c4 08 31 c0 5b c3 <0f> 0b 31 c0 e9 65 ff ff ff 0f 0b 31 c0 e9 5c ff ff ff 48 89 0c 24 [ 265.187573] RSP: 0018:ffff88011546f098 EFLAGS: 00010246 [ 265.187624] RAX: ffffffff8d90e135 RBX: 1ffff10022a8de1c RCX: 0000000000000000 [ 265.187645] RDX: 0000000000000000 RSI: 0000000000000005 RDI: ffff880116298040 [ 265.187645] RBP: ffff88010ea4c1a8 R08: 0000000000000000 R09: 0000000000000000 [ 265.187645] R10: ffff88011546f1d8 R11: ffffed0022c532c1 R12: ffff88010ea4c1d0 [ 265.187645] R13: 0000000000000005 R14: dffffc0000000000 R15: ffff88010ea4c1c4 [ 265.187645] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000 [ 265.187645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 265.187645] CR2: 00007fdfb8d00000 CR3: 0000000057a16000 CR4: 00000000001006f0 [ 265.187645] Call Trace: [ 265.187645] __nf_unregister_net_hook+0xca/0x5d0 [ 265.187645] ? nf_hook_entries_free.part.3+0x80/0x80 [ 265.187645] ? save_trace+0x300/0x300 [ 265.187645] nf_unregister_net_hooks+0x2e/0x40 [ 265.187645] nf_tables_exit_net+0x479/0x1340 [nf_tables] [ 265.187645] ? find_held_lock+0x39/0x1c0 [ 265.187645] ? nf_tables_abort+0x30/0x30 [nf_tables] [ 265.187645] ? inet_frag_destroy_rcu+0xd0/0xd0 [ 265.187645] ? trace_hardirqs_on+0x93/0x210 [ 265.187645] ? __bpf_trace_preemptirq_template+0x10/0x10 [ 265.187645] ? inet_frag_destroy_rcu+0xd0/0xd0 [ 265.187645] ? inet_frag_destroy_rcu+0xd0/0xd0 [ 265.187645] ? __mutex_unlock_slowpath+0x17f/0x740 [ 265.187645] ? wait_for_completion+0x710/0x710 [ 265.187645] ? bucket_table_free+0xb2/0x1f0 [ 265.187645] ? nested_table_free+0x130/0x130 [ 265.187645] ? __lock_is_held+0xb4/0x140 [ 265.187645] ops_exit_list.isra.10+0x94/0x140 [ 265.187645] cleanup_net+0x45b/0x900 [ ... ] This WARNING means that hook unregisteration is failed because all flowtables hooks are already unregistered by notifier_call. Network namespace exit routine guarantees that all devices will be unregistered first. then, other exit callbacks of pernet_operations are called. so that removing flowtable flush routine in exit callback of pernet_operation(nf_tables_exit_net) doesn't make flowtable leak. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19	Btrfs: fix use-after-free during inode eviction	Filipe Manana
	At inode.c:evict_inode_truncate_pages(), when we iterate over the inode's extent states, we access an extent state record's "state" field after we unlocked the inode's io tree lock. This can lead to a use-after-free issue because after we unlock the io tree that extent state record might have been freed due to being merged into another adjacent extent state record (a previous inflight bio for a read operation finished in the meanwhile which unlocked a range in the io tree and cause a merge of extent state records, as explained in the comment before the while loop added in commit 6ca0709756710 ("Btrfs: fix hang during inode eviction due to concurrent readahead")). Fix this by keeping a copy of the extent state's flags in a local variable and using it after unlocking the io tree. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201189 Fixes: b9d0b38928e2 ("btrfs: Add handler for invalidate page") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: move the dio_sem higher up the callchain	Josef Bacik
	We're getting a lockdep splat because we take the dio_sem under the log_mutex. What we really need is to protect fsync() from logging an extent map for an extent we never waited on higher up, so just guard the whole thing with dio_sem. ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411 Not tainted ------------------------------------------------------ aio-dio-invalid/30928 is trying to acquire lock: 0000000092621cfd (&mm->mmap_sem){++++}, at: get_user_pages_unlocked+0x5a/0x1e0 but task is already holding lock: 00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (&ei->dio_sem){++++}: lock_acquire+0xbd/0x220 down_write+0x51/0xb0 btrfs_log_changed_extents+0x80/0xa40 btrfs_log_inode+0xbaf/0x1000 btrfs_log_inode_parent+0x26f/0xa80 btrfs_log_dentry_safe+0x50/0x70 btrfs_sync_file+0x357/0x540 do_fsync+0x38/0x60 __ia32_sys_fdatasync+0x12/0x20 do_fast_syscall_32+0x9a/0x2f0 entry_SYSENTER_compat+0x84/0x96 -> #4 (&ei->log_mutex){+.+.}: lock_acquire+0xbd/0x220 __mutex_lock+0x86/0xa10 btrfs_record_unlink_dir+0x2a/0xa0 btrfs_unlink+0x5a/0xc0 vfs_unlink+0xb1/0x1a0 do_unlinkat+0x264/0x2b0 do_fast_syscall_32+0x9a/0x2f0 entry_SYSENTER_compat+0x84/0x96 -> #3 (sb_internal#2){.+.+}: lock_acquire+0xbd/0x220 __sb_start_write+0x14d/0x230 start_transaction+0x3e6/0x590 btrfs_evict_inode+0x475/0x640 evict+0xbf/0x1b0 btrfs_run_delayed_iputs+0x6c/0x90 cleaner_kthread+0x124/0x1a0 kthread+0x106/0x140 ret_from_fork+0x3a/0x50 -> #2 (&fs_info->cleaner_delayed_iput_mutex){+.+.}: lock_acquire+0xbd/0x220 __mutex_lock+0x86/0xa10 btrfs_alloc_data_chunk_ondemand+0x197/0x530 btrfs_check_data_free_space+0x4c/0x90 btrfs_delalloc_reserve_space+0x20/0x60 btrfs_page_mkwrite+0x87/0x520 do_page_mkwrite+0x31/0xa0 __handle_mm_fault+0x799/0xb00 handle_mm_fault+0x7c/0xe0 __do_page_fault+0x1d3/0x4a0 async_page_fault+0x1e/0x30 -> #1 (sb_pagefaults){.+.+}: lock_acquire+0xbd/0x220 __sb_start_write+0x14d/0x230 btrfs_page_mkwrite+0x6a/0x520 do_page_mkwrite+0x31/0xa0 __handle_mm_fault+0x799/0xb00 handle_mm_fault+0x7c/0xe0 __do_page_fault+0x1d3/0x4a0 async_page_fault+0x1e/0x30 -> #0 (&mm->mmap_sem){++++}: __lock_acquire+0x42e/0x7a0 lock_acquire+0xbd/0x220 down_read+0x48/0xb0 get_user_pages_unlocked+0x5a/0x1e0 get_user_pages_fast+0xa4/0x150 iov_iter_get_pages+0xc3/0x340 do_direct_IO+0xf93/0x1d70 __blockdev_direct_IO+0x32d/0x1c20 btrfs_direct_IO+0x227/0x400 generic_file_direct_write+0xcf/0x180 btrfs_file_write_iter+0x308/0x58c aio_write+0xf8/0x1d0 io_submit_one+0x3a9/0x620 __ia32_compat_sys_io_submit+0xb2/0x270 do_int80_syscall_32+0x5b/0x1a0 entry_INT80_compat+0x88/0xa0 other info that might help us debug this: Chain exists of: &mm->mmap_sem --> &ei->log_mutex --> &ei->dio_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->dio_sem); lock(&ei->log_mutex); lock(&ei->dio_sem); lock(&mm->mmap_sem); * DEADLOCK * 1 lock held by aio-dio-invalid/30928: #0: 00000000cefe6b35 (&ei->dio_sem){++++}, at: btrfs_direct_IO+0x3be/0x400 stack backtrace: CPU: 0 PID: 30928 Comm: aio-dio-invalid Not tainted 4.18.0-rc4-xfstests-00025-g5de5edbaf1d4 #411 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Call Trace: dump_stack+0x7c/0xbb print_circular_bug.isra.37+0x297/0x2a4 check_prev_add.constprop.45+0x781/0x7a0 ? __lock_acquire+0x42e/0x7a0 validate_chain.isra.41+0x7f0/0xb00 __lock_acquire+0x42e/0x7a0 lock_acquire+0xbd/0x220 ? get_user_pages_unlocked+0x5a/0x1e0 down_read+0x48/0xb0 ? get_user_pages_unlocked+0x5a/0x1e0 get_user_pages_unlocked+0x5a/0x1e0 get_user_pages_fast+0xa4/0x150 iov_iter_get_pages+0xc3/0x340 do_direct_IO+0xf93/0x1d70 ? __alloc_workqueue_key+0x358/0x490 ? __blockdev_direct_IO+0x14b/0x1c20 __blockdev_direct_IO+0x32d/0x1c20 ? btrfs_run_delalloc_work+0x40/0x40 ? can_nocow_extent+0x490/0x490 ? kvm_clock_read+0x1f/0x30 ? can_nocow_extent+0x490/0x490 ? btrfs_run_delalloc_work+0x40/0x40 btrfs_direct_IO+0x227/0x400 ? btrfs_run_delalloc_work+0x40/0x40 generic_file_direct_write+0xcf/0x180 btrfs_file_write_iter+0x308/0x58c aio_write+0xf8/0x1d0 ? kvm_clock_read+0x1f/0x30 ? __might_fault+0x3e/0x90 io_submit_one+0x3a9/0x620 ? io_submit_one+0xe5/0x620 __ia32_compat_sys_io_submit+0xb2/0x270 do_int80_syscall_32+0x5b/0x1a0 entry_INT80_compat+0x88/0xa0 CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: don't run delayed_iputs in commit	Josef Bacik
	This could result in a really bad case where we do something like evict evict_refill_and_join btrfs_commit_transaction btrfs_run_delayed_iputs evict evict_refill_and_join btrfs_commit_transaction ... forever We have plenty of other places where we run delayed iputs that are much safer, let those do the work. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: fix insert_reserved error handling	Josef Bacik
	We were not handling the reserved byte accounting properly for data references. Metadata was fine, if it errored out the error paths would free the bytes_reserved count and pin the extent, but it even missed one of the error cases. So instead move this handling up into run_one_delayed_ref so we are sure that both cases are properly cleaned up in case of a transaction abort. CC: stable@vger.kernel.org # 4.18+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: only free reserved extent if we didn't insert it	Josef Bacik
	When we insert the file extent once the ordered extent completes we free the reserved extent reservation as it'll have been migrated to the bytes_used counter. However if we error out after this step we'll still clear the reserved extent reservation, resulting in a negative accounting of the reserved bytes for the block group and space info. Fix this by only doing the free if we didn't successfully insert a file extent for this extent. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: don't use ctl->free_space for max_extent_size	Josef Bacik
	max_extent_size is supposed to be the largest contiguous range for the space info, and ctl->free_space is the total free space in the block group. We need to keep track of these separately and _only_ use the max_free_space if we don't have a max_extent_size, as that means our original request was too large to search any of the block groups for and therefore wouldn't have a max_extent_size set. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: set max_extent_size properly	Josef Bacik
	We can't use entry->bytes if our entry is a bitmap entry, we need to use entry->max_extent_size in that case. Fix up all the logic to make this consistent. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	btrfs: reset max_extent_size properly	Josef Bacik
	If we use up our block group before allocating a new one we'll easily get a max_extent_size that's set really really low, which will result in a lot of fragmentation. We need to make sure we're resetting the max_extent_size when we add a new chunk or add new space. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	MAINTAINERS: update my email address for btrfs	Josef Bacik
	My work email is completely useless, switch it to my personal address so I get emails on a account I actually pay attention to. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-19	KVM: arm64: Safety check PSTATE when entering guest and handle IL	Christoffer Dall
	This commit adds a paranoid check when entering the guest to make sure we don't attempt running guest code in an equally or more privilged mode than the hypervisor. We also catch other accidental programming of the SPSR_EL2 which results in an illegal exception return and report this safely back to the user. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-19	drm/sun4i: Fix an ulong overflow in the dotclock driver	Boris Brezillon
	The calculated ideal rate can easily overflow an unsigned long, thus making the best div selection buggy as soon as no ideal match is found before the overflow occurs. Fixes: 4731a72df273 ("drm/sun4i: request exact rates to our parents") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181018100250.12565-1-boris.brezillon@bootlin.com
2018-10-19	KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips	Paul Mackerras
	This disables the use of the streamlined entry path for radix guests on early POWER9 chips that need the workaround added in commit a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with KVM", 2017-07-24), because the streamlined entry path does not include that workaround. This also means that we can't do nested HV-KVM on those chips. Since the chips that need that workaround are the same ones that can't run both radix and HPT guests at the same time on different threads of a core, we use the existing 'no_mixing_hpt_and_radix' variable that identifies those chips to identify when we can't use the new guest entry path, and when we can't do nested virtualization. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-10-19	Merge tag 'nand/for-4.20' of git://git.infradead.org/linux-mtd into mtd/next	Boris Brezillon
	NAND core changes: - Two batchs of cleanups of the NAND API, including: * Deprecating a lot of interfaces (now replaced by ->exec_op()). * Moving code in separate drivers (JEDEC, ONFI), in private files (internals), in platform drivers, etc. * Functions/structures reordering. * Exclusive use of the nand_chip structure instead of the MTD one all across the subsystem. - Addition of the nand_wait_readrdy/rdy_op() helpers. Raw NAND controllers drivers changes: - Various coccinelle patches. - Marvell: * Use regmap_update_bits() for syscon access. * More documentation. * BCH failure path rework. * More layouts to be supported. * IRQ handler complete() condition fixed. - Fsl_ifc: * SRAM initialization fixed for newer controller versions. - Denali: * Fix licenses mismatch and use a SPDX tag. * Set SPARE_AREA_SKIP_BYTES register to 8 if unset. - Qualcomm: * Do not include dma-direct.h. - Docg4: * Removed. - Ams-delta: * Use of a GPIO lookup table * Internal machinery changes. Raw NAND chip drivers changes: - Toshiba: * Add support for Toshiba memory BENAND * Pass a single nand_chip object to the status helper. - ESMT: * New driver to retrieve the ECC requirements from the 5th ID byte.
2018-10-19	Merge tag 'spi-nor/for-4.20' of git://git.infradead.org/linux-mtd into mtd/next	Boris Brezillon
	Core changes: * Support non-uniform erase size * Support controllers with limited TX fifo size Driver changes: * m25p80: Re-issue a WREN command after each write access * cadence: Pass a proper dir value to dma_[un]map_single() * fsl-qspi: Check fsl_qspi_get_seqid() return val make sure 4B addressing opcodes are properly handled * intel-spi: Add a new PCI entry for Ice Lake
2018-10-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Greg Kroah-Hartman
	David writes: "Networking 1) Fix gro_cells leak in xfrm layer, from Li RongQing. 2) BPF selftests change RLIMIT_MEMLOCK blindly, don't do that. From Eric Dumazet. 3) AF_XDP calls synchronize_net() under RCU lock, fix from Björn Töpel. 4) Out of bounds packet access in _decode_session6(), from Alexei Starovoitov. 5) Several ethtool bugs, where we copy a struct into the kernel twice and our validations of the values in the first copy can be invalidated by the second copy due to asynchronous updates to the memory by the user. From Wenwen Wang. 6) Missing netlink attribute validation in cls_api, from Davide Caratti. 7) LLC SAP sockets neet to be SOCK_RCU FREE, from Cong Wang. 8) rxrpc operates on wrong kvec, from Yue Haibing. 9) A regression was introduced by the disassosciation of route neighbour references in rt6_probe(), causing probe for neighbourless routes to not be properly rate limited. Fix from Sabrina Dubroca. 10) Unsafe RCU locking in tipc, from Tung Nguyen. 11) Use after free in inet6_mc_check(), from Eric Dumazet. 12) PMTU from icmp packets should update the SCTP transport pathmtu, from Xin Long. 13) Missing peer put on error in rxrpc, from David Howells. 14) Fix pedit in nfp driver, from Pieter Jansen van Vuuren. 15) Fix overflowing shift statement in qla3xxx driver, from Nathan Chancellor. 16) Fix Spectre v1 in ptp code, from Gustavo A. R. Silva. 17) udp6_unicast_rcv_skb() interprets udpv6_queue_rcv_skb() return value in an inverted manner, fix from Paolo Abeni. 18) Fix missed unresolved entries in ipmr dumps, from Nikolay Aleksandrov. 19) Fix NAPI handling under high load, we can completely miss events when NAPI has to loop more than one time in a cycle. From Heiner Kallweit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) ip6_tunnel: Fix encapsulation layout tipc: fix info leak from kernel tipc_event net: socket: fix a missing-check bug net: sched: Fix for duplicate class dump r8169: fix NAPI handling under high load net: ipmr: fix unresolved entry dumps net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion() sctp: fix the data size calculation in sctp_data_size virtio_net: avoid using netif_tx_disable() for serializing tx routine udp6: fix encap return code for resubmitting mlxsw: core: Fix use-after-free when flashing firmware during init sctp: not free the new asoc when sctp_wait_for_connect returns err sctp: fix race on sctp_id2asoc r8169: re-enable MSI-X on RTL8168g net: bpfilter: use get_pid_task instead of pid_task ptp: fix Spectre v1 vulnerability net: qla3xxx: Remove overflowing shift statement geneve, vxlan: Don't set exceptions if skb->len < mtu geneve, vxlan: Don't check skb_dst() twice sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL instead ...
2018-10-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc	Greg Kroah-Hartman
	David writes: "Sparc fixes: The main bit here is fixing how fallback system calls are handled in the sparc vDSO. Unfortunately, I fat fingered the commit and some perf debugging hacks slipped into the vDSO fix, which I revert in the very next commit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Revert unintended perf changes. sparc: vDSO: Silence an uninitialized variable warning sparc: Fix syscall fallback bugs in VDSO.