git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2019-11-18	btrfs: move btrfs_set_path_blocking to other locking functions	David Sterba
	The function belongs to the family of locking functions, so move it there. The 'noinline' keyword is dropped as it's now an exported function that does not need it. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: make btrfs_assert_tree_locked static inline	David Sterba
	The function btrfs_assert_tree_locked is used outside of the locking code so it is exported, however we can make it static inine as it's fairly trivial. This is the only locking assertion used in release builds, inlining improves the text size by 174 bytes and reduces stack consumption in the callers. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: make locking assertion helpers static inline	David Sterba
	I've noticed that none of the btrfs_assert_lock debugging helpers is inlined, despite they're short and mostly a value update. Making them inline shaves 67 from the text size, reduces stack consumption and perhaps also slightly improves the performance due to avoiding unnecessary calls. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: get rid of pointless wtag variable in async-thread.c	Omar Sandoval
	Commit ac0c7cf8be00 ("btrfs: fix crash when tracepoint arguments are freed by wq callbacks") added a void pointer, wtag, which is passed into trace_btrfs_all_work_done() instead of the freed work item. This is silly for a few reasons: 1. The freed work item still has the same address. 2. work is still in scope after it's freed, so assigning wtag doesn't stop anyone from using it. 3. The tracepoint has always taken a void * argument, so assigning wtag doesn't actually make things any more type-safe. (Note that the original bug in commit bc074524e123 ("btrfs: prefix fsid to all trace events") was that the void * was implicitly casted when it was passed to btrfs_work_owner() in the trace point itself). Instead, let's add some clearer warnings as comments. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: get rid of unique workqueue helper functions	Omar Sandoval
	Commit 9e0af2376434 ("Btrfs: fix task hang under heavy compressed write") worked around the issue that a recycled work item could get a false dependency on the original work item due to how the workqueue code guarantees non-reentrancy. It did so by giving different work functions to different types of work. However, the fixes in the previous few patches are more complete, as they prevent a work item from being recycled at all (except for a tiny window that the kernel workqueue code handles for us). This obsoletes the previous fix, so we don't need the unique helpers for correctness. The only other reason to keep them would be so they show up in stack traces, but they always seem to be optimized to a tail call, so they don't show up anyways. So, let's just get rid of the extra indirection. While we're here, rename normal_work_helper() to the more informative btrfs_work_helper(). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: don't prematurely free work in scrub_missing_raid56_worker()	Omar Sandoval
	Currently, scrub_missing_raid56_worker() puts and potentially frees sblock (which embeds the work item) and then submits a bio through scrub_wr_submit(). This is another potential instance of the bug in "btrfs: don't prematurely free work in run_ordered_work()". Fix it by dropping the reference after we submit the bio. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: don't prematurely free work in reada_start_machine_worker()	Omar Sandoval
	Currently, reada_start_machine_worker() frees the reada_machine_work and then calls __reada_start_machine() to do readahead. This is another potential instance of the bug in "btrfs: don't prematurely free work in run_ordered_work()". There _might_ already be a deadlock here: reada_start_machine_worker() can depend on itself through stacked filesystems (__read_start_machine() -> reada_start_machine_dev() -> reada_tree_block_flagged() -> read_extent_buffer_pages() -> submit_one_bio() -> btree_submit_bio_hook() -> btrfs_map_bio() -> submit_stripe_bio() -> submit_bio() onto a loop device can trigger readahead on the lower filesystem). Either way, let's fix it by freeing the work at the end. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: don't prematurely free work in end_workqueue_fn()	Omar Sandoval
	Currently, end_workqueue_fn() frees the end_io_wq entry (which embeds the work item) and then calls bio_endio(). This is another potential instance of the bug in "btrfs: don't prematurely free work in run_ordered_work()". In particular, the endio call may depend on other work items. For example, btrfs_end_dio_bio() can call btrfs_subio_endio_read() -> __btrfs_correct_data_nocsum() -> dio_read_error() -> submit_dio_repair_bio(), which submits a bio that is also completed through a end_workqueue_fn() work item. However, __btrfs_correct_data_nocsum() waits for the newly submitted bio to complete, thus it depends on another work item. This example currently usually works because we use different workqueue helper functions for BTRFS_WQ_ENDIO_DATA and BTRFS_WQ_ENDIO_DIO_REPAIR. However, it may deadlock with stacked filesystems and is fragile overall. The proper fix is to free the work item at the very end of the work function, so let's do that. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: don't prematurely free work in run_ordered_work()	Omar Sandoval
	We hit the following very strange deadlock on a system with Btrfs on a loop device backed by another Btrfs filesystem: 1. The top (loop device) filesystem queues an async_cow work item from cow_file_range_async(). We'll call this work X. 2. Worker thread A starts work X (normal_work_helper()). 3. Worker thread A executes the ordered work for the top filesystem (run_ordered_work()). 4. Worker thread A finishes the ordered work for work X and frees X (work->ordered_free()). 5. Worker thread A executes another ordered work and gets blocked on I/O to the bottom filesystem (still in run_ordered_work()). 6. Meanwhile, the bottom filesystem allocates and queues an async_cow work item which happens to be the recently-freed X. 7. The workqueue code sees that X is already being executed by worker thread A, so it schedules X to be executed _after_ worker thread A finishes (see the find_worker_executing_work() call in process_one_work()). Now, the top filesystem is waiting for I/O on the bottom filesystem, but the bottom filesystem is waiting for the top filesystem to finish, so we deadlock. This happens because we are breaking the workqueue assumption that a work item cannot be recycled while it still depends on other work. Fix it by waiting to free the work item until we are done with all of the related ordered work. P.S.: One might ask why the workqueue code doesn't try to detect a recycled work item. It actually does try by checking whether the work item has the same work function (find_worker_executing_work()), but in our case the function is the same. This is the only key that the workqueue code has available to compare, short of adding an additional, layer-violating "custom key". Considering that we're the only ones that have ever hit this, we should just play by the rules. Unfortunately, we haven't been able to create a minimal reproducer other than our full container setup using a compress-force=zstd filesystem on top of another compress-force=zstd filesystem. Suggested-by: Tejun Heo <tj@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: get rid of unnecessary memset() of work item	Omar Sandoval
	Commit fc97fab0ea59 ("btrfs: Replace fs_info->qgroup_rescan_worker workqueue with btrfs_workqueue.") converted qgroup_rescan_work to be initialized with btrfs_init_work(), but it left behind an unnecessary memset(). Get rid of the memset(). Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: move the failrec tree stuff into extent-io-tree.h	Josef Bacik
	This needs to be cleaned up in the future, but for now it belongs to the extent-io-tree stuff since it uses the internal tree search code. Needed to export get_state_failrec and set_state_failrec as well since we're not going to move the actual IO part of the failrec stuff out at this point. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: export find_delalloc_range	Josef Bacik
	This utilizes internal stuff to the extent_io_tree, so we need to export it before we move it. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: move extent_io_tree defs to their own header	Josef Bacik
	extent_io.c/h are huge, encompassing a bunch of different things. The extent_io_tree code can live on its own, so separate this out. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: separate out the extent io init function	Josef Bacik
	We are moving extent_io_tree into it's on file, so separate out the extent_state init stuff from extent_io_tree_init(). Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: separate out the extent leak code	Josef Bacik
	We check both extent buffer and extent state leaks in the same function, separate these two functions out so we can move them around. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: ctree: Remove stray comment of setting up path lock	Qu Wenruo
	The following comment shows up in btrfs_search_slot() with out much sense: /* * setup the path here so we can release it under lock * contention with the cow code / if (cow) { / code touching path->lock[] is far away from here / } This comment hasn't been cleaned up after the relevant code has been removed. The original code is introduced in commit 65b51a009e29 ("btrfs_search_slot: reduce lock contention by cowing in two stages"): + + / + * setup the path here so we can release it under lock + * contention with the cow code + */ + p->nodes[level] = b; + if (!p->skip_locking) + p->locks[level] = 1; + But in current code, we have different timing for modifying path lock, so just remove the comment. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: ctree: Reduce one indent level for btrfs_search_old_slot()	Qu Wenruo
	Similar to btrfs_search_slot() done in previous patch, make a shortcut for the level 0 case and allow to reduce indentation for the remaining case. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: ctree: Reduce one indent level for btrfs_search_slot()	Qu Wenruo
	In btrfs_search_slot(), we something like: if (level != 0) { /* Do search inside tree nodes/ } else { / Do search inside tree leaves / goto done; } This caused extra indent for tree node search code. Change it to something like: if (level == 0) { / Do search inside tree leaves / goto done' } / Do search inside tree nodes */ So we have more space to maneuver our code, this is especially useful as the tree nodes search code is more complex than the leaves search code. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: tree-checker: Add check for INODE_REF	Qu Wenruo
	For INODE_REF we will check: - Objectid (ino) against previous key To detect missing INODE_ITEM. - No overflow/padding in the data payload Much like DIR_ITEM, but with less members to check. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: tree-checker: Try to detect missing INODE_ITEM	Qu Wenruo
	For the following items, key->objectid is inode number: - DIR_ITEM - DIR_INDEX - XATTR_ITEM - EXTENT_DATA - INODE_REF So in the subvolume tree, such items must have its previous item share the same objectid, e.g.: (257 INODE_ITEM 0) (257 DIR_INDEX xxx) (257 DIR_ITEM xxx) (258 INODE_ITEM 0) (258 INODE_REF 0) (258 XATTR_ITEM 0) (258 EXTENT_DATA 0) But if we have the following sequence, then there is definitely something wrong, normally some INODE_ITEM is missing, like: (257 INODE_ITEM 0) (257 DIR_INDEX xxx) (257 DIR_ITEM xxx) (258 XATTR_ITEM 0) <<< objecitd suddenly changed to 258 (258 EXTENT_DATA 0) So just by checking the previous key for above inode based key types, we can detect a missing inode item. For INODE_REF key type, the check will be added along with INODE_REF checker. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	Btrfs: make btrfs_wait_extents() static	Filipe Manana
	It's not used ouside of transaction.c Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: Add assert to catch nested transaction commit	Nikolay Borisov
	A recent patch to btrfs showed that there was at least 1 case where a nested transaction was committed. Nested transaction in this case means a code which has a transaction handle calls some function which in turn obtains a copy of the same transaction handle. In such cases the correct thing to do is for the lower callee to call btrfs_end_transaction which contains appropriate checks so as to not commit the transaction which will result in stale trans handler for the caller. To catch such cases add an assert in btrfs_commit_transaction ensuring btrfs_trans_handle::use_count is always 1. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	btrfs: simplify inode locking for RWF_NOWAIT	Goldwyn Rodrigues
	This is similar to 942491c9e6d6 ("xfs: fix AIM7 regression"). Apparently our current rwsem code doesn't like doing the trylock, then lock for real scheme. This causes extra contention on the lock and can be measured eg. by AIM7 benchmark. So change our read/write methods to just do the trylock for the RWF_NOWAIT case. Fixes: edf064e7c6fe ("btrfs: nowait aio support") Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18	USB: documentation: flags on usb-storage versus UAS	Oliver Neukum
	Document which flags work storage, UAS or both Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114112758.32747-4-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	USB: uas: heed CAPACITY_HEURISTICS	Oliver Neukum
	There is no need to ignore this flag. We should be as close to storage in that regard as makes sense, so honor flags whose cost is tiny. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114112758.32747-3-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	USB: uas: honor flag to avoid CAPACITY16	Oliver Neukum
	Copy the support over from usb-storage to get feature parity Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114112758.32747-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	usb: host: xhci-tegra: Correct phy enable sequence	Nagarjuna Kristam
	XUSB phy needs to be enabled before un-powergating the power partitions. However in the current sequence, it happens opposite. Correct the phy enable and powergating partition sequence to avoid any boot hangs. Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Jui Chang Kuo <jckuo@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/1572859470-7823-1-git-send-email-nkristam@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	Merge tag 'usb-ci-v5.5-rc1' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-next Peter writes: - Improve the Vbus handler - Improve the HSIC handler for i.mx serial SoC - Some other tiny changes * tag 'usb-ci-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb: usb: chipidea: imx: pinctrl for HSIC is optional usb: chipidea: imx: refine the error handling for hsic usb: chipidea: imx: change hsic power regulator as optional usb: chipidea: imx: check data->usbmisc_data against NULL before access usb: chipidea: core: change vbus-regulator as optional usb: chipidea: imx: enable vbus and id wakeup only for OTG events usb: chipidea: udc: protect usb interrupt enable usb: chipidea: udc: add new API ci_hdrc_gadget_connect
2019-11-18	powerpc: cleanup hw_irq.h	Christophe Leroy
	SET_MSR_EE() is just use in this file and doesn't provide any added value compared to mtmsr(). Drop it. Add a wrtee() inline function to use wrtee/wrteei insn. Replace #ifdefs by IS_ENABLED() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a28a20514d5f6df9629c1a117b667e48c4272736.1567068137.git.christophe.leroy@c-s.fr
2019-11-18	powerpc: permanently include 8xx registers in reg.h	Christophe Leroy
	Most 8xx registers have specific names, so just include reg_8xx.h all the time in reg.h in order to have them defined even when CONFIG_PPC_8xx is not selected. This will avoid the need for #ifdefs in C code. Guard SPRN_ICTRL in an #ifdef CONFIG_PPC_8xx as this register has same name but different meaning and different spr number as another register in the mpc7450. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dd82934ad91aab607d0eb7e626c14e6ac0d654eb.1567068137.git.christophe.leroy@c-s.fr
2019-11-18	powerpc/reg: use ASM_FTR_IFSET() instead of opencoding fixup.	Christophe Leroy
	mftb() includes a feature fixup for CELL ppc. Use ASM_FTR_IFSET() macro instead of opencoding the setup of the fixup sections. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ac19713826fa55e9e7bfe3100c5a7b1712ab9526.1566999711.git.christophe.leroy@c-s.fr
2019-11-18	powerpc/32: Don't populate page tables for block mapped pages except on the 8xx.	Christophe Leroy
	Commit d2f15e0979ee ("powerpc/32: always populate page tables for Abatron BDI.") wrongly sets page tables for any PPC32 for using BDI, and does't update them after init (remove RX on init section, set text and rodata read-only) Only the 8xx requires page tables to be populated for using the BDI. They also need to be populated in order to see the mappings in /sys/kernel/debug/kernel_page_tables On BOOK3S_32, pages that are not mapped by page tables are mapped by BATs. The BDI knows BATs and they can be viewed in /sys/kernel/debug/powerpc/block_address_translation Only set pagetables for RAM and IMMR on the 8xx and properly update them at the end of init. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c8610942203e0d93fcb02ad20c57edd3adb4c9d3.1566554029.git.christophe.leroy@c-s.fr
2019-11-18	powerpc/mm: Show if a bad page fault on data is read or write.	Christophe Leroy
	DSISR (or ESR on some CPUs) has a bit to tell if the fault is due to a read or a write. Display it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4f88d7e6fda53b5f80a71040ab400242f6c8cb93.1566400889.git.christophe.leroy@c-s.fr
2019-11-18	powerpc/mm: drop #ifdef CONFIG_MMU in is_ioremap_addr()	Christophe Leroy
	powerpc always selects CONFIG_MMU and CONFIG_MMU is not checked anywhere else in powerpc code. Drop the #ifdef and the alternative part of is_ioremap_addr() Fixes: 9bd3bb6703d8 ("mm/nvdimm: add is_ioremap_addr and use that to check ioremap address") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/de395e444fb8dd7a6365c3314d78e15ebb3d7d1b.1566382245.git.christophe.leroy@c-s.fr
2019-11-18	powerpc: Refactor BUG/WARN macros	Christophe Leroy
	BUG(), WARN() and friends are using a similar inline assembly to implement various traps with various flags. Lets refactor via a new BUG_ENTRY() macro. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c19a82b37677ace0eebb0dc8c2120373c29c8dd1.1566219503.git.christophe.leroy@c-s.fr
2019-11-18	Merge branch 'next' of ↵	Michael Ellerman
	https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Merge changes from Scott: Includes a couple of device tree fixes, a spelling fix, and leftover code cleanup.
2019-11-18	usb-serial: cp201x: support Mark-10 digital force gauge	Greg Kroah-Hartman
	Add support for the Mark-10 digital force gauge device to the cp201x driver. Based on a report and a larger patch from Joel Jennings Reported-by: Joel Jennings <joel.jennings@makeitlabs.com> Cc: stable <stable@vger.kernel.org> Acked-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191118092119.GA153852@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	platform/x86: touchscreen_dmi: Add info for the ezpad 6 m4 tablet	Hans de Goede
	Add touchscreen info for the Jumper EZpad 6 m4 tablet. Reported-and-tested-by: youling257 <youling257@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-11-18	Merge branch 'bpf-array-mmap'	Daniel Borkmann
	Andrii Nakryiko says: ==================== This patch set adds ability to memory-map BPF array maps (single- and multi-element). The primary use case is memory-mapping BPF array maps, created to back global data variables, created by libbpf implicitly. This allows for much better usability, along with avoiding syscalls to read or update data completely. Due to memory-mapping requirements, BPF array map that is supposed to be memory-mapped, has to be created with special BPF_F_MMAPABLE attribute, which triggers slightly different memory allocation strategy internally. See patch 1 for details. Libbpf is extended to detect kernel support for this flag, and if supported, will specify it for all global data maps automatically. Patch #1 refactors bpf_map_inc() and converts bpf_map's refcnt to atomic64_t to make refcounting never fail. Patch #2 does similar refactoring for bpf_prog_add()/bpf_prog_inc(). v5->v6: - add back uref counting (Daniel); v4->v5: - change bpf_prog's refcnt to atomic64_t (Daniel); v3->v4: - add mmap's open() callback to fix refcounting (Johannes); - switch to remap_vmalloc_pages() instead of custom fault handler (Johannes); - converted bpf_map's refcnt/usercnt into atomic64_t; - provide default bpf_map_default_vmops handling open/close properly; v2->v3: - change allocation strategy to avoid extra pointer dereference (Jakub); v1->v2: - fix map lookup code generation for BPF_F_MMAPABLE case; - prevent BPF_F_MMAPABLE flag for all but plain array map type; - centralize ref-counting in generic bpf_map_mmap(); - don't use uref counting (Alexei); - use vfree() directly; - print flags with %x (Song); - extend tests to verify bpf_map_{lookup,update}_elem() logic as well. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-11-18	selftests/bpf: Add BPF_TYPE_MAP_ARRAY mmap() tests	Andrii Nakryiko
	Add selftests validating mmap()-ing BPF array maps: both single-element and multi-element ones. Check that plain bpf_map_update_elem() and bpf_map_lookup_elem() work correctly with memory-mapped array. Also convert CO-RE relocation tests to use memory-mapped views of global data. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-6-andriin@fb.com
2019-11-18	libbpf: Make global data internal arrays mmap()-able, if possible	Andrii Nakryiko
	Add detection of BPF_F_MMAPABLE flag support for arrays and add it as an extra flag to internal global data maps, if supported by kernel. This allows users to memory-map global data and use it without BPF map operations, greatly simplifying user experience. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-5-andriin@fb.com
2019-11-18	bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY	Andrii Nakryiko
	Add ability to memory-map contents of BPF array map. This is extremely useful for working with BPF global data from userspace programs. It allows to avoid typical bpf_map_{lookup,update}_elem operations, improving both performance and usability. There had to be special considerations for map freezing, to avoid having writable memory view into a frozen map. To solve this issue, map freezing and mmap-ing is happening under mutex now: - if map is already frozen, no writable mapping is allowed; - if map has writable memory mappings active (accounted in map->writecnt), map freezing will keep failing with -EBUSY; - once number of writable memory mappings drops to zero, map freezing can be performed again. Only non-per-CPU plain arrays are supported right now. Maps with spinlocks can't be memory mapped either. For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc() to be mmap()'able. We also need to make sure that array data memory is page-sized and page-aligned, so we over-allocate memory in such a way that struct bpf_array is at the end of a single page of memory with array->value being aligned with the start of the second page. On deallocation we need to accomodate this memory arrangement to free vmalloc()'ed memory correctly. One important consideration regarding how memory-mapping subsystem functions. Memory-mapping subsystem provides few optional callbacks, among them open() and close(). close() is called for each memory region that is unmapped, so that users can decrease their reference counters and free up resources, if necessary. open() is almost symmetrical: it's called for each memory region that is being mapped, except the very first one. So bpf_map_mmap does initial refcnt bump, while open() will do any extra ones after that. Thus number of close() calls is equal to number of open() calls plus one more. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
2019-11-18	bpf: Convert bpf_prog refcnt to atomic64_t	Andrii Nakryiko
	Similarly to bpf_map's refcnt/usercnt, convert bpf_prog's refcnt to atomic64 and remove artificial 32k limit. This allows to make bpf_prog's refcounting non-failing, simplifying logic of users of bpf_prog_add/bpf_prog_inc. Validated compilation by running allyesconfig kernel build. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191117172806.2195367-3-andriin@fb.com
2019-11-18	bpf: Switch bpf_map ref counter to atomic64_t so bpf_map_inc() never fails	Andrii Nakryiko
	92117d8443bc ("bpf: fix refcnt overflow") turned refcounting of bpf_map into potentially failing operation, when refcount reaches BPF_MAX_REFCNT limit (32k). Due to using 32-bit counter, it's possible in practice to overflow refcounter and make it wrap around to 0, causing erroneous map free, while there are still references to it, causing use-after-free problems. But having a failing refcounting operations are problematic in some cases. One example is mmap() interface. After establishing initial memory-mapping, user is allowed to arbitrarily map/remap/unmap parts of mapped memory, arbitrarily splitting it into multiple non-contiguous regions. All this happening without any control from the users of mmap subsystem. Rather mmap subsystem sends notifications to original creator of memory mapping through open/close callbacks, which are optionally specified during initial memory mapping creation. These callbacks are used to maintain accurate refcount for bpf_map (see next patch in this series). The problem is that open() callback is not supposed to fail, because memory-mapped resource is set up and properly referenced. This is posing a problem for using memory-mapping with BPF maps. One solution to this is to maintain separate refcount for just memory-mappings and do single bpf_map_inc/bpf_map_put when it goes from/to zero, respectively. There are similar use cases in current work on tcp-bpf, necessitating extra counter as well. This seems like a rather unfortunate and ugly solution that doesn't scale well to various new use cases. Another approach to solve this is to use non-failing refcount_t type, which uses 32-bit counter internally, but, once reaching overflow state at UINT_MAX, stays there. This utlimately causes memory leak, but prevents use after free. But given refcounting is not the most performance-critical operation with BPF maps (it's not used from running BPF program code), we can also just switch to 64-bit counter that can't overflow in practice, potentially disadvantaging 32-bit platforms a tiny bit. This simplifies semantics and allows above described scenarios to not worry about failing refcount increment operation. In terms of struct bpf_map size, we are still good and use the same amount of space: BEFORE (3 cache lines, 8 bytes of padding at the end): struct bpf_map { const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / int spin_lock_off; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / u32 btf_key_type_id; / 56 4 / u32 btf_value_type_id; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / struct btf btf; /* 64 8 / struct bpf_map_memory memory; / 72 16 / bool unpriv_array; / 88 1 / bool frozen; / 89 1 / / XXX 38 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / atomic_t refcnt __attribute__((__aligned__(64))); / 128 4 / atomic_t usercnt; / 132 4 / struct work_struct work; / 136 32 / char name[16]; / 168 16 / / size: 192, cachelines: 3, members: 21 / / sum members: 146, holes: 1, sum holes: 38 / / padding: 8 / / forced alignments: 2, forced holes: 1, sum forced holes: 38 / } __attribute__((__aligned__(64))); AFTER (same 3 cache lines, no extra padding now): struct bpf_map { const struct bpf_map_ops ops __attribute__((__aligned__(64))); /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / int spin_lock_off; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / u32 btf_key_type_id; / 56 4 / u32 btf_value_type_id; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / struct btf btf; /* 64 8 / struct bpf_map_memory memory; / 72 16 / bool unpriv_array; / 88 1 / bool frozen; / 89 1 / / XXX 38 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / atomic64_t refcnt __attribute__((__aligned__(64))); / 128 8 / atomic64_t usercnt; / 136 8 / struct work_struct work; / 144 32 / char name[16]; / 176 16 / / size: 192, cachelines: 3, members: 21 / / sum members: 154, holes: 1, sum holes: 38 / / forced alignments: 2, forced holes: 1, sum forced holes: 38 */ } __attribute__((__aligned__(64))); This patch, while modifying all users of bpf_map_inc, also cleans up its interface to match bpf_map_put with separate operations for bpf_map_inc and bpf_map_inc_with_uref (to match bpf_map_put and bpf_map_put_with_uref, respectively). Also, given there are no users of bpf_map_inc_not_zero specifying uref=true, remove uref flag and default to uref=false internally. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-2-andriin@fb.com
2019-11-18	x86: Remove unused asm/rio.h	Thomas Gleixner
	The removed calgary IOMMU driver was the only user of this header file. Reported-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de>
2019-11-18	usb: chipidea: imx: pinctrl for HSIC is optional	Peter Chen
	For imx chipidea controllers, if they use mxs PHY, they need pinctrl for HSIC. Otherwise, it doesn't need pinctrl and usbmisc control. Like imx7d and imx8mm. Reported-by: André Draszik <git@andred.net> Signed-off-by: Peter Chen <peter.chen@nxp.com>
2019-11-18	HID: rmi: Check that the RMI_STARTED bit is set before unregistering the RMI ↵	Andrew Duggan
	transport device In the event that the RMI device is unreachable, the calls to rmi_set_mode() or rmi_set_page() will fail before registering the RMI transport device. When the device is removed, rmi_remove() will call rmi_unregister_transport_device() which will attempt to access the rmi_dev pointer which was not set. This patch adds a check of the RMI_STARTED bit before calling rmi_unregister_transport_device(). The RMI_STARTED bit is only set after rmi_register_transport_device() completes successfully. The kernel oops was reported in this message: https://www.spinics.net/lists/linux-input/msg58433.html [jkosina@suse.cz: reworded changelog as agreed with Andrew] Signed-off-by: Andrew Duggan <aduggan@synaptics.com> Reported-by: Federico Cerutti <federico@ceres-c.it> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2019-11-18	m68k/atari: Convert Falcon IDE drivers to platform drivers	Michael Schmitz
	Autoloading of Falcon IDE driver modules requires converting these drivers to platform drivers. Add platform device for Falcon IDE interface in Atari platform setup code. Use this in the pata_falcon driver in place of the simple platform device set up on the fly. Convert falconide driver to use the same platform device that is used by pata_falcon also. (With the introduction of a platform device for the Atari Falcon IDE interface, the old Falcon IDE driver no longer loads (resource already claimed by the platform device)). Tested (as built-in driver) on my Atari Falcon. Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://lore.kernel.org/r/1573008449-8226-1-git-send-email-schmitzmic@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-11-18	HID: quirks: remove hid-led devices from hid_have_special_driver	Heiner Kallweit
	Since e04a0442d33b ("HID: core: remove the absolute need of hid_have_special_driver[]") it's no longer needed to list these LED devices in hid_have_special_driver[]. This allows libraries needing access to the hidraw device to work properly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2019-11-18	mmc: core: Fix size overflow for mmc partitions	Bradley Bolen
	With large eMMC cards, it is possible to create general purpose partitions that are bigger than 4GB. The size member of the mmc_part struct is only an unsigned int which overflows for gp partitions larger than 4GB. Change this to a u64 to handle the overflow. Signed-off-by: Bradley Bolen <bradleybolen@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>