linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2020-01-23	net/sonic: Fix receive buffer handling	Finn Thain
	The SONIC can sometimes advance its rx buffer pointer (RRP register) without advancing its rx descriptor pointer (CRDA register). As a result the index of the current rx descriptor may not equal that of the current rx buffer. The driver mistakenly assumes that they are always equal. This assumption leads to incorrect packet lengths and possible packet duplication. Avoid this by calling a new function to locate the buffer corresponding to a given descriptor. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net/sonic: Fix interface error stats collection	Finn Thain
	The tx_aborted_errors statistic should count packets flagged with EXD, EXC, FU, or BCM bits because those bits denote an aborted transmission. That corresponds to the bitmask 0x0446, not 0x0642. Use macros for these constants to avoid mistakes. Better to leave out FIFO Underruns (FU) as there's a separate counter for that purpose. Don't lump all these errors in with the general tx_errors counter as that's used for tx timeout events. On the rx side, don't count RDE and RBAE interrupts as dropped packets. These interrupts don't indicate a lost packet, just a lack of resources. When a lack of resources results in a lost packet, this gets reported in the rx_missed_errors counter (along with RFO events). Don't double-count rx_frame_errors and rx_crc_errors. Don't use the general rx_errors counter for events that already have special counters. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net/sonic: Use MMIO accessors	Finn Thain
	The driver accesses descriptor memory which is simultaneously accessed by the chip, so the compiler must not be allowed to re-order CPU accesses. sonic_buf_get() used 'volatile' to prevent that. sonic_buf_put() should have done so too but was overlooked. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net/sonic: Clear interrupt flags immediately	Finn Thain
	The chip can change a packet's descriptor status flags at any time. However, an active interrupt flag gets cleared rather late. This allows a race condition that could theoretically lose an interrupt. Fix this by clearing asserted interrupt flags immediately. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net/sonic: Add mutual exclusion for accessing shared state	Finn Thain
	The netif_stop_queue() call in sonic_send_packet() races with the netif_wake_queue() call in sonic_interrupt(). This causes issues like "NETDEV WATCHDOG: eth0 (macsonic): transmit queue 0 timed out". Fix this by disabling interrupts when accessing tx_skb[] and next_tx. Update a comment to clarify the synchronization properties. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	Merge branch 'Add-PHY-IDs-for-DP83825-6'	David S. Miller
	Dan Murphy says: ==================== Add PHY IDs for DP83825/6 Adding new PHY IDs for the DP83825 and DP83826 TI Ethernet PHYs to the DP83822 PHY driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net: phy: DP83822: Add support for additional DP83825 devices	Dan Murphy
	Add PHY IDs for the DP83825CS, DP83825CM and the DP83825S devices to the DP83822 driver. Signed-off-by: Dan Murphy <dmurphy@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	phy: dp83826: Add phy IDs for DP83826N and 826NC	Dan Murphy
	Add phy IDs to the DP83822 phy driver for the DP83826N and the DP83826NC devices. The register map and features are the same for basic enablement. Signed-off-by: Dan Murphy <dmurphy@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net: fsl/fman: rename IF_MODE_XGMII to IF_MODE_10G	Madalin Bucur
	As the only 10G PHY interface type defined at the moment the code was developed was XGMII, although the PHY interface mode used was not XGMII, XGMII was used in the code to denote 10G. This patch renames the 10G interface mode to remove the ambiguity. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	Merge branch 'net-fsl-fman-address-erratum-A011043'	David S. Miller
	Madalin Bucur says: ==================== net: fsl/fman: address erratum A011043 This addresses a HW erratum on some QorIQ DPAA devices. MDIO reads to internal PCS registers may result in having the MDIO_CFG[MDIO_RD_ER] bit set, even when there is no error and read data (MDIO_DATA[MDIO_DATA]) is correct. Software may get false read error when reading internal PCS registers through MDIO. As a workaround, all internal MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit. When the issue was present, one could see such errors during boot: mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	net/fsl: treat fsl,erratum-a011043	Madalin Bucur
	When fsl,erratum-a011043 is set, adjust for erratum A011043: MDIO reads to internal PCS registers may result in having the MDIO_CFG[MDIO_RD_ER] bit set, even when there is no error and read data (MDIO_DATA[MDIO_DATA]) is correct. Software may get false read error when reading internal PCS registers through MDIO. As a workaround, all internal MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	powerpc/fsl/dts: add fsl,erratum-a011043	Madalin Bucur
	Add fsl,erratum-a011043 to internal MDIO buses. Software may get false read error when reading internal PCS registers through MDIO. As a workaround, all internal MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	dt-bindings: net: add fsl,erratum-a011043	Madalin Bucur
	Add an entry for erratum A011043: the MDIO_CFG[MDIO_RD_ER] bit may be falsely set when reading internal PCS registers. MDIO reads to internal PCS registers may result in having the MDIO_CFG[MDIO_RD_ER] bit set, even when there is no error and read data (MDIO_DATA[MDIO_DATA]) is correct. Software may get false read error when reading internal PCS registers through MDIO. As a workaround, all internal MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	qlcnic: Fix CPU soft lockup while collecting firmware dump	Manish Chopra
	Driver while collecting firmware dump takes longer time to collect/process some of the firmware dump entries/memories. Bigger capture masks makes it worse as it results in larger amount of data being collected and results in CPU soft lockup. Place cond_resched() in some of the driver flows that are expectedly time consuming to relinquish the CPU to avoid CPU soft lockup panic. Signed-off-by: Shahed Shaikh <shshaikh@marvell.com> Tested-by: Yonggen Xu <Yonggen.Xu@dell.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23	Merge tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-dax	Linus Torvalds
	Pull XArray fixes from Matthew Wilcox: "Primarily bugfixes, mostly around handling index wrap-around correctly. A couple of doc fixes and adding missing APIs. I had an oops live on stage at linux.conf.au this year, and it turned out to be a bug in xas_find() which I can't prove isn't triggerable in the current codebase. Then in looking for the bug, I spotted two more bugs. The bots have had a few days to chew on this with no problems reported, and it passes the test-suite (which now has more tests to make sure these problems don't come back)" * tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-dax: XArray: Add xa_for_each_range XArray: Fix xas_find returning too many entries XArray: Fix xa_find_after with multi-index entries XArray: Fix infinite loop with entry at ULONG_MAX XArray: Add wrappers for nested spinlocks XArray: Improve documentation of search marks XArray: Fix xas_pause at ULONG_MAX
2020-01-23	Merge tag 'trace-v5.5-rc6-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various tracing fixes: - Fix a function comparison warning for a xen trace event macro - Fix a double perf_event linking to a trace_uprobe_filter for multiple events - Fix suspicious RCU warnings in trace event code for using list_for_each_entry_rcu() when the "_rcu" portion wasn't needed. - Fix a bug in the histogram code when using the same variable - Fix a NULL pointer dereference when tracefs lockdown enabled and calling trace_set_default_clock() - A fix to a bug found with the double perf_event linking patch" * tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobe: Fix to make trace_uprobe_filter alignment safe tracing: Do not set trace clock if tracefs lockdown is in effect tracing: Fix histogram code when expression has same var as value tracing: trigger: Replace unneeded RCU-list traversals tracing/uprobe: Fix double perf_event linking on multiprobe uprobe tracing: xen: Ordered comparison of function pointers
2020-01-23	Merge tag 'ceph-for-5.5-rc8' of https://github.com/ceph/ceph-client	Linus Torvalds
	Pull ceph fix from Ilya Dryomov: "A fix for a potential use-after-free from Jeff, marked for stable" * tag 'ceph-for-5.5-rc8' of https://github.com/ceph/ceph-client: ceph: hold extra reference to r_parent over life of request
2020-01-23	Merge tag 'pm-5.5-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Prevent the kernel from crashing during resume from hibernation if free pages contain leftover data from the restore kernel and init_on_free is set (Alexander Potapenko)" * tag 'pm-5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: fix crashes with init_on_free=1
2020-01-23	Merge tag 'pci-v5.5-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Mark ATS as broken on AMD Navi14 GPU rev 0xc5 (Alex Deucher)" * tag 'pci-v5.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken
2020-01-23	char: hpet: Use flexible-array member	Gustavo A. R. Silva
	Old code in the kernel uses 1-byte and 0-byte arrays to indicate the presence of a "variable length array": struct something { int length; u8 data[1]; }; struct something instance; instance = kmalloc(sizeof(instance) + size, GFP_KERNEL); instance->length = size; memcpy(instance->data, source, size); There is also 0-byte arrays. Both cases pose confusion for things like sizeof(), CONFIG_FORTIFY_SOURCE, etc.[1] Instead, the preferred mechanism to declare variable-length types such as the one above is a flexible array member[2] which need to be the last member of a structure and empty-sized: struct something { int stuff; u8 data[]; }; Also, by making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being unadvertenly introduced[3] to the codebase from now on. [1] https://github.com/KSPP/linux/issues/21 [2] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200120235326.GA29231@embeddedor.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	partitions/ldm: fix spelling mistake "to" -> "too"	Colin Ian King
	There is a spelling mistake in a ldm_error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	x86/mpx: remove MPX from arch/x86	Dave Hansen
	From: Dave Hansen <dave.hansen@linux.intel.com> MPX is being removed from the kernel due to a lack of support in the toolchain going forward (gcc). This removes all the remaining (dead at this point) MPX handling code remaining in the tree. The only remaining code is the XSAVE support for MPX state which is currently needd for KVM to handle VMs which might use MPX. Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2020-01-23	mm: remove arch_bprm_mm_init() hook	Dave Hansen
	From: Dave Hansen <dave.hansen@linux.intel.com> MPX is being removed from the kernel due to a lack of support in the toolchain going forward (gcc). arch_bprm_mm_init() is used at execve() time. The only non-stub implementation is on x86 for MPX. Remove the hook entirely from all architectures and generic code. Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2020-01-23	x86/mpx: remove bounds exception code	Dave Hansen
	From: Dave Hansen <dave.hansen@linux.intel.com> MPX is being removed from the kernel due to a lack of support in the toolchain going forward (gcc). Remove the other user-visible ABI: signal handling. This code should basically have been inactive after the prctl()s were removed, but there may be some small ABI remnants from this code. Remove it. Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2020-01-23	x86/mpx: remove build infrastructure	Dave Hansen
	From: Dave Hansen <dave.hansen@linux.intel.com> MPX is being removed from the kernel due to a lack of support in the toolchain going forward (gcc). Remove the Kconfig option and the Makefile line. This makes arch/x86/mm/mpx.c and anything under an #ifdef for X86_INTEL_MPX dead code. Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2020-01-23	x86/alternatives: add missing insn.h include	Dave Hansen
	From: Dave Hansen <dave.hansen@linux.intel.com> While testing my MPX removal series, Borislav noted compilation failure with an allnoconfig build. Turned out to be a missing include of insn.h in alternative.c. With MPX, it got it implicitly from: asm/mmu_context.h -> asm/mpx.h -> asm/insn.h Fixes: c3d6324f841b ("x86/alternatives: Teach text_poke_bp() to emulate instructions") Reported-by: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2020-01-23	bcache: reap from tail of c->btree_cache in bch_mca_scan()	Coly Li
	When shrink btree node cache from c->btree_cache in bch_mca_scan(), no matter the selected node is reaped or not, it will be rotated from the head to the tail of c->btree_cache list. But in bcache journal code, when flushing the btree nodes with oldest journal entry, btree nodes are iterated and slected from the tail of c->btree_cache list in btree_flush_write(). The list_rotate_left() in bch_mca_scan() will make btree_flush_write() iterate more nodes in c->btree_list in reverse order. This patch just reaps the selected btree node cache, and not move it from the head to the tail of c->btree_cache list. Then bch_mca_scan() will not mess up c->btree_cache list to btree_flush_write(). Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: reap c->btree_cache_freeable from the tail in bch_mca_scan()	Coly Li
	In order to skip the most recently freed btree node cahce, currently in bch_mca_scan() the first 3 caches in c->btree_cache_freeable list are skipped when shrinking bcache node caches in bch_mca_scan(). The related code in bch_mca_scan() is, 737 list_for_each_entry_safe(b, t, &c->btree_cache_freeable, list) { 738 if (nr <= 0) 739 goto out; 740 741 if (++i > 3 && 742 !mca_reap(b, 0, false)) { lines free cache memory 746 } 747 nr--; 748 } The problem is, if virtual memory code calls bch_mca_scan() and the calculated 'nr' is 1 or 2, then in the above loop, nothing will be shunk. In such case, if slub/slab manager calls bch_mca_scan() for many times with small scan number, it does not help to shrink cache memory and just wasts CPU cycles. This patch just selects btree node caches from tail of the c->btree_cache_freeable list, then the newly freed host cache can still be allocated by mca_alloc(), and at least 1 node can be shunk. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: remove member accessed from struct btree	Coly Li
	The member 'accessed' of struct btree is used in bch_mca_scan() when shrinking btree node caches. The original idea is, if b->accessed is set, clean it and look at next btree node cache from c->btree_cache list, and only shrink the caches whose b->accessed is cleaned. Then only cold btree node cache will be shrunk. But when I/O pressure is high, it is very probably that b->accessed of a btree node cache will be set again in bch_btree_node_get() before bch_mca_scan() selects it again. Then there is no chance for bch_mca_scan() to shrink enough memory back to slub or slab system. This patch removes member accessed from struct btree, then once a btree node ache is selected, it will be immediately shunk. By this change, bch_mca_scan() may release btree node cahce more efficiently. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: print written and keys in trace_bcache_btree_write	Guoju Fang
	It's useful to dump written block and keys on btree write, this patch add them into trace_bcache_btree_write. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: avoid unnecessary btree nodes flushing in btree_flush_write()	Coly Li
	the commit 91be66e1318f ("bcache: performance improvement for btree_flush_write()") was an effort to flushing btree node with oldest btree node faster in following methods, - Only iterate dirty btree nodes in c->btree_cache, avoid scanning a lot of clean btree nodes. - Take c->btree_cache as a LRU-like list, aggressively flushing all dirty nodes from tail of c->btree_cache util the btree node with oldest journal entry is flushed. This is to reduce the time of holding c->bucket_lock. Guoju Fang and Shuang Li reported that they observe unexptected extra write I/Os on cache device after applying the above patch. Guoju Fang provideed more detailed diagnose information that the aggressive btree nodes flushing may cause 10x more btree nodes to flush in his workload. He points out when system memory is large enough to hold all btree nodes in memory, c->btree_cache is not a LRU-like list any more. Then the btree node with oldest journal entry is very probably not- close to the tail of c->btree_cache list. In such situation much more dirty btree nodes will be aggressively flushed before the target node is flushed. When slow SATA SSD is used as cache device, such over- aggressive flushing behavior will cause performance regression. After spending a lot of time on debug and diagnose, I find the real condition is more complicated, aggressive flushing dirty btree nodes from tail of c->btree_cache list is not a good solution. - When all btree nodes are cached in memory, c->btree_cache is not a LRU-like list, the btree nodes with oldest journal entry won't be close to the tail of the list. - There can be hundreds dirty btree nodes reference the oldest journal entry, before flushing all the nodes the oldest journal entry cannot be reclaimed. When the above two conditions mixed together, a simply flushing from tail of c->btree_cache list is really NOT a good idea. Fortunately there is still chance to make btree_flush_write() work better. Here is how this patch avoids unnecessary btree nodes flushing, - Only acquire c->journal.lock when getting oldest journal entry of fifo c->journal.pin. In rested locations check the journal entries locklessly, so their values can be changed on other cores in parallel. - In loop list_for_each_entry_safe_reverse(), checking latest front point of fifo c->journal.pin. If it is different from the original point which we get with locking c->journal.lock, it means the oldest journal entry is reclaim on other cores. At this moment, all selected dirty nodes recorded in array btree_nodes[] are all flushed and clean on other CPU cores, it is unncessary to iterate c->btree_cache any longer. Just quit the list_for_each_entry_safe_reverse() loop and the following for-loop will skip all the selected clean nodes. - Find a proper time to quit the list_for_each_entry_safe_reverse() loop. Check the refcount value of orignial fifo front point, if the value is larger than selected node number of btree_nodes[], it means more matching btree nodes should be scanned. Otherwise it means no more matching btee nodes in rest of c->btree_cache list, the loop can be quit. If the original oldest journal entry is reclaimed and fifo front point is updated, the refcount of original fifo front point will be 0, then the loop will be quit too. - Not hold c->bucket_lock too long time. c->bucket_lock is also required for space allocation for cached data, hold it for too long time will block regular I/O requests. When iterating list c->btree_cache, even there are a lot of maching btree nodes, in order to not holding c->bucket_lock for too long time, only BTREE_FLUSH_NR nodes are selected and to flush in following for-loop. With this patch, only btree nodes referencing oldest journal entry are flushed to cache device, no aggressive flushing for unnecessary btree node any more. And in order to avoid blocking regluar I/O requests, each time when btree_flush_write() called, at most only BTREE_FLUSH_NR btree nodes are selected to flush, even there are more maching btree nodes in list c->btree_cache. At last, one more thing to explain: Why it is safe to read front point of c->journal.pin without holding c->journal.lock inside the list_for_each_entry_safe_reverse() loop ? Here is my answer: When reading the front point of fifo c->journal.pin, we don't need to know the exact value of front point, we just want to check whether the value is different from the original front point (which is accurate value because we get it while c->jouranl.lock is held). For such purpose, it works as expected without holding c->journal.lock. Even the front point is changed on other CPU core and not updated to local core, and current iterating btree node has identical journal entry local as original fetched fifo front point, it is still safe. Because after holding mutex b->write_lock (with memory barrier) this btree node can be found as clean and skipped, the loop will quite latter when iterate on next node of list c->btree_cache. Fixes: 91be66e1318f ("bcache: performance improvement for btree_flush_write()") Reported-by: Guoju Fang <fangguoju@gmail.com> Reported-by: Shuang Li <psymon@bonuscloud.io> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: add code comments for state->pool in __btree_sort()	Coly Li
	To explain the pages allocated from mempool state->pool can be swapped in __btree_sort(), because state->pool is a page pool, which allocates pages by alloc_pages() indeed. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	lib: crc64: include <linux/crc64.h> for 'crc64_be'	Ben Dooks (Codethink)
	The crc64_be() is declared in <linux/crc64.h> so include this where the symbol is defined to avoid the following warning: lib/crc64.c:43:12: warning: symbol 'crc64_be' was not declared. Should it be static? Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: use read_cache_page_gfp to read the superblock	Christoph Hellwig
	Avoid a pointless dependency on buffer heads in bcache by simply open coding reading a single page. Also add a SB_OFFSET define for the byte offset of the superblock instead of using magic numbers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: store a pointer to the on-disk sb in the cache and cached_dev structures	Christoph Hellwig
	This allows to properly build the superblock bio including the offset in the page using the normal bio helpers. This fixes writing the superblock for page sizes larger than 4k where the sb write bio would need an offset in the bio_vec. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: return a pointer to the on-disk sb from read_super	Christoph Hellwig
	Returning the properly typed actual data structure insteaf of the containing struct page will save the callers some work going forward. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: transfer the sb_page reference to register_{bdev,cache}	Christoph Hellwig
	Avoid an extra reference count roundtrip by transferring the sb_page ownership to the lower level register helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: fix use-after-free in register_bcache()	Coly Li
	The patch "bcache: rework error unwinding in register_bcache" introduces a use-after-free regression in register_bcache(). Here are current code, 2510 out_free_path: 2511 kfree(path); 2512 out_module_put: 2513 module_put(THIS_MODULE); 2514 out: 2515 pr_info("error %s: %s", path, err); 2516 return ret; If some error happens and the above code path is executed, at line 2511 path is released, but referenced at line 2515. Then KASAN reports a use- after-free error message. This patch changes line 2515 in the following way to fix the problem, 2515 pr_info("error %s: %s", path?path:"", err); Signed-off-by: Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: properly initialize 'path' and 'err' in register_bcache()	Coly Li
	Patch "bcache: rework error unwinding in register_bcache" from Christoph Hellwig changes the local variables 'path' and 'err' in undefined initial state. If the code in register_bcache() jumps to label 'out:' or 'out_module_put:' by goto, these two variables might be reference with undefined value by the following line, out_module_put: module_put(THIS_MODULE); out: pr_info("error %s: %s", path, err); return ret; Therefore this patch initializes these two local variables properly in register_bcache() to avoid such issue. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: rework error unwinding in register_bcache	Christoph Hellwig
	Split the successful and error return path, and use one goto label for each resource to unwind. This also fixes some small errors like leaking the module reference count in the reboot case (which seems entirely harmless) or printing the wrong warning messages for early failures. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: use a separate data structure for the on-disk super block	Christoph Hellwig
	Split out an on-disk version struct cache_sb with the proper endianness annotations. This fixes a fair chunk of sparse warnings, but there are some left due to the way the checksum is defined. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	bcache: cached_dev_free needs to put the sb page	Liang Chen
	Same as cache device, the buffer page needs to be put while freeing cached_dev. Otherwise a page would be leaked every time a cached_dev is stopped. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-23	tty: n_hdlc: Use flexible-array member and struct_size() helper	Gustavo A. R. Silva
	Old code in the kernel uses 1-byte and 0-byte arrays to indicate the presence of a "variable length array": struct something { int length; u8 data[1]; }; struct something instance; instance = kmalloc(sizeof(instance) + size, GFP_KERNEL); instance->length = size; memcpy(instance->data, source, size); There is also 0-byte arrays. Both cases pose confusion for things like sizeof(), CONFIG_FORTIFY_SOURCE, etc.[1] Instead, the preferred mechanism to declare variable-length types such as the one above is a flexible array member[2] which need to be the last member of a structure and empty-sized: struct something { int stuff; u8 data[]; }; Also, by making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertenly introduced[3] to the codebase from now on. Lastly, make use of the struct_size() helper to safely calculate the allocation size for instances of struct n_hdlc_buf and avoid any potential type mistakes[4][5]. [1] https://github.com/KSPP/linux/issues/21 [2] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") [4] https://lore.kernel.org/lkml/60e14fb7-8596-e21c-f4be-546ce39e7bdb@embeddedor.com/ [5] commit 553d66cb1e86 ("iommu/vt-d: Use struct_size() helper") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20200121172138.GA3162@embeddedor Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	usb: phy: phy-gpio-vbus-usb: Convert to GPIO descriptors	Linus Walleij
	Instead of using the legacy GPIO API and keeping track on polarity inversion semantics in the driver, switch to use GPIO descriptors for this driver and change all consumers in the process. This makes it possible to retire platform data completely: the only remaining platform data member was "wakeup" which was intended to make the vbus interrupt wakeup capable, but was not set by any users and thus remained unused. VBUS was not waking any devices up. Leave a comment about it so later developers using the platform can consider setting it to always enabled so plugging in USB wakes up the platform. Cc: Daniel Mack <daniel@zonque.org> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Acked-by: Felipe Balbi <balbi@kernel.org> Acked-by: Sylwester Nawrocki <snawrocki@kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20200123155013.93249-1-linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	staging: comedi: drivers: fix spelling mistake "to" -> "too"	Colin Ian King
	There is a spelling mistake in a deb_dbg message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200123010344.2834618-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	staging: exfat: remove fs_func struct.	Tetsuhiro Kohada
	Remove 'fs_func struct' and change indirect calls to direct calls. The following issues are described in exfat's TODO. > Create helper function for exfat_set_entry_time () and > exfat_set_entry_type () because it's sort of ugly to be calling the same functionn directly and other code calling through the fs_func struc ponters ... The fs_func struct was used for switching the helper functions of fat16/fat32/exfat. Now, it has lost the role of switching, just making the code less readable. Signed-off-by: Tetsuhiro Kohada <Kohada.Tetsuhiro@dc.MitsubishiElectric.co.jp> Link: https://lore.kernel.org/r/20200123102445.123033-1-Kohada.Tetsuhiro@dc.MitsubishiElectric.co.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	staging: wilc1000: avoid mutex unlock without lock in wilc_wlan_handle_txq()	Ajay Singh
	In wilc_wlan_handle_txq(), mutex unlock was called without acquiring it. Also error code for full VMM condition was incorrect as discussed in [1]. Now used a proper code to indicate VMM is full, for which transfer to VMM is required again. 'wilc_wlan_handle_txq()' should be called again if the VMM space was full earlier or otherwise based on 'txq_event' signal. 1. https://lore.kernel.org/driverdev-devel/20191113183322.a54mh2w6dulklgsd@kili.mountain/ Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Link: https://lore.kernel.org/r/20200123182129.4053-2-ajay.kathat@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	staging: wilc1000: return zero on success and non-zero on function failure	Ajay Singh
	Some of the HIF layer API's return zero for failure and non-zero for success condition. Now, modified the functions to return zero for success and non-zero for failure as its recommended approach suggested in [1]. 1. https://lore.kernel.org/driverdev-devel/20191113183322.a54mh2w6dulklgsd@kili.mountain/ Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Link: https://lore.kernel.org/r/20200123182129.4053-1-ajay.kathat@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	readdir: make user_access_begin() use the real access range	Linus Torvalds
	In commit 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()") I changed filldir to not do individual __put_user() accesses, but instead use unsafe_put_user() surrounded by the proper user_access_begin/end() pair. That make them enormously faster on modern x86, where the STAC/CLAC games make individual user accesses fairly heavy-weight. However, the user_access_begin() range was not really the exact right one, since filldir() has the unfortunate problem that it needs to not only fill out the new directory entry, it also needs to fix up the previous one to contain the proper file offset. It's unfortunate, but the "d_off" field in "struct dirent" is _not_ the file offset of the directory entry itself - it's the offset of the next one. So we end up backfilling the offset in the previous entry as we walk along. But since x86 didn't really care about the exact range, and used to be the only architecture that did anything fancy in user_access_begin() to begin with, the filldir[64]() changes did something lazy, and even commented on it: /* * Note! This range-checks 'previous' (which may be NULL). * The real range was checked in getdents / if (!user_access_begin(dirent, sizeof(dirent))) goto efault; and it all worked fine. But now 32-bit ppc is starting to also implement user_access_begin(), and the fact that we faked the range to only be the (possibly not even valid) previous directory entry becomes a problem, because ppc32 will actually be using the range that is passed in for more than just "check that it's user space". This is a complete rewrite of Christophe's original patch. By saving off the record length of the previous entry instead of a pointer to it in the filldir data structures, we can simplify the range check and the writing of the previous entry d_off field. No need for any conditionals in the user accesses themselves, although we retain the conditional EINTR checking for the "was this the first directory entry" signal handling latency logic. Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()") Link: https://lore.kernel.org/lkml/a02d3426f93f7eb04960a4d9140902d278cab0bb.1579697910.git.christophe.leroy@c-s.fr/ Link: https://lore.kernel.org/lkml/408c90c4068b00ea8f1c41cca45b84ec23d4946b.1579783936.git.christophe.leroy@c-s.fr/ Reported-and-tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-23	readdir: be more conservative with directory entry names	Linus Torvalds
	Commit 8a23eb804ca4 ("Make filldir[64]() verify the directory entry filename is valid") added some minimal validity checks on the directory entries passed to filldir[64](). But they really were pretty minimal. This fleshes out at least the name length check: we used to disallow zero-length names, but really, negative lengths or oevr-long names aren't ok either. Both could happen if there is some filesystem corruption going on. Now, most filesystems tend to use just an "unsigned char" or similar for the length of a directory entry name, so even with a corrupt filesystem you should never see anything odd like that. But since we then use the name length to create the directory entry record length, let's make sure it actually is half-way sensible. Note how POSIX states that the size of a path component is limited by NAME_MAX, but we actually use PATH_MAX for the check here. That's because while NAME_MAX is generally the correct maximum name length (it's 255, for the same old "name length is usually just a byte on disk"), there's nothing in the VFS layer that really cares. So the real limitation at a VFS layer is the total pathname length you can pass as a filename: PATH_MAX. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>