linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-11-03	Update MIPS email addresses	Paul Burton
	MIPS will soon not be a part of Imagination Technologies, and as such many @imgtec.com email addresses will no longer be valid. This patch updates the addresses for those who: - Have 10 or more patches in mainline authored using an @imgtec.com email address, or any patches dated within the past year. - Are still with Imagination but leaving as part of the MIPS business unit, as determined from an internal email address list. - Haven't already updated their email address (ie. JamesH) or expressed a desire to be excluded (ie. Maciej). - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt & myself. New addresses are of the form firstname.lastname@mips.com, and all verified against an internal email address list. An entry is added to .mailmap for each person such that get_maintainer.pl will report the new addresses rather than @imgtec.com addresses which will soon be dead. Instances of the affected addresses throughout the tree are then mechanically replaced with the new @mips.com address. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com> Acked-by: Dengcheng Zhu <dengcheng.zhu@mips.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Matt Redfearn <matt.redfearn@mips.com> Acked-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: trivial@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo	Rafael J. Wysocki
	Commit 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes made after the commit it has reverted. To address this, make the code in question use arch_freq_get_on_cpu() which also is used by cpufreq for reporting the current frequency of CPUs and since that function doesn't really depend on cpufreq in any way, drop the CONFIG_CPU_FREQ dependency for the object file containing it. Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and return cached values right away if it is called very often over a short time (to prevent user space from triggering IPI storms through it). Fixes: 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") Cc: stable@kernel.org # 4.13 - together with 890da9cf0983 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	mm, swap: fix race between swap count continuation operations	Huang Ying
	One page may store a set of entries of the sis->swap_map (swap_info_struct->swap_map) in multiple swap clusters. If some of the entries has sis->swap_map[offset] > SWAP_MAP_MAX, multiple pages will be used to store the set of entries of the sis->swap_map. And the pages are linked with page->lru. This is called swap count continuation. To access the pages which store the set of entries of the sis->swap_map simultaneously, previously, sis->lock is used. But to improve the scalability of __swap_duplicate(), swap cluster lock may be used in swap_count_continued() now. This may race with add_swap_count_continuation() which operates on a nearby swap cluster, in which the sis->swap_map entries are stored in the same page. The race can cause wrong swap count in practice, thus cause unfreeable swap entries or software lockup, etc. To fix the race, a new spin lock called cont_lock is added to struct swap_info_struct to protect the swap count continuation page list. This is a lock at the swap device level, so the scalability isn't very well. But it is still much better than the original sis->lock, because it is only acquired/released when swap count continuation is used. Which is considered rare in practice. If it turns out that the scalability becomes an issue for some workloads, we can split the lock into some more fine grained locks. Link: http://lkml.kernel.org/r/20171017081320.28133-1-ying.huang@intel.com Fixes: 235b62176712 ("mm/swap: add cluster lock") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	mm/huge_memory.c: deposit page table when copying a PMD migration entry	Zi Yan
	We need to deposit pre-allocated PTE page table when a PMD migration entry is copied in copy_huge_pmd(). Otherwise, we will leak the pre-allocated page and cause a NULL pointer dereference later in zap_huge_pmd(). The missing counters during PMD migration entry copy process are added as well. The bug report is here: https://lkml.org/lkml/2017/10/29/214 Link: http://lkml.kernel.org/r/20171030144636.4836-1-zi.yan@sent.com Fixes: 84c3fc4e9c563 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	initramfs: fix initramfs rebuilds w/ compression after disabling	Florian Fainelli
	This is a follow-up to commit 57ddfdaa9a72 ("initramfs: fix disabling of initramfs (and its compression)"). This particular commit fixed the use case where we build the kernel with an initramfs with no compression, and then we build the kernel with no initramfs. Now this still left us with the same case as described here: http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com not working with initramfs compression. This can be seen by the following steps/timestamps: https://www.spinics.net/lists/kernel/msg2598153.html .initramfs_data.cpio.gz.cmd is correct: cmd_usr/initramfs_data.cpio.gz := /bin/bash ./scripts/gen_initramfs_list.sh -o usr/initramfs_data.cpio.gz -u 1000 -g 1000 /home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev and was generated the first time we did generate the gzip initramfs, so the command has not changed, nor its arguments, so we just don't call it, no initramfs cpio is re-generated as a consequence. The fix for this problem is just to properly keep track of the .initramfs_cpio_data.d file by suffixing it with the compression extension. This takes care of properly tracking dependencies such that the initramfs get (re)generated any time files are added/deleted etc. Link: http://lkml.kernel.org/r/20170930033936.6722-1-f.fainelli@gmail.com Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded initramfs compression algorithm") Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: "Francisco Blas Izquierdo Riera (klondike)" <klondike@xiscosoft.net> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	fs/hugetlbfs/inode.c: fix hwpoison reserve accounting	Mike Kravetz
	Calling madvise(MADV_HWPOISON) on a hugetlbfs page will result in bad (negative) reserved huge page counts. This may not happen immediately, but may happen later when the underlying file is removed or filesystem unmounted. For example: AnonHugePages: 0 kB ShmemHugePages: 0 kB HugePages_Total: 1 HugePages_Free: 0 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Hugepagesize: 2048 kB In routine hugetlbfs_error_remove_page(), hugetlb_fix_reserve_counts is called after remove_huge_page. hugetlb_fix_reserve_counts is designed to only be called/used only if a failure is returned from hugetlb_unreserve_pages. Therefore, call hugetlb_unreserve_pages as required and only call hugetlb_fix_reserve_counts in the unlikely event that hugetlb_unreserve_pages returns an error. Link: http://lkml.kernel.org/r/20171019230007.17043-2-mike.kravetz@oracle.com Fixes: 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	ocfs2: fstrim: Fix start offset of first cluster group during fstrim	Ashish Samant
	The first cluster group descriptor is not stored at the start of the group but at an offset from the start. We need to take this into account while doing fstrim on the first cluster group. Otherwise we will wrongly start fstrim a few blocks after the desired start block and the range can cross over into the next cluster group and zero out the group descriptor there. This can cause filesytem corruption that cannot be fixed by fsck. Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.com Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	mm, /proc/pid/pagemap: fix soft dirty marking for PMD migration entry	Huang Ying
	When the pagetable is walked in the implementation of /proc/<pid>/pagemap, pmd_soft_dirty() is used for both the PMD huge page map and the PMD migration entries. That is wrong, pmd_swp_soft_dirty() should be used for the PMD migration entries instead because the different page table entry flag is used. As a result, /proc/pid/pagemap may report incorrect soft dirty information for PMD migration entries. Link: http://lkml.kernel.org/r/20171017081818.31795-1-ying.huang@intel.com Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Hugh Dickins <hughd@google.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Daniel Colascione <dancol@google.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size	Andrea Arcangeli
	This oops: kernel BUG at fs/hugetlbfs/inode.c:484! RIP: remove_inode_hugepages+0x3d0/0x410 Call Trace: hugetlbfs_setattr+0xd9/0x130 notify_change+0x292/0x410 do_truncate+0x65/0xa0 do_sys_ftruncate.constprop.3+0x11a/0x180 SyS_ftruncate+0xe/0x10 tracesys+0xd9/0xde was caused by the lack of i_size check in hugetlb_mcopy_atomic_pte. mmap() can still succeed beyond the end of the i_size after vmtruncate zapped vmas in those ranges, but the faults must not succeed, and that includes UFFDIO_COPY. We could differentiate the retval to userland to represent a SIGBUS like a page fault would do (vs SIGSEGV), but it doesn't seem very useful and we'd need to pick a random retval as there's no meaningful syscall retval that would differentiate from SIGSEGV and SIGBUS, there's just -EFAULT. Link: http://lkml.kernel.org/r/20171016223914.2421-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03	Merge tag 'perf-core-for-mingo-4.15-20171103' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Beautify the 'kcmp' and 'prctl' syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo) - Implement a way to print formatted output to per-event files in 'perf script' to facilitate generate flamegraphs, elliminating the need to write scripts to do that separation (yuzhoujian, Arnaldo Carvalho de Melo) Make 'perf stat --per-thread' update shadow stats to show metrics (Jiri Olsa) - Fix double mapping al->addr in callchain processing for children without self period (Namhyung Kim) - Fix memory leak in addr2inlines() when libbfd is not used (Namhyung Kim) - Show correct function name for srcline of callchains when libbfd is not used (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-03	crypto: ccm - preserve the IV buffer	Romain Izard
	The IV buffer used during CCM operations is used twice, during both the hashing step and the ciphering step. When using a hardware accelerator that updates the contents of the IV buffer at the end of ciphering operations, the value will be modified. In the decryption case, the subsequent setup of the hashing algorithm will interpret the updated IV instead of the original value, which can lead to out-of-bounds writes. Reuse the idata buffer, only used in the hashing step, to preserve the IV's value during the ciphering step in the decryption case. Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-03	crypto: x86/sha1-mb - fix panic due to unaligned access	Andrey Ryabinin
	struct sha1_ctx_mgr allocated in sha1_mb_mod_init() via kzalloc() and later passed in sha1_mb_flusher_mgr_flush_avx2() function where instructions vmovdqa used to access the struct. vmovdqa requires 16-bytes aligned argument, but nothing guarantees that struct sha1_ctx_mgr will have that alignment. Unaligned vmovdqa will generate GP fault. Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment requirements. Fixes: 2249cbb53ead ("crypto: sha-mb - SHA1 multibuffer submit and flush routines for AVX2") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-03	crypto: x86/sha256-mb - fix panic due to unaligned access	Andrey Ryabinin
	struct sha256_ctx_mgr allocated in sha256_mb_mod_init() via kzalloc() and later passed in sha256_mb_flusher_mgr_flush_avx2() function where instructions vmovdqa used to access the struct. vmovdqa requires 16-bytes aligned argument, but nothing guarantees that struct sha256_ctx_mgr will have that alignment. Unaligned vmovdqa will generate GP fault. Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment requirements. Fixes: a377c6b1876e ("crypto: sha256-mb - submit/flush routines for AVX2") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: <stable@vger.kernel.org> Acked-by: Tim Chen Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-03	xfrm: Fix stack-out-of-bounds read in xfrm_state_find.	Steffen Klassert
	When we do tunnel or beet mode, we pass saddr and daddr from the template to xfrm_state_find(), this is ok. On transport mode, we pass the addresses from the flowi, assuming that the IP addresses (and address family) don't change during transformation. This assumption is wrong in the IPv4 mapped IPv6 case, packet is IPv4 and template is IPv6. Fix this by using the addresses from the template unconditionally. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-11-03	spi: spi-fsl-dspi: enabling Coldfire mcf5441x dspi	Angelo Dureghello
	Signed-off-by: Angelo Dureghello <angelo@sysam.it> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03	regmap: Add a config option for hwspinlock	Mark Brown
	Unlike other lock types hwspinlocks are optional and can be built modular so we can't use them unconditionally in regmap so add a config option that drivers that want to use hwspinlocks with regmap can select which will ensure that hwspinlock is built in. Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03	Merge branch 'linus' into perf/urgent, to pick up dependent commits	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-03	spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers	Lucas Stach
	On systems where some controllers get a dynamic ID assigned and some have a fixed number from DT, the current implemention might run into an IDR collision if the dynamic controllers gets probed first and get an IDR number, which is later requested by the controller with the fixed numbering. When this happens the fixed controller will fail to register with the SPI core. Fix this by skipping all known alias numbers when assigning the dynamic IDs. Fixes: 9b61e302210e (spi: Pick spi bus number from Linux idr or spi alias) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03	xfrm: do unconditional template resolution before pcpu cache check	Florian Westphal
	Stephen Smalley says: Since 4.14-rc1, the selinux-testsuite has been encountering sporadic failures during testing of labeled IPSEC. git bisect pointed to commit ec30d ("xfrm: add xdst pcpu cache"). The xdst pcpu cache is only checking that the policies are the same, but does not validate that the policy, state, and flow match with respect to security context labeling. As a result, the wrong SA could be used and the receiver could end up performing permission checking and providing SO_PEERSEC or SCM_SECURITY values for the wrong security context. This fix makes it so that we always do the template resolution, and then checks that the found states match those in the pcpu bundle. This has the disadvantage of doing a bit more work (lookup in state hash table) if we can reuse the xdst entry (we only avoid xdst alloc/free) but we don't add a lot of extra work in case we can't reuse. xfrm_pol_dead() check is removed, reasoning is that xfrm_tmpl_resolve does all needed checks. Cc: Paul Moore <paul@paul-moore.com> Fixes: ec30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache") Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Tested-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-11-03	s390/qdio: sanitize put_indicator	Sebastian Ott
	qdio maintains an array of struct indicator_t. put_indicator takes a pointer to a member of a struct indicator_t within that array, calculates the index, and uses the array and the index to get the struct indicator_t. Simply use the pointer directly. Although the pointer happens to point to the first member of that struct use the container_of macro. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-03	s390/qdio: use atomic_cmpxchg	Sebastian Ott
	qdio uses atomic_read to find an unused indicator and atomic_set to flag it as used. This could lead to multiple users getting the same indicator. Use atomic_cmpxchg instead. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-03	Documentation: Add Tim Bird to list of enforcement statement endorsers	Bird, Timothy
	Add my name to the list. Signed-off-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-03	net: systemport: Correct IPG length settings	Florian Fainelli
	Due to a documentation mistake, the IPG length was set to 0x12 while it should have been 12 (decimal). This would affect short packet (64B typically) performance since the IPG was bigger than necessary. Fixes: 44a4524c54af ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	tcp: do not mangle skb->cb[] in tcp_make_synack()	Eric Dumazet
	Christoph Paasch sent a patch to address the following issue : tcp_make_synack() is leaving some TCP private info in skb->cb[], then send the packet by other means than tcp_transmit_skb() tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse IPv4/IPV6 stacks, but we have no such cleanup for SYNACK. tcp_make_synack() should not use tcp_init_nondata_skb() : tcp_init_nondata_skb() really should be limited to skbs put in write/rtx queues (the ones that are only sent via tcp_transmit_skb()) This patch fixes the issue and should even save few cpu cycles ;) Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	fib: fib_dump_info can no longer use __in_dev_get_rtnl	Florian Westphal
	syzbot reported yet another regression added with DOIT_UNLOCKED. When nexthop is marked as dead, fib_dump_info uses __in_dev_get_rtnl(): ./include/linux/inetdevice.h:230 suspicious rcu_dereference_protected() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by syz-executor2/23859: #0: (rcu_read_lock){....}, at: [<ffffffff840283f0>] inet_rtm_getroute+0xaa0/0x2d70 net/ipv4/route.c:2738 [..] lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4665 __in_dev_get_rtnl include/linux/inetdevice.h:230 [inline] fib_dump_info+0x1136/0x13d0 net/ipv4/fib_semantics.c:1377 inet_rtm_getroute+0xf97/0x2d70 net/ipv4/route.c:2785 .. This isn't safe anymore, callers either hold RTNL mutex or rcu read lock, so these spots must use rcu_dereference_rtnl() or plain rcu_derefence() (plus unconditional rcu read lock). This does the latter. Fixes: 394f51abb3d04f ("ipv4: route: set ipv4 RTM_GETROUTE to not use rtnl") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	stmmac: use of_property_read_u32 instead of read_u8	Bhadram Varka
	Numbers in DT are stored in “cells” which are 32-bits in size. of_property_read_u8 does not work properly because of endianness problem. This causes it to always return 0 with little-endian architectures. Fix it by using of_property_read_u32() OF API. Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	openrisc: fix possible deadlock scenario during timer sync	Stafford Horne
	OpenRISC borrows its timer sync logic from MIPS, Matt helped to review the OpenRISC implementation and noted that we may suffer the same deadlock case that MIPS has faced. The case being: "the MIPS timer synchronization code contained the possibility of deadlock. If you mark a CPU online before it goes into the synchronize loop, then the boot CPU can schedule a different thread and send IPIs to all "online" CPUs. It gets stuck waiting for the secondary to ack it's IPI, since this secondary CPU has not enabled IRQs yet, and is stuck waiting for the master to synchronise with it. The system then deadlocks." Fix this by moving set_cpu_online() to after timer sync. Reported-by: Matt Redfearn <matt.redfearn@mips.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: pass endianness info to sparse	Luc Van Oostenryck
	openrisc is big-endian only but sparse assumes the same endianness as the building machine. This is problematic for code which expect __BYTE_ORDER__ being correctly predefined by the compiler which sparse can then pre-process differently from what gcc would, depending on the building machine endianness. Fix this by letting sparse know about the architecture endianness. To: Jonas Bonn <jonas@southpole.se> To: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> To: Stafford Horne <shorne@gmail.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: add tick timer multi-core sync logic	Stafford Horne
	In case timers are not in sync when cpus start (i.e. hot plug / offset resets) we need to synchronize the secondary cpus internal timer with the main cpu. This is needed as in OpenRISC SMP there is only one clocksource registered which reads from the same ttcr register on each cpu. This synchronization routine heavily borrows from mips implementation that does something similar. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: enable LOCKDEP_SUPPORT and irqflags tracing	Stafford Horne
	Lockdep is needed for proving the spinlocks and rwlocks work fine on our platform. It also requires calling the trace_hardirqs_off() and trace_hardirqs_on() pair of routines when entering and exiting an interrupt. For OpenRISC the interrupt stack frame does not support frame pointers, so to call trace_hardirqs_on() and trace_hardirqs_off() here the macro's build up a stack frame each time. There is one necessary small change in _sys_call_handler to move interrupt enabling later so they can get re-enabled during syscall restart. This was done to fix lockdep warnings that are now possible due to this patch. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: support framepointers and STACKTRACE_SUPPORT	Stafford Horne
	For lockdep support a reliable stack trace mechanism is needed. This patch adds support in OpenRISC for the stacktrace framework, implemented by a simple unwinder api. The unwinder api supports both framepointer and basic stack tracing. The unwinder is now used to replace the stack_dump() implementation as well. The new traces are inline with other architectures trace format: Call trace: [<c0004448>] show_stack+0x3c/0x58 [<c031c940>] dump_stack+0xa8/0xe4 [<c0008104>] __cpu_up+0x64/0x130 [<c000d268>] bringup_cpu+0x3c/0x178 [<c000d038>] cpuhp_invoke_callback+0xa8/0x1fc [<c000d680>] cpuhp_up_callbacks+0x44/0x14c [<c000e400>] cpu_up+0x14c/0x1bc [<c041da60>] smp_init+0x104/0x15c [<c033843c>] ? kernel_init+0x0/0x140 [<c0415e04>] kernel_init_freeable+0xbc/0x25c [<c033843c>] ? kernel_init+0x0/0x140 [<c0338458>] kernel_init+0x1c/0x140 [<c003a174>] ? schedule_tail+0x18/0xa0 [<c0006b80>] ret_from_fork+0x1c/0x9c Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: add simple_smp dts and defconfig for simulators	Stefan Kristiansson
	Simple enough to be compatible with simulation environments, such as verilated systems, QEMU and other targets supporting OpenRISC SMP. This also supports our base FPGA SoC's if the cpu frequency is upped to 50Mhz. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: Added defconfig] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: add cacheflush support to fix icache aliasing	Jan Henrik Weinstock
	On OpenRISC the icache does not snoop data stores. This can cause aliasing as reported by Jan. This patch fixes the issue to ensure icache is properly synchronized when code is written to memory. It supports both SMP and UP flushing. This supports dcache flush as well for architectures that do not support write-through caches; most OpenRISC implementations do implement write-through cache however. Dcache flushes are done only on a single core as OpenRISC dcaches all support snooping of bus stores. Signed-off-by: Jan Henrik Weinstock <jan.weinstock@ice.rwth-aachen.de> [shorne@gmail.com: Squashed patches and wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: sleep instead of spin on secondary wait	Stafford Horne
	Currently we do a spin on secondary cpus when waiting to boot. This theoretically causes issues with power consumption and does cause issues with qemu cycle burning (it starves cpu 0 from actually being able to boot.) This change puts each secondary cpu to sleep if they have a power management unit, then signals them to wake via IPI when its time to boot. If the cpus have no power management unit they will loop as before. Note: The wakeup IPI requires a special interrupt handler as on secondary cpu's the interrupt infrastructure is not yet established. This interrupt handler is set and reset by updating SPR_EVBAR. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: fix initial preempt state for secondary cpu tasks	Stafford Horne
	During SMP testing we were getting the below warning after booting the secondary cpu: [ 0.060000] BUG: scheduling while atomic: swapper/1/0/0x00000000 This change follows similar patterns from other architectures to start the schduler with preempt disabled. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: initial SMP support	Stefan Kristiansson
	This patch introduces the SMP support for the OpenRISC architecture. The SMP architecture requires cores which have multi-core features which have been introduced a few years back including: - New SPRS SPR_COREID SPR_NUMCORES - Shadow SPRs - Atomic Instructions - Cache Coherency - A wired in IPI controller This patch adds all of the SMP specific changes to core infrastructure, it looks big but it needs to go all together as its hard to split this one up. Boot loader spinning of second cpu is not supported yet, it's assumed that Linux is booted straight after cpu reset. The bulk of these changes are trivial changes to refactor to use per cpu data structures throughout. The addition of the smp.c and changes in time.c are the changes. Some specific notes: MM changes ---------- The reason why this is created as an array, and not with DEFINE_PER_CPU is that doing it this way, we'll save a load in the tlb-miss handler (the load from __per_cpu_offset). TLB Flush --------- The SMP implementation of flush_tlb_* works by sending out a function-call IPI to all the non-local cpus by using the generic on_each_cpu() function. Currently, all flush_tlb_* functions will result in a flush_tlb_all(), which has always been the behaviour in the UP case. CPU INFO -------- This creates a per cpu cpuinfo struct and fills it out accordingly for each activated cpu. show_cpuinfo is also updated to reflect new version information in later versions of the spec. SMP API ------- This imitates the arm64 implementation by having a smp_cross_call callback that can be set by set_smp_cross_call to initiate an IPI and a handle_IPI function that is expected to be called from an IPI irqchip driver. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: added cpu stop, checkpatch fixes, wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	irqchip: add initial support for ompic	Stafford Horne
	IPI driver for the Open Multi-Processor Interrupt Controller (ompic) as described in the Multi-core support section of the OpenRISC 1.2 architecture specification: https://github.com/openrisc/doc/raw/master/openrisc-arch-1.2-rev0.pdf Each OpenRISC core contains a full interrupt controller which is used in the SMP architecture for interrupt balancing. This IPI device, the ompic, is the only external device required for enabling SMP on OpenRISC. Pending ops are stored in a memory bit mask which can allow multiple pending operations to be set and serviced at a time. This is mostly borrowed from the alpha IPI implementation. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: converted ops to bitmask, wrote commit message] Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	dt-bindings: add openrisc to vendor prefixes list	Stafford Horne
	Add OpenRISC.io to vendor prefixes. This is reserved for softcores developed by the OpenRISC community. The OpenRISC community has separated from OpenCores.org requiring a new prefix. Reviewed-by: Andreas Färber <afaerber@suse.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: use qspinlocks and qrwlocks	Stafford Horne
	Enable OpenRISC to use qspinlocks and qrwlocks for upcoming SMP support. Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: add 1 and 2 byte cmpxchg support	Stafford Horne
	OpenRISC only supports hardware instructions that perform 4 byte atomic operations. For enabling qrwlocks for upcoming SMP support 1 and 2 byte implementations are needed. To do this we leverage the 4 byte atomic operations and shift/mask the 1 and 2 byte areas as needed. This heavily borrows ideas and routines from sh and mips, which do something similar. Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	openrisc: use shadow registers to save regs on exception	Stefan Kristiansson
	Previously, the area between 0x0-0x100 have been used as a "scratch" memory area to temporarily store regs during exception entry. In a multi-core environment, this will not work. This change is to use shadow registers for nested context. Currently only the "critical" temp load/stores are covered, the EMERGENCY_PRINT ones are left as is (when they are used, it's game over anyway), they need to be handled as well in the future. Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	dt-bindings: openrisc: Add OpenRISC platform SoC	Stafford Horne
	Add devicetree binding documentation for the OpenRISC platform opencores,or1ksim. This is the main OpenRISC reference platform supporting multiple FPGA SoC's. This format is based on some of the mips binding docs as we have similar requirements. Also, update maintainers so openrisc related binding changes are visible to the openrisc team. Acked-by: Rob Herring <robh@kernel.org> Suggested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-11-03	Merge branch 'net-sched-use-after-free'	David S. Miller
	Cong Wang says: ==================== net_sched: fix a use-after-free for tc actions This patchset fixes a use-after-free reported by Lucas and closes potential races too. Please see each patch for details. ==================== Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	net_sched: hold netns refcnt for each action	Cong Wang
	TC actions have been destroyed asynchronously for a long time, previously in a RCU callback and now in a workqueue. If we don't hold a refcnt for its netns, we could use the per netns data structure, struct tcf_idrinfo, after it has been freed by netns workqueue. Hold refcnt to ensure netns destroy happens after all actions are gone. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Tested-by: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	net_sched: acquire RTNL in tc_action_net_exit()	Cong Wang
	I forgot to acquire RTNL in tc_action_net_exit() which leads that action ops->cleanup() is not always called with RTNL. This usually is not a big deal because this function is called after all netns refcnt are gone, but given RTNL protects more than just actions, add it for safety and consistency. Also add an assertion to catch other potential bugs. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Tested-by: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03	Merge tag 'timers-conversion-next3' of ↵	Thomas Gleixner
	https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into timers/core Pull the 3rd batch of timer conversions from Kees Cook: - various per-architecture conversions - several driver conversions not picked up by a specific maintainer - other Acked/Reviewed conversions to go through tip
2017-11-02	drivers/sgi-xp: Convert timers to use timer_setup()	Kees Cook
	In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Robin Holt <robinmholt@gmail.com>
2017-11-02	drivers/pcmcia: Convert timers to use timer_setup()	Kees Cook
	In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: David Howells <dhowells@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-pcmcia@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> # for soc_common.c
2017-11-02	drivers/memstick: Convert timers to use timer_setup()	Kees Cook
	In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-02	drivers/macintosh: Convert timers to use timer_setup()	Kees Cook
	In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook <keescook@chromium.org>