linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-07-05	sched/fair: Fix load_balance() affinity redo path	Jeffrey Hugo
	If load_balance() fails to migrate any tasks because all tasks were affined, load_balance() removes the source CPU from consideration and attempts to redo and balance among the new subset of CPUs. There is a bug in this code path where the algorithm considers all active CPUs in the system (minus the source that was just masked out). This is not valid for two reasons: some active CPUs may not be in the current scheduling domain and one of the active CPUs is dst_cpu. These CPUs should not be considered, as we cannot pull load from them. Instead of failing out of load_balance(), we may end up redoing the search with no valid CPUs and incorrectly concluding the domain is balanced. Additionally, if the group_imbalance flag was just set, it may also be incorrectly unset, thus the flag will not be seen by other CPUs in future load_balance() runs as that algorithm intends. Fix the check by removing CPUs not in the current domain and the dst_cpu from considertation, thus limiting the evaluation to valid remaining CPUs from which load might be migrated. Co-authored-by: Austin Christ <austinwc@codeaurora.org> Co-authored-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Austin Christ <austinwc@codeaurora.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Timur Tabi <timur@codeaurora.org> Link: http://lkml.kernel.org/r/1496863138-11322-2-git-send-email-jhugo@codeaurora.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	MAINTAINERS: Add Frederic Weisbecker as nohz/dyntics maintainer	Ingo Molnar
	Frederic has been improving and maintaining the nohz/dynticks kernel features for years, so make his de facto maintainership official. Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	ftrace: Test for NULL iter->tr in regex for stack_trace_filter changes	Steven Rostedt (VMware)
	As writing into stack_trace_filter, the iter-tr is not set and is NULL. Check if it is NULL before dereferencing it in ftrace_regex_release(). Fixes: 8c08f0d5c6fb ("ftrace: Have cached module filters be an active filter") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05	Merge commit '0f17976568b3f72e676450af0c0db6f8752253d6' into trace/ftrace/core	Steven Rostedt (VMware)
	Need to get the changes from 0f17976568b3 ("ftrace: Fix regression with module command in stack_trace_filter") as it is required to fix some other changes with stack_trace_filter and the new development code. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05	Merge branch 'dt/property-move' into dt/next	Rob Herring

2017-07-05	Merge branch 'topic/of-graph-base' of ↵	Rob Herring
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into dt/property-move OF graph changes for ALSA conflict with the move of graph functions into property.c.
2017-07-05	GFS2: constify attribute_group structures.	Arvind Yadav
	attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 5259 1344 8 6611 19d3 fs/gfs2/sys.o File size After adding 'const': text data bss dec hex filename 5371 1216 8 6595 19c3 fs/gfs2/sys.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05	gfs2: gfs2_create_inode: Keep glock across iput	Andreas Gruenbacher
	On failure, keep the inode glock across the final iput of the new inode so that gfs2_evict_inode doesn't have to re-acquire the glock. That way, gfs2_evict_inode won't need to revalidate the block type. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05	gfs2: Clean up glock work enqueuing	Andreas Gruenbacher
	This patch adds a standardized queueing mechanism for glock work with spin_lock protection to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05	gfs2: Protect gl->gl_object by spin lock	Andreas Gruenbacher
	Put all remaining accesses to gl->gl_object under the gl->gl_lockref.lock spinlock to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05	gfs2: Get rid of flush_delayed_work in gfs2_evict_inode	Andreas Gruenbacher
	So far, gfs2_evict_inode clears gl->gl_object and then flushes the glock work queue to make sure that inode glops which dereference gl->gl_object have finished running before the inode is destroyed. However, flushing the work queue may do more work than needed, and in particular, it may call into DLM, which we want to avoid here. Use a bit lock (GIF_GLOP_PENDING) to synchronize between the inode glops and gfs2_evict_inode instead to get rid of the flushing. In addition, flush the work queues of existing glocks before reusing them for new inodes to get those glocks into a known state: the glock state engine currently doesn't handle glock re-appropriation correctly. (We may be able to fix the glock state engine instead later.) Based on a patch by Steven Whitehouse <swhiteho@redhat.com>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05	locking/rwsem-spinlock: Fix EINTR branch in __down_write_common()	Kirill Tkhai
	If a writer could been woken up, the above branch if (sem->count == 0) break; would have moved us to taking the sem. So, it's not the time to wake a writer now, and only readers are allowed now. Thus, 0 must be passed to __rwsem_do_wake(). Next, __rwsem_do_wake() wakes readers unconditionally. But we mustn't do that if the sem is owned by writer in the moment. Otherwise, writer and reader own the sem the same time, which leads to memory corruption in callers. rwsem-xadd.c does not need that, as: 1) the similar check is made lockless there, 2) in __rwsem_mark_wake::try_reader_grant we test, that sem is not owned by writer. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Niklas Cassel <niklas.cassel@axis.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 17fcbd590d0c "locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y" Link: http://lkml.kernel.org/r/149762063282.19811.9129615532201147826.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	Merge branches 'fixes' and 'misc' into for-linus	Russell King

2017-07-05	Merge branch 'phy-dp83867-workaround-incorrect-RX_CTRL-pin-strap'	David S. Miller
	Sekhar Nori says: ==================== net: phy: dp83867: workaround incorrect RX_CTRL pin strap This patch series adds workaround for incorrect RX_CTRL pin strap setting that can be found on some TI boards. This is required to be complaint to PHY datamanual specification. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap	Murali Karicheri
	The data manual for DP83867IR/CR, SNLS484E[1], revised march 2017, advises that strapping RX_DV/RX_CTRL pin in mode 1 and 2 is not supported (see note below Table 5 (4-Level Strap Pins)). There are some boards which have the pin strapped this way and need software workaround suggested by the data manual. Bit[7] of Configuration Register 4 (address 0x0031) must be cleared to 0. This ensures proper operation of the PHY. Implement driver support for device-tree property meant to advertise the wrong strapping. [1] http://www.ti.com/lit/ds/snls484e/snls484e.pdf Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> [nsekhar@ti.com: rebase to mainline, code simplification] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap	Murali Karicheri
	The data manual for DP83867IR/CR, SNLS484E[1], revised march 2017, advises that strapping RX_DV/RX_CTRL pin in mode 1 and 2 is not supported (see note below Table 5 (4-Level Strap Pins)). It further advises that if a board has this pin strapped in mode 1 and mode 2, then to ensure proper operation of the PHY, a software workaround must be implemented. Since it is not possible to detect in software if RX_DV/RX_CTRL pin is incorrectly strapped, add a device-tree property for the board to advertise this and allow corrective action in software. [1] http://www.ti.com/lit/ds/snls484e/snls484e.pdf Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> [nsekhar@ti.com: rebase to mainline, split documentation into separate patch] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	Merge branch 'cxgb4-ptp'	David S. Miller
	Atul Gupta says: ==================== cxgb4: Add PTP Hardware Clock (PHC) support V4: Splitting the patch again V3: Releasing lock in the exit paths V2: Splitting the patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	cxgb4: Support for get_ts_info ethtool method	Atul Gupta
	Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	cxgb4: Add PTP Hardware Clock (PHC) support	Atul Gupta
	Add PTP IEEE-1588 support and make it accessible via PHC subsystem. The functionality is enabled for T5/T6 adapters. Driver interfaces with Firmware to program and adjust the clock offset. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	cxgb4: time stamping interface for PTP	Atul Gupta
	Supports hardware and software time stamping via the Linux SO_TIMESTAMPING socket option. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	Merge branch 'nfp-port-enumeration-change-and-FW-ABI-adjustment'	David S. Miller
	Jakub Kicinski says: ==================== nfp: port enumeration change and FW ABI adjustment This set changes the way ports are numbered internally to avoid MAC address changes and invalid link information when breakout is configured. Second patch gets rid of old way of looking up MAC addresses in device information which caused all this confusion. Patch 3 is a small adjustment to the new FW ABI version we introduced in this release cycle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	nfp: default to chained metadata prepend format	Jakub Kicinski
	ABI 4.x introduced the chained metadata format and made it the only one possible. There are cases, however, where the old format is preferred - mostly to make interoperation with VFs using ABI 3.x easier for the datapath. In ABI 5.x we allowed for more flexibility by selecting the metadata format based on capabilities. The default was left to non-chained. In case of fallback traffic, there is no capability telling the driver there may be chained metadata. With a very stripped- -down FW the default old metadata format would be selected making the driver drop all fallback traffic. This patch changes the default selection in the driver. It should not hurt with old firmwares, because if they don't advertise RSS they will not produce metadata anyway. New firmwares advertising ABI 5.x, however, can depend on the driver defaulting to chained format. Fixes: f9380629fafc ("nfp: advertise support for NFD ABI 0.5") Suggested-by: Michael Rapson <michael.rapson@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	nfp: remove legacy MAC address lookup	Jakub Kicinski
	The legacy MAC address lookup doesn't work well with breakout cables. We are probably better off picking random addresses than the wrong ones in the theoretical scenario where management FW didn't tell us what the port config is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	nfp: improve order of interfaces in breakout mode	Jakub Kicinski
	For historical reasons we enumerate the vNICs in order. This means that if user configures breakout on a multiport card, the first interface of the second port will have its MAC address changed. What's worse, when moved from static information (HWInfo) to using management FW (NSP), more features started depending on the port ids. Right now in case of breakout first subport of the second port and second subport of the first port will have their link info swapped. Revise the ordering scheme so that first subport maintains its address. Side effect of this change is that we will use base lane ids in devlink (i.e. 40G ports will be 4 ids apart), e.g.: pci/0000:04:00.0/0: type eth netdev p6p1 pci/0000:04:00.0/4: type eth netdev p6p2 Note that behaviour of phys_port_id is not changed since there is a separate id number for the subport there. Fixes: ec8b1fbe682d ("nfp: support port splitting via devlink") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	net: macb: remove extraneous return when MACB_EXT_DESC is defined	Colin Ian King
	When macro MACB_EXT_DESC is defined we end up with two identical return statements and just one is sufficient. Remove the extra return. Detected by CoverityScan, CID#1449361 ("Structurally dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table	Chen Yu
	Add the real e820_tabel_firmware[] that will not be modified by the kernel or the EFI boot stub under any circumstance. In addition to that modify the code so that e820_table_firmwarep[] is exposed via sysfs to represent the real firmware memory layout, rather than exposing the e820_table_kexec[] table. This fixes a hibernation bug/warning, which uses e820_table_kexec[] to check RAM layout consistency across hibernation/resume: The suspend kernel: [ 0.000000] e820: update [mem 0x76671018-0x76679457] usable ==> usable The resume kernel: [ 0.000000] e820: update [mem 0x7666f018-0x76677457] usable ==> usable ... [ 15.752088] PM: Using 3 thread(s) for decompression. [ 15.752088] PM: Loading and decompressing image data (471870 pages)... [ 15.764971] Hibernate inconsistent memory map detected! [ 15.770833] PM: Image mismatch: architecture specific data Actually it is safe to restore these pages because E820_TYPE_RAM and E820_TYPE_RESERVED_KERN are treated the same during hibernation, so the original e820 table provided by the bootloader is used for hibernation MD5 fingerprint checking. The side effect is that, this newly introduced variable might increase the kernel size at compile time. Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	x86/boot/e820: Rename the e820_table_firmware to e820_table_kexec	Chen Yu
	Currently the e820_table_firmware[] table is mainly used by the kexec, and it is not what it's supposed to be - despite its name it might be modified by the kernel. So change its name to e820_table_kexec[]. In the next patch we will introduce the real e820_table_firmware[] table. No functional change. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	x86/boot/e820: Avoid overwriting e820_table_firmware	Chen Yu
	The following commit in 2013: 77ea8c948953 ("x86: Reserve setup_data ranges late after parsing memmap cmdline") has fixed the issue of losing setup_data information by deferring the e820_reserve_setup_data() call until the early params have been parsed. But this also introduced a new problem that, during early params parsing, the kexec kernel might fake a mptable and saves it into the e820_table_firmware[] table (without saving the mptable to the e820_table[]), however the subsequent invoking of e820_reserve_setup_data() will overwrite the e820_table_firmware[] according to the e820_table[], thus the fake mptable information is lost. Fix this issue by updating the e820_table_firmware[] according to the setup_data information, but without overwriting it. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case	Colin Ian King
	There appears to be a missing break in the TCP_BPF_SNDCWND_CLAMP case. Currently the non-error path where val is greater than zero falls through to the default case that sets the error return to -EINVAL. Add in the missing break. Detected by CoverityScan, CID#1449376 ("Missing break in switch") Fixes: 13bf96411ad2 ("bpf: Adds support for setting sndcwnd clamp") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	bpf: fix return in load_bpf_file	Lawrence Brakmo
	The function load_bpf_file ignores the return value of load_and_attach(), so even if load_and_attach() returns an error, load_bpf_file() will return 0. Now, load_bpf_file() can call load_and_attach() multiple times and some can succeed and some could fail. I think the correct behavor is to return error on the first failed load_and_attach(). v2: Added missing SOB Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	mpls: fix rtm policy in mpls_getroute	Roopa Prabhu
	fix rtm policy name typo in mpls_getroute and also remove export of rtm_ipv4_policy Fixes: 397fc9e5cefe ("mpls: route get support") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	sched/cputime: Accumulate vtime on top of nsec clocksource	Wanpeng Li
	Currently the cputime source used by vtime is jiffies. When we cross a context boundary and jiffies have changed since the last snapshot, the pending cputime is accounted to the switching out context. This system works ok if the ticks are not aligned across CPUs. If they instead are aligned (ie: all fire at the same time) and the CPUs run in userspace, the jiffies change is only observed on tick exit and therefore the user cputime is accounted as system cputime. This is because the CPU that maintains timekeeping fires its tick at the same time as the others. It updates jiffies in the middle of the tick and the other CPUs see that update on IRQ exit: CPU 0 (timekeeper) CPU 1 ------------------- ------------- jiffies = N ... run in userspace for a jiffy tick entry tick entry (sees jiffies = N) set jiffies = N + 1 tick exit tick exit (sees jiffies = N + 1) account 1 jiffy as stime Fix this with using a nanosec clock source instead of jiffies. The cputime is then accumulated and flushed everytime the pending delta reaches a jiffy in order to mitigate the accounting overhead. [ fweisbec: changelog, rebase on struct vtime, field renames, add delta on cputime readers, keep idle vtime as-is (low overhead accounting), harmonize clock sources. ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Luiz Capitulino <lcapitulino@redhat.com> Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-6-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	sched/cputime: Move the vtime task fields to their own struct	Frederic Weisbecker
	We are about to add vtime accumulation fields to the task struct. Let's avoid more bloatification and gather vtime information to their own struct. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-5-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	sched/cputime: Rename vtime fields	Frederic Weisbecker
	The current "snapshot" based naming on vtime fields suggests we record some past event but that's a low level picture of their actual purpose which comes out blurry. The real point of these fields is to run a basic state machine that tracks down cputime entry while switching between contexts. So lets reflect that with more meaningful names. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	sched/cputime: Always set tsk->vtime_snap_whence after accounting vtime	Frederic Weisbecker
	Even though it doesn't have functional consequences, setting the task's new context state after we actually accounted the pending vtime from the old context state makes more sense from a review perspective. vtime_user_exit() is the only function that doesn't follow that rule and that can bug the reviewer for a little while until he realizes there is no reason for this special case. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	vtime, sched/cputime: Remove vtime_account_user()	Frederic Weisbecker
	It's an unnecessary function between vtime_user_exit() and account_user_time(). Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Herbert Xu
	Merge the crypto tree to pull in fixes for the next merge window.
2017-07-05	Merge tag 'perf-urgent-for-mingo-4.12-20170704' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix max attr.precise_ip probing to make perf use the best cycles:p available in the processor for non root users (Arnaldo Carvalho de Melo) - Fix processing of MMAP events for 32-bit binaries on 64-bit systems when unwind support is not fully integrated, fixing DSO and symbol resolution (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05	x86/mm/pat: Don't report PAT on CPUs that don't support it	Mikulas Patocka
	The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Fixes: 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bernhard Held <berny156@gmx.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com
2017-07-05	Update my email address	Cornelia Huck
	Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/syscalls: Fix out of bounds arguments access	Jiri Olsa
	Zorro reported following crash while having enabled syscall tracing (CONFIG_FTRACE_SYSCALLS): Unable to handle kernel pointer dereference at virtual ... Oops: 0011 [#1] SMP DEBUG_PAGEALLOC SNIP Call Trace: ([<000000000024d79c>] ftrace_syscall_enter+0xec/0x1d8) [<00000000001099c6>] do_syscall_trace_enter+0x236/0x2f8 [<0000000000730f1c>] sysc_tracesys+0x1a/0x32 [<000003fffcf946a2>] 0x3fffcf946a2 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000000000022dd44>] rb_event_data+0x34/0x40 ---[ end trace 8c795f86b1b3f7b9 ]--- The crash happens in syscall_get_arguments function for syscalls with zero arguments, that will try to access first argument (args[0]) in event entry, but it's not allocated. Bail out of there are no arguments. Cc: stable@vger.kernel.org Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/vfio_ccw: remove unused variable	Sebastian Ott
	Fix this set but not used warning: drivers/s390/cio/vfio_ccw_drv.c: In function 'vfio_ccw_sch_io_todo': drivers/s390/cio/vfio_ccw_drv.c:72:21: warning: variable 'sch' set but not used [-Wunused-but-set-variable] struct subchannel *sch; ^ Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/dasd: remove unneeded code	Sebastian Ott
	Fix these set but not used warnings: drivers/s390/block/dasd.c:3933:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] drivers/s390/block/dasd_alias.c:757:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] In addition to that remove the test if an unsigned is < 0: drivers/s390/block/dasd_devmap.c:153:11: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/crash: Remove unused KEXEC_NOTE_BYTES	Michael Holzheu
	After commmit 692f66f26a4c19 ("crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE") the KEXEC_NOTE_BYTES macro is not used anymore and for s390 we create the ELF header in the new kernel anyway. Therefore remove the macro. Reported-by: Xunlei Pang <xpang@redhat.com> Reviewed-by: Mikhail Zaslonko <zaslonko@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/zcrypt: Fix missing newlines at some debug feature messages.	Harald Freudenberger
	On some debug feature invocations the newline was missing. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/dasd: Make raw I/O usable without prefix support	Jan Höppner
	The Prefix CCW is not mandatory and raw I/O can also be issued without it. Check whether the Prefix CCW is supported and if not use the combination of Define Extent and Locate Record Extended instead. While at it, sort the variable declarations, replace the gotos with early exits, and remove an error check at the end which is irrelevant. Also, remove the XRC check as it is not relevant for raw I/O. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/dasd: Rename dasd_raw_build_cp()	Jan Höppner
	Rename dasd_raw_build_cp() to dasd_eckd_build_cp_raw() to fit the scheme. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390/dasd: Refactor prefix_LRE() and related functions	Jan Höppner
	We already have define_extent() that prepares necessary data for the Define Extent CCW. The exact same thing is done in prefix_LRE(). Remove the duplicate code and move commands that were only used in combination with the Prefix command to define_extent(). One of these commands needs the blocksize to be specified. Add the blksize parameter to define_extent() to account for that. In addition, the check_XRC() function can be made more generic. Do this and remove the Prefix-specific check_XRC_on_prefix() function. Furthermore, prefix_LRE() uses fill_LRE_data() to prepare Locate Record Extended data. Rename the function to fit the scheme better and make it usable outside of the Prefix context by adding the corresponding CCW command. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	s390: fix up for "blk-mq: switch ->queue_rq return value to blk_status_t"	Stephen Rothwell
	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05	fs: generic_block_bmap(): initialize all of the fields in the temp bh	Alexander Potapenko
	KMSAN (KernelMemorySanitizer, a new error detection tool) reports the use of uninitialized memory in ext4_update_bh_state(): ================================================================== BUG: KMSAN: use of unitialized memory CPU: 3 PID: 1 Comm: swapper/0 Tainted: G B 4.8.0-rc6+ #597 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000282 ffff88003cc96f68 ffffffff81f30856 0000003000000008 ffff88003cc96f78 0000000000000096 ffffffff8169742a ffff88003cc96ff8 ffffffff812fc1fc 0000000000000008 ffff88003a1980e8 0000000100000000 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81f30856>] dump_stack+0xa6/0xc0 lib/dump_stack.c:51 [<ffffffff812fc1fc>] kmsan_report+0x1ec/0x300 mm/kmsan/kmsan.c:? [<ffffffff812fc33b>] __msan_warning+0x2b/0x40 ??:? [< inline >] ext4_update_bh_state fs/ext4/inode.c:727 [<ffffffff8169742a>] _ext4_get_block+0x6ca/0x8a0 fs/ext4/inode.c:759 [<ffffffff81696d4c>] ext4_get_block+0x8c/0xa0 fs/ext4/inode.c:769 [<ffffffff814a2d36>] generic_block_bmap+0x246/0x2b0 fs/buffer.c:2991 [<ffffffff816ca30e>] ext4_bmap+0x5ee/0x660 fs/ext4/inode.c:3177 ... origin description: ----tmp@generic_block_bmap ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) The local \|tmp\| is created in generic_block_bmap() and then passed into ext4_bmap() => ext4_get_block() => _ext4_get_block() => ext4_update_bh_state(). Along the way tmp.b_page is never initialized before ext4_update_bh_state() checks its value. [ Use the approach suggested by Kees Cook of initializing the whole bh structure.] Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>