summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-07-05sched/fair: Fix load_balance() affinity redo pathJeffrey Hugo
If load_balance() fails to migrate any tasks because all tasks were affined, load_balance() removes the source CPU from consideration and attempts to redo and balance among the new subset of CPUs. There is a bug in this code path where the algorithm considers all active CPUs in the system (minus the source that was just masked out). This is not valid for two reasons: some active CPUs may not be in the current scheduling domain and one of the active CPUs is dst_cpu. These CPUs should not be considered, as we cannot pull load from them. Instead of failing out of load_balance(), we may end up redoing the search with no valid CPUs and incorrectly concluding the domain is balanced. Additionally, if the group_imbalance flag was just set, it may also be incorrectly unset, thus the flag will not be seen by other CPUs in future load_balance() runs as that algorithm intends. Fix the check by removing CPUs not in the current domain and the dst_cpu from considertation, thus limiting the evaluation to valid remaining CPUs from which load might be migrated. Co-authored-by: Austin Christ <austinwc@codeaurora.org> Co-authored-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Austin Christ <austinwc@codeaurora.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Timur Tabi <timur@codeaurora.org> Link: http://lkml.kernel.org/r/1496863138-11322-2-git-send-email-jhugo@codeaurora.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05MAINTAINERS: Add Frederic Weisbecker as nohz/dyntics maintainerIngo Molnar
Frederic has been improving and maintaining the nohz/dynticks kernel features for years, so make his de facto maintainership official. Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05ftrace: Test for NULL iter->tr in regex for stack_trace_filter changesSteven Rostedt (VMware)
As writing into stack_trace_filter, the iter-tr is not set and is NULL. Check if it is NULL before dereferencing it in ftrace_regex_release(). Fixes: 8c08f0d5c6fb ("ftrace: Have cached module filters be an active filter") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05Merge commit '0f17976568b3f72e676450af0c0db6f8752253d6' into trace/ftrace/coreSteven Rostedt (VMware)
Need to get the changes from 0f17976568b3 ("ftrace: Fix regression with module command in stack_trace_filter") as it is required to fix some other changes with stack_trace_filter and the new development code. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05Merge branch 'dt/property-move' into dt/nextRob Herring
2017-07-05Merge branch 'topic/of-graph-base' of ↵Rob Herring
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into dt/property-move OF graph changes for ALSA conflict with the move of graph functions into property.c.
2017-07-05GFS2: constify attribute_group structures.Arvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 5259 1344 8 6611 19d3 fs/gfs2/sys.o File size After adding 'const': text data bss dec hex filename 5371 1216 8 6595 19c3 fs/gfs2/sys.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05gfs2: gfs2_create_inode: Keep glock across iputAndreas Gruenbacher
On failure, keep the inode glock across the final iput of the new inode so that gfs2_evict_inode doesn't have to re-acquire the glock. That way, gfs2_evict_inode won't need to revalidate the block type. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05gfs2: Clean up glock work enqueuingAndreas Gruenbacher
This patch adds a standardized queueing mechanism for glock work with spin_lock protection to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05gfs2: Protect gl->gl_object by spin lockAndreas Gruenbacher
Put all remaining accesses to gl->gl_object under the gl->gl_lockref.lock spinlock to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05gfs2: Get rid of flush_delayed_work in gfs2_evict_inodeAndreas Gruenbacher
So far, gfs2_evict_inode clears gl->gl_object and then flushes the glock work queue to make sure that inode glops which dereference gl->gl_object have finished running before the inode is destroyed. However, flushing the work queue may do more work than needed, and in particular, it may call into DLM, which we want to avoid here. Use a bit lock (GIF_GLOP_PENDING) to synchronize between the inode glops and gfs2_evict_inode instead to get rid of the flushing. In addition, flush the work queues of existing glocks before reusing them for new inodes to get those glocks into a known state: the glock state engine currently doesn't handle glock re-appropriation correctly. (We may be able to fix the glock state engine instead later.) Based on a patch by Steven Whitehouse <swhiteho@redhat.com>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05locking/rwsem-spinlock: Fix EINTR branch in __down_write_common()Kirill Tkhai
If a writer could been woken up, the above branch if (sem->count == 0) break; would have moved us to taking the sem. So, it's not the time to wake a writer now, and only readers are allowed now. Thus, 0 must be passed to __rwsem_do_wake(). Next, __rwsem_do_wake() wakes readers unconditionally. But we mustn't do that if the sem is owned by writer in the moment. Otherwise, writer and reader own the sem the same time, which leads to memory corruption in callers. rwsem-xadd.c does not need that, as: 1) the similar check is made lockless there, 2) in __rwsem_mark_wake::try_reader_grant we test, that sem is not owned by writer. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Niklas Cassel <niklas.cassel@axis.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 17fcbd590d0c "locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y" Link: http://lkml.kernel.org/r/149762063282.19811.9129615532201147826.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05Merge branches 'fixes' and 'misc' into for-linusRussell King
2017-07-05Merge branch 'phy-dp83867-workaround-incorrect-RX_CTRL-pin-strap'David S. Miller
Sekhar Nori says: ==================== net: phy: dp83867: workaround incorrect RX_CTRL pin strap This patch series adds workaround for incorrect RX_CTRL pin strap setting that can be found on some TI boards. This is required to be complaint to PHY datamanual specification. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05net: phy: dp83867: add workaround for incorrect RX_CTRL pin strapMurali Karicheri
The data manual for DP83867IR/CR, SNLS484E[1], revised march 2017, advises that strapping RX_DV/RX_CTRL pin in mode 1 and 2 is not supported (see note below Table 5 (4-Level Strap Pins)). There are some boards which have the pin strapped this way and need software workaround suggested by the data manual. Bit[7] of Configuration Register 4 (address 0x0031) must be cleared to 0. This ensures proper operation of the PHY. Implement driver support for device-tree property meant to advertise the wrong strapping. [1] http://www.ti.com/lit/ds/snls484e/snls484e.pdf Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> [nsekhar@ti.com: rebase to mainline, code simplification] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strapMurali Karicheri
The data manual for DP83867IR/CR, SNLS484E[1], revised march 2017, advises that strapping RX_DV/RX_CTRL pin in mode 1 and 2 is not supported (see note below Table 5 (4-Level Strap Pins)). It further advises that if a board has this pin strapped in mode 1 and mode 2, then to ensure proper operation of the PHY, a software workaround must be implemented. Since it is not possible to detect in software if RX_DV/RX_CTRL pin is incorrectly strapped, add a device-tree property for the board to advertise this and allow corrective action in software. [1] http://www.ti.com/lit/ds/snls484e/snls484e.pdf Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> [nsekhar@ti.com: rebase to mainline, split documentation into separate patch] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05Merge branch 'cxgb4-ptp'David S. Miller
Atul Gupta says: ==================== cxgb4: Add PTP Hardware Clock (PHC) support V4: Splitting the patch again V3: Releasing lock in the exit paths V2: Splitting the patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05cxgb4: Support for get_ts_info ethtool methodAtul Gupta
Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05cxgb4: Add PTP Hardware Clock (PHC) supportAtul Gupta
Add PTP IEEE-1588 support and make it accessible via PHC subsystem. The functionality is enabled for T5/T6 adapters. Driver interfaces with Firmware to program and adjust the clock offset. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05cxgb4: time stamping interface for PTPAtul Gupta
Supports hardware and software time stamping via the Linux SO_TIMESTAMPING socket option. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05Merge branch 'nfp-port-enumeration-change-and-FW-ABI-adjustment'David S. Miller
Jakub Kicinski says: ==================== nfp: port enumeration change and FW ABI adjustment This set changes the way ports are numbered internally to avoid MAC address changes and invalid link information when breakout is configured. Second patch gets rid of old way of looking up MAC addresses in device information which caused all this confusion. Patch 3 is a small adjustment to the new FW ABI version we introduced in this release cycle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05nfp: default to chained metadata prepend formatJakub Kicinski
ABI 4.x introduced the chained metadata format and made it the only one possible. There are cases, however, where the old format is preferred - mostly to make interoperation with VFs using ABI 3.x easier for the datapath. In ABI 5.x we allowed for more flexibility by selecting the metadata format based on capabilities. The default was left to non-chained. In case of fallback traffic, there is no capability telling the driver there may be chained metadata. With a very stripped- -down FW the default old metadata format would be selected making the driver drop all fallback traffic. This patch changes the default selection in the driver. It should not hurt with old firmwares, because if they don't advertise RSS they will not produce metadata anyway. New firmwares advertising ABI 5.x, however, can depend on the driver defaulting to chained format. Fixes: f9380629fafc ("nfp: advertise support for NFD ABI 0.5") Suggested-by: Michael Rapson <michael.rapson@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05nfp: remove legacy MAC address lookupJakub Kicinski
The legacy MAC address lookup doesn't work well with breakout cables. We are probably better off picking random addresses than the wrong ones in the theoretical scenario where management FW didn't tell us what the port config is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05nfp: improve order of interfaces in breakout modeJakub Kicinski
For historical reasons we enumerate the vNICs in order. This means that if user configures breakout on a multiport card, the first interface of the second port will have its MAC address changed. What's worse, when moved from static information (HWInfo) to using management FW (NSP), more features started depending on the port ids. Right now in case of breakout first subport of the second port and second subport of the first port will have their link info swapped. Revise the ordering scheme so that first subport maintains its address. Side effect of this change is that we will use base lane ids in devlink (i.e. 40G ports will be 4 ids apart), e.g.: pci/0000:04:00.0/0: type eth netdev p6p1 pci/0000:04:00.0/4: type eth netdev p6p2 Note that behaviour of phys_port_id is not changed since there is a separate id number for the subport there. Fixes: ec8b1fbe682d ("nfp: support port splitting via devlink") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05net: macb: remove extraneous return when MACB_EXT_DESC is definedColin Ian King
When macro MACB_EXT_DESC is defined we end up with two identical return statements and just one is sufficient. Remove the extra return. Detected by CoverityScan, CID#1449361 ("Structurally dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] tableChen Yu
Add the real e820_tabel_firmware[] that will not be modified by the kernel or the EFI boot stub under any circumstance. In addition to that modify the code so that e820_table_firmwarep[] is exposed via sysfs to represent the real firmware memory layout, rather than exposing the e820_table_kexec[] table. This fixes a hibernation bug/warning, which uses e820_table_kexec[] to check RAM layout consistency across hibernation/resume: The suspend kernel: [ 0.000000] e820: update [mem 0x76671018-0x76679457] usable ==> usable The resume kernel: [ 0.000000] e820: update [mem 0x7666f018-0x76677457] usable ==> usable ... [ 15.752088] PM: Using 3 thread(s) for decompression. [ 15.752088] PM: Loading and decompressing image data (471870 pages)... [ 15.764971] Hibernate inconsistent memory map detected! [ 15.770833] PM: Image mismatch: architecture specific data Actually it is safe to restore these pages because E820_TYPE_RAM and E820_TYPE_RESERVED_KERN are treated the same during hibernation, so the original e820 table provided by the bootloader is used for hibernation MD5 fingerprint checking. The side effect is that, this newly introduced variable might increase the kernel size at compile time. Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05x86/boot/e820: Rename the e820_table_firmware to e820_table_kexecChen Yu
Currently the e820_table_firmware[] table is mainly used by the kexec, and it is not what it's supposed to be - despite its name it might be modified by the kernel. So change its name to e820_table_kexec[]. In the next patch we will introduce the real e820_table_firmware[] table. No functional change. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05x86/boot/e820: Avoid overwriting e820_table_firmwareChen Yu
The following commit in 2013: 77ea8c948953 ("x86: Reserve setup_data ranges late after parsing memmap cmdline") has fixed the issue of losing setup_data information by deferring the e820_reserve_setup_data() call until the early params have been parsed. But this also introduced a new problem that, during early params parsing, the kexec kernel might fake a mptable and saves it into the e820_table_firmware[] table (without saving the mptable to the e820_table[]), however the subsequent invoking of e820_reserve_setup_data() will overwrite the e820_table_firmware[] according to the e820_table[], thus the fake mptable information is lost. Fix this issue by updating the e820_table_firmware[] according to the setup_data information, but without overwriting it. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xunlei Pang <xlpang@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP caseColin Ian King
There appears to be a missing break in the TCP_BPF_SNDCWND_CLAMP case. Currently the non-error path where val is greater than zero falls through to the default case that sets the error return to -EINVAL. Add in the missing break. Detected by CoverityScan, CID#1449376 ("Missing break in switch") Fixes: 13bf96411ad2 ("bpf: Adds support for setting sndcwnd clamp") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05bpf: fix return in load_bpf_fileLawrence Brakmo
The function load_bpf_file ignores the return value of load_and_attach(), so even if load_and_attach() returns an error, load_bpf_file() will return 0. Now, load_bpf_file() can call load_and_attach() multiple times and some can succeed and some could fail. I think the correct behavor is to return error on the first failed load_and_attach(). v2: Added missing SOB Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05mpls: fix rtm policy in mpls_getrouteRoopa Prabhu
fix rtm policy name typo in mpls_getroute and also remove export of rtm_ipv4_policy Fixes: 397fc9e5cefe ("mpls: route get support") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05sched/cputime: Accumulate vtime on top of nsec clocksourceWanpeng Li
Currently the cputime source used by vtime is jiffies. When we cross a context boundary and jiffies have changed since the last snapshot, the pending cputime is accounted to the switching out context. This system works ok if the ticks are not aligned across CPUs. If they instead are aligned (ie: all fire at the same time) and the CPUs run in userspace, the jiffies change is only observed on tick exit and therefore the user cputime is accounted as system cputime. This is because the CPU that maintains timekeeping fires its tick at the same time as the others. It updates jiffies in the middle of the tick and the other CPUs see that update on IRQ exit: CPU 0 (timekeeper) CPU 1 ------------------- ------------- jiffies = N ... run in userspace for a jiffy tick entry tick entry (sees jiffies = N) set jiffies = N + 1 tick exit tick exit (sees jiffies = N + 1) account 1 jiffy as stime Fix this with using a nanosec clock source instead of jiffies. The cputime is then accumulated and flushed everytime the pending delta reaches a jiffy in order to mitigate the accounting overhead. [ fweisbec: changelog, rebase on struct vtime, field renames, add delta on cputime readers, keep idle vtime as-is (low overhead accounting), harmonize clock sources. ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Luiz Capitulino <lcapitulino@redhat.com> Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-6-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05sched/cputime: Move the vtime task fields to their own structFrederic Weisbecker
We are about to add vtime accumulation fields to the task struct. Let's avoid more bloatification and gather vtime information to their own struct. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-5-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05sched/cputime: Rename vtime fieldsFrederic Weisbecker
The current "snapshot" based naming on vtime fields suggests we record some past event but that's a low level picture of their actual purpose which comes out blurry. The real point of these fields is to run a basic state machine that tracks down cputime entry while switching between contexts. So lets reflect that with more meaningful names. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05sched/cputime: Always set tsk->vtime_snap_whence after accounting vtimeFrederic Weisbecker
Even though it doesn't have functional consequences, setting the task's new context state after we actually accounted the pending vtime from the old context state makes more sense from a review perspective. vtime_user_exit() is the only function that doesn't follow that rule and that can bug the reviewer for a little while until he realizes there is no reason for this special case. Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05vtime, sched/cputime: Remove vtime_account_user()Frederic Weisbecker
It's an unnecessary function between vtime_user_exit() and account_user_time(). Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1498756511-11714-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Herbert Xu
Merge the crypto tree to pull in fixes for the next merge window.
2017-07-05Merge tag 'perf-urgent-for-mingo-4.12-20170704' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix max attr.precise_ip probing to make perf use the best cycles:p available in the processor for non root users (Arnaldo Carvalho de Melo) - Fix processing of MMAP events for 32-bit binaries on 64-bit systems when unwind support is not fully integrated, fixing DSO and symbol resolution (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05x86/mm/pat: Don't report PAT on CPUs that don't support itMikulas Patocka
The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Fixes: 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bernhard Held <berny156@gmx.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com
2017-07-05Update my email addressCornelia Huck
Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/syscalls: Fix out of bounds arguments accessJiri Olsa
Zorro reported following crash while having enabled syscall tracing (CONFIG_FTRACE_SYSCALLS): Unable to handle kernel pointer dereference at virtual ... Oops: 0011 [#1] SMP DEBUG_PAGEALLOC SNIP Call Trace: ([<000000000024d79c>] ftrace_syscall_enter+0xec/0x1d8) [<00000000001099c6>] do_syscall_trace_enter+0x236/0x2f8 [<0000000000730f1c>] sysc_tracesys+0x1a/0x32 [<000003fffcf946a2>] 0x3fffcf946a2 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000000000022dd44>] rb_event_data+0x34/0x40 ---[ end trace 8c795f86b1b3f7b9 ]--- The crash happens in syscall_get_arguments function for syscalls with zero arguments, that will try to access first argument (args[0]) in event entry, but it's not allocated. Bail out of there are no arguments. Cc: stable@vger.kernel.org Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/vfio_ccw: remove unused variableSebastian Ott
Fix this set but not used warning: drivers/s390/cio/vfio_ccw_drv.c: In function 'vfio_ccw_sch_io_todo': drivers/s390/cio/vfio_ccw_drv.c:72:21: warning: variable 'sch' set but not used [-Wunused-but-set-variable] struct subchannel *sch; ^ Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/dasd: remove unneeded codeSebastian Ott
Fix these set but not used warnings: drivers/s390/block/dasd.c:3933:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] drivers/s390/block/dasd_alias.c:757:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] In addition to that remove the test if an unsigned is < 0: drivers/s390/block/dasd_devmap.c:153:11: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/crash: Remove unused KEXEC_NOTE_BYTESMichael Holzheu
After commmit 692f66f26a4c19 ("crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE") the KEXEC_NOTE_BYTES macro is not used anymore and for s390 we create the ELF header in the new kernel anyway. Therefore remove the macro. Reported-by: Xunlei Pang <xpang@redhat.com> Reviewed-by: Mikhail Zaslonko <zaslonko@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/zcrypt: Fix missing newlines at some debug feature messages.Harald Freudenberger
On some debug feature invocations the newline was missing. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/dasd: Make raw I/O usable without prefix supportJan Höppner
The Prefix CCW is not mandatory and raw I/O can also be issued without it. Check whether the Prefix CCW is supported and if not use the combination of Define Extent and Locate Record Extended instead. While at it, sort the variable declarations, replace the gotos with early exits, and remove an error check at the end which is irrelevant. Also, remove the XRC check as it is not relevant for raw I/O. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/dasd: Rename dasd_raw_build_cp()Jan Höppner
Rename dasd_raw_build_cp() to dasd_eckd_build_cp_raw() to fit the scheme. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390/dasd: Refactor prefix_LRE() and related functionsJan Höppner
We already have define_extent() that prepares necessary data for the Define Extent CCW. The exact same thing is done in prefix_LRE(). Remove the duplicate code and move commands that were only used in combination with the Prefix command to define_extent(). One of these commands needs the blocksize to be specified. Add the blksize parameter to define_extent() to account for that. In addition, the check_XRC() function can be made more generic. Do this and remove the Prefix-specific check_XRC_on_prefix() function. Furthermore, prefix_LRE() uses fill_LRE_data() to prepare Locate Record Extended data. Rename the function to fit the scheme better and make it usable outside of the Prefix context by adding the corresponding CCW command. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05s390: fix up for "blk-mq: switch ->queue_rq return value to blk_status_t"Stephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-05fs: generic_block_bmap(): initialize all of the fields in the temp bhAlexander Potapenko
KMSAN (KernelMemorySanitizer, a new error detection tool) reports the use of uninitialized memory in ext4_update_bh_state(): ================================================================== BUG: KMSAN: use of unitialized memory CPU: 3 PID: 1 Comm: swapper/0 Tainted: G B 4.8.0-rc6+ #597 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000282 ffff88003cc96f68 ffffffff81f30856 0000003000000008 ffff88003cc96f78 0000000000000096 ffffffff8169742a ffff88003cc96ff8 ffffffff812fc1fc 0000000000000008 ffff88003a1980e8 0000000100000000 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81f30856>] dump_stack+0xa6/0xc0 lib/dump_stack.c:51 [<ffffffff812fc1fc>] kmsan_report+0x1ec/0x300 mm/kmsan/kmsan.c:? [<ffffffff812fc33b>] __msan_warning+0x2b/0x40 ??:? [< inline >] ext4_update_bh_state fs/ext4/inode.c:727 [<ffffffff8169742a>] _ext4_get_block+0x6ca/0x8a0 fs/ext4/inode.c:759 [<ffffffff81696d4c>] ext4_get_block+0x8c/0xa0 fs/ext4/inode.c:769 [<ffffffff814a2d36>] generic_block_bmap+0x246/0x2b0 fs/buffer.c:2991 [<ffffffff816ca30e>] ext4_bmap+0x5ee/0x660 fs/ext4/inode.c:3177 ... origin description: ----tmp@generic_block_bmap ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) The local |tmp| is created in generic_block_bmap() and then passed into ext4_bmap() => ext4_get_block() => _ext4_get_block() => ext4_update_bh_state(). Along the way tmp.b_page is never initialized before ext4_update_bh_state() checks its value. [ Use the approach suggested by Kees Cook of initializing the whole bh structure.] Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>