summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2015-04-15Merge branch 'exec_domain_rip_v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc Pull exec domain removal from Richard Weinberger: "This series removes execution domain support from Linux. The idea behind exec domains was to support different ABIs. The feature was never complete nor stable. Let's rip it out and make the kernel signal handling code less complicated" * 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits) arm64: Removed unused variable sparc: Fix execution domain removal Remove rest of exec domains. arch: Remove exec_domain from remaining archs arc: Remove signal translation and exec_domain xtensa: Remove signal translation and exec_domain xtensa: Autogenerate offsets in struct thread_info x86: Remove signal translation and exec_domain unicore32: Remove signal translation and exec_domain um: Remove signal translation and exec_domain tile: Remove signal translation and exec_domain sparc: Remove signal translation and exec_domain sh: Remove signal translation and exec_domain s390: Remove signal translation and exec_domain mn10300: Remove signal translation and exec_domain microblaze: Remove signal translation and exec_domain m68k: Remove signal translation and exec_domain m32r: Remove signal translation and exec_domain m32r: Autogenerate offsets in struct thread_info frv: Remove signal translation and exec_domain ...
2015-04-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...
2015-04-14x86: expose number of page table levels on Kconfig levelKirill A. Shutemov
We would want to use number of page table level to define mm_struct. Let's expose it as CONFIG_PGTABLE_LEVELS. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14watchdog: introduce the hardlockup_detector_disable() functionUlrich Obergfell
Have kvm_guest_init() use hardlockup_detector_disable() instead of watchdog_enable_hardlockup_detector(false). Remove the watchdog_hardlockup_detector_is_enabled() and the watchdog_enable_hardlockup_detector() function which are no longer needed. Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Core kernel changes: - One of the more interesting features in this cycle is the ability to attach eBPF programs (user-defined, sandboxed bytecode executed by the kernel) to kprobes. This allows user-defined instrumentation on a live kernel image that can never crash, hang or interfere with the kernel negatively. (Right now it's limited to root-only, but in the future we might allow unprivileged use as well.) (Alexei Starovoitov) - Another non-trivial feature is per event clockid support: this allows, amongst other things, the selection of different clock sources for event timestamps traced via perf. This feature is sought by people who'd like to merge perf generated events with external events that were measured with different clocks: - cluster wide profiling - for system wide tracing with user-space events, - JIT profiling events etc. Matching perf tooling support is added as well, available via the -k, --clockid <clockid> parameter to perf record et al. (Peter Zijlstra) Hardware enablement kernel changes: - x86 Intel Processor Trace (PT) support: which is a hardware tracer on steroids, available on Broadwell CPUs. The hardware trace stream is directly output into the user-space ring-buffer, using the 'AUX' data format extension that was added to the perf core to support hardware constraints such as the necessity to have the tracing buffer physically contiguous. This patch-set was developed for two years and this is the result. A simple way to make use of this is to use BTS tracing, the PT driver emulates BTS output - available via the 'intel_bts' PMU. More explicit PT specific tooling support is in the works as well - will probably be ready by 4.2. (Alexander Shishkin, Peter Zijlstra) - x86 Intel Cache QoS Monitoring (CQM) support: this is a hardware feature of Intel Xeon CPUs that allows the measurement and allocation/partitioning of caches to individual workloads. These kernel changes expose the measurement side as a new PMU driver, which exposes various QoS related PMU events. (The partitioning change is work in progress and is planned to be merged as a cgroup extension.) (Matt Fleming, Peter Zijlstra; CPU feature detection by Peter P Waskiewicz Jr) - x86 Intel Haswell LBR call stack support: this is a new Haswell feature that allows the hardware recording of call chains, plus tooling support. To activate this feature you have to enable it via the new 'lbr' call-graph recording option: perf record --call-graph lbr perf report or: perf top --call-graph lbr This hardware feature is a lot faster than stack walk or dwarf based unwinding, but has some limitations: - It reuses the current LBR facility, so LBR call stack and branch record can not be enabled at the same time. - It is only available for user-space callchains. (Yan, Zheng) - x86 Intel Broadwell CPU support and various event constraints and event table fixes for earlier models. (Andi Kleen) - x86 Intel HT CPUs event scheduling workarounds. This is a complex CPU bug affecting the SNB,IVB,HSW families that results in counter value corruption. The mitigation code is automatically enabled and is transparent. (Maria Dimakopoulou, Stephane Eranian) The perf tooling side had a ton of changes in this cycle as well, so I'm only able to list the user visible changes here, in addition to the tooling changes outlined above: User visible changes affecting all tools: - Improve support of compressed kernel modules (Jiri Olsa) - Save DSO loading errno to better report errors (Arnaldo Carvalho de Melo) - Bash completion for subcommands (Yunlong Song) - Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa) - Support missing -f to override perf.data file ownership. (Yunlong Song) - Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo) User visible changes in individual tools: 'perf data': New tool for converting perf.data to other formats, initially for the CTF (Common Trace Format) from LTTng (Jiri Olsa, Sebastian Siewior) 'perf diff': Add --kallsyms option (David Ahern) 'perf list': Allow listing events with 'tracepoint' prefix (Yunlong Song) Sort the output of the command (Yunlong Song) 'perf kmem': Respect -i option (Jiri Olsa) Print big numbers using thousands' group (Namhyung Kim) Allow -v option (Namhyung Kim) Fix alignment of slab result table (Namhyung Kim) 'perf probe': Support multiple probes on different binaries on the same command line (Masami Hiramatsu) Support unnamed union/structure members data collection. (Masami Hiramatsu) Check kprobes blacklist when adding new events. (Masami Hiramatsu) 'perf record': Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra) Support recording running/enabled time (Andi Kleen) 'perf sched': Improve the performance of 'perf sched replay' on high CPU core count machines (Yunlong Song) 'perf report' and 'perf top': Allow annotating entries in callchains in the hists browser (Arnaldo Carvalho de Melo) Indicate which callchain entries are annotated in the TUI hists browser (Arnaldo Carvalho de Melo) Add pid/tid filtering to 'report' and 'script' commands (David Ahern) Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT events (Arnaldo Carvalho de Melo) 'perf stat': Report unsupported events properly (Suzuki K. Poulose) Output running time and run/enabled ratio in CSV mode (Andi Kleen) 'perf trace': Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo) Only insert blank duration bracket when tracing syscalls (Arnaldo Carvalho de Melo) Filter out the trace pid when no threads are specified (Arnaldo Carvalho de Melo) Dump stack on segfaults (Arnaldo Carvalho de Melo) No need to explicitely enable evsels for workload started from perf, let it be enabled via perf_event_attr.enable_on_exec, removing some events that take place in the 'perf trace' before a workload is really started by it. (Arnaldo Carvalho de Melo) Allow mixing with tracepoints and suppressing plain syscalls. (Arnaldo Carvalho de Melo) There's also been a ton of infrastructure work done, such as the split-out of perf's build system into tools/build/ and other changes - see the shortlog and changelog for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (358 commits) perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init() perf evlist: Fix type for references to data_head/tail perf probe: Check the orphaned -x option perf probe: Support multiple probes on different binaries perf buildid-list: Fix segfault when show DSOs with hits perf tools: Fix cross-endian analysis perf tools: Fix error path to do closedir() when synthesizing threads perf tools: Fix synthesizing fork_event.ppid for non-main thread perf tools: Add 'I' event modifier for exclude_idle bit perf report: Don't call map__kmap if map is NULL. perf tests: Fix attr tests perf probe: Fix ARM 32 building error perf tools: Merge all perf_event_attr print functions perf record: Add clockid parameter perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10 perf sched replay: Support using -f to override perf.data file ownership perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task perf sched replay: Fix the segmentation fault problem caused by pr_err in threads perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations ...
2015-04-14Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ changes from Ingo Molnar: "This tree adds full dynticks support to KVM guests (support the disabling of the timer tick on the guest). The main missing piece was the recognition of guest execution as RCU extended quiescent state and related changes" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest context_tracking: Export context_tracking_user_enter/exit context_tracking: Run vtime_user_enter/exit only when state == CONTEXT_USER context_tracking: Add stub context_tracking_is_enabled context_tracking: Generalize context tracking APIs to support user and guest context_tracking: Rename context symbols to prepare for transition state ppc: Remove unused cpp symbols in kvm headers
2015-04-14Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "The main changes in this cycle were: - changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - add in-kernel API to enable and disable expediting of normal RCU grace periods. - improve RCU's handling of (hotplug-) outgoing CPUs. - NO_HZ_FULL_SYSIDLE fixes. - tiny-RCU updates to make it more tiny. - documentation updates. - miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well cpu: Defer smpboot kthread unparking until CPU known to scheduler rcu: Associate quiescent-state reports with grace period rcu: Yet another fix for preemption and CPU hotplug rcu: Add diagnostics to grace-period cleanup rcutorture: Default to grace-period-initialization delays rcu: Handle outgoing CPUs on exit from idle loop cpu: Make CPU-offline idle-loop transition point more precise rcu: Eliminate ->onoff_mutex from rcu_node structure rcu: Process offlining and onlining only at grace-period start rcu: Move rcu_report_unblock_qs_rnp() to common code rcu: Rework preemptible expedited bitmask handling rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs rcutorture: Enable slow grace-period initializations rcu: Provide diagnostic option to slow down grace-period initialization rcu: Detect stalls caused by failure to propagate up rcu_node tree rcu: Eliminate empty HOTPLUG_CPU ifdef rcu: Simplify sync_rcu_preempt_exp_init() rcu: Put all orphan-callback-related code under same comment rcu: Consolidate offline-CPU callback initialization ...
2015-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "Usual trivial tree updates. Nothing outstanding -- mostly printk() and comment fixes and unused identifier removals" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: goldfish: goldfish_tty_probe() is not using 'i' any more powerpc: Fix comment in smu.h qla2xxx: Fix printks in ql_log message lib: correct link to the original source for div64_u64 si2168, tda10071, m88ds3103: Fix firmware wording usb: storage: Fix printk in isd200_log_config() qla2xxx: Fix printk in qla25xx_setup_mode init/main: fix reset_device comment ipwireless: missing assignment goldfish: remove unreachable line of code coredump: Fix do_coredump() comment stacktrace.h: remove duplicate declaration task_struct smpboot.h: Remove unused function prototype treewide: Fix typo in printk messages treewide: Fix typo in printk messages mod_devicetable: fix comment for match_flags
2015-04-13Merge branch 'x86-ras-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS changes from Ingo Molnar: "The main changes in this cycle were: - Simplify the CMCI storm logic on Intel CPUs after yet another report about a race in the code (Borislav Petkov) - Enable the MCE threshold irq on AMD CPUs by default (Aravind Gopalakrishnan) - Add AMD-specific MCE-severity grading function. Further error recovery actions will be based on its output (Aravind Gopalakrishnan) - Documentation updates (Borislav Petkov) - ... assorted fixes and cleanups" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/severity: Fix warning about indented braces x86/mce: Define mce_severity function pointer x86/mce: Add an AMD severities-grading function x86/mce: Reindent __mcheck_cpu_apply_quirks() properly x86/mce: Use safe MSR accesses for AMD quirk x86/MCE/AMD: Enable thresholding interrupts by default if supported x86/MCE: Make mce_panic() fatal machine check msg in the same pattern x86/MCE/intel: Cleanup CMCI storm logic Documentation/acpi/einj: Correct and streamline text x86/MCE/AMD: Drop bogus const modifier from AMD's bank4_names()
2015-04-13Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "The main changes in this cycle were: - reduce the x86/32 PAE per task PGD allocation overhead from 4K to 0.032k (Fenghua Yu) - early_ioremap/memunmap() usage cleanups (Juergen Gross) - gbpages support cleanups (Luis R Rodriguez) - improve AMD Bulldozer (family 0x15) ASLR I$ aliasing workaround to increase randomization by 3 bits (per bootup) (Hector Marco-Gisbert) - misc fixlets" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Improve AMD Bulldozer ASLR workaround x86/mm/pat: Initialize __cachemode2pte_tbl[] and __pte2cachemode_tbl[] in a bit more readable fashion init.h: Clean up the __setup()/early_param() macros x86/mm: Simplify probe_page_size_mask() x86/mm: Further simplify 1 GB kernel linear mappings handling x86/mm: Use early_param_on_off() for direct_gbpages init.h: Add early_param_on_off() x86/mm: Simplify enabling direct_gbpages x86/mm: Use IS_ENABLED() for direct_gbpages x86/mm: Unexport set_memory_ro() and set_memory_rw() x86/mm, efi: Use early_ioremap() in arch/x86/platform/efi/efi-bgrt.c x86/mm: Use early_memunmap() instead of early_iounmap() x86/mm/pat: Ensure different messages in STRICT_DEVMEM and PAT cases x86/mm: Reduce PAE-mode per task pgd allocation overhead from 4K to 32 bytes
2015-04-13Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode changes from Ingo Molnar: "Microcode driver updates: mostly cleanups but also some fixes (Borislav Petkov)" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/amd: Drop the pci_ids.h dependency x86/microcode/intel: Fix printing of microcode blobs in show_saved_mc() x86/microcode/intel: Check scan_microcode()'s retval x86/microcode/intel: Sanitize microcode_pointer() x86/microcode/intel: Move mc arg last in get_matching_{microcode|sig} x86/microcode/intel: Simplify generic_load_microcode_early() x86/microcode: Consolidate family,model, ... code x86/microcode/intel: Rename update_match_revision() x86/microcode/intel: Sanitize _save_mc() x86/microcode/intel: Make _save_mc() return the updated saved count x86/microcode/intel: Simplify load_ucode_intel_bsp() x86/microcode/intel: Get rid of last arg to load_ucode_intel_bsp() x86/microcode/intel: Do the mc_saved_src NULL check first x86/microcode/intel: Check if microcode was found before applying x86/microcode/intel: Fix out of bounds memory access to the extended header
2015-04-13Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu changes from Ingo Molnar: "Various x86 FPU handling cleanups, refactorings and fixes (Borislav Petkov, Oleg Nesterov, Rik van Riel)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fpu: Kill eager_fpu_init_bp() x86/fpu: Don't allocate fpu->state for swapper/0 x86/fpu: Rename drop_init_fpu() to fpu_reset_state() x86/fpu: Fold __drop_fpu() into its sole user x86/fpu: Don't abuse drop_init_fpu() in flush_thread() x86/fpu: Use restore_init_xstate() instead of math_state_restore() on kthread exec x86/fpu: Introduce restore_init_xstate() x86/fpu: Document user_fpu_begin() x86/fpu: Factor out memset(xstate, 0) in fpu_finit() paths x86/fpu: Change xstateregs_get()/set() to use ->xsave.i387 rather than ->fxsave x86/fpu: Don't abuse FPU in kernel threads if use_eager_fpu() x86/fpu: Always allow FPU in interrupt if use_eager_fpu() x86/fpu: __kernel_fpu_begin() should clear fpu_owner_task even if use_eager_fpu() x86/fpu: Also check fpu_lazy_restore() when use_eager_fpu() x86/fpu: Use task_disable_lazy_fpu_restore() helper x86/fpu: Use an explicit if/else in switch_fpu_prepare() x86/fpu: Introduce task_disable_lazy_fpu_restore() helper x86/fpu: Move lazy restore functions up a few lines x86/fpu: Change math_error() to use unlazy_fpu(), kill (now) unused save_init_fpu() x86/fpu: Don't do __thread_fpu_end() if use_eager_fpu() ...
2015-04-13Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 debug changes from Ingo Molnar: "Stack printing fixlets" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kernel: Use kstack_end() in dumpstack_64.c x86/kernel: Fix output of show_stack_log_lvl()
2015-04-13Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cacheinfo sysfs changes from Ingo Molnar: "This tree converts the x86 cacheinfo sysfs code to use the generic code in drivers/base/cacheinfo.c. It's not intended to change the sysfs ABI: 'This patch neither alters any existing sysfs entries nor their formating, however since the generic cacheinfo has switched to use the device attributes instead of the traditional raw kobjects, a directory named 'power' along with its standard attributes are added similar to any other device'" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/cacheinfo: Fix cache_get_priv_group() for Intel processors x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure
2015-04-13Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Various cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/iommu: Fix header comments regarding standard and _FINISH macros x86/earlyprintk: Put CONFIG_PCI-only functions under the #ifdef x86: Fix up obsolete __cpu_set() function usage
2015-04-13Merge branch 'x86-build-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build changes from Ingo Molnar: "Small cleanups and fixes" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Cleanup KEXEC_VERIFY_SIG Kconfig help text x86/build/defconfig: Enable USB_EHCI_TT_NEWSCHED=y x86/build: Fix mkcapflags.sh bash-ism x86/Kconfig: Simplify X86_UP_APIC handling x86/Kconfig: Simplify X86_IO_APIC dependencies x86/Kconfig: Avoid issuing pointless turned off entries to .config
2015-04-13Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot changes from Ingo Molnar: "A number of cleanups" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Standardize strcmp() x86/boot/64: Remove pointless early_printk() message x86/boot/video: Move the 'video_segment' variable to video.c
2015-04-13Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm changes from Ingo Molnar: "There were lots of changes in this development cycle: - over 100 separate cleanups, restructuring changes, speedups and fixes in the x86 system call, irq, trap and other entry code, part of a heroic effort to deobfuscate a decade old spaghetti asm code and its C code dependencies (Denys Vlasenko, Andy Lutomirski) - alternatives code fixes and enhancements (Borislav Petkov) - simplifications and cleanups to the compat code (Brian Gerst) - signal handling fixes and new x86 testcases (Andy Lutomirski) - various other fixes and cleanups By their nature many of these changes are risky - we tried to test them well on many different x86 systems (there are no known regressions), and they are split up finely to help bisection - but there's still a fair bit of residual risk left so caveat emptor" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (148 commits) perf/x86/64: Report regs_user->ax too in get_regs_user() perf/x86/64: Simplify regs_user->abi setting code in get_regs_user() perf/x86/64: Do report user_regs->cx while we are in syscall, in get_regs_user() perf/x86/64: Do not guess user_regs->cs, ss, sp in get_regs_user() x86/asm/entry/32: Tidy up JNZ instructions after TESTs x86/asm/entry/64: Reduce padding in execve stubs x86/asm/entry/64: Remove GET_THREAD_INFO() in ret_from_fork x86/asm/entry/64: Simplify jumps in ret_from_fork x86/asm/entry/64: Remove a redundant jump x86/asm/entry/64: Optimize [v]fork/clone stubs x86/asm/entry: Zero EXTRA_REGS for stub32_execve() too x86/asm/entry/64: Move stub_x32_execvecloser() to stub_execveat() x86/asm/entry/64: Use common code for rt_sigreturn() epilogue x86/asm/entry/64: Add forgotten CFI annotation x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layout x86/asm/entry/64: Move opportunistic sysret code to syscall code path x86, selftests: Add sigreturn selftest x86/alternatives: Guard NOPs optimization x86/asm/entry: Clear EXTRA_REGS for all executable formats x86/signal: Remove pax argument from restore_sigcontext ...
2015-04-13Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic changes from Ingo Molnar: "Changes: - SGI UV APIC driver updates - dead code removal" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/uv: Update the UV APIC HUB check x86/apic/uv: Update the UV APIC driver check x86/apic/uv: Update the APIC UV OEM check x86/apic: Remove verify_local_APIC()
2015-04-13Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Ingo Molnar: "The main changes in this cycle were: - clockevents state machine cleanups and enhancements (Viresh Kumar) - clockevents broadcast notifier horror to state machine conversion and related cleanups (Thomas Gleixner, Rafael J Wysocki) - clocksource and timekeeping core updates (John Stultz) - clocksource driver updates and fixes (Ben Dooks, Dmitry Osipenko, Hans de Goede, Laurent Pinchart, Maxime Ripard, Xunlei Pang) - y2038 fixes (Xunlei Pang, John Stultz) - NMI-safe ktime_get_raw_fast() and general refactoring of the clock code, in preparation to perf's per event clock ID support (Peter Zijlstra) - generic sched/clock fixes, optimizations and cleanups (Daniel Thompson) - clockevents cpu_down() race fix (Preeti U Murthy)" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) timers/PM: Drop unnecessary braces from tick_freeze() timers/PM: Fix up tick_unfreeze() timekeeping: Get rid of stale comment clockevents: Cleanup dead cpu explicitely clockevents: Make tick handover explicit clockevents: Remove broadcast oneshot control leftovers sched/idle: Use explicit broadcast oneshot control function ARM: Tegra: Use explicit broadcast oneshot control function ARM: OMAP: Use explicit broadcast oneshot control function intel_idle: Use explicit broadcast oneshot control function ACPI/idle: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast oneshot control function x86/amd/idle, clockevents: Use explicit broadcast oneshot control functions clockevents: Provide explicit broadcast oneshot control functions clockevents: Remove the broadcast control leftovers ARM: OMAP: Use explicit broadcast control function intel_idle: Use explicit broadcast control function cpuidle: Use explicit broadcast control function ACPI/processor: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast control function ...
2015-04-13Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Major changes: - Reworked CPU capacity code, for better SMP load balancing on systems with assymetric CPUs. (Vincent Guittot, Morten Rasmussen) - Reworked RT task SMP balancing to be push based instead of pull based, to reduce latencies on large CPU count systems. (Steven Rostedt) - SCHED_DEADLINE support updates and fixes. (Juri Lelli) - SCHED_DEADLINE task migration support during CPU hotplug. (Wanpeng Li) - x86 mwait-idle optimizations and fixes. (Mike Galbraith, Len Brown) - sched/numa improvements. (Rik van Riel) - various cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) sched/core: Drop debugging leftover trace_printk call sched/deadline: Support DL task migration during CPU hotplug sched/core: Check for available DL bandwidth in cpuset_cpu_inactive() sched/deadline: Always enqueue on previous rq when dl_task_timer() fires sched/core: Remove unused argument from init_[rt|dl]_rq() sched/deadline: Fix rt runtime corruption when dl fails its global constraints sched/deadline: Avoid a superfluous check sched: Improve load balancing in the presence of idle CPUs sched: Optimize freq invariant accounting sched: Move CFS tasks to CPUs with higher capacity sched: Add SD_PREFER_SIBLING for SMT level sched: Remove unused struct sched_group_capacity::capacity_orig sched: Replace capacity_factor by usage sched: Calculate CPU's usage statistic and put it into struct sg_lb_stats::group_usage sched: Add struct rq::cpu_capacity_orig sched: Make scale_rt invariant with frequency sched: Make sched entity usage tracking scale-invariant sched: Remove frequency scaling from cpu_capacity sched: Track group sched_entity usage contributions sched: Add sched_avg::utilization_avg_contrib ...
2015-04-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.1 The most interesting bit here is irqfd/ioeventfd support for ARM and ARM64. Summary: ARM/ARM64: fixes for live migration, irqfd and ioeventfd support (enabling vhost, too), page aging s390: interrupt handling rework, allowing to inject all local interrupts via new ioctl and to get/set the full local irq state for migration and introspection. New ioctls to access memory by virtual address, and to get/set the guest storage keys. SIMD support. MIPS: FPU and MIPS SIMD Architecture (MSA) support. Includes some patches from Ralf Baechle's MIPS tree. x86: bugfixes (notably for pvclock, the others are small) and cleanups. Another small latency improvement for the TSC deadline timer" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits) KVM: use slowpath for cross page cached accesses kvm: mmu: lazy collapse small sptes into large sptes KVM: x86: Clear CR2 on VCPU reset KVM: x86: DR0-DR3 are not clear on reset KVM: x86: BSP in MSR_IA32_APICBASE is writable KVM: x86: simplify kvm_apic_map KVM: x86: avoid logical_map when it is invalid KVM: x86: fix mixed APIC mode broadcast KVM: x86: use MDA for interrupt matching kvm/ppc/mpic: drop unused IRQ_testbit KVM: nVMX: remove unnecessary double caching of MAXPHYADDR KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu KVM: vmx: pass error code with internal error #2 x86: vdso: fix pvclock races with task migration KVM: remove kvm_read_hva and kvm_read_hva_atomic KVM: x86: optimize delivery of TSC deadline timer interrupt KVM: x86: extract blocking logic from __vcpu_run kvm: x86: fix x86 eflags fixed bit KVM: s390: migrate vcpu interrupt state ...
2015-04-12x86: Remove signal translation and exec_domainRichard Weinberger
As execution domain support is gone we can remove signal translation from the signal code and remove exec_domain from thread_info. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-12perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init()Ingo Molnar
Dan Carpenter pointed out that the control flow in pt_pmu_hw_init() is a bit messy: for example the kfree(de_attrs) is entirely superfluous. Another problem is the inconsistent mixing of label based and direct return error handling. Add modern, label based error handling instead and clean up the code a bit as well. Note that we'll still do a kfree(NULL) in the normal case - this does not matter as this is an init path and kfree() returns early if it sees a NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150409090805.GG17605@mwanda Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-11perf/x86/64: Report regs_user->ax too in get_regs_user()Denys Vlasenko
I don't see why we report e.g. orix_ax, which is not always meaningful, but don't report ax, which is meaningful. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-11perf/x86/64: Simplify regs_user->abi setting code in get_regs_user()Denys Vlasenko
user_64bit_mode(regs) basically checks regs->cs to point to a 64-bit segment. This check used to be unreliable here because regs->cs was not always correct in syscalls. Now regs->cs is always correct: in syscalls, in interrupts, in exceptions. No need to emply heuristics here. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-11perf/x86/64: Do report user_regs->cx while we are in syscall, in get_regs_user()Denys Vlasenko
Yes, it is true that cx contains return address. It's not clear why we trash it. Stop doing that. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-11perf/x86/64: Do not guess user_regs->cs, ss, sp in get_regs_user()Denys Vlasenko
After recent changes to syscall entry points, user_regs->{cs,ss,sp} are always correct. (They used to be undefined while in syscalls). We can report them reliably, without guessing. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-11x86/asm/entry/32: Tidy up JNZ instructions after TESTsDenys Vlasenko
After TESTs, use logically correct JNZ mnemonic instead of JNE. This doesn't change code: md5: c3005b39a11fe582b7df7908561ad4ee entry_32.o.before.asm c3005b39a11fe582b7df7908561ad4ee entry_32.o.after.asm Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428689620-21881-1-git-send-email-dvlasenk@redhat.com [ Added object file comparison. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10x86/apic/uv: Update the UV APIC HUB checkMike Travis
Update the check for UV2000/3000. Note when the HUB is not recognized. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Hedi Berriche <hedi@sgi.com> Acked-by: Dimitri Sivanich <sivanich@sgi.com> Link: http://lkml.kernel.org/r/20150409182629.267239403@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10x86/apic/uv: Update the UV APIC driver checkMike Travis
Fix a bug in the OEM check function that determines if the system is a UV system and the BIOS is compatible with the kernel's UV apic driver. This prevents some possibly obscure panics and guards the system against being started on SGI hardware that does not have the required kernel support. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Hedi Berriche <hedi@sgi.com> Acked-by: Dimitri Sivanich <sivanich@sgi.com> Link: http://lkml.kernel.org/r/20150409182629.112998930@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10x86/apic/uv: Update the APIC UV OEM checkMike Travis
Optimize the first "SGI" OEM check to return faster if the system is not an SGI or UV system. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Hedi Berriche <hedi@sgi.com> Acked-by: Dimitri Sivanich <sivanich@sgi.com> Link: http://lkml.kernel.org/r/20150409182628.952357922@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Reduce padding in execve stubsDenys Vlasenko
execve stubs are 7 bytes only. Padding them to 16 bytes is a waste. text data bss dec hex filename 12594 0 0 12594 3132 entry_64.o.before 12530 0 0 12530 30f2 entry_64.o Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-8-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Remove GET_THREAD_INFO() in ret_from_forkDenys Vlasenko
It used to be used to check for _TIF_IA32, but the check has been removed. Remove GET_THREAD_INFO() too. Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-7-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Simplify jumps in ret_from_forkDenys Vlasenko
Replace test jz 1f jmp label 1: with test jnz label Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-6-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Remove a redundant jumpDenys Vlasenko
Jumping to the very next instruction is not very useful: jmp label label: Removing the jump. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Optimize [v]fork/clone stubsDenys Vlasenko
Replace "call func; ret" with "jmp func". Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-4-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry: Zero EXTRA_REGS for stub32_execve() tooDenys Vlasenko
The change which affected how execve clears EXTRA_REGS missed 32-bit execve syscalls. Fix this by using 64-bit execve stub epilogue for them too. Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Move stub_x32_execvecloser() to stub_execveat()Denys Vlasenko
This is a preparatory patch for moving stub32_execve[at]() to this file. It makes sense to have all execve stubs in one place, so that they can reuse code. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09x86/asm/entry/64: Use common code for rt_sigreturn() epilogueDenys Vlasenko
Similarly to stub_execve, we can reuse the epilogue in stub_rt_sigreturn() and stub_x32_rt_sigreturn(). Add a comment explaining why we can't eliminage SAVE_EXTRA_REGS here. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-08x86/asm/entry/64: Add forgotten CFI annotationDenys Vlasenko
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428424967-14460-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-08x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layoutDenys Vlasenko
Interrupt entry points are handled with the following code, each 32-byte code block contains seven entry points: ... [push][jump 22] // 4 bytes [push][jump 18] // 4 bytes [push][jump 14] // 4 bytes [push][jump 10] // 4 bytes [push][jump 6] // 4 bytes [push][jump 2] // 4 bytes [push][jump common_interrupt][padding] // 8 bytes [push][jump] [push][jump] [push][jump] [push][jump] [push][jump] [push][jump] [push][jump common_interrupt][padding] [padding_2] common_interrupt: And there is a table which holds pointers to every entry point, IOW: to every push. In cold cache, two jumps are still costlier than one, even though we get the benefit of them residing in the same cacheline. This change replaces short jumps with near ones to 'common_interrupt', and pads every push+jump pair to 8 bytes. This way, each interrupt takes only one jump. This change replaces ".p2align CONFIG_X86_L1_CACHE_SHIFT" before dispatch table with ".align 8" - we do not need anything stronger than that. The table of entry addresses (the interrupt[] array) is no longer necessary, the address of entries can be easily calculated as (irq_entries_start + i*8). text data bss dec hex filename 12546 0 0 12546 3102 entry_64.o.before 11626 0 0 11626 2d6a entry_64.o The size decrease is because 1656 bytes of .init.rodata are gone. That's initdata, though. The resident size does go up a bit. Run-tested (32 and 64 bits). Acked-and-Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428090553-7283-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-08x86/asm/entry/64: Move opportunistic sysret code to syscall code pathDenys Vlasenko
This change does two things: Copy-pastes "retint_swapgs:" code into syscall handling code, the copy is under "syscall_return:" label. The code is unchanged apart from some label renames. Removes "opportunistic sysret" code from "retint_swapgs:" code block, since now it won't be reached by syscall return. This in fact removes most of the code in question. text data bss dec hex filename 12530 0 0 12530 30f2 entry_64.o.before 12562 0 0 12562 3112 entry_64.o Run-tested. Acked-and-Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427993219-7291-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-08Merge tag 'v4.0-rc7' into x86/asm, to resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-06x86/alternatives: Guard NOPs optimizationBorislav Petkov
Take a look at the first instruction byte before optimizing the NOP - there might be something else there already, like the ALTERNATIVE_2() in rdtsc_barrier() which NOPs out on AMD even though we just patched in an MFENCE. This happens because the alternatives sees X86_FEATURE_MFENCE_RDTSC, AMD CPUs set it, we patch in the MFENCE and right afterwards it sees X86_FEATURE_LFENCE_RDTSC which AMD CPUs don't set and we blindly optimize the NOP. Checking whether at least the first byte is 0x90 prevents that. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1428181662-18020-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-06x86/asm/entry: Clear EXTRA_REGS for all executable formatsDenys Vlasenko
On failure, sys_execve() does not clobber EXTRA_REGS, so we can just return to userpsace without saving/restoring them. On success, ELF_PLAT_INIT() in sys_execve() clears all these registers. On other executable formats: - binfmt_flat.c has similar FLAT_PLAT_INIT, but x86 (and everyone else except sh) doesn't define it. - binfmt_elf_fdpic.c has ELF_FDPIC_PLAT_INIT, but x86 (and most others) doesn't define it. - There are no such hooks in binfmt_aout.c et al. We inherit EXTRA_REGS from the prior executable. This inconsistency was not intended. This change removes SAVE/RESTORE_EXTRA_REGS in stub_execve, removes register clearing in ELF_PLAT_INIT(), and instead simply clears them on success in stub_execve. Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428173719-7637-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-06x86/signal: Remove pax argument from restore_sigcontextBrian Gerst
The 'pax' argument is unnecesary. Instead, store the RAX value directly in regs. This pattern goes all the way back to 2.1.106pre1, when restore_sigcontext() was changed to return an error code instead of EAX directly: https://git.kernel.org/cgit/linux/kernel/git/history/history.git/diff/arch/i386/kernel/signal.c?id=9a8f8b7ca3f319bd668298d447bdf32730e51174 In 2007 sigaltstack syscall support was added, where the return value of restore_sigcontext() was changed to carry the memory-copying failure code. But instead of putting 'ax' into regs->ax directly, it was carried in via a pointer and then returned, where the generic syscall return code copied it to regs->ax. So there was never any deeper reason for this suboptimal pattern, it was simply never noticed after being introduced. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1428152303-17154-1-git-send-email-brgerst@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-04x86/alternatives: Fix ALTERNATIVE_2 padding generation properlyBorislav Petkov
Quentin caught a corner case with the generation of instruction padding in the ALTERNATIVE_2 macro: if len(orig_insn) < len(alt1) < len(alt2), then not enough padding gets added and that is not good(tm) as we could overwrite the beginning of the next instruction. Luckily, at the time of this writing, we don't have ALTERNATIVE_2() invocations which have that problem and even if we did, a simple fix would be to prepend the instructions with enough prefixes so that that corner case doesn't happen. However, best it would be if we fixed it properly. See below for a simple, abstracted example of what we're doing. So what we ended up doing is, we compute the max(len(alt1), len(alt2)) - len(orig_insn) and feed that value to the .skip gas directive. The max() cannot have conditionals due to gas limitations, thus the fancy integer math. With this patch, all ALTERNATIVE_2 sites get padded correctly; generating obscure test cases pass too: #define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b))))) #define gen_skip(orig, alt1, alt2, marker) \ .skip -((alt_max_short(alt1, alt2) - (orig)) > 0) * \ (alt_max_short(alt1, alt2) - (orig)),marker .pushsection .text, "ax" .globl main main: gen_skip(1, 2, 4, 0x09) gen_skip(4, 1, 2, 0x10) ... .popsection Thanks to Quentin for catching it and double-checking the fix! Reported-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150404133443.GE21152@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: a SYSRET single-stepping fix, a dmi-scan robustization fix, a reboot quirk and a kgdb fixlet" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kgdb/x86: Fix reporting of 'si' in kgdb on x86_64 x86/asm/entry/64: Disable opportunistic SYSRET if regs->flags has TF set x86/reboot: Add ASRock Q1900DC-ITX mainboard reboot quirk MAINTAINERS: Change the x86 microcode loader maintainer firmware: dmi_scan: Prevent dmi_num integer overflow
2015-04-03x86/asm/entry/64: Use a define for an invalid segment selectorBorislav Petkov
... instead of a naked number, for better readability. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428054130-25847-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>