summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2017-04-14x86/smp: Reduce code duplicationDou Liyang
The CONFIG_X86_32_SMP and CONFIG_X86_64_SMP sections in smp.h contain duplicate defines. Merge them and only put the difference into an #ifdeff'ed section. [ tglx: Massaged changelog ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: jaswinder@infradead.org Link: http://lkml.kernel.org/r/1491734806-15413-1-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/cpu: Keep model defines sorted by model numberAndy Shevchenko
For better maintenance keep it sorted by numeric model ID. Add new lines to seperate model groups. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20170316155045.50389-1-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt/mba: Add schemata file support for MBAVikas Shivappa
Add support to update the MBA bandwidth values for the domains via the schemata file. - Verify that the bandwidth value is valid - Round to the next control step depending on the bandwidth granularity of the hardware - Convert the bandwidth to delay values and write the delay values to the corresponding domain PQOS_MSRs. [ tglx: Massaged changelog ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-9-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt: Make schemata file parsers resource specificVikas Shivappa
The schemata files are the user space interface to update resource controls. The parser is hardwired to support only cache resources, which do not fit the requirements of memory resources. Add a function pointer for a parser to the struct rdt_resource and switch the cache parsing over. [ tglx: Massaged changelog ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-8-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt/mba: Add info directory files for Memory Bandwidth AllocationVikas Shivappa
The files in the info directory for MBA are as follows: num_closids The maximum number of CLOSids available for MBA min_bandwidth The minimum memory bandwidth percentage value bandwidth_gran The granularity of the bandwidth control in percent for the particular CPU SKU. Intermediate values entered are rounded off to the previous control step available. Available bandwidth control steps are minimum_bandwidth + N * bandwidth_gran. delay_linear When set, the OS writes a linear percentage based value to the control MSRs ranging from minimum_bandwidth to 100 percent. This value is informational and has no influence on the values written to the schemata files. The values written to the schemata are always bandwidth percentage that is requested. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-7-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt: Make information files resource specificVikas Shivappa
Cache allocation and memory bandwidth allocation require different information files in the resctrl/info directory, but the current implementation does not allow to have files per resource. Add the necessary fields to the resource struct and assign the files dynamically depending on the resource type. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-6-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)Vikas Shivappa
The MBA feature details like minimum bandwidth supported, bandwidth granularity etc are obtained via executing CPUID with EAX=10H ,ECX=3. Setup and initialize the MBA specific extensions to data structures like global list of RDT resources, RDT resource structure and RDT domain structure. [ tglx: Split out the seperate structure and the CBM related parts ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-5-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt/mba: Memory bandwith allocation feature detectVikas Shivappa
Detect MBA feature if CPUID.(EAX=10H, ECX=0):EBX.L2[bit 3] = 1. Add supporting data structures to detect feature details which is done in later patch using CPUID with EAX=10H, ECX= 3. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-4-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14x86/intel_rdt: Add resource specific msr update functionThomas Gleixner
Updating of Cache and Memory bandwidth QOS MSRs is different. Add a function pointer to struct rdt_resource and convert the cache part over. Based on Vikas all in one patch^Wmess. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com
2017-04-14x86/intel_rdt: Move CBM specific data into a structThomas Gleixner
Memory bandwidth allocation requires different information than cache allocation. To avoid a lump of data in struct rdt_resource, move all cache related information into a seperate structure and add that to struct rdt_resource. Sanitize the data types while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com
2017-04-14x86/intel_rdt: Cleanup namespace to support multiple resource typesVikas Shivappa
Lot of data structures and functions are named after cache specific resources(named after cbm, cache etc). In many cases other non cache resources may need to share the same data structures/functions. Generalize such naming to prepare to add more resources like memory bandwidth. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Link: http://lkml.kernel.org/r/1491611637-20417-3-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14x86/unwind: Silence entry-related warningsJosh Poimboeuf
A few people have reported unwinder warnings like the following: WARNING: kernel stack frame pointer at ffffc90000fe7ff0 in rsync:1157 has bad value (null) unwind stack type:0 next_sp: (null) mask:2 graph_idx:0 ffffc90000fe7f98: ffffc90000fe7ff0 (0xffffc90000fe7ff0) ffffc90000fe7fa0: ffffffffb7000f56 (trace_hardirqs_off_thunk+0x1a/0x1c) ffffc90000fe7fa8: 0000000000000246 (0x246) ffffc90000fe7fb0: 0000000000000000 ... ffffc90000fe7fc0: 00007ffe3af639bc (0x7ffe3af639bc) ffffc90000fe7fc8: 0000000000000006 (0x6) ffffc90000fe7fd0: 00007f80af433fc5 (0x7f80af433fc5) ffffc90000fe7fd8: 00007ffe3af638e0 (0x7ffe3af638e0) ffffc90000fe7fe0: 00007ffe3af638e0 (0x7ffe3af638e0) ffffc90000fe7fe8: 00007ffe3af63970 (0x7ffe3af63970) ffffc90000fe7ff0: 0000000000000000 ... ffffc90000fe7ff8: ffffffffb7b74b9a (entry_SYSCALL_64_after_swapgs+0x17/0x4f) This warning can happen when unwinding a code path where an interrupt occurred in x86 entry code before it set up the first stack frame. Silently ignore any warnings for this case. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: c32c47c68a0a ("x86/unwind: Warn on bad frame pointer") Link: http://lkml.kernel.org/r/dbd6838826466a60dc23a52098185bc973ce2f1e.1492020577.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14x86/unwind: Read stack return address in update_stack_state()Josh Poimboeuf
Instead of reading the return address when unwind_get_return_address() is called, read it from update_stack_state() and store it in the unwind state. This enables the next patch to check the return address from unwind_next_frame() so it can detect an entry code frame. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/af0c5e4560c49c0343dca486ea26c4fa92bc4e35.1492020577.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-13platform/x86: intel_scu_ipc: Introduce intel_scu_ipc_raw_command()Andy Shevchenko
A new call to SCU intel_scu_ipc_raw_command() writes SPTR and DPTR registers before sending a command. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-04-12Merge branch 'for-4.11/libnvdimm' into for-4.12/daxDan Williams
2017-04-12x86, pmem: fix broken __copy_user_nocache cache-bypass assumptionsDan Williams
Before we rework the "pmem api" to stop abusing __copy_user_nocache() for memcpy_to_pmem() we need to fix cases where we may strand dirty data in the cpu cache. The problem occurs when copy_from_iter_pmem() is used for arbitrary data transfers from userspace. There is no guarantee that these transfers, performed by dax_iomap_actor(), will have aligned destinations or aligned transfer lengths. Backstop the usage __copy_user_nocache() with explicit cache management in these unaligned cases. Yes, copy_from_iter_pmem() is now too big for an inline, but addressing that is saved for a later patch that moves the entirety of the "pmem api" into the pmem driver directly. Fixes: 5de490daec8b ("pmem: add copy_from_iter_pmem() and clear_pmem()") Cc: <stable@vger.kernel.org> Cc: <x86@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-04-12KVM: x86: new irqchip mode KVM_IRQCHIP_INIT_IN_PROGRESSDavid Hildenbrand
Let's add a new mode and set it while we create the irqchip via KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP. This mode will be used later to test if adding routes (in kvm_set_routing_entry()) is already allowed. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12kprobes/x86: Make boostable flag booleanMasami Hiramatsu
Make arch_specific_insn.boostable to boolean, since it has only 2 states, boostable or not. So it is better to use boolean from the viewpoint of code readability. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076368566.22469.6322906866458231844.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Merge branch 'x86/urgent' into x86/cpu, to resolve conflictIngo Molnar
Conflicts: arch/x86/kernel/cpu/intel_rdt_schemata.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Merge branch 'x86/boot' into x86/mm, to avoid conflictIngo Molnar
There's a conflict between ongoing level-5 paging support and the E820 rewrite. Since the E820 rewrite is essentially ready, merge it into x86/mm to reduce tree conflicts. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Merge branch 'WIP.x86/boot' into x86/boot, to pick up ready branchIngo Molnar
The E820 rework in WIP.x86/boot has gone through a couple of weeks of exposure in -tip, merge it in a wider fashion. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Backmerge tag 'v4.11-rc6' into drm-nextDave Airlie
Linux 4.11-rc6 drm-misc needs 4.11-rc5, may as well fix conflicts with rc6.
2017-04-10x86/intel_rdt: Add cpus_list rdtgroup fileJiri Olsa
The resource control filesystem provides only a bitmask based cpus file for assigning CPUs to a resource group. That's cumbersome with large cpumasks and non-intuitive when modifying the file from the command line. Range based cpu lists are commonly used along with bitmask based cpu files in various subsystems throughout the kernel. Add 'cpus_list' file which is CPU range based. # cd /sys/fs/resctrl/ # echo 1-10 > krava/cpus_list # cat krava/cpus_list 1-10 # cat krava/cpus 0007fe # cat cpus fffff9 # cat cpus_list 0,3-23 [ tglx: Massaged changelog and replaced "bitmask lists" by "CPU ranges" ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Shaohua Li <shli@fb.com> Link: http://lkml.kernel.org/r/20170410145232.GF25354@krava Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-10x86/intel_rdt: Cleanup kernel-docThomas Gleixner
The kernel-doc is inconsistently formatted. Fix it up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
2017-04-10x86/vdso: Plug race between mapping and ELF header setupThomas Gleixner
The vsyscall32 sysctl can racy against a concurrent fork when it switches from disabled to enabled: arch_setup_additional_pages() if (vdso32_enabled) --> No mapping sysctl.vsysscall32() --> vdso32_enabled = true create_elf_tables() ARCH_DLINFO_IA32 if (vdso32_enabled) { --> Add VDSO entry with NULL pointer Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for the newly forked process or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-07kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET publicPaolo Bonzini
Its value has never changed; we might as well make it part of the ABI instead of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO). Because PPC does not always make MMIO available, the code has to be made dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-07KVM: nVMX: support RDRAND and RDSEED exitingPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-07KVM: VMX: add missing exit reasonsPaolo Bonzini
In order to simplify adding exit reasons in the future, the array of exit reason names is now also sorted by exit reason code. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-07kvm: nVMX: support EPT accessed/dirty bitsPaolo Bonzini
Now use bit 6 of EPTP to optionally enable A/D bits for EPTP. Another thing to change is that, when EPT accessed and dirty bits are not in use, VMX treats accesses to guest paging structures as data reads. When they are in use (bit 6 of EPTP is set), they are treated as writes and the corresponding EPT dirty bit is set. The MMU didn't know this detail, so this patch adds it. We also have to fix up the exit qualification. It may be wrong because KVM sets bit 6 but the guest might not. L1 emulates EPT A/D bits using write permissions, so in principle it may be possible for EPT A/D bits to be used by L1 even though not available in hardware. The problem is that guest page-table walks will be treated as reads rather than writes, so they would not cause an EPT violation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Fixed typo in walk_addr_generic() comment and changed bit clear + conditional-set pattern in handle_ept_violation() to conditional-clear] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-06Merge commit 'b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e' into uaccess.ia64Al Viro
backmerge of mainline ia64 fix
2017-04-06Merge commit 'fc69910f329d' into uaccess.mipsAl Viro
backmerge of a build fix from mainline
2017-04-05x86/intel_rdt: Update schemata read to show data in tabular formatVikas Shivappa
The schemata file displays data from different resources on all domains. Its cumbersome to read since they are not tabular and data/names could be of different widths. Make the schemata file to display data in a tabular format thereby making it nice and simple to read. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: vikas.shivappa@intel.com Cc: h.peter.anvin@intel.com Link: http://lkml.kernel.org/r/1491255857-17213-4-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-05x86/intel_rdt: Implement "update" mode when writing schemata fileTony Luck
The schemata file can have multiple lines and it is cumbersome to update all lines. Remove code that requires that the user provides values for every resource (in the right order). If the user provides values for just a few resources, update them and leave the rest unchanged. Side benefit: we now check which values were updated and only send IPIs to cpus that actually have updates. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: ravi.v.shankar@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: vikas.shivappa@intel.com Cc: h.peter.anvin@intel.com Link: http://lkml.kernel.org/r/1491255857-17213-3-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-05crypto: glue_helper - remove the le128_gf128mul_x_ble functionOndrej Mosnáček
The le128_gf128mul_x_ble function in glue_helper.h is now obsolete and can be replaced with the gf128mul_x_ble function from gf128mul.h. Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Reviewd-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-04usercopy: Move enum for arch_within_stack_frames()Sahara
This patch moves the arch_within_stack_frames() return value enum up in the header files so that per-architecture implementations can reuse the same return values. Signed-off-by: Sahara <keun-o.park@darkmatter.ae> Signed-off-by: James Morse <james.morse@arm.com> [kees: adjusted naming and commit log] Signed-off-by: Kees Cook <keescook@chromium.org>
2017-04-04x86/espfix: Add support for 5-level pagingKirill A. Shutemov
We don't need extra virtual address space for ESPFIX, so it stays within one PUD page table for both 4- and 5-level paging. Redefining ESPFIX_BASE_ADDR using P4D_SHIFT instead of PGDIR_SHIFT would make it stay in the same place regarding of paging mode. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-8-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-04x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=yKirill A. Shutemov
Extends pagetable headers to support the new paging mode. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-6-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-04x86/paravirt: Add 5-level support to the paravirt codeKirill A. Shutemov
Add operations to allocate/release p4ds. Xen requires more work. We will need to come back to it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-5-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-04x86/mm: Define virtual memory map for 5-level pagingKirill A. Shutemov
The first part of memory map (up to %esp fixup) simply scales existing map for 4-level paging by factor of 9 -- number of bits addressed by the additional page table level. The rest of the map is unchanged. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-04x86/boot: Detect 5-level paging supportKirill A. Shutemov
In this initial implementation we force-require 5-level paging support from the hardware, when compiled with CONFIG_X86_5LEVEL=y. (The kernel will panic during boot on CPUs that don't support 5-level paging.) We will implement boot-time switch between 4- and 5-level paging later. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-03Merge tag 'v4.11-rc5' into x86/mm, to refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "This update provides: - prevent KASLR from randomizing EFI regions - restrict the usage of -maccumulate-outgoing-args and document when and why it is required. - make the Global Physical Address calculation for UV4 systems work correctly. - address a copy->paste->forgot-edit problem in the MCE exception table entries. - assign a name to AMD MCA bank 3, so the sysfs file registration works. - add a missing include in the boot code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Include missing header file x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs x86/build: Mostly disable '-maccumulate-outgoing-args' x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization x86/mce: Fix copy/paste error in exception table entries x86/platform/uv: Fix calculation of Global Physical Address
2017-04-02Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "This update provides: - make the scheduler clock switch to unstable mode smooth so the timestamps stay at microseconds granularity instead of switching to tick granularity. - unbreak perf test tsc by taking the new offset into account which was added in order to proveide better sched clock continuity - switching sched clock to unstable mode runs all clock related computations which affect the sched clock output itself from a work queue. In case of preemption sched clock uses half updated data and provides wrong timestamps. Keep the math in the protected context and delegate only the static key switch to workqueue context. - remove a duplicate header include" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/headers: Remove duplicate #include <linux/sched/debug.h> line sched/clock: Fix broken stable to unstable transfer sched/clock, x86/perf: Fix "perf test tsc" sched/clock: Fix clear_sched_clock_stable() preempt wobbly
2017-04-02Merge branch 'parisc-4.11-3' of ↵Al Viro
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux into uaccess.parisc
2017-03-30debug: Add _ONCE() logic to report_bug()Peter Zijlstra
Josh suggested moving the _ONCE logic inside the trap handler, using a bit in the bug_entry::flags field, avoiding the need for the extra variable. Sadly this only works for WARN_ON_ONCE(), since the others have printk() statements prior to triggering the trap. Still, this saves a fair amount of text and some data: text data filename 10682460 4530992 defconfig-build/vmlinux.orig 10665111 4530096 defconfig-build/vmlinux.patched Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30locking/atomic: Fix atomic_try_cmpxchg() semanticsPeter Zijlstra
Dmitry noted that the new atomic_try_cmpxchg() primitive is broken when the old pointer doesn't point to the local stack. He writes: "Consider a classical lock-free stack push: node->next = atomic_read(&head); do { } while (!atomic_try_cmpxchg(&head, &node->next, node)); This code is broken with the current implementation, the problem is with unconditional update of *__po. In case of success it writes the same value back into *__po, but in case of cmpxchg success we might have lose ownership of some memory locations and potentially over what __po has pointed to. The same holds for the re-read of *__po. " He also points out that this makes it surprisingly different from the similar C/C++ atomic operation. After investigating the code-gen differences caused by this patch; and a number of alternatives (Linus dislikes this interface lots), we arrived at these results (size x86_64-defconfig/vmlinux): GCC-6.3.0: 10735757 cmpxchg 10726413 try_cmpxchg 10730509 try_cmpxchg + patch 10730445 try_cmpxchg-linus GCC-7 (20170327): 10709514 cmpxchg 10704266 try_cmpxchg 10704266 try_cmpxchg + patch 10704394 try_cmpxchg-linus From this we see that the patch has the advantage of better code-gen on GCC-7 and keeps the interface roughly consistent with the C language variant. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: a9ebf306f52c ("locking/atomic: Introduce atomic_try_cmpxchg()") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/debug: Define BUG() again for !CONFIG_BUGArnd Bergmann
The latest change to the BUG() macro inadvertently reverted the earlier commit: b06dd879f5db ("x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG") ... that sanitized the behavior with CONFIG_BUG=n. I noticed this as some warnings have appeared again that were previously fixed as a side effect of that patch: kernel/seccomp.c: In function '__seccomp_filter': kernel/seccomp.c:670:1: error: no return statement in function returning non-void [-Werror=return-type] ... This combines the two patches and uses the ud2 macro to define BUG() in case of CONFIG_BUG=n. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 9a93848fe787 ("x86/debug: Implement __WARN() using UD0") Link: http://lkml.kernel.org/r/20170329211646.2707365-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30Merge branch 'x86/cpu' into x86/mm, before applying dependent patchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-29x86: switch to RAW_COPY_USERAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>