summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2017-09-11x86/cpu: Remove unused and undefined __generic_processor_info() declarationDou Liyang
The following revert: 2b85b3d22920 ("x86/acpi: Restore the order of CPU IDs") ... got rid of __generic_processor_info(), but forgot to remove its declaration in mpspec.h. Remove the declaration and update the comments as well. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1505101403-29100-1-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-07x86/mm: Make the SME mask a u64Borislav Petkov
The SME encryption mask is for masking 64-bit pagetable entries. It being an unsigned long works fine on X86_64 but on 32-bit builds in truncates bits leading to Xen guests crashing very early. And regardless, the whole SME mask handling shouldnt've leaked into 32-bit because SME is X86_64-only feature. So, first make the mask u64. And then, add trivial 32-bit versions of the __sme_* macros so that nothing happens there. Reported-and-tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tom Lendacky <Thomas.Lendacky@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas <Thomas.Lendacky@amd.com> Fixes: 21729f81ce8a ("x86/mm: Provide general kernel support for memory encryption") Link: http://lkml.kernel.org/r/20170907093837.76zojtkgebwtqc74@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-06x86/mm: Document how CR4.PCIDE restore worksAndy Lutomirski
While debugging a problem, I thought that using cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be helpful. It turns out to be counterproductive. Add a comment documenting how this works. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06x86/mm: Reinitialize TLB state on hotplug and resumeAndy Lutomirski
When Linux brings a CPU down and back up, it switches to init_mm and then loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect of masking off the ASID bits in CR3. This can result in some confusion in the TLB handling code. If we bring a CPU down and back up with any ASID other than 0, we end up with the wrong ASID active on the CPU after resume. This could cause our internal state to become corrupt, although major corruption is unlikely because init_mm doesn't have any user pages. More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion in the next context switch. The result of *that* is a failure to resume from suspend with probability 1 - 1/6^(cpus-1). Fix it by reinitializing cpu_tlbstate on resume and CPU bringup. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Jiri Kosina <jikos@kernel.org> Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 4.14: API: - Defer scompress scratch buffer allocation to first use. - Add __crypto_xor that takes separte src and dst operands. - Add ahash multiple registration interface. - Revamped aead/skcipher algif code to fix async IO properly. Drivers: - Add non-SIMD fallback code path on ARM for SVE. - Add AMD Security Processor framework for ccp. - Add support for RSA in ccp. - Add XTS-AES-256 support for CCP version 5. - Add support for PRNG in sun4i-ss. - Add support for DPAA2 in caam. - Add ARTPEC crypto support. - Add Freescale RNGC hwrng support. - Add Microchip / Atmel ECC driver. - Add support for STM32 HASH module" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: af_alg - get_page upon reassignment to TX SGL crypto: cavium/nitrox - Fix an error handling path in 'nitrox_probe()' crypto: inside-secure - fix an error handling path in safexcel_probe() crypto: rockchip - Don't dequeue the request when device is busy crypto: cavium - add release_firmware to all return case crypto: sahara - constify platform_device_id MAINTAINERS: Add ARTPEC crypto maintainer crypto: axis - add ARTPEC-6/7 crypto accelerator driver crypto: hash - add crypto_(un)register_ahashes() dt-bindings: crypto: add ARTPEC crypto crypto: algif_aead - fix comment regarding memory layout crypto: ccp - use dma_mapping_error to check map error lib/mpi: fix build with clang crypto: sahara - Remove leftover from previous used spinlock crypto: sahara - Fix dma unmap direction crypto: af_alg - consolidation of duplicate code crypto: caam - Remove unused dentry members crypto: ccp - select CONFIG_CRYPTO_RSA crypto: ccp - avoid uninitialized variable warning crypto: serpent - improve __serpent_setkey with UBSAN ...
2017-09-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Support ipv6 checksum offload in sunvnet driver, from Shannon Nelson. 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric Dumazet. 3) Allow generic XDP to work on virtual devices, from John Fastabend. 4) Add bpf device maps and XDP_REDIRECT, which can be used to build arbitrary switching frameworks using XDP. From John Fastabend. 5) Remove UFO offloads from the tree, gave us little other than bugs. 6) Remove the IPSEC flow cache, from Florian Westphal. 7) Support ipv6 route offload in mlxsw driver. 8) Support VF representors in bnxt_en, from Sathya Perla. 9) Add support for forward error correction modes to ethtool, from Vidya Sagar Ravipati. 10) Add time filter for packet scheduler action dumping, from Jamal Hadi Salim. 11) Extend the zerocopy sendmsg() used by virtio and tap to regular sockets via MSG_ZEROCOPY. From Willem de Bruijn. 12) Significantly rework value tracking in the BPF verifier, from Edward Cree. 13) Add new jump instructions to eBPF, from Daniel Borkmann. 14) Rework rtnetlink plumbing so that operations can be run without taking the RTNL semaphore. From Florian Westphal. 15) Support XDP in tap driver, from Jason Wang. 16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal. 17) Add Huawei hinic ethernet driver. 18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan Delalande. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits) i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq i40e: avoid NVM acquire deadlock during NVM update drivers: net: xgene: Remove return statement from void function drivers: net: xgene: Configure tx/rx delay for ACPI drivers: net: xgene: Read tx/rx delay for ACPI rocker: fix kcalloc parameter order rds: Fix non-atomic operation on shared flag variable net: sched: don't use GFP_KERNEL under spin lock vhost_net: correctly check tx avail during rx busy polling net: mdio-mux: add mdio_mux parameter to mdio_mux_init() rxrpc: Make service connection lookup always check for retry net: stmmac: Delete dead code for MDIO registration gianfar: Fix Tx flow control deactivation cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6 cxgb4: Fix pause frame count in t4_get_port_stats cxgb4: fix memory leak tun: rename generic_xdp to skb_xdp tun: reserve extra headroom only when XDP is set net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping net: dsa: bcm_sf2: Advertise number of egress queues ...
2017-09-05Merge tag 'acpi-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These include a usual ACPICA code update (this time to upstream revision 20170728), a fix for a boot crash on some systems with Thunderbolt devices connected at boot time, a rework of the handling of PCI bridges when setting up device wakeup, new support for Apple device properties, support for DMA configurations reported via ACPI on ARM64, APEI-related updates, ACPI EC driver updates and assorted minor modifications in several places. Specifics: - Update the ACPICA code in the kernel to upstream revision 20170728 including: * Alias operator handling update (Bob Moore). * Deferred resolution of reference package elements (Bob Moore). * Support for the _DMA method in walk resources (Bob Moore). * Tables handling update and support for deferred table verification (Lv Zheng). * Update of SMMU models for IORT (Robin Murphy). * Compiler and disassembler updates (Alex James, Erik Schmauss, Ganapatrao Kulkarni, James Morse). * Tools updates (Erik Schmauss, Lv Zheng). * Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv Zheng, Shao Ming). - Rework the initialization of non-wakeup GPEs with method handlers in order to address a boot crash on some systems with Thunderbolt devices connected at boot time where we miss an early hotplug event due to a delay in GPE enabling (Rafael Wysocki). - Rework the handling of PCI bridges when setting up ACPI-based device wakeup in order to avoid disabling wakeup for bridges prematurely (Rafael Wysocki). - Consolidate Apple DMI checks throughout the tree, add support for Apple device properties to the device properties framework and use these properties for the handling of I2C and SPI devices on Apple systems (Lukas Wunner). - Add support for _DMA to the ACPI-based device properties lookup code and make it possible to use the information from there to configure DMA regions on ARM64 systems (Lorenzo Pieralisi). - Fix several issues in the APEI code, add support for exporting the BERT error region over sysfs and update APEI MAINTAINERS entry with reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit Agrawal, Tony Luck, Yazen Ghannam). - Fix a potential initialization ordering issue in the ACPI EC driver and clean it up somewhat (Lv Zheng). - Update the ACPI SPCR driver to extend the existing XGENE 8250 workaround in it to a new platform (m400) and to work around an Xgene UART clock issue (Graeme Gregory). - Add a new utility function to the ACPI core to support using ACPI OEM ID / OEM Table ID / Revision for system identification in blacklisting or similar and switch over the existing code already using this information to this new interface (Toshi Kani). - Fix an xpower PMIC issue related to GPADC reads that always return 0 without extra pin manipulations (Hans de Goede). - Add statements to print debug messages in a couple of places in the ACPI core for easier diagnostics (Rafael Wysocki). - Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun Guo). - Clean up the ACPI x86 boot code somewhat (Andy Shevchenko). - Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver (Alex Hung). - Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal, Ronald Tschalär, Sumeet Pawnikar)" * tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (75 commits) ACPI / APEI: Suppress message if HEST not present intel_pstate: convert to use acpi_match_platform_list() ACPI / blacklist: add acpi_match_platform_list() ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources ACPI: make device_attribute const ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region ACPI: APEI: fix the wrong iteration of generic error status block ACPI / processor: make function acpi_processor_check_duplicates() static ACPI / EC: Clean up EC GPE mask flag ACPI: EC: Fix possible issues related to EC initialization order ACPI / PM: Add debug statements to acpi_pm_notify_handler() ACPI: Add debug statements to acpi_global_event_handler() ACPI / scan: Enable GPEs before scanning the namespace ACPICA: Make it possible to enable runtime GPEs earlier ACPICA: Dispatch active GPEs at init time ACPI: SPCR: work around clock issue on xgene UART ACPI: SPCR: extend XGENE 8250 workaround to m400 ACPI / LPSS: Don't abort ACPI scan on missing mem resource mailbox: pcc: Drop uninformative output during boot ACPI/IORT: Add IORT named component memory address limits ...
2017-09-05Merge tag 'char-misc-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver update for 4.14-rc1. Lots of different stuff in here, it's been an active development cycle for some reason. Highlights are: - updated binder driver, this brings binder up to date with what shipped in the Android O release, plus some more changes that happened since then that are in the Android development trees. - coresight updates and fixes - mux driver file renames to be a bit "nicer" - intel_th driver updates - normal set of hyper-v updates and changes - small fpga subsystem and driver updates - lots of const code changes all over the driver trees - extcon driver updates - fmc driver subsystem upadates - w1 subsystem minor reworks and new features and drivers added - spmi driver updates Plus a smattering of other minor driver updates and fixes. All of these have been in linux-next with no reported issues for a while" * tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits) ANDROID: binder: don't queue async transactions to thread. ANDROID: binder: don't enqueue death notifications to thread todo. ANDROID: binder: Don't BUG_ON(!spin_is_locked()). ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl ANDROID: binder: push new transactions to waiting threads. ANDROID: binder: remove proc waitqueue android: binder: Add page usage in binder stats android: binder: fixup crash introduced by moving buffer hdr drivers: w1: add hwmon temp support for w1_therm drivers: w1: refactor w1_slave_show to make the temp reading functionality separate drivers: w1: add hwmon support structures eeprom: idt_89hpesx: Support both ACPI and OF probing mcb: Fix an error handling path in 'chameleon_parse_cells()' MCB: add support for SC31 to mcb-lpc mux: make device_type const char: virtio: constify attribute_group structures. Documentation/ABI: document the nvmem sysfs files lkdtm: fix spelling mistake: "incremeted" -> "incremented" perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file nvmem: include linux/err.h from header ...
2017-09-04Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "This update provides: - Cleanup of the IDT management including the removal of the extra tracing IDT. A first step to cleanup the vector management code. - The removal of the paravirt op adjust_exception_frame. This is a XEN specific issue, but merged through this branch to avoid nasty merge collisions - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has disabled the TSC DEADLINE timer in CPUID. - Adjust a debug message in the ioapic code to print out the information correctly" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) x86/idt: Fix the X86_TRAP_BP gate x86/xen: Get rid of paravirt op adjust_exception_frame x86/eisa: Add missing include x86/idt: Remove superfluous ALIGNment x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature x86/idt: Remove the tracing IDT leftovers x86/idt: Hide set_intr_gate() x86/idt: Simplify alloc_intr_gate() x86/idt: Deinline setup functions x86/idt: Remove unused functions/inlines x86/idt: Move interrupt gate initialization to IDT code x86/idt: Move APIC gate initialization to tables x86/idt: Move regular trap init to tables x86/idt: Move IST stack based traps to table init x86/idt: Move debug stack init to table based x86/idt: Switch early trap init to IDT tables x86/idt: Prepare for table based init x86/idt: Move early IDT setup out of 32-bit asm x86/idt: Move early IDT handler setup to IDT code x86/idt: Consolidate IDT invalidation ...
2017-09-04Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache quality monitoring update from Thomas Gleixner: "This update provides a complete rewrite of the Cache Quality Monitoring (CQM) facility. The existing CQM support was duct taped into perf with a lot of issues and the attempts to fix those turned out to be incomplete and horrible. After lengthy discussions it was decided to integrate the CQM support into the Resource Director Technology (RDT) facility, which is the obvious choise as in hardware CQM is part of RDT. This allowed to add Memory Bandwidth Monitoring support on top. As a result the mechanisms for allocating cache/memory bandwidth and the corresponding monitoring mechanisms are integrated into a single management facility with a consistent user interface" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/intel_rdt: Turn off most RDT features on Skylake x86/intel_rdt: Add command line options for resource director technology x86/intel_rdt: Move special case code for Haswell to a quirk function x86/intel_rdt: Remove redundant ternary operator on return x86/intel_rdt/cqm: Improve limbo list processing x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug x86/intel_rdt: Modify the intel_pqr_state for better performance x86/intel_rdt/cqm: Clear the default RMID during hotcpu x86/intel_rdt: Show bitmask of shareable resource with other executing units x86/intel_rdt/mbm: Handle counter overflow x86/intel_rdt/mbm: Add mbm counter initialization x86/intel_rdt/mbm: Basic counting of MBM events (total and local) x86/intel_rdt/cqm: Add CPU hotplug support x86/intel_rdt/cqm: Add sched_in support x86/intel_rdt: Introduce rdt_enable_key for scheduling x86/intel_rdt/cqm: Add mount,umount support x86/intel_rdt/cqm: Add rmdir support x86/intel_rdt: Separate the ctrl bits from rmdir x86/intel_rdt/cqm: Add mon_data x86/intel_rdt: Prepare for RDT monitor data support ...
2017-09-04Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "PCID support, 5-level paging support, Secure Memory Encryption support The main changes in this cycle are support for three new, complex hardware features of x86 CPUs: - Add 5-level paging support, which is a new hardware feature on upcoming Intel CPUs allowing up to 128 PB of virtual address space and 4 PB of physical RAM space - a 512-fold increase over the old limits. (Supercomputers of the future forecasting hurricanes on an ever warming planet can certainly make good use of more RAM.) Many of the necessary changes went upstream in previous cycles, v4.14 is the first kernel that can enable 5-level paging. This feature is activated via CONFIG_X86_5LEVEL=y - disabled by default. (By Kirill A. Shutemov) - Add 'encrypted memory' support, which is a new hardware feature on upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system RAM to be encrypted and decrypted (mostly) transparently by the CPU, with a little help from the kernel to transition to/from encrypted RAM. Such RAM should be more secure against various attacks like RAM access via the memory bus and should make the radio signature of memory bus traffic harder to intercept (and decrypt) as well. This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled by default. (By Tom Lendacky) - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a hardware feature that attaches an address space tag to TLB entries and thus allows to skip TLB flushing in many cases, even if we switch mm's. (By Andy Lutomirski) All three of these features were in the works for a long time, and it's coincidence of the three independent development paths that they are all enabled in v4.14 at once" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits) x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y) x86/mm: Use pr_cont() in dump_pagetable() x86/mm: Fix SME encryption stack ptr handling kvm/x86: Avoid clearing the C-bit in rsvd_bits() x86/CPU: Align CR3 defines x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages acpi, x86/mm: Remove encryption mask from ACPI page protection type x86/mm, kexec: Fix memory corruption with SME on successive kexecs x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y x86/mm: Allow userspace have mappings above 47-bit x86/mm: Prepare to expose larger address space to userspace x86/mpx: Do not allow MPX if we have mappings above 47-bit x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit() x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD x86/mm/dump_pagetables: Fix printout of p4d level x86/mm/dump_pagetables: Generalize address normalization x86/boot: Fix memremap() related build failure ...
2017-09-04Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Add 'cross-release' support to lockdep, which allows APIs like completions, where it's not the 'owner' who releases the lock, to be tracked. It's all activated automatically under CONFIG_PROVE_LOCKING=y. - Clean up (restructure) the x86 atomics op implementation to be more readable, in preparation of KASAN annotations. (Dmitry Vyukov) - Fix static keys (Paolo Bonzini) - Add killable versions of down_read() et al (Kirill Tkhai) - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini) - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra) - Remove smp_mb__before_spinlock() and convert its usages, introduce smp_mb__after_spinlock() (Peter Zijlstra) * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) locking/lockdep/selftests: Fix mixed read-write ABBA tests sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK() acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures smp: Avoid using two cache lines for struct call_single_data locking/lockdep: Untangle xhlock history save/restore from task independence locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being futex: Remove duplicated code and fix undefined behaviour Documentation/locking/atomic: Finish the document... locking/lockdep: Fix workqueue crossrelease annotation workqueue/lockdep: 'Fix' flush_work() annotation locking/lockdep/selftests: Add mixed read-write ABBA tests mm, locking/barriers: Clarify tlb_flush_pending() barriers locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive locking/lockdep: Explicitly initialize wq_barrier::done::map locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING locking/refcounts, x86/asm: Implement fast refcount overflow protection locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease ...
2017-09-04Merge branch 'x86-syscall-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull syscall updates from Ingo Molnar: "Improve the security of set_fs(): we now check the address limit on a number of key platforms (x86, arm, arm64) before returning to user-space - without adding overhead to the typical system call fast path" * 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/syscalls: Check address limit on user-mode return arm/syscalls: Check address limit on user-mode return x86/syscalls: Check address limit on user-mode return
2017-09-04Merge branch 'x86-spinlocks-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 spinlock update from Ingo Molnar: "Convert an NMI lock to raw" [ Clarification: it's not that the lock itself is NMI-safe, it's about NMI registration called from RT contexts - Linus ] * 'x86-spinlocks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Use raw lock
2017-09-04Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loading updates from Ingo Molnar: "Update documentation, improve robustness and fix a memory leak" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/intel: Improve microcode patches saving flow x86/microcode: Document the three loading methods x86/microcode/AMD: Free unneeded patch before exit from update_cache()
2017-09-04Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 debug updates from Ingo Molnar: "Various fixes to the NUMA emulation code" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numa_emulation: Recalculate numa_nodes_parsed from emulated nodes x86/numa_emulation: Assign physnode_mask directly from numa_nodes_parsed x86/numa_emulation: Refine the calculation of max_emu_nid and dfl_phys_nid
2017-09-04Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Ingo Molnar: "AMD F17h related updates" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/amd: Hide unused legacy_fixup_core_id() function x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask x86/cpu/amd: Limit cpu_core_id fixup to families older than F17h
2017-09-04Merge branch 'x86-build-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: "More Clang support related fixes" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Use cc-option to validate stack alignment parameter x86/build: Fix stack alignment for CLang x86/build: Drop unused mflags-y
2017-09-04Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The main changes are KASL related fixes and cleanups: in particular we now exclude certain physical memory ranges as KASLR randomization targets that have proven to be unreliable (early-)RAM on some firmware versions" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* and EFI_LOADER_* from KASLR's choice x86/boot/KASLR: Prefer mirrored memory regions for the kernel physical address efi: Introduce efi_early_memdesc_ptr to get pointer to memmap descriptor x86/boot/KASLR: Rename process_e820_entry() into process_mem_region() x86/boot/KASLR: Switch to pass struct mem_vector to process_e820_entry() x86/boot/KASLR: Wrap e820 entries walking code into new function process_e820_entries()
2017-09-04Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: - Introduce the ORC unwinder, which can be enabled via CONFIG_ORC_UNWINDER=y. The ORC unwinder is a lightweight, Linux kernel specific debuginfo implementation, which aims to be DWARF done right for unwinding. Objtool is used to generate the ORC unwinder tables during build, so the data format is flexible and kernel internal: there's no dependency on debuginfo created by an external toolchain. The ORC unwinder is almost two orders of magnitude faster than the (out of tree) DWARF unwinder - which is important for perf call graph profiling. It is also significantly simpler and is coded defensively: there has not been a single ORC related kernel crash so far, even with early versions. (knock on wood!) But the main advantage is that enabling the ORC unwinder allows CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel measurably: With frame pointers disabled, GCC does not have to add frame pointer instrumentation code to every function in the kernel. The kernel's .text size decreases by about 3.2%, resulting in better cache utilization and fewer instructions executed, resulting in a broad kernel-wide speedup. Average speedup of system calls should be roughly in the 1-3% range - measurements by Mel Gorman [1] have shown a speedup of 5-10% for some function execution intense workloads. The main cost of the unwinder is that the unwinder data has to be stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel config - which is a modest cost on modern x86 systems. Given how young the ORC unwinder code is it's not enabled by default - but given the performance advantages the plan is to eventually make it the default unwinder on x86. See Documentation/x86/orc-unwinder.txt for more details. - Remove lguest support: its intended role was that of a temporary proof of concept for virtualization, plus its removal will enable the reduction (removal) of the paravirt API as well, so Rusty agreed to its removal. (Juergen Gross) - Clean up and fix FSGS related functionality (Andy Lutomirski) - Clean up IO access APIs (Andy Shevchenko) - Enhance the symbol namespace (Jiri Slaby) * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) objtool: Handle GCC stack pointer adjustment bug x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone() x86/fpu/math-emu: Add ENDPROC to functions x86/boot/64: Extract efi_pe_entry() from startup_64() x86/boot/32: Extract efi_pe_entry() from startup_32() x86/lguest: Remove lguest support x86/paravirt/xen: Remove xen_patch() objtool: Fix objtool fallthrough detection with function padding x86/xen/64: Fix the reported SS and CS in SYSCALL objtool: Track DRAP separately from callee-saved registers objtool: Fix validate_branch() return codes x86: Clarify/fix no-op barriers for text_poke_bp() x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs selftests/x86/fsgsbase: Test selectors 1, 2, and 3 x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common x86/asm: Fix UNWIND_HINT_REGS macro for older binutils x86/asm/32: Fix regs_get_register() on segment registers x86/xen/64: Rearrange the SYSCALL entries x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads ...
2017-09-04Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - fix affine wakeups (Peter Zijlstra) - improve CPU onlining (and general bootup) scalability on systems with ridiculous number (thousands) of CPUs (Peter Zijlstra) - sched/numa updates (Rik van Riel) - sched/deadline updates (Byungchul Park) - sched/cpufreq enhancements and related cleanups (Viresh Kumar) - sched/debug enhancements (Xie XiuQi) - various fixes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) sched/debug: Optimize sched_domain sysctl generation sched/topology: Avoid pointless rebuild sched/topology, cpuset: Avoid spurious/wrong domain rebuilds sched/topology: Improve comments sched/topology: Fix memory leak in __sdt_alloc() sched/completion: Document that reinit_completion() must be called after complete_all() sched/autogroup: Fix error reporting printk text in autogroup_create() sched/fair: Fix wake_affine() for !NUMA_BALANCING sched/debug: Intruduce task_state_to_char() helper function sched/debug: Show task state in /proc/sched_debug sched/debug: Use task_pid_nr_ns in /proc/$pid/sched sched/core: Remove unnecessary initialization init_idle_bootup_task() sched/deadline: Change return value of cpudl_find() sched/deadline: Make find_later_rq() choose a closer CPU in topology sched/numa: Scale scan period with tasks in group and shared/private sched/numa: Slow down scan rate if shared faults dominate sched/pelt: Fix false running accounting sched: Mark pick_next_task_dl() and build_sched_domain() as static sched/cpupri: Don't re-initialize 'struct cpupri' sched/deadline: Don't re-initialize 'struct cpudl' ...
2017-09-04Merge branch 'ras-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fix from Ingo Molnar: "A single change fixing SMCA bank initialization on systems that don't have CPU0 enabled" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/AMD: Allow any CPU to initialize the smca_banks array
2017-09-04Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Add branch type profiling/tracing support. (Jin Yao) - Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of physical memory addresses, where the PMU supports it. (Kan Liang) - Export some PMU capability details in the new /sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi Kleen) - Aux data fixes and updates (Will Deacon) - kprobes fixes and updates (Masami Hiramatsu) - AMD uncore PMU driver fixes and updates (Janakarajan Natarajan) On the tooling side, here's a (limited!) list of highlights - there were many other changes that I could not list, see the shortlog and git history for details: UI improvements: - Implement a visual marker for fused x86 instructions in the annotate TUI browser, available now in 'perf report', more work needed to have it available as well in 'perf top' (Jin Yao) Further explanation from one of Jin's patches: │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne 29 │ │↓ jmp 43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) That means the cmpl+je is a fused instruction pair and they should be considered together. - Record the branch type and then show statistics and info about in callchain entries (Jin Yao) Example from one of Jin's patches: # perf record -g -j any,save_type # perf report --branch-history --stdio --no-children 38.50% div.c:45 [.] main div | ---main div.c:42 (RET CROSS_2M cycles:2) compute_flag div.c:28 (cycles:2) compute_flag div.c:27 (RET CROSS_2M cycles:1) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M cycles:1) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M cycles:9) namespaces support: - Add initial support for namespaces, using setns to access files in namespaces, grabbing their build-ids, etc. (Krister Johansen) perf trace enhancements: - Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace' (Arnaldo Carvalho de Melo) - Add initial 'clone' syscall args beautifier in 'perf trace' (Arnaldo Carvalho de Melo) - Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace' (Arnaldo Carvalho de Melo) - Beautifiers for the 'cmd' arg of several ioctl types, including: sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de Melo) - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data' CTF conversion, allowing CTF trace visualization tools to show callchains and to resolve symbols (Geneviève Bastien) - Beautify the fcntl syscall, which is an interesting one in the sense that infrastructure had to be put in place to change the formatters of some arguments according to the value in a previous one, i.e. cmd dictates how arg and the syscall return will be formatted. (Arnaldo Carvalho de Melo perf stat enhancements: - Use group read for event groups in 'perf stat', reducing overhead when groups are defined in the event specification, i.e. when using {} to enclose a list of events, asking them to be read at the same time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa) pipe mode improvements: - Process tracing data in 'perf annotate' pipe mode (David Carrillo-Cisneros) - Add header record types to pipe-mode, now this command: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header Will show the same as in non-pipe mode, i.e. involving a perf.data file (David Carrillo-Cisneros) Vendor specific hardware event support updates/enhancements: - Update POWER9 vendor events tables (Sukadev Bhattiprolu) - Add POWER9 PMU events Sukadev (Bhattiprolu) - Support additional POWER8+ PVR in PMU mapfile (Shriya) - Add Skylake server uncore JSON vendor events (Andi Kleen) - Support exporting Intel PT data to sqlite3 with python perf scripts, this is in addition to the postgresql support that was already there (Adrian Hunter)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits) perf symbols: Fix plt entry calculation for ARM and AARCH64 perf probe: Fix kprobe blacklist checking condition perf/x86: Fix caps/ for !Intel perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR perf/core, pt, bts: Get rid of itrace_started perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments tools headers: Sync cpu features kernel ABI headers with tooling headers perf tools: Pass full path of FEATURES_DUMP perf tools: Robustify detection of clang binary tools lib: Allow external definition of CC, AR and LD perf tools: Allow external definition of flex and bison binary names tools build tests: Don't hardcode gcc name perf report: Group stat values on global event id perf values: Zero value buffers perf values: Fix allocation check perf values: Fix thread index bug perf report: Add dump_read function perf record: Set read_format for inherit_stat perf c2c: Fix remote HITM detection for Skylake perf tools: Fix static build with newer toolchains ...
2017-09-04Merge branch 'linus' into locking/core, to fix up conflictsIngo Molnar
Conflicts: mm/page_alloc.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-03Merge tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "This is the main drm pull request for 4.14 merge window. I'm sending this early, as my continuing journey into fatherhood is occurring really soon now, I'm going to be mostly useless for the next couple of weeks, though I may be able to read email, I doubt I'll be doing much patch applications or git sending. If anything urgent pops up I've asked Daniel/Jani/Alex/Sean to try and direct stuff towards you. Outside drm changes: Some rcar-du updates that touch the V4L tree, all acks should be in place. It adds one export to the radix tree code for new i915 use case. There are some minor AGP cleanups (don't see that too often). Changes to the vbox driver in staging to avoid breaking compilation. Summary: core: - Atomic helper fixes - Atomic UAPI fixes - Add YCBCR 4:2:0 support - Drop set_busid hook - Refactor fb_helper locking - Remove a bunch of internal APIs - Add a bunch of better default handlers - Format modifier/blob plane property added - More internal header refactoring - Make more internal API names consistent - Enhanced syncobj APIs (wait/signal/reset/create signalled) bridge: - Add Synopsys Designware MIPI DSI host bridge driver tiny: - Add Pervasive Displays RePaper displays - Add support for LEGO MINDSTORMS EV3 LCD i915: - Lots of GEN10/CNL support patches - drm syncobj support - Skylake+ watermark refactoring - GVT vGPU 48-bit ppgtt support - GVT performance improvements - NOA change ioctl - CCS (color compression) scanout support - GPU reset improvements amdgpu: - Initial hugepage support - BO migration logic rework - Vega10 improvements - Powerplay fixes - Stop reprogramming the MC - Fixes for ACP audio on stoney - SR-IOV fixes/improvements - Command submission overhead improvements amdkfd: - Non-dGPU upstreaming patches - Scratch VA ioctl - Image tiling modes - Update PM4 headers for new firmware - Drop all BUG_ONs. nouveau: - GP108 modesetting support. - Disable MSI on big endian. vmwgfx: - Add fence fd support. msm: - Runtime PM improvements exynos: - NV12MT support - Refactor KMS drivers imx-drm: - Lock scanout channel to improve memory bw - Cleanups etnaviv: - GEM object population fixes tegra: - Prep work for Tegra186 support - PRIME mmap support sunxi: - HDMI support improvements - HDMI CEC support omapdrm: - HDMI hotplug IRQ support - Big driver cleanup - OMAP5 DSI support rcar-du: - vblank fixes - VSP1 updates arcgpu: - Minor fixes stm: - Add STM32 DSI controller driver dw_hdmi: - Add support for Rockchip RK3399 - HDMI CEC support atmel-hlcdc: - Add 8-bit color support vc4: - Atomic fixes - New ioctl to attach a label to a buffer object - HDMI CEC support - Allow userspace to dictate rendering order on submit ioctl" * tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux: (1074 commits) drm/syncobj: Add a signal ioctl (v3) drm/syncobj: Add a reset ioctl (v3) drm/syncobj: Add a syncobj_array_find helper drm/syncobj: Allow wait for submit and signal behavior (v5) drm/syncobj: Add a CREATE_SIGNALED flag drm/syncobj: Add a callback mechanism for replace_fence (v3) drm/syncobj: add sync obj wait interface. (v8) i915: Use drm_syncobj_fence_get drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2) drm/syncobj: Rename fence_get to find_fence drm: kirin: Add mode_valid logic to avoid mode clocks we can't generate drm/vmwgfx: Bump the version for fence FD support drm/vmwgfx: Add export fence to file descriptor support drm/vmwgfx: Add support for imported Fence File Descriptor drm/vmwgfx: Prepare to support fence fd drm/vmwgfx: Fix incorrect command header offset at restart drm/vmwgfx: Support the NOP_ERROR command drm/vmwgfx: Restart command buffers after errors drm/vmwgfx: Move irq bottom half processing to threads drm/vmwgfx: Don't use drm_irq_[un]install ...
2017-09-03Merge branches 'acpi-x86', 'acpi-soc', 'acpi-pmic' and 'acpi-apple'Rafael J. Wysocki
* acpi-x86: ACPI / boot: Add number of legacy IRQs to debug output ACPI / boot: Correct address space of __acpi_map_table() ACPI / boot: Don't define unused variables * acpi-soc: ACPI / LPSS: Don't abort ACPI scan on missing mem resource * acpi-pmic: ACPI / PMIC: xpower: Do pinswitch magic when reading GPADC * acpi-apple: spi: Use Apple device properties in absence of ACPI resources ACPI / scan: Recognize Apple SPI and I2C slaves ACPI / property: Support Apple _DSM properties ACPI / property: Don't evaluate objects for devices w/o handle treewide: Consolidate Apple DMI checks
2017-09-03Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: - Expand the space for uncompressing as the LZ4 worst case does not fit into the currently reserved space - Validate boot parameters more strictly to prevent out of bound access in the decompressor/boot code - Fix off by one errors in get_segment_base() * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Prevent faulty bootparams.screeninfo from causing harm x86/boot: Provide more slack space during decompression x86/ldt: Fix off by one in get_segment_base()
2017-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01x86/idt: Fix the X86_TRAP_BP gateIngo Molnar
Andrei Vagin reported a CRIU regression and bisected it back to: 90f6225fba0c ("x86/idt: Move IST stack based traps to table init") This table init conversion loses the system-gate property of X86_TRAP_BP and erroneously moves it from DPL3 to DPL0. Fix it. Reported-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: dvlasenk@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: peterz@infradead.org Cc: brgerst@gmail.com Cc: rostedt@goodmis.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: jpoimboe@redhat.com Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: torvalds@linux-foundation.org Cc: tip-bot for Jacob Shin <tipbot@zytor.com> Link: http://lkml.kernel.org/r/20170901082630.xvyi5bwk6etmppqc@gmail.com
2017-08-31KVM: update to new mmu_notifier semantic v2Jérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Changed since v1 (Linus Torvalds) - remove now useless kvm_arch_mmu_notifier_invalidate_page() Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Mike Galbraith <efault@gmx.de> Tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31x86/xen: Get rid of paravirt op adjust_exception_frameJuergen Gross
When running as Xen pv-guest the exception frame on the stack contains %r11 and %rcx additional to the other data pushed by the processor. Instead of having a paravirt op being called for each exception type prepend the Xen specific code to each exception entry. When running as Xen pv-guest just use the exception entry with prepended instructions, otherwise use the entry without the Xen specific code. [ tglx: Merged through tip to avoid ugly merge conflict ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Cc: luto@amacapital.net Link: http://lkml.kernel.org/r/20170831174249.26853-1-jg@pfupf.net
2017-08-31x86/eisa: Add missing includeThomas Gleixner
The seperation of the EISA init missed to include linux/io.h which breaks the build with some special configurations. Reported-by: Ingo Molnar <mingo@kernel.org> Fixes: f7eaf6e00fd5 ("x86/boot: Move EISA setup to a separate file") Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-08-31x86: bpf_jit: small optimization in emit_bpf_tail_call()Eric Dumazet
Saves 4 bytes replacing following instructions : lea rax, [rsi + rdx * 8 + offsetof(...)] mov rax, qword ptr [rax] cmp rax, 0 by : mov rax, [rsi + rdx * 8 + offsetof(...)] test rax, rax Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-31x86/idt: Remove superfluous ALIGNmentJiri Slaby
Commit 87e81786b13b ("x86/idt: Move early IDT setup out of 32-bit asm") switched early_ignore_irq to use ENTRY. ENTRY aligns the code, so there is no need for one more ALIGN right before the function. And add one \n after the function to separate it from the data. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: http://lkml.kernel.org/r/20170831121653.28917-1-jslaby@suse.cz
2017-08-31x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* ↵Naoya Horiguchi
and EFI_LOADER_* from KASLR's choice There's a potential bug in how we select the KASLR kernel address n the early boot code. The KASLR boot code currently chooses the kernel image's physical memory location from E820_TYPE_RAM regions by walking over all e820 entries. E820_TYPE_RAM includes EFI_BOOT_SERVICES_CODE and EFI_BOOT_SERVICES_DATA as well, so those regions can end up hosting the kernel image. According to the UEFI spec, all memory regions marked as EfiBootServicesCode and EfiBootServicesData are available as free memory after the first call to ExitBootServices(). I.e. so such regions should be usable for the kernel, per spec. In real life however, we have workarounds for broken x86 firmware, where we keep such regions reserved until SetVirtualAddressMap() is done. See the following code in should_map_region(): static bool should_map_region(efi_memory_desc_t *md) { ... /* * Map boot services regions as a workaround for buggy * firmware that accesses them even when they shouldn't. * * See efi_{reserve,free}_boot_services(). */ if (md->type =3D=3D EFI_BOOT_SERVICES_CODE || md->type =3D=3D EFI_BOOT_SERVICES_DATA) return false; This workaround suppressed a boot crash, but potential issues still remain because no one prevents the regions from overlapping with kernel image by KASLR. So let's make sure that EFI_BOOT_SERVICES_{CODE|DATA} regions are never chosen as kernel memory for the workaround to work fine. Furthermore, EFI_LOADER_{CODE|DATA} regions are also excluded because they can be used after ExitBootServices() as defined in EFI spec. As a result, we choose kernel address only from EFI_CONVENTIONAL_MEMORY which is the only memory type we know to be safely free. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Baoquan He <bhe@redhat.com> Cc: Junichi Nomura <j-nomura@ce.jp.nec.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: fanc.fnst@cn.fujitsu.com Cc: izumi.taku@jp.fujitsu.com Link: http://lkml.kernel.org/r/20170828074444.GC23181@hori1.linux.bs1.fc.nec.co.jp [ Rewrote/fixed/clarified the changelog and the in code comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-31x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs ↵Hans de Goede
without the feature When booting 4.13 on a VirtualBox VM on a Skylake host the following error shows up in the logs: [ 0.000000] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0xb2 (or later) This is caused by apic_check_deadline_errata() only checking CPU model and not the X86_FEATURE_TSC_DEADLINE_TIMER flag (which VirtualBox does NOT export to the guest), combined with VirtualBox not exporting the micro-code version to the guest. This commit adds a check for X86_FEATURE_TSC_DEADLINE_TIMER to apic_check_deadline_errata(), silencing this error on VirtualBox VMs. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frank Mehnert <frank.mehnert@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Thayer <michael.thayer@oracle.com> Cc: Michal Necasek <michal.necasek@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: bd9240a18e ("x86/apic: Add TSC_DEADLINE quirk due to errata") Link: http://lkml.kernel.org/r/20170830105811.27539-1-hdegoede@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-31x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)Vitaly Kuznetsov
There's a subtle bug in how some of the paravirt guest code handles page table freeing on x86: On x86 software page table walkers depend on the fact that remote TLB flush does an IPI: walk is performed lockless but with interrupts disabled and in case the page table is freed the freeing CPU will get blocked as remote TLB flush is required. On other architectures which don't require an IPI to do remote TLB flush we have an RCU-based mechanism (see include/asm-generic/tlb.h for more details). In virtualized environments we may want to override the ->flush_tlb_others callback in pv_mmu_ops and use a hypercall asking the hypervisor to do a remote TLB flush for us. This breaks the assumption about IPIs. Xen PV has been doing this for years and the upcoming remote TLB flush for Hyper-V will do it too. This is not safe, as software page table walkers may step on an already freed page. Fix the bug by enabling the RCU-based page table freeing mechanism, CONFIG_HAVE_RCU_TABLE_FREE=y. Testing with kernbench and mmap/munmap microbenchmarks, and neither showed any noticeable performance impact. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: KY Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170828082251.5562-1-vkuznets@redhat.com [ Rewrote/fixed/clarified the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-31x86/mm: Use pr_cont() in dump_pagetable()Jan Beulich
The lack of newlines in preceding format strings is a clear indication that these were meant to be continuations of one another, and indeed output ends up quite a bit more compact (and readable) that way. Switch other plain printk()-s in the function instances to pr_info(), as requested. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/59A7D72B0200007800175E4E@prv-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-31x86/idt: Remove the tracing IDT leftoversThomas Gleixner
Stephen reported a merge conflict with the XEN tree. That also shows that the IDT cleanup forgot to remove the now unused trace_{trap} defines. Remove them. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com>
2017-08-30Merge branch 'for-linus-4.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fix from Richard Weinberger: "This contains a single fix for a regression which was introduced while the merge window" * 'for-linus-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix check for _xstate for older hosts
2017-08-29perf/x86: Fix caps/ for !IntelPeter Zijlstra
Move the 'max_precise' capability into generic x86 code where it belongs. This fixes a sysfs splat on !Intel systems where we fail to set x86_pmu_caps_group.atts. Reported-and-tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Fixes: 22688d1c20f5 ("x86/perf: Export some PMU attributes in caps/ directory") Link: http://lkml.kernel.org/r/20170828104650.2u3rsim4jafyjzv2@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29perf/core, x86: Add PERF_SAMPLE_PHYS_ADDRKan Liang
For understanding how the workload maps to memory channels and hardware behavior, it's very important to collect address maps with physical addresses. For example, 3D XPoint access can only be found by filtering the physical address. Add a new sample type for physical address. perf already has a facility to collect data virtual address. This patch introduces a function to convert the virtual address to physical address. The function is quite generic and can be extended to any architecture as long as a virtual address is provided. - For kernel direct mapping addresses, virt_to_phys is used to convert the virtual addresses to physical address. - For user virtual addresses, __get_user_pages_fast is used to walk the pages tables for user physical address. - This does not work for vmalloc addresses right now. These are not resolved, but code to do that could be added. The new sample type requires collecting the virtual address. The virtual address will not be output unless SAMPLE_ADDR is applied. For security, the physical address can only be exposed to root or privileged user. Tested-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: mpe@ellerman.id.au Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29perf/core, pt, bts: Get rid of itrace_startedAlexander Shishkin
I just noticed that hw.itrace_started and hw.config are aliased to the same location. Now, the PT driver happens to use both, which works out fine by sheer luck: - STORE(hw.itrace_start) is ordered before STORE(hw.config), in the program order, although there are no compiler barriers to ensure that, - to the perf_log_itrace_start() hw.itrace_start looks set at the same time as when it is intended to be set because both stores happen in the same path, - hw.config is never reset to zero in the PT driver. Now, the use of hw.config by the PT driver makes more sense (it being a HW PMU) than messing around with itrace_started, which is an awkward API to begin with. This patch replaces hw.itrace_started with an attach_state bit and an API call for the PMU drivers to use to communicate the condition. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170330153956.25994-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/boot: Prevent faulty bootparams.screeninfo from causing harmJan H. Schönherr
If a zero for the number of lines manages to slip through, scroll() may underflow some offset calculations, causing accesses outside the video memory. Make the check in __putstr() more pessimistic to prevent that. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1503858223-14983-1-git-send-email-jschoenh@amazon.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/boot: Provide more slack space during decompressionJan H. Schönherr
The current slack space is not enough for LZ4, which has a worst case overhead of 0.4% for data that cannot be further compressed. With an LZ4 compressed kernel with an embedded initrd, the output is likely to overwrite the input. Increase the slack space to avoid that. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1503842124-29718-1-git-send-email-jschoenh@amazon.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()Jiri Slaby
ALIGN+GLOBAL is effectively what ENTRY() does, so use ENTRY() which is dedicated for exactly this purpose -- global functions. Note that stub32_clone() is a C-like leaf function -- it has a standard call frame -- it only switches one argument and continues by jumping into C. Since each ENTRY() should be balanced by some END*() marker, we add a corresponding ENDPROC() to stub32_clone() too. Besides that, x86's custom GLOBAL macro is going to die very soon. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170824080624.7768-2-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/fpu/math-emu: Add ENDPROC to functionsJiri Slaby
Functions in math-emu are annotated as ENTRY() symbols, but their ends are not annotated at all. But these are standard functions called from C, with proper stack register update etc. Omitting the ends means: * the annotations are not paired and we cannot deal with such functions e.g. in objtool * the symbols are not marked as functions in the object file * there are no sizes of the functions in the object file So fix this by adding ENDPROC() to each such case in math-emu. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170824080624.7768-1-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/boot/64: Extract efi_pe_entry() from startup_64()Jiri Slaby
Similarly to the 32-bit code, efi_pe_entry body() is somehow squashed into startup_64(). In the old days, we forced startup_64() to start at offset 0x200 and efi_pe_entry() to start at 0x210. But this requirement was removed long time ago, in: 99f857db8857 ("x86, build: Dynamically find entry points in compressed startup code") The way it is now makes the code less readable and illogical. Given we can now safely extract the inlined efi_pe_entry() body from startup_64() into a separate function, we do so. We also annotate the function appropriatelly by ENTRY+ENDPROC. ABI offsets are preserved: 0000000000000000 T startup_32 0000000000000200 T startup_64 0000000000000390 T efi64_stub_entry On the top-level, it looked like: .org 0x200 ENTRY(startup_64) #ifdef CONFIG_EFI_STUB ; start of inlined jmp preferred_addr GLOBAL(efi_pe_entry) ... ; a lot of assembly (efi_pe_entry) leaq preferred_addr(%rax), %rax jmp *%rax preferred_addr: #endif ; end of inlined ... ; a lot of assembly (startup_64) ENDPROC(startup_64) And it is now converted into: .org 0x200 ENTRY(startup_64) ... ; a lot of assembly (startup_64) ENDPROC(startup_64) #ifdef CONFIG_EFI_STUB ENTRY(efi_pe_entry) ... ; a lot of assembly (efi_pe_entry) leaq startup_64(%rax), %rax jmp *%rax ENDPROC(efi_pe_entry) #endif Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: ard.biesheuvel@linaro.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170824073327.4129-2-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/boot/32: Extract efi_pe_entry() from startup_32()Jiri Slaby
The efi_pe_entry() body is somehow squashed into startup_32(). In the old days, we forced startup_32() to start at offset 0x00 and efi_pe_entry() to start at 0x10. But this requirement was removed long time ago, in: 99f857db8857 ("x86, build: Dynamically find entry points in compressed startup code") The way it is now makes the code less readable and illogical. Given we can now safely extract the inlined efi_pe_entry() body from startup_32() into a separate function, we do so and we separate it to two functions as they are marked already: efi_pe_entry() + efi32_stub_entry(). We also annotate the functions appropriatelly by ENTRY+ENDPROC. ABI offset is preserved: 0000 128 FUNC GLOBAL DEFAULT 6 startup_32 0080 60 FUNC GLOBAL DEFAULT 6 efi_pe_entry 00bc 68 FUNC GLOBAL DEFAULT 6 efi32_stub_entry On the top-level, it looked like this: ENTRY(startup_32) #ifdef CONFIG_EFI_STUB ; start of inlined jmp preferred_addr ENTRY(efi_pe_entry) ... ; a lot of assembly (efi_pe_entry) ENTRY(efi32_stub_entry) ... ; a lot of assembly (efi32_stub_entry) leal preferred_addr(%eax), %eax jmp *%eax preferred_addr: #endif ; end of inlined ... ; a lot of assembly (startup_32) ENDPROC(startup_32) And it is now converted into: ENTRY(startup_32) ... ; a lot of assembly (startup_32) ENDPROC(startup_32) #ifdef CONFIG_EFI_STUB ENTRY(efi_pe_entry) ... ; a lot of assembly (efi_pe_entry) ENDPROC(efi_pe_entry) ENTRY(efi32_stub_entry) ... ; a lot of assembly (efi32_stub_entry) leal startup_32(%eax), %eax jmp *%eax ENDPROC(efi32_stub_entry) #endif Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: ard.biesheuvel@linaro.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170824073327.4129-1-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>