summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-09-21x86/numa: Online memory-less nodes at boot timeTang Chen
For now, x86 does not support memory-less node. A node without memory will not be onlined, and the cpus on it will be mapped to the other online nodes with memory in init_cpu_to_node(). The reason of doing this is to ensure each cpu has mapped to a node with memory, so that it will be able to allocate local memory for that cpu. But we don't have to do it in this way. In this series of patches, we are going to construct cpu <-> node mapping for all possible cpus at boot time, which is a persistent mapping. It means that the cpu will be mapped to the node which it belongs to, and will never be changed. If a node has only cpus but no memory, the cpus on it will be mapped to a memory-less node. And the memory-less node should be onlined. Allocate pgdats for all memory-less nodes and online them at boot time. Then build zonelists for these nodes. As a result, when cpus on these memory-less nodes try to allocate memory from local node, it will automatically fall back to the proper zones in the zonelists. Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: mika.j.penttila@gmail.com Cc: len.brown@intel.com Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: yasu.isimatu@gmail.com Cc: linux-mm@kvack.org Cc: linux-acpi@vger.kernel.org Cc: isimatu.yasuaki@jp.fujitsu.com Cc: gongzhaogang@inspur.com Cc: tj@kernel.org Cc: izumi.taku@jp.fujitsu.com Cc: cl@linux.com Cc: chen.tang@easystack.cn Cc: akpm@linux-foundation.org Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1472114120-3281-2-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21nfit: fail DSMs that return non-zero status by defaultDan Williams
For the DSMs where the kernel knows the format of the output buffer and originates those DSMs from within the kernel, return -EIO for any non-zero status. If the BIOS is indicating a status that we do not know how to handle, fail the DSM. Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21libnvdimm: fix devm_nvdimm_memremap() error pathDan Williams
The internal alloc_nvdimm_map() helper might fail, particularly if the memory region is already busy. Report request_mem_region() failures and check for the failure. Reported-by: Ryan Chen <ryan.chan105@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21i2c-eg20t: fix race between i2c init and interrupt enableYadi.hu
the eg20t driver call request_irq() function before the pch_base_address, base address of i2c controller's register, is assigned an effective value. there is one possible scenario that an interrupt which isn't inside eg20t arrives immediately after request_irq() is executed when i2c controller shares an interrupt number with others. since the interrupt handler pch_i2c_handler() has already active as shared action, it will be called and read its own register to determine if this interrupt is from itself. At that moment, since base address of i2c registers is not remapped in kernel space yet,so the INT handler will access an illegal address and then a error occurs. Signed-off-by: Yadi.hu <yadi.hu@windriver.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2016-09-21perf hists: Use bigger buffer for stdio headersJiri Olsa
With node column on big CPUs servers we can run out of stdio header space quite soon. Enlarging header buffer. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1474290610-23241-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-21perf evsel: Remove superfluous initialization of weightJiri Olsa
Removing superfluous initialization of weight, it's already set to 0 via memset. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1474290610-23241-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-21MIPS: vDSO: Fix Malta EVA mapping to vDSO page structsJames Hogan
The page structures associated with the vDSO pages in the kernel image are calculated using virt_to_page(), which uses __pa() under the hood to find the pfn associated with the virtual address. The vDSO data pointers however point to kernel symbols, so __pa_symbol() should really be used instead. Since there is no equivalent to virt_to_page() which uses __pa_symbol(), fix init_vdso_image() to work directly with pfns, calculated with __phys_to_pfn(__pa_symbol(...)). This issue broke the Malta Enhanced Virtual Addressing (EVA) configuration which has a non-default implementation of __pa_symbol(). This is because it uses a physical alias so that the kernel executes from KSeg0 (VA 0x80000000 -> PA 0x00000000), while RAM is provided to the kernel in the KUSeg range (VA 0x00000000 -> PA 0x80000000) which uses the same underlying RAM. Since there are no page structures associated with the low physical address region, some arbitrary kernel memory would be interpreted as a page structure for the vDSO pages and badness ensues. Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x- Patchwork: https://patchwork.linux-mips.org/patch/14229/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-21x86/e820: Use much less memory for e820/e820_saved, save up to 120kDenys Vlasenko
The maximum size of e820 map array for EFI systems is defined as E820_X_MAX (E820MAX + 3 * MAX_NUMNODES). In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved are 6404 bytes each. With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30 e820 areas at most. After previous patch, e820 and e820_saved are pointers to e280 maps. Change them to initially point to maps which are __initdata. At the very end of kernel init, just before __init[data] sections are freed in free_initmem(), allocate smaller blocks, copy maps there, and change pointers. The late switch makes sure that all functions which can be used to change e820 maps are no longer accessible (they are all __init functions). Run-tested. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160918182125.21000-1-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21x86/e820: Prepare e280 code for switch to dynamic storageDenys Vlasenko
This patch turns e820 and e820_saved into pointers to e820 tables, of the same size as before. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160917213927.1787-2-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21x86/e820: Mark some static functions __initDenys Vlasenko
They are all called only from other __init functions in e820.c Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160917213927.1787-1-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21Merge branch 'linus' into x86/boot, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21ARM/dts: Add EXTI controller node to stm32f429Alexandre TORGUE
Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: arnd@arndb.de Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: bruherrera@gmail.com Cc: Linus Walleij <linus.walleij@linaro.org> Cc: linux-gpio@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: lee.jones@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474387259-18926-5-git-send-email-alexandre.torgue@st.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21ARM/STM32: Select external interrupts controllerAlexandre TORGUE
Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: arnd@arndb.de Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: bruherrera@gmail.com Cc: Linus Walleij <linus.walleij@linaro.org> Cc: linux-gpio@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: lee.jones@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474387259-18926-4-git-send-email-alexandre.torgue@st.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21drivers/irqchip: Add STM32 external interrupts supportAlexandre TORGUE
The STM32 external interrupt controller consists of edge detectors that generate interrupts requests or wake-up events. Each line can be independently configured as interrupt or wake-up source, and triggers either on rising, falling or both edges. Each line can also be masked independently. Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: arnd@arndb.de Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: bruherrera@gmail.com Cc: Linus Walleij <linus.walleij@linaro.org> Cc: linux-gpio@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: lee.jones@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474387259-18926-3-git-send-email-alexandre.torgue@st.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21Documentation/dt-bindings: Document STM32 EXTI controller bindingsAlexandre TORGUE
Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: arnd@arndb.de Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: bruherrera@gmail.com Cc: Linus Walleij <linus.walleij@linaro.org> Cc: linux-gpio@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: lee.jones@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474387259-18926-2-git-send-email-alexandre.torgue@st.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21net: can: ifi: Configure transmitter delayMarek Vasut
Configure the transmitter delay register at +0x1c to correctly handle the CAN FD bitrate switch (BRS). This moves the SSP (secondary sample point) to a proper offset, so that the TDC mechanism works and won't generate error frames on the CAN link. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-09-21vti6: fix input pathNicolas Dichtel
Since commit 1625f4529957, vti6 is broken, all input packets are dropped (LINUX_MIB_XFRMINNOSTATES is incremented). XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in xfrm6_rcv_spi(). A new function xfrm6_rcv_tnl() that enables to pass a value to xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function is used in several handlers). CC: Alexey Kodanev <alexey.kodanev@oracle.com> Fixes: 1625f4529957 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2016-09-21Merge branch 'smp/for-block' into smp/hotplugThomas Gleixner
Bring in the block hotplug states for consistency.
2016-09-21ipmr, ip6mr: return lastuse relative to nowNikolay Aleksandrov
When I introduced the lastuse member I made a subtle error because it was returned as an absolute value but that is meaningless to user-space as it doesn't allow to see how old exactly an entry is. Let's make it similar to how the bridge returns such values and make it relative to "now" (jiffies). This allows us to show the actual age of the entries and is much more useful (e.g. user-space daemons can age out entries, iproute2 can display the lastuse properly). Fixes: 43b9e1274060 ("net: ipmr/ip6mr: add support for keeping an entry age") Reported-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21Merge branch 'r8152-phy-fixes'David S. Miller
Hayes Wang says: ==================== r8152: correct the flow of PHY First, to enable the PHY as early as possible. Some settings may fail if the PHY is power down. Move the other PHY settings to hw_phy_cfg() to make sure the order is correct. Finally, disable ALDPS and EEE before updating the PHY for RTL8153. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21r8152: disable ALDPS and EEE before setting PHYhayeswang
Disable ALDPS and EEE to avoid the possible failure when setting the PHY. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21r8152: remove r8153_enable_eeehayeswang
Remove r8153_enable_eee(). Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21r8152: move PHY settings to hw_phy_cfghayeswang
Move the PHY relative settings together to hw_phy_cfg(). Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21r8152: move enabling PHYhayeswang
Move enabling PHY to init(), otherwise some other settings may fail. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21r8152: move some functionshayeswang
Move the following functions forward. r8152_mmd_indirect() r8152_mmd_read() r8152_mmd_write() r8152_eee_en() r8152b_enable_eee() r8153_eee_en() r8153_enable_eee() r8152b_enable_fc() r8153_aldps_en() Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21cxgb4/cxgb4vf: Allocate more queues for 25G and 100G adapterHariprasad Shenai
We were missing check for 25G and 100G while checking port speed, which lead to less number of queues getting allocated for 25G & 100G adapters and leading to low throughput. Adding the missing check for both NIC and vNIC driver. Also fixes port advertisement for 25G and 100G in ethtool output. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignmentRussell Currey
Commit 5958d19a143e checks for prefetchable m64 BARs by comparing the addresses instead of using resource flags. This broke SR-IOV as the m64 check in pnv_pci_ioda_fixup_iov_resources() fails. The condition in pnv_pci_window_alignment() also changed to checking only IORESOURCE_MEM_64 instead of both IORESOURCE_MEM_64 and IORESOURCE_PREFETCH. Revert these cases to the previous behaviour, adding a new helper function to do so. This is named pnv_pci_is_m64_flags() to make it clear this function is only looking at resource flags and should not be relied on for non-SRIOV resources. Fixes: 5958d19a143e ("Fix incorrect PE reservation attempt on some 64-bit BARs") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Russell Currey <ruscur@russell.cc> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20Merge tag 'linux-can-fixes-for-4.8-20160919' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-19 this is a pull request of one patch for the upcoming linux-4.8 release. The patch by Fabio Estevam fixes the pm handling in the flexcan driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20Merge tag 'usercopy-v4.8-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull usercopy hardening fix from Kees Cook: "Expand the arm64 vmalloc check to include skipping the module space too" * tag 'usercopy-v4.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mm: usercopy: Check for module addresses
2016-09-20fix fault_in_multipages_...() on architectures with no-op access_ok()Al Viro
Switching iov_iter fault-in to multipages variants has exposed an old bug in underlying fault_in_multipages_...(); they break if the range passed to them wraps around. Normally access_ok() done by callers will prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into such a range and they should not point to any valid objects). However, on architectures where userland and kernel live in different MMU contexts (e.g. s390) access_ok() is a no-op and on those a range with a wraparound can reach fault_in_multipages_...(). Since any wraparound means EFAULT there, the fix is trivial - turn those while (uaddr <= end) ... into if (unlikely(uaddr > end)) return -EFAULT; do ... while (uaddr <= end); Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Cc: stable@vger.kernel.org # v3.5+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20mm: usercopy: Check for module addressesLaura Abbott
While running a compile on arm64, I hit a memory exposure usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes) ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:75! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_security ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle iptable_security iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys CPU: 0 PID: 19744 Comm: updatedb Tainted: G W 4.8.0-rc3-threadinfo+ #1 Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016 task: fffffe03df944c00 task.stack: fffffe00d128c000 PC is at __check_object_size+0x70/0x3f0 LR is at __check_object_size+0x70/0x3f0 ... [<fffffc00082b4280>] __check_object_size+0x70/0x3f0 [<fffffc00082cdc30>] filldir64+0x158/0x1a0 [<fffffc0000f327e8>] __fat_readdir+0x4a0/0x558 [fat] [<fffffc0000f328d4>] fat_readdir+0x34/0x40 [fat] [<fffffc00082cd8f8>] iterate_dir+0x190/0x1e0 [<fffffc00082cde58>] SyS_getdents64+0x88/0x120 [<fffffc0008082c70>] el0_svc_naked+0x24/0x28 fffffc0000f3b1a8 is a module address. Modules may have compiled in strings which could get copied to userspace. In this instance, it looks like "." which matches with a size of 1 byte. Extend the is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover all possible cases. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-20x86/dumpstack: Fix show_stack() task pointer regressionJosh Poimboeuf
With the following commit: e18bcccd1a4e ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder") The task pointer argument to show_stack_log_lvl() in show_stack() was inadvertently changed to 'current'. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: byungchul.park@lge.com Cc: fweisbec@gmail.com Cc: keescook@chromium.org Cc: linux-tip-commits@vger.kernel.org Cc: luto@amacapital.net Cc: nilayvaish@gmail.com Cc: rostedt@goodmis.org Cc: tip-bot for Josh Poimboeuf <tipbot@zytor.com> Fixes: e18bcccd1a4e ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder") Link: http://lkml.kernel.org/r/20160920155340.yhewlx7vmgmov5fb@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20Merge tag 'perf-core-for-mingo-20160920' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Support event group view with hierarchy mode in 'perf top' and 'perf report' (Namhyung Kim) e.g.: $ perf record -e '{cycles,instructions}' make $ perf report --hierarchy --stdio ... # Overhead Command / Shared Object / Symbol # ...................... .................................. ... 25.74% 27.18% sh 19.96% 24.14% libc-2.24.so 9.55% 14.64% [.] __strcmp_sse2 1.54% 0.00% [.] __tfind 1.07% 1.13% [.] _int_malloc 0.95% 0.00% [.] __strchr_sse2 0.89% 1.39% [.] __tsearch 0.76% 0.00% [.] strlen - Fix the dwarf regs table for x86_64, adding a missing % to the "%di" register, noticed with a failing 'perf test bpf' (Arnaldo Carvalho de Melo) - Fix handling of mmap parameters in the 'perf trace' beautifier in architectures that don't have the same mappings as x86_64 (Wang Nan) - Handle hugetbl mappings in older systems running new kernels (Wang Nan) - Resolve 'call' operands in 'annotate', that when using /proc/kcore were appearing just as hexadecimal addresses, to function names (Arnaldo Carvalho de Melo) - Fix width computation for srcline sort entry (Jiri Olsa) - Do not ignore call instruction with indirect target in 'annotate' (Ravi Bangoria) - Handle MADV_FREE in the madvise 'trace' beautifier (Wang Nan) - Fix build of 'perf trace' mman beautifier in !x86_64 (Wang Nan) Infrastructure changes: - Add infrastructure for PMU specific configuration, allowing to pass config variables directly to the kernel PMU driver, prefixing those variables with a '@', part of a larger series to support Coresight (Mathieu Poirier) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20clocksource/mips-gic-timer: Stop checking cpu_has_counterPaul Burton
The cpu_has_counter macro indicates whether the current CPU has a working coprocessor 0 count & compare registers, and has no bearing on the GIC. Stop checking it. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Link: http://lkml.kernel.org/r/20160913165644.627-2-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20clocksource/mips-gic-timer: Print an error if IRQ setup failsPaul Burton
We've checked for errors from setup_irq_percpu since commit f95ac8558b88 ("CLOCKSOURCE: mips-gic: Add missing error returns checks") but didn't print an error message in the failure case. This makes it very easy to overlook the GIC timer clock event driver not being registered, since we'll generally just use a different clock event driver if that happens. Print an error if IRQ setup fails in order to make such problems harder to miss (ie. not completely silent). Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Link: http://lkml.kernel.org/r/20160913165644.627-1-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20irqchip/mips-gic: Use for_each_set_bit to iterate over local IRQsPaul Burton
The MIPS GIC driver has previously iterated over bits set in a bitmap representing pending local IRQs by calling find_first_bit, clearing that bit then calling find_first_bit again until all bits are clear. If multiple interrupts are pending then this is wasteful, as find_first_bit will have to loop over the whole bitmap from the start. Use the for_each_set_bit macro which performs exactly what we need here instead. It will use find_next_bit and thus only scan over the relevant part of the bitmap, and it makes the intent of the code clearer. This makes the same change for local interrupts that commit cae750bae4e4 ("irqchip: mips-gic: Use for_each_set_bit to iterate over IRQs") made for shared interrupts. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-mips@linux-mips.org Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/20160913165427.31686-1-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20Merge branch 'irq/urgent' into irq/coreThomas Gleixner
Merge urgent fixes so pending patches for 4.9 can be applied.
2016-09-20irqchip/mips-gic: Fix local interruptsPaul Burton
Since the device hierarchy domain was added by commit c98c1822ee13 ("irqchip/mips-gic: Add device hierarchy domain"), GIC local interrupts have been broken. Users attempting to setup a per-cpu local IRQ, for example the GIC timer clock events code in drivers/clocksource/mips-gic-timer.c, the setup_percpu_irq function would refuse with -EINVAL because the GIC irqchip driver never called irq_set_percpu_devid so the IRQ_PER_CPU_DEVID flag was never set for the IRQ. This happens because irq_set_percpu_devid was being called from the gic_irq_domain_map function which is no longer called. Doing only that runs into further problems because gic_dev_domain_alloc set the struct irq_chip for all interrupts, local or shared, to gic_level_irq_controller despite that only being suitable for shared interrupts. The typical outcome of this is that gic_level_irq_controller callback functions are called for local interrupts, and then hwirq number calculations overflow & the driver ends up attempting to access some invalid register with an address calculated from an invalid hwirq number. Best case scenario is that this then leads to a bus error. This is fixed by abstracting the setup of the hwirq & chip to a new function gic_setup_dev_chip which is used by both the root GIC IRQ domain & the device domain. Finally, decoding local interrupts failed because gic_dev_domain_alloc only called irq_domain_alloc_irqs_parent for shared interrupts. Local ones were therefore never associated with hwirqs in the root GIC IRQ domain and the virq in gic_handle_local_int would always be 0. This is fixed by calling irq_domain_alloc_irqs_parent unconditionally & having gic_irq_domain_alloc handle both local & shared interrupts, which is easy due to the aforementioned abstraction of chip setup into gic_setup_dev_chip. This fixes use of the MIPS GIC timer for clock events, which has been broken since c98c1822ee13 ("irqchip/mips-gic: Add device hierarchy domain") but hadn't been noticed due to a silent fallback to the MIPS coprocessor 0 count/compare clock events device. Fixes: c98c1822ee13 ("irqchip/mips-gic: Add device hierarchy domain") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Jason Cooper <jason@lakedaemon.net> Cc: Qais Yousef <qsyousef@gmail.com> Cc: stable@vger.kernel.org Cc: Marc Zyngier <marc.zyngier@arm.com> Link: http://lkml.kernel.org/r/20160913165335.31389-1-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20fs/proc/kcore.c: Add bounce buffer for ktext dataJiri Olsa
We hit hardened usercopy feature check for kernel text access by reading kcore file: usercopy: kernel memory exposure attempt detected from ffffffff8179a01f (<kernel text>) (4065 bytes) kernel BUG at mm/usercopy.c:75! Bypassing this check for kcore by adding bounce buffer for ktext data. Reported-by: Steve Best <sbest@redhat.com> Fixes: f5509cc18daa ("mm: Hardened usercopy") Suggested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20fs/proc/kcore.c: Make bounce buffer global for readJiri Olsa
Next patch adds bounce buffer for ktext area, so it's convenient to have single bounce buffer for both vmalloc/module and ktext cases. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20perf symbols: Do not open device filesJiri Olsa
The dso__read_binary_type_filename gets the dso's file name to open. We need to check it for regular file before trying to open it, otherwise we might get stuck with device file. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20160920161245.GA8995@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf hists: Factor out hists__reset_column_width()Namhyung Kim
The stdio and tui has same code to reset hpp format column width. Factor it out as a new function. Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160920053025.13989-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf ui/tui: Reset output width for hierarchyNamhyung Kim
When --hierarchy option is used, each entry has its own hpp_list to show the result. But it missed to update width of each column. Before: - 46.29% 48.12% netctl-auto + 31.44% 29.25% [kernel.vmlinux] + 8.52% 11.55% libc-2.22.so + 5.19% 6.91% bash + 10.75% 11.83% wpa_cli + 8.25% 2.23% swapper + 6.45% 5.40% tr + 4.81% 8.09% awk + 4.15% 2.85% firefox + 3.86% 2.53% sh After: - 46.29% 48.12% netctl-auto + 31.44% 29.25% [kernel.vmlinux] + 8.52% 11.55% libc-2.22.so + 5.19% 6.91% bash + 10.75% 11.83% wpa_cli + 8.25% 2.23% swapper + 6.45% 5.40% tr + 4.81% 8.09% awk + 4.15% 2.85% firefox + 3.86% 2.53% sh Committer note: Full testing instructions: 1) Record with an event group: $ perf record -e '{cycles,instructions}' make -j4 2) Use report in hierarchy mode, to get a few expanded trees on the same screen, use --percent-limit: $ perf report --hierarchy --percent-limit 0.5 Samples: 103K of event 'anon group { cycles:u, instructions:u }', Event count (approx.): 57317631725 Overhead Command / Shared Object / Symbol ◆ - 58.89% 55.12% cc1 ▒ - 50.26% 48.10% cc1 ▒ 3.61% 5.13% [.] _cpp_lex_token ▒ 2.58% 0.78% [.] ht_lookup_with_hash ▒ 1.31% 1.30% [.] ggc_internal_alloc ▒ 1.08% 2.25% [.] get_combined_adhoc_loc ▒ 1.01% 1.95% [.] ira_init ▒ 0.96% 1.78% [.] linemap_position_for_column ▒ 0.65% 1.01% [.] cpp_get_token_with_location ▒ - 7.52% 6.58% libc-2.23.so ▒ 1.70% 1.78% [.] _int_malloc ▒ 0.69% 0.75% [.] _int_free ▒ 0.67% 0.42% [.] malloc_consolidate ▒ - 0.58% 0.42% ld-2.23.so ▒ no entry >= 0.50% ▒ - 0.52% 0.03% [kernel.vmlinux] ▒ no entry >= 0.50% ▒ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 1b2dbbf41a0f ("perf hists: Use own hpp_list for hierarchy mode") Link: http://lkml.kernel.org/r/20160920053025.13989-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf annotate: Resolve 'call' operands to function namesArnaldo Carvalho de Melo
Before this patch the '_raw_spin_lock_irqsave' and 'update_rq_clock' operands were appearing just as hexadecimal numbers: update_blocked_averages /proc/kcore │ push %r12 │ push %rbx │ and $0xfffffffffffffff0,%rsp │ sub $0x40,%rsp │ add -0x662cac00(,%rdi,8),%rax │ mov %rax,%rbx │ mov %rax,%rdi │ mov %rax,0x38(%rsp) │ → callq _raw_spin_lock_irqsave │ mov %rbx,%rdi │ mov %rax,0x30(%rsp) │ → callq update_rq_clock │ mov 0x8d0(%rbx),%rax │ lea 0x8d0(%rbx),%r11 To check that all is right one can always use the 'o' hotkey and see the original objdump -dS output, that for this case is: update_blocked_averages /proc/kcore │ffffffff990d5489: push %r12 │ffffffff990d548b: push %rbx │ffffffff990d548c: and $0xfffffffffffffff0,%rsp │ffffffff990d5490: sub $0x40,%rsp │ffffffff990d5494: add -0x662cac00(,%rdi,8),%rax │ffffffff990d549c: mov %rax,%rbx │ffffffff990d549f: mov %rax,%rdi │ffffffff990d54a2: mov %rax,0x38(%rsp) │ffffffff990d54a7: → callq 0xffffffff997eb7a0 │ffffffff990d54ac: mov %rbx,%rdi │ffffffff990d54af: mov %rax,0x30(%rsp) │ffffffff990d54b4: → callq 0xffffffff990c7720 │ffffffff990d54b9: mov 0x8d0(%rbx),%rax │ffffffff990d54c0: lea 0x8d0(%rbx),%r11 Use the 'h' hotkey to see a list of available hotkeys. More work needed to cover operands for other instructions, such as 'mov', that can resolve variable names, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xqgtw9mzmzcjgwkis9kiiv1p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf annotate: Pass the symbol's map/dso to the instruction parsersArnaldo Carvalho de Melo
So that things like: → callq 0xffffffff993e3230 found while disassembling /proc/kcore can be beautified by later patches, that will resolve that address to a function, looking it up in /proc/kallsyms. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p76myuke4j7gplg54amaklxk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf annotate: Do not ignore call instruction with indirect targetRavi Bangoria
Do not ignore call instruction with indirect target when its already identified as a call. This is an extension of commit e8ea1561952b ("perf annotate: Use raw form for register indirect call instructions") to generalize annotation for all instructions with indirect calls. This is needed for certain powerpc call instructions that use address in a register (such as bctrl, btarl, ...). Apart from that, when kcore is used to disassemble function, all call instructions were ignored. This patch will fix it as a side effect by not ignoring them. For example, Before (with kcore): mov %r13,%rdi callq 0xffffffff811a7e70 ^ jmpq 64 mov %gs:0x7ef41a6e(%rip),%al After (with kcore): mov %r13,%rdi > callq 0xffffffff811a7e70 ^ jmpq 64 mov %gs:0x7ef41a6e(%rip),%al Suggested-by: Michael Ellerman <mpe@ellerman.id.au> [Suggested about 'bctrl' instruction] Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1471611578-11255-5-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf hists: Fix width computation for srcline sort entryJiri Olsa
Adding header size to width computation for srcline sort entry, because it's possible to get empty data with ':0' which set width of 2 which is lower than width needed to display column header. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1474290610-23241-62-git-send-email-jolsa@kernel.org [ Added declaration to sort.h ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20cpufreq: Fix up conversion to hotplug state machineSebastian Andrzej Siewior
The function cpufreq_register_driver() returns zero on success and since commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine") erroneously a positive number. Due to the "if (x) assume_error" construct all callers assumed an error and as a consequence the cpu freq kworker crashes with a NULL pointer dereference. Reset the return value back to zero in the success case. Fixes: 27622b061eb4 ("cpufreq: Convert to hotplug state machine") Reported-by: Borislav Petkov <bp@alien8.de> Reported-and-tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: peterz@infradead.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/20160920145628.lp2bmq72ip3oiash@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20Merge tag 'efi-next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core Pull EFI fix from Matt Fleming: * Fix a boot crash reported by Mike Galbraith and Mike Krinkin. The new EFI memory map reservation code didn't align reservations to EFI_PAGE_SIZE boundaries causing bogus regions to be inserted into the global EFI memory map (Matt Fleming) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20Merge branch 'efi/urgent' into efi/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>