summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-15Merge tag 'batadv-net-for-davem-20171215' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - Initialize the fragment headers, by Sven Eckelmann - Fix a NULL check in BATMAN V, by Sven Eckelmann - Fix kernel doc for the time_setup() change, by Sven Eckelmann - Use the right lock in BATMAN IV OGM Update, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: dsa: bcm_sf2: Update compatible string for 7278B0Florian Fainelli
Update the compatible string and Device Tree binding document for 7278B0. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15Merge branch 'hnx3-vf'David S. Miller
Salil Mehta says: ==================== Hisilicon Network Subsystem 3 VF Ethernet Driver This patch-set contains the support of the HNS3 (Hisilicon Network Subsystem 3) Virtual Function Ethernet driver for hip08 family of SoCs. The Physical Function driver is already part of the Linux mainline. This VF driver has its Hardware Compatibility Layer and has commom/unified ENET layer/client/ethtool code with the PF driver. It also has support of mailbox to communicate with the HNS3 PF driver. The basic architecture of VF driver is derivative of the PF driver. Just like PF driver, this driver is also PCI Express based. This driver is the ongoing development work and HNS3 VF Ethernet driver would be incrementally enhanced with more new features. High Level Architecture: [ Ethtool ] | [ Ethernet Client ] ... [ RoCE Client ] | | [ HNAE Device ] |________ | | | --------------------------------------------- | | [ HNAE3 Framework (Register/unregister) ] | | --------------------------------------------- | | | [ VF HCLGE Layer ] | | | | | | | | | | | [ VF Mailbox (To PF via IMP) ] | | | | [ IMP command Interface ] [ IMP command Interface ] | | | | (A B O V E R U N S O N G U E S T S Y S T E M) ------------------------------------------------------------- Q E M U / V F I O / K V M (on Host System) ------------------------------------------------------------- HIP08 H A R D W A R E (limited to VF by SMMU) [ IMP/Mgmt Processor (hardware common to system/cmd based) ] Fig 1. HNS3 Virtual Function Driver [ dcbnl ] [ Ethtool ] | | [ Ethernet Client ] [ ODP/UIO Client ] . . .[ RoCE Client ] |_____________________| | | _________| [ HNAE Device ] | | | | | --------------------------------------------- | | [ HNAE3 Framework (Register/unregister) ] | | --------------------------------------------- | | | [ HCLGE Layer ] | ________________|_________________ | | | | | [ DCB ] | | | | | | | [ Scheduler/Shaper ] [ MDIO ] [ PF Mailbox ] | | | | | |________________|_________________| | | | [ IMP command Interface ] [ IMP command Interface ] ---------------------------------------------------------------- HIP08 H A R D W A R E [ IMP/Mgmt Processor (hardware common to system/cmd based) ] Fig 2. Existing HNS3 PF Driver (added with mailbox) Change Log Summary: Patch V4: Addressed SPDX related comment by Philippe Ombredanne Patch V3: Addressed SPDX change requested by Philippe Ombredanne Patch V2: 1. Addressed some comments by David Miller. 2. Addressed some internal comments on various patches Patch V1: Initial Submit ==================== Acked-by: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add mailbox interrupt handling to PF driverSalil Mehta
All PF mailbox events are conveyed through a common interrupt (vector 0). This interrupt vector is shared by reset and mailbox. This patch adds the handling of mailbox interrupt event and its deferred processing in context to a separate mailbox task. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Change PF to add ring-vect binding & resetQ to mailboxSalil Mehta
This patch is required to support ring-vector binding and reset of TQPs requested by the VF driver to the PF driver. Mailbox handler is added with corresponding VF commands/messages to handle the request. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add mailbox support to PF driverSalil Mehta
Command queue provides the provision of Mailbox command which can be used for communication between PF and VF. PF handles messages from various VFs for fetching various information like, queue, vlan, link status related etc. It also handles the request from various VFs to perform certain privileged operations. This patch adds the support of a message handler for handling such various command requests from VF. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoCSalil Mehta
Most of the NAPI handling interface, skb buffer management, management of the RX/TX descriptors, ethool interface etc. has quite a bit of code which is common to VF and PF driver. This patch makes the exisitng PF's HNS3 ENET driver as the common ENET driver for both Virtual & Physical Function. This will help in reduction of redundancy and better management of code. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add HNS3 VF driver to kernel build frameworkSalil Mehta
This patch introduces the new Makefiles and updates existing Makefiles required to build the HNS3 Virtual Function driver. This also updates the Kconfig for introduction of new menuconfig entries related to VF driver. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) SupportSalil Mehta
This patch adds the support of hardware compatibiltiy layer to the HNS3 VF Driver. This layer implements various {set|get} operations over MAC address for a virtual port, RSS related configuration, fetches the link status info from PF, does various VLAN related configuration over the virtual port, queries the statistics from the hardware etc. This layer can directly interact with hardware through the IMP(Integrated Mangement Processor) interface or can use mailbox to interact with the PF driver. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add mailbox support to VF driverSalil Mehta
This patch adds the support of the mailbox to the VF driver. The mailbox shall be used as an interface to communicate with the PF driver for various purposes like {set|get} MAC related operations, reset, link status etc. The mailbox supports both synchronous and asynchronous command send to PF driver. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interfaceSalil Mehta
This patch adds support of command interface for communication with the IMP(Integrated Management Processor) for HNS3 Virtual Function Driver. Each VF has support of CQP(Command Queue Pair) ring interface. Each CQP consis of send queue CSQ and receive queue CRQ. There are various commands a VF may support, like to query frimware version, TQP management, statistics, interrupt related, mailbox etc. This also contains code to initialize the command queue, manage the command queue descriptors and Rx/Tx protocol with the command processor in the form of various commands/results and acknowledgements. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15mlxsw: spectrum: Disable MAC learning for ovs portYuval Mintz
Learning is currently enabled for ports which are OVS slaves - even though OVS doesn't need this indication. Since we're not associating a fid with the port, HW would continuously notify driver of learned [& aged] MACs which would be logged as errors. Fixes: 2b94e58df58c ("mlxsw: spectrum: Allow ports to work under OVS master") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15Merge branch 'dsa-MT7530-vlan'David S. Miller
Sean Wang says: ==================== add VLAN support to DSA MT7530 Changes sicne v2: update to the latest code base from net-next and fix up all building errors with -Werror. Changes since v1: - fix up the typo - prefer ordering declarations longest to shortest - update that vlan_prepare callback should not change any state - use lower case letter for function naming The patchset extends DSA MT7530 to VLAN support through filling required callbacks in patch 1 and merging the special tag with VLAN tag in patch 2 for allowing that the hardware can handle these packets with VID from the CPU port. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: dsa: mediatek: update MAINTAINERS entry with MediaTek switch driverSean Wang
I work for MediaTek and maintain SoC targeting to home gateway and also will keep extending and testing the function from MediaTek switch. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: dsa: mediatek: combine MediaTek tag with VLAN tagSean Wang
In order to let MT7530 switch can recognize well those egress packets having both special tag and VLAN tag, the information about the special tag should be carried on the existing VLAN tag. On the other hand, it's unnecessary for extra handling for ingress packets when VLAN tag is present since it is able to put the VLAN tag after the special tag and then follow the existing way to parse. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15net: dsa: mediatek: add VLAN support for MT7530Sean Wang
MT7530 can treat each port as either VLAN-unaware port or VLAN-aware port through the implementation of port matrix mode or port security mode on the ingress port, respectively. On one hand, Each port has been acting as the VLAN-unaware one whenever the device is created in the initial or certain port joins or leaves into/from the bridge at the runtime. On the other hand, the patch just filling the required callbacks for VLAN operations is achieved via extending the port to be into port security mode when the port is configured as VLAN-aware port. Which mode can make the port be able to recognize VID from incoming packets and look up VLAN table to validate and judge which port it should be going to. And the range for VID from 1 to 4094 is valid for the hardware. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15sched/rt: Do not pull from current CPU if only one CPU to pullSteven Rostedt
Daniel Wagner reported a crash on the BeagleBone Black SoC. This is a single CPU architecture, and does not have a functional arch_send_call_function_single_ipi() implementation which can crash the kernel if that is called. As it only has one CPU, it shouldn't be called, but if the kernel is compiled for SMP, the push/pull RT scheduling logic now calls it for irq_work if the one CPU is overloaded, it can use that function to call itself and crash the kernel. Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system only has a single CPU. But SCHED_FEAT is a constant if sched debugging is turned off. Another fix can also be used, and this should also help with normal SMP machines. That is, do not initiate the pull code if there's only one RT overloaded CPU, and that CPU happens to be the current CPU that is scheduling in a lower priority task. Even on a system with many CPUs, if there's many RT tasks waiting to run on a single CPU, and that CPU schedules in another RT task of lower priority, it will initiate the PULL logic in case there's a higher priority RT task on another CPU that is waiting to run. But if there is no other CPU with waiting RT tasks, it will initiate the RT pull logic on itself (as it still has RT tasks waiting to run). This is a wasted effort. Not only does this help with SMP code where the current CPU is the only one with RT overloaded tasks, it should also solve the issue that Daniel encountered, because it will prevent the PULL logic from executing, as there's only one CPU on the system, and the check added here will cause it to exit the RT pull code. Reported-by: Daniel Wagner <wagi@monom.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Cc: stable@vger.kernel.org Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15Merge branch 'bpf-nfp-jit-adjust-head-support'Daniel Borkmann
Jakub Kicinski says: ==================== This small set adds support for bpf_xdp_adjust_head() to the offload. Since we have access to unmodified BPF bytecode translating calls is pretty trivial. First part of the series adds handling of BPF capabilities included in the FW in TLV format. The last two patches add adjust head support in the nfp verifier and jit, and a small optimization in case we can guarantee the constant adjustment will always meet adjustment constaints. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15nfp: bpf: optimize the adjust_head calls in trivial casesJakub Kicinski
If the program is simple and has only one adjust head call with constant parameters, we can check that the call will always succeed at translation time. We need to track the location of the call and make sure parameters are always the same. We also have to check the parameters against datapath constraints and ETH_HLEN. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15nfp: bpf: add basic support for adjust head callJakub Kicinski
Support bpf_xdp_adjust_head(). We need to check whether the packet offset after adjustment is within datapath's limits. We also check if the frame is at least ETH_HLEN long (similar to the kernel implementation). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15nfp: bpf: prepare for call supportJakub Kicinski
Add skeleton of verifier checks and translation handler for call instructions. Make sure jump target resolution will not treat them as jumps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15nfp: bpf: prepare for parsing BPF FW capabilitiesJakub Kicinski
BPF FW creates a run time symbol called bpf_capabilities which contains TLV-formatted capability information. Allocate app private structure to store parsed capabilities and add a skeleton of parsing logic. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15nfp: add nfp_cpp_area_size() accessorJakub Kicinski
Allow users outside of core reading area sizes. This was not needed previously because whatever entity created the area would usually know what size it asked for. The nfp_rtsym_map() helper, however, will allocate the area based on the size of an RT-symbol with given name. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-15tools/headers: Synchronize kernel <-> tooling headersIngo Molnar
Two kernel headers got modified recently, which are used by tooling as well: tools/include/uapi/linux/kvm.h arch/x86/include/asm/cpufeatures.h None of those changes have an effect on tooling, so do a plain copy. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15objtool: Resync objtool's instruction decoder source code copy with the ↵Ingo Molnar
kernel's latest version This fixes the following warning: warning: objtool: x86 instruction decoder differs from kernel Note that there are cleanups queued up for v4.16 that will make this warning more informative and will make the syncing easier as well. Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15x86/decoder: Fix and update the opcodes mapRandy Dunlap
Update x86-opcode-map.txt based on the October 2017 Intel SDM publication. Fix INVPID to INVVPID. Add UD0 and UD1 instruction opcodes. Also sync the objtool and perf tooling copies of this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15x86/power: Make restore_processor_context() saneAndy Lutomirski
My previous attempt to fix a couple of bugs in __restore_processor_context(): 5b06bbcfc2c6 ("x86/power: Fix some ordering bugs in __restore_processor_context()") ... introduced yet another bug, breaking suspend-resume. Rather than trying to come up with a minimal fix, let's try to clean it up for real. This patch fixes quite a few things: - The old code saved a nonsensical subset of segment registers. The only registers that need to be saved are those that contain userspace state or those that can't be trivially restored without percpu access working. (On x86_32, we can restore percpu access by writing __KERNEL_PERCPU to %fs. On x86_64, it's easier to save and restore the kernel's GSBASE.) With this patch, we restore hardcoded values to the kernel state where applicable and explicitly restore the user state after fixing all the descriptor tables. - We used to use an unholy mix of inline asm and C helpers for segment register access. Let's get rid of the inline asm. This fixes the reported s2ram hangs and make the code all around more logical. Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reported-by: Pavel Machek <pavel@ucw.cz> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Fixes: 5b06bbcfc2c6 ("x86/power: Fix some ordering bugs in __restore_processor_context()") Link: http://lkml.kernel.org/r/398ee68e5c0f766425a7b746becfc810840770ff.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15x86/power/32: Move SYSENTER MSR restoration to fix_processor_context()Andy Lutomirski
x86_64 restores system call MSRs in fix_processor_context(), and x86_32 restored them along with segment registers. The 64-bit variant makes more sense, so move the 32-bit code to match the 64-bit code. No side effects are expected to runtime behavior. Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Link: http://lkml.kernel.org/r/65158f8d7ee64dd6bbc6c1c83b3b34aaa854e3ae.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15x86/power/64: Use struct desc_ptr for the IDT in struct saved_contextAndy Lutomirski
x86_64's saved_context nonsensically used separate idt_limit and idt_base fields and then cast &idt_limit to struct desc_ptr *. This was correct (with -fno-strict-aliasing), but it's confusing, served no purpose, and required #ifdeffery. Simplify this by using struct desc_ptr directly. No change in functionality. Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Link: http://lkml.kernel.org/r/967909ce38d341b01d45eff53e278e2728a3a93a.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-14Merge tag 'pm-4.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This fixes an issue in two recent commits that may cause pm_runtime_enable() to be called for too many times for some devices during the "thaw" transition belonging to hibernation" * tag 'pm-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume()
2017-12-14Merge tag 'trace-v4.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various fix-ups: - comment fixes - build fix - better memory alloction (don't use NR_CPUS) - configuration fix - build warning fix - enhanced callback parameter (to simplify users of trace hooks) - give up on stack tracing when RCU isn't watching (it's a lost cause)" * tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Have stack trace not record if RCU is not watching tracing: Pass export pointer as argument to ->write() ring-buffer: Remove unused function __rb_data_page_index() tracing: make PREEMPTIRQ_EVENTS depend on TRACING tracing: Allocate mask_str buffer dynamically tracing: always define trace_{irq,preempt}_{enable_disable} tracing: Fix code comments in trace.c
2017-12-14tracing: Have stack trace not record if RCU is not watchingSteven Rostedt (VMware)
The stack tracer records a stack dump whenever it sees a stack usage that is more than what it ever saw before. This can happen at any function that is being traced. If it happens when the CPU is going idle (or other strange locations), RCU may not be watching, and in this case, the recording of the stack trace will trigger a warning. There's been lots of efforts to make hacks to allow stack tracing to proceed even if RCU is not watching, but this only causes more issues to appear. Simply do not trace a stack if RCU is not watching. It probably isn't a bad stack anyway. Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-14Merge tag 'pci-v4.15-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - add a pci_get_domain_bus_and_slot() stub for the CONFIG_PCI=n case to avoid build breakage in the v4.16 merge window if a pci_get_bus_and_slot() -> pci_get_domain_bus_and_slot() patch gets merged before the PCI tree (Randy Dunlap) - fix an AMD boot regression in the 64bit BAR support added in v4.15 (Christian König) - fix an R-Car use-after-free that causes a crash if no PCIe card is present (Geert Uytterhoeven) * tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Fix use-after-free in probe error path x86/PCI: Only enable a 64bit BAR on single-socket AMD Family 15h x86/PCI: Fix infinite loop in search for 64bit BAR placement PCI: Add pci_get_domain_bus_and_slot() stub
2017-12-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: arch: define weak abort() mm, oom_reaper: fix memory corruption kernel: make groups_sort calling a responsibility group_info allocators mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' tools/slabinfo-gnuplot: force to use bash shell kcov: fix comparison callback signature mm/slab.c: do not hash pointers when debugging slab mm/page_alloc.c: avoid excessive IRQ disabled times in free_unref_page_list() mm/memory.c: mark wp_huge_pmd() inline to prevent build failure scripts/faddr2line: fix CROSS_COMPILE unset error Documentation/vm/zswap.txt: update with same-value filled page feature exec: avoid gcc-8 warning for get_task_comm autofs: fix careless error in recent commit string.h: workaround for increased stack usage mm/kmemleak.c: make cond_resched() rate-limiting more efficient lib/rbtree,drm/mm: add rbtree_replace_node_cached() include/linux/idr.h: add #include <linux/bug.h>
2017-12-14arch: define weak abort()Sudip Mukherjee
gcc toggle -fisolate-erroneous-paths-dereference (default at -O2 onwards) isolates faulty code paths such as null pointer access, divide by zero etc. If gcc port doesnt implement __builtin_trap, an abort() is generated which causes kernel link error. In this case, gcc is generating abort due to 'divide by zero' in lib/mpi/mpih-div.c. Currently 'frv' and 'arc' are failing. Previously other arch was also broken like m32r was fixed by commit d22e3d69ee1a ("m32r: fix build failure"). Let's define this weak function which is common for all arch and fix the problem permanently. We can even remove the arch specific 'abort' after this is done. Link: http://lkml.kernel.org/r/1513118956-8718-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm, oom_reaper: fix memory corruptionMichal Hocko
David Rientjes has reported the following memory corruption while the oom reaper tries to unmap the victims address space BUG: Bad page map in process oom_reaper pte:6353826300000000 pmd:00000000 addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping: (null) index:7f50cab1d file: (null) fault: (null) mmap: (null) readpage: (null) CPU: 2 PID: 1001 Comm: oom_reaper Call Trace: unmap_page_range+0x1068/0x1130 __oom_reap_task_mm+0xd5/0x16b oom_reaper+0xff/0x14c kthread+0xc1/0xe0 Tetsuo Handa has noticed that the synchronization inside exit_mmap is insufficient. We only synchronize with the oom reaper if tsk_is_oom_victim which is not true if the final __mmput is called from a different context than the oom victim exit path. This can trivially happen from context of any task which has grabbed mm reference (e.g. to read /proc/<pid>/ file which requires mm etc.). The race would look like this oom_reaper oom_victim task mmget_not_zero do_exit mmput __oom_reap_task_mm mmput __mmput exit_mmap remove_vma unmap_page_range Fix this issue by providing a new mm_is_oom_victim() helper which operates on the mm struct rather than a task. Any context which operates on a remote mm struct should use this helper in place of tsk_is_oom_victim. The flag is set in mark_oom_victim and never cleared so it is stable in the exit_mmap path. Debugged by Tetsuo Handa. Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Andrea Argangeli <andrea@kernel.org> Cc: <stable@vger.kernel.org> [4.14] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14kernel: make groups_sort calling a responsibility group_info allocatorsThiago Rafael Becker
In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'Christophe JAILLET
A semaphore is acquired before this check, so we must release it before leaving. Link: http://lkml.kernel.org/r/20171211211009.4971-1-christophe.jaillet@wanadoo.fr Fixes: b7f0554a56f2 ("mm: fail get_vaddr_frames() for filesystem-dax mappings") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Sterba <dsterba@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14tools/slabinfo-gnuplot: force to use bash shellLiu, Changcheng
On some linux distributions, the default link of sh is dash which deoesn't support split array like "${var//,/ }" It's better to force to use bash shell directly. Link: http://lkml.kernel.org/r/20171208093751.GA175471@sofia Signed-off-by: Liu Changcheng <changcheng.liu@intel.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14kcov: fix comparison callback signatureDmitry Vyukov
Fix a silly copy-paste bug. We truncated u32 args to u16. Link: http://lkml.kernel.org/r/20171207101134.107168-1-dvyukov@google.com Fixes: ded97d2c2b2c ("kcov: support comparison operands collection") Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: syzkaller@googlegroups.com Cc: Alexander Potapenko <glider@google.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm/slab.c: do not hash pointers when debugging slabGeert Uytterhoeven
If CONFIG_DEBUG_SLAB/CONFIG_DEBUG_SLAB_LEAK are enabled, the slab code prints extra debug information when e.g. corruption is detected. This includes pointers, which are not very useful when hashed. Fix this by using %px to print unhashed pointers instead where it makes sense, and by removing the printing of a last user pointer referring to code. [geert+renesas@glider.be: v2] Link: http://lkml.kernel.org/r/1513179267-2509-1-git-send-email-geert+renesas@glider.be Link: http://lkml.kernel.org/r/1512641861-5113-1-git-send-email-geert+renesas@glider.be Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Tobin C . Harding" <me@tobin.cc> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm/page_alloc.c: avoid excessive IRQ disabled times in free_unref_page_list()Lucas Stach
Since commit 9cca35d42eb6 ("mm, page_alloc: enable/disable IRQs once when freeing a list of pages") we see excessive IRQ disabled times of up to 25ms on an embedded ARM system (tracing overhead included). This is due to graphics buffers being freed back to the system via release_pages(). Graphics buffers can be huge, so it's not hard to hit cases where the list of pages to free has 2048 entries. Disabling IRQs while freeing all those pages is clearly not a good idea. Introduce a batch limit, which allows IRQ servicing once every few pages. The batch count is the same as used in other parts of the MM subsystem when dealing with IRQ disabled regions. Link: http://lkml.kernel.org/r/20171207170314.4419-1-l.stach@pengutronix.de Fixes: 9cca35d42eb6 ("mm, page_alloc: enable/disable IRQs once when freeing a list of pages") Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm/memory.c: mark wp_huge_pmd() inline to prevent build failureGeert Uytterhoeven
With gcc 4.1.2: mm/memory.o: In function `wp_huge_pmd': memory.c:(.text+0x9b4): undefined reference to `do_huge_pmd_wp_page' Interestingly, wp_huge_pmd() is emitted in the assembler output, but never called. Apparently replacing the call to pmd_write() in __handle_mm_fault() by a call to the more complex pmd_access_permitted() reduced the ability of the compiler to remove unused code. Fix this by marking wp_huge_pmd() inline, like was done in commit 91a90140f998 ("mm/memory.c: mark create_huge_pmd() inline to prevent build failure") for a similar problem. [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/1512335500-10889-1-git-send-email-geert@linux-m68k.org Fixes: c7da82b894e9eef6 ("mm: replace pmd_write with pmd_access_permitted in fault + gup paths") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14scripts/faddr2line: fix CROSS_COMPILE unset errorLiu, Changcheng
faddr2line hit var unbound error when CROSS_COMPILE isn't set since nounset option is set in bash script. Link: http://lkml.kernel.org/r/20171206013022.GA83929@sofia Fixes: 95a879825419 ("scripts/faddr2line: extend usage on generic arch") Signed-off-by: Liu Changcheng <changcheng.liu@intel.com> Reported-by: Richard Weinberger <richard.weinberger@gmail.com> Reviewed-by: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: NeilBrown <neilb@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14Documentation/vm/zswap.txt: update with same-value filled page featureSrividya Desireddy
Update zswap document with details on same-value filled pages identification feature. The usage of zswap.same_filled_pages_enabled module parameter is explained. Link: http://lkml.kernel.org/r/20171206114852epcms5p6973b02a9f455d5d3c765eafda0fe2631@epcms5p6 Signed-off-by: Srividya Desireddy <srividya.dr@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14exec: avoid gcc-8 warning for get_task_commArnd Bergmann
gcc-8 warns about using strncpy() with the source size as the limit: fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess] This is indeed slightly suspicious, as it protects us from source arguments without NUL-termination, but does not guarantee that the destination is terminated. This keeps the strncpy() to ensure we have properly padded target buffer, but ensures that we use the correct length, by passing the actual length of the destination buffer as well as adding a build-time check to ensure it is exactly TASK_COMM_LEN. There are only 23 callsites which I all reviewed to ensure this is currently the case. We could get away with doing only the check or passing the right length, but it doesn't hurt to do both. Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Aleksa Sarai <asarai@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14autofs: fix careless error in recent commitNeilBrown
Commit ecc0c469f277 ("autofs: don't fail mount for transient error") was meant to replace an 'if' with a 'switch', but instead added the 'switch' leaving the case in place. Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name Fixes: ecc0c469f277 ("autofs: don't fail mount for transient error") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: NeilBrown <neilb@suse.com> Cc: Ian Kent <raven@themaw.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14string.h: workaround for increased stack usageArnd Bergmann
The hardened strlen() function causes rather large stack usage in at least one file in the kernel, in particular when CONFIG_KASAN is enabled: drivers/media/usb/em28xx/em28xx-dvb.c: In function 'em28xx_dvb_init': drivers/media/usb/em28xx/em28xx-dvb.c:2062:1: error: the frame size of 3256 bytes is larger than 204 bytes [-Werror=frame-larger-than=] Analyzing this problem led to the discovery that gcc fails to merge the stack slots for the i2c_board_info[] structures after we strlcpy() into them, due to the 'noreturn' attribute on the source string length check. I reported this as a gcc bug, but it is unlikely to get fixed for gcc-8, since it is relatively easy to work around, and it gets triggered rarely. An earlier workaround I did added an empty inline assembly statement before the call to fortify_panic(), which works surprisingly well, but is really ugly and unintuitive. This is a new approach to the same problem, this time addressing it by not calling the 'extern __real_strnlen()' function for string constants where __builtin_strlen() is a compile-time constant and therefore known to be safe. We do this by checking if the last character in the string is a compile-time constant '\0'. If it is, we can assume that strlen() of the string is also constant. As a side-effect, this should also improve the object code output for any other call of strlen() on a string constant. [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/20171205215143.3085755-1-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Link: https://patchwork.kernel.org/patch/9980413/ Link: https://patchwork.kernel.org/patch/9974047/ Fixes: 6974f0c4555 ("include/linux/string.h: add the option of fortified string.h functions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Martin Wilck <mwilck@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14mm/kmemleak.c: make cond_resched() rate-limiting more efficientAndrew Morton
Commit bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()") tries to rate-limit the frequency of cond_resched() calls, but does it in a way which might incur an expensive division operation in the inner loop. Simplify this. Fixes: bde5f6bc68db5 ("kmemleak: add scheduling point to kmemleak_scan()") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14lib/rbtree,drm/mm: add rbtree_replace_node_cached()Chris Wilson
Add a variant of rbtree_replace_node() that maintains the leftmost cache of struct rbtree_root_cached when replacing nodes within the rbtree. As drm_mm is the only rb_replace_node() being used on an interval tree, the mistake looks fairly self-contained. Furthermore the only user of drm_mm_replace_node() is its testsuite... Testcase: igt/drm_mm/replace Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>