linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-02-17	nospec: Kill array_index_nospec_mask_check()	Dan Williams
	There are multiple problems with the dynamic sanity checking in array_index_nospec_mask_check(): * It causes unnecessary overhead in the 32-bit case since integer sized @index values will no longer cause the check to be compiled away like in the 64-bit case. * In the 32-bit case it may trigger with user controllable input when the expectation is that should only trigger during development of new kernel enabling. * The macro reuses the input parameter in multiple locations which is broken if someone passes an expression like 'index++' to array_index_nospec(). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/151881604278.17395.6605847763178076520.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16	workqueue: Allow retrieval of current task's work struct	Lukas Wunner
	Introduce a helper to retrieve the current task's work struct if it is a workqueue worker. This allows us to fix a long-standing deadlock in several DRM drivers wherein the ->runtime_suspend callback waits for a specific worker to finish and that worker in turn calls a function which waits for runtime suspend to finish. That function is invoked from multiple call sites and waiting for runtime suspend to finish is the correct thing to do except if it's executing in the context of the worker. Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/2d8f603074131eb87e588d2b803a71765bd3a2fd.1518338788.git.lukas@wunner.de
2018-02-16	skbuff: export mm_[un]account_pinned_pages for other modules	Sowmini Varadhan
	RDS would like to use the helper functions for managing pinned pages added by Commit a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16	skbuff: Fix comment mis-spelling.	David S. Miller
	'peform' --> 'perform' Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16	Merge tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping	Linus Torvalds
	Pull dma-mapping fixes from Christoph Hellwig: "A few dma-mapping fixes for the fallout from the changes in rc1" * tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping: powerpc/macio: set a proper dma_coherent_mask dma-mapping: fix a comment typo dma-direct: comment the dma_direct_free calling convention dma-direct: mark as is_phys ia64: fix build failure with CONFIG_SWIOTLB
2018-02-16	regulator: da9211: Pass descriptors instead of GPIO numbers	Linus Walleij
	This augments the DA9211 regulator driver to fetch its GPIO descriptors directly from the device tree using the newly exported devm_get_gpiod_from_child(). Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16	regulator: da9055: Pass descriptor instead of GPIO number	Linus Walleij
	When setting up a fixed regulator on the DA9055, pass a descriptor instead of a global GPIO number. This facility is not used in the kernel so we can easily just say that this should be a descriptor if/when put to use. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16	regulator: core: Support passing an initialized GPIO enable descriptor	Linus Walleij
	We are currently passing a GPIO number from the global GPIO numberspace into the regulator core for handling enable GPIOs. This is not good since it ties into the global GPIO numberspace and uses gpio_to_desc() to overcome this. Start supporting passing an already initialized GPIO descriptor to the core instead: leaf drivers pick their descriptors, associated directly with the device node (or from ACPI or from a board descriptor table) and use that directly without any roundtrip over the global GPIO numberspace. This looks messy since it adds a bunch of extra code in the core, but at the end of the patch series we will delete the handling of the GPIO number and only deal with descriptors so things end up neat. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-16	PCI: Remove pci_get_bus_and_slot() function	Sinan Kaya
	pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Now that all users of pci_get_bus_and_slot() switched to pci_get_domain_bus_and_slot(), it is now safe to remove this function. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-02-16	sched/wait: add wait_event_idle() functions.	NeilBrown
	The new TASK_IDLE state (TASK_UNINTERRUPTIBLE \| __TASK_NOLOAD) is not much used. One way to make it easier to use is to add wait_event*() family functions that make use of it. This patch adds: wait_event_idle() wait_event_idle_timeout() wait_event_idle_exclusive() wait_event_idle_exclusive_timeout() This set was chosen because lustre needs them before it can discard its own l_wait_event() macro. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Simmons <jsimmons@infradead.org> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Patrick Farrell <paf@cray.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16	HID: core: Fix size as type u32	Aaron Ma
	When size is negative, calling memset will make segment fault. Declare the size as type u32 to keep memset safe. size in struct hid_report is unsigned, fix return type of hid_report_len to u32. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-02-16	cpumask: Make for_each_cpu_wrap() available on UP as well	Michael Kelley
	for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Michael Kelley <mhkelley@outlook.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16	mtd: nand: Add core infrastructure to deal with NAND devices	Boris Brezillon
	Add an intermediate layer to abstract NAND device interface so that some logic can be shared between SPI NANDs, parallel/raw NANDs, OneNANDs, ... Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-02-16	mtd: nand: Get rid of comments giving the file path inside the file itself	Boris Brezillon
	Some files add a comment giving the path of the file inside the Linux tree, which is pretty useless since the reader had to find the file to open it. Getting rid of these comments will also allow us to easily move these files around when needed. Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-02-16	headers: Drop two #included headers from <linux/interrupt.h>	Randy Dunlap
	It seems that <linux/interrupt.h> does not need <linux/linkage.h> nor <linux/preempt.h>. 8 kernels builds are successful without these 2 headers (allmodconfig, allyesconfig, allnoconfig, and tinyconfig on both i386 and x86_64). <linux/interrupt.h> is #included 3875 times in 4.16-rc1, so this reduces #include processing of these 2 files by a total of 7750 times. Since I only tested x86 builds, this needs to be tested on other $ARCHes as well. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/b24b9ec8-4970-65f5-759a-911d4ba2fcf0@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15	Merge tag 'acpi-4.16-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a system resume regression from the 4.13 cycle, clean up device table handling in the ACPI core, update sysfs ABI documentation of a couple of drivers and add an expected switch fall-through marker to the SPCR table parsing code. Specifics: - Revert a problematic EC driver change from the 4.13 cycle that introduced a system resume regression on Thinkpad X240 (Rafael Wysocki). - Clean up device tables handling in the ACPI core and the related part of the device properties framework (Andy Shevchenko). - Update the sysfs ABI documentatio of the dock and the INT3407 special device drivers (Aishwarya Pant). - Add an expected switch fall-through marker to the SPCR table parsing code (Gustavo Silva)" * tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: dock: document sysfs interface ACPI / DPTF: Document dptf_power sysfs atttributes device property: Constify device_get_match_data() ACPI / bus: Rename acpi_get_match_data() to acpi_device_get_match_data() ACPI / bus: Remove checks in acpi_get_match_data() ACPI / bus: Do not traverse through non-existed device table ACPI: SPCR: Mark expected switch fall-through in acpi_parse_spcr ACPI / EC: Restore polling during noirq suspend/resume phases
2018-02-15	Merge tag 'pm-4.16-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a recently introduced build issue related to cpuidle and two bugs in the PM core, update cpuidle documentation and clean up memory allocations in the operating performance points (OPP) framework. Specifics: - Fix a recently introduced build issue related to cpuidle by covering all of the relevant combinations of Kconfig options in its header (Rafael Wysocki). - Add missing invocation of pm_runtime_drop_link() to the !CONFIG_SRCU variant of __device_link_del() (Lukas Wunner). - Fix unbalanced IRQ enable in the wakeup interrupts framework (Tony Lindgren). - Update cpuidle sysfs ABI documentation (Aishwarya Pant). - Use GFP_KERNEL instead of GFP_ATOMIC for allocating memory in dev_pm_opp_init_cpufreq_table() (Jia-Ju Bai)" * tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: cpuidle: Fix cpuidle_poll_state_init() prototype PM / runtime: Update links_count also if !CONFIG_SRCU PM / wakeirq: Fix unbalanced IRQ enable for wakeirq Documentation/ABI: update cpuidle sysfs documentation opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15	net: Make extern and export get_net_ns()	Kirill Tkhai
	This function will be used to obtain net of tun device. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15	Merge branch 'locking-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "This contains two qspinlock fixes and three documentation and comment fixes" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/semaphore: Update the file path in documentation locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit() locking/qspinlock: Ensure node->count is updated before initialising node locking/qspinlock: Ensure node is initialised before updating prev->next Documentation/locking/mutex-design: Update to reflect latest changes
2018-02-15	block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS	Minwoo Im
	Update comment typo _consisitent_ to _consistent_ from following commit. commit 0206319fdfee ("blk-mq: Fix poll_stat for new size-based bucketing.") Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-15	Merge branches 'pm-cpuidle' and 'pm-opp'	Rafael J. Wysocki
	* pm-cpuidle: PM: cpuidle: Fix cpuidle_poll_state_init() prototype Documentation/ABI: update cpuidle sysfs documentation * pm-opp: opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15	IB/mlx5: Implement fragmented completion queue (CQ)	Yonatan Cohen
	The current implementation of create CQ requires contiguous memory, such requirement is problematic once the memory is fragmented or the system is low in memory, it causes for failures in dma_zalloc_coherent(). This patch implements new scheme of fragmented CQ to overcome this issue by introducing new type: 'struct mlx5_frag_buf_ctrl' to allocate fragmented buffers, rather than contiguous ones. Base the Completion Queues (CQs) on this new fragmented buffer. It fixes following crashes: kworker/29:0: page allocation failure: order:6, mode:0x80d0 CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0 Workqueue: ib_cm cm_work_handler [ib_cm] Call Trace: [<>] dump_stack+0x19/0x1b [<>] warn_alloc_failed+0x110/0x180 [<>] __alloc_pages_slowpath+0x6b7/0x725 [<>] __alloc_pages_nodemask+0x405/0x420 [<>] dma_generic_alloc_coherent+0x8f/0x140 [<>] x86_swiotlb_alloc_coherent+0x21/0x50 [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core] [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core] [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core] [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core] [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib] [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib] Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-15	net/mlx5: Remove redundant EQ API exports	Saeed Mahameed
	EQ structure and API is private to mlx5_core driver only, external drivers should not have access or the means to manipulate EQ objects. Remove redundant exports and move API functions out of the linux/mlx5 include directory into the driver's mlx5_core.h private include file. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15	net/mlx5: Move CQ completion and event forwarding logic to eq.c	Saeed Mahameed
	Since CQ tree is now per EQ, CQ completion and event forwarding became specific implementation of EQ logic, this patch moves that logic to eq.c and makes those functions static. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15	net/mlx5: CQ hold/put API	Saeed Mahameed
	Now as the CQ table is per EQ, add an API to hold/put CQ to be used from eq.c in downstream patch. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15	net/mlx5: CQ Database per EQ	Saeed Mahameed
	Before this patch the driver had one CQ database protected via one spinlock, this spinlock is meant to synchronize between CQ adding/removing and CQ IRQ interrupt handling. On a system with large number of CPUs and on a work load that requires lots of interrupts, this global spinlock becomes a very nasty hotspot and introduces a contention between the active cores, which will significantly hurt performance and becomes a bottleneck that prevents seamless cpu scaling. To solve this we simply move the CQ database and its spinlock to be per EQ (IRQ), thus per core. Tested with: system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores netperf command: ./super_netperf 200 -P 0 -t TCP_RR -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M WITHOUT THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.32 0.00 36.15 0.09 0.00 34.02 0.00 0.00 0.00 25.41 Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271 Overhead Command Shared Object Symbol + 14.28% swapper [kernel.vmlinux] [k] intel_idle + 12.25% swapper [kernel.vmlinux] [k] queued_spin_lock_slowpath + 10.29% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath + 1.32% netserver [kernel.vmlinux] [k] mlx5e_xmit WITH THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.27 0.00 34.31 0.01 0.00 18.71 0.00 0.00 0.00 42.69 Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483 Overhead Command Shared Object Symbol + 23.33% swapper [kernel.vmlinux] [k] intel_idle + 1.69% netserver [kernel.vmlinux] [k] mlx5e_xmit Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-14	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes all across the map: - /proc/kcore vsyscall related fixes - LTO fix - build warning fix - CPU hotplug fix - Kconfig NR_CPUS cleanups - cpu_has() cleanups/robustification - .gitignore fix - memory-failure unmapping fix - UV platform fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages x86/error_inject: Make just_return_func() globally visible x86/platform/UV: Fix GAM Range Table entries less than 1GB x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page x86/Kconfig: Further simplify the NR_CPUS config x86/Kconfig: Simplify NR_CPUS config x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()" x86/cpufeature: Update _static_cpu_has() to use all named variables x86/cpufeature: Reindent _static_cpu_has()
2018-02-14	Merge branch 'x86-pti-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar: "Here's the latest set of Spectre and PTI related fixes and updates: Spectre: - Add entry code register clearing to reduce the Spectre attack surface - Update the Spectre microcode blacklist - Inline the KVM Spectre helpers to get close to v4.14 performance again. - Fix indirect_branch_prediction_barrier() - Fix/improve Spectre related kernel messages - Fix array_index_nospec_mask() asm constraint - KVM: fix two MSR handling bugs PTI: - Fix a paranoid entry PTI CR3 handling bug - Fix comments objtool: - Fix paranoid_entry() frame pointer warning - Annotate WARN()-related UD2 as reachable - Various fixes - Add Add Peter Zijlstra as objtool co-maintainer Misc: - Various x86 entry code self-test fixes - Improve/simplify entry code stack frame generation and handling after recent heavy-handed PTI and Spectre changes. (There's two more WIP improvements expected here.) - Type fix for cache entries There's also some low risk non-fix changes I've included in this branch to reduce backporting conflicts: - rename a confusing x86_cpu field name - de-obfuscate the naming of single-TLB flushing primitives" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/entry/64: Fix CR3 restore in paranoid_exit() x86/cpu: Change type of x86_cache_size variable to unsigned int x86/spectre: Fix an error message x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping selftests/x86/mpx: Fix incorrect bounds with old _sigfault x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user\|kernel]() x86/speculation: Add <asm/msr-index.h> dependency nospec: Move array_index_nospec() parameter checking into separate macro x86/speculation: Fix up array_index_nospec_mask() asm constraint x86/debug: Use UD2 for WARN() x86/debug, objtool: Annotate WARN()-related UD2 as reachable objtool: Fix segfault in ignore_unreachable_insn() selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory selftests/x86/pkeys: Remove unused functions selftests/x86: Clean up and document sscanf() usage selftests/x86: Fix vDSO selftest segfault for vsyscall=none x86/entry/64: Remove the unused 'icebp' macro ...
2018-02-15	nospec: Move array_index_nospec() parameter checking into separate macro	Will Deacon
	For architectures providing their own implementation of array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to complain about out-of-range parameters using WARN_ON() results in a mess of mutually-dependent include files. Rather than unpick the dependencies, simply have the core code in nospec.h perform the checking for us. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-14	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-02-14 This patch series enables the new mqprio hardware offload mechanism creating traffic classes on VFs for XL710 devices. The parameters needed to configure these traffic classes/queue channels are provides by the user via the tc tool. A maximum of four traffic classes can be created on each VF. This patch series also enables application of cloud filters to each of these traffic classes. The cloud filters are applied using the tc-flower classifier. Example: 1. tc qdisc add dev vf0 root mqprio num_tc 4 map 0 0 0 0 1 2 2 3\ queues 2@0 2@2 1@4 1@5 hw 1 mode channel 2. tc qdisc add dev vf0 ingress 3. ethtool -K vf0 hw-tc-offload on 4. ip link set eth0 vf 0 spoofchk off 5. tc filter add dev vf0 protocol ip parent ffff: prio 1 flower dst_ip\ 192.168.3.5/32 ip_proto udp dst_port 25 skip_sw hw_tc 2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14	net: ptp: Add stub for ptp_classify_raw()	Andrew Lunn
	When NET_PTP_CLASSIFY is disabled, a stub function is required in order that the drivers compile. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14	ARM: OMAP2+: Cleanup omap_mcbsp_dev_attr and other legacy data	Suman Anna
	The omap_mcbsp_dev_attr data was used to supply instance-specific data for legacy non-DT devices. The legacy McBSP device support including the usage of the hwmod class revision data has been dropped in commit 48f6693790aa ("ARM: OMAP2+: Remove unused legacy code for McBSP") and this data is therefore no longer needed. So, cleanup the structure and all the associated data in various hwmod data files. Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14	ARM: OMAP2+: Cleanup omap2_spi_dev_attr and other legacy data	Suman Anna
	The omap2_spi_dev_attr data was used to supply instance-specific data for legacy non-DT devices. The SPI legacy device support including the usage of the hwmod class revision data has been dropped in commit 6f3ab009a178 ("ARM: OMAP2+: Remove unused legacy code for device init") and this data is therefore no longer needed. So, cleanup the structure and all the associated data in various hwmod data files. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14	ARM: OMAP2+: Cleanup omap_gpio_dev_attr usage	Suman Anna
	The omap_gpio_dev_attr data was used to supply instance-specific data for legacy non-DT devices. The GPIO legacy device support has been cleaned up in commit 14944934f8ac ("ARM: OMAP2+: Remove legacy gpio code") a while ago and this data is therefore no longer needed. So, cleanup the structure and all the associated data in various hwmod data files. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-02-14	virtchnl: Add filter data structures	Harshitha Ramamurthy
	This patch adds infrastructure to send virtchnl messages to the PF to configure filters on the VF. The patch adds a struct called virtchnl_filter which contains information about the fields in the user-specified tc filter. Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14	virtchnl: Add a macro to check the size of a union	Harshitha Ramamurthy
	This patch adds a macro to check if the size of a union is correct. It throws a divide by zero error if the union is not of the correct size. Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14	virtchnl: Add virtchl structures to support queue channels	Harshitha Ramamurthy
	This patch defines new structs in support of the virtchannel message that the VF sends to the PF to create a queue channel specified by the user via tc tool. Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-02-14	net: Make atalk_ptr depend on ATALK or IRDA	David Ahern
	Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14	net: Make ax25_ptr depend on CONFIG_AX25	David Ahern
	Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14	net: Make dn_ptr depend on CONFIG_DECNET	David Ahern
	Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14	spi: spi-gpio: Rewrite to use GPIO descriptors	Linus Walleij
	This converts the bit-banged GPIO SPI driver to looking up and using GPIO descriptors to get a handle on GPIO lines for SCK, MOSI, MISO and all CS lines. All existing board files are converted in one go to keep it all consistent. With these conversions I rarely find any interrim steps that makes any sense. Device tree probing and GPIO handling should work like before also after this patch. For board files, we stop using controller data to pass the GPIO line for chip select, instead we pass this as a GPIO descriptor lookup like everything else. In some s3c24xx machines the names of the SPI devices were set to "spi-gpio" rather than "spi_gpio" which can never have worked, I fixed it working (I guess) as part of this patch set. Sometimes I wonder how this code got upstream in the first place, it obviously is not tested. mach-s3c64xx/mach-smartq.c has the same problem and additionally defines the same GPIO line for MOSI and MISO which is not going to be accepted by gpiolib. As the lines were number 1,2,2 I assumed it was a typo and use lines 1,2,3. A comment gives awat that line 0 is chip select though no actual SPI device is provided for the LCD supposed to be on this bit-banged SPI bus. I left it intact instead of just deleting the bus though. Kill off board file code that try to initialize the SPI lines to the same values that they will later be set by the spi_gpio driver anyways. Given the huge number of weird things in these board files I do not think this code is very tested or put in with much afterthought anyways. In order to assert that we do not get performance regressions on this crucial bing-banged driver, a ran a script like this dumping the Ilitek ILI9322 regmap 10000 times (it has no caching obviously) on an otherwise idle system in two iterations before and after the patches: #!/bin/sh for run in `seq 10000` do cat /debug/regmap/spi0.0/registers > /dev/null done Before the patch: time test.sh real 3m 41.03s user 0m 29.41s sys 3m 7.22s time test.sh real 3m 44.24s user 0m 32.31s sys 3m 7.60s After the patch: time test.sh real 3m 41.32s user 0m 28.92s sys 3m 8.08s time test.sh real 3m 39.92s user 0m 30.20s sys 3m 5.56s So any performance differences seems to be in the error margin. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-02-14	ACPI/IORT: Add msi address regions reservation helper	Shameer Kolothum
	On some platforms msi parent address regions have to be excluded from normal IOVA allocation in that they are detected and decoded in a HW specific way by system components and so they cannot be considered normal IOVA address space. Add a helper function that retrieves ITS address regions - the msi parent - through IORT device <-> ITS mappings and reserves it so that these regions will not be translated by IOMMU and will be excluded from IOVA allocations. The function checks for the smmu model number and only applies the msi reservation if the platform requires it. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> [For the ITS part] Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-14	x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable	Kirill A. Shutemov
	For boot-time switching between 4- and 5-level paging we need to be able to fold p4d page table level at runtime. It requires variable PGDIR_SHIFT and PTRS_PER_P4D. The change doesn't affect the kernel image size much: text data bss dec hex filename 8628091 4734304 1368064 14730459 e0c4db vmlinux.before 8628393 4734340 1368064 14730797 e0c62d vmlinux.after Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214111656.88514-7-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13	net: avoid including xdp.h in filter.h	Jesper Dangaard Brouer
	If is sufficient with a forward declaration of struct xdp_rxq_info in linux/filter.h, which avoids including net/xdp.h. This was originally suggested by John Fastabend during the review phase, but wasn't included in the final patchset revision. Thus, this followup. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-13	iommu: Do not return error code for APIs with size_t return type	Suravee Suthikulpanit
	Currently, iommu_unmap, iommu_unmap_fast and iommu_map_sg return size_t. However, some of the return values are error codes (< 0), which can be misinterpreted as large size. Therefore, returning size 0 instead to signify failure to map/unmap. Cc: Joerg Roedel <joro@8bytes.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13	iommu/vt-d: Clean/document fault status flags	Dmitry Safonov
	So one could decode them without opening the specification. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-02-13	net: Introduce net_sem for protection of pernet_list	Kirill Tkhai
	Currently, the mutex is mostly used to protect pernet operations list. It orders setup_net() and cleanup_net() with parallel {un,}register_pernet_operations() calls, so ->exit{,batch} methods of the same pernet operations are executed for a dying net, as were used to call ->init methods, even after the net namespace is unlinked from net_namespace_list in cleanup_net(). But there are several problems with scalability. The first one is that more than one net can't be created or destroyed at the same moment on the node. For big machines with many cpus running many containers it's very sensitive. The second one is that it's need to synchronize_rcu() after net is removed from net_namespace_list(): Destroy net_ns: cleanup_net() mutex_lock(&net_mutex) list_del_rcu(&net->list) synchronize_rcu() <--- Sleep there for ages list_for_each_entry_reverse(ops, &pernet_list, list) ops_exit_list(ops, &net_exit_list) list_for_each_entry_reverse(ops, &pernet_list, list) ops_free_list(ops, &net_exit_list) mutex_unlock(&net_mutex) This primitive is not fast, especially on the systems with many processors and/or when preemptible RCU is enabled in config. So, all the time, while cleanup_net() is waiting for RCU grace period, creation of new net namespaces is not possible, the tasks, who makes it, are sleeping on the same mutex: Create net_ns: copy_net_ns() mutex_lock_killable(&net_mutex) <--- Sleep there for ages I observed 20-30 seconds hangs of "unshare -n" on ordinary 8-cpu laptop with preemptible RCU enabled after CRIU tests round is finished. The solution is to convert net_mutex to the rw_semaphore and add fine grain locks to really small number of pernet_operations, what really need them. Then, pernet_operations::init/::exit methods, modifying the net-related data, will require down_read() locking only, while down_write() will be used for changing pernet_list (i.e., when modules are being loaded and unloaded). This gives signify performance increase, after all patch set is applied, like you may see here: %for i in {1..10000}; do unshare -n bash -c exit; done before real 1m40,377s user 0m9,672s sys 0m19,928s after real 0m17,007s user 0m5,311s sys 0m11,779 (5.8 times faster) This patch starts replacing net_mutex to net_sem. It adds rw_semaphore, describes the variables it protects, and makes to use, where appropriate. net_mutex is still present, and next patches will kick it out step-by-step. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13	x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages	Tony Luck
	In the following commit: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") ... we added code to memory_failure() to unmap the page from the kernel 1:1 virtual address space to avoid speculative access to the page logging additional errors. But memory_failure() may not always succeed in taking the page offline, especially if the page belongs to the kernel. This can happen if there are too many corrected errors on a page and either mcelog(8) or drivers/ras/cec.c asks to take a page offline. Since we remove the 1:1 mapping early in memory_failure(), we can end up with the page unmapped, but still in use. On the next access the kernel crashes :-( There are also various debug paths that call memory_failure() to simulate occurrence of an error. Since there is no actual error in memory, we don't need to map out the page for those cases. Revert most of the previous attempt and keep the solution local to arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when: 1) there is a real error 2) memory_failure() succeeds. All of this only applies to 64-bit systems. 32-bit kernel doesn't map all of memory into kernel space. It isn't worth adding the code to unmap the piece that is mapped because nobody would run a 32-bit kernel on a machine that has recoverable machine checks. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert (Persistent Memory) <elliott@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Cc: stable@vger.kernel.org #v4.14 Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13	locking/semaphore: Update the file path in documentation	Tycho Andersen
	While reading this header I noticed that the locking stuff has moved to kernel/locking/*, so update the path in semaphore.h to point to that. Signed-off-by: Tycho Andersen <tycho@tycho.ws> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180201114119.1090-1-tycho@tycho.ws Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13	vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page	Jia Zhang
	Commit: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") ... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y. However, accessing the vsyscall user page will cause an SMAP fault. Replace memcpy() with copy_from_user() to fix this bug works, but adding a common way to handle this sort of user page may be useful for future. Currently, only vsyscall page requires KCORE_USER. Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org>