linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-08-10	powerpc/fadump: handle crash memory ranges array index overflow	Hari Bathini
	Crash memory ranges is an array of memory ranges of the crashing kernel to be exported as a dump via /proc/vmcore file. The size of the array is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since commit 142b45a72e22 ("memblock: Add array resizing support"). On large memory systems with a few DLPAR operations, the memblock memory regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such systems, registering fadump results in crash or other system failures like below: task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000 NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180 REGS: c00000000b73b570 TRAP: 0300 Tainted: G L X (4.4.140+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22004484 XER: 20000000 CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0 ... NIP [c000000000047df4] smp_send_reschedule+0x24/0x80 LR [c0000000000f9e58] resched_curr+0x138/0x160 Call Trace: resched_curr+0x138/0x160 (unreliable) check_preempt_curr+0xc8/0xf0 ttwu_do_wakeup+0x38/0x150 try_to_wake_up+0x224/0x4d0 __wake_up_common+0x94/0x100 ep_poll_callback+0xac/0x1c0 __wake_up_common+0x94/0x100 __wake_up_sync_key+0x70/0xa0 sock_def_readable+0x58/0xa0 unix_stream_sendmsg+0x2dc/0x4c0 sock_sendmsg+0x68/0xa0 ___sys_sendmsg+0x2cc/0x2e0 __sys_sendmsg+0x5c/0xc0 SyS_socketcall+0x36c/0x3f0 system_call+0x3c/0x100 as array index overflow is not checked for while setting up crash memory ranges causing memory corruption. To resolve this issue, dynamically allocate memory for crash memory ranges and resize it incrementally, in units of pagesize, on hitting array size limit. Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Just use PAGE_SIZE directly, fixup variable placement] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10	powerpc/cpm1: fix compilation error with CONFIG_PPC_EARLY_DEBUG_CPM	Christophe Leroy
	commit e8cb7a55eb8dc ("powerpc: remove superflous inclusions of asm/fixmap.h") removed inclusion of asm/fixmap.h from files not including objects from that file. However, asm/mmu-8xx.h includes call to __fix_to_virt(). The proper way would be to include asm/fixmap.h in asm/mmu-8xx.h but it creates an inclusion loop. So we have to leave asm/fixmap.h in sysdep/cpm_common.c for CONFIG_PPC_EARLY_DEBUG_CPM CC arch/powerpc/sysdev/cpm_common.o In file included from ./arch/powerpc/include/asm/mmu.h:340:0, from ./arch/powerpc/include/asm/reg_8xx.h:8, from ./arch/powerpc/include/asm/reg.h:29, from ./arch/powerpc/include/asm/processor.h:13, from ./arch/powerpc/include/asm/thread_info.h:28, from ./include/linux/thread_info.h:38, from ./arch/powerpc/include/asm/ptrace.h:159, from ./arch/powerpc/include/asm/hw_irq.h:12, from ./arch/powerpc/include/asm/irqflags.h:12, from ./include/linux/irqflags.h:16, from ./include/asm-generic/cmpxchg-local.h:6, from ./arch/powerpc/include/asm/cmpxchg.h:537, from ./arch/powerpc/include/asm/atomic.h:11, from ./include/linux/atomic.h:5, from ./include/linux/mutex.h:18, from ./include/linux/kernfs.h:13, from ./include/linux/sysfs.h:16, from ./include/linux/kobject.h:20, from ./include/linux/device.h:16, from ./include/linux/node.h:18, from ./include/linux/cpu.h:17, from ./include/linux/of_device.h:5, from arch/powerpc/sysdev/cpm_common.c:21: arch/powerpc/sysdev/cpm_common.c: In function ‘udbg_init_cpm’: ./arch/powerpc/include/asm/mmu-8xx.h:218:25: error: implicit declaration of function ‘__fix_to_virt’ [-Werror=implicit-function-declaration] #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: error: ‘FIX_IMMR_BASE’ undeclared (first use in this function) #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ ./arch/powerpc/include/asm/mmu-8xx.h:218:39: note: each undeclared identifier is reported only once for each function it appears in #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE)) ^ arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’ VIRT_IMMR_BASE); ^ cc1: all warnings being treated as errors make[1]: *** [arch/powerpc/sysdev/cpm_common.o] Error 1 Fixes: e8cb7a55eb8dc ("powerpc: remove superflous inclusions of asm/fixmap.h") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10	powerpc: Fix size calculation using resource_size()	Dan Carpenter
	The problem is the the calculation should be "end - start + 1" but the plus one is missing in this calculation. Fixes: 8626816e905e ("powerpc: add support for MPIC message register API") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10	Documentation: Update documentation on ppc-memtrace	Rashmica Gupta
	Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10	powerpc/powernv: Allow memory that has been hot-removed to be hot-added	Rashmica Gupta
	This patch allows the memory removed by memtrace to be readded to the kernel. So now you don't have to reboot your system to add the memory back to the kernel or to have a different amount of memory removed. Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10	spi: davinci: fix a NULL pointer dereference	Bartosz Golaszewski
	On non-OF systems spi->controlled_data may be NULL. This causes a NULL pointer derefence on dm365-evm. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2018-08-10	x86/microcode: Allow late microcode loading with SMT disabled	Josh Poimboeuf
	The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2018-08-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2018-08-10 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix cpumap and devmap on teardown as they're under RCU context and won't have same assumption as running under NAPI protection, from Jesper. 2) Fix various sockmap bugs in bpf_tcp_sendmsg() code, e.g. we had a bug where socket error was not propagated correctly, from Daniel. 3) Fix incompatible libbpf header license for BTF code and match it before it gets officially released with the rest of libbpf which is LGPL-2.1, from Martin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	smb3: enumerating snapshots was leaving part of the data off end	Steve French
	When enumerating snapshots, the last few bytes of the final snapshot could be left off since we were miscalculating the length returned (leaving off the sizeof struct SRV_SNAPSHOT_ARRAY) See MS-SMB2 section 2.2.32.2. In addition fixup the length used to allow smaller buffer to be passed in, in order to allow returning the size of the whole snapshot array more easily. Sample userspace output with a kernel patched with this (mounted to a Windows volume with two snapshots). Before this patch, the second snapshot would be missing a few bytes at the end. ~/cifs-2.6# ~/enum-snapshots /mnt/file press enter to issue the ioctl to retrieve snapshot information ... size of snapshot array = 102 Num snapshots: 2 Num returned: 2 Array Size: 102 Snapshot 0:@GMT-2018.06.30-19.34.17 Snapshot 1:@GMT-2018.06.30-19.33.37 CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-08-09	cifs: update smb2_queryfs() to use compounding	Ronnie Sahlberg
	Change smb2_queryfs() to use a Create/QueryInfo/Close compound request. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara <palcantara@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-08-09	cifs: update receive_encrypted_standard to handle compounded responses	Ronnie Sahlberg
	Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara <palcantara@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-08-10	Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-next More fixes for 4.19: - Fixes for scheduler - Fix for SR-IOV - Fixes for display Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180809200052.2777-1-alexander.deucher@amd.com
2018-08-10	Merge tag 'imx-drm-fixes-2018-08-03' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-next drm/imx: ipu-v3 plane offset and IPU id fixes - Fix U/V plane offsets for odd vertical offsets. Due to wrong operator order, the y offset was not rounded down properly for vertically chroma subsampled planar formats. - Fix IPU id number for boards that don't have an OF alias for their single IPU in the device tree. This is necessary to support imx-media on i.MX51 and i.MX53 SoCs. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1533552680.4204.14.camel@pengutronix.de
2018-08-10	Merge tag 'imx-drm-next-2018-08-03' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-next drm/imx: use suspend/resume helpers, add ipu-v3 V4L2 XRGB32/XBGR32 support - Convert imx_drm_suspend/resume to use the drm_mode_config_helper_suspend/ resume functions. - Add support for V4L2_PIX_FMT_XRGB32/XBGR32, corresponding to DRM_FORMAT_XRGB8888/BGRX8888, respectively. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1533552701.4204.15.camel@pengutronix.de
2018-08-09	PCI: Add ACS Redirect disable quirk for Intel Sunrise Point	Logan Gunthorpe
	Intel Sunrise Point PCH hardware has an implementation of the ACS bits that does not comply with the PCIe standard. Add a device-specific quirk, pci_quirk_disable_intel_spt_pch_acs_redir() to disable ACS Redirection on this system. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: changelog, split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-09	PCI: Add device-specific ACS Redirect disable infrastructure	Logan Gunthorpe
	Intel Sunrise Point (SPT) PCH hardware has an implementation of the ACS bits that does not comply with the PCIe standard. To deal with this we need device-specific quirks to disable ACS redirection. Add a new pci_dev_specific_disable_acs_redir() quirk and a new .disable_acs_redir() function pointer for use by non-compliant devices. No functional change intended. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: split to separate patch, move pci_dev_specific_disable_acs_redir() declarations to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-09	PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE	Logan Gunthorpe
	Convert the search for device-specific ACS enable quirks from searching a NULL-terminated array to iterating through the array, which is always fixed-size anyway. No functional change intended. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: changelog, split to separate patch for reviewability] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-09	PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support	Logan Gunthorpe
	To support peer-to-peer traffic on a segment of the PCI hierarchy, we must disable the ACS redirect bits for select PCI bridges. The bridges must be selected before the devices are discovered by the kernel and the IOMMU groups created. Therefore, add a kernel command line parameter to specify devices which must have their ACS bits disabled. The new parameter takes a list of devices separated by a semicolon. Each device specified will have its ACS redirect bits disabled. This is similar to the existing 'resource_alignment' parameter. The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P Egress Control bits are disabled, which is sufficient to always allow passing P2P traffic uninterrupted. The bits are set after the kernel (optionally) enables the ACS bits itself. It is also done regardless of whether the kernel or platform firmware sets the bits. If the user tries to disable the ACS redirect for a device without the ACS capability, print a warning to dmesg. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: reorder to add the generic code first and move the device-specific quirk to subsequent patches] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>
2018-08-09	make sure that __dentry_kill() always invalidates d_seq, unhashed or not	Al Viro
	RCU pathwalk relies upon the assumption that anything that changes ->d_inode of a dentry will invalidate its ->d_seq. That's almost true - the one exception is that the final dput() of already unhashed dentry does not touch ->d_seq at all. Unhashing does, though, so for anything we'd found by RCU dcache lookup we are fine. Unfortunately, we can start with an unhashed dentry or jump into it. We could try and be careful in the (few) places where that could happen. Or we could just make the final dput() invalidate the damn thing, unhashed or not. The latter is much simpler and easier to backport, so let's do it that way. Reported-by: "Dae R. Jeong" <threeearcat@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-09	fix __legitimize_mnt()/mntput() race	Al Viro
	__legitimize_mnt() has two problems - one is that in case of success the check of mount_lock is not ordered wrt preceding increment of refcount, making it possible to have successful __legitimize_mnt() on one CPU just before the otherwise final mntpu() on another, with __legitimize_mnt() not seeing mntput() taking the lock and mntput() not seeing the increment done by __legitimize_mnt(). Solved by a pair of barriers. Another is that failure of __legitimize_mnt() on the second read_seqretry() leaves us with reference that'll need to be dropped by caller; however, if that races with final mntput() we can end up with caller dropping rcu_read_lock() and doing mntput() to release that reference - with the first mntput() having freed the damn thing just as rcu_read_lock() had been dropped. Solution: in "do mntput() yourself" failure case grab mount_lock, check if MNT_DOOMED has been set by racing final mntput() that has missed our increment and if it has - undo the increment and treat that as "failure, caller doesn't need to drop anything" case. It's not easy to hit - the final mntput() has to come right after the first read_seqretry() in __legitimize_mnt() and manage to miss the increment done by __legitimize_mnt() before the second read_seqretry() in there. The things that are almost impossible to hit on bare hardware are not impossible on SMP KVM, though... Reported-by: Oleg Nesterov <oleg@redhat.com> Fixes: 48a066e72d97 ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-09	IB/uverbs: Fix reading of 32 bit flags	Jason Gunthorpe
	This is missing a zeroing of the high bits of flags, and is also not correct for big endian machines. Properly zero extend the 32 bit flags into the 64 bit stack variable. Reported-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Fixes: bccd06223f21 ("IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
2018-08-09	MIPS: Remove remnants of UASM_ISA	Paul Burton
	Commit 33679a50370d ("MIPS: uasm: Remove needless ISA abstraction") removed use of the MIPS_ISA preprocessor macro, but left a couple of unused definitions of it behind. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-09	cxgb4: update 1.20.8.0 as the latest firmware supported	Ganesh Goudar
	Change t4fw_version.h to update latest firmware version number to 1.20.8.0. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	net: allow to call netif_reset_xps_queues() under cpus_read_lock	Andrei Vagin
	The definition of static_key_slow_inc() has cpus_read_lock in place. In the virtio_net driver, XPS queues are initialized after setting the queue:cpu affinity in virtnet_set_affinity() which is already protected within cpus_read_lock. Lockdep prints a warning when we are trying to acquire cpus_read_lock when it is already held. This patch adds an ability to call __netif_set_xps_queue under cpus_read_lock(). Acked-by: Jason Wang <jasowang@redhat.com> ============================================ WARNING: possible recursive locking detected 4.18.0-rc3-next-20180703+ #1 Not tainted -------------------------------------------- swapper/0/1 is trying to acquire lock: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20 but task is already holding lock: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by swapper/0/1: #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110 #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0 #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60 v2: move cpus_read_lock() out of __netif_set_xps_queue() Cc: "Nambiar, Amritha" <amritha.nambiar@intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	PCI: Allow specifying devices using a base bus and path of devfns	Logan Gunthorpe
	When specifying PCI devices on the kernel command line using a bus/device/function address, bus numbers can change when adding or replacing a device, changing motherboard firmware, or applying kernel parameters like "pci=assign-buses". When bus numbers change, it's likely the command line tweak will be applied to the wrong device. Therefore, it is useful to be able to specify devices with a base bus number and the path of devfns needed to get to it, similar to the "device scope" structure in the Intel VT-d spec, Section 8.3.1. Thus, we add an option to specify devices in the following format: [<domain>:]<bus>:<device>.<func>[/<device>.<func>]* The path can be any segment within the PCI hierarchy of any length and determined through the use of 'lspci -t'. When specified this way, it is less likely that a renumbered bus will result in a valid device specification and the tweak won't be applied to the wrong device. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>
2018-08-09	PCI: Make specifying PCI devices in kernel parameters reusable	Logan Gunthorpe
	Separate out the code to match a PCI device with a string (typically originating from a kernel parameter) from the pci_specified_resource_alignment() function into its own helper function. While we are at it, this change fixes the kernel style of the function (fixing a number of long lines and extra parentheses). Additionally, make the analogous change to the kernel parameter documentation: Separate the description of how to specify a PCI device into its own section at the head of the "pci=" parameter. This patch should have no functional alterations. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>
2018-08-09	fix mntput/mntput race	Al Viro
	mntput_no_expire() does the calculation of total refcount under mount_lock; unfortunately, the decrement (as well as all increments) are done outside of it, leading to false positives in the "are we dropping the last reference" test. Consider the following situation: * mnt is a lazy-umounted mount, kept alive by two opened files. One of those files gets closed. Total refcount of mnt is 2. On CPU 42 mntput(mnt) (called from __fput()) drops one reference, decrementing component * After it has looked at component #0, the process on CPU 0 does mntget(), incrementing component #0, gets preempted and gets to run again - on CPU 69. There it does mntput(), which drops the reference (component #69) and proceeds to spin on mount_lock. * On CPU 42 our first mntput() finishes counting. It observes the decrement of component #69, but not the increment of component #0. As the result, the total it gets is not 1 as it should've been - it's 0. At which point we decide that vfsmount needs to be killed and proceed to free it and shut the filesystem down. However, there's still another opened file on that filesystem, with reference to (now freed) vfsmount, etc. and we are screwed. It's not a wide race, but it can be reproduced with artificial slowdown of the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups. Fix consists of moving the refcount decrement under mount_lock; the tricky part is that we want (and can) keep the fast case (i.e. mount that still has non-NULL ->mnt_ns) entirely out of mount_lock. All places that zero mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu() before that mntput(). IOW, if mntput() observes (under rcu_read_lock()) a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to be dropped. Reported-by: Jann Horn <jannh@google.com> Tested-by: Jann Horn <jannh@google.com> Fixes: 48a066e72d97 ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-08-09	PCI: Hide ACS quirk declarations inside PCI core	Bjorn Helgaas
	Move declarations for these functions: pci_dev_specific_acs_enabled() pci_dev_specific_enable_acs() from include/linux/pci.h to drivers/pci/pci.h because nothing outside the PCI core needs to use them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-08-09	net: phy: sftp: print debug message with text, not numbers	Andrew Lunn
	Convert the state numbers, device state, etc from numbers to strings when printing debug messages. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	ethernet/qlogic: remove unused array msi_tgt_status	Colin Ian King
	Array msi_tgt_status is defined but never used, hence it is redundant and can be removed. Cleans up clang warning: warning: 'msi_tgt_status' defined but not used [-Wunused-const-variable=] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	net: dsa: rtl8366rb: Support port 4 (WAN)	Linus Walleij
	The totally undocumented IO mode needs to be set to enumerator 0 to enable port 4 also known as WAN in most configurations, for ordinary traffic. The 3 bits in the register come up as 010 after reset, but need to be set to 000. The Realtek source code contains a name for these bits, but no explanation of what the 8 different IO modes may be. Set it to zero for the time being and drop a comment so people know what is going on if they run into trouble. This "mode zero" works fine with the D-Link DIR-685 with RTL8366RB. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	mlxsw: spectrum_flower: use PTR_ERR_OR_ZERO()	YueHaibing
	Fix ptr_ret.cocci warnings: drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c:543:1-3: WARNING: PTR_ERR_OR_ZERO can be used Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	net: sched: fix block->refcnt decrement	Jiri Pirko
	Currently the refcnt is never decremented in case the value is not 1. Fix it by adding decrement in case the refcnt is not 1. Reported-by: Vlad Buslov <vladbu@mellanox.com> Fixes: f71e0ca4db18 ("net: sched: Avoid implicit chain 0 creation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	decnet: fix using plain integer as NULL warning	YueHaibing
	Fixes the following sparse warning: net/decnet/dn_route.c:407:30: warning: Using plain integer as NULL pointer net/decnet/dn_route.c:1923:22: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	net: skbuff.h: fix using plain integer as NULL warning	YueHaibing
	Fixes the following sparse warning: ./include/linux/skbuff.h:2365:58: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	be2net: Use Kconfig flag to support for enabling/disabling adapters	Petr Oros
	Add flags to enable/disable supported chips in be2net. With disable support are removed coresponding PCI IDs and also codepaths with [BE2\|BE3\|BEx\|lancer\|skyhawk]_chip checks. Disable chip will reduce module size by: BE2 ~2kb BE3 ~3kb Lancer ~10kb Skyhawk ~9kb When enable skyhawk only it will reduce module size by ~20kb New help style in Kconfig Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	net: ipv6_gre: Fix GRO to work on IPv6 over GRE tap	Maria Pasechnik
	IPv6 GRO over GRE tap is not working while GRO is not set over the native interface. gro_list_prepare function updates the same_flow variable of existing sessions to 1 if their mac headers match the one of the incoming packet. same_flow is used to filter out non-matching sessions and keep potential ones for aggregation. The number of bytes to compare should be the number of bytes in the mac headers. In gro_list_prepare this number is set to be skb->dev->hard_header_len. For GRE interfaces this hard_header_len should be as it is set in the initialization process (when GRE is created), it should not be overridden. But currently it is being overridden by the value that is actually supposed to represent the needed_headroom. Therefore, the number of bytes compared in order to decide whether the the mac headers are the same is greater than the length of the headers. As it's documented in netdevice.h, hard_header_len is the maximum hardware header length, and needed_headroom is the extra headroom the hardware may need. hard_header_len is basically all the bytes received by the physical till layer 3 header of the packet received by the interface. For example, if the interface is a GRE tap then the needed_headroom should be the total length of the following headers: IP header of the physical, GRE header, mac header of GRE. It is often used to calculate the MTU of the created interface. This patch removes the override of the hard_header_len, and assigns the calculated value to needed_headroom. This way, the comparison in gro_list_prepare is really of the mac headers, and if the packets have the same mac headers the same_flow will be set to 1. Performance testing: 45% higher bandwidth. Measuring bandwidth of single-stream IPv4 TCP traffic over IPv6 GRE tap while GRO is not set on the native. NIC: ConnectX-4LX Before (GRO not working) : 7.2 Gbits/sec After (GRO working): 10.5 Gbits/sec Signed-off-by: Maria Pasechnik <mariap@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	Merge branch 'qed-Enhancements'	David S. Miller
	Manish Chopra says: ==================== qed*: Enhancements This patch series adds following support in drivers - 1. Egress mqprio offload. 2. Add destination IP based flow profile. 3. Ingress flower offload (for drop action). Please consider applying this series to "net-next". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	qede: Ingress tc flower offload (drop action) support.	Manish Chopra
	The main motive of this patch is to lay down driver's tc offload infrastructure in place. With these changes tc can offload various supported flow profiles (4 tuples, src-ip, dst-ip, l4 port) for the drop action. Dropped flows statistic is a global counter for all the offloaded flows for drop action and is populated in ethtool statistics as common "gft_filter_drop". Examples - tc qdisc add dev p4p1 ingress tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp dst_ip 192.168.40.200 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto udp src_ip 192.168.40.100 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp src_ip 192.168.40.100 dst_ip 192.168.40.200 \ src_port 453 dst_port 876 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp dst_port 98 action drop Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	qede: Add destination ip based flow profile.	Manish Chopra
	This patch adds support for dropping and redirecting the flows based on destination IP in the packet. This also moves the profile mode settings in their own functions which can be used through tc flows in successive patch. For example - ethtool -N p5p1 flow-type tcp4 dst-ip 192.168.40.100 action -1 ethtool -N p5p1 flow-type udp4 dst-ip 192.168.50.100 action 1 ethtool -N p5p1 flow-type tcp4 dst-ip 192.168.60.100 action 0x100000000 Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	qed/qede: Multi CoS support.	Manish Chopra
	This patch adds support for tc mqprio offload, using this different traffic classes on the adapter can be utilized based on configured priority to tc map. For example - tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 This will cause SKBs with priority 0,1,2,3 to transmit over tc 0,1,2,3 hardware queues respectively. Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	Merge branch 's390-qeth-next'	David S. Miller
	Julian Wiedmann says: ==================== s390/qeth: updates 2018-08-09 one more set of patches for net-next. This is all sorts of cleanups and more refactoring on the way to using netdev_priv. Please apply. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: use true and false for boolean values	Gustavo A. R. Silva
	Return statements in functions returning bool should use true or false instead of an integer value. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: don't restrict qeth_card to DMA memory	Julian Wiedmann
	Allocating the main qeth_card struct with GFP_DMA blocks us from moving it into netdev_priv(). But the only reason why we need DMA memory is the ccw1 structs embedded into each ccw channel. So extract those into separate allocations, like we already do for the cmd buffers. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: clean up card initialization	Julian Wiedmann
	The qeth_card struct is kzalloc-ed, so remove all the redundant 0-initializations. While at it, split up what's left of qeth_determine_card_type(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: do basic setup for data channel	Julian Wiedmann
	The data channel currently doesn't need a setup operation, because we don't use pre-allocated cmd buffers for its IO. But subsequent changes will introduce further setup that also applies to the data channel. This refactors things a bit, so that the new stuff can then be automatically applied to all channels. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: use qeth_setup_ccw() to set up all CCWs	Julian Wiedmann
	Re-work the helper a little bit, so that it can be used for all CCWs that qeth issues. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: reduce hard-coded access to ccw channels	Julian Wiedmann
	Where possible use accessor macros and local pointers to access the ccw channels. This makes it less likely to miss a spot. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	s390/qeth: extract helper for MPC protocol type	Julian Wiedmann
	Just a little code deduplication. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	null_blk: add lock drop/acquire annotation	Jens Axboe
	sparse complains: drivers/block/null_blk_main.c:816:24: sparse: context imbalance in 'null_insert_page' - unexpected unlock Fix it by adding the necessary annotations to the function. Signed-off-by: Jens Axboe <axboe@kernel.dk>