git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2019-05-29	Merge tag 'linux-kselftest-5.2-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: - Alexandre Belloni's fixes to rtc regressions introduced in kselftest Makefile test run output refactoring work from Kees Cook. - ftrace test checkbashisms fixes from Masami Hiramatsu * tag 'linux-kselftest-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: rtc: rtctest: specify timeouts selftests/harness: Allow test to configure timeout selftests/ftrace: Add checkbashisms meta-testcase selftests/ftrace: Make a script checkbashisms clean
2019-05-29	s390/crypto: fix possible sleep during spinlock aquired	Harald Freudenberger
	This patch fixes a complain about possible sleep during spinlock aquired "BUG: sleeping function called from invalid context at include/crypto/algapi.h:426" for the ctr(aes) and ctr(des) s390 specific ciphers. Instead of using a spinlock this patch introduces a mutex which is save to be held in sleeping context. Please note a deadlock is not possible as mutex_trylock() is used. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reported-by: Julian Wiedmann <jwi@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-05-29	MIPS: pistachio: Build uImage.gz by default	Paul Burton
	The pistachio platform uses the U-Boot bootloader & generally boots a kernel in the uImage format. As such it's useful to build one when building the kernel, but to do so currently requires the user to manually specify a uImage target on the make command line. Make uImage.gz the pistachio platform's default build target, so that the default is to build a kernel image that we can actually boot on a board such as the MIPS Creator Ci40. Marked for stable backport as far as v4.1 where pistachio support was introduced. This is primarily useful for CI systems such as kernelci.org which will benefit from us building a suitable image which can then be booted as part of automated testing, extending our test coverage to the affected stable branches. Signed-off-by: Paul Burton <paul.burton@mips.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> URL: https://groups.io/g/kernelci/message/388 Cc: stable@vger.kernel.org # v4.1+ Cc: linux-mips@vger.kernel.org
2019-05-29	MIPS: Make virt_addr_valid() return bool	Paul Burton
	virt_addr_valid() really returns a boolean value, but currently uses an integer to represent it. Switch to the bool type to make it clearer that we really are returning a true or false value. Signed-off-by: Paul Burton <paul.burton@mips.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: linux-mips@vger.kernel.org
2019-05-29	MIPS: Bounds check virt_addr_valid	Paul Burton
	The virt_addr_valid() function is meant to return true iff virt_to_page() will return a valid struct page reference. This is true iff the address provided is found within the unmapped address range between PAGE_OFFSET & MAP_BASE, but we don't currently check for that condition. Instead we simply mask the address to obtain what will be a physical address if the virtual address is indeed in the desired range, shift it to form a PFN & then call pfn_valid(). This can incorrectly return true if called with a virtual address which, after masking, happens to form a physical address corresponding to a valid PFN. For example we may vmalloc an address in the kernel mapped region starting a MAP_BASE & obtain the virtual address: addr = 0xc000000000002000 When masked by virt_to_phys(), which uses __pa() & in turn CPHYSADDR(), we obtain the following (bogus) physical address: addr = 0x2000 In a common system with PHYS_OFFSET=0 this will correspond to a valid struct page which should really be accessed by virtual address PAGE_OFFSET+0x2000, causing virt_addr_valid() to incorrectly return 1 indicating that the original address corresponds to a struct page. This is equivalent to the ARM64 change made in commit ca219452c6b8 ("arm64: Correctly bounds check virt_addr_valid"). This fixes fallout when hardened usercopy is enabled caused by the related commit 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check") which removed a check for the vmalloc range that was present from the introduction of the hardened usercopy feature. Signed-off-by: Paul Burton <paul.burton@mips.com> References: ca219452c6b8 ("arm64: Correctly bounds check virt_addr_valid") References: 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check") Reported-by: Julien Cristau <jcristau@debian.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: YunQiang Su <ysu@wavecomp.com> URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=929366 Cc: stable@vger.kernel.org # v4.12+ Cc: linux-mips@vger.kernel.org Cc: Yunqiang Su <ysu@wavecomp.com>
2019-05-29	CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM	Roberto Bergantinos Corpas
	In cifs_read_allocate_pages, in case of ENOMEM, we go through whole rdata->pages array but we have failed the allocation before nr_pages, therefore we may end up calling put_page with NULL pointer, causing oops Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-05-29	Merge tag 'trace-v5.2-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "This fixes a memory leak from the error path in the event filter logic" * tag 'trace-v5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Avoid memory leak in predicate_parse()
2019-05-29	RDMA/efa: Remove MAYEXEC flag check from mmap flow	Gal Pressman
	MAYEXEC test was mistakenly added, remove it. Checking MAYEXEC in the driver prevents it from working with userspace that uses things like EXEC STACK. (ie some Fortran and other runtimes) Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Reported-by: Jason Gunthorpe <jgg@ziepe.ca> Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	mlx5: avoid 64-bit division	Michal Kubecek
	Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") breaks i386 build by introducing three 64-bit divisions. As the divisor is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace the division with bit operations. Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	IB/hfi1: Validate page aligned for a given virtual address	Kamenee Arumugam
	User applications can register memory regions for TID buffers that are not aligned on page boundaries. Hfi1 is expected to pin those pages in memory and cache the pages with mmu_rb. The rb tree will fail to insert pages that are not aligned correctly. Validate whether a given virtual address is page aligned before pinning. Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body") Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value	Mike Marciniszyn
	The command 'ibv_devinfo -v' reports 0 for max_mr. Fix by assigning the query values after the mr lkey_table has been built rather than early on in the driver. Fixes: 7b1e2099adc8 ("IB/rdmavt: Move memory registration into rdmavt") Reviewed-by: Josh Collier <josh.d.collier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	IB/hfi1: Insure freeze_work work_struct is canceled on shutdown	Mike Marciniszyn
	By code inspection, the freeze_work is never canceled. Fix by adding a cancel_work_sync in the shutdown path to insure it is no longer running. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	IB/rdmavt: Fix alloc_qpn() WARN_ON()	Mike Marciniszyn
	The qpn allocation logic has a WARN_ON() that intends to detect the use of an index that will introduce bits in the lower order bits of the QOS bits in the QPN. Unfortunately, it has the following bugs: - it misfires when wrapping QPN allocation for non-QOS - it doesn't correctly detect low order QOS bits (despite the comment) The WARN_ON() should not be applied to non-QOS (qos_shift == 1). Additionally, it SHOULD test the qpn bits per the table below: 2 data VLs: [qp7, qp6, qp5, qp4, qp3, qp2, qp1] ^ [ 0, 0, 0, 0, 0, 0, sc0], qp bit 1 always 0* 3-4 data VLs: [qp7, qp6, qp5, qp4, qp3, qp2, qp1] ^ [ 0, 0, 0, 0, 0, sc1, sc0], qp bits [21] always 0 5-8 data VLs: [qp7, qp6, qp5, qp4, qp3, qp2, qp1] ^ [ 0, 0, 0, 0, sc2, sc1, sc0] qp bits [321] always 0 Fix by qualifying the warning for qos_shift > 1 and producing the correct mask to insure the above bits are zero without generating a superfluous warning. Fixes: 501edc42446e ("IB/rdmavt: Correct warning during QPN allocation") Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-29	drm/amdgpu: reserve stollen vram for raven series	Flora Cui
	to avoid screen corruption during modprobe. Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-29	arm64: use the correct function type for __arm64_sys_ni_syscall	Sami Tolvanen
	Calling sys_ni_syscall through a syscall_fn_t pointer trips indirect call Control-Flow Integrity checking due to a function type mismatch. Use SYSCALL_DEFINE0 for __arm64_sys_ni_syscall instead and remove the now unnecessary casts. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29	arm64: use the correct function type in SYSCALL_DEFINE0	Sami Tolvanen
	Although a syscall defined using SYSCALL_DEFINE0 doesn't accept parameters, use the correct function type to avoid indirect call type mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29	arm64: fix syscall_fn_t type	Sami Tolvanen
	Syscall wrappers in <asm/syscall_wrapper.h> use const struct pt_regs * as the argument type. Use const in syscall_fn_t as well to fix indirect call type mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29	block: don't protect generic_make_request_checks with blk_queue_enter	Ming Lei
	Now a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") has been reverted, and blkcg_exit_queue() won't be called in blk_cleanup_queue() any more. So don't need to protect generic_make_request_checks() with blk_queue_enter(), then the total mess can be cleaned. 37f9579f4c31 ("blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash") is reverted. Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-29	block: move blk_exit_queue into __blk_release_queue	Ming Lei
	Commit 498f6650aec8 ("block: Fix a race between the cgroup code and request queue initialization") moves what blk_exit_queue does into blk_cleanup_queue() for fixing issue caused by changing back queue lock. However, after legacy request IO path is killed, driver queue lock won't be used at all, and there isn't story for changing back queue lock. Then the issue addressed by Commit 498f6650aec8 doesn't exist any more. So move move blk_exit_queue into __blk_release_queue. This patch basically reverts the following two commits: 498f6650aec8 block: Fix a race between the cgroup code and request queue initialization 24ecc3585348 block: Ensure that a request queue is dissociated from the cgroup controller Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-29	ovl: detect overlapping layers	Amir Goldstein
	Overlapping overlay layers are not supported and can cause unexpected behavior, but overlayfs does not currently check or warn about these configurations. User is not supposed to specify the same directory for upper and lower dirs or for different lower layers and user is not supposed to specify directories that are descendants of each other for overlay layers, but that is exactly what this zysbot repro did: https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000 Moving layer root directories into other layers while overlayfs is mounted could also result in unexpected behavior. This commit places "traps" in the overlay inode hash table. Those traps are dummy overlay inodes that are hashed by the layers root inodes. On mount, the hash table trap entries are used to verify that overlay layers are not overlapping. While at it, we also verify that overlay layers are not overlapping with directories "in-use" by other overlay instances as upperdir/workdir. On lookup, the trap entries are used to verify that overlay layers root inodes have not been moved into other layers after mount. Some examples: $ ./run --ov --samefs -s ... ( mkdir -p base/upper/0/u base/upper/0/w base/lower lower upper mnt mount -o bind base/lower lower mount -o bind base/upper upper mount -t overlay none mnt ... -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w) $ umount mnt $ mount -t overlay none mnt ... -o lowerdir=base,upperdir=upper/0/u,workdir=upper/0/w [ 94.434900] overlayfs: overlapping upperdir path mount: mount overlay on mnt failed: Too many levels of symbolic links $ mount -t overlay none mnt ... -o lowerdir=upper/0/u,upperdir=upper/0/u,workdir=upper/0/w [ 151.350132] overlayfs: conflicting lowerdir path mount: none is already mounted or mnt busy $ mount -t overlay none mnt ... -o lowerdir=lower:lower/a,upperdir=upper/0/u,workdir=upper/0/w [ 201.205045] overlayfs: overlapping lowerdir path mount: mount overlay on mnt failed: Too many levels of symbolic links $ mount -t overlay none mnt ... -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w $ mv base/upper/0/ base/lower/ $ find mnt/0 mnt/0 mnt/0/w find: 'mnt/0/w/work': Too many levels of symbolic links find: 'mnt/0/u': Too many levels of symbolic links Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-05-29	drm/i915/icl: Add WaDisableBankHangMode	Tvrtko Ursulin
	Disable GPU hang by default on unrecoverable ECC cache errors. v2: * Rebase. v3: * Use intel_uncore_read. (Chris) Fixes: cc38cae7c4e9 ("drm/i915/icl: Introduce initial Icelake Workarounds") Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190520110442.403-2-tvrtko.ursulin@linux.intel.com (cherry picked from commit cbe3e1d103793705204b29c6952faed537c41fe1) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-05-29	ALSA: fireface: Use ULL suffixes for 64-bit constants	Geert Uytterhoeven
	With gcc 4.1: sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_switch_fetching_mode’: sound/firewire/fireface/ff-protocol-latter.c:97: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_begin_session’: sound/firewire/fireface/ff-protocol-latter.c:170: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:197: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:205: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_finish_session’: sound/firewire/fireface/ff-protocol-latter.c:214: warning: integer constant is too large for ‘long’ type Fix this by adding the missing "ULL" suffixes. Add the same suffix to the last constant, to maintain consistency. Fixes: fd1cc9de64c2ca6c ("ALSA: fireface: add support for Fireface UCX") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-05-29	signal/arm64: Use force_sig not force_sig_fault for SIGKILL	Eric W. Biederman
	I don't think this is userspace visible but SIGKILL does not have any si_codes that use the fault member of the siginfo union. Correct this the simple way and call force_sig instead of force_sig_fault when the signal is SIGKILL. The two know places where synchronous SIGKILL are generated are do_bad_area and fpsimd_save. The call paths to force_sig_fault are: do_bad_area arm64_force_sig_fault force_sig_fault force_signal_inject arm64_notify_die arm64_force_sig_fault force_sig_fault Which means correcting this in arm64_force_sig_fault is enough to ensure the arm64 code is not misusing the generic code, which could lead to maintenance problems later. Cc: stable@vger.kernel.org Cc: Dave Martin <Dave.Martin@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29	ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops	Hui Wang
	We met another Acer Aspire laptop which has the problem on the headset-mic, the Pin 0x19 is not set the corret configuration for a mic and the pin presence can't be detected too after plugging a headset. Kailang suggested that we should set the coeff to enable the mic and apply the ALC269_FIXUP_LIFEBOOK_EXTMIC. After doing that, both headset-mic presence and headset-mic work well. The existing ALC255_FIXUP_ACER_MIC_NO_PRESENCE set the headset-mic jack to be a phantom jack. Now since the jack can support presence unsol event, let us imporve it to set the jack to be a normal jack. https://bugs.launchpad.net/bugs/1821269 Fixes: 5824ce8de7b1c ("ALSA: hda/realtek - Add support for Acer Aspire E5-475 headset mic") Cc: Chris Chiu <chiu@endlessm.com> CC: Daniel Drake <drake@endlessm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Kailang Yang <kailang@realtek.com> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-05-29	KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier	Cédric Le Goater
	When a vCPU is connected to the KVM device, it is done using its vCPU identifier in the guest. Fix the enforced limit on the vCPU identifier by taking into account the SMT mode. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting	Cédric Le Goater
	When a CPU is hot-unplugged, the EQ is deconfigured using a zero size and a zero address. In this case, there is no need to check the flag and queue size validity. Move the checks after the queue reset code section to fix CPU hot-unplug. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released	Cédric Le Goater
	Improve the release of the XIVE KVM device by clearing the file address_space, which is used to unmap the interrupt ESB pages when a device is passed-through. Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu	Paul Mackerras
	Currently the HV KVM code takes the kvm->lock around calls to kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call kvm_for_each_vcpu() internally). However, that leads to a lock order inversion problem, because these are called in contexts where the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock according to Documentation/virtual/kvm/locking.txt. Hence there is a possibility of deadlock. To fix this, we simply don't take the kvm->lock mutex around these calls. This is safe because the implementations of kvm_for_each_vcpu() and kvm_get_vcpu_by_id() have been designed to be able to be called locklessly. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list	Paul Mackerras
	Currently the Book 3S KVM code uses kvm->lock to synchronize access to the kvm->arch.rtas_tokens list. Because this list is scanned inside kvmppc_rtas_hcall(), which is called with the vcpu mutex held, taking kvm->lock cause a lock inversion problem, which could lead to a deadlock. To fix this, we add a new mutex, kvm->arch.rtas_token_lock, which nests inside the vcpu mutexes, and use that instead of kvm->lock when accessing the rtas token list. This removes the lockdep_assert_held() in kvmppc_rtas_tokens_free(). At this point we don't hold the new mutex, but that is OK because kvmppc_rtas_tokens_free() is only called when the whole VM is being destroyed, and at that point nothing can be looking up a token in the list. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup	Paul Mackerras
	Currently the HV KVM code uses kvm->lock in conjunction with a flag, kvm->arch.mmu_ready, to synchronize MMU setup and hold off vcpu execution until the MMU-related data structures are ready. However, this means that kvm->lock is being taken inside vcpu->mutex, which is contrary to Documentation/virtual/kvm/locking.txt and results in lockdep warnings. To fix this, we add a new mutex, kvm->arch.mmu_setup_lock, which nests inside the vcpu mutexes, and is taken in the places where kvm->lock was taken that are related to MMU setup. Additionally we take the new mutex in the vcpu creation code at the point where we are creating a new vcore, in order to provide mutual exclusion with kvmppc_update_lpcr() and ensure that an update to kvm->arch.lpcr doesn't get missed, which could otherwise lead to a stale vcore->lpcr value. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29	KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions	Paul Mackerras
	Currently, kvmppc_xive_release() and kvmppc_xive_native_release() clear kvm->arch.mmu_ready and call kick_all_cpus_sync() as a way of ensuring that no vcpus are executing in the guest. However, future patches will change the mutex associated with kvm->arch.mmu_ready to a new mutex that nests inside the vcpu mutexes, making it difficult to continue to use this method. In fact, taking the vcpu mutex for a vcpu excludes execution of that vcpu, and we already take the vcpu mutex around the call to kvmppc_xive_[native_]cleanup_vcpu(). Once the cleanup function is done and we release the vcpu mutex, the vcpu can execute once again, but because we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type, vcpu->arch.xive_esc_vaddr and vcpu->arch.xive_esc_raddr, that vcpu will not be going into XIVE code any more. Thus, once we have cleaned up all of the vcpus, we are safe to clean up the rest of the XIVE state, and we don't need to use kvm->arch.mmu_ready to hold off vcpu execution. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-28	Revert "drivers: thermal: tsens: Add new operation to check if a sensor is ↵	Eduardo Valentin
	enabled" This reverts commit 3e6a8fb3308419129c7a52de6eb42feef5a919a0. Cc: Andy Gross <agross@kernel.org> Cc: David Brown <david.brown@linaro.org> Cc: Amit Kucheria <amit.kucheria@linaro.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Suggested-by: Amit Kucheria <amit.kucheria@linaro.org> Reported-by: Andy Gross <andygro@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-28	net/mlx5e: Disable rxhash when CQE compress is enabled	Saeed Mahameed
	When CQE compression is enabled (Multi-host systems), compressed CQEs might arrive to the driver rx, compressed CQEs don't have a valid hash offload and the driver already reports a hash value of 0 and invalid hash type on the skb for compressed CQEs, but this is not good enough. On a congested PCIe, where CQE compression will kick in aggressively, gro will deliver lots of out of order packets due to the invalid hash and this might cause a serious performance drop. The only valid solution, is to disable rxhash offload at all when CQE compression is favorable (Multi-host systems). Fixes: 7219ab34f184 ("net/mlx5e: CQE compression") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	net/mlx5e: restrict the real_dev of vlan device is the same as uplink device	wenxu
	When register indr block for vlan device, it should check the real_dev of vlan device is same as uplink device. Or it will set offload rule to mlx5e which will never hit. Fixes: 35a605db168c ("net/mlx5e: Offload TC e-switch rules with ingress VLAN device") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	net/mlx5: Allocate root ns memory using kzalloc to match kfree	Parav Pandit
	root ns is yet another fs core node which is freed using kfree() by tree_put_node(). Rest of the other fs core objects are also allocated using kmalloc variants. However, root ns memory is allocated using kvzalloc(). Hence allocate root ns memory using kzalloc(). Fixes: 2530236303d9e ("net/mlx5_core: Flow steering tree initialization") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	net/mlx5: Avoid double free in fs init error unwinding path	Parav Pandit
	In below code flow, for ingress acl table root ns memory leads to double free. mlx5_init_fs init_ingress_acls_root_ns() init_ingress_acl_root_ns kfree(steering->esw_ingress_root_ns); /* steering->esw_ingress_root_ns is not marked NULL / mlx5_cleanup_fs cleanup_ingress_acls_root_ns steering->esw_ingress_root_ns non NULL check passes. kfree(steering->esw_ingress_root_ns); / double free */ Similar issue exist for other tables. Hence zero out the pointers to not process the table again. Fixes: 9b93ab981e3bf ("net/mlx5: Separate ingress/egress namespaces for each vport") Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	net/mlx5: Avoid double free of root ns in the error flow path	Parav Pandit
	When root ns setup for rdma, sniffer tx and sniffer rx fails, such root ns cleanup is done by the error unwinding path of mlx5_cleanup_fs(). Below call graph shows an example for sniffer_rx_root_ns. mlx5_init_fs() init_sniffer_rx_root_ns() cleanup_root_ns(steering->sniffer_rx_root_ns); mlx5_cleanup_fs() cleanup_root_ns(steering->sniffer_rx_root_ns); /* double free of sniffer_rx_root_ns */ Hence, use the existing cleanup_fs to cleanup. Fixes: d83eb50e29de3 ("net/mlx5: Add support in RDMA RX steering") Fixes: 87d22483ce68e ("net/mlx5: Add sniffer namespaces") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	net/mlx5: Fix error handling in mlx5_load()	Saeed Mahameed
	In case mlx5_core_set_hca_defaults fails, it should jump to mlx5_cleanup_fs, fix that. Fixes: c85023e153e3 ("IB/mlx5: Add raw ethernet local loopback support") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28	Documentation: net-sysfs: Remove duplicate PHY device documentation	Florian Fainelli
	Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same duplication information. There is not currently any MDIO bus specific attribute, but there are PHY device (struct phy_device) specific attributes. Use the more precise description from sysfs-bus-mdio and carry that over to sysfs-class-net-phydev. Fixes: 86f22d04dfb5 ("net: sysfs: Document PHY device sysfs attributes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28	llc: fix skb leak in llc_build_and_send_ui_pkt()	Eric Dumazet
	If llc_mac_hdr_init() returns an error, we must drop the skb since no llc_build_and_send_ui_pkt() caller will take care of this. BUG: memory leak unreferenced object 0xffff8881202b6800 (size 2048): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline] [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline] [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669 [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline] [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608 [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662 [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950 [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173 [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430 [<000000008bdec225>] sock_create net/socket.c:1481 [inline] [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523 [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline] [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline] [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 BUG: memory leak unreferenced object 0xffff88811d750d00 (size 224): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff ...$.....h+ .... backtrace: [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline] [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline] [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline] [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28	selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu	Stefano Brivio
	In the pmtu_vti6_link_change_mtu test, both local and remote addresses for the vti6 tunnel are assigned to the same address given to the dummy interface that we use as encapsulating device with a known MTU. This works as long as the dummy interface is actually selected, via rt6_lookup(), as encapsulating device. But if the remote address of the tunnel is a local address too, the loopback interface could also be selected, and there's nothing wrong with it. This is what some older -stable kernels do (3.18.z, at least), and nothing prevents us from subtly changing FIB implementation to revert back to that behaviour in the future. Define an IPv6 prefix instead, and use two separate addresses as local and remote for vti6, so that the encapsulating device can't be a loopback interface. Reported-by: Xiumei Mu <xmu@redhat.com> Fixes: 1fad59ea1c34 ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28	dfs_cache: fix a wrong use of kfree in flush_cache_ent()	Gen Zhang
	In flush_cache_ent(), 'ce->ce_path' is allocated by kstrdup_const(). It should be freed by kfree_const(), rather than kfree(). Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-28	fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free	Murphy Zhou
	The 2nd buffer could be NULL even if iov_len is not zero. This can trigger a panic when handling symlinks. It's easy to reproduce with LTP fs_racer scripts[1] which are randomly craete/delete/link files and dirs. Fix this panic by checking if the 2nd buffer is padding before kfree, like what we do in SMB2_open_free. [1] https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/fs/racer Fixes: 2c87d6a94d16 ("cifs: Allocate memory for all iovs in smb2_ioctl") Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2019-05-28	cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case	Colin Ian King
	Currently in the case where SMB2_ioctl returns the -EOPNOTSUPP error there is a memory leak of pneg_inbuf. Fix this by returning via the out_free_inbuf exit path that will perform the relevant kfree. Addresses-Coverity: ("Resource leak") Fixes: 969ae8e8d4ee ("cifs: Accept validate negotiate if server return NT_STATUS_NOT_SUPPORTED") CC: Stable <stable@vger.kernel.org> # v5.1+ Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-28	xenbus: Avoid deadlock during suspend due to open transactions	Ross Lagerwall
	During a suspend/resume, the xenwatch thread waits for all outstanding xenstore requests and transactions to complete. This does not work correctly for transactions started by userspace because it waits for them to complete after freezing userspace threads which means the transactions have no way of completing, resulting in a deadlock. This is trivial to reproduce by running this script and then suspending the VM: import pyxs, time c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus") c.connect() c.transaction() time.sleep(3600) Even if this deadlock were resolved, misbehaving userspace should not prevent a VM from being migrated. So, instead of waiting for these transactions to complete before suspending, store the current generation id for each transaction when it is started. The global generation id is incremented during resume. If the caller commits the transaction and the generation id does not match the current generation id, return EAGAIN so that they try again. If the transaction was instead discarded, return OK since no changes were made anyway. This only affects users of the xenbus file interface. In-kernel users of xenbus are assumed to be well-behaved and complete all transactions before freezing. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2019-05-28	xen/pvcalls: Remove set but not used variable	YueHaibing
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/xen/pvcalls-front.c: In function pvcalls_front_sendmsg: drivers/xen/pvcalls-front.c:543:25: warning: variable bedata set but not used [-Wunused-but-set-variable] drivers/xen/pvcalls-front.c: In function pvcalls_front_recvmsg: drivers/xen/pvcalls-front.c:638:25: warning: variable bedata set but not used [-Wunused-but-set-variable] They are never used since introduction. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2019-05-28	Merge tag 'perf-urgent-for-mingo-5.2-20190528' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes: BPF: Jiri Olsa: - Fixup determination of end of kernel map, to avoid having BPF programs, that are after the kernel headers and just before module texts mixed up in the kernel map. tools UAPI header copies: Arnaldo Carvalho de Melo: - Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls. - Sync cpufeatures.h, sched.h, fs.h, drm.h, i915_drm.h and kvm.h headers. Namespaces: Namhyung Kim: - Add missing byte swap ops for namespace events when processing records from perf.data files that could have been recorded in a arch with a different endianness. - Fix access to the thread namespaces list by using the namespaces_lock. perf data: Shawn Landden: - Fix 'strncat may truncate' build failure with recent gcc. s/390 Thomas Richter: - Fix s390 missing module symbol and warning for non-root users in 'perf record'. arm64: Vitaly Chikunov: - Fix mksyscalltbl when system kernel headers are ahead of the kernel. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-28	tracing: Avoid memory leak in predicate_parse()	Tomas Bortoli
	In case of errors, predicate_parse() goes to the out_free label to free memory and to return an error code. However, predicate_parse() does not free the predicates of the temporary prog_stack array, thence leaking them. Link: http://lkml.kernel.org/r/20190528154338.29976-1-tomasbortoli@gmail.com Cc: stable@vger.kernel.org Fixes: 80765597bc587 ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: syzbot+6b8e0fb820e570c59e19@syzkaller.appspotmail.com Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> [ Added protection around freeing prog_stack[i].pred ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-28	habanalabs: fix bug in checking huge page optimization	Oded Gabbay
	This patch fix a bug in the mmu code that checks whether we can use huge page mappings for host pages. The code is supposed to enable huge page mappings only if ALL DMA addresses are aligned to 2MB AND the number of pages in each DMA chunk is a modulo of the number of pages in 2MB. However, the code ignored the first requirement for the first DMA chunk. This patch fix that issue by making sure the requirement of address alignment is validated against all DMA chunks. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-28	mmc: sdhci: Fix SDIO IRQ thread deadlock	Adrian Hunter
	Since commit c07a48c26519 ("mmc: sdhci: Remove finish_tasklet"), the IRQ thread might be used to complete requests, but the IRQ thread is also used to process SDIO card interrupts. This can cause a deadlock when the SDIO processing tries to access the card since that would also require the IRQ thread. Change SDHCI to use sdio_signal_irq() to schedule a work item instead. That also requires implementing the ->ack_sdio_irq() mmc host op. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: c07a48c26519 ("mmc: sdhci: Remove finish_tasklet") Reported-by: Brian Masney <masneyb@onstation.org> Tested-by: Brian Masney <masneyb@onstation.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>