summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-07Merge tag 'for-5.0/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Both of these fixes address issues in changes merged for 5.0-rc4: - Fix DM core's missing memory barrier before waitqueue_active() calls. - Fix DM core's clone_bio() to work when cloning a subset of a bio with an integrity payload; bio_integrity_trim() wasn't getting called due to bio_trim()'s early return" * tag 'for-5.0/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: don't use bio_trim() afterall dm: add memory barrier before waitqueue_active
2019-02-07Merge tag 'irqchip-5.0-3' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip updates from Marc Zyngier: - Another GICv3 ITS fix for devices sharing the same DevID - Don't return invalid data on exhaustion of the GICv3 LPI pool - Fix a GICv3 field decoding bug leading to memory over-allocation - Init GICv4 at boot time instead of lazy init - Fix interrupt masking on PJ4
2019-02-07blktrace: Show requests without sectorJan Kara
Currently, blktrace will not show requests that don't have any data as rq->__sector is initialized to -1 which is out of device range and thus discarded by act_log_check(). This is most notably the case for cache flush requests sent to the device. Fix the problem by making blk_rq_trace_sector() return 0 for requests without initialized sector. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-07mips: cm: reprime error causeVladimir Kondratiev
Accordingly to the documentation ---cut--- The GCR_ERROR_CAUSE.ERR_TYPE field and the GCR_ERROR_MULT.ERR_TYPE fields can be cleared by either a reset or by writing the current value of GCR_ERROR_CAUSE.ERR_TYPE to the GCR_ERROR_CAUSE.ERR_TYPE register. ---cut--- Do exactly this. Original value of cm_error may be safely written back; it clears error cause and keeps other bits untouched. Fixes: 3885c2b463f6 ("MIPS: CM: Add support for reporting CM cache errors") Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@linux.intel.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v4.3+
2019-02-07mips: loongson64: remove unreachable(), fix loongson_poweroff().Yifeng Li
On my Yeeloong 8089, I noticed the machine fails to shutdown properly, and often, the function mach_prepare_reboot() is unexpectedly executed, thus the machine reboots instead. A wait loop is needed to ensure the system is in a well-defined state before going down. In commit 997e93d4df16 ("MIPS: Hang more efficiently on halt/powerdown/restart"), a general superset of the wait loop for all platforms is already provided, so we don't need to implement our own. This commit simply removes the unreachable() compiler marco after mach_prepare_reboot(), thus allowing the execution of machine_hang(). My test shows that the machine is now able to shutdown successfully. Please note that there are two different bugs preventing the machine from shutting down, another work-in-progress commit is needed to fix a lockup in cpufreq / i8259 driver, please read Reference, this commit does not fix that bug. Reference: https://lkml.org/lkml/2019/2/5/908 Signed-off-by: Yifeng Li <tomli@tomli.me> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: Huacai Chen <chenhc@lemote.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: stable@vger.kernel.org # v4.17+
2019-02-07ALSA: usb-audio: Fix implicit fb endpoint setup by quirkManuel Reinhardt
The commit a60945fd08e4 ("ALSA: usb-audio: move implicit fb quirks to separate function") introduced an error in the handling of quirks for implicit feedback endpoints. This commit fixes this. If a quirk successfully sets up an implicit feedback endpoint, usb-audio no longer tries to find the implicit fb endpoint itself. Fixes: a60945fd08e4 ("ALSA: usb-audio: move implicit fb quirks to separate function") Signed-off-by: Manuel Reinhardt <manuel.rhdt@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-07Merge branch 'ipv6-fixes'David S. Miller
Hangbin Liu says: ==================== fix two kernel panics when disabled IPv6 on boot up When disabled IPv6 on boot up, since there is no ipv6 route tables, we should not call rt6_lookup. Fix them by checking if we have inet6_dev pointer on netdevice. v2: Fix idev reference leak, declarations and code mixing as Stefano, Eric pointed. Since we only want to check if idev exists and not reference it, use __in6_dev_get() insteand of in6_dev_get(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()Hangbin Liu
If we disabled IPv6 from the kernel command line (ipv6.disable=1), we should not call ip6_err_gen_icmpv6_unreach(). This: ip link add sit1 type sit local 192.0.2.1 remote 192.0.2.2 ttl 1 ip link set sit1 up ip addr add 198.51.100.1/24 dev sit1 ping 198.51.100.2 if IPv6 is disabled at boot time, will crash the kernel. v2: there's no need to use in6_dev_get(), use __in6_dev_get() instead, as we only need to check that idev exists and we are under rcu_read_lock() (from netif_receive_skb_internal()). Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error") Cc: Oussama Ghorbel <ghorbel@pivasoftware.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07geneve: should not call rt6_lookup() when ipv6 was disabledHangbin Liu
When we add a new GENEVE device with IPv6 remote, checking only for IS_ENABLED(CONFIG_IPV6) is not enough as we may disable IPv6 in the kernel command line (ipv6.disable=1), and calling rt6_lookup() would cause a NULL pointer dereference. v2: - don't mix declarations and code (reported by Stefano Brivio, Eric Dumazet) - there's no need to use in6_dev_get() as we only need to check that idev exists (reported by David Ahern). This is under RTNL, so we can simply use __in6_dev_get() instead (Stefano, Eric). Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: c40e89fd358e9 ("geneve: configure MTU based on a lower device") Cc: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07KVM: nVMX: unconditionally cancel preemption timer in free_nested ↵Peter Shier
(CVE-2019-7221) Bugzilla: 1671904 There are multiple code paths where an hrtimer may have been started to emulate an L1 VMX preemption timer that can result in a call to free_nested without an intervening L2 exit where the hrtimer is normally cancelled. Unconditionally cancel in free_nested to cover all cases. Embargoed until Feb 7th 2019. Signed-off-by: Peter Shier <pshier@google.com> Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Message-Id: <20181011184646.154065-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)Paolo Bonzini
Bugzilla: 1671930 Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with memory operand, INVEPT, INVVPID) can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Embargoed until Feb 7th 2019. Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)Jann Horn
kvm_ioctl_create_device() does the following: 1. creates a device that holds a reference to the VM object (with a borrowed reference, the VM's refcount has not been bumped yet) 2. initializes the device 3. transfers the reference to the device to the caller's file descriptor table 4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real reference The ownership transfer in step 3 must not happen before the reference to the VM becomes a proper, non-borrowed reference, which only happens in step 4. After step 3, an attacker can close the file descriptor and drop the borrowed reference, which can cause the refcount of the kvm object to drop to zero. This means that we need to grab a reference for the device before anon_inode_getfd(), otherwise the VM can disappear from under us. Fixes: 852b6d57dc7f ("kvm: add device control API") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07ALSA: hda - Add quirk for HP EliteBook 840 G5Jurica Vukadin
This enables mute LED support and fixes switching jacks when the laptop is docked. Signed-off-by: Jurica Vukadin <jurica.vukadin@rt-rk.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-07mt76x0u: fix suspend/resumeStanislaw Gruszka
We need to reset MCU and do other initializations on resume otherwise MT7610U device will fail to initialize, what cause system hung due to USB requests timeouts. Patch fixes 4.19 -> 4.20 regression. Cc: stable@vger.kernel.org # 4.20+ Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-07ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplugRussell King
If we have a kernel configured for periodic timer interrupts, and we have cpuidle enabled, then we end up with CPU1 losing timer interupts after a hotplug. This can manifest itself in RCU stall warnings, or userspace becoming unresponsive. The problem is that the kernel initially wants to use the TWD timer for interrupts, but the TWD loses context when we enter the C3 cpuidle state. Nothing reprograms the TWD after idle. We have solved this in the past by switching to broadcast timer ticks, and cpuidle44xx switches to that mode at boot time. However, there is nothing to switch from periodic mode local timers after a hotplug operation. We call tick_broadcast_enter() in omap_enter_idle_coupled(), which one would expect would take care of the issue, but internally this only deals with one-shot local timers - tick_broadcast_enable() on the other hand only deals with periodic local timers. So, we need to call both. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> [tony@atomide.com: just standardized the subject line] Signed-off-by: Tony Lindgren <tony@atomide.com>
2019-02-07signal: Better detection of synchronous signalsEric W. Biederman
Recently syzkaller was able to create unkillablle processes by creating a timer that is delivered as a thread local signal on SIGHUP, and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop failing to deliver SIGHUP but always trying. When the stack overflows delivery of SIGHUP fails and force_sigsegv is called. Unfortunately because SIGSEGV is numerically higher than SIGHUP next_signal tries again to deliver a SIGHUP. From a quality of implementation standpoint attempting to deliver the timer SIGHUP signal is wrong. We should attempt to deliver the synchronous SIGSEGV signal we just forced. We can make that happening in a fairly straight forward manner by instead of just looking at the signal number we also look at the si_code. In particular for exceptions (aka synchronous signals) the si_code is always greater than 0. That still has the potential to pick up a number of asynchronous signals as in a few cases the same si_codes that are used for synchronous signals are also used for asynchronous signals, and SI_KERNEL is also included in the list of possible si_codes. Still the heuristic is much better and timer signals are definitely excluded. Which is enough to prevent all known ways for someone sending a process signals fast enough to cause unexpected and arguably incorrect behavior. Cc: stable@vger.kernel.org Fixes: a27341cd5fcb ("Prioritize synchronous signals over 'normal' signals") Tested-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-02-07signal: Always notice exiting tasksEric W. Biederman
Recently syzkaller was able to create unkillablle processes by creating a timer that is delivered as a thread local signal on SIGHUP, and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop failing to deliver SIGHUP but always trying. Upon examination it turns out part of the problem is actually most of the solution. Since 2.5 signal delivery has found all fatal signals, marked the signal group for death, and queued SIGKILL in every threads thread queue relying on signal->group_exit_code to preserve the information of which was the actual fatal signal. The conversion of all fatal signals to SIGKILL results in the synchronous signal heuristic in next_signal kicking in and preferring SIGHUP to SIGKILL. Which is especially problematic as all fatal signals have already been transformed into SIGKILL. Instead of dequeueing signals and depending upon SIGKILL to be the first signal dequeued, first test if the signal group has already been marked for death. This guarantees that nothing in the signal queue can prevent a process that needs to exit from exiting. Cc: stable@vger.kernel.org Tested-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-02-07ASoC: samsung: Prevent clk_get_rate() calls in atomic contextSylwester Nawrocki
This patch moves clk_get_rate() call from trigger() to hw_params() callback to avoid calling sleeping clk API from atomic context and prevent deadlock as indicated below. Before this change clk_get_rate() was being called with same spinlock held as the one passed to the clk API when registering clocks exposed by the I2S driver. [ 82.109780] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 [ 82.117009] in_atomic(): 1, irqs_disabled(): 128, pid: 1554, name: speaker-test [ 82.124235] 3 locks held by speaker-test/1554: [ 82.128653] #0: cc8c5328 (snd_pcm_link_rwlock){...-}, at: snd_pcm_stream_lock_irq+0x20/0x38 [ 82.137058] #1: ec9eda17 (&(&substream->self_group.lock)->rlock){..-.}, at: snd_pcm_ioctl+0x900/0x1268 [ 82.146417] #2: 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4 [ 82.154650] irq event stamp: 8144 [ 82.157949] hardirqs last enabled at (8143): [<c0a0f574>] _raw_read_unlock_irq+0x24/0x5c [ 82.166089] hardirqs last disabled at (8144): [<c0a0f6a8>] _raw_read_lock_irq+0x18/0x58 [ 82.174063] softirqs last enabled at (8004): [<c01024e4>] __do_softirq+0x3a4/0x66c [ 82.181688] softirqs last disabled at (7997): [<c012d730>] irq_exit+0x140/0x168 [ 82.188964] Preemption disabled at: [ 82.188967] [<00000000>] (null) [ 82.195728] CPU: 6 PID: 1554 Comm: speaker-test Not tainted 5.0.0-rc5-00192-ga6e6caca8f03 #191 [ 82.204302] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 82.210376] [<c0111a54>] (unwind_backtrace) from [<c010d8f4>] (show_stack+0x10/0x14) [ 82.218084] [<c010d8f4>] (show_stack) from [<c09ef004>] (dump_stack+0x90/0xc8) [ 82.225278] [<c09ef004>] (dump_stack) from [<c0152980>] (___might_sleep+0x22c/0x2c8) [ 82.232990] [<c0152980>] (___might_sleep) from [<c0a0a2e4>] (__mutex_lock+0x28/0xa3c) [ 82.240788] [<c0a0a2e4>] (__mutex_lock) from [<c0a0ad80>] (mutex_lock_nested+0x1c/0x24) [ 82.248763] [<c0a0ad80>] (mutex_lock_nested) from [<c04923dc>] (clk_prepare_lock+0x78/0xec) [ 82.257079] [<c04923dc>] (clk_prepare_lock) from [<c049538c>] (clk_core_get_rate+0xc/0x5c) [ 82.265309] [<c049538c>] (clk_core_get_rate) from [<c0766b18>] (i2s_trigger+0x490/0x6d4) [ 82.273369] [<c0766b18>] (i2s_trigger) from [<c074fec4>] (soc_pcm_trigger+0x100/0x140) [ 82.281254] [<c074fec4>] (soc_pcm_trigger) from [<c07378a0>] (snd_pcm_do_start+0x2c/0x30) [ 82.289400] [<c07378a0>] (snd_pcm_do_start) from [<c07376cc>] (snd_pcm_action_single+0x38/0x78) [ 82.298065] [<c07376cc>] (snd_pcm_action_single) from [<c073a450>] (snd_pcm_ioctl+0x910/0x1268) [ 82.306734] [<c073a450>] (snd_pcm_ioctl) from [<c0292344>] (do_vfs_ioctl+0x90/0x9ec) [ 82.314443] [<c0292344>] (do_vfs_ioctl) from [<c0292cd4>] (ksys_ioctl+0x34/0x60) [ 82.321808] [<c0292cd4>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28) [ 82.329431] Exception stack(0xeb875fa8 to 0xeb875ff0) [ 82.334459] 5fa0: 00033c18 b6e31000 00000004 00004142 00033d80 00033d80 [ 82.342605] 5fc0: 00033c18 b6e31000 00008000 00000036 00008000 00000000 beea38a8 00008000 [ 82.350748] 5fe0: b6e3142c beea384c b6da9a30 b6c9212c [ 82.355789] [ 82.357245] ====================================================== [ 82.363397] WARNING: possible circular locking dependency detected [ 82.369551] 5.0.0-rc5-00192-ga6e6caca8f03 #191 Tainted: G W [ 82.376395] ------------------------------------------------------ [ 82.382548] speaker-test/1554 is trying to acquire lock: [ 82.387834] 6d2007f4 (prepare_lock){+.+.}, at: clk_prepare_lock+0x78/0xec [ 82.394593] [ 82.394593] but task is already holding lock: [ 82.400398] 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4 [ 82.408197] [ 82.408197] which lock already depends on the new lock. [ 82.416343] [ 82.416343] the existing dependency chain (in reverse order) is: [ 82.423795] [ 82.423795] -> #1 (&(&pri_dai->spinlock)->rlock){..-.}: [ 82.430472] clk_mux_set_parent+0x34/0xb8 [ 82.434975] clk_core_set_parent_nolock+0x1c4/0x52c [ 82.440347] clk_set_parent+0x38/0x6c [ 82.444509] of_clk_set_defaults+0xc8/0x308 [ 82.449186] of_clk_add_provider+0x84/0xd0 [ 82.453779] samsung_i2s_probe+0x408/0x5f8 [ 82.458376] platform_drv_probe+0x48/0x98 [ 82.462879] really_probe+0x224/0x3f4 [ 82.467037] driver_probe_device+0x70/0x1c4 [ 82.471716] bus_for_each_drv+0x44/0x8c [ 82.476049] __device_attach+0xa0/0x138 [ 82.480382] bus_probe_device+0x88/0x90 [ 82.484715] deferred_probe_work_func+0x6c/0xbc [ 82.489741] process_one_work+0x200/0x740 [ 82.494246] worker_thread+0x2c/0x4c8 [ 82.498408] kthread+0x128/0x164 [ 82.502131] ret_from_fork+0x14/0x20 [ 82.506204] (null) [ 82.508976] [ 82.508976] -> #0 (prepare_lock){+.+.}: [ 82.514264] __mutex_lock+0x60/0xa3c [ 82.518336] mutex_lock_nested+0x1c/0x24 [ 82.522756] clk_prepare_lock+0x78/0xec [ 82.527088] clk_core_get_rate+0xc/0x5c [ 82.531421] i2s_trigger+0x490/0x6d4 [ 82.535494] soc_pcm_trigger+0x100/0x140 [ 82.539913] snd_pcm_do_start+0x2c/0x30 [ 82.544246] snd_pcm_action_single+0x38/0x78 [ 82.549012] snd_pcm_ioctl+0x910/0x1268 [ 82.553345] do_vfs_ioctl+0x90/0x9ec [ 82.557417] ksys_ioctl+0x34/0x60 [ 82.561229] ret_fast_syscall+0x0/0x28 [ 82.565477] 0xbeea384c [ 82.568421] [ 82.568421] other info that might help us debug this: [ 82.568421] [ 82.576394] Possible unsafe locking scenario: [ 82.576394] [ 82.582285] CPU0 CPU1 [ 82.586792] ---- ---- [ 82.591297] lock(&(&pri_dai->spinlock)->rlock); [ 82.595977] lock(prepare_lock); [ 82.601782] lock(&(&pri_dai->spinlock)->rlock); [ 82.608975] lock(prepare_lock); [ 82.612268] [ 82.612268] *** DEADLOCK *** Fixes: 647d04f8e07a ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined") Reported-by: Krzysztof Kozłowski <krzk@kernel.org> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-02-07ASoC: rsnd: ssiu: correct shift bit for ssiu9Jiada Wang
Currently "0xf << 36" is used to clear SSIU-9 internal buffer state, which overflows 32-bit value according to user reference manual, it is always bit4 ~ bit7 of SSI_SYS_STATUS[1,3,5,7] registers indicate SSIU-9's buffer state, so "0xf << 4" should be used. This patch fix incorrect shifting issue in SSIU-9 case Fixes: commit b7169ddea2f2 ("ASoC: rsnd: remove RSND_REG_ from rsnd_reg") Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-02-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fix from Jiri Kosina: "A fix for a bug in hid-debug that can lock up the kernel in infinite loop (CVE-2019-3819), from Vladis Dronov" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: debug: fix the ring buffer implementation
2019-02-07KVM: arm64: Forbid kprobing of the VHE world-switch codeJames Morse
On systems with VHE the kernel and KVM's world-switch code run at the same exception level. Code that is only used on a VHE system does not need to be annotated as __hyp_text as it can reside anywhere in the kernel text. __hyp_text was also used to prevent kprobes from patching breakpoint instructions into this region, as this code runs at a different exception level. While this is no longer true with VHE, KVM still switches VBAR_EL1, meaning a kprobe's breakpoint executed in the world-switch code will cause a hyp-panic. echo "p:weasel sysreg_save_guest_state_vhe" > /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/weasel/enable lkvm run -k /boot/Image --console serial -p "console=ttyS0 earlycon=uart,mmio,0x3f8" # lkvm run -k /boot/Image -m 384 -c 3 --name guest-1474 Info: Placing fdt at 0x8fe00000 - 0x8fffffff Info: virtio-mmio.devices=0x200@0x10000:36 Info: virtio-mmio.devices=0x200@0x10200:37 Info: virtio-mmio.devices=0x200@0x10400:38 [ 614.178186] Kernel panic - not syncing: HYP panic: [ 614.178186] PS:404003c9 PC:ffff0000100d70e0 ESR:f2000004 [ 614.178186] FAR:0000000080080000 HPFAR:0000000000800800 PAR:1d00007edbadc0de [ 614.178186] VCPU:00000000f8de32f1 [ 614.178383] CPU: 2 PID: 1482 Comm: kvm-vcpu-0 Not tainted 5.0.0-rc2 #10799 [ 614.178446] Call trace: [ 614.178480] dump_backtrace+0x0/0x148 [ 614.178567] show_stack+0x24/0x30 [ 614.178658] dump_stack+0x90/0xb4 [ 614.178710] panic+0x13c/0x2d8 [ 614.178793] hyp_panic+0xac/0xd8 [ 614.178880] kvm_vcpu_run_vhe+0x9c/0xe0 [ 614.178958] kvm_arch_vcpu_ioctl_run+0x454/0x798 [ 614.179038] kvm_vcpu_ioctl+0x360/0x898 [ 614.179087] do_vfs_ioctl+0xc4/0x858 [ 614.179174] ksys_ioctl+0x84/0xb8 [ 614.179261] __arm64_sys_ioctl+0x28/0x38 [ 614.179348] el0_svc_common+0x94/0x108 [ 614.179401] el0_svc_handler+0x38/0x78 [ 614.179487] el0_svc+0x8/0xc [ 614.179558] SMP: stopping secondary CPUs [ 614.179661] Kernel Offset: disabled [ 614.179695] CPU features: 0x003,2a80aa38 [ 614.179758] Memory Limit: none [ 614.179858] ---[ end Kernel panic - not syncing: HYP panic: [ 614.179858] PS:404003c9 PC:ffff0000100d70e0 ESR:f2000004 [ 614.179858] FAR:0000000080080000 HPFAR:0000000000800800 PAR:1d00007edbadc0de [ 614.179858] VCPU:00000000f8de32f1 ]--- Annotate the VHE world-switch functions that aren't marked __hyp_text using NOKPROBE_SYMBOL(). Signed-off-by: James Morse <james.morse@arm.com> Fixes: 3f5c90b890ac ("KVM: arm64: Introduce VHE-specific kvm_vcpu_run") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07KVM: arm64: Relax the restriction on using stage2 PUD huge mappingSuzuki K Poulose
We restrict mapping the PUD huge pages in stage2 to only when the stage2 has 4 level page table, leaving the feature unused with the default IPA size. But we could use it even with a 3 level page table, i.e, when the PUD level is folded into PGD, just like the stage1. Relax the condition to allow using the PUD huge page mappings at stage2 when it is possible. Cc: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07arm: KVM: Add missing kvm_stage2_has_pmd() helperMarc Zyngier
Fixup 32bit by providing the now required helper. Cc: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07KVM: arm/arm64: vgic: Always initialize the group of private IRQsChristoffer Dall
We currently initialize the group of private IRQs during kvm_vgic_vcpu_init, and the value of the group depends on the GIC model we are emulating. However, CPUs created before creating (and initializing) the VGIC might end up with the wrong group if the VGIC is created as GICv3 later. Since we have no enforced ordering of creating the VGIC and creating VCPUs, we can end up with part the VCPUs being properly intialized and the remaining incorrectly initialized. That also means that we have no single place to do the per-cpu data structure initialization which depends on knowing the emulated GIC model (which is only the group field). This patch removes the incorrect comment from kvm_vgic_vcpu_init and initializes the group of all previously created VCPUs's private interrupts in vgic_init in addition to the existing initialization in kvm_vgic_vcpu_init. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07arm/arm64: KVM: Don't panic on failure to properly reset system registersMarc Zyngier
Failing to properly reset system registers is pretty bad. But not quite as bad as bringing the whole machine down... So warn loudly, but slightly more gracefully. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com>
2019-02-07arm/arm64: KVM: Allow a VCPU to fully reset itselfMarc Zyngier
The current kvm_psci_vcpu_on implementation will directly try to manipulate the state of the VCPU to reset it. However, since this is not done on the thread that runs the VCPU, we can end up in a strangely corrupted state when the source and target VCPUs are running at the same time. Fix this by factoring out all reset logic from the PSCI implementation and forwarding the required information along with a request to the target VCPU. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2019-02-07KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loadedChristoffer Dall
We have two ways to reset a vcpu: - either through VCPU_INIT - or through a PSCI_ON call The first one is easy to reason about. The second one is implemented in a more bizarre way, as it is the vcpu that handles PSCI_ON that resets the vcpu that is being powered-on. As we need to turn the logic around and have the target vcpu to reset itself, we must take some preliminary steps. Resetting the VCPU state modifies the system register state in memory, but this may interact with vcpu_load/vcpu_put if running with preemption disabled, which in turn may lead to corrupted system register state. Address this by disabling preemption and doing put/load if required around the reset logic. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07ACPI: Set debug output flags independent of ACPICAErik Schmauss
There was a divergence between Linux and ACPICA on the definition of ACPI_DEBUG_DEFAULT. This divergence was solved by taking ACPICA's definition in 4c1379d7bb42. After resolving the divergence, it was clear that Linux users wanted to use their old set of debug flags. This change fixes the divergence by setting these debug flags during acpi_early_init() rather than during global variable initialization in acpixf.h (owned by ACPICA). Fixes: 4c1379d7bb42 ("ACPICA: Debug output: Add option to display method/object evaluation") Reported-by: Michael J Ruhl <michael.j.ruhl@intel.com> Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-07Revert "s390/pci: remove bit_lock usage in interrupt handler"Sebastian Ott
This reverts commit 9594ca6b87d9f11e9f14ac31581e0e5d79a8e839. With the handle_simple_irq irq_flow_handler it must be ensured to not call generic_handle_irq with the same IRQ number on 2 CPUs at the same time (interrupts are floating on s390). Contrary to my initial investigation the irq_desc's lock usage in handle_simple_irq does not ensure this. Thus re-introduce the bit- lock usage in s390's pci handler. Reported-by: Ursula Braun <ubraun@linux.ibm.com> Reported-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07Merge tag 'sound-5.0-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of a few small fixes. The most significant one is the fix for the possible race at loading HD-audio drivers. This has been present for long time and surfaced only in a rare occasion, but finally spotted out" * tag 'sound-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/ca0132 - Fix build error without CONFIG_PCI ALSA: compress: Fix stop handling on compressed capture streams ALSA: usb-audio: Add support for new T+A USB DAC ALSA: hda - Serialize codec registrations ALSA: hda/realtek - Use a common helper for hp pin reference ALSA: hda/realtek - Fix lose hp_pins for disable auto mute ALSA: hda/realtek - Headset microphone support for System76 darp5
2019-02-07Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "A small fix for a uapi header, and a fix for VDPA for non-x86 guests" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: drop internal struct from UAPI virtio: support VIRTIO_F_ORDER_PLATFORM
2019-02-07Merge tag 'trace-v5.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This has two fixes for uprobe code. - Cut and paste fix to have uprobe printks say "uprobe" and not "kprobe" - Add terminating '\0' byte when copying function arguments" * tag 'trace-v5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobes: Fix output for multiple string arguments tracing: uprobes: Fix typo in pr_fmt string
2019-02-07Merge tag 'fuse-fixes-5.0-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "A fix for a CUSE regression introduced in v4.20, as well as fixes for a couple of old bugs" * tag 'fuse-fixes-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: decrement NR_WRITEBACK_TEMP on the right page fuse: call pipe_buf_release() under pipe lock cuse: fix ioctl fuse: handle zero sized retrieve correctly
2019-02-07Merge tag 'pinctrl-v5.0-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Mediatek Kconfig fix - Sunxi regulator, IRQ banks and pin base fixup - Intel Cherryview Strago DMI workaround - Potential regmap problem on mcp23s08 * tag 'pinctrl-v5.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sunxi: Correct number of IRQ banks on H6 main pin controller pinctrl: mcp23s08: spi: Fix regmap allocation for mcp23s18 pinctrl: cherryview: fix Strago DMI workaround pinctrl: sunxi: Consider pin_base when calculating regulator array index pinctrl: sunxi: Fix and simplify pin bank regulator handling pinctrl: mediatek: fix Kconfig build errors for moore core
2019-02-06net: Don't default Cavium PTP driver to 'y'Bjorn Helgaas
8c56df372bc1 ("net: add support for Cavium PTP coprocessor") added the Cavium PTP coprocessor driver and enabled it by default. Remove the "default y" because the driver only applies to Cavium ThunderX processors. Fixes: 8c56df372bc1 ("net: add support for Cavium PTP coprocessor") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: broadcom: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop ↵Yang Wei
profiles dev_consume_skb_irq() should be called in sbdma_tx_process() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: via-velocity: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop ↵Yang Wei
profiles dev_consume_skb_irq() should be called in velocity_free_tx_buf() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: tehuti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei
dev_consume_skb_irq() should be called in bdx_tx_cleanup() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: sun: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei
dev_consume_skb_irq() should be called when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: fsl_ucc_hdlc: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop ↵Yang Wei
profiles dev_consume_skb_irq() should be called in hdlc_tx_done() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: fec_mpc52xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop ↵Yang Wei
profiles dev_consume_skb_irq() should be called in mpc52xx_fec_tx_interrupt() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: smsc: epic100: replace dev_kfree_skb_irq by dev_consume_skb_irq for ↵Yang Wei
drop profiles dev_consume_skb_irq() should be called in epic_tx() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: dscc4: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei
dev_consume_skb_irq() should be called in dscc4_tx_irq() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: tulip: de2104x: replace dev_kfree_skb_irq by dev_consume_skb_irq for ↵Yang Wei
drop profiles dev_consume_skb_irq() should be called in de_tx() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: defxx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profilesYang Wei
dev_consume_skb_irq() should be called in dfx_xmt_done() when skb xmit done. It makes drop profiles(dropwatch, perf) more friendly. Signed-off-by: Yang Wei <yang.wei9@zte.com.cn> Reviewed-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net/mlx5e: Don't overwrite pedit action when multiple pedit usedTonghao Zhang
In some case, we may use multiple pedit actions to modify packets. The command shown as below: the last pedit action is effective. $ tc filter add dev netdev_rep parent ffff: protocol ip prio 1 \ flower skip_sw ip_proto icmp dst_ip 3.3.3.3 \ action pedit ex munge ip dst set 192.168.1.100 pipe \ action pedit ex munge eth src set 00:00:00:00:00:01 pipe \ action pedit ex munge eth dst set 00:00:00:00:00:02 pipe \ action csum ip pipe \ action tunnel_key set src_ip 1.1.1.100 dst_ip 1.1.1.200 dst_port 4789 id 100 \ action mirred egress redirect dev vxlan0 To fix it, we add max_mod_hdr_actions to mlx5e_tc_flow_parse_attr struction, max_mod_hdr_actions will store the max pedit action number we support and num_mod_hdr_actions indicates how many pedit action we used, and store all pedit action to mod_hdr_actions. Fixes: d79b6df6b10a ("net/mlx5e: Add parsing of TC pedit actions to HW format") Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net/mlx5e: Update hw flows when encap source mac changedTonghao Zhang
When we offload tc filters to hardware, hardware flows can be updated when mac of encap destination ip is changed. But we ignore one case, that the mac of local encap ip can be changed too, so we should also update them. To fix it, add route_dev in mlx5e_encap_entry struct to save the local encap netdevice, and when mac changed, kernel will flush all the neighbour on the netdevice and send NETEVENT_NEIGH_UPDATE event. The mlx5 driver will delete the flows and add them when neighbour available again. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Cc: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'qed-Bug-fixes'David S. Miller
Manish Chopra says: ==================== qed*: Bug fixes. This series contains general qed/qede fixes. Please consider applying this to "net" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06qed*: Advance drivers version to 8.37.0.20Manish Chopra
Version update for qed/qede modules. Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06qed: Change verbosity for coalescing message.Rahul Verma
Fix unnecessary logging of message in an expected default case where coalescing value read (via ethtool -c) migh not be valid unless they are configured explicitly in the hardware using ethtool -C. Signed-off-by: Rahul Verma <rverma@marvell.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>