summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-19Merge tag 'hyperv-fixes-signed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fix from Wei Liu: "One patch from Vitaly to fix reenlightenment notifications" * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Properly suspend/resume reenlightenment notifications
2020-05-19Merge tag 'iommu-fixes-v5.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "All related to the AMD IOMMU driver: - ACPI table parser fix to correctly read the UID of ACPI devices - ACPI UID device matching fix - Fix deferred device attachment to a domain in kdump kernels when the IOMMU driver uses the dma-iommu DMA-API implementation" * tag 'iommu-fixes-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix deferred domain attachment iommu/amd: Fix get_acpihid_device_id() iommu/amd: Fix over-read of ACPI UID from IVRS table
2020-05-19vsprintf: don't obfuscate NULL and error pointersIlya Dryomov
I don't see what security concern is addressed by obfuscating NULL and IS_ERR() error pointers, printed with %p/%pK. Given the number of sites where %p is used (over 10000) and the fact that NULL pointers aren't uncommon, it probably wouldn't take long for an attacker to find the hash that corresponds to 0. Although harder, the same goes for most common error values, such as -1, -2, -11, -14, etc. The NULL part actually fixes a regression: NULL pointers weren't obfuscated until commit 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid pointers") which went into 5.2. I'm tacking the IS_ERR() part on here because error pointers won't leak kernel addresses and printing them as pointers shouldn't be any different from e.g. %d with PTR_ERR_OR_ZERO(). Obfuscating them just makes debugging based on existing pr_debug and friends excruciating. Note that the "always print 0's for %pK when kptr_restrict == 2" behaviour which goes way back is left as is. Example output with the patch applied: ptr error-ptr NULL %p: 0000000001f8cc5b fffffffffffffff2 0000000000000000 %pK, kptr = 0: 0000000001f8cc5b fffffffffffffff2 0000000000000000 %px: ffff888048c04020 fffffffffffffff2 0000000000000000 %pK, kptr = 1: ffff888048c04020 fffffffffffffff2 0000000000000000 %pK, kptr = 2: 0000000000000000 0000000000000000 0000000000000000 Fixes: 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid pointers") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-19locking/lockdep: Replace zero-length array with flexible-arrayGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200507185804.GA15036@embeddedor
2020-05-19sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq listVincent Guittot
Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair() are quite close and follow the same sequence for enqueuing an entity in the cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as enqueue_task_fair(). This fixes a problem already faced with the latter and add an optimization in the last for_each_sched_entity loop. Fixes: fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) Reported-by Tao Zhou <zohooouoto@zoho.com.cn> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Ben Segall <bsegall@google.com> Link: https://lkml.kernel.org/r/20200513135528.4742-1-vincent.guittot@linaro.org
2020-05-19sched/debug: Fix requested task uclamp values shown in procfsPavankumar Kondeti
The intention of commit 96e74ebf8d59 ("sched/debug: Add task uclamp values to SCHED_DEBUG procfs") was to print requested and effective task uclamp values. The requested values printed are read from p->uclamp, which holds the last effective values. Fix this by printing the values from p->uclamp_req. Fixes: 96e74ebf8d59 ("sched/debug: Add task uclamp values to SCHED_DEBUG procfs") Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/1589115401-26391-1-git-send-email-pkondeti@codeaurora.org
2020-05-19sched/fair: Fix enqueue_task_fair() warning some morePhil Auld
sched/fair: Fix enqueue_task_fair warning some more The recent patch, fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) did not fully resolve the issues with the rq->tmp_alone_branch != &rq->leaf_cfs_rq_list warning in enqueue_task_fair. There is a case where the first for_each_sched_entity loop exits due to on_rq, having incompletely updated the list. In this case the second for_each_sched_entity loop can further modify se. The later code to fix up the list management fails to do what is needed because se does not point to the sched_entity which broke out of the first loop. The list is not fixed up because the throttled parent was already added back to the list by a task enqueue in a parallel child hierarchy. Address this by calling list_add_leaf_cfs_rq if there are throttled parents while doing the second for_each_sched_entity loop. Fixes: fe61468b2cb ("sched/fair: Fix enqueue_task_fair warning") Suggested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20200512135222.GC2201@lorien.usersys.redhat.com
2020-05-19x86/mmiotrace: Use cpumask_available() for cpumask_var_t variablesNathan Chancellor
When building with Clang + -Wtautological-compare and CONFIG_CPUMASK_OFFSTACK unset: arch/x86/mm/mmio-mod.c:375:6: warning: comparison of array 'downed_cpus' equal to a null pointer is always false [-Wtautological-pointer-compare] if (downed_cpus == NULL && ^~~~~~~~~~~ ~~~~ arch/x86/mm/mmio-mod.c:405:6: warning: comparison of array 'downed_cpus' equal to a null pointer is always false [-Wtautological-pointer-compare] if (downed_cpus == NULL || cpumask_weight(downed_cpus) == 0) ^~~~~~~~~~~ ~~~~ 2 warnings generated. Commit f7e30f01a9e2 ("cpumask: Add helper cpumask_available()") added cpumask_available() to fix warnings of this nature. Use that here so that clang does not warn regardless of CONFIG_CPUMASK_OFFSTACK's value. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://github.com/ClangBuiltLinux/linux/issues/982 Link: https://lkml.kernel.org/r/20200408205323.44490-1-natechancellor@gmail.com
2020-05-19dmaengine: tegra210-adma: Fix an error handling path in 'tegra_adma_probe()'Christophe JAILLET
Commit b53611fb1ce9 ("dmaengine: tegra210-adma: Fix crash during probe") has moved some code in the probe function and reordered the error handling path accordingly. However, a goto has been missed. Fix it and goto the right label if 'dma_async_device_register()' fails, so that all resources are released. Fixes: b53611fb1ce9 ("dmaengine: tegra210-adma: Fix crash during probe") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200516214205.276266-1-christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-05-19fscrypt: add support for IV_INO_LBLK_32 policiesEric Biggers
The eMMC inline crypto standard will only specify 32 DUN bits (a.k.a. IV bits), unlike UFS's 64. IV_INO_LBLK_64 is therefore not applicable, but an encryption format which uses one key per policy and permits the moving of encrypted file contents (as f2fs's garbage collector requires) is still desirable. To support such hardware, add a new encryption format IV_INO_LBLK_32 that makes the best use of the 32 bits: the IV is set to 'SipHash-2-4(inode_number) + file_logical_block_number mod 2^32', where the SipHash key is derived from the fscrypt master key. We hash only the inode number and not also the block number, because we need to maintain contiguity of DUNs to merge bios. Unlike with IV_INO_LBLK_64, with this format IV reuse is possible; this is unavoidable given the size of the DUN. This means this format should only be used where the requirements of the first paragraph apply. However, the hash spreads out the IVs in the whole usable range, and the use of a keyed hash makes it difficult for an attacker to determine which files use which IVs. Besides the above differences, this flag works like IV_INO_LBLK_64 in that on ext4 it is only allowed if the stable_inodes feature has been enabled to prevent inode numbers and the filesystem UUID from changing. Link: https://lore.kernel.org/r/20200515204141.251098-1-ebiggers@kernel.org Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Paul Crowley <paulcrowley@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-19driver core: Fix SYNC_STATE_ONLY device link implementationSaravana Kannan
When SYNC_STATE_ONLY support was added in commit 05ef983e0d65 ("driver core: Add device link support for SYNC_STATE_ONLY flag"), device_link_add() incorrectly skipped adding the new SYNC_STATE_ONLY device link to the supplier's and consumer's "device link" list. This causes multiple issues: - The device link is lost forever from driver core if the caller didn't keep track of it (caller typically isn't expected to). This is a memory leak. - The device link is also never visible to any other code path after device_link_add() returns. If we fix the "device link" list handling, that exposes a bunch of issues. 1. The device link "status" state management code rightfully doesn't handle the case where a DL_FLAG_MANAGED device link exists between a supplier and consumer, but the consumer manages to probe successfully before the supplier. The addition of DL_FLAG_SYNC_STATE_ONLY links break this assumption. This causes device_links_driver_bound() to throw a warning when this happens. Since DL_FLAG_SYNC_STATE_ONLY device links are mainly used for creating proxy device links for child device dependencies and aren't useful once the consumer device probes successfully, this patch just deletes DL_FLAG_SYNC_STATE_ONLY device links once its consumer device probes. This way, we avoid the warning, free up some memory and avoid complicating the device links "status" state management code. 2. Creating a DL_FLAG_STATELESS device link between two devices that already have a DL_FLAG_SYNC_STATE_ONLY device link will result in the DL_FLAG_STATELESS flag not getting set correctly. This patch also fixes this. Lastly, this patch also fixes minor whitespace issues. Cc: stable@vger.kernel.org Fixes: 05ef983e0d65 ("driver core: Add device link support for SYNC_STATE_ONLY flag") Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200519063000.128819-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19kprobes: Prevent probes in .noinstr.text sectionThomas Gleixner
Instrumentation is forbidden in the .noinstr.text section. Make kprobes respect this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200505134100.179862032@linutronix.de
2020-05-19Merge tag 'noinstr-lds-2020-05-19' into core/kprobesThomas Gleixner
Get the noinstr section and markers to base the kprobe changes on.
2020-05-19rcu: Provide __rcu_is_watching()Thomas Gleixner
Same as rcu_is_watching() but without the preempt_disable/enable() pair inside the function. It is merked noinstr so it ends up in the non-instrumentable text section. This is useful for non-preemptible code especially in the low level entry section. Using rcu_is_watching() there results in a call to the preempt_schedule_notrace() thunk which triggers noinstr section warnings in objtool. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200512213810.518709291@linutronix.de
2020-05-19rcu: Provide rcu_irq_exit_preempt()Thomas Gleixner
Interrupts and exceptions invoke rcu_irq_enter() on entry and need to invoke rcu_irq_exit() before they either return to the interrupted code or invoke the scheduler due to preemption. The general assumption is that RCU idle code has to have preemption disabled so that a return from interrupt cannot schedule. So the return from interrupt code invokes rcu_irq_exit() and preempt_schedule_irq(). If there is any imbalance in the rcu_irq/nmi* invocations or RCU idle code had preemption enabled then this goes unnoticed until the CPU goes idle or some other RCU check is executed. Provide rcu_irq_exit_preempt() which can be invoked from the interrupt/exception return code in case that preemption is enabled. It invokes rcu_irq_exit() and contains a few sanity checks in case that CONFIG_PROVE_RCU is enabled to catch such issues directly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134904.364456424@linutronix.de
2020-05-19rcu: Make RCU IRQ enter/exit functions rely on in_nmi()Paul E. McKenney
The rcu_nmi_enter_common() and rcu_nmi_exit_common() functions take an "irq" parameter that indicates whether these functions have been invoked from an irq handler (irq==true) or an NMI handler (irq==false). However, recent changes have applied notrace to a few critical functions such that rcu_nmi_enter_common() and rcu_nmi_exit_common() many now rely on in_nmi(). Note that in_nmi() works no differently than before, but rather that tracing is now prohibited in code regions where in_nmi() would incorrectly report NMI state. Therefore remove the "irq" parameter and inline rcu_nmi_enter_common() and rcu_nmi_exit_common() into rcu_nmi_enter() and rcu_nmi_exit(), respectively. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.617130349@linutronix.de
2020-05-19rcu/tree: Mark the idle relevant functions noinstrThomas Gleixner
These functions are invoked from context tracking and other places in the low level entry code. Move them into the .noinstr.text section to exclude them from instrumentation. Mark the places which are safe to invoke traceable functions with instrumentation_begin/end() so objtool won't complain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lkml.kernel.org/r/20200505134100.575356107@linutronix.de
2020-05-19x86: Replace ist_enter() with nmi_enter()Peter Zijlstra
A few exceptions (like #DB and #BP) can happen at any location in the code, this then means that tracers should treat events from these exceptions as NMI-like. The interrupted context could be holding locks with interrupts disabled for instance. Similarly, #MC is an actual NMI-like exception. All of them use ist_enter() which only concerns itself with RCU, but does not do any of the other setup that NMIs need. This means things like: printk() raw_spin_lock_irq(&logbuf_lock); <#DB/#BP/#MC> printk() raw_spin_lock_irq(&logbuf_lock); are entirely possible (well, not really since printk tries hard to play nice, but the concept stands). So replace ist_enter() with nmi_enter(). Also observe that any nmi_enter() caller must be both notrace and NOKPROBE, or in the noinstr text section. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.525508608@linutronix.de
2020-05-19x86/mce: Send #MC singal from task workPeter Zijlstra
Convert #MC over to using task_work_add(); it will run the same code slightly later, on the return to user path of the same exception. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134100.957390899@linutronix.de
2020-05-19x86/entry: Get rid of ist_begin/end_non_atomic()Thomas Gleixner
This is completely overengineered and definitely not an interface which should be made available to anything else than this particular MCE case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134059.462640294@linutronix.de
2020-05-19sched,rcu,tracing: Avoid tracing before in_nmi() is correctPeter Zijlstra
If a tracer is invoked before in_nmi() becomes true, the tracer can no longer detect it is called from NMI context and behave correctly. Therefore change nmi_{enter,exit}() to use __preempt_count_{add,sub}() as the normal preempt_count_{add,sub}() have a (desired) function trace entry. This fixes a potential issue with the current code; when the function-tracer has stack-tracing enabled __trace_stack() will malfunction when it hits the preempt_count_add() function entry from NMI context. Suggested-by: Steven Rostedt (VMware) <rosted@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.434193525@linutronix.de
2020-05-19sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exceptionPeter Zijlstra
SuperH is the last remaining user of arch_ftrace_nmi_{enter,exit}(), remove it from the generic code and into the SuperH code. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200505134101.248881738@linutronix.de
2020-05-19lockdep: Always inline lockdep_{off,on}()Peter Zijlstra
These functions are called {early,late} in nmi_{enter,exit} and should not be traced or probed. They are also puny, so 'inline' them. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.048523500@linutronix.de
2020-05-19hardirq/nmi: Allow nested nmi_enter()Peter Zijlstra
Since there are already a number of sites (ARM64, PowerPC) that effectively nest nmi_enter(), make the primitive support this before adding even more. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: https://lkml.kernel.org/r/20200505134100.864179229@linutronix.de
2020-05-19arm64: Prepare arch_nmi_enter() for recursionFrederic Weisbecker
When using nmi_enter() recursively, arch_nmi_enter() must also be recursion safe. In particular, it must be ensured that HCR_TGE is always set while in NMI context when in HYP mode, and be restored to it's former state when done. The current code fails this when interleaved wrong. Notably it overwrites the original hcr state on nesting. Introduce a nesting counter to make sure to store the original value. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lkml.kernel.org/r/20200505134100.771491291@linutronix.de
2020-05-19printk: Disallow instrumenting print_nmi_enter()Peter Zijlstra
It happens early in nmi_enter(), no tracing, probing or other funnies allowed. Specifically as nmi_enter() will be used in do_debug(), which would cause recursive exceptions when kprobed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.139720912@linutronix.de
2020-05-19printk: Prepare for nested printk_nmi_enter()Petr Mladek
There is plenty of space in the printk_context variable. Reserve one byte there for the NMI context to be on the safe side. It should never overflow. The BUG_ON(in_nmi() == NMI_MASK) in nmi_enter() will trigger much earlier. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134100.681374113@linutronix.de
2020-05-19Merge tag 'noinstr-lds-2020-05-19' into core/rcuThomas Gleixner
Get the noinstr section and annotation markers to base the RCU parts on.
2020-05-19vmlinux.lds.h: Create section for protection against instrumentationThomas Gleixner
Some code pathes, especially the low level entry code, must be protected against instrumentation for various reasons: - Low level entry code can be a fragile beast, especially on x86. - With NO_HZ_FULL RCU state needs to be established before using it. Having a dedicated section for such code allows to validate with tooling that no unsafe functions are invoked. Add the .noinstr.text section and the noinstr attribute to mark functions. noinstr implies notrace. Kprobes will gain a section check later. Provide also a set of markers: instrumentation_begin()/end() These are used to mark code inside a noinstr function which calls into regular instrumentable text section as safe. The instrumentation markers are only active when CONFIG_DEBUG_ENTRY is enabled as the end marker emits a NOP to prevent the compiler from merging the annotation points. This means the objtool verification requires a kernel compiled with this option. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134100.075416272@linutronix.de
2020-05-19spi: ti_qspi: fix unit addressKangmin Park
Fix unit address to match the first address specified in the reg property of the node in example. Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com> Link: https://lore.kernel.org/r/1589873541-5587-1-git-send-email-l4stpr0gr4m@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-19iommu: Fix deferred domain attachmentJoerg Roedel
The IOMMU core code has support for deferring the attachment of a domain to a device. This is needed in kdump kernels where the new domain must not be attached to a device before the device driver takes it over. When the AMD IOMMU driver got converted to use the dma-iommu implementation, the deferred attaching got lost. The code in dma-iommu.c has support for deferred attaching, but it calls into iommu_attach_device() to actually do it. But iommu_attach_device() will check if the device should be deferred in it code-path and do nothing, breaking deferred attachment. Move the is_deferred_attach() check out of the attach_device path and into iommu_group_add_device() to make deferred attaching work from the dma-iommu code. Fixes: 795bbbb9b6f8 ("iommu/dma-iommu: Handle deferred devices") Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Cc: Jerry Snitselaar <jsnitsel@redhat.com> Cc: Tom Murphy <murphyt7@tcd.ie> Cc: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200519130340.14564-1-joro@8bytes.org
2020-05-19ARM: 8977/1: ptrace: Fix mask for thumb breakpoint hookFredrik Strupe
call_undef_hook() in traps.c applies the same instr_mask for both 16-bit and 32-bit thumb instructions. If instr_mask then is only 16 bits wide (0xffff as opposed to 0xffffffff), the first half-word of 32-bit thumb instructions will be masked out. This makes the function match 32-bit thumb instructions where the second half-word is equal to instr_val, regardless of the first half-word. The result in this case is that all undefined 32-bit thumb instructions with the second half-word equal to 0xde01 (udf #1) work as breakpoints and will raise a SIGTRAP instead of a SIGILL, instead of just the one intended 16-bit instruction. An example of such an instruction is 0xeaa0de01, which is unallocated according to Arm ARM and should raise a SIGILL, but instead raises a SIGTRAP. This patch fixes the issue by setting all the bits in instr_mask, which will still match the intended 16-bit thumb instruction (where the upper half is always 0), but not any 32-bit thumb instructions. Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Fredrik Strupe <fredrik@strupe.net> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-05-19drm/etnaviv: Fix a leak in submit_pin_objects()Dan Carpenter
If the mapping address is wrong then we have to release the reference to it before returning -EINVAL. Fixes: 088880ddc0b2 ("drm/etnaviv: implement softpin") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2020-05-19drm/etnaviv: fix perfmon domain interationChristian Gmeiner
The GC860 has one GPU device which has a 2d and 3d core. In this case we want to expose perfmon information for both cores. The driver has one array which contains all possible perfmon domains with some meta data - doms_meta. Here we can see that for the GC860 two elements of that array are relevant: doms_3d: is at index 0 in the doms_meta array with 8 perfmon domains doms_2d: is at index 1 in the doms_meta array with 1 perfmon domain The userspace driver wants to get a list of all perfmon domains and their perfmon signals. This is done by iterating over all domains and their signals. If the userspace driver wants to access the domain with id 8 the kernel driver fails and returns invalid data from doms_3d with and invalid offset. This results in: Unable to handle kernel paging request at virtual address 00000000 On such a device it is not possible to use the userspace driver at all. The fix for this off-by-one error is quite simple. Reported-by: Paul Cercueil <paul@crapouillou.net> Tested-by: Paul Cercueil <paul@crapouillou.net> Fixes: ed1dd899baa3 ("drm/etnaviv: rework perfmon query infrastructure") Cc: stable@vger.kernel.org Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2020-05-19mtd:rawnand: brcmnand: Fix PM resume crashKamal Dasu
This change fixes crash observed on PM resume. This bug was introduced in the change made for flash-edu support. Fixes: a5d53ad26a8b ("mtd: rawnand: brcmnand: Add support for flash-edu for dma transfers") Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-05-19ALSA: hda/realtek - Add more fixup entries for Clevo machinesPeiSen Hou
A few known Clevo machines (PC50, PC70, X170) with ALC1220 codec need the existing quirk for pins for PB51 and co. Signed-off-by: PeiSen Hou <pshou@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200519065012.13119-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-18fscrypt: make test_dummy_encryption use v2 by defaultEric Biggers
Since v1 encryption policies are deprecated, make test_dummy_encryption test v2 policies by default. Note that this causes ext4/023 and ext4/028 to start failing due to known bugs in those tests (see previous commit). Link: https://lore.kernel.org/r/20200512233251.118314-5-ebiggers@kernel.org Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-18fscrypt: support test_dummy_encryption=v2Eric Biggers
v1 encryption policies are deprecated in favor of v2, and some new features (e.g. encryption+casefolding) are only being added for v2. Therefore, the "test_dummy_encryption" mount option (which is used for encryption I/O testing with xfstests) needs to support v2 policies. To do this, extend its syntax to be "test_dummy_encryption=v1" or "test_dummy_encryption=v2". The existing "test_dummy_encryption" (no argument) also continues to be accepted, to specify the default setting -- currently v1, but the next patch changes it to v2. To cleanly support both v1 and v2 while also making it easy to support specifying other encryption settings in the future (say, accepting "$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a pointer to the dummy fscrypt_context rather than using mount flags. To avoid concurrency issues, don't allow test_dummy_encryption to be set or changed during a remount. (The former restriction is new, but xfstests doesn't run into it, so no one should notice.) Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'. On ext4, there are two regressions, both of which are test bugs: ext4/023 and ext4/028 fail because they set an xattr and expect it to be stored inline, but the increase in size of the fscrypt_context from 24 to 40 bytes causes this xattr to be spilled into an external block. Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.org Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-18net sched: fix reporting the first-time use timestampRoman Mashak
When a new action is installed, firstuse field of 'tcf_t' is explicitly set to 0. Value of zero means "new action, not yet used"; as a packet hits the action, 'firstuse' is stamped with the current jiffies value. tcf_tm_dump() should return 0 for firstuse if action has not yet been hit. Fixes: 48d8ee1694dd ("net sched actions: aggregate dumping of actions timeinfo") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18mtd: Fix mtd not registered due to nvmem name collisionRicardo Ribalda Delgado
When the nvmem framework is enabled, a nvmem device is created per mtd device/partition. It is not uncommon that a device can have multiple mtd devices with partitions that have the same name. Eg, when there DT overlay is allowed and the same device with mtd is attached twice. Under that circumstances, the mtd fails to register due to a name duplication on the nvmem framework. With this patch we use the mtdX name instead of the partition name, which is unique. [ 8.948991] sysfs: cannot create duplicate filename '/bus/nvmem/devices/Production Data' [ 8.948992] CPU: 7 PID: 246 Comm: systemd-udevd Not tainted 5.5.0-qtec-standard #13 [ 8.948993] Hardware name: AMD Dibbler/Dibbler, BIOS 05.22.04.0019 10/26/2019 [ 8.948994] Call Trace: [ 8.948996] dump_stack+0x50/0x70 [ 8.948998] sysfs_warn_dup.cold+0x17/0x2d [ 8.949000] sysfs_do_create_link_sd.isra.0+0xc2/0xd0 [ 8.949002] bus_add_device+0x74/0x140 [ 8.949004] device_add+0x34b/0x850 [ 8.949006] nvmem_register.part.0+0x1bf/0x640 ... [ 8.948926] mtd mtd8: Failed to register NVMEM device Fixes: c4dfa25ab307 ("mtd: add support for reading MTD devices via the nvmem API") Signed-off-by: Ricardo Ribalda Delgado <ribalda@kernel.org> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-05-18mtd: spinand: Propagate ECC information to the MTD structureMiquel Raynal
This is done by default in the raw NAND core (nand_base.c) but was missing in the SPI-NAND core. Without these two lines the ecc_strength and ecc_step_size values are not exported to the user through sysfs. Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-05-18Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fixes from Mimi Zohar: "A couple of miscellaneous bug fixes for the integrity subsystem: IMA: - Properly modify the open flags in order to calculate the file hash. - On systems requiring the IMA policy to be signed, the policy is loaded differently. Don't differentiate between "enforce" and either "log" or "fix" modes how the policy is loaded. EVM: - Two patches to fix an EVM race condition, normally the result of attempting to load an unsupported hash algorithm. - Use the lockless RCU version for walking an append only list" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: evm: Fix a small race in init_desc() evm: Fix RCU list related warnings ima: Fix return value of ima_write_policy() evm: Check also if *tfm is an error pointer in init_desc() ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash()
2020-05-18ALSA: iec1712: Initialize STDSP24 properly when using the model=staudio optionScott Bahling
The ST Audio ADCIII is an STDSP24 card plus extension box. With commit e8a91ae18bdc ("ALSA: ice1712: Add support for STAudio ADCIII") we enabled the ADCIII ports using the model=staudio option but forgot this part to ensure the STDSP24 card is initialized properly. Fixes: e8a91ae18bdc ("ALSA: ice1712: Add support for STAudio ADCIII") Signed-off-by: Scott Bahling <sbahling@suse.com> Cc: <stable@vger.kernel.org> BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1048934 Link: https://lore.kernel.org/r/20200518175728.28766-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-18Merge tag 'for-5.7-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: - Fix potential memory leak in exfat_find - Set exfat's splice_write to iter_file_splice_write to fix a splice failure on direct-opened files * tag 'for-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix possible memory leak in exfat_find() exfat: use iter_file_splice_write
2020-05-18afs: Don't unlock fetched data pages until the op completes successfullyDavid Howells
Don't call req->page_done() on each page as we finish filling it with the data coming from the network. Whilst this might speed up the application a bit, it's a problem if there's a network failure and the operation has to be reissued. If this happens, an oops occurs because afs_readpages_page_done() clears the pointer to each page it unlocks and when a retry happens, the pointers to the pages it wants to fill are now NULL (and the pages have been unlocked anyway). Instead, wait till the operation completes successfully and only then release all the pages after clearing any terminal gap (the server can give us less data than we requested as we're allowed to ask for more than is available). KASAN produces a bug like the following, and even without KASAN, it can oops and panic. BUG: KASAN: wild-memory-access in _copy_to_iter+0x323/0x5f4 Write of size 1404 at addr 0005088000000000 by task md5sum/5235 CPU: 0 PID: 5235 Comm: md5sum Not tainted 5.7.0-rc3-fscache+ #250 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: memcpy+0x39/0x58 _copy_to_iter+0x323/0x5f4 __skb_datagram_iter+0x89/0x2a6 skb_copy_datagram_iter+0x129/0x135 rxrpc_recvmsg_data.isra.0+0x615/0xd42 rxrpc_kernel_recv_data+0x1e9/0x3ae afs_extract_data+0x139/0x33a yfs_deliver_fs_fetch_data64+0x47a/0x91b afs_deliver_to_call+0x304/0x709 afs_wait_for_call_to_complete+0x1cc/0x4ad yfs_fs_fetch_data+0x279/0x288 afs_fetch_data+0x1e1/0x38d afs_readpages+0x593/0x72e read_pages+0xf5/0x21e __do_page_cache_readahead+0x128/0x23f ondemand_readahead+0x36e/0x37f generic_file_buffered_read+0x234/0x680 new_sync_read+0x109/0x17e vfs_read+0xe6/0x138 ksys_read+0xd8/0x14d do_syscall_64+0x6e/0x8a entry_SYSCALL_64_after_hwframe+0x49/0xb3 Fixes: 196ee9cd2d04 ("afs: Make afs_fs_fetch_data() take a list of pages") Fixes: 30062bd13e36 ("afs: Implement YFS support in the fs client") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-18io_uring: cancel work if task_work_add() failsJens Axboe
We currently move it to the io_wqe_manager for execution, but we cannot safely do so as we may lack some of the state to execute it out of context. As we cancel work anyway when the ring/task exits, just mark this request as canceled and io_async_task_func() will do the right thing. Fixes: aa96bf8a9ee3 ("io_uring: use io-wq manager as backup task if task is exiting") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-18mmc: sdhci: Fix SDHCI_QUIRK_BROKEN_CQEAdrian Hunter
Previous to commit 511ce378e16f07 ("mmc: Add MMC host software queue support"), removing MMC_CAP2_CQE was enough to disable command queuing, but now the cqe_ops must also be NULL otherwise ->cqe_enable() will be called. Fix SDHCI_QUIRK_BROKEN_CQE to do that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 511ce378e16f07 ("mmc: Add MMC host software queue support") Link: https://lore.kernel.org/r/20200518120939.1399-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-18ACPI: EC: PM: Avoid flushing EC work when EC GPE is inactiveRafael J. Wysocki
Flushing the EC work while suspended to idle when the EC GPE status is not set causes some EC wakeup events (notably power button and lid ones) to be missed after a series of spurious wakeups on the Dell XPS13 9360 in my office. If that happens, the machine cannot be woken up from suspend-to-idle by the power button or lid status change and it needs to be woken up in some other way (eg. by a key press). Flushing the EC work only after successful dispatching the EC GPE, which means that its status has been set, avoids the issue, so change the code in question accordingly. Fixes: 7b301750f7f8 ("ACPI: EC: PM: Avoid premature returns from acpi_s2idle_wake()") Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chris Chiu <chiu@endlessm.com>
2020-05-18Merge tag 'v5.7-rc6' into objtool/core, to pick up fixes and resolve ↵Ingo Molnar
semantic conflict Resolve structural conflict between: 59566b0b622e: ("x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up") which introduced a new reference to 'ftrace_epilogue', and: 0298739b7983: ("x86,ftrace: Fix ftrace_regs_caller() unwind") Which renamed it to 'ftrace_caller_end'. Rename the new usage site in the merge commit. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-05-18esp4: improve xfrm4_beet_gso_segment() to be more readableXin Long
This patch is to improve the code to make xfrm4_beet_gso_segment() more readable, and keep consistent with xfrm6_beet_gso_segment(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>