summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-06-03perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaroundAndi Kleen
Now that we have topology_max_smt_threads() use it to detect the HT workarounds for older CPUs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add topdown events to Intel AtomAndi Kleen
Add topdown event declarations to Silvermont / Airmont. These cores do not support the full Top Down metrics, but an useful subset (FrontendBound, Retiring, Backend Bound/Bad Speculation). The perf stat tool automatically handles the missing events and combines the available metrics. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add topdown events to Intel CoreAndi Kleen
Add declarations for the events needed for topdown to the Intel big core CPUs starting with Sandy Bridge. We need to report different values if HyperThreading is on or off. The only thing this patch does is to export some events in sysfs. topdown level 1 uses a set of abstracted metrics which are generic to out of order CPU cores (although some CPUs may not implement all of them): topdown-total-slots Available slots in the pipeline topdown-slots-issued Slots issued into the pipeline topdown-slots-retired Slots successfully retired topdown-fetch-bubbles Pipeline gaps in the frontend topdown-recovery-bubbles Pipeline gaps during recovery from misspeculation A slot is a single operation in the CPU pipe line. These metrics then allow to compute four useful metrics: FrontendBound, BackendBound, Retiring, BadSpeculation. The formulas to compute the metrics are generic, they only change based on the availability on the abstracted input values. The kernel declares the events supported by the current CPU and their scaling factors (such as the pipeline width) and perf stat then computes the formulas based on the available metrics. This is similar how existing perf metrics, such as TSC metrics or IPC, are implemented. This abstracts all CPU pipe line specific knowledge in the kernel driver, but still avoids the need for larger scale perf interface changes. For HyperThreading the any bit is needed to get accurate values when both threads are executing. This implies that the events can only be collected as root or with perf_event_paranoid=-1 for now. The basic scheme is based on the following paper: Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google) Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86: Support sysfs files depending on SMT statusAndi Kleen
Add a way to show different sysfs events attributes depending on HyperThreading is on or off. This is difficult to determine early at boot, so we just do it dynamically when the sysfs attribute is read. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03x86/topology: Add topology_max_smt_threads()Andi Kleen
For SMT specific workarounds it is useful to know if SMT is active on any online CPU in the system. This currently requires a loop over all online CPUs. Add a global variable that is updated with the maximum number of smt threads on any CPU on online/offline, and use it for topology_max_smt_threads() The single call is easier to use than a loop. Not exported to user space because user space already can use the existing sibling interfaces to find this out. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03tools/perf: Handle -EOPNOTSUPP for sampling eventsVineet Gupta
This allows (with a previous change to the perf error return ABI) for calling out in userspace the exact reason for perf record failing when PMU doesn't support overflow interrupts. Note that this needs to be put ahead of existing precise_ip check as that gets hit otherwise for the sampling fail case as well. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <acme@redhat.com> Cc: <linux-snps-arc@lists.infradead.org> Cc: <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: http://lkml.kernel.org/r/1462786660-2900-2-git-send-email-vgupta@synopsys.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/abi: Change the errno for sampling event not supported in hardwareVineet Gupta
Change the return code for sampling event not supported from -ENOTSUPP to -EOPNOTSUPP. This allows userspace to identify this case specifically, instead of printing the catch-all error message it did previously. Technically this is an ABI change, but we think we can get away with it. Old behavior: ------- | # perf record ls | Error: | The sys_perf_event_open() syscall returned with 524 (Unknown error 524) | for event (cycles:ppp). | /bin/dmesg may provide additional information. | No CONFIG_PERF_EVENTS=y kernel support configured? New behavior: ------- | # perf record ls | Error: | PMU Hardware doesn't support sampling/overflow-interrupts. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <acme@redhat.com> Cc: <linux-snps-arc@lists.infradead.org> Cc: <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: http://lkml.kernel.org/r/1462786660-2900-3-git-send-email-vgupta@synopsys.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel/uncore: Locate specific box by checking full device infoKan Liang
Some platforms, e.g. Knights Landing, use a common PCI device ID for multiple instances of an uncore PMU device type. So it is impossible to locate the specific instances only by PCI device ID. The current code specially handles Knights Landing by arbitrarily pointing an instance to an unused uncore box. However, we still have no idea which uncore device is mapped to which box. Furthermore, there could be more platforms which use a common PCI device ID for uncore devices. We have to specially handle them one by one. This patch records full device information (slot, func, and device ID) in id_table[]. So the probe function can point the instance to a specific uncore box by checking the full device information. Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: tglx@linutronix.de Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@suse.de Cc: harish.chegondi@intel.com Cc: hubert.chrzaniuk@intel.com Cc: lawrence.f.meadows@intel.com Link: http://lkml.kernel.org/r/1463379504-39003-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Change offcore response masks for Knights LandingLukasz Odzioba
Due to change in register definition we need to update OCR mask: MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14 MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38 Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: kan.liang@intel.com Cc: lukasz.anaczkowski@intel.com Cc: zheng.z.yan@intel.com Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel: Add 'static' keyword to locally used arraysLukasz Odzioba
Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[], nhm_events_attrs[] and intel_skl_event_constraints arrays[], because they are only used locally. Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: kan.liang@intel.com Cc: lukasz.anaczkowski@intel.com Cc: zheng.z.yan@intel.com Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/core: Fix implicitly enable dynamic interrupt throttleKan Liang
This patch fixes an issue which was introduced by commit: 91a612eea9a3 ("perf/core: Fix dynamic interrupt throttle") ... which commit unconditionally sets the perf_sample_allowed_ns value to !0. But that could trigger a bug in the following corner case: The user can disable the dynamic interrupt throttle mechanism by setting perf_cpu_time_max_percent to 0. Then they change perf_event_max_sample_rate. For this case, the mechanism will be enabled implicitly, because perf_sample_allowed_ns becomes !0 - which is not what we want. This patch only updates perf_sample_allowed_ns when the dynamic interrupt throttle mechanism is enabled. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/1462260366-3160-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/core: Rename the perf_event_aux*() APIs to perf_event_sb*(), to ↵Peter Zijlstra
separate them from AUX ring-buffer records There are now two different things called AUX in perf, the infrastructure to deliver the mmap/comm/task records and the AUX part in the mmap buffer (with associated AUX_RECORD). Since the former is internal, rename it to side-band to reduce the confusion factor. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/core: Optimize side-band event deliveryKan Liang
The perf_event_aux() function iterates all PMUs and all events in their respective per-CPU contexts to find the events to deliver side-band records to. For example, the brk test case in lkp triggers many mmap() operations, which, if we're also running perf, results in many perf_event_aux() invocations. If we enable uncore PMU support (even when uncore events are not used), dozens of uncore PMUs will be iterated, which can significantly decrease brk_test's throughput. For example, the brk throughput: without uncore PMUs: 2647573 ops_per_sec with uncore PMUs: 1768444 ops_per_sec ... a 33% reduction. To get at the per-CPU events that need side-band records, this patch puts these events on a per-CPU list, this avoids iterating the PMUs and any events that do not need side-band records. Per task events are unchanged to avoid extra overhead on the context switch paths. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Huang, Ying <ying.huang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86/intel/uncore: Remove SBOX support for Broadwell serverKan Liang
There was a report that on certain Broadwell-EP systems writing any bit of the SBOX PMU initialization MSR would #GP at boot. This did not happen on all systems. My test systems booted fine. Considering both DE and EP may have such issues, this patch removes SBOX support for all Broadwell platforms for now. Reported-and-tested-by: Mark van Dijk <mark@voidzero.net> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1464347540-5763-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03locking/ww_mutex: Report recursive ww_mutex locking earlyChris Wilson
Recursive locking for ww_mutexes was originally conceived as an exception. However, it is heavily used by the DRM atomic modesetting code. Currently, the recursive deadlock is checked after we have queued up for a busy-spin and as we never release the lock, we spin until kicked, whereupon the deadlock is discovered and reported. A simple solution for the now common problem is to move the recursive deadlock discovery to the first action when taking the ww_mutex. Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1464293297-19777-1-git-send-email-chris@chris-wilson.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03locking/seqcount: Re-fix raw_read_seqcount_latch()Peter Zijlstra
Commit 50755bc1c305 ("seqlock: fix raw_read_seqcount_latch()") broke raw_read_seqcount_latch(). If you look at the comment that was modified; the thing that changes is the seq count, not the latch pointer. * void latch_modify(struct latch_struct *latch, ...) * { * smp_wmb(); <- Ensure that the last data[1] update is visible * latch->seq++; * smp_wmb(); <- Ensure that the seqcount update is visible * * modify(latch->data[0], ...); * * smp_wmb(); <- Ensure that the data[0] update is visible * latch->seq++; * smp_wmb(); <- Ensure that the seqcount update is visible * * modify(latch->data[1], ...); * } * * The query will have a form like: * * struct entry *latch_query(struct latch_struct *latch, ...) * { * struct entry *entry; * unsigned seq, idx; * * do { * seq = lockless_dereference(latch->seq); So here we have: seq = READ_ONCE(latch->seq); smp_read_barrier_depends(); Which is exactly what we want; the new code: seq = ({ p = READ_ONCE(latch); smp_read_barrier_depends(); p })->seq; is just wrong; because it looses the volatile read on seq, which can now be torn or worse 'optimized'. And the read_depend barrier is also placed wrong, we want it after the load of seq, to match the above data[] up-to-date wmb()s. Such that when we dereference latch->data[] below, we're guaranteed to observe the right data. * * idx = seq & 0x01; * entry = data_query(latch->data[idx], ...); * * smp_rmb(); * } while (seq != latch->seq); * * return entry; * } So yes, not passing a pointer is not pretty, but the code was correct, and isn't anymore now. Change to explicit READ_ONCE()+smp_read_barrier_depends() to avoid confusion and allow strict lockless_dereference() checking. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 50755bc1c305 ("seqlock: fix raw_read_seqcount_latch()") Link: http://lkml.kernel.org/r/20160527111117.GL3192@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03phy: ti-pipe3: Program the DPLL even if it was already lockedRoger Quadros
If bootloader has set a wrong DPLL then we must trash those values and re-program it anyways. This fixes USB3 devices not being enumerated on beagle-x15 if usb was started in u-boot. We don't re-program SATA DPLL if it is locked as it was causing SATA failures if device was hotpluged after boot. Reported-by: Robert Nelson <robertcnelson@gmail.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2016-06-03KEYS: Add placeholder for KDF usage with DHStephan Mueller
The values computed during Diffie-Hellman key exchange are often used in combination with key derivation functions to create cryptographic keys. Add a placeholder for a later implementation to configure a key derivation function that will transform the Diffie-Hellman result returned by the KEYCTL_DH_COMPUTE command. [This patch was stripped down from a patch produced by Mat Martineau that had a bug in the compat code - so for the moment Stephan's patch simply requires that the placeholder argument must be NULL] Original-signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-03drm/amdkfd: print once about mem_banks truncationOded Gabbay
This print can really spam the kernel log in case we are truncating mem_banks, so just print this info once. It should also not be classified as warning. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-06-03drm/amdkfd: destroy dbgmgr in notifier releaseOded Gabbay
amdkfd need to destroy the debug manager in case amdkfd's notifier function is called before the unbind function, because in that case, the unbind function will exit without destroying debug manager. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: Stable <stable@vger.kernel.org>
2016-06-03drm/amdkfd: unbind only existing processesOded Gabbay
When unbinding a process from a device (initiated by amd_iommu_v2), the driver needs to make sure that process still exists in the process table. There is a possibility that amdkfd's own notifier handler - kfd_process_notifier_release() - was called before the unbind function and it already removed the process from the process table. v2: Because there can be only one process with the specified pasid, and because *p can't be NULL inside the hash_for_each_rcu macro, it is more reasonable to just put the whole code inside the if statement that compares the pasid value. That way, when we exit hash_for_each_rcu, we simply exit the function as well. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: Stable <stable@vger.kernel.org>
2016-06-03drm/omap: fix unused variable warning.Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-03Merge tag 'omapdrm-4.7-fixes' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-fixes omapdrm fixes for 4.7 * multiple compile break fixes for missing includes, bad kconfig dependencies. * remove regulator API misuse causing deprecation warnings * OMAP5 HDMI fixes for DDC and AVI infoframe * OMAP4 HDMI fix for CEC * tag 'omapdrm-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: drm/omap: include gpio/consumer.h where needed drm/omap: include linux/seq_file.h where needed Revert "drm/omap: no need to select OMAP2_DSS" drm/omap: Remove regulator API abuse OMAPDSS: HDMI5: Change DDC timings OMAPDSS: HDMI5: Fix AVI infoframe drm/omap: fix OMAP4 hdmi_core_powerdown_disable() drm/omap: Fix missing includes drm/omapdrm: include pinctrl/consumer.h where needed
2016-06-02rds: fix an infoleak in rds_inc_info_copyKangjie Lu
The last field "flags" of object "minfo" is not initialized. Copying this object out may leak kernel stack data. Assign 0 to it to avoid leak. Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-02tipc: fix an infoleak in tipc_nl_compat_link_dumpKangjie Lu
link_info.str is a char array of size 60. Memory after the NULL byte is not initialized. Sending the whole object out can cause a leak. Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03Merge tag 'imx-drm-next-2016-06-01' of ↵Dave Airlie
git://git.pengutronix.de/git/pza/linux into drm-fixes imx-drm updates - add support for reading LVDS panel EDID over DDC - enable UYVY/VYUY support - add support for pixel clock polarity configuration - honor the native-mode DT property for LVDS - various fixes and cleanups * tag 'imx-drm-next-2016-06-01' of git://git.pengutronix.de/git/pza/linux: drm/imx: plane: Don't set plane->crtc in ipu_plane_update() drm/imx: ipuv3-plane: Constify ipu_plane_funcs drm/imx: imx-ldb: honor 'native-mode' property when selecting video mode from DT drm/imx: parallel-display: remove dead code drm/imx: use bus_flags for pixel clock polarity drm/imx: ipuv3-plane: enable UYVY and VYUY formats drm/imx: parallel-display: use of_graph_get_endpoint_by_regs helper drm/imx: imx-ldb: use of_graph_get_endpoint_by_regs helper dt-bindings: imx: ldb: Add ddc-i2c-bus property drm/imx: imx-ldb: Add DDC support
2016-06-03Merge tag 'drm-atmel-hlcdc-fixes/for-4.7-rc2' of ↵Dave Airlie
github.com:bbrezillon/linux-at91 into drm-fixes Two trivial bugfixes for the atmel-hlcdc driver. The first one is making use of __drm_atomic_helper_crtc_destroy_state() instead of duplicating its logic in atmel_hlcdc_crtc_reset() and risking memory leaks if other objects are added to the common CRTC state. The second one is fixing a possible NULL pointer dereference. * tag 'drm-atmel-hlcdc-fixes/for-4.7-rc2' of github.com:bbrezillon/linux-at91: drm: atmel-hlcdc: fix a NULL check drm: atmel-hlcdc: fix atmel_hlcdc_crtc_reset() implementation
2016-06-03Merge branch 'for-upstream/hdlcd' of git://linux-arm.org/linux-ld into drm-fixesDave Airlie
"I have accumulated some cleanup patches for HDLCD, partly triggered by Daniel Vetter's work on non-blocking atomic operations, that I would like to integrate into v4.7. My first patch is important for the newly enabled hibernate option for AArch64 on Juno, the others are fixing behaviour in HDLCD and adding a debugfs entry to help track the underlying framebuffer usage. I'm also taking one of Daniel's patches from his non-blocking series to help with the integration of his patches later." * 'for-upstream/hdlcd' of git://linux-arm.org/linux-ld: drm: hdlcd: Add information about the underlying framebuffers in debugfs drm: hdlcd: Cleanup the atomic plane operations drm/hdlcd: Fix up crtc_state->event handling drm: hdlcd: Revamp runtime power management
2016-06-02Possible problem with e6afc8ac ("udp: remove headers from UDP packets before ↵Eric Dumazet
queueing") Paul Moore tracked a regression caused by a recent commit, which mistakenly assumed that sk_filter() could be avoided if socket had no current BPF filter. The intent was to avoid udp_lib_checksum_complete() overhead. But sk_filter() also checks skb_pfmemalloc() and security_sock_rcv_skb(), so better call it. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Moore <paul@paul-moore.com> Tested-by: Paul Moore <paul@paul-moore.com> Tested-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: samanthakumar <samanthakumar@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-02MAINTAINERS: DeviceTree maintainer updatesRob Herring
Grant stepped down as kernel DT maintainer and his linaro.org email will be bouncing soon, so remove him now. Pawel, Ian and Kumar either said they don't want to remain maintainers or didn't reply, so removing them as binding maintainers. Update the DT git tree to mine. Grant's has not been active for a while now. I'm actively using patchwork for binding review tracking, so add its URL. Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-06-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "ARM: - two fixes for 4.6 vgic [Christoffer] (cc stable) - six fixes for 4.7 vgic [Marc] x86: - six fixes from syzkaller reports [Paolo] (two of them cc stable) - allow OS X to boot [Dmitry] - don't trust compilers [Nadav]" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR KVM: Handle MSR_IA32_PERF_CTL KVM: x86: avoid write-tearing of TDP KVM: arm/arm64: vgic-new: Removel harmful BUG_ON arm64: KVM: vgic-v3: Relax synchronization when SRE==1 arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1 arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value KVM: arm/arm64: vgic-v3: Always resample level interrupts KVM: arm/arm64: vgic-v2: Always resample level interrupts KVM: arm/arm64: vgic-v3: Clear all dirty LRs KVM: arm/arm64: vgic-v2: Clear all dirty LRs
2016-06-02cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLECatalin Marinas
The cpuidle_devices per-CPU variable is only defined when CPU_IDLE is enabled. Commit c8cc7d4de7a4 ("sched/idle: Reorganize the idle loop") removed the #ifdef CONFIG_CPU_IDLE around cpuidle_idle_call() with the compiler optimising away __this_cpu_read(cpuidle_devices). However, with CONFIG_UBSAN && !CONFIG_CPU_IDLE, this optimisation no longer happens and the kernel fails to link since cpuidle_devices is not defined. This patch introduces an accessor function for the current CPU cpuidle device (returning NULL when !CONFIG_CPU_IDLE) and uses it in cpuidle_idle_call(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-02stmmac: do not sleep in atomic context for mdio_resetVincent Palatin
stmmac_mdio_reset() has been updated to use msleep rather udelay (as some PHY requires a one second delay there). It called from stmmac_resume() within the spin_lock_irqsave block atomic context triggering 'scheduling while atomic'. The stmmac_priv lock usage is not fully documented, but it seems to protect the access to the MAC registers / DMA structures rather than the MDIO bus or the PHY (which have separate locking), so we can push the spin_lock after the stmmac_mdio_reset call. Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-02blk-mq: really fix plug list flushing for nomerge queuesOmar Sandoval
Commit 0809e3ac6231 ("block: fix plug list flushing for nomerge queues") updated blk_mq_make_request() to set request_count even when blk_queue_nomerges() returns true. However, blk_mq_make_request() only does limited plugging and doesn't use request_count; blk_sq_make_request() is the one that should have been fixed. Do that and get rid of the unnecessary work in the mq version. Fixes: 0809e3ac6231 ("block: fix plug list flushing for nomerge queues") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-02Btrfs: self-tests: Support non-4k page sizeFeifei Xu
self-tests code assumes 4k as the sectorsize and nodesize. This commit fix hardcoded 4K. Enables the self-tests code to be executed on non-4k page sized systems (e.g. ppc64). Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-02Btrfs: Fix integer overflow when calculating bytes_per_bitmapFeifei Xu
On ppc64, bytes_per_bitmap will be (65536*8*65536). Hence append UL to fix integer overflow. Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-02Btrfs: test_check_exists: Fix infinite loop when searching for free space ↵Feifei Xu
entries On a ppc64 machine using 64K as the block size, assume that the RB tree at btrfs_free_space_ctl->free_space_offset contains following two entries: 1. A bitmap entry having an offset value of 0 and having the bits corresponding to the address range [128M+512K, 128M+768K] set. 2. An extent entry corresponding to the address range [128M-256K, 128M-128K] In such a scenario, test_check_exists() invoked for checking the existence of address range [128M+768K, 256M] can lead to an infinite loop as explained below: - Checking for the extent entry fails. - Checking for a bitmap entry results in the free space info in range [128M+512K, 128M+768K] beng returned. - rb_prev(info) returns NULL because the bitmap entry starting from offset 0 comes first in the RB tree. - current_node = bitmap node. - while (current_node) tmp = rb_next(bitmap_node);/*tmp is extent based free space entry*/ Since extent based free space entry's last address is smaller than the address being searched for (i.e. 128M+768K) we incorrectly again obtain the extent node as the "next right node" of the RB tree and thus end up looping infinitely. This patch fixes the issue by checking the "tmp" variable which point to the most recently searched free space node. Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-06-02irqchip/irq-pic32-evic: Fix bug with external interrupts.Joshua Henderson
The wrong external interrupt bits are being set, offset by 1. Signed-off-by: Joshua Henderson <digitalpeer@digitalpeer.com> Signed-off-by: Purna Chandra Mandal <purna.mandal@microchip.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144Ganapatrao Kulkarni
The erratum fixes the hang of ITS SYNC command by avoiding inter node io and collections/cpu mapping on thunderx dual-socket platform. This fix is only applicable for Cavium's ThunderX dual-socket platform. Reviewed-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02irqchip/gic-v3: Fix quiescence check in gic_enable_redistAndrew Jones
Make sure the two sides of the bitwise operation are bool. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02irqchip/gic-v3: Fix copy+paste mistakes in definesAndrew Jones
ICC_SGI1R_AFFINITY_{2,3}_MASK are unused, which is good because they were defined with the wrong shifts. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding maskMarc Zyngier
The INTID mask is wrong, and is made a signed value, which has nteresting effects in the KVM emulation. Let's sanitize it. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02drm: hdlcd: Add information about the underlying framebuffers in debugfsLiviu Dudau
drm_fb_cma code has a nice helper function to display in the debugfs information about the underlying framebuffers used by HDLCD: $ cat /sys/kernel/debug/dri/0/fb fb: 1920x1200@XR24 0: offset=0 pitch=7680, obj: 0 ( 2) 001011ba 0x00000000fc300000 ffffff800a27c000 9338880 fb: 1920x1200@XR24 0: offset=0 pitch=7680, obj: 0 ( 2) 001008ca 0x00000000fba00000 ffffff8009987000 9338880 fb: 1920x1200@XR24 0: offset=0 pitch=7680, obj: 0 ( 1) 00100000 0x00000000fb100000 ffffff8008fdc000 9216000 Add the entry in HDLCD's debugfs node. Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-06-02drm: hdlcd: Cleanup the atomic plane operationsLiviu Dudau
Harden the plane_check() code to drop attempts at scaling because that is not supported. Make hdlcd_plane_atomic_update() set the pitch and line length registers that correctly reflect the plane's values. And make hdlcd_crtc_mode_set_nofb() a helper function for hdlcd_crtc_enable() rather than an exposed hook. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-06-02drm/hdlcd: Fix up crtc_state->event handlingDaniel Vetter
event_list just reimplemented what drm_crtc_arm_vblank_event does. And we also need to send out drm events when shutting down a pipe. With this it's possible to use the new nonblocking commit support in the helpers. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-06-02drm: hdlcd: Revamp runtime power managementLiviu Dudau
Because the HDLCD driver acts as a component master it can end up enabling the runtime PM functionality before the encoders are initialised. This can cause crashes if the component slave never probes (missing module) or if the PM operations kick in before the probe finishes. Move the enabling of the runtime PM after the component master has finished collecting the slave components and use the DRM atomic helpers to suspend and resume the device. Tested-by: Robin Murphy <Robin.Murphy@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-06-02KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGSPaolo Bonzini
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40 RSP <ffff88005836bd50> Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUIDPaolo Bonzini
This causes an ugly dmesg splat. Beautified syzkaller testcase: #include <unistd.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <fcntl.h> #include <linux/kvm.h> long r[8]; int main() { struct kvm_irq_routing ir = { 0 }; r[2] = open("/dev/kvm", O_RDWR); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_SET_GSI_ROUTING, &ir); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsiPaolo Bonzini
Found by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000120 IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] PGD 6f80b067 PUD b6535067 PMD 0 Oops: 0000 [#1] SMP CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 [...] Call Trace: [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm] [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm] [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a1062>] tracesys_phase2+0x84/0x89 Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85 RIP [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] RSP <ffff8800926cbca8> CR2: 0000000000000120 Testcase: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[26]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", 0); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); struct kvm_irqfd ifd; ifd.fd = syscall(SYS_eventfd2, 5, 0); ifd.gsi = 3; ifd.flags = 2; ifd.resamplefd = ifd.fd; r[25] = ioctl(r[3], KVM_IRQFD, &ifd); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: fail KVM_SET_VCPU_EVENTS with invalid exception numberPaolo Bonzini
This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return EINVAL. It causes a WARN from exception_type: WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]() CPU: 3 PID: 16732 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e 0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2 ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001 Call Trace: [<ffffffff813b542e>] dump_stack+0x63/0x85 [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm] [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm] [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71 ---[ end trace b1a0391266848f50 ]--- Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kvm.h> long r[31]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0); struct kvm_vcpu_events ve = { .exception.injected = 1, .exception.nr = 0xd4 }; r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve); r[30] = ioctl(r[7], KVM_RUN, 0); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>