Age | Commit message (Collapse) | Author |
|
Perf doesn't take the left period into account when auto-reload is
enabled with fixed period sampling mode in context switch.
Here is the MSR trace of the perf command as below.
(The MSR trace is simplified from a ftrace log.)
#perf record -e cycles:p -c 2000000 -- ./triad_loop
//The MSR trace of task schedule out
//perf disable all counters, disable PEBS, disable GP counter 0,
//read GP counter 0, and re-enable all counters.
//The counter 0 stops at 0xfffffff82840
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0
write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 0
write_msr: MSR_P6_EVNTSEL0(186), value 40003003c
rdpmc: 0, value fffffff82840
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff
//The MSR trace of the same task schedule in again
//perf disable all counters, enable and set GP counter 0,
//enable PEBS, and re-enable all counters.
//0xffffffe17b80 (-2000000) is written to GP counter 0.
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0
write_msr: MSR_IA32_PMC0(4c1), value ffffffe17b80
write_msr: MSR_P6_EVNTSEL0(186), value 40043003c
write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 1
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff
When the same task schedule in again, the counter should starts from
previous left. However, it starts from the fixed period -2000000 again.
A special variant of intel_pmu_save_and_restart() is used for
auto-reload, which doesn't update the hwc->period_left.
When the monitored task schedules in again, perf doesn't know the left
period. The fixed period is used, which is inaccurate.
With auto-reload, the counter always has a negative counter value. So
the left period is -value. Update the period_left in
intel_pmu_save_and_restart_reload().
With the patch:
//The MSR trace of task schedule out
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0
write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 0
write_msr: MSR_P6_EVNTSEL0(186), value 40003003c
rdpmc: 0, value ffffffe25cbc
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff
//The MSR trace of the same task schedule in again
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0
write_msr: MSR_IA32_PMC0(4c1), value ffffffe25cbc
write_msr: MSR_P6_EVNTSEL0(186), value 40043003c
write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 1
write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff
Fixes: d31fc13fdcb2 ("perf/x86/intel: Fix event update for auto-reload")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200121190125.3389-1-kan.liang@linux.intel.com
|
|
Commit 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h"),
claimed L2 misses were unsupported, due to them not being found in its
referenced documentation, whose link has now moved [1].
That old documentation listed PMCx064 unit mask bit 3 as:
"LsRdBlkC: LS Read Block C S L X Change to X Miss."
and bit 0 as:
"IcFillMiss: IC Fill Miss"
We now have new public documentation [2] with improved descriptions, that
clearly indicate what events those unit mask bits represent:
Bit 3 now clearly states:
"LsRdBlkC: Data Cache Req Miss in L2 (all types)"
and bit 0 is:
"IcFillMiss: Instruction Cache Req Miss in L2."
So we can now add support for L2 misses in perf's genericised events as
PMCx064 with both the above unit masks.
[1] The commit's original documentation reference, "Processor Programming
Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors",
originally available here:
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
is now available here:
https://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Models_00h-0Fh.pdf
[2] "Processor Programming Reference (PPR) for Family 17h Model 31h,
Revision B0 Processors", available here:
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h")
Reported-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200121171232.28839-1-kim.phillips@amd.com
|
|
Tremont is Intel's successor to Goldmont Plus. SMI_COUNT MSR is also
supported.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1580236279-35492-3-git-send-email-kan.liang@linux.intel.com
|
|
Tremont is Intel's successor to Goldmont Plus. From the perspective of
Intel cstate residency counters, there is nothing changed compared with
Goldmont Plus and Goldmont.
Share glm_cstates with Goldmont Plus and Goldmont.
Update the comments for Tremont.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1580236279-35492-2-git-send-email-kan.liang@linux.intel.com
|
|
Elkhart Lake also uses Tremont CPU. From the perspective of Intel PMU,
there is nothing changed compared with Jacobsville.
Share the perf code with Jacobsville.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1580236279-35492-1-git-send-email-kan.liang@linux.intel.com
|
|
We are missing this annotation in percpu_down_write(). Correct
this.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20200108013305.7732-1-dave@stgolabs.net
|
|
Now that __percpu_up_read() is only ever used from percpu_up_read()
merge them, it's a small function.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20200131151540.212415454@infradead.org
|
|
Remove the now unused RWSEM_OWNER_UNKNOWN hack. This hack breaks
PREEMPT_RT and getting rid of it was the entire motivation for
re-writing the percpu rwsem.
The biggest problem is that it is fundamentally incompatible with any
form of Priority Inheritance, any exclusively held lock must have a
distinct owner.
Requested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200204092228.GP14946@hirez.programming.kicks-ass.net
|
|
The filesystem freezer uses percpu-rwsem in a way that is effectively
write_non_owner() and achieves this with a few horrible hacks that
rely on the rwsem (!percpu) implementation.
When PREEMPT_RT replaces the rwsem implementation with a PI aware
variant this comes apart.
Remove the embedded rwsem and implement it using a waitqueue and an
atomic_t.
- make readers_block an atomic, and use it, with the waitqueue
for a blocking test-and-set write-side.
- have the read-side wait for the 'lock' state to clear.
Have the waiters use FIFO queueing and mark them (reader/writer) with
a new WQ_FLAG. Use a custom wake_function to wake either a single
writer or all readers until a writer.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200204092403.GB14879@hirez.programming.kicks-ass.net
|
|
In preparation for removing the embedded rwsem and building a custom
lock, extract the read-trylock primitive.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200131151540.098485539@infradead.org
|
|
As preparation to rework __percpu_down_read() move the
__this_cpu_inc() into it.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200131151540.041600199@infradead.org
|
|
Use bool where possible.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200131151539.984626569@infradead.org
|
|
As preparation for replacing the embedded rwsem, give percpu-rwsem its
own lockdep_map.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200131151539.927625541@infradead.org
|
|
Once a lock class is zapped, all the lock chains that include the zapped
class are essentially useless. The lock_chain structure itself can be
reused, but not the corresponding chain_hlocks[] entries. Over time,
we will run out of chain_hlocks entries while there are still plenty
of other lockdep array entries available.
To fix this imbalance, we have to make chain_hlocks entries reusable
just like the others. As the freed chain_hlocks entries are in blocks of
various lengths. A simple bitmap like the one used in the other reusable
lockdep arrays isn't applicable. Instead the chain_hlocks entries are
put into bucketed lists (MAX_CHAIN_BUCKETS) of chain blocks. Bucket 0
is the variable size bucket which houses chain blocks of size larger than
MAX_CHAIN_BUCKETS sorted in decreasing size order. Initially, the whole
array is in one chain block (the primordial chain block) in bucket 0.
The minimum size of a chain block is 2 chain_hlocks entries. That will
be the minimum allocation size. In other word, allocation requests
for one chain_hlocks entry will cause 2-entry block to be returned and
hence 1 entry will be wasted.
Allocation requests for the chain_hlocks are fulfilled first by looking
for chain block of matching size. If not found, the first chain block
from bucket[0] (the largest one) is split. That can cause hlock entries
fragmentation and reduce allocation efficiency if a chain block of size >
MAX_CHAIN_BUCKETS is ever zapped and put back to after the primordial
chain block. So the MAX_CHAIN_BUCKETS must be large enough that this
should seldom happen.
By reusing the chain_hlocks entries, we are able to handle workloads
that add and zap a lot of lock classes without the risk of running out
of chain_hlocks entries as long as the total number of outstanding lock
classes at any time remain within a reasonable limit.
Two new tracking counters, nr_free_chain_hlocks & nr_large_chain_blocks,
are added to track the total number of chain_hlocks entries in the
free bucketed lists and the number of large chain blocks in buckets[0]
respectively. The nr_free_chain_hlocks replaces nr_chain_hlocks.
The nr_large_chain_blocks counter enables to see if we should increase
the number of buckets (MAX_CHAIN_BUCKETS) available so as to avoid to
avoid the fragmentation problem in bucket[0].
An internal nfsd test that ran for more than an hour and kept on
loading and unloading kernel modules could cause the following message
to be displayed.
[ 4318.443670] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
The patched kernel was able to complete the test with a lot of free
chain_hlocks entries to spare:
# cat /proc/lockdep_stats
:
dependency chains: 18867 [max: 65536]
dependency chain hlocks: 74926 [max: 327680]
dependency chain hlocks lost: 0
:
zapped classes: 1541
zapped lock chains: 56765
large chain blocks: 1
By changing MAX_CHAIN_BUCKETS to 3 and add a counter for the size of the
largest chain block. The system still worked and We got the following
lockdep_stats data:
dependency chains: 18601 [max: 65536]
dependency chain hlocks used: 73133 [max: 327680]
dependency chain hlocks lost: 0
:
zapped classes: 1541
zapped lock chains: 56702
large chain blocks: 45165
large chain block size: 20165
By running the test again, I was indeed able to cause chain_hlocks
entries to get lost:
dependency chain hlocks used: 74806 [max: 327680]
dependency chain hlocks lost: 575
:
large chain blocks: 48737
large chain block size: 7
Due to the fragmentation, it is possible that the
"MAX_LOCKDEP_CHAIN_HLOCKS too low!" error can happen even if a lot of
of chain_hlocks entries appear to be free.
Fortunately, a MAX_CHAIN_BUCKETS value of 16 should be big enough that
few variable sized chain blocks, other than the initial one, should
ever be present in bucket 0.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-7-longman@redhat.com
|
|
Add a new counter nr_zapped_lock_chains to track the number lock chains
that have been removed.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-6-longman@redhat.com
|
|
If a lock chain contains a class that is zapped, the whole lock chain is
likely to be invalid. If the zapped class is at the end of the chain,
the partial chain without the zapped class should have been stored
already as the current code will store all its predecessor chains. If
the zapped class is somewhere in the middle, there is no guarantee that
the partial chain will actually happen. It may just clutter up the hash
and make searching slower. I would rather prefer storing the chain only
when it actually happens.
So just dump the corresponding chain_hlocks entries for now. A latter
patch will try to reuse the freed chain_hlocks entries.
This patch also changes the type of nr_chain_hlocks to unsigned integer
to be consistent with the other counters.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-5-longman@redhat.com
|
|
The whole point of the lockdep dynamic key patch is to allow unused
locks to be removed from the lockdep data buffers so that existing
buffer space can be reused. However, there is no way to find out how
many unused locks are zapped and so we don't know if the zapping process
is working properly.
Add a new nr_zapped_classes counter to track that and show it in
/proc/lockdep_stats.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-4-longman@redhat.com
|
|
Currently, the irq_context field of a lock chains displayed in
/proc/lockdep_chains is just a number. It is likely that many people
may not know what a non-zero number means. To make the information more
useful, print the actual irq names ("softirq" and "hardirq") instead.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-3-longman@redhat.com
|
|
There are currently three counters to track the IRQ context of a lock
chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
They are incremented when a new lock chain is added, but they are
not decremented when a lock chain is removed. That causes some of the
statistic counts reported by /proc/lockdep_stats to be incorrect.
IRQ
Fix that by decrementing the right counter when a lock chain is removed.
Since inc_chains() no longer accesses hardirq_context and softirq_context
directly, it is moved out from the CONFIG_TRACE_IRQFLAGS conditional
compilation block.
Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206152408.24165-2-longman@redhat.com
|
|
Fix kernel-doc warning in kernel/sched/fair.c, caused by a recent
function parameter removal:
../kernel/sched/fair.c:3526: warning: Excess function parameter 'flags' description in 'attach_entity_load_avg'
Fixes: a4f9a0e51bbf ("sched/fair: Remove redundant call to cpufreq_update_util()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/cbe964e4-6879-fd08-41c9-ef1917414af4@infradead.org
|
|
This patch fixes the following sparse warnings in sched/core.c
and sched/membarrier.c:
kernel/sched/core.c:2372:27: error: incompatible types in comparison expression
kernel/sched/core.c:4061:17: error: incompatible types in comparison expression
kernel/sched/core.c:6067:9: error: incompatible types in comparison expression
kernel/sched/membarrier.c:108:21: error: incompatible types in comparison expression
kernel/sched/membarrier.c:177:21: error: incompatible types in comparison expression
kernel/sched/membarrier.c:243:21: error: incompatible types in comparison expression
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200201125803.20245-1-madhuparnabhowmik10@gmail.com
|
|
Issuing write() with count parameter set to 0 on any file under
/proc/pressure/ will cause an OOB write because of the access to
buf[buf_size-1] when NUL-termination is performed. Fix this by checking
for buf_size to be non-zero.
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20200203212216.7076-1-surenb@google.com
|
|
Now that patch.o is unconditionally selected for ftrace, it can also
get compiled for !MMU kernels. These (obviously) lack
{set,clear}_fixmap() support.
Also remove the superfluous __acquire/__release nonsense.
Fixes: 42e51f187f86 ("arm/ftrace: Use __patch_text()")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The __patch_text() function already applies __opcode_to_mem_*(), so
when __opcode_to_mem_*() is not the identity (BE*), it is applied
twice, wrecking the instruction.
Fixes: 42e51f187f86 ("arm/ftrace: Use __patch_text()")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
|
|
Jabra Evolve 65 headset appears as if supporting lower rates than
48kHz, but it actually doesn't work but with 48kHz for playback.
This patch applies a workaround to enforce the 48kHz like LINE6
devices already did. The workaround is put in a unified helper
function, set_fixed_rate(), to be called from both places now.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206149
Link: https://lore.kernel.org/r/20200211111419.5895-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Followup patch will need a helper function with the 'clashing entries
refer to the identical tuple in both directions' resolution logic.
This patch will add another resolve_clash helper where loser_ct must
not be added to the dying list because it will be inserted into the
table.
Therefore this also moves the stat counters and dying-list insertion
of the losing ct.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
... so it can be re-used from clash resolution in followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
ctinfo is whats taken from the skb, i.e.
ct = nf_ct_get(skb, &ctinfo).
We do not pass 'ct' and instead re-fetch it from the skb.
Just do the same for both netns and ctinfo.
Also add a comment on what clash resolution is supposed to do.
While at it, one indent level can be removed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Only assert that the i915_vma is now idle if and only if no other pins
are present. If another user has the i915_vma pinned, they may submit
more work to the i915_vma skipping the vm->mutex used to serialise the
unbind. We need to wait again, if we want to continue and unbind this
vma.
However, if we own the i915_vma (we hold the vm->mutex for the unbind
and the pin_count is 0), we can assert that the vma remains idle as we
unbind.
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/530
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-1-chris@chris-wilson.co.uk
(cherry picked from commit 60e94557fff1f5514c7fc4da7ddc2c7a13ffff26)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
To multiply 2 u32 numbers to generate a u64 in C requires a bit of
forewarning for the compiler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123125934.1401755-1-chris@chris-wilson.co.uk
(cherry picked from commit 0f8f8a64300092852b9361cd835395ee71e6a7d6)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
For a simulated preemption reset, we don't populate the request and so
do not fill in the guilty context name.
[ 79.991294] i915 0000:00:02.0: GPU HANG: ecode 9:1:e757fefe, in [0]
Just don't mention the empty string in the logs!
Fixes: 742379c0c400 ("drm/i915: Start chopping up the GPU error capture")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200121132107.267709-1-chris@chris-wilson.co.uk
(cherry picked from commit 29baf3ae8daa4c673de58106ff41c7236dff57f4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Currently we create a new mmap_offset for every call to
mmap_offset_ioctl. This exposes ourselves to an abusive client that may
simply create new mmap_offsets ad infinitum, which will exhaust physical
memory and the virtual address space. In addition to the exhaustion, a
very long linear list of mmap_offsets causes other clients using the
object to incur long list walks -- these long lists can also be
generated by simply having many clients generate their own mmap_offset.
However, we can simply use the drm_vma_node itself to manage the file
association (allow/revoke) dropping our need to keep an mmo per-file.
Then if we keep a small rbtree of per-type mmap_offsets, we can lookup
duplicate requests quickly.
Fixes: cc662126b413 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200120104924.4000706-3-chris@chris-wilson.co.uk
(cherry picked from commit 7865559872074a9ab169c87915504661d630addf)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
We need to allow concurrent intel_context_unpin, which means avoiding
doing destructive operations like intel_ring_reset(). This was already
fixed for intel_ring_unpin() in commit 0725d9a31869 ("drm/i915/gt: Make
intel_ring_unpin() safe for concurrent pint"), but I overlooked that
execlists_context_unpin() also made the same mistake.
Reported-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 841350223816 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
References: 0725d9a31869 ("drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200115175829.2761329-1-chris@chris-wilson.co.uk
(cherry picked from commit f3c0efc9fe7a4e61544034f525348a3aa86ac5aa)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The entire asm/archrandom.h header is generically included via
linux/archrandom.h only when CONFIG_ARCH_RANDOM is already set, so the
stub definitions of __arm64_rndr() and __early_cpu_has_rndr() are only
visible to KASLR if it explicitly includes the arch-internal header.
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
During device memory memset, the driver allocates and use a CB (command
buffer). To reuse existing code, it keeps a pointer to the CB in two
variables, user_cb and patched_cb. Therefore, there is no need to "put"
both the user_cb and patched_cb, as it will cause an underflow of the
refcnt of the CB.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
During hard reset we must not write to the device.
Hence avoid halting CoreSight during user context close if it is done
during hard reset.
In addition, we must not re-enable clock gating afterwards as it was
deliberately disabled in the beginning of the hard reset flow.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
The driver must halt the engines before doing hard-reset, otherwise the
device can go into undefined state. There is a place where the driver
didn't do that and this patch fixes it.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
It is theoretically possible for the ACPI EC GPE to be set after the
s2idle_ops->wake() called from s2idle_loop() has returned and before
the subsequent pm_wakeup_pending() check is carried out. If that
happens, the resulting wakeup event will cause the system to resume
even though it may be a spurious one.
To avoid that race, first make the ->wake() callback in struct
platform_s2idle_ops return a bool value indicating whether or not
to let the system resume and rearrange s2idle_loop() to use that
value instad of the direct pm_wakeup_pending() call if ->wake() is
present.
Next, rework acpi_s2idle_wake() to process EC events and check
pm_wakeup_pending() before re-arming the SCI for system wakeup
to prevent it from triggering prematurely and add comments to
that function to explain the rationale for the new code flow.
Fixes: 56b991849009 ("PM: sleep: Simplify suspend-to-idle control flow")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It seems that we shouldn't try to include the include/linux/ path to
unaligned functions. Just include asm/unaligned.h instead so that we
don't run into compilation warnings like below.
In file included from drivers/platform/chrome/wilco_ec/properties.c:8:0:
include/linux/unaligned/le_memmove.h:7:19: error: redefinition of 'get_unaligned_le16'
static inline u16 get_unaligned_le16(const void *p)
^~~~~~~~~~~~~~~~~~
In file included from arch/ia64/include/asm/unaligned.h:5:0,
from arch/ia64/include/asm/io.h:23,
from arch/ia64/include/asm/smp.h:21,
from include/linux/smp.h:68,
from include/linux/percpu.h:7,
from include/linux/arch_topology.h:9,
from include/linux/topology.h:30,
from include/linux/gfp.h:9,
from include/linux/xarray.h:14,
from include/linux/radix-tree.h:18,
from include/linux/idr.h:15,
from include/linux/kernfs.h:13,
from include/linux/sysfs.h:16,
from include/linux/kobject.h:20,
from include/linux/device.h:16,
from include/linux/platform_data/wilco-ec.h:11,
from drivers/platform/chrome/wilco_ec/properties.c:6:
include/linux/unaligned/le_struct.h:7:19: note: previous definition of 'get_unaligned_le16' was here
static inline u16 get_unaligned_le16(const void *p)
^~~~~~~~~~~~~~~~~~
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 60fb8a8e93ca ("platform/chrome: wilco_ec: Allow wilco to be compiled in COMPILE_TEST")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Currently the string formatting is mixing up the offset of ret and
len. Re-work the code to use just len, remove ret and use scnprintf
instead of snprintf and len position accumulation where required.
Remove the -ve return check since scnprintf never returns a failure
-ve size. Also break overly long lines to clean up checkpatch
warnings.
Addresses-Coverity: ("Unused value")
Fixes: 1381a5113caf ("usb: dwc3: debug: purge usage of strcat")
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
Symptom: application opens /dev/ttyGS0 and starts sending (writing) to
it while either USB cable is not connected, or nobody listens on the
other side of the cable. If driver circular buffer overflows before
connection is established, no data will be written to the USB layer
until/unless /dev/ttyGS0 is closed and re-opened again by the
application (the latter besides having no means of being notified about
the event of establishing of the connection.)
Fix: on open and/or connect, kick Tx to flush circular buffer data to
USB layer.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
ffs_aio_cancel() can be called from both interrupt and thread context. Make
sure that the current IRQ state is saved and restored by using
spin_{un,}lock_irq{save,restore}().
Otherwise undefined behavior might occur.
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
SET/CLEAR_FEATURE for Remote Wakeup allowance not handled correctly.
GET_STATUS handling provided not correct data on DATA Stage.
Issue seen when gadget's dr_mode set to "otg" mode and connected
to MacOS.
Both are fixed and tested using USBCV Ch.9 tests.
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Fixes: fa389a6d7726 ("usb: dwc2: gadget: Add remote_wakeup_allowed flag")
Tested-by: Jack Mitchell <ml@embed.me.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
Moved ISOC request length checking from dwc2_hsotg_start_req() function to
dwc2_hsotg_ep_queue().
Fixes: 4fca54aa58293 ("usb: gadget: s3c-hsotg: add multi count support")
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
USB 3.x SuperSpeed peripherals can draw up to 900mA of VBUS power
when in configured state. However, if a configuration wanting to
take advantage of this is added with MaxPower greater than 500
(currently possible if using a ConfigFS gadget) the composite
driver fails to accommodate this for a couple reasons:
- usb_gadget_vbus_draw() when called from set_config() and
composite_resume() will be passed the MaxPower value without
regard for the current connection speed, resulting in a
violation for USB 2.0 since the max is 500mA.
- the bMaxPower of the configuration descriptor would be
incorrectly encoded, again if the connection speed is only
at USB 2.0 or below, likely wrapping around U8_MAX since
the 2mA multiplier corresponds to a maximum of 510mA.
Fix these by adding checks against the current gadget->speed
when the c->MaxPower value is used (set_config() and
composite_resume()) and appropriately limit based on whether
it is currently at a low-/full-/high- or super-speed connection.
Because 900 is not divisible by 8, with the round-up division
currently used in encode_bMaxPower() a MaxPower of 900mA will
result in an encoded value of 0x71. When a host stack (including
Linux and Windows) enumerates this on a single port root hub, it
reads this value back and decodes (multiplies by 8) to get 904mA
which is strictly greater than 900mA that is typically budgeted
for that port, causing it to reject the configuration. Instead,
we should be using the round-down behavior of normal integral
division so that 900 / 8 -> 0x70 or 896mA to stay within range.
And we might as well change it for the high/full/low case as well
for consistency.
N.B. USB 3.2 Gen N x 2 allows for up to 1500mA but there doesn't
seem to be any any peripheral controller supported by Linux that
does two lane operation, so for now keeping the clamp at 900
should be fine.
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
SuperSpeedPlus peripherals must report their bMaxPower of the
configuration descriptor in units of 8mA as per the USB 3.2
specification. The current switch statement in encode_bMaxPower()
only checks for USB_SPEED_SUPER but not USB_SPEED_SUPER_PLUS so
the latter falls back to USB 2.0 encoding which uses 2mA units.
Replace the switch with a simple if/else.
Fixes: eae5820b852f ("usb: gadget: composite: Write SuperSpeedPlus config descriptors")
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
Prior to commit eb9fecb9e69b ("usb: gadget: f_uac2: split out audio
core") the maximum packet size was calculated only from the high-speed
descriptor but now we use the largest of the full-speed and high-speed
descriptors.
This is correct, but the full-speed value is likely to be higher than
that for high-speed and this leads to submitting requests for OUT
transfers (received by the gadget) which are larger than the endpoint's
maximum packet size. These are rightly rejected by the gadget core.
config_ep_by_speed() already sets up the correct maximum packet size for
the enumerated speed in the usb_ep structure, so we can simply use this
instead of the overall value that has been used to allocate buffers for
requests.
Note that the minimum period for ALSA is still set from the largest
value, and this is unavoidable because it's possible to open the audio
device before the gadget has been enumerated.
Tested-by: Pavel Hofman <pavel.hofman@ivitera.com>
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
The current code in dwc3_gadget_ep_reclaim_completed_trb() will
check for IOC/LST bit in the event->status and returns if
IOC/LST bit is set. This logic doesn't work if multiple TRBs
are queued per request and the IOC/LST bit is set on the last
TRB of that request.
Consider an example where a queued request has multiple queued
TRBs and IOC/LST bit is set only for the last TRB. In this case,
the core generates XferComplete/XferInProgress events only for
the last TRB (since IOC/LST are set only for the last TRB). As
per the logic in dwc3_gadget_ep_reclaim_completed_trb()
event->status is checked for IOC/LST bit and returns on the
first TRB. This leaves the remaining TRBs left unhandled.
Similarly, if the gadget function enqueues an unaligned request
with sglist already in it, it should fail the same way, since we
will append another TRB to something that already uses more than
one TRB.
To aviod this, this patch changes the code to check for IOC/LST
bits in TRB->ctrl instead.
At a practical level, this patch resolves USB transfer stalls seen
with adb on dwc3 based HiKey960 after functionfs gadget added
scatter-gather support around v4.20.
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Yang Fei <fei.yang@intel.com>
Cc: Thinh Nguyen <thinhn@synopsys.com>
Cc: Tejas Joglekar <tejas.joglekar@synopsys.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Cc: Jack Pham <jackp@codeaurora.org>
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linux USB List <linux-usb@vger.kernel.org>
Cc: stable <stable@vger.kernel.org>
Tested-by: Tejas Joglekar <tejas.joglekar@synopsys.com>
Reviewed-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com>
[jstultz: forward ported to mainline, reworded commit log, reworked
to only check trb->ctrl as suggested by Felipe]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
|
|
Move the checking, buffer reserve and buffer commit code in
synth_event_trace_start/end() into inline functions
__synth_event_trace_start/end() so they can also be used by
synth_event_trace() and synth_event_trace_array(), and then have all
those functions use them.
Also, change synth_event_trace_state.enabled to disabled so it only
needs to be set if the event is disabled, which is not normally the
case.
Link: http://lkml.kernel.org/r/b1f3108d0f450e58192955a300e31d0405ab4149.1581374549.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|