Age | Commit message (Collapse) | Author |
|
threads
The pr_err in self_open_counters() prints error message to stderr.
Unlike stdout, stderr uses memory buffer on the stack of each calling
process.
The pr_err in self_open_counters() works in a thread called thread_func
created in function create_tasks, which concurrently creates
sched->nr_tasks threads.
If the error happens and pr_err prints the error message in each of
these threads, the stack size of the perf process (default is 8192
kbytes) will quickly run out and the segmentation fault will happen
then.
To solve this problem, pr_err with self_open_counters() should be moved
from newly created threads to the old main thread of the perf process.
Then the pr_err can work in a stable situation without the strange
segmentation fault problem.
Example:
Test environment: x86_64 with 160 cores
Before this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
Segmentation fault
After this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
...
As shown above, the result continues without any segmentation fault.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-6-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
the different pid_max configurations
Although the memory of pid_to_task can be allocated via calloc according
to the value of /proc/sys/kernel/pid_max, it cannot handle the case when
pid_max is changed after 'perf sched record' has created its perf.data.
If the new pid_max configured in 'perf sched replay' is smaller than the
old pid_max configured in 'perf sched record', then it will cause the
assertion failure problem.
To solve this problem, we realloc the memory of pid_to_task stepwise
once the passed-in pid parameter in register_pid is larger than the
current pid_max.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
$ perf sched record ls
$ echo 5000 > /proc/sys/kernel/pid_max
$ cat /proc/sys/kernel/pid_max
5000
Before this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55356 nsecs
the run test took 1000011 nsecs
the sleep test took 1060940 nsecs
perf: builtin-sched.c:337: register_pid: Assertion `!(pid >= (unsigned
long)pid_max)' failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55611 nsecs
the run test took 1000026 nsecs
the sleep test took 1060486 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-5-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
the unexpected change of pid_max
The current memory allocation of struct task_desc *pid_to_task[MAX_PID]
is in a permanent and preset way, and it has two problems:
Problem 1: If the pid_max, which is the max number of pids in the
system, is much smaller than MAX_PID (1024*1000), then it causes a waste
of stack memory. This may happen in the case where the number of cpu
cores is much smaller than 1000.
Problem 2: If the pid_max is changed from the default value to a value
larger than MAX_PID, then it will cause assertion failure problem. The
maximum value of pid_max can be set to pid_max_max (see pidmap_init
defined in kernel/pid.c), which equals to PID_MAX_LIMIT. In x86_64,
PID_MAX_LIMIT is 4*1024*1024 (defined in include/linux/threads.h). This
value is much larger than MAX_PID, and will take up 32768 Kbytes
(4*1024*1024*8/1024) for memory allocation of pid_to_task, which is much
larger than the default 8192 Kbytes of the stack size of calling
process.
Due to these two problems, we use calloc to allocate the memory of
pid_to_task dynamically.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
$ echo 1025000 > /proc/sys/kernel/pid_max
$ cat /proc/sys/kernel/pid_max
1025000
Run some applications until the pid of some process is greater than
the value of MAX_PID (1024*1000).
Before this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55480 nsecs
the run test took 1000008 nsecs
the sleep test took 1063151 nsecs
perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 1024000)'
failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55435 nsecs
the run test took 1000004 nsecs
the sleep test took 1059312 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-4-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Current MAX_PID is only 65536, which will cause assertion failure problem
when CPU cores are more than 64 in x86_64.
This is because the pid_max value in x86_64 is at least
PIDS_PER_CPU_DEFAULT * num_possible_cpus() (see function pidmap_init
defined in kernel/pid.c), where PIDS_PER_CPU_DEFAULT is 1024 (defined in
include/linux/threads.h).
Thus for MAX_PID = 65536, the correspoinding CPU cores are
65536/1024=64. This is obviously not enough at all for x86_64, and will
cause an assertion failure problem due to BUG_ON(pid >= MAX_PID) in the
codes.
We increase MAX_PID value from 65536 to 1024*1000, which can be used in
x86_64 with 1000 cores.
This number is finally decided according to the limitation of stack size
of calling process.
Use 'ulimit -a', the result shows the stack size of any process is 8192
Kbytes, which is defined in include/uapi/linux/resource.h (#define
_STK_LIM (8*1024*1024)).
Thus we choose a large enough value for MAX_PID, and make it satisfy to
the limitation of the stack size, i.e., making the perf process take up
a memory space just smaller than 8192 Kbytes.
We have calculated and tested that 1024*1000 is OK for MAX_PID.
This means perf sched replay can now be used with at most 1000 cores in
x86_64 without any assertion failure problem.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
Before this patch:
$ perf sched replay
run measurement overhead: 240 nsecs
sleep measurement overhead: 55379 nsecs
the run test took 1000004 nsecs
the sleep test took 1059424 nsecs
perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 65536)'
failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55397 nsecs
the run test took 999920 nsecs
the sleep test took 1053313 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
correct meaning
There is no struct task_task at all, thus it is a typo error in the old
commits, now fix it to what it should be in order to avoid unnecessary
misunderstanding.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-2-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the perf kmem does not respect -i option.
Initializing the file.path properly after options get parsed.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-2-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently it ignores operator priority and just sets processed args as a
right operand. But it could result in priority inversion in case that
the right operand is also a operator arg and its priority is lower.
For example, following print format is from new kmem events.
"page=%p", REC->pfn != -1UL ? (((struct page *)(0xffffea0000000000UL)) + (REC->pfn)) : ((void *)0)
But this was treated as below:
REC->pfn != ((null - 1UL) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
In this case, the right arg was '?' operator which has lower priority.
But it just sets the whole arg so making the output confusing - page was
always 0 or 1 since that's the result of logical operation.
With this patch, it can handle it properly like following:
((REC->pfn != (null - 1UL)) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-10-git-send-email-namhyung@kernel.org
[ Replaced 'swap' with 'rotate' in a comment as requested by Steve and agreed by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch add checks in places where map__kmap is used to get kmaps
from struct kmap.
Error messages are added at map__kmap to warn invalid accessing of kmap
(for the case of !map->dso->kernel, kmap(map) does not exists at all).
Also, introduces map__kmaps() to warn uninitialized kmaps.
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1428394966-131044-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_evlist__mmap_consume() uses perf_mmap__empty() to judge whether
perf_mmap is empty and can be released. But the result is inverted so
fix it.
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1428399071-7141-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Tidy up error reporting and move rpm reference retrieval out of the for
loop for improved readability.
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Modeling the individual RPM resources as platform devices consumes at
least 12-15kb of RAM, just to hold the platform_device structs. Rework
this to instead have one device per pmic exposed by the RPM.
With this representation we can more accurately define the input pins on
the pmic and have the supply description match the data sheet.
Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Refactor out all custom property parsing code from the probe function
into a function suitable for regulator_desc->of_parse_cb usage.
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver itself should not flag regulators as being DRMS compatible,
this should come from board or dt files.
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
pdata is assigned to &pdata_of, however, pdata_of becomes dead (when it
goes out of scope) so pdata effectively becomes a dead pointer to the
out of scope object. This is detected by static analysis:
[drivers/regulator/max8660.c:411]: (error) Dead pointer usage.
Pointer 'pdata' is dead if it has been assigned '&pdata_of' at line 404.
Move declaration of pdata_of so it is always in scope.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Setting the transfer length in the TRANSACTION register after the
CONTROL register is programmed causes intermittent timeout issues in
SPFI transfers when using the SPI framework to control the CS GPIO
lines. To avoid this issue, set transfer length before programming
the CONTROL register.
Signed-off-by: Sifan Naeem <sifan.naeem@imgtec.com>
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Everybody expects the error field in the struct mmc_command|data to be
and int but it's actually an unsigned int. Let's convert it into an int
to meet the expectations.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Also check MMC OF properties. The controller supports MMC too.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
rts5250 chip failed handle 64 bit ADMA for address below 4G.
Add 64 BIT quirks to disable this feature.
Signed-off-by: Micky Ching <micky_ching@realsil.com.cn>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Pin sense will active when power pin is wake up.
Power pin will not wake up immediately during resume state.
Add some delay to wait for power pin activated.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Dirty logging tracks sptes in 4k granularity, meaning that large sptes
have to be split. If live migration is successful, the guest in the
source machine will be destroyed and large sptes will be created in the
destination. However, the guest continues to run in the source machine
(for example if live migration fails), small sptes will remain around
and cause bad performance.
This patch introduce lazy collapsing of small sptes into large sptes.
The rmap will be scanned in ioctl context when dirty logging is stopped,
dropping those sptes which can be collapsed into a single large-page spte.
Later page faults will create the large-page sptes.
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Message-Id: <1428046825-6905-1-git-send-email-wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
CR2 is not cleared as it should after reset. See Intel SDM table named "IA-32
Processor States Following Power-up, Reset, or INIT".
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427933438-12782-5-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
DR0-DR3 are not cleared as they should during reset and when they are set from
userspace. It appears to be caused by c77fb5fe6f03 ("KVM: x86: Allow the guest
to run with dirty debug registers").
Force their reload on these situations.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427933438-12782-4-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
After reset, the CPU can change the BSP, which will be used upon INIT. Reset
should return the BSP which QEMU asked for, and therefore handled accordingly.
To quote: "If the MP protocol has completed and a BSP is chosen, subsequent
INITs (either to a specific processor or system wide) do not cause the MP
protocol to be repeated."
[Intel SDM 8.4.2: MP Initialization Protocol Requirements and Restrictions]
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427933438-12782-3-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
recalculate_apic_map() uses two passes over all VCPUs. This is a relic
from time when we selected a global mode in the first pass and set up
the optimized table in the second pass (to have a consistent mode).
Recent changes made mixed mode unoptimized and we can do it in one pass.
Format of logical MDA is a function of the mode, so we encode it in
apic_logical_id() and drop obsoleted variables from the struct.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1423766494-26150-5-git-send-email-rkrcmar@redhat.com>
[Add lid_bits temporary in apic_logical_id. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We want to support mixed modes and the easiest solution is to avoid
optimizing those weird and unlikely scenarios.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1423766494-26150-4-git-send-email-rkrcmar@redhat.com>
[Add comment above KVM_APIC_MODE_* defines. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Broadcast allowed only one global APIC mode, but mixed modes are
theoretically possible. x2APIC IPI doesn't mean 0xff as broadcast,
the rest does.
x2APIC broadcasts are accepted by xAPIC. If we take SDM to be logical,
even addreses beginning with 0xff should be accepted, but real hardware
disagrees. This patch aims for simple code by considering most of real
behavior as undefined.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1423766494-26150-3-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In mixed modes, we musn't deliver xAPIC IPIs like x2APIC and vice versa.
Instead of preserving the information in apic_send_ipi(), we regain it
by converting all destinations into correct MDA in the slow path.
This allows easier reasoning about subsequent matching.
Our kvm_apic_broadcast() had an interesting design decision: it didn't
consider IOxAPIC 0xff as broadcast in x2APIC mode ...
everything worked because IOxAPIC can't set that in physical mode and
logical mode considered it as a message for first 8 VCPUs.
This patch interprets IOxAPIC 0xff as x2APIC broadcast.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1423766494-26150-2-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop unused static procedure which doesn't have callers within its
translation unit. It had been already removed independently in QEMU[1]
from the OpenPIC implementation borrowed from the kernel.
[1] https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg01812.html
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1424768706-23150-3-git-send-email-asolokha@kb.kras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
After speed-up of cpuid_maxphyaddr() it can be called frequently:
instead of heavyweight enumeration of CPUID entries it returns a cached
pre-computed value. It is also inlined now. So caching its result became
unnecessary and can be removed.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Message-Id: <20150329205644.GA1258@gnote>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On each VM-entry CPU should check the following VMCS fields for zero bits
beyond physical address width:
- APIC-access address
- virtual-APIC address
- posted-interrupt descriptor address
This patch adds these checks required by Intel SDM.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Message-Id: <20150329205627.GA1244@gnote>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
cpuid_maxphyaddr(), which performs lot of memory accesses is called
extensively across KVM, especially in nVMX code.
This patch adds a cached value of maxphyaddr to vcpu.arch to reduce the
pressure onto CPU cache and simplify the code of cpuid_maxphyaddr()
callers. The cached value is initialized in kvm_arch_vcpu_init() and
reloaded every time CPUID is updated by usermode. It is obvious that
these reloads occur infrequently.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Message-Id: <20150329205612.GA1223@gnote>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Exposing the on-stack error code with internal error is cheap and
potentially useful.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1428001865-32280-1-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If we were migrated right after __getcpu, but before reading the
migration_count, we wouldn't notice that we read TSC of a different
VCPU, nor that KVM's bug made pvti invalid, as only migration_count
on source VCPU is increased.
Change vdso instead of updating migration_count on destination.
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Fixes: 0a4e6be9ca17 ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"")
Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The corresponding write functions just use __copy_to_user. Do the
same on the read side.
This reverts what's left of commit 86ab8cffb498 (KVM: introduce
gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic, 2012-08-21)
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1427976500-28533-1-git-send-email-pbonzini@redhat.com>
|
|
The newly-added tracepoint shows the following results on
the tscdeadline_latency test:
qemu-kvm-8387 [002] 6425.558974: kvm_vcpu_wakeup: poll time 10407 ns
qemu-kvm-8387 [002] 6425.558984: kvm_vcpu_wakeup: poll time 0 ns
qemu-kvm-8387 [002] 6425.561242: kvm_vcpu_wakeup: poll time 10477 ns
qemu-kvm-8387 [002] 6425.561251: kvm_vcpu_wakeup: poll time 0 ns
and so on. This is because we need to go through kvm_vcpu_block again
after the timer IRQ is injected. Avoid it by polling once before
entering kvm_vcpu_block.
On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
from the latency of the TSC deadline timer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename the old __vcpu_run to vcpu_run, and extract part of it to a new
function vcpu_block.
The next patch will add a new condition in vcpu_block, avoid extra
indentation.
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Guest can't be booted w/ ept=0, there is a message dumped as below:
If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.
EAX=00000011 EBX=f000d2f6 ECX=00006cac EDX=000f8956
ESI=bffbdf62 EDI=00000000 EBP=00006c68 ESP=00006c68
EIP=0000d187 EFL=00000004 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =e000 000e0000 ffffffff 00809300 DPL=0 DS16 [-WA]
CS =f000 000f0000 ffffffff 00809b00 DPL=0 CS16 [-RA]
SS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
DS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
FS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= 000f6a80 00000037
IDT= 000f6abe 00000000
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=01 1e b8 6a 2e 0f 01 16 74 6a 0f 20 c0 66 83 c8 01 0f 22 c0 <66> ea 8f d1 0f 00 08 00 b8 10 00 00 00 8e d8 8e c0 8e d0 8e e0 8e e8 89 c8 ff e2 89 c1 b8X
X86 eflags bit 1 is fixed set, which means that 1 << 1 is set instead of 1,
this patch fix it.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Message-Id: <1428473294-6633-1-git-send-email-wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The change to only export WEXT symbols when required could break
the build if CONFIG_CFG80211_WEXT was explicitly disabled while
a driver like orinoco selected it.
Fix this by hiding the symbol when it's required so it can't be
disabled in that case.
Fixes: 2afe38d15cee ("cfg80211-wext: export symbols only when needed")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1428424967-14460-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Interrupt entry points are handled with the following code,
each 32-byte code block contains seven entry points:
...
[push][jump 22] // 4 bytes
[push][jump 18] // 4 bytes
[push][jump 14] // 4 bytes
[push][jump 10] // 4 bytes
[push][jump 6] // 4 bytes
[push][jump 2] // 4 bytes
[push][jump common_interrupt][padding] // 8 bytes
[push][jump]
[push][jump]
[push][jump]
[push][jump]
[push][jump]
[push][jump]
[push][jump common_interrupt][padding]
[padding_2]
common_interrupt:
And there is a table which holds pointers to every entry point,
IOW: to every push.
In cold cache, two jumps are still costlier than one, even
though we get the benefit of them residing in the same
cacheline.
This change replaces short jumps with near ones to
'common_interrupt', and pads every push+jump pair to 8 bytes. This
way, each interrupt takes only one jump.
This change replaces ".p2align CONFIG_X86_L1_CACHE_SHIFT" before
dispatch table with ".align 8" - we do not need anything
stronger than that.
The table of entry addresses (the interrupt[] array) is no
longer necessary, the address of entries can be easily
calculated as (irq_entries_start + i*8).
text data bss dec hex filename
12546 0 0 12546 3102 entry_64.o.before
11626 0 0 11626 2d6a entry_64.o
The size decrease is because 1656 bytes of .init.rodata are
gone. That's initdata, though. The resident size does go up a
bit.
Run-tested (32 and 64 bits).
Acked-and-Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1428090553-7283-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This change does two things:
Copy-pastes "retint_swapgs:" code into syscall handling code,
the copy is under "syscall_return:" label. The code is unchanged
apart from some label renames.
Removes "opportunistic sysret" code from "retint_swapgs:" code
block, since now it won't be reached by syscall return. This in
fact removes most of the code in question.
text data bss dec hex filename
12530 0 0 12530 30f2 entry_64.o.before
12562 0 0 12562 3112 entry_64.o
Run-tested.
Acked-and-Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1427993219-7291-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Conflicts:
arch/x86/kernel/entry_64.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since the required clock to access registers is gated off in low power mode,
add ci->in_lpm check before try to dump registers value.
Signed-off-by: Li Jun <jun.li@freescale.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
|
|
This is my sigreturn test, added mostly unchanged from its old
home. It exercises the sigreturn(2) syscall, specifically
focusing on its interactions with various IRET corner cases.
It tests for correct behavior in several areas that were
historically dangerously buggy. For example, it exercises espfix
on kernels of both bitnesses under various conditions, and it
contains testcases for several now-fixed bugs in IRET error
handling.
If you run it on older kernels without the fixes, your system will
crash. It probably won't eat your data in the process.
There is no released kernel on which the sigreturn_64 test will
pass, but it passes on tip:x86/asm.
I plan to switch to lib.mk for Linux 4.2.
I'm not using the ksft_ helpers at all yet. I can do that later.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuah.kh@samsung.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/89d10b76b92c7202d8123654dc8d36701c017b3d.1428386971.git.luto@kernel.org
[ Fixed empty format string GCC build warning in trivial_32bit_program.c ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
By pass pullup DP in OTG fsm mode when do gadget connect, to let it handled
by OTG state machine.
This patch can fix the problem you found with my HNP polling patchset after
below 3 patches introduced:
467a78c usb: chipidea: udc: apply new usb_udc_vbus_handler interface
628ef0d usb: udc: add usb_udc_vbus_handler
dfea9c9 usb: udc: store usb_udc pointer in struct usb_gadget
Problem:
- Connect USB cable and MicroAB cable between two boards
- Boot up two boards
- load g_mass_storage at B-device side, the enumeration will success,
and A will see a usb mass-storage device
- load g_mass_storage at A-device side, the problem has occurred, the
connection will be lost at the beginning, then connect again.
This patch is based on
commit eff933c1d3a2e046492b3dfc86db813856553a29
(chipidea: pci: make it depends on NOP_USB_XCEIV)
on branch peter-usb-dev of
git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb.git
Signed-off-by: Li Jun <jun.li@freescale.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
|
|
See https://lkml.org/lkml/2014/10/29/643 and commit f56141e3e2d9
("all arches, signal: move restart_block to struct task_struct")
Signed-off-by: Ley Foon Tan <lftan@altera.com>
|
|
Regression in commit 2caa80e72b57c6216aec6f6a11fcfb4fec46daa0
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sun Feb 22 11:38:36 2015 +0100
drm: Fix deadlock due to getconnector locking changes
If the drm_connector_find() call returns NULL, we should no longer
call drm_modeset_unlock() to avoid locking imbalance.
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Simon reported the md io stats accounting issue:
"
I'm seeing "iostat -x -k 1" print this after a RAID1 rebuild on 4.0-rc5.
It's not abnormal other than it's 3-disk, with one being SSD (sdc) and
the other two being write-mostly:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 345.00 0.00 0.00 0.00 0.00 100.00
md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 58779.00 0.00 0.00 0.00 0.00 100.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 100.00
"
The cause is commit "18c0b223cf9901727ef3b02da6711ac930b4e5d4" uses the
generic_start_io_acct to account the disk stats rather than the open code,
but it also introduced the increase to .in_flight[rw] which is needless to
md. So we re-use the open code here to fix it.
Reported-by: Simon Kirby <sim@hostway.ca>
Cc: <stable@vger.kernel.org> 3.19
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
"The domainname can be specified as either a DNS host name, a
dotted-decimal IPv4 address, or a bracketed IPv6 address as specified
in [RFC2732]."
See https://bugzilla.redhat.com/show_bug.cgi?id=1206868
Reported-by: Kyle Brantley <kyle@averageurl.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|