Age | Commit message (Collapse) | Author |
|
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SET_SYSTEM_SLEEP_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
omap_hsmmc_x callbacks under CONFIG_PM_SLEEP to avoid build warnings.
Signed-off-by: Russ Dill <Russ.Dill@ti.com>
[Grygorii.Strashko@linaro.org: rebased on top of K4.0]
Signed-off-by: Grygorii Strashko <grygorii.strashko@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Mobile phones (some) have no card detect pin, but can detect if the
cover is removed. The purpose is the same; detect if card is being
added/removed, but the details differ.
When the cover is removed, it does not mean the card is gone. But it
might, since it is accessible now. It's like a warning. All the driver
does is to limit write access to the card, see protect_card flag.
In contrast, card detect notifies us after the fact, e.g.
card is gone, card is inserted. We can't take precautions, but we can
rely on those events, -- the card is really gone, or do scan the card.
To summarize there is not much code sharing between cover and card
detect, it only increases confusion. By splitting, both will be
simplified in a followup patch.
Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The indirection via omap_hsmmc_get_ro and omap_hsmmc_get_wp is
redundant. Also dropped setting gpio_wp to EINVAL since platform date
is read-only
Untested: no device with ro pin was available, but change is fairly
simple
Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
addon to: 09108968b7b72b6083a3bfc8f8259a74ed57255e
mmc: omap_hsmmc: remove prepare/complete system suspend support
Signed-off-by: Andreas Fenkart <afenkart@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The reset control for the sunxi mmc controller is optional. Some
newer platforms (sun6i, sun8i, sun9i) have it, while older ones
(sun4i, sun5i, sun7i) don't.
Use the properly stubbed _optional version so the driver does not
fail to compile when RESET_CONTROLLER=n.
This patch also adds a check for deferred probing on the reset
control.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Cc: <stable@vger.kernel.org> # 3.16+
Acked-by: David Lanzendörfer <david.lanzendoerfer@o2s.ch>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In these drivers, the driver specific .remove function just a simple
wrapper of function sdhci_pltfm_unregister(). So remove these wrappers
and just set .remove to sdhci_pltfm_unregister().
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
So we can avoid to sprinkle the clk_disable_unprepare() in many
drivers.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Actually we can use the "clk" in the struct sdhci_pltfm_host. Also
change the "external clock" to "core clock" and kill two redundant
private functions in this driver as suggested by Ray Jui.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Actually we can use the "clk" in the struct sdhci_pltfm_host.
With this change we can also kill the private function for get
max clock in this driver.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Simplify the error and remove path.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
There is only one "clk" member in this driver specific private struct.
Actually we can use the "clk" member in the struct sdhci_pltfm_host,
and then kill this struct completely.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The function clk_disable_unprepare() already take care of either error
or null cases.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
For the original tuning code, delay value is set to SD Bus Clock Delay
Register (SD_CLK_DELAY_SETTING) as (val | (Val << 7) | (val << 16)),
which means CLK_DELAY_IN1, CLK_DELAY_IN2 and CLK_DELAY_OUT are the
same and with 128 steps. This is doubtful. In CSR design specification
documents CS-304575-DR-3H, this issue is clarified, the delay[13:0] in
SD_CLK_DELAY_SETTING is simplied to the concatenation of {CLK_DELAY_IN2,
CLK_DELAY_IN1}.
Besides, for CMD19 tuning, no need to set CLK_DELAY_OUT([22,16]
of SD_CLK_DELAY_SETTING).
Signed-off-by: weijun yang <york.yang@csr.com>
Signed-off-by: Barry Song <baohua.song@csr.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
It's unlikely that this is really needed on any single-slot systems
where we disable card detects until the end of probe, but it still
seems safer to check to make sure that a slot has been initted before
we try to dereference it to find the SDIO interrupt mask.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We really don't want to get a card detect interrupt during probe time
since it can confuse things. Let's disable the card detect interrupt
until we're in a really good place: the end of probe. Let's also
simply avoid enabling the card detect interrupt if it's not used.
It appears that (at least on rk3288) when vqmmc is turned on it can
cause a bogus "card detect" interrupt. That meant that we were
getting a predictable card detect interrupt while we were in
mmc_add_host(). On the version of the kernel I'm working with at
least (3.14), this is not a great time to get a card detect interrupt
since I think that we don't grab all the needed locks in
mmc_add_host() and children. I put stack dumps in dw_mci_setup_bus()
and found that I could see two distinct stack crawls that looked like:
Caller one:
* dw_mci_setup_bus
* dw_mci_set_ios
* mmc_power_up
* mmc_start_host
* mmc_add_host
Caller two:
* dw_mci_setup_bus
* dw_mci_set_ios
* mmc_set_chip_select
* mmc_go_idle
* mmc_rescan
* process_one_work
* worker_thread
* kthread
Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We've seen problems on some WiFi modules where we seem to send a CMD53
(which requires the data lines) while the module is asserting busy.
We shouldn't do that.
The Designware Databook says that before issuing a new data transfer
command we should check for busy, so that's what we'll do.
We'll leverage the existing dw_mmc knowledge about whether it should
wait for the previous command to finish to know whether we should
check for busy before sending the command. This means we won't end up
incorrectly waiting for things like CMD52 (SDIO) or CMD13 (SD) which
don't use the data line.
Note that this also has the advantage of making sure that we don't
change the clock while the card is busy, too.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We should give dw_mmc a good reset after we apply power. On some
boards vqmmc may actually be connected to the IP block in the SoC so
it's good to reset after power comes in.
Without this we sometimes see failures enumerating cards on rk3288.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
It appears that we can confuse things if we try to turn on the MMC
clock when the power is off. Adjust is so that we turn the clock on
(using dw_mci_setup_bus) after power is all the way on and we turn the
clock off before the power goes off.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The STOP command can terminate a data transfer between a memory card and
mmc controller.
As show in Synopsys DesignWare Cores Mobile Storage Host Databook:
Data timeout and Data end-bit error will terminate further data transfer
by mmc controller. So we should not send abort command to terminate a
data transfer again if we got DRTO and EBE interrupt.
After this patch, all mmc_test cases can pass on RK3288-Pink2 board.
Signed-off-by: Addy Ke <addy.ke@rock-chips.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Tested-by: Doug Anderson <dianders@chromium.org>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
To support HS200 and UHS mode, mmc core will call init_card() to
execute tuning:
- sdio: init_card can be executed at runtime resume.
- sd and mmc: init_card can be executed at resume or runtime resume,
which depends on MMC_CAP_RUNTIME_RESUME capability.
On rk3288 SoC, host will get DRTO interrupt when host send command
to read tuning data. This will spend more than 111ms:
drto_ms = drto_clks * 1000 / bus_hz = 111ms.
And the total tuning time will be more than 400ms.
So we should add MMC_CAP_RUNTIME_RESUME capability to execute tuning
at runtime resume. Only if we do so, can we pass resume test.
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Addy Ke <addy.ke@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Implements HS400 mode support for exynos host driver.
This also include some updates as new mode is added.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
[Alim: addressed review comments]
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When ioremap_wc() or ioremap_cached() are used without first including
asm/pgtable.h, the _PAGE_CACHEABLE or _PAGE_WR_COMBINE definitions
aren't found, resulting in build errors like the following (in
next-20150323 due to "lib: devres: add a helper function for
ioremap_wc"):
lib/devres.c: In function ‘devm_ioremap_wc’:
lib/devres.c:91: error: ‘_PAGE_WR_COMBINE’ undeclared
We can't easily include asm/pgtable.h in asm/io.h due to dependency
problems, so split out the _PAGE_* definitions from asm/pgtable.h into a
separate asm/pgtable-bits.h header (as a couple of other architectures
already do), and include that in io.h instead.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: Abhilash Kesavan <a.kesavan@samsung.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make the code which sets up the pmd depend on PT_NLEVELS == 3, not on
CONFIG_64BIT. The reason is, that a 64bit kernel with a page size
greater than 4k doesn't need the pmd and thus has PT_NLEVELS = 2.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The patch dc6c9a35b66b520cf67e05d8ca60ebecad3b0479 that counts pmds
allocated for a process introduced a bug on 64-bit PA-RISC kernels.
The PA-RISC architecture preallocates one pmd with each pgd. This
preallocated pmd can never be freed - pmd_free does nothing when it is
called with this pmd. When the kernel attempts to free this preallocated
pmd, it decreases the count of allocated pmds. The result is that the
counter underflows and this error is reported.
This patch fixes the bug by artifically incrementing the counter in
pmd_free when the kernel tries to free the preallocated pmd.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
This allows us to remove some unnecessary ifdefs. There should
be no change to the generated code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f7e00f0d668e253abf0bd8bf36491ac47bd761ff.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
It has no callers anymore.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a594afd6a0bddb1311bd7c92a15201c87fbb8681.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
user_mode_vm() and user_mode() are now the same. Change all callers
of user_mode_vm() to user_mode().
The next patch will remove the definition of user_mode_vm.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/43b1f57f3df70df5a08b0925897c660725015554.1426728647.git.luto@kernel.org
[ Merged to a more recent kernel. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
user_mode() is now identical to user_mode_vm(). Subsequent patches
will change all callers of user_mode_vm() to user_mode() and then
delete user_mode_vm().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0dd03eacb5f0a2b5ba0240de25347a31b493c289.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
A few of the user_mode() checks in traps.c are immediately after
explicit checks for vm86 mode. Change them to user_mode_ignore_vm86().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0b324d5b75c3402be07f8d3c6245ed7f4995029e.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There's no point in checking the VM bit on 64-bit, and, since
we're explicitly checking it, we can use user_mode_ignore_vm86()
after the check.
While we're at it, rearrange the #ifdef slightly to make the code
flow a bit clearer.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dc1457a734feccd03a19bb3538a7648582f57cdd.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
user_mode() is dangerous and user_mode_vm() has a confusing name.
Add user_mode_ignore_vm86() (equivalent to current user_mode()).
We'll change the small number of legitimate users of user_mode()
to user_mode_ignore_vm86().
Inspired by grsec, although this works rather differently.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/202c56ca63823c338af8e2e54948dbe222da6343.1426728647.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Conflicts:
arch/x86/kernel/entry_64.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The digicolor_set_gc() routine is only called from __init annotated
digicolor_of_init(). Annotate digicolor_set_gc() with __init as well to save a
few bytes at run time.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Link: https://lkml.kernel.org/r/a3b57ecdbe0b07f55c20c07ff98f1f694275722d.1427009985.git.baruch@tkos.co.il
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
When debugging the latencies on a 40 core box, where we hit 300 to
500 microsecond latencies, I found there was a huge contention on the
runqueue locks.
Investigating it further, running ftrace, I found that it was due to
the pulling of RT tasks.
The test that was run was the following:
cyclictest --numa -p95 -m -d0 -i100
This created a thread on each CPU, that would set its wakeup in iterations
of 100 microseconds. The -d0 means that all the threads had the same
interval (100us). Each thread sleeps for 100us and wakes up and measures
its latencies.
cyclictest is maintained at:
git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git
What happened was another RT task would be scheduled on one of the CPUs
that was running our test, when the other CPU tests went to sleep and
scheduled idle. This caused the "pull" operation to execute on all
these CPUs. Each one of these saw the RT task that was overloaded on
the CPU of the test that was still running, and each one tried
to grab that task in a thundering herd way.
To grab the task, each thread would do a double rq lock grab, grabbing
its own lock as well as the rq of the overloaded CPU. As the sched
domains on this box was rather flat for its size, I saw up to 12 CPUs
block on this lock at once. This caused a ripple affect with the
rq locks especially since the taking was done via a double rq lock, which
means that several of the CPUs had their own rq locks held while trying
to take this rq lock. As these locks were blocked, any wakeups or load
balanceing on these CPUs would also block on these locks, and the wait
time escalated.
I've tried various methods to lessen the load, but things like an
atomic counter to only let one CPU grab the task wont work, because
the task may have a limited affinity, and we may pick the wrong
CPU to take that lock and do the pull, to only find out that the
CPU we picked isn't in the task's affinity.
Instead of doing the PULL, I now have the CPUs that want the pull to
send over an IPI to the overloaded CPU, and let that CPU pick what
CPU to push the task to. No more need to grab the rq lock, and the
push/pull algorithm still works fine.
With this patch, the latency dropped to just 150us over a 20 hour run.
Without the patch, the huge latencies would trigger in seconds.
I've created a new sched feature called RT_PUSH_IPI, which is enabled
by default.
When RT_PUSH_IPI is not enabled, the old method of grabbing the rq locks
and having the pulling CPU do the work is implemented. When RT_PUSH_IPI
is enabled, the IPI is sent to the overloaded CPU to do a push.
To enabled or disable this at run time:
# mount -t debugfs nodev /sys/kernel/debug
# echo RT_PUSH_IPI > /sys/kernel/debug/sched_features
or
# echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched_features
Update: This original patch would send an IPI to all CPUs in the RT overload
list. But that could theoretically cause the reverse issue. That is, there
could be lots of overloaded RT queues and one CPU lowers its priority. It would
then send an IPI to all the overloaded RT queues and they could then all try
to grab the rq lock of the CPU lowering its priority, and then we have the
same problem.
The latest design sends out only one IPI to the first overloaded CPU. It tries to
push any tasks that it can, and then looks for the next overloaded CPU that can
push to the source CPU. The IPIs stop when all overloaded CPUs that have pushable
tasks that have priorities greater than the source CPU are covered. In case the
source CPU lowers its priority again, a flag is set to tell the IPI traversal to
restart with the first RT overloaded CPU after the source CPU.
Parts-suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Joern Engel <joern@purestorage.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150318144946.2f3cc982@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When CONFIG_IRQ_WORK is not defined (difficult to do, as it also
requires CONFIG_PRINTK not to be defined), we get a build failure:
kernel/built-in.o: In function `flush_smp_call_function_queue':
kernel/smp.c:263: undefined reference to `irq_work_run'
kernel/smp.c:263: undefined reference to `irq_work_run'
Makefile:933: recipe for target 'vmlinux' failed
Simplest thing to do is to make irq_work_run() a nop when not set.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150319101851.4d224d9b@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
applying new patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The hrtimer mode of broadcast queues hrtimers in the idle entry
path so as to wakeup cpus in deep idle states. The associated
call graph is :
cpuidle_idle_call()
|____ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, ....))
|_____tick_broadcast_set_event()
|____clockevents_program_event()
|____bc_set_next()
The hrtimer_{start/cancel} functions call into tracing which uses RCU.
But it is not legal to call into RCU in cpuidle because it is one of the
quiescent states. Hence protect this region with RCU_NONIDLE which informs
RCU that the cpu is momentarily non-idle.
As an aside it is helpful to point out that the clock event device that is
programmed here is not a per-cpu clock device; it is a
pseudo clock device, used by the broadcast framework alone.
The per-cpu clock device programming never goes through bc_set_next().
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Cc: mpe@ellerman.id.au
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/r/20150318104705.17763.56668.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Module unload calls lockdep_free_key_range(), which removes entries
from the data structures. Most of the lockdep code OTOH assumes the
data structures are append only; in specific see the comments in
add_lock_to_list() and look_up_lock_class().
Clearly this has only worked by accident; make it work proper. The
actual scenario to make it go boom would involve the memory freed by
the module unlock being re-allocated and re-used for a lock inside of
a rcu-sched grace period. This is a very unlikely scenario, still
better plug the hole.
Use RCU list iteration in all places and ammend the comments.
Change lockdep_free_key_range() to issue a sync_sched() between
removal from the lists and returning -- which results in the memory
being freed. Further ensure the callers are placed correctly and
comment the requirements.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Tsyvarev <tsyvarev@ispras.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When non-realtime tasks get priority-inheritance boosted to a realtime
scheduling class, RLIMIT_RTTIME starts to apply to them. However, the
counter used for checking this (the same one used for SCHED_RR
timeslices) was not getting reset. This meant that tasks running with a
non-realtime scheduling class which are repeatedly boosted to a realtime
one, but never block while they are running realtime, eventually hit the
timeout without ever running for a time over the limit. This patch
resets the realtime timeslice counter when un-PI-boosting from an RT to
a non-RT scheduling class.
I have some test code with two threads and a shared PTHREAD_PRIO_INHERIT
mutex which induces priority boosting and spins while boosted that gets
killed by a SIGXCPU on non-fixed kernels but doesn't with this patch
applied. It happens much faster with a CONFIG_PREEMPT_RT kernel, and
does happen eventually with PREEMPT_VOLUNTARY kernels.
Signed-off-by: Brian Silverman <brian@peloton-tech.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: austin@peloton-tech.com
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1424305436-6716-1-git-send-email-brian@peloton-tech.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Vince reported a watchdog lockup like:
[<ffffffff8115e114>] perf_tp_event+0xc4/0x210
[<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
[<ffffffff810b7f10>] lock_release+0x130/0x260
[<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
[<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
[<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
[<ffffffff811f71ce>] send_sigio+0xae/0x100
[<ffffffff811f72b7>] kill_fasync+0x97/0xf0
[<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
[<ffffffff8115d103>] perf_pending_event+0x33/0x60
[<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
[<ffffffff8114e448>] irq_work_run+0x18/0x40
[<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
[<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
Which is caused by an irq_work generating new irq_work and therefore
not allowing forward progress.
This happens because processing the perf irq_work triggers another
perf event (tracepoint stuff) which in turn generates an irq_work ad
infinitum.
Avoid this by raising the recursion counter in the irq_work -- which
effectively disables all software events (including tracepoints) from
actually triggering again.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The external IRQ controller has a functional clock, which is used for
power management. Document it.
Fix a typo in the r8a73a4 SoC name while we're at it.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-4-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
This is just enough to let pm_clk_*() enable the functional clock, and
manage it for suspend/resume, if present.
Before, it was assumed enabled by the bootloader or reset state.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-3-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Link: https://lkml.kernel.org/r/1426704961-27322-2-git-send-email-geert+renesas@glider.be
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
strcmp() is always expected to return 0 when arguments are equal,
negative when its first argument @str1 is less than its second argument
@str2 and a positive value otherwise. Previously strcmp("a", "b")
returned 1. Now it gives -1, as it is supposed to.
Until now this bug never triggered, because all uses for strcmp() in the
boot code tested for nonzero:
triton:~/tip> git grep strcmp arch/x86/boot/
arch/x86/boot/boot.h:int strcmp(const char *str1, const char *str2);
arch/x86/boot/edd.c: if (!strcmp(eddarg, "skipmbr") || !strcmp(eddarg, "skip")) {
arch/x86/boot/edd.c: else if (!strcmp(eddarg, "off"))
arch/x86/boot/edd.c: else if (!strcmp(eddarg, "on"))
should in the future strcmp() be used in a comparative way in the boot
code, it might have led to (not so subtle) bugs.
Signed-off-by: Arjun Sreedharan <arjun024@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426520267-1803-1-git-send-email-arjun024@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The private pointer provided by the cacheinfo code is used to implement
the AMD L3 cache-specific attributes using a pointer to the northbridge
descriptor. It is needed for performing L3-specific operations and for
that we need a couple of PCI devices and other service information, all
contained in the northbridge descriptor.
This results in failure of cacheinfo setup as shown below as
cache_get_priv_group() returns the uninitialised private attributes which
are not valid for Intel processors.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1 at fs/sysfs/group.c:102
internal_create_group+0x151/0x280()
sysfs: (bin_)attrs not set by subsystem for group: index3/
Modules linked in:
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc3+ #1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014
...
Call Trace:
dump_stack
warn_slowpath_common
warn_slowpath_fmt
internal_create_group
sysfs_create_groups
device_add
cpu_device_create
? __kmalloc
cache_add_dev
cacheinfo_sysfs_init
? container_dev_init
do_one_initcall
kernel_init_freeable
? rest_init
kernel_init
ret_from_fork
? rest_init
This patch fixes the issue by checking if the L3 cache indices are
populated correctly (AMD-specific) before initializing the private
attributes.
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Had some strange 3 tabs + 2 chars indentation, probably from me. Fix it.
No code changed:
# arch/x86/kernel/cpu/mcheck/mce.o:
text data bss dec hex filename
21371 5923 264 27558 6ba6 mce.o.before
21371 5923 264 27558 6ba6 mce.o.after
md5:
eb3996c84d15e08ed836f043df2cbb01 mce.o.before.asm
eb3996c84d15e08ed836f043df2cbb01 mce.o.after.asm
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Certain MSRs are only relevant to a kernel in host mode, and kvm had
chosen not to implement these MSRs at all for guests. If a guest kernel
ever tried to access these MSRs, the result was a general protection
fault.
KVM will be separately patched to return 0 when these MSRs are read,
and this patch ensures that MSR accesses are tolerant of exceptions.
Signed-off-by: Jesse Larrew <jesse.larrew@amd.com>
[ Drop {} braces around loop ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Joel Schopp <joel.schopp@amd.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/1426262619-5016-1-git-send-email-jesse.larrew@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Now that eager_fpu_init_bp() does setup_init_fpu_buf() only and
nothing else, we can remove it and move this code into its "caller",
eager_fpu_init().
This avoids the confusing games with "static __refdata void (*boot_func)":
init_xstate_buf can be NULL only during boot, so it is safe to call the
__init-annotated setup_init_fpu_buf() function in eager_fpu_init(), we
just need to mark it as __init_refok.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/20150314151334.GC13029@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|