Age | Commit message (Collapse) | Author |
|
The following commit:
commit 7e348b9012522fa0efd854d20d210d5e57fcedd1
Author: Robert Lee <rob.lee@linaro.org>
Date: Tue Mar 20 15:22:43 2012 -0500
ARM: at91: Consolidate time keeping and irq enable
Enable core cpuidle timekeeping and irq enabling and remove that
handling from this code.
introduced an additional zero to the state1 (suspend) target residency.
With a periodic tick, the cpu never enters the state1 with both 10000 and
100000.
With a tickless system, it enters to state1 much more often with the
initial value, roughly x7 more.
Fix it by setting the value to 10ms again.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[nicola.ferre@atmel.com: add precisions given by Daniel to commit message]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
|
|
Since 4b68520dc0ec96153bc0d87bca5ffba508edfcf
ARM: at91: add AIC5 support
we allocate the at91_extern_irq.
This patch makes it static and stores the non-dt extern irq in the soc
structure. It is then possible to use a at91_get_extern_irq() function
to get the value for outside of the irq driver. It is useful for passing
its value to at91_aic_init().
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
[nicolas.ferre@atmel.com: rework commit message]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
|
|
Compiling without CONFIG_X86_LOCAL_APIC set, apic.c will not be
compiled, and the irq tracepoints will not be created via the
CREATE_TRACE_POINTS macro. When CONFIG_X86_LOCAL_APIC is not set,
we get the following build error:
LD init/built-in.o
arch/x86/built-in.o: In function `trace_x86_platform_ipi_entry':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_entry'
arch/x86/built-in.o: In function `trace_x86_platform_ipi_exit':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_exit'
arch/x86/built-in.o: In function `trace_irq_work_entry':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_entry'
arch/x86/built-in.o: In function `trace_irq_work_exit':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_exit'
arch/x86/built-in.o:(__jump_table+0x8): undefined reference to `__tracepoint_x86_platform_ipi_entry'
arch/x86/built-in.o:(__jump_table+0x14): undefined reference to `__tracepoint_x86_platform_ipi_exit'
arch/x86/built-in.o:(__jump_table+0x20): undefined reference to `__tracepoint_irq_work_entry'
arch/x86/built-in.o:(__jump_table+0x2c): undefined reference to `__tracepoint_irq_work_exit'
make[1]: *** [vmlinux] Error 1
make: *** [sub-make] Error 2
As irq.c is always compiled for x86, it is a more appropriate location
to create the irq tracepoints.
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
next/boards
From Michal Simek:
arm: Xilinx Zynq defconfig changes for v3.11
Enable zynq uartps driver and initrd in defconfig.
* tag 'zynq-defconfig-for-3.11' of git://git.xilinx.com/linux-xlnx:
arm: multi_v7_defconfig: Enable initrd/initramfs support
arm: multi_v7_defconfig: Enable Zynq UART driver
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://github.com/broadcom/bcm11351 into next/cleanup
From Christian Daudt:
* 'armsoc/for-3.11/cleanups' of git://github.com/broadcom/bcm11351:
ARM: bcm281xx: Remove init_irq declaration in machine description
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
next/dt
From Christian Daudt:
* 'armsoc/for-3.11/dt' of git://github.com/broadcom/bcm11351:
ARM: dts: bcm281xx: change comment to C89 style
ARM: mmc: bcm281xx SDHCI driver (dt mods)
ARM: dts: bcm281xx: use existing defines for irqs
ARM: dts: bcm281xx: use #include for device tree files
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/dt
From Simon Horman:
Second Round of Renesas ARM-based SoC DT updates for v3.11
* Increased DT coverage for renesas-intc-irqpin
by Guennadi Liakhovetski
* Clean up of address format used in sh73a0 dtsi file
by Guennadi Liakhovetski
* tag 'renesas-dt2-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: irqpin: add a DT property to enable masking on parent
ARM: shmobile: sh73a0: remove "0x" prefix from DT node names
irqchip: renesas-intc-irqpin: DT binding for sense bitfield width
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers
From Simon Horman:
Second Round of Renesas ARM based SoC GPIO R-Car updates for v3.11
Documentation enhancement and code cleanup by Laurent Pinchart.
* tag 'renesas-gpio-rcar2-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
gpio-rcar: Remove #ifdef CONFIG_OF around OF-specific sections
gpio-rcar: Reference core gpio documentation in the DT bindings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers
From Simon Horman:
Second Round of Renesas ARM based SoC pinmux and GPIO update for v3.11
tidyup MMC_D1 pin for r8a7778 SoC
fix two pin numbers and add HSCIF pin groups to r8a7790 SoC
add pinmux data for MMCIF and SDHI interfaces for r8a73a4 SoC
* tag 'renesas-pinmux2-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
sh-pfc: r8a7778: tidyup MMC_D1 pin
pinctrl: r8a7790: fix two pin numbers
sh-pfc: r8a7790: add HSCIF pin groups
pinctrl: r8a73a4: add pinmux data for MMCIF and SDHI interfaces
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/boards
From Simon Horman:
Second Round of Renesas ARM-based SoC board updates for v3.11
* Extended hardware coverage for the Bock-W board
by Goda-san and Morimoto-san
* Correction to Ether device name for the Bock-W board
from Sergei Shtylyov
* tag 'renesas-boards2-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: BOCK-W: change Ether device name
ARM: shmobile: bockw: add MMCIF support
ARM: shmobile: bockw: add SPI FLASH support
ARM: shmobile: bockw: add I2C device support
ARM: shmobile: BOCK-W: add Ether support
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/late
From Kukjin Kim:
based on tags/common-clk-audio
- add support for exynos5420 SoC
* tag 'soc-exynos5420-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: EXYNOS: extend soft-reset support for EXYNOS5420
ARM: EXYNOS: add secondary CPU boot base location for EXYNOS5420
clocksource: exynos_mct: use (request/free)_irq calls for local timer registration
ARM: dts: Add initial device tree support for EXYNOS5420
clk: exynos5420: register clocks using common clock framework
ARM: EXYNOS: use four additional chipid bits to identify EXYNOS family
serial: samsung: select EXYNOS specific driver data if ARCH_EXYNOS is defined
ARM: EXYNOS: Add support for EXYNOS5420 SoC
ARM: dts: list the CPU nodes for EXYNOS5250
ARM: dts: fork out common EXYNOS5 nodes
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/soc
From Simon Horman:
Renesas ARM based SoC boot cleanup for v3.11
Work by Magnus Damm and others to clean up the boot of and move
things closer to supporting multi-arch.
As a side effect of this work it was decided to remove support for
two boards, Bonito and AP4EVB. Those patches are included in this
series as they depend on earlier patches in the series.
* tag 'renesas-cleanup-boot-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: Remove Bonito board support
ARM: shmobile: Remove AP4EVB board support
ARM: shmobile: Remove mach/memory.h
ARM: shmobile: Remove MEMORY_START/SIZE
ARM: shmobile: Enable ARM_PATCH_PHYS_VIRT
ARM: shmobile: Remove old SCU boot code
ARM: shmobile: EMEV2 SMP with SCU boot fn and args
ARM: shmobile: sh73a0 SMP with SCU boot fn and args
ARM: shmobile: r8a7779 SMP with SCU boot fn and args
ARM: shmobile: Add SCU boot function using argument
ARM: shmobile: Add SMP boot function and argument
ARM: shmobile: Rework sh7372 sleep code to use virt_to_phys()
ARM: shmobile: Remove romImage CONFIG_MEMORY_START
ARM: shmobile: Let romImage rely on default ATAGS
ARM: shmobile: uImage load address rework
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/late
From Simon Horman:
Renesas ARM based SoC cleanups for v3.11
__initdata annotations for the r8a7790 SoC by Morimoto-san.
* tag 'renesas-cleanup-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: (158 commits)
ARM: shmobile: r8a7790: add __initdata on resource and device data
Based on 'renesas-pinmux-for-v3.11' and 'renesas-soc-for-v3.11
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Patches from Lee Jones:
This gets rid of mop500_snowball_ethernet_clock_enable() which is no
longer in use. It also straightens out a bug which ensures the SMSC911x's
regulator is turned on at start-up when using Device Tree.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
mop500_snowball_ethernet_clock_enable() provided a means to enable a
clock which was used for the SMSC911x Ethernet device on Snowball. It
was merely a stand-in until the driver was common clk compliant. Now
that it is, this can be removed for both DT and ATAGs booting.
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
When this node was added, the AB8500 GPIO driver was pretty broken.
As a hack, we pretended that NOMADIK GPIO 26 was the correct on/off
pin, as it was unused. It worked because AB8500 GPIO 26 was in an
'always on from boot' state. Now the AB8500 GPIO driver is working,
the default state for all the pins is 'off'. Let's flip back over to
use the correct GPIO which is _actually_ attached to the regulator.
We're also taking the opportunity to straighten out some formatting
misdemeanours, swapping spaces for tabs.
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Here we're adding a node for the AB8500 GPIO device. This will allow
other DT:ed components to obtain GPIOs for use within their drivers.
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
In sound/usb/card.c and sound/usb/misc/ua101.c there are no spaces
between the vendor and the device names, use this style in the other
drivers too.
This also helps keeping consistency when new drivers copies from the
ones already in the mainline tree.
Signed-off-by: Antonio Ospite <ao2@amarulasolutions.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
For USB devices it's not necessary to allocate physically contiguous
buffers.
Signed-off-by: Antonio Ospite <ao2@amarulasolutions.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
For USB devices it's not necessary to allocate physically contiguous
buffers.
Signed-off-by: Antonio Ospite <ao2@amarulasolutions.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The snd_card_used variable is only read but never written, remove it.
Signed-off-by: Antonio Ospite <ao2@amarulasolutions.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
This code is older than git, and I haven't tested it, but if size ==
SIZE_MAX_STATUS then we would write one space past the end of the
rmh->Stat[] array.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
More than 256 entries in ACPI MADT is supported from ACPI 3.0, so the
information in should be Documentation/cpu-hotplug.txt updated.
[rjw: Changelog]
Suggested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The recent modification in the cpuidle framework consolidated the
timer broadcast code across the different drivers by setting a new
flag in the idle state. It tells the cpuidle core code to enter/exit
the broadcast mode for the cpu when entering a deep idle state. The
broadcast timer enter/exit is no longer handled by the back-end
driver.
This change made the local interrupt to be enabled *before* calling
CLOCK_EVENT_NOTIFY_EXIT.
On a tegra114, a four cores system, when the flag has been introduced
in the driver, the following warning appeared:
WARNING: at kernel/time/tick-broadcast.c:578 tick_broadcast_oneshot_control
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-rc3-next-20130529+ #15
[<c00667f8>] (tick_broadcast_oneshot_control+0x1a4/0x1d0) from [<c0065cd0>] (tick_notify+0x240/0x40c)
[<c0065cd0>] (tick_notify+0x240/0x40c) from [<c0044724>] (notifier_call_chain+0x44/0x84)
[<c0044724>] (notifier_call_chain+0x44/0x84) from [<c0044828>] (raw_notifier_call_chain+0x18/0x20)
[<c0044828>] (raw_notifier_call_chain+0x18/0x20) from [<c00650cc>] (clockevents_notify+0x28/0x170)
[<c00650cc>] (clockevents_notify+0x28/0x170) from [<c033f1f0>] (cpuidle_idle_call+0x11c/0x168)
[<c033f1f0>] (cpuidle_idle_call+0x11c/0x168) from [<c000ea94>] (arch_cpu_idle+0x8/0x38)
[<c000ea94>] (arch_cpu_idle+0x8/0x38) from [<c005ea80>] (cpu_startup_entry+0x60/0x134)
[<c005ea80>] (cpu_startup_entry+0x60/0x134) from [<804fe9a4>] (0x804fe9a4)
I don't have the hardware, so I wasn't able to reproduce the warning
but after looking a while at the code, I deduced the following:
1. the CPU2 enters a deep idle state and sets the broadcast timer
2. the timer expires, the tick_handle_oneshot_broadcast function is
called, setting the tick_broadcast_pending_mask and waking up the
idle cpu CPU2
3. the CPU2 exits idle handles the interrupt and then invokes
tick_broadcast_oneshot_control with CLOCK_EVENT_NOTIFY_EXIT which
runs the following code:
[...]
if (dev->next_event.tv64 == KTIME_MAX)
goto out;
if (cpumask_test_and_clear_cpu(cpu,
tick_broadcast_pending_mask))
goto out;
[...]
So if there is no next event scheduled for CPU2, we fulfil the
first condition and jump out without clearing the
tick_broadcast_pending_mask.
4. CPU2 goes to deep idle again and calls
tick_broadcast_oneshot_control with CLOCK_NOTIFY_EVENT_ENTER but
with the tick_broadcast_pending_mask set for CPU2, triggering the
warning.
The issue only surfaced due to the modifications of the cpuidle
framework, which resulted in interrupts being enabled before the call
to the clockevents code. If the call happens before interrupts have
been enabled, the warning cannot trigger, because there is still the
event pending which caused the broadcast timer expiry.
Move the check for the next event below the check for the pending bit,
so the pending bit gets cleared whether an event is scheduled on the
cpu or not.
[ tglx: Massaged changelog ]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reported-and-tested-by: Joseph Lo <josephl@nvidia.com>
Cc: Stephen Warren <swarren@nvidia.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/1371485735-31249-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
* Don't leak random kernel memory to EFI variable NVRAM when attempting
to initiate garbage collection. Also, free the kernel memory when
we're done with it instead of leaking - Ben Hutchings
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
Currently the max8973 regulator driver uses a single static struct of
regulator operations for all chip instances, but can overwrite some of its
members depending on configuration. This will affect all other MAX8973
instances on the system. This patch fixes this bug by allocating a separate
copy of the struct for each chip instance.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski+renesas@gmail.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
|
|
ACPI part of the driver accidentally used sizeof(*ssp) instead of the
correct sizeof(*pdata). This leads to nasty memory corruptions like the one
below:
BUG: unable to handle kernel paging request at 0000000749fd30b8
IP: [<ffffffff813fe8a1>] __list_del_entry+0x31/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted 3.10.0-rc6v3.10-rc6_sdhci_modprobe+ #443
task: ffff8801483a0940 ti: ffff88014839e000 task.ti: ffff88014839e000
RIP: 0010:[<ffffffff813fe8a1>] [<ffffffff813fe8a1>] __list_del_entry+0x31/0xd0
RSP: 0000:ffff88014839fde8 EFLAGS: 00010046
RAX: ffff880149fd30b0 RBX: ffff880149fd3040 RCX: dead000000200200
RDX: 0000000749fd30b0 RSI: ffff880149fd3058 RDI: ffff88014834d640
RBP: ffff88014839fde8 R08: ffff88014834d640 R09: 0000000000000001
R10: ffff8801483a0940 R11: 0000000000000001 R12: ffff880149fd3040
R13: ffffffff810e0b30 R14: ffff8801483a0940 R15: ffff88014834d640
FS: 0000000000000000(0000) GS:ffff880149e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000168 CR3: 0000000001e0b000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff88014839fe48 ffffffff810e0baf ffffffff81120abd ffff88014839fe20
ffff8801483a0940 ffff8801483a0940 ffff8801483a0940 ffff8801486b1c90
ffff88014834d640 ffffffff810e0b30 0000000000000000 0000000000000000
Call Trace:
[<ffffffff810e0baf>] worker_thread+0x7f/0x390
[<ffffffff81120abd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff810e0b30>] ? manage_workers.isra.22+0x2b0/0x2b0
[<ffffffff810e6c09>] kthread+0xd9/0xe0
[<ffffffff810f93df>] ? local_clock+0x3f/0x50
[<ffffffff810e6b30>] ? kthread_create_on_node+0x110/0x110
[<ffffffff818c5dec>] ret_from_fork+0x7c/0xb0
[<ffffffff810e6b30>] ? kthread_create_on_node+0x110/0x110
Fix this by using the right structure size in devm_kzalloc().
Reported-by: Jerome Blin <jerome.blin@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org # 3.9+
|
|
1. Check for allocation failure
2. Clear the buffer contents, as they may actually be written to flash
3. Don't leak the buffer
Compile-tested only.
[ Tested successfully on my buggy ASUS machine - Matt ]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into next/soc
From Heiko Stuebner:
Adds basic support for Rockchip Cortex-A9 SoCs.
* tag 'v3.11-rockchip-basics' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm: add basic support for Rockchip RK3066a boards
arm: add debug uarts for rockchip rk29xx and rk3xxx series
arm: Add basic clocks for Rockchip rk3066a SoCs
clocksource: dw_apb_timer_of: use clocksource_of_init
clocksource: dw_apb_timer_of: select DW_APB_TIMER
clocksource: dw_apb_timer_of: add clock-handling
clocksource: dw_apb_timer_of: enable the use the clocksource as sched clock
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
This adds a generic devicetree board file and a dtsi for boards
based on the RK3066a SoCs from Rockchip.
Apart from the generic parts (gic, clocks, pinctrl) the only components
currently supported are the timers, uarts and mmc ports (all DesignWare-
based).
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Olof Johansson <olof@lixom.net>
|
|
Uarts on all recent Rockchip SoCs are Synopsis DesignWare 8250 types.
Only their addresses vary very much.
This patch adds the necessary definitions to use any of the uart ports
for early debug purposes.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
This adds a basic clock setup for rk3066a SoCs. Only the gates are
set up currently, as the mux and dividers should use the upcoming
generic devicetree bindings.
Clocks whose rates need to be known are supplied by fixed-rate
"dummy"-clocks that provide the correct rate. This is uncritical insofar
that the only bootloader currently in existence for Rockchip devices
is the propietary Rockchip one that always setups the clocks in the
necessary way.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Mike Turquette <mturquette@linaro.org>
|
|
Add CONFIG_BLK_DEV_INITRD to the defconfig to support
initramfs and initrd.
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
|
|
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
|
|
Hugepage invalidate involves invalidating multiple hpte entries.
Optimize the operation using H_BULK_REMOVE on lpar platforms.
On native, reduce the number of tlb flush.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We enable only if the we support 16MB page size.
Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We find all the overlapping vma and mark them such that we don't allocate
hugepage in that range. Also we split existing huge page so that the
normal page hash can be invalidated and new page faulted in with new
protection bits.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
With THP we set pmd to none, before we do pte_clear. Hence we can't
walk page table to get the pte lock ptr and verify whether it is locked.
THP do take pte lock before calling pte_clear. So we don't change the locking
rules here. It is that we can't use page table walking to check whether
pte locks are held with THP.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
GCC is very likely to read the pagetables just once and cache them in
the local stack or in a register, but it is can also decide to re-read
the pagetables. The problem is that the pagetable in those places can
change from under gcc.
With THP/hugetlbfs the pmd (and pgd for hugetlbfs giga pages) can
change under gup_fast. The pages won't be freed untill we finish
gup fast because we have irq disabled and we free these pages via
rcu callback.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We need to have irqs disabled to handle all the possible parallel update for
linux page table without holding locks.
Events that we are intersted in while walking page tables are
1) Page fault
2) umap
3) THP split
4) THP collapse
A) local_irq_disabled:
------------------------
1) page fault:
A none to valid transition via page fault is not an issue because we
would either see a none or valid. If it is none, we would error out
the page table walk. We may need to use on stack values when checking for
type of page table elements, because if we do
if (!is_hugepd()) {
if (!pmd_none() {
if (pmd_bad() {
We could take that bad condition because the pmd got converted to a hugepd
after the !is_hugepd check via a hugetlb fault.
The right way would be to check for pmd_none higher up or use on stack value.
2) A valid to none conversion via unmap:
We can safely walk the upper level table, because we don't remove the the
page table entries until rcu grace period. So even if we followed a
wrong pointer we still have the pointer valid till the grace period.
A PTE pointer returned need to be atomically checked for _PAGE_PRESENT and
_PAGE_BUSY. A valid pointer returned could becoming none later. To prevent
pte_clear we take _PAGE_BUSY.
3) THP split:
A valid transparent hugepage is converted to nomal page. Before we split we
do pmd_splitting_flush, which sets the hugepage PTE to _PAGE_SPLITTING
So when walking page table we need to check for pmd_trans_splitting and
handle that. The pte returned should also need to be checked for
_PAGE_SPLITTING before setting _PAGE_BUSY similar to _PAGE_PRESENT. We save
the value of PTE on stack and check for the flag in the local pte value.
If we don't have the value set we can safely operate on the local pte value
and we atomicaly set _PAGE_BUSY.
4) THP collapse:
A normal page gets converted to hugepage. In the collapse path, we
mark the pmd none early (pmdp_clear_flush). With irq disabled, if we
are aleady walking page table we would see the pmd_none and won't continue.
If we see a valid PMD, we should still check for _PAGE_PRESENT before
setting _PAGE_BUSY, to make sure we didn't collapse the PTE to a Huge PTE.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The deposted PTE page in the second half of the PMD table is used to
track the state on hash PTEs. After updating the HPTE, we mark the
coresponding slot in the deposted PTE page valid.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We can find pte that are splitting while walking page tables. Return
None pte in that case.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly
document why we don't need to handle transparent hugepages at callsites.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We will use this in the later patch for handling THP pages
Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We now have pmd entries covering 16MB range and the PMD table double its original size.
We use the second half of the PMD table to deposit the pgtable (PTE page).
The depoisted PTE page is further used to track the HPTE information. The information
include [ secondary group | 3 bit hidx | valid ]. We use one byte per each HPTE entry.
With 16MB hugepage and 64K HPTE we need 256 entries and with 4K HPTE we need
4096 entries. Both will fit in a 4K PTE page. On hugepage invalidate we need to walk
the PTE page and invalidate all valid HPTEs.
This patch implements necessary arch specific functions for THP support and also
hugepage invalidate logic. These PMD related functions are intentionally kept
similar to their PTE counter-part.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
THP code does PTE page allocation along with large page request and deposit them
for later use. This is to ensure that we won't have any failures when we split
hugepages to regular pages.
On powerpc we want to use the deposited PTE page for storing hash pte slot and
secondary bit information for the HPTEs. We use the second half
of the pmd table to save the deposted PTE page.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
If a hash bucket gets full, we "evict" a more/less random entry from it.
When we do that we don't invalidate the TLB (hpte_remove) because we assume
the old translation is still technically "valid". This implies that when
we are invalidating or updating pte, even if HPTE entry is not valid
we should do a tlb invalidate. With hugepages, we need to pass the correct
actual page size value for tlb invalidation.
This change update the patch 0608d692463598c1d6e826d9dd7283381b4f246c
"powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle
transparent hugepages correctly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch creates debugfs entries (powerpc/PCIxxxx/err_injct) for
injecting EEH errors for testing purpose.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|