Age | Commit message (Collapse) | Author |
|
This patch removes the NUMA PTE bits and associated helpers. As a
side-effect it increases the maximum possible swap space on x86-64.
One potential source of problems is races between the marking of PTEs
PROT_NONE, NUMA hinting faults and migration. It must be guaranteed that
a PTE being protected is not faulted in parallel, seen as a pte_none and
corrupting memory. The base case is safe but transhuge has problems in
the past due to an different migration mechanism and a dependance on page
lock to serialise migrations and warrants a closer look.
task_work hinting update parallel fault
------------------------ --------------
change_pmd_range
change_huge_pmd
__pmd_trans_huge_lock
pmdp_get_and_clear
__handle_mm_fault
pmd_none
do_huge_pmd_anonymous_page
read? pmd_lock blocks until hinting complete, fail !pmd_none test
write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
pmd_modify
set_pmd_at
task_work hinting update parallel migration
------------------------ ------------------
change_pmd_range
change_huge_pmd
__pmd_trans_huge_lock
pmdp_get_and_clear
__handle_mm_fault
do_huge_pmd_numa_page
migrate_misplaced_transhuge_page
pmd_lock waits for updates to complete, recheck pmd_same
pmd_modify
set_pmd_at
Both of those are safe and the case where a transhuge page is inserted
during a protection update is unchanged. The case where two processes try
migrating at the same time is unchanged by this series so should still be
ok. I could not find a case where we are accidentally depending on the
PTE not being cleared and flushed. If one is missed, it'll manifest as
corruption problems that start triggering shortly after this series is
merged and only happen when NUMA balancing is enabled.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ppc64 should not be depending on DSISR_PROTFAULT and it's unexpected if
they are triggered. This patch adds warnings just in case they are being
accidentally depended upon.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert existing users of pte_numa and friends to the new helper. Note
that the kernel is broken after this patch is applied until the other page
table modifiers are also altered. This patch layout is to make review
easier.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is a preparatory patch that introduces protnone helpers for automatic
NUMA balancing.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time with:
- Generic page-table framework for ARM IOMMUs using the LPAE
page-table format, ARM-SMMU and Renesas IPMMU make use of it
already.
- Break out the IO virtual address allocator from the Intel IOMMU so
that it can be used by other DMA-API implementations too. The
first user will be the ARM64 common DMA-API implementation for
IOMMUs
- Device tree support for Renesas IPMMU
- Various fixes and cleanups all over the place"
* tag 'iommu-updates-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits)
iommu/amd: Convert non-returned local variable to boolean when relevant
iommu: Update my email address
iommu/amd: Use wait_event in put_pasid_state_wait
iommu/amd: Fix amd_iommu_free_device()
iommu/arm-smmu: Avoid build warning
iommu/fsl: Various cleanups
iommu/fsl: Use %pa to print phys_addr_t
iommu/omap: Print phys_addr_t using %pa
iommu: Make more drivers depend on COMPILE_TEST
iommu/ipmmu-vmsa: Fix IOMMU lookup when multiple IOMMUs are registered
iommu: Disable on !MMU builds
iommu/fsl: Remove unused fsl_of_pamu_ids[]
iommu/fsl: Fix section mismatch
iommu/ipmmu-vmsa: Use the ARM LPAE page table allocator
iommu: Fix trace_map() to report original iova and original size
iommu/arm-smmu: add support for iova_to_phys through ATS1PR
iopoll: Introduce memory-mapped IO polling macros
iommu/arm-smmu: don't touch the secure STLBIALL register
iommu/arm-smmu: make use of generic LPAE allocator
iommu: io-pgtable-arm: add non-secure quirk
...
|
|
Merge second set of updates from Andrew Morton:
"More of MM"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
vmstat: Reduce time interval to stat update on idle cpu
mm/page_owner.c: remove unnecessary stack_trace field
Documentation/filesystems/proc.txt: describe /proc/<pid>/map_files
mm: incorporate read-only pages into transparent huge pages
vmstat: do not use deferrable delayed work for vmstat_update
mm: more aggressive page stealing for UNMOVABLE allocations
mm: always steal split buddies in fallback allocations
mm: when stealing freepages, also take pages created by splitting buddy page
mincore: apply page table walker on do_mincore()
mm: /proc/pid/clear_refs: avoid split_huge_page()
mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)
mempolicy: apply page table walker on queue_pages_range()
arch/powerpc/mm/subpage-prot.c: use walk->vma and walk_page_vma()
memcg: cleanup preparation for page table walk
numa_maps: remove numa_maps->vma
numa_maps: fix typo in gather_hugetbl_stats
pagemap: use walk->vma instead of calling find_vma()
clear_refs: remove clear_refs_private->vma and introduce clear_refs_test_walk()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:
- Update of all defconfigs
- Addition of a bunch of config options to modernise our defconfigs
- Some PS3 updates from Geoff
- Optimised memcmp for 64 bit from Anton
- Fix for kprobes that allows 'perf probe' to work from Naveen
- Several cxl updates from Ian & Ryan
- Expanded support for the '24x7' PMU from Cody & Sukadev
- Freescale updates from Scott:
"Highlights include 8xx optimizations, some more work on datapath
device tree content, e300 machine check support, t1040 corenet
error reporting, and various cleanups and fixes"
* tag 'powerpc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (102 commits)
cxl: Add missing return statement after handling AFU errror
cxl: Fail AFU initialisation if an invalid configuration record is found
cxl: Export optional AFU configuration record in sysfs
powerpc/mm: Warn on flushing tlb page in kernel context
powerpc/powernv: Add OPAL soft-poweroff routine
powerpc/perf/hv-24x7: Document sysfs event description entries
powerpc/perf/hv-gpci: add the remaining gpci requests
powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated
powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper
perf: add PMU_EVENT_ATTR_STRING() helper
perf: provide sysfs_show for struct perf_pmu_events_attr
powerpc/kernel: Avoid initializing device-tree pointer twice
powerpc: Remove old compile time disabled syscall tracing code
powerpc/kernel: Make syscall_exit a local label
cxl: Fix device_node reference counting
powerpc/mm: bail out early when flushing TLB page
powerpc: defconfigs: add MTD_SPI_NOR (new dependency for M25P80)
perf/powerpc: reset event hw state when adding it to the PMU
powerpc/qe: Use strlcpy()
...
|
|
We don't have to use mm_walk->private to pass vma to the callback function
because of mm_walk->vma. And walk_page_vma() is useful if we walk over a
single vma.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
LKP has triggered a compiler warning after my recent patch "mm: account
pmd page tables to the process":
mm/mmap.c: In function 'exit_mmap':
>> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]
The code:
> 2857 WARN_ON(mm_nr_pmds(mm) >
2858 round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);
In this, on tile, we have FIRST_USER_ADDRESS defined as 0. round_up() has
the same type -- int. PUD_SHIFT.
I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
long. On every arch for consistency.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently we have many duplicates in definitions around
follow_huge_addr(), follow_huge_pmd(), and follow_huge_pud(), so this
patch tries to remove the m. The basic idea is to put the default
implementation for these functions in mm/hugetlb.c as weak symbols
(regardless of CONFIG_ARCH_WANT_GENERAL_HUGETL B), and to implement
arch-specific code only when the arch needs it.
For follow_huge_addr(), only powerpc and ia64 have their own
implementation, and in all other architectures this function just returns
ERR_PTR(-EINVAL). So this patch sets returning ERR_PTR(-EINVAL) as
default.
As for follow_huge_(pmd|pud)(), if (pmd|pud)_huge() is implemented to
always return 0 in your architecture (like in ia64 or sparc,) it's never
called (the callsite is optimized away) no matter how implemented it is.
So in such architectures, we don't need arch-specific implementation.
In some architecture (like mips, s390 and tile,) their current
arch-specific follow_huge_(pmd|pud)() are effectively identical with the
common code, so this patch lets these architecture use the common code.
One exception is metag, where pmd_huge() could return non-zero but it
expects follow_huge_pmd() to always return NULL. This means that we need
arch-specific implementation which returns NULL. This behavior looks
strange to me (because non-zero pmd_huge() implies that the architecture
supports PMD-based hugepage, so follow_huge_pmd() can/should return some
relevant value,) but that's beyond this cleanup patch, so let's keep it.
Justification of non-trivial changes:
- in s390, follow_huge_pmd() checks !MACHINE_HAS_HPAGE at first, and this
patch removes the check. This is OK because we can assume MACHINE_HAS_HPAGE
is true when follow_huge_pmd() can be called (note that pmd_huge() has
the same check and always returns 0 for !MACHINE_HAS_HPAGE.)
- in s390 and mips, we use HPAGE_MASK instead of PMD_MASK as done in common
code. This patch forces these archs use PMD_MASK, but it's OK because
they are identical in both archs.
In s390, both of HPAGE_SHIFT and PMD_SHIFT are 20.
In mips, HPAGE_SHIFT is defined as (PAGE_SHIFT + PAGE_SHIFT - 3) and
PMD_SHIFT is define as (PAGE_SHIFT + PAGE_SHIFT + PTE_ORDER - 3), but
PTE_ORDER is always 0, so these are identical.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"Enumeration
- Move domain assignment from arm64 to generic code (Lorenzo Pieralisi)
- ARM: Remove artificial dependency on pci_sys_data domain (Lorenzo Pieralisi)
- ARM: Move to generic PCI domains (Lorenzo Pieralisi)
- Generate uppercase hex for modalias var in uevent (Ricardo Ribalda Delgado)
- Add and use generic config accessors on ARM, PowerPC (Rob Herring)
Resource management
- Free resources on failure in of_pci_get_host_bridge_resources() (Lorenzo Pieralisi)
- Fix infinite loop with ROM image of size 0 (Michel Dänzer)
PCI device hotplug
- Handle surprise add even if surprise removal isn't supported (Bjorn Helgaas)
Virtualization
- Mark AMD/ATI VGA devices that don't reset on D3hot->D0 transition (Alex Williamson)
- Add DMA alias quirk for Adaptec 3405 (Alex Williamson)
- Add Wellsburg (X99) to Intel PCH root port ACS quirk (Alex Williamson)
- Add ACS quirk for Emulex NICs (Vasundhara Volam)
MSI
- Fail MSI-X mappings if there's no space assigned to MSI-X BAR (Yijing Wang)
Freescale Layerscape host bridge driver
- Fix platform_no_drv_owner.cocci warnings (Julia Lawall)
NVIDIA Tegra host bridge driver
- Remove unnecessary tegra_pcie_fixup_bridge() (Lucas Stach)
Renesas R-Car host bridge driver
- Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov)
TI Keystone host bridge driver
- Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov)
- Fix misspelling of current function in debug output (Julia Lawall)
Xilinx AXI host bridge driver
- Fix harmless format string warning (Arnd Bergmann)
Miscellaneous
- Use standard parsing functions for ASPM sysfs setters (Chris J Arges)
- Add pci_device_to_OF_node() stub for !CONFIG_OF (Kevin Hao)
- Delete unnecessary NULL pointer checks (Markus Elfring)
- Add and use defines for PCIe Max_Read_Request_Size (Rafał Miłecki)
- Include clk.h instead of clk-private.h (Stephen Boyd)"
* tag 'pci-v3.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
PCI: Add pci_device_to_OF_node() stub for !CONFIG_OF
PCI: xilinx: Convert to use generic config accessors
PCI: xgene: Convert to use generic config accessors
PCI: tegra: Convert to use generic config accessors
PCI: rcar: Convert to use generic config accessors
PCI: generic: Convert to use generic config accessors
powerpc/powermac: Convert PCI to use generic config accessors
powerpc/fsl_pci: Convert PCI to use generic config accessors
ARM: ks8695: Convert PCI to use generic config accessors
ARM: sa1100: Convert PCI to use generic config accessors
ARM: integrator: Convert PCI to use generic config accessors
PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver
ARM: dts: versatile: add PCI controller binding
of/pci: Free resources on failure in of_pci_get_host_bridge_resources()
PCI: versatile: Add DT docs for ARM Versatile PB PCIe driver
PCI: Fail MSI-X mappings if there's no space assigned to MSI-X BAR
r8169: use PCI define for Max_Read_Request_Size
[SCSI] esas2r: use PCI define for Max_Read_Request_Size
tile: use PCI define for Max_Read_Request_Size
rapidio/tsi721: use PCI define for Max_Read_Request_Size
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The main RCU changes in this cycle are:
- Documentation updates.
- Miscellaneous fixes.
- Preemptible-RCU fixes, including fixing an old bug in the
interaction of RCU priority boosting and CPU hotplug.
- SRCU updates.
- RCU CPU stall-warning updates.
- RCU torture-test updates"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
rcu: Initialize tiny RCU stall-warning timeouts at boot
rcu: Fix RCU CPU stall detection in tiny implementation
rcu: Add GP-kthread-starvation checks to CPU stall warnings
rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
rcu: Optionally run grace-period kthreads at real-time priority
ksoftirqd: Use new cond_resched_rcu_qs() function
ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
rcutorture: Add more diagnostics in rcu_barrier() test failure case
torture: Flag console.log file to prevent holdovers from earlier runs
torture: Add "-enable-kvm -soundhw pcspk" to qemu command line
rcutorture: Handle different mpstat versions
rcutorture: Check from beginning to end of grace period
rcu: Remove redundant rcu_batches_completed() declaration
rcutorture: Drop rcu_torture_completed() and friends
rcu: Provide rcu_batches_completed_sched() for TINY_RCU
rcutorture: Use unsigned for Reader Batch computations
rcutorture: Make build-output parsing correctly flag RCU's warnings
rcu: Make _batches_completed() functions return unsigned long
rcutorture: Issue warnings on close calls due to Reader Batch blows
documentation: Fix smp typo in memory-barriers.txt
...
|
|
Kim Phillips reported following build failure.
LD init/built-in.o
mm/built-in.o: In function `free_pages_prepare':
mm/page_alloc.c:770: undefined reference to `.kernel_map_pages'
mm/built-in.o: In function `prep_new_page':
mm/page_alloc.c:933: undefined reference to `.kernel_map_pages'
mm/built-in.o: In function `map_pages':
mm/compaction.c:61: undefined reference to `.kernel_map_pages'
make: *** [vmlinux] Error 1
Reason for this problem is that commit 031bc5743f15
("mm/debug-pagealloc: make debug-pagealloc boottime configurable")
forgot to remove the old declaration of kernel_map_pages() for some
architectures. This patch removes them to fix build failure.
Reported-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
and 'core' into next
Conflicts:
drivers/iommu/Kconfig
drivers/iommu/Makefile
|
|
Function __flush_tlb_page() must only be called for user contexts, so
put in extra hardening to warn on calling it for kernel context.
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Register a notifier for a OPAL message indicating that the machine
should prepare itself for a graceful power off.
OPAL will tell us if the power off is a reboot or shutdown, but for now
we perform the same orderly_poweroff action.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:
"Highlights include 8xx optimizations, some more work on datapath device
tree content, e300 machine check support, t1040 corenet error reporting,
and various cleanups and fixes."
|
|
Currently a PAMU driver patch is very likely to receive some
checkpatch complaints about the code in the context of the
patch. This patch is an attempt to fix most of that and make
the driver more readable
Also fixed a subset of the sparse and coccinelle reported
issues.
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add the remaining gpci requests that contain counters suitable for use
by perf. Omit those that don't contain any counters (but note their
ommision).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This adds (in req-gen/) a framework for defining gpci counter requests.
It uses macro magic similar to ftrace.
Also convert the existing hv-gpci request structures and enum values to
use the new framework (and adjust old users of the structs and enum
values to cope with changes in naming).
In exchange for this macro disaster, we get autogenerated event listing
for GPCI in sysfs, build time field offset checking, and zero
duplication of information about GPCI requests.
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Retrieves and parses the 24x7 catalog on POWER systems that supply it
(right now, only POWER 8). Events are exposed via sysfs in the standard
fashion, and are all parameterized.
$ cd /sys/bus/event_source/devices/hv_24x7/events
$ cat HPM_CS_FROM_L4_LDATA__PHYS_CORE
domain=0x2,offset=0xd58,core=?,lpar=0x0
$ cat HPM_TLBIE__VCPU_HOME_CHIP
domain=0x4,offset=0x358,vcpu=?,lpar=?
where user is required to specify values for the fields with '?' (like
core, vcpu, lpar above), when specifying the event with the perf tool.
Catalog is (at the moment) only parsed on boot. It needs re-parsing
when a some hypervisor events occur. At that point we'll also need to
prevent old events from continuing to function (counter that is passed
in via spare space in the config values?).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Define a lite version of the EVENT_DEFINE_RANGE_FORMAT() that avoids
defining helper functions for the bit-field ranges.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
As commit 50ba08f3 ("of/fdt: Don't clear initial_boot_params
if fdt_check_header() fails") does, the device-tree pointer
"initial_boot_params" is initialized by early_init_dt_verify(),
which is called by early_init_devtree(). So we needn't explicitly
initialize that again in early_init_devtree().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
We have code to do syscall tracing which is disabled at compile time by
default. It's not been touched since the dawn of time (ie. v2.6.12).
There are now better ways to do syscall tracing, ie. using the
raw_syscall, or syscall tracepoints.
For the specific case of tracing syscalls at boot on a system that
doesn't get to userspace, you can boot with:
trace_event=syscalls tp_printk=on
Which will trace syscalls from boot, and echo all output to the console.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Currently when we back trace something that is in a syscall we see
something like this:
[c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
[c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
Although it's entirely correct, seeing syscall_exit at the bottom can be
confusing - we were exiting from a syscall and then called SyS_read() ?
If we instead change syscall_exit to be a local label we get something
more intuitive:
[c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
[c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
ie. we were handling a system call, and it was SyS_read().
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:
ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98
We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
MMU_NO_CONTEXT is conditionally defined as 0 or (unsigned int)-1. However,
in __flush_tlb_page() a corresponding variable is only tested for open
coded 0, which can cause NULL pointer dereference if `mm' argument was
legitimately passed as such.
Bail out early in case the first argument is NULL, thus eliminate confusion
between different values of MMU_NO_CONTEXT and avoid disabling and then
re-enabling preemption unnecessarily.
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Convert the powermac PCI driver to use the generic config access functions.
This changes accesses from (in|out)_(8|le16|le32) to readX/writeX variants.
I believe these should be equivalent for PCI config space accesses, but
confirmation would be nice.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: linuxppc-dev@lists.ozlabs.org
|
|
Convert the fsl_pci driver to use the generic config access functions.
This changes accesses from (in|out)_(8|le16|le32) to readX/writeX variants.
I believe these should be equivalent for PCI config space accesses, but
confirmation would be nice.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: linuxppc-dev@lists.ozlabs.org
|
|
These defconfigs contain the CONFIG_M25P80 symbol, which is now
dependent on the MTD_SPI_NOR symbol. Add CONFIG_MTD_SPI_NOR to satisfy
the new dependency.
At the same time, drop the now-nonexistent CONFIG_MTD_CHAR symbol.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
When adding an event to the PMU with PERF_EF_START the STOPPED and UPTODATE
flags need to be cleared in the hw.event status variable because they are
preventing the update of the event count on overflow interrupt.
Signed-off-by: Alexandru-Cezar Sardan <alexandru.sardan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Replace strcpy and strncpy with strlcpy to avoid strings that are too
big, or lack null termination.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
[scottwood@freescale.com: cleaned up commit message]
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
"of_node_put"
The of_node_put() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Fix the GPIO address in the device tree to match the documented location.
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Change-Id: I16e63db731e55a3d60d4e147573c1af8718082d3
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Geoff Thorpe <Geoff.Thorpe@freescale.com>
Signed-off-by: Hai-Ying Wang <Haiying.Wang@freescale.com>
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Geoff Thorpe <Geoff.Thorpe@freescale.com>
Signed-off-by: Hai-Ying Wang <Haiying.Wang@freescale.com>
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Add support for the Artesyn MVME2500 Single Board Computer.
The MVME2500 is a 6U form factor VME64 computer with:
- A single Freescale QorIQ P2010 CPU
- 1 GB of DDR3 onboard memory
- Three Gigabit Ethernets
- Five 16550 compatible UARTS
- One USB 2.0 port, one SHDC socket and one SATA connector
- One PCI/PCI eXpress Mezzanine Card (PMC/XMC) Slot
- MultiProcessor Interrupt Controller (MPIC)
- A DS1375T Real Time Clock (RTC) and 512 KB of Non-Volatile Memory
- Two 64 KB EEPROMs
- U-Boot in 16 SPI Flash
This patch is based on linux-3.18 and has been boot tested.
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Commit 746c9e9f92dd "of/base: Fix PowerPC address parsing hack" limited
the applicability of the workaround whereby a missing ranges is treated
as an empty ranges. This workaround was hiding a bug in the etsec2
device tree nodes, which have children with reg, but did not have
ranges.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Reported-by: Alexander Graf <agraf@suse.de>
|
|
Also, enable Vitesse PHY and fixed PHY support.
Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Signed-off-by: Esben Haabendal <eha@deif.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
All accessed to PGD entries are done via 0(r11).
By using lower part of swapper_pg_dir as load index to r11, we can remove the
ori instruction.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
L1 base address is now aligned so we can insert L1 index into r11 directly and
then preserve r10
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
When pages are not 4K, PGDIR table is allocated with kmalloc(). In order to
optimise TLB handlers, aligned memory is needed. kmalloc() doesn't provide
aligned memory blocks, so lets use a kmem_cache pool instead.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Kernel MMU handling code handles validity of entries via _PMD_PRESENT which
corresponds to V bit in MD_TWC and MI_TWC. When the V bit is not set, MPC8xx
triggers TLBError exception. So we don't have to check that and branch ourself
to TLBError. We can set TLB entries with non present entries, remove all those
tests and let the 8xx handle it. This reduce the number of cycle when the
entries are valid which is the case most of the time, and doesn't significantly
increase the time for handling invalid entries.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Since commit 33fb845a6f01 ("powerpc/8xx: Don't use MD_TWC for walk"), MD_EPN and
MD_TWC are not writen anymore in FixupDAR so saving r3 has become useless.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
For nohash powerpc, when we run out of contexts, contexts are freed by stealing
used contexts in-turn. When a victim has been selected, the associated TLB
entries are freed using _tlbil_pid(). Unfortunatly, on the PPC 8xx, _tlbil_pid()
does a tlbia, hence flushes ALL TLB entries and not only the one linked to the
stolen context. Therefore, as implented today, at each task switch requiring a
new context, all entries are flushed.
This patch modifies the implementation so that when running out of contexts, all
contexts get freed at once, hence dividing the number of calls to tlbia by 16.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
On powerpc 8xx, in TLB entries, 0x400 bit is set to 1 for read-only pages
and is set to 0 for RW pages. So we should use _PAGE_RO instead of _PAGE_RW
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Some powerpc like the 8xx don't have a RW bit in PTE bits but a RO
(Read Only) bit. This patch implements the handling of a _PAGE_RO flag
to be used in place of _PAGE_RW
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[scottwood@freescale.com: fix whitespace]
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
PMCs on PowerPC increases towards 0x80000000 and triggers an overflow
interrupt when the msb is set to collect a sample. Therefore, to setup
for the next sample collection, pmu_start should set the pmc value to
0x80000000 - left instead of left which incorrectly delays the next
overflow interrupt. Same as commit 9a45a9407c69 ("powerpc/perf:
power_pmu_start restores incorrect values, breaking frequency events")
for book3s.
Signed-off-by: Tom Huynh <tom.huynh@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|