Age | Commit message (Collapse) | Author |
|
When PCI Device pass-through is enabled via VFIO, KVM-PPC will
pin pages using get_user_pages_fast(). One of the downsides of
the pinning is that the page could be in CMA region. The CMA
region is used for other allocations like the hash page table.
Ideally we want the pinned pages to be from non CMA region.
This patch (currently only for KVM PPC with VFIO) forcefully
migrates the pages out (huge pages are omitted for the moment).
There are more efficient ways of doing this, but that might
be elaborate and might impact a larger audience beyond just
the kvm ppc implementation.
The magic is in new_iommu_non_cma_page() which allocates the
new page from a non CMA region.
I've tested the patches lightly at my end. The full solution
requires migration of THP pages in the CMA region. That work
will be done incrementally on top of this.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Merged via powerpc tree as that's where the changes are]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This supports PCI surprise hotplug. The design is highlighted as
below:
* The PCI slot's surprise hotplug capability is exposed through
device node property "ibm,slot-surprise-pluggable", meaning
PCI surprise hotplug will be disabled if skiboot doesn't support
it yet.
* The interrupt because of presence or link state change is raised
on surprise hotplug event. One event is allocated and queued to
the PCI slot for workqueue to pick it up and process in serialized
fashion. The code flow for surprise hotplug is same to that for
managed hotplug except: the affected PEs are put into frozen state
to avoid unexpected EEH error reporting in surprise hot remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This removes likely() and unlikely() in pnv_php.c as the code isn't
running in hot path. Those macros to affect CPU's branch stream don't
help a lot for performance. I used them to identify the cases are
likely or unlikely to happen. No logical changes introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This unfreezes PE when it's initialized because the PE might be put
into frozen state in the last hot remove path. It's not harmful to
do so if the PE is already in unfrozen state.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This exports eeh_pe_state_mark(). It will be used to mark the surprise
hot removed PE as isolated to avoid unexpected EEH error reporting in
surprise remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This exports @confirm_error_lock so that eeh_serialize_{lock, unlock}()
can be used to freeze the affected PE in PCI surprise hot remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Function eeh_pe_set_option() is used to apply the requested options
(enable, disable, unfreeze) in EEH virtualization path. The semantics
of this function isn't complete until freezing is supported.
This allows to freeze the indicated PE. The new semantics is going to
be used in PCI surprise hot remove path, to freeze removed PCI devices
(PE) to avoid unexpected EEH error reporting.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.
This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Need to provide a dummy smp_fill_in_cpu_possible_map.
Fixes: 9b2f753ec237 ("sparc64: Fix cpu_possible_mask if nr_cpus is set")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit 900f65d361d3 ("tcp: move duplicate code from
tcp_v4_init_sock()/tcp_v6_init_sock()") we no longer need
to export sk_stream_write_space()
From: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use platform_device_register_full() instead of open-coded.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
|
|
Merge fixes from Andrew Morton:
"4 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mem-hotplug: use nodes that contain memory as mask in new_node_page()
scripts/recordmcount.c: account for .softirqentry.text
dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
mm,ksm: fix endless looping in allocating memory when ksm enable
|
|
9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()")
prevents allocating from an empty nodemask, but as David points out, it is
still wrong. As node_online_map may include memoryless nodes, only
allocating from these nodes is meaningless.
This patch uses node_states[N_MEMORY] mask to prevent the above case.
Fixes: 9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()")
Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")
Link: http://lkml.kernel.org/r/1474447117.28370.6.camel@TP420
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into
separate sections") added .softirqentry.text section, but it was not added
to recordmcount. So functions in the section are untracable. Add the
section to scripts/recordmcount.c and scripts/recordmcount.pl.
Fixes: be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections")
Link: http://lkml.kernel.org/r/1474902626-73468-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Steve Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org> [4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When CONFIG_DMA_API_DEBUG is enabled we need to preserve unmapping address
even if "unmap" is a no-op for our architecutre because we need
debug_dma_unmap_page() to correctly cleanup all of the debug bookkeeping.
Failing to do so results in a false positive warnings about previously
mapped areas never being unmapped.
Link: http://lkml.kernel.org/r/1474387125-3713-1-git-send-email-andrew.smirnov@gmail.com
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Geliang Tang <geliangtang@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I hit the following hung task when runing a OOM LTP test case with 4.1
kernel.
Call trace:
[<ffffffc000086a88>] __switch_to+0x74/0x8c
[<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
[<ffffffc000a1c09c>] schedule+0x3c/0x94
[<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
[<ffffffc000a1e32c>] down_write+0x64/0x80
[<ffffffc00021f794>] __ksm_exit+0x90/0x19c
[<ffffffc0000be650>] mmput+0x118/0x11c
[<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
[<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
[<ffffffc0000d0f34>] get_signal+0x444/0x5e0
[<ffffffc000089fcc>] do_signal+0x1d8/0x450
[<ffffffc00008a35c>] do_notify_resume+0x70/0x78
The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator
ksm_do_scan
scan_get_next_rmap_item
down_read
get_next_rmap_item
alloc_rmap_item #ksmd will loop permanently.
There is no way forward because the oom victim cannot release any memory
in 4.1 based kernel. Since 4.6 we have the oom reaper which would solve
this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan
would just retry later after the lock got dropped.
Such a patch would be also easy to backport to older stable kernels which
do not have oom_reaper.
While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed
by the allocation failure.
Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Suggested-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Resource allocation for VFs is done via the VF BARx registers in the PF's
SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
(see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).
Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
structs describe the space allocated for the VF (this is a piece of the
space described by the VF BARx register in the PF's SR-IOV capability).
It's meaningless to request additional alignment for a VF: the VF BAR
alignment is completely determined by the alignment of the VF BARx in the
PF and the size of the VF BAR.
Ignore the user's alignment requests for VF devices.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Jouni reported that during (repeated) wext_pmf test runs (from the
wpa_supplicant hwsim test suite) the kernel crashes. The reason is
that after the key is set, the wext code still unnecessarily stores
it into the key cache. Despite smatch pointing out an overflow, I
failed to identify the possibility for this in the code and missed
it during development of the earlier patch series.
In order to fix this, simply check that we never store anything but
WEP keys into the cache, adding a comment as to why that's enough.
Also, since the cache is still allocated early even if it won't be
used in many cases, add a comment explaining why - otherwise we'd
have to roll back key settings to the driver in case of allocation
failures, which is far more difficult.
Fixes: 89b706fb28e4 ("cfg80211: reduce connect key caching struct size")
Reported-by: Jouni Malinen <j@w1.fi>
Bisected-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Users may request additional alignment of PCI resources, e.g., to align
BARs on page boundaries so they can be shared with guests via VFIO. This
of course may require reallocation if firmware has already assigned the
BARs with smaller alignments.
If the platform has requested PCI_PROBE_ONLY, we should never change any
PCI BARs, so we can't provide any additional alignment. Also, if a BAR is
marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
firmware depends on the current BAR value, we can't change the alignment.
In these cases, log a message and ignore the user's alignment requests.
[bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Fixes the following sparse warnings:
drivers/watchdog/wdat_wdt.c:210:66: warning: Using plain integer as NULL pointer
drivers/watchdog/wdat_wdt.c:235:66: warning: Using plain integer as NULL pointer
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPI WDAT table is the preferred way to use hardware watchdog over the
native iTCO_wdt. Windows only uses this table for its hardware watchdog
implementation so we should be relatively safe to trust it has been
validated by OEMs.
Prevent iTCO watchdog creation if we detect that there is an ACPI WDAT
table.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPI WDAT table is the preferred way to use hardware watchdog over the
native iTCO_wdt. Windows only uses this table for its hardware watchdog
implementation so we should be relatively safe to trust it has been
validated by OEMs
Prevent iTCO watchdog creation if we detect that there is ACPI WDAT table.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPI WDAT table is the preferred way to use hardware watchdog over the
native iTCO_wdt. Windows only uses this table for its hardware watchdog
implementation so we should be relatively safe to trust it has been
validated by OEMs
Prevent iTCO watchdog creation if we detect that there is ACPI WDAT table.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We can't properly detect all hypervisors and we
need this to properly tear down the hardware.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We can't properly detect all hypervisors and we
need this to properly tear down the hardware.
v2: trivial warning fix
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use a separate one for the copy operation and
log all the interesting parameters.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Obviously missed during the rename.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This fixes a memory leak since binding GTT only on demand.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It's pretty pointless to get the offset first and then initialize it.
Should fix issues with the new GTT manager.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This bug seems to be present for a very long time.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to clear the shadows as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Close a very small window where something can go wrong.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Unreference the shadow BOs in the error path as well and drop the NULL checks.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to access those with the system domain.
Fixes fallout from only allocating GTT space on demand.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Just to cleanup some radeon leftovers.
sed -i "s/rbo/abo/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/rbo/abo/g" drivers/gpu/drm/amd/amdgpu/*.h
v2: rebased
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Not used in a while.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only allocate address space when we really need it.
v2: fix a typo, add correct function description,
stop leaking the node in the error case.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise we can get a hotplug interrupt storm when
we turn the panel off if hpd interrupts were enabled
by the bios.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97471
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise we can get a hotplug interrupt storm when
we turn the panel off if hpd interrupts were enabled
by the bios.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97471
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Otherwise we can get a hotplug interrupt storm when
we turn the panel off if hpd interrupts were enabled
by the bios.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97471
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Otherwise we can get a hotplug interrupt storm when
we turn the panel off if hpd interrupts were enabled
by the bios.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=97471
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Used the wrong index to setup the phase shedding mask.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Used the wrong index to setup the phase shedding mask.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If the master device has callbacks for _get/put_device()
and this MTD has slaves a get_mtd_device() call on paritions
will never issue the registered callbacks.
Fix this by propagating _get/put_device() down.
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
|
Pull late MTD fixes from Brian Norris:
"Another round of MTD fixes for v4.8
My apologies for sending this so late. I've been fairly absent as a
maintainer this cycle, but I did queue these up weeks ago. In the
meantime, Richard was able to handle some other fixes (thanks!) but
didn't pick these up.
On the bright side, these are very simple changes that should carry
little risk.
Summary:
- Davinci NAND: fix a long-standing bug in how we clear/prep 4-bit ECC
- OMAP NAND: an error-handling fix that made it into v4.8-rc1 caused
error-handling cases in other configurations/code-paths; this fixes
the fix"
* tag 'for-linus-20160928' of git://git.infradead.org/linux-mtd:
mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
mtd: nand: omap2: Don't call dma_release_channel() if dma_request_chan() failed
|
|
I will be starting employment at Versity next week and would like to update
my MAINTAINERS e-mail to reflect that change. My versity e-mail is already
activated so I shouldn't get any bounces on the new one. My ability to help
with Ocfs2 kernel maintenance won't change as a result of the new job.
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|