Age | Commit message (Collapse) | Author |
|
The quotaoff operation has a race with inode allocation that results
in a livelock. An inode allocation that occurs before the quota
status flags are updated acquires the appropriate dquots for the
inode via xfs_qm_vop_dqalloc(). It then inserts the XFS_INEW inode
into the perag radix tree, sometime later attaches the dquots to the
inode and finally clears the XFS_INEW flag. Quotaoff expects to
release the dquots from all inodes in the filesystem via
xfs_qm_dqrele_all_inodes(). This invokes the AG inode iterator,
which skips inodes in the XFS_INEW state because they are not fully
constructed. If the scan occurs after dquots have been attached to
an inode, but before XFS_INEW is cleared, the newly allocated inode
will continue to hold a reference to the applicable dquots. When
quotaoff invokes xfs_qm_dqpurge_all(), the reference count of those
dquot(s) remain elevated and the dqpurge scan spins indefinitely.
To address this problem, update the xfs_qm_dqrele_all_inodes() scan
to wait on inodes marked on the XFS_INEW state. We wait on the
inodes explicitly rather than skip and retry to avoid continuous
retry loops due to a parallel inode allocation workload. Since
quotaoff updates the quota state flags and uses a synchronous
transaction before the dqrele scan, and dquots are attached to
inodes after radix tree insertion iff quota is enabled, one INEW
waiting pass through the AG guarantees that the scan has processed
all inodes that could possibly hold dquot references.
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The AG inode iterator currently skips new inodes as such inodes are
inserted into the inode radix tree before they are fully
constructed. Certain contexts require the ability to wait on the
construction of new inodes, however. The fs-wide dquot release from
the quotaoff sequence is an example of this.
Update the AG inode iterator to support the ability to wait on
inodes flagged with XFS_INEW upon request. Create a new
xfs_inode_ag_iterator_flags() interface and support a set of
iteration flags to modify the iteration behavior. When the
XFS_AGITER_INEW_WAIT flag is set, include XFS_INEW flags in the
radix tree inode lookup and wait on them before the callback is
executed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Inodes that are inserted into the perag tree but still under
construction are flagged with the XFS_INEW bit. Most contexts either
skip such inodes when they are encountered or have the ability to
handle them.
The runtime quotaoff sequence introduces a context that must wait
for construction of such inodes to correctly ensure that all dquots
in the fs are released. In anticipation of this, support the ability
to wait on new inodes. Wake the appropriate bit when XFS_INEW is
cleared.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Copy the uuid of the filesystem to struct super_block s_uuid field,
as several other filesystems already do. Copy regardless of the nouuid
mount option, because other filesystems also do not guaranty uniqueness
of the s_uuid field in super_block struct.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Commit 99e6608c9e74 "block: Add badblock management for gendisks"
allowed for drivers like pmem and software-raid to advertise a list of
bad media areas. However, it inadvertently added a 'badblocks' to all
block devices. Lets clean this up by having the 'badblocks' attribute
not be visible when the driver has not populated a 'struct badblocks'
instance in the gendisk.
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Add missing L2 cache events: read/write accesses and misses, as well as
the DTLB refills.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
http://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs
Simon Horman says:
====================
IPVS Fixes for v4.11
I would also like it considered for stable.
* Explicitly forbid ipv6 service/dest creation if ipv6 mod is disabled
to avoid oops caused by IPVS accesing IPv6 routing code in such
circumstances.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The commit 6d684e54690c ("rhashtable: Cap total number of entries
to 2^31") breaks rhashtable users that do not set max_size. This
is because when max_size is zero max_elems is also incorrectly set
to zero instead of 2^31.
This patch fixes it by only lowering max_elems when max_size is not
zero.
Fixes: 6d684e54690c ("rhashtable: Cap total number of entries to 2^31")
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The only difference between ->run_work and ->delay_work, is that
the latter is used to defer running a queue. This is done by
marking the queue stopped, and scheduling ->delay_work to run
sometime in the future. While the queue is stopped, direct runs
or runs through ->run_work will not run the queue.
If we combine the handlers, then we need to handle two things:
1) If a delayed/stopped run is scheduled, then we should not run
the queue before that has been completed.
2) If a queue is delayed/stopped, the handler needs to restart
the queue. Normally a run of a queue with the stopped bit set
would be a no-op.
Case 1 is handled by modifying a currently pending queue run
to the deadline set by the caller of blk_mq_delay_queue().
Subsequent attempts to queue a queue run will find the work
item already pending, and direct runs will see a stopped queue
as before.
Case 2 is handled by adding a new bit, BLK_MQ_S_START_ON_RUN,
that tells the work handler that it should clear a stopped
queue and run the handler.
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This modifies (or adds, if not currently pending) an existing
delayed work item.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
They serve the exact same purpose. Get rid of the non-delayed
work variant, and just run it without delay for the normal case.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
list_for_each_entry() isn't super safe if we're freeing the objects
while we traverse the list. Also don't bother taking the extra
reference, the module refcounting stuff will save us from having anybody
messing with the device while we're trying to unload.
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently, file system's writepages() function must not fail with an
ENOMEM, since if they do, it's possible for buffered data to be lost.
This is because on a data integrity writeback writepages() gets called
but once, and if it returns ENOMEM, if you're lucky the error will get
reflected back to the userspace process calling fsync(). If you
aren't lucky, the user is unmounting the file system, and the dirty
pages will simply be lost.
For this reason, file system code generally will use GFP_NOFS, and in
some cases, will retry the allocation in a loop, on the theory that
"kernel livelocks are temporary; data loss is forever".
Unfortunately, this can indeed cause livelocks, since inside the
writepages() call, the file system is holding various mutexes, and
these mutexes may prevent the OOM killer from killing its targetted
victim if it is also holding on to those mutexes.
A better solution would be to allow writepages() to call the memory
allocator with flags that give greater latitude to the allocator to
fail, and then release its locks and return ENOMEM, and in the case of
background writeback, the writes can be retried at a later time. In
the case of data-integrity writeback retry after waiting a brief
amount of time.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
According to my static checker we should unlock here before the return.
That seems reasonable to me as well.
Fixes" b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Seems like this was forgotten in the bfq-series from Paolo. Let's do it now
so people don't miss out involving Paolo for any future changes or when
reporting bugs.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
mempool_alloc() cannot fail if the gfp flags allow it to
sleep, and both GFP_FS allows for sleeping.
So these tests of the return value from mempool_alloc()
cannot be needed.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
commit 620d8745b35d ("Introduce cifs_copy_file_range()") changes the
behaviour of the cifs ioctl call CIFS_IOC_COPYCHUNK_FILE. In case of
successful writes, it now returns the number of bytes written. This
return value is treated as an error by the xfstest cifs/001. Depending
on the errno set at that time, this may or may not result in the test
failing.
The patch fixes this by setting the return value to 0 in case of
successful writes.
Fixes: commit 620d8745b35d ("Introduce cifs_copy_file_range()")
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
Incorrect return value for shares not using the prefix path means that
we will never match superblocks for these shares.
Fixes: commit c1d8b24d1819 ("Compare prepaths when comparing superblocks")
Cc: stable@vger.kernel.org
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
The ls1046a datasheet specified that the max SD clock frequency
for eSDHC SDR104/HS200 was 167MHz, and the ls1012a datasheet
specified it's 125MHz for ls1012a. So this patch is to add the
limitation.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Have proper request id filled in the SCHED_SCAN_RESULTS and
SCHED_SCAN_STOPPED notifications toward user-space by having the
driver provide it through the api.
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Fix a checkpatch error: CODE_INDENT (code indent should use tabs where
possible).
Signed-off-by: Juan Antonio Pedreira Martos <juanpm1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix checkpatch warning: removed unnecessary initialization of
static variable "heap_id" to 0 in source file "ioc.c".
Signed-off-by: Fabrizio Perria <fabrizio.perria@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The range checking on clk_num is incorrect; fix these so that invalid
clk_num values are detected correctly.
Detected by static analysis with by PVS-Studio
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This fix "overrided", the correct past tense form of "override" is
"overridden".
Signed-off-by: Luis Oliveira <lolivei@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There's no need to check kmap() return value because it won't fail.
If it's highmem mapping, it will receive virtual address
or a new one; if it's lowmem, all kernel pages are already being mapped.
(Thanks to Jan Kara for explanations)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The runtime power management functions are called from the reset handler even
if CONFIG_PM is disabled, leading to a link error:
drivers/staging/built-in.o: In function `atomisp_reset':
(.text+0x4cd1c): undefined reference to `atomisp_runtime_suspend'
drivers/staging/built-in.o: In function `atomisp_reset':
(.text+0x4cd3a): undefined reference to `atomisp_mrfld_power_down'
drivers/staging/built-in.o: In function `atomisp_reset':
(.text+0x4cd58): undefined reference to `atomisp_mrfld_power_up'
drivers/staging/built-in.o: In function `atomisp_reset':
(.text+0x4cd77): undefined reference to `atomisp_runtime_resume'
Removing the #ifdef around the PM functions avoids the problem, and
lets us simplify it further. The __maybe_unused annotation is needed
to ensure the compiler can silently drop the unused callbacks.
Fixes: a49d25364dfb ("staging/atomisp: Add support for the Intel IPU v2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
After the satm kernel was removed, we should no longer add the directory
to the search path. This was found with a 'make W=1' warning:
cc1: error: drivers/staging/media/atomisp/pci/atomisp2/css2400/isp/kernels/satm/: No such file or directory [-Werror=missing-include-dirs]
Fixes: 184f8e0981ef ("atomisp: remove satm kernel")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The extra list contains some which are used and some which are not. At this
point I think we can safely remove those that are simply not used.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We have a layer of un-needed wrapping here that can go. In addition there are
some functions that don't exist and one that isn't used which can also go.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This is just another wrapper layer around hmm_free that servers no purpose
in this driver.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We don't need any of these indirections as we only support one MMU type. Start
by getting rid of the init/clear/free ones. The init ordering check we already
pushed down in a previous patch.
The allocation side is more complicated so leave it for now.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently the code handles this in the abstraction above. We want to remove
that abstraction so begin by pushing down the sanity check. Unfortunately
at this point we can't simply fix the init order.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a list of TODO items for the Ethernet driver
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add trace events in significant places of the data path.
Useful for debuggging.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add custom statistics to be reported via ethtool -S. These include
driver specific per-cpu statistics as well as queue and channel
counters.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for several ethtool operations: show hardware statistics,
get/set link settings, get hash configuration.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Introduce the DPAA2 Ethernet driver, which manages Datapath
Network Interface (DPNI) objects discovered on the MC bus.
In addition to DPNIs, the Ethernet driver uses several other
MC objects to build a network interface abstraction: buffer
pools (DPBPs), I/O Portals (DPIOs) and concentrators (DPCONs).
A more detailed description of the driver can be found in the
associated README file.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add the command build/parse APIs for operating on DPNI objects through
the DPAA2 Management Complex.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a README file describing the driver architecture, components
and I/O interface.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch adds the command building/parsing wrapper functions
for the DPCON object. The binary interface version is v3.2.
A DPCON (DataPath Concentrator) is an aggregator object that
allows ingress frames from multiple hardware queues to be seen
as coming from a single source, from the CPU point of view.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When checking the response verb, the valid bit should be masked out,
since its value flips depending on what Response Register
(RR0 /RR1) it's been read from.
Fixes: 321eecb06bfb ("bus: fsl-mc: dpio: add QBMan portal APIs for DPAA2")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Michal Suchánek noticed a comment in book3s/64/mmu-hash.h about the context ids
we use for the kernel was inconsistent with the code and other comments in the
same file.
It should read 1-4 not 1-5.
While we're touching it, update "address" to "addresses" which makes more sense
as it's referring to more than one address below.
Reported-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This enables VFIO on pseries host in order to allow VFIO in nested guest under
PR KVM or DPDK in a HV guest. This adds support of the VFIO_SPAPR_TCE_IOMMU
type.
This adds exchange() callback to allow TCE updates by the SPAPR TCE IOMMU
driver in VFIO.
This initializes DMA32 window parameters in iommu_table_group as as this does
not implement VFIO_SPAPR_TCE_v2_IOMMU and VFIO_SPAPR_TCE_IOMMU just reuses the
existing DMA32 window.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When the userspace requests a small TCE table (which takes less than
the system page size) and more than 1 TCE level, the existing code
returns a single page size which is a bug as each additional TCE level
requires at least one page and this is what
pnv_pci_ioda2_table_alloc_pages() does. And we end up seeing
WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size))
in drivers/vfio/vfio_iommu_spapr_tce.c.
This replaces incorrect _ALIGN_UP() (which aligns zero up to zero) with
max_t() to fix the bug.
Besides removing WARN_ON(), there should be no other changes in
behaviour.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
pnv_pci_table_alloc() ignores possible failure from kzalloc_node(),
this adds a check. There are 2 callers of pnv_pci_table_alloc(),
one already checks for tbl!=NULL, this adds WARN_ON() to the other path
which only happens during boot time in IODA1 and not expected to fail.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Move a couple of existing scripts under there. Remove scripts directory:
a script is a tool, a tool is not a script.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Currently powerpc has to introduce a dependency on its default build
target zImage in order to run a relocation check pass over the linked
vmlinux. This is deficient because the check is not run if the plain
vmlinux target is built, or if one of the other boot targets is built.
Switch to using the kbuild post-link pass, added in commit fbe6e37dab97
("kbuild: add arch specific post-link Makefile") in order to run this
check. In future powerpc will use this to do more complicated operations,
but initially using it for something simple is a good first step.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
An externally triggered system reset (e.g., via QEMU nmi command, or pseries
reset button) can cause system reset interrupts on all CPUs. In case this causes
xmon to be entered, it is undesirable for the primary (first) CPU into xmon to
trigger an NMI IPI to others, because this may cause a nested system reset
interrupt.
So spin for a time waiting for secondaries to join xmon before performing the
NMI IPI, similarly to what the crash dump code does.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Only do it when we come in from system reset, not via sysrq etc.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|