Age | Commit message (Collapse) | Author |
|
Refactor xfs_lock_two_inodes to take separate locking modes for each
inode. Specifically, this enables us to take a SHARED lock on one inode
and an EXCL lock on the other. The lock class (MMAPLOCK/ILOCK) must be
the same for each inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Before we share blocks between files, we need to break the pnfs leases
on the layout before we start slicing and dicing the block map. The
structure of this function sets us up for the lock contention reduction
in the next patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Even if we can't use the inobt/finobt cursors to count the number of
inode btree blocks, we are never allowed to clobber the cursor of the
btree being checked, so don't do this. Found by fuzzing level = ones
in xfs/364.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Every so often we blow the ASSERT(type != XFS_IO_COW) in xfs_map_blocks
when running fsstress, as we do in generic/269. The cause of this is
writeback racing with truncate -- writeback doesn't take the iolock, so
truncate can sneak in to decrease i_size and truncate page cache while
writeback is gathering buffer heads to schedule writeout.
If we hit this race on a block that has a CoW mapping, we'll get a valid
imap from the CoW fork but the reduced i_size trims the mapping to zero
length (which makes it invalid), so we call xfs_map_blocks to try again.
This doesn't do much anyway, since any mapping we get out of that will
also be invalid, so we might as well skip the assert and just stop.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Commit 66f364649d870 ("xfs: remove if_rdev") moved storing of rdev
value for special inodes to VFS inodes, but forgot to preserve the
value of i_rdev when recycling a reclaimable xfs_inode.
This was detected by xfstest overlay/017 with inodex=on mount option
and xfs base fs. The test does a lookup of overlay chardev and blockdev
right after drop caches.
Overlayfs inodes hold a reference on underlying xfs inodes when mount
option index=on is configured. If drop caches reclaim xfs inodes, before
it relclaims overlayfs inodes, that can sometimes leave a reclaimable xfs
inode and that test hits that case quite often.
When that happens, the xfs inode cache remains broken (zere i_rdev)
until the next cycle mount or drop caches.
Fixes: 66f364649d870 ("xfs: remove if_rdev")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Move all the inode and quota accounting updates out of xfs_bmap_btalloc
in preparation for fixing some quota accounting problems with copy on
write.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Refactor inode verifier error reporting into a non-libxfs function so
that we aren't encoding the message format in libxfs. This also
changes the kernel dmesg output to resemble buffer verifier errors
more closely.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Fix all the inode number formats to be consistently (0x%llx) in all
trace point definitions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Always zero the di_flags2 field when we free the inode so that we never
end up with an on-disk record for an unallocated inode that also has the
reflink iflag set. This is in keeping with the general principle that
only files can have the reflink iflag set, even though we'll zero out
di_flags2 if we ever reallocate the inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Ensure that we've attached all the necessary dquots before performing
reflink operations so that quota accounting is accurate.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Remove the extent size hint and realtime inode relevant code from
the xfs_bmapi_reserve_delalloc since it is not called on the inode
with extent size hint set or on a realtime inode.
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Now that buffer's b_fspriv has been split, just replace the current
singly linked list of xfs_log_items, by the list_head infrastructure.
Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(),
there is no need for this argument, once the log items can be walked
through the list_head in the buffer.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style cleanups]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
By splitting the b_fspriv field into two different fields (b_log_item
and b_li_list). It's possible to get rid of an old ABI workaround, by
using the new b_log_item field to store xfs_buf_log_item separated from
the log items attached to the buffer, which will be linked in the new
b_li_list field.
This way, there is no more need to reorder the log items list to place
the buf_log_item at the beginning of the list, simplifying a bit the
logic to handle buffer IO.
This also opens the possibility to change buffer's log items list into a
proper list_head.
b_log_item field is still defined as a void *, because it is still used
by the log buffers to store xlog_in_core structures, and there is no
need to add an extra field on xfs_buf just for xlog_in_core.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style changes]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Take advantage of the rework on xfs_buf log items list, to get rid of
ths typedef for xfs_buf_log_item.
This patch also fix some indentation alignment issues found along the way.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Uniformize STMicroelectronics copyrights header and add SPDX identifier
CC: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Acked-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lkml.kernel.org/r/20171130084500.23439-1-benjamin.gaignard@st.com
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The kmemdup line in the non-patch case was left over from the added kmemdup
line in the patch case.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Do some cleanup of debug messages, making them cleaner and
easier to be used to analyze what's going on.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
|
|
When a packet discontinuity happens, it is not just the payload
that was lost. The headers are lost too. So, the max size is not
184 but, instead 188.
Also, while printing warnings, make a distinction between
MPEG-TS indicated discontinuity and detected one.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
|
|
The XC2028_I2C_FLUSH only needs to be implemented on a few
devices. Others can safely ignore it.
That prevents filling the dmesg with lots of messages like:
dib0700: stk7700ph_xc3028_callback: unknown command 2, arg 0
Cc: stable@vger.kernel.org
Fixes: 4d37ece757a8 ("[media] tuner/xc2028: Add I2C flush callback")
Reported-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
|
|
Before this patch, when compiled for arm32, the signal strength
were reported as:
Lock (0x1f) Signal= 4294908.66dBm C/N= 12.79dB
Because of a 32 bit integer overflow. After it, it is properly
reported as:
Lock (0x1f) Signal= -58.64dBm C/N= 12.79dB
Cc: stable@vger.kernel.org
Fixes: 0f91c9d6bab9 ("[media] TS2020: Calculate tuner gain correctly")
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
|
|
Since i_version is mostly treated as an opaque value, we can exploit that
fact to avoid incrementing it when no one is watching. With that change,
we can avoid incrementing the counter on writes, unless someone has
queried for it since it was last incremented. If the a/c/mtime don't
change, and the i_version hasn't changed, then there's no need to dirty
the inode metadata on a write.
Convert the i_version counter to an atomic64_t, and use the lowest order
bit to hold a flag that will tell whether anyone has queried the value
since it was last incremented.
When we go to maybe increment it, we fetch the value and check the flag
bit. If it's clear then we don't need to do anything if the update
isn't being forced.
If we do need to update, then we increment the counter by 2, and clear
the flag bit, and then use a CAS op to swap it into place. If that
works, we return true. If it doesn't then do it again with the value
that we fetch from the CAS operation.
On the query side, if the flag is already set, then we just shift the
value down by 1 bit and return it. Otherwise, we set the flag in our
on-stack value and again use cmpxchg to swap it into place if it hasn't
changed. If it has, then we use the value from the cmpxchg as the new
"old" value and try again.
This method allows us to avoid incrementing the counter on writes (and
dirtying the metadata) under typical workloads. We only need to increment
if it has been queried since it was last changed.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
At this point, we know that "now" and the file times may differ, and we
suspect that the i_version has been flagged to be bumped. Attempt to
bump the i_version, and only mark the inode dirty if that actually
occurred or if one of the times was updated.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: David Sterba <dsterba@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
|
|
If XFS_ILOG_CORE is already set then go ahead and increment it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
|
|
We only really need to update i_version if someone has queried for it
since we last incremented it. By doing that, we can avoid having to
update the inode if the times haven't changed.
If the times have changed, then we go ahead and forcibly increment the
counter, under the assumption that we'll be going to the storage
anyway, and the increment itself is relatively cheap.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Mostly just making sure we use the "get" wrappers so we know when
it is being fetched for later use.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
For NFS, we just use the "raw" API since the i_version is mostly
managed by the server. The exception there is when the client
holds a write delegation, but we only need to bump it once
there anyway to handle CB_GETATTR.
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: David Sterba <dsterba@suse.com>
|
|
For AFS, it's generally treated as an opaque value, so we use the
*_raw variants of the API here.
Note that AFS has quite a different definition for this counter. AFS
only increments it on changes to the data to the data in regular files
and contents of the directories. Inode metadata changes do not result
in a version increment.
We'll need to reconcile that somehow if we ever want to present this to
userspace via statx.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
The rationale for taking the i_lock when incrementing this value is
lost in antiquity. The readers of the field don't take it (at least
not universally), so my assumption is that it was only done here to
serialize incrementors.
If that is indeed the case, then we can drop the i_lock from this
codepath and treat it as a atomic64_t for the purposes of
incrementing it. This allows us to use inode_inc_iversion without
any danger of lock inversion.
Note that the read side is not fetched atomically with this change.
The assumption here is that that is not a critical issue since the
i_version is not fully synchronized with anything else anyway.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Add a documentation blob that explains what the i_version field is, how
it is expected to work, and how it is currently implemented by various
filesystems.
We already have inode_inc_iversion. Add several other functions for
manipulating and accessing the i_version counter. For now, the
implementation is trivial and basically works the way that all of the
open-coded i_version accesses work today.
Future patches will convert existing users of i_version to use the new
API, and then convert the backend implementation to do things more
efficiently.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Pull NAND changes from Boris Brezillon:
"
Core changes:
* Fix NAND_CMD_NONE handling in nand_command[_lp]() hooks
* Introduce the ->exec_op() infrastructure
* Rework NAND buffers handling
* Fix ECC requirements for K9F4G08U0D
* Fix nand_do_read_oob() to return the number of bitflips
* Mark K9F1G08U0E as not supporting subpage writes
Driver changes:
* MTK: Rework the driver to support new IP versions
* OMAP OneNAND: Full rework to use new APIs (libgpio, dmaengine) and fix
DT support
* Marvell: Add a new driver to replace the pxa3xx one
"
|
|
Pull spi-nor changes from Cyrille Pitchen:
"
This pull-request contains the following notable changes:
Core changes:
* Add support to new ISSI and Cypress/Spansion memory parts.
* Fix support of Micron memories by checking error bits in the FSR.
* Fix update of block-protection bits by reading back the SR.
* Restore the internal state of the SPI flash memory when removing the
device.
Driver changes:
* Maintenance for Freescale, Intel and Metiatek drivers.
* Add support of the direct access mode for the Cadence QSPI controller.
"
|
|
Sparse is whining about the u32 and __le32 mixed usage in the driver
drivers/ntb/test/ntb_perf.c:288:21: warning: cast to restricted __le32
drivers/ntb/test/ntb_perf.c:295:37: warning: incorrect type in argument 4 (different base types)
drivers/ntb/test/ntb_perf.c:295:37: expected unsigned int [unsigned] [usertype] val
drivers/ntb/test/ntb_perf.c:295:37: got restricted __le32 [usertype] <noident>
...
NTB hardware drivers shall accept CPU-endian data and translate it to
the portable formate by internal means, so the explicit conversions
are not necessary before Scratchpad/Messages API usage anymore.
Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
We accidentally return success if dmaengine_submit() fails. The fix is
to preserve the error code from dma_submit_error().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
Fixes the following sparse warnings:
drivers/ntb/hw/mscc/ntb_hw_switchtec.c:1552:6: warning:
symbol 'switchtec_ntb_remove' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
Currently there is a memory leak on buf when the call to ntb_mw_get_align
fails. Add an exit err label and jump to this so that kfree on buf frees
the memory.
Detected by CoverityScan, CID#1464286 ("Resource leak")
Fixes: d637628ce00c ("NTB: ntb_tool: Add full multi-port NTB API support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
On 32-bit architectures, resource_size_t is usually 'unsigned int' or
'unsigned long' but not 'unsigned long long', so we get a warning
about printing the wrong data:
drivers/ntb/test/ntb_perf.c: In function 'perf_setup_peer_mw':
drivers/ntb/test/ntb_perf.c:1390:35: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t {aka unsigned int}' [-Werror=format=]
This changes the format string to the special %pa that is already
used elsewhere in the same file.
Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
Since Switchtec patch there has been a new topology added to
the NTB API. It's called NTB_TOPO_SWITCH and dedicated for
PCIe switch chips. Even though topo field isn't used within the
IDT driver much, lets set it for the sake of unification.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
ntb_perf driver has been also updated so to have the multi-port
interface support. User now must specify what peer port is going
to be used to perform the test.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|
|
There are devices (like IDT PCIe switches), which outbound MWs xlat address
is setup on peer side. In this case local side is supposed to allocate
a memory buffer and somehow deliver the xlat DMA address to peer so one
could set the outbound MW up. The MW test is altered so to support both
previous Intel/AMD and new IDT-like devices.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
|