Age | Commit message (Collapse) | Author |
|
esp->dev is a void pointer that points either to a struct device, or a
struct platform_device. As we can easily get from the device to the
platform_device if needed change it to always point to a struct device
and properly type the pointer to avoid errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
esp_sbus_map_command_block is called straight from the probe routine
without any locks held, so we can safely use GFP_KERNEL here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Remove usage of the legacy PCI DMA API. To make this easier we also
store a struct device instead of pci_dev in the dev field of struct
esp.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per Qcom's UFS host controller HW design, the UFS Tx lane1 clock could be
muxed with Tx lane0 clock, hence keep Tx lane1 clock optional by ignoring
it if it is not provided in device tree. This change also performs some
cleanup to lanes per direction checks when enable/disable lane clocks
just for symmetry.
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We get a warning from 'make headers_check' about a newly introduced usage
of integer types in the scsi/scsi_bsg_ufs.h uapi header:
usr/include/scsi/scsi_bsg_ufs.h:18: found __[us]{8,16,32,64} type without #include <linux/types.h>
Aside from the missing linux/types.h inclusion, I also noticed that it
uses the wrong types: 'u32' is not available at all in user space, and
'uint32_t' depends on the inclusion of a standard header that we should
not include from kernel headers.
Change the all to __u32 and similar types here.
I also note the usage of '__be32' and '__be16' that seems unfortunate for
a user space API. I wonder if it would be better to define the interface
in terms of a CPU-endian structure and convert it in kernel space.
Fixes: e77044c5a842 ("scsi: ufs-bsg: Add support for uic commands in ufs_bsg_request()")
Fixes: df032bf27a41 ("scsi: ufs: Add a bsg endpoint that supports UPIUs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
dma_alloc_coherent allocates memory that can be used by the cpu and the
device at the same time, calls to pci_dma_sync_* are not required, and in
fact actively harmful on some architectures like arm.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The dma_map_sg / dma_unmap_sg APIs called from scsi_dma_map /
scsi_dma_unmap already transfer memory ownership to the device or cpu
respectively. Adding additional calls to pci_dma_sync_sg_* will in fact
lead to data corruption if we end up using swiotlb for some reason.
Also remove the now pointless megaraid_mbox_sync_scb function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Straight forward conversion, using an internal list to enable the
driver to pull requests at will.
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Convert from the old request_fn style driver to blk-mq.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This pattern is repeated throughout all the blk-mq conversions.
Provide a basic helper to get it done.
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/block/null_blk_main.c: In function 'end_cmd':
drivers/block/null_blk_main.c:609:24: warning:
variable 'q' set but not used [-Wunused-but-set-variable]
It not used any more after commit
e50b1e327aeb ("null_blk: remove legacy IO path")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This reverts commit 0b9871a3a8cc7234c285b5d9bf66cc6712cfee7c.
Causes crashes with qemu, interacts badly with commit commit
6d0a70a284be ("vsprintf: print OF node name using full_name")
etc.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a way of creating maps from user space. The command takes
as parameters most of the attributes of the map creation system
call command. After map is created its pinned to bpffs. This makes
it possible to easily and dynamically (without rebuilding programs)
test various corner cases related to map creation.
Map type names are taken from bpftool's array used for printing.
In general these days we try to make use of libbpf type names, but
there are no map type names in libbpf as of today.
As with most features I add the motivation is testing (offloads) :)
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
John Fastabend says:
====================
The first patch adds support for attaching programs to maps. This is
needed to support sock{map|hash} use from bpftool. Currently, I carry
around custom code to do this so doing it using standard bpftool will
be great.
The second patch adds a compat mode to ignore non-zero entries in
the map def. This allows using bpftool with maps that have a extra
fields that the user knows can be ignored. This is needed to work
correctly with maps being loaded by other tools or directly via
syscalls.
v3: add bash completion and doc updates for --mapcompat
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Multiple map definition structures exist and user may have non-zero
fields in their definition that are not recognized by bpftool and
libbpf. The normal behavior is to then fail loading the map. Although
this is a good default behavior users may still want to load the map
for debugging or other reasons. This patch adds a --mapcompat flag
that can be used to override the default behavior and allow loading
the map even when it has additional non-zero fields.
For now the only user is 'bpftool prog' we can switch over other
subcommands as needed. The library exposes an API that consumes
a flags field now but I kept the original API around also in case
users of the API don't want to expose this. The flags field is an
int in case we need more control over how the API call handles
errors/features/etc in the future.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Sock map/hash introduce support for attaching programs to maps. To
date I have been doing this with custom tooling but this is less than
ideal as we shift to using bpftool as the single CLI for our BPF uses.
This patch adds new sub commands 'attach' and 'detach' to the 'prog'
command to attach programs to maps and then detach them.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Joe Stringer says:
====================
This series includes a couple of fixups for the IPv6 socket lookup
helper, to make the API more consistent (always supply all arguments in
network byte-order) and to allow its use when IPv6 is compiled as a
module.
====================
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF")
mistakenly passed the destination port in network byte-order to the IPv6
TCP/UDP socket lookup functions, which meant that BPF writers would need
to either manually swap the byte-order of this field or otherwise IPv6
sockets could not be located via this helper.
Fix the issue by swapping the byte-order appropriately in the helper.
This also makes the API more consistent with the IPv4 version.
Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF")
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This is a more complete fix than d71019b54bff ("net: core: Fix build
with CONFIG_IPV6=m"), so that IPv6 sockets may be looked up if the IPv6
module is loaded (not just if it's compiled in).
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This documentation was inadvertently released under the CC-BY-SA-4.0
license. It was intended to be released under GPL-2.0 or later.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
|
|
The IDA was declared on the stack instead of statically, so lockdep
triggered a warning that it was improperly initialised.
Reported-by: 0day bot
Tested-by: Rong Chen <rong.a.chen@intel.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
|
|
Devices with compatible="pmbus" field have zero initial page count,
and pmbus_clear_faults() being called before the page count auto-
detection does not actually clear faults because it depends on the
page count. Non-cleared faults in its turn may fail the subsequent
page count auto-detection.
This patch fixes this problem by calling pmbus_clear_fault_page()
for currently set page and calling pmbus_clear_faults() after the
page count was detected.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Bazhenov <bazhenov.dn@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Daniel Borkmann says:
====================
This work adds a generic sk_msg layer and converts both sockmap
and later ktls over to make use of it as a common data structure
for application data (similarly as sk_buff for network packets).
With that in place the sk_msg framework spans accross ULP layer
in the kernel and allows for introspection or filtering of L7
data with the help of BPF programs operating on a common input
context.
In a second step, we enable the latter for ktls which was previously
not possible, meaning, ktls and sk_msg verdict programs were
mutually exclusive in the ULP layer which created challenges for
the orchestrator when trying to apply TCP based policy, for
example. Leveraging the prior consolidation we can finally overcome
this limitation.
Note, there's no change in behavior when ktls is not used in
combination with BPF, and also no change in behavior for stand
alone sockmap. The kselftest suites for ktls, sockmap and ktls
with sockmap combined also runs through successfully. For further
details please see individual patches.
Thanks!
v1 -> v2:
- Removed leftover comment spotted by Alexei
- Improved commit messages, rebase
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a MAINTAINERS entry to the skmsg and related files such that
patches, features, bug reports land with the right Cc.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This adds a --ktls option to test_sockmap in order to enable the
combination of ktls and sockmap to run, which makes for another
batch of 648 test cases for both in combination.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This work adds BPF sk_msg verdict program support to kTLS
allowing BPF and kTLS to be combined together. Previously kTLS
and sk_msg verdict programs were mutually exclusive in the
ULP layer which created challenges for the orchestrator when
trying to apply TCP based policy, for example. To resolve this,
leveraging the work from previous patches that consolidates
the use of sk_msg, we can finally enable BPF sk_msg verdict
programs so they continue to run after the kTLS socket is
created. No change in behavior when kTLS is not used in
combination with BPF, the kselftest suite for kTLS also runs
successfully.
Joint work with Daniel.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Instead of re-implementing poll routine use the poll callback to
trigger read from kTLS, we reuse the stream_memory_read callback
which is simpler and achieves the same. This helps to align sockmap
and kTLS so we can more easily embed BPF in kTLS.
Joint work with Daniel.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Convert kTLS over to make use of sk_msg interface for plaintext and
encrypted scattergather data, so it reuses all the sk_msg helpers
and data structure which later on in a second step enables to glue
this to BPF.
This also allows to remove quite a bit of open coded helpers which
are covered by the sk_msg API. Recent changes in kTLs 80ece6a03aaf
("tls: Remove redundant vars from tls record structure") and
4e6d47206c32 ("tls: Add support for inplace records encryption")
changed the data path handling a bit; while we've kept the latter
optimization intact, we had to undo the former change to better
fit the sk_msg model, hence the sg_aead_in and sg_aead_out have
been brought back and are linked into the sk_msg sgs. Now the kTLS
record contains a msg_plaintext and msg_encrypted sk_msg each.
In the original code, the zerocopy_from_iter() has been used out
of TX but also RX path. For the strparser skb-based RX path,
we've left the zerocopy_from_iter() in decrypt_internal() mostly
untouched, meaning it has been moved into tls_setup_from_iter()
with charging logic removed (as not used from RX). Given RX path
is not based on sk_msg objects, we haven't pursued setting up a
dummy sk_msg to call into sk_msg_zerocopy_from_iter(), but it
could be an option to prusue in a later step.
Joint work with John.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a generic sk_msg layer, and convert current sockmap and later
kTLS over to make use of it. While sk_buff handles network packet
representation from netdevice up to socket, sk_msg handles data
representation from application to socket layer.
This means that sk_msg framework spans across ULP users in the
kernel, and enables features such as introspection or filtering
of data with the help of BPF programs that operate on this data
structure.
Latter becomes in particular useful for kTLS where data encryption
is deferred into the kernel, and as such enabling the kernel to
perform L7 introspection and policy based on BPF for TLS connections
where the record is being encrypted after BPF has run and came to
a verdict. In order to get there, first step is to transform open
coding of scatter-gather list handling into a common core framework
that subsystems can use.
The code itself has been split and refactored into three bigger
pieces: i) the generic sk_msg API which deals with managing the
scatter gather ring, providing helpers for walking and mangling,
transferring application data from user space into it, and preparing
it for BPF pre/post-processing, ii) the plain sock map itself
where sockets can be attached to or detached from; these bits
are independent of i) which can now be used also without sock
map, and iii) the integration with plain TCP as one protocol
to be used for processing L7 application data (later this could
e.g. also be extended to other protocols like UDP). The semantics
are the same with the old sock map code and therefore no change
of user facing behavior or APIs. While pursuing this work it
also helped finding a number of bugs in the old sockmap code
that we've fixed already in earlier commits. The test_sockmap
kselftest suite passes through fine as well.
Joint work with John.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In order to prepare sockmap logic to be used in combination with kTLS
we need to detangle it from ULP, and further split it in later commits
into a generic API.
Joint work with John.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Whenever the ULP data on the socket is mangled, enforce that the
caller has the socket lock held as otherwise things may race with
initialization and cleanup callbacks from ulp ops as both would
mangle internal socket state.
Joint work with John.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add ELAN061C to the ACPI table to support Elan touchpad found in Lenovo
IdeaPad 330-15IGM.
Signed-off-by: Mikhail Nikiforov <jackxviichaos@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
All the other functions that deal with the sd_ail_list run the list
from the tail back to the head, iow, in reverse. We should do the
same while writing revokes, otherwise we might miss removing entries
properly from the list when we hit the limit of how many revokes we
can write at one time (based on block size, which determines how
many block pointers will fit in the revoke block).
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Following warnings are generated when compiled with W=1,
drivers/dma/owl-dma.c:170: warning: Function parameter or member 'cyclic'
not described in 'owl_dma_txd'
drivers/dma/owl-dma.c:198: warning: Function parameter or member 'cfg' not
described in 'owl_dma_vchan'
drivers/dma/owl-dma.c:198: warning: Function parameter or member 'drq' not
described in 'owl_dma_vchan'
drivers/dma/owl-dma.c:225: warning: Function parameter or member 'irq' not
described in 'owl_dma'
Fix this by adding comments for relevant struct members to appear in
kernel-doc.
Fixes: d64e1b3f5cce ("dmaengine: owl: Add Slave and Cyclic mode support for
Actions Semi Owl S900 SoC")
Reported-by: Vinod Koul <vinod.koul@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Using bool is more suitable than int here, and add the comment about the
return_bigger.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The avg_delayed_ref_runtime can be referenced from the transaction
handle.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It can be referenced from the transaction handle.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since trans is only used for referring to delayed_refs, there is no need
to pass it instead of delayed_refs to btrfs_delayed_ref_lock().
No functional change.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since trans is only used for referring to delayed_refs, there is no need
to pass it instead of delayed_refs to btrfs_select_ref_head(). No
functional change.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There is no reason to put this check in (!qgroup)'s else branch because
if qgroup is null, it will goto out directly. So move it out to reduce
indentation level. No functional change.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Commit 581c1760415c ("btrfs: Validate child tree block's level and first
key") has made tree block level check mandatory.
So if tree block level doesn't match, we won't get a valid extent
buffer. The extra WARN_ON() check can be removed completely.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
And add one line comment explaining what we're doing for each loop.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Some qgroup trace events like btrfs_qgroup_release_data() and
btrfs_qgroup_free_delayed_ref() can still be triggered even if qgroup is
not enabled.
This is caused by the lack of qgroup status check before calling some
qgroup functions. Thankfully the functions can handle quota disabled
case well and just do nothing for qgroup disabled case.
This patch will do earlier check before triggering related trace events.
And for enabled <-> disabled race case:
1) For enabled->disabled case
Disable will wipe out all qgroups data including reservation and
excl/rfer. Even if we leak some reservation or numbers, it will
still be cleared, so nothing will go wrong.
2) For disabled -> enabled case
Current btrfs_qgroup_release_data() will use extent_io tree to ensure
we won't underflow reservation. And for delayed_ref we use
head->qgroup_reserved to record the reserved space, so in that case
head->qgroup_reserved should be 0 and we won't underflow.
CC: stable@vger.kernel.org # 4.14+
Reported-by: Chris Murphy <lists@colorremedies.com>
Link: https://lore.kernel.org/linux-btrfs/CAJCQCtQau7DtuUUeycCkZ36qjbKuxNzsgqJ7+sJ6W0dK_NLE3w@mail.gmail.com/
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In a scenario like the following:
mkdir /mnt/A # inode 258
mkdir /mnt/B # inode 259
touch /mnt/B/bar # inode 260
sync
mv /mnt/B/bar /mnt/A/bar
mv -T /mnt/A /mnt/B
fsync /mnt/B/bar
<power fail>
After replaying the log we end up with file bar having 2 hard links, both
with the name 'bar' and one in the directory with inode number 258 and the
other in the directory with inode number 259. Also, we end up with the
directory inode 259 still existing and with the directory inode 258 still
named as 'A', instead of 'B'. In this scenario, file 'bar' should only
have one hard link, located at directory inode 258, the directory inode
259 should not exist anymore and the name for directory inode 258 should
be 'B'.
This incorrect behaviour happens because when attempting to log the old
parents of an inode, we skip any parents that no longer exist. Fix this
by forcing a full commit if an old parent no longer exists.
A test case for fstests follows soon.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When replaying a log which contains a tmpfile (which necessarily has a
link count of 0) we end up calling inc_nlink(), at
fs/btrfs/tree-log.c:replay_one_buffer(), which produces a warning like
the following:
[195191.943673] WARNING: CPU: 0 PID: 6924 at fs/inode.c:342 inc_nlink+0x33/0x40
[195191.943723] CPU: 0 PID: 6924 Comm: mount Not tainted 4.19.0-rc6-btrfs-next-38 #1
[195191.943724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
[195191.943726] RIP: 0010:inc_nlink+0x33/0x40
[195191.943728] RSP: 0018:ffffb96e425e3870 EFLAGS: 00010246
[195191.943730] RAX: 0000000000000000 RBX: ffff8c0d1e6af4f0 RCX: 0000000000000006
[195191.943731] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8c0d1e6af4f0
[195191.943731] RBP: 0000000000000097 R08: 0000000000000001 R09: 0000000000000000
[195191.943732] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb96e425e3a60
[195191.943733] R13: ffff8c0d10cff0c8 R14: ffff8c0d0d515348 R15: ffff8c0d78a1b3f8
[195191.943735] FS: 00007f570ee24480(0000) GS:ffff8c0dfb200000(0000) knlGS:0000000000000000
[195191.943736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[195191.943737] CR2: 00005593286277c8 CR3: 00000000bb8f2006 CR4: 00000000003606f0
[195191.943739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[195191.943740] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[195191.943741] Call Trace:
[195191.943778] replay_one_buffer+0x797/0x7d0 [btrfs]
[195191.943802] walk_up_log_tree+0x1c1/0x250 [btrfs]
[195191.943809] ? rcu_read_lock_sched_held+0x3f/0x70
[195191.943825] walk_log_tree+0xae/0x1d0 [btrfs]
[195191.943840] btrfs_recover_log_trees+0x1d7/0x4d0 [btrfs]
[195191.943856] ? replay_dir_deletes+0x280/0x280 [btrfs]
[195191.943870] open_ctree+0x1c3b/0x22a0 [btrfs]
[195191.943887] btrfs_mount_root+0x6b4/0x800 [btrfs]
[195191.943894] ? rcu_read_lock_sched_held+0x3f/0x70
[195191.943899] ? pcpu_alloc+0x55b/0x7c0
[195191.943906] ? mount_fs+0x3b/0x140
[195191.943908] mount_fs+0x3b/0x140
[195191.943912] ? __init_waitqueue_head+0x36/0x50
[195191.943916] vfs_kern_mount+0x62/0x160
[195191.943927] btrfs_mount+0x134/0x890 [btrfs]
[195191.943936] ? rcu_read_lock_sched_held+0x3f/0x70
[195191.943938] ? pcpu_alloc+0x55b/0x7c0
[195191.943943] ? mount_fs+0x3b/0x140
[195191.943952] ? btrfs_remount+0x570/0x570 [btrfs]
[195191.943954] mount_fs+0x3b/0x140
[195191.943956] ? __init_waitqueue_head+0x36/0x50
[195191.943960] vfs_kern_mount+0x62/0x160
[195191.943963] do_mount+0x1f9/0xd40
[195191.943967] ? memdup_user+0x4b/0x70
[195191.943971] ksys_mount+0x7e/0xd0
[195191.943974] __x64_sys_mount+0x21/0x30
[195191.943977] do_syscall_64+0x60/0x1b0
[195191.943980] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[195191.943983] RIP: 0033:0x7f570e4e524a
[195191.943986] RSP: 002b:00007ffd83589478 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[195191.943989] RAX: ffffffffffffffda RBX: 0000563f335b2060 RCX: 00007f570e4e524a
[195191.943990] RDX: 0000563f335b2240 RSI: 0000563f335b2280 RDI: 0000563f335b2260
[195191.943992] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
[195191.943993] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000563f335b2260
[195191.943994] R13: 0000563f335b2240 R14: 0000000000000000 R15: 00000000ffffffff
[195191.944002] irq event stamp: 8688
[195191.944010] hardirqs last enabled at (8687): [<ffffffff9cb004c3>] console_unlock+0x503/0x640
[195191.944012] hardirqs last disabled at (8688): [<ffffffff9ca037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
[195191.944018] softirqs last enabled at (8638): [<ffffffff9cc0a5d1>] __set_page_dirty_nobuffers+0x101/0x150
[195191.944020] softirqs last disabled at (8634): [<ffffffff9cc26bbe>] wb_wakeup_delayed+0x2e/0x60
[195191.944022] ---[ end trace 5d6e873a9a0b811a ]---
This happens because the inode does not have the flag I_LINKABLE set,
which is a runtime only flag, not meant to be persisted, set when the
inode is created through open(2) if the flag O_EXCL is not passed to it.
Except for the warning, there are no other consequences (like corruptions
or metadata inconsistencies).
Since it's pointless to replay a tmpfile as it would be deleted in a
later phase of the log replay procedure (it has a link count of 0), fix
this by not logging tmpfiles and if a tmpfile is found in a log (created
by a kernel without this change), skip the replay of the inode.
A test case for fstests follows soon.
Fixes: 471d557afed1 ("Btrfs: fix loss of prealloc extents past i_size after fsync log replay")
CC: stable@vger.kernel.org # 4.18+
Reported-by: Martin Steigerwald <martin@lichtvoll.de>
Link: https://lore.kernel.org/linux-btrfs/3666619.NTnn27ZJZE@merkaba/
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We don't need it, rsv->size is set once and never changes throughout
its lifetime, so just use that for the reserve size.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
I ran into an issue where there was some reference being held on an
inode that I couldn't track. This assert wasn't triggered, but it at
least rules out we're doing something stupid.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|