Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's provided by the extent_buffer.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The merge call was factored out to a separate helper but it's a trivial
one and arguably we can opencode it and cache the value.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The value of page_end is only stored to end, no other use.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
All callers pass a valid pointer so we can drop the redundant checks.
The call to submit_one_bio never happend and can be removed.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K)
as unit, however, scrub_extent() sets blocksize as unit, so rebuild
process may be triggered on every block on a same stripe.
A typical example would be that when we're replacing a disappeared disk,
all reads on the disks get -EIO, every block (size is 4K if blocksize is
4K) would go thru these,
scrub_handle_errored_block
scrub_recheck_block # re-read pages one by one
scrub_recheck_block # rebuild by calling raid56_parity_recover()
page by page
Although with raid56 stripe cache most of reads during rebuild can be
avoided, the parity recover calculation(xor or raid6 algorithms) needs to
be done $(BTRFS_STRIPE_LEN / blocksize) times.
This makes it smarter by doing raid56 scrub/replace on stripe length.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Sort mount options by the primary name, followed by the 'no-'
counterpart if it exists. Group the deprecated and debugging options.
Enum and token defintions are synced.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Btrfs has two mount options for SSD optimizations: ssd and ssd_spread.
Presently there is an option to disable all SSD optimizations, but there
isn't an option to disable just ssd_spread.
This patch adds a mount option nossd_spread that disables ssd_spread
only.
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since userspace transaction have been removed we no longer have use
for this field so delete it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Now that the userspace transaction ioctls have been removed,
TRANS_USERSPACE is no longer used hence we can remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Now that the userspace transaction IOCTL have been removed, this member
is no longer used so just remove it
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Commit 3558d4f88ec8 ("btrfs: Deprecate userspace transaction ioctls")
marked the beginning of the end of userspace transaction. This commit
finishes the job! There are no known users and ceph does not use the
ioctl anymore.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
are created with quota enabled
When multiple pending snapshots referring to the same source subvolume
are executed, enabled quota will cause root item corruption, where root
items are using old bytenr (no backref in extent tree).
This can be triggered by fstests btrfs/152.
The cause is when source subvolume is still dirty, extra commit
(simplied transaction commit) of qgroup_account_snapshot() can skip
dirty roots not recorded in current transaction, making root item of
source subvolume not updated.
Fix it by forcing recording source subvolume in current transaction
before qgroup sub-transaction commit.
Reported-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When performing an unlock on an extent buffer we'd like to order the
decrement of extent_buffer::blocking_writers with waking up any
waiters. In such situations it's sufficient to use smp_mb__after_atomic
rather than the heavy smp_mb. On architectures where atomic operations
are fully ordered (such as x86 or s390) unconditionally executing
a heavyweight smp_mb instruction causes a severe hit to performance
while bringin no improvements in terms of correctness.
The better thing is to use the appropriate smp_mb__after_atomic routine
which will do the correct thing (invoke a full smp_mb or in the case
of ordered atomics insert a compiler barrier). Put another way,
an RMW atomic op + smp_load__after_atomic equals, in terms of
semantics, to a full smp_mb. This ensures that none of the problems
described in the accompanying comment of waitqueue_active occur.
No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Some functions can filter metadata by the generation. Add a define that
will annotate such arguments.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The called function name is self explanatory.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The current implementation of btrfs_page_exists_in_range() gives the
wrong answer if the workingset code has stored a shadow entry in the
page cache. The filemap_range_has_page() function does not have this
problem, and it's shared code, so use it instead.
eigned-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When modifying the page mapping of a HW memory region
(via a UMR post), post the new values inlined in WQE,
instead of using a data pointer.
This is a micro-optimization, inline UMR WQEs of different
rings scale better in HW.
In addition, this obsoletes a few control flows and helps
delete ~50 LOC.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Do not busy-wait a pending UMR completion. Under high HW load,
busy-waiting a delayed completion would fully utilize the CPU core
and mistakenly indicate a SW bottleneck.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Gets the process of a UMR WQE post in one function,
in preparation for a downstream patch that inlines
the WQE data.
No functional change here.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In Striding RQ, each WQE serves multiple packets
(hence called Multi-Packet WQE, MPWQE).
The size of a MPWQE is constant (currently 256KB).
Upon a ringparam set operation, we calculate the number of
MPWQEs per RQ. For this, first it is needed to determine the
number of packets that can reside within a single MPWQE.
In this patch we use the actual MTU size instead of ETH_DATA_LEN
for this calculation.
This implies that a change in MTU might require a change
in Striding RQ ring size.
In addition, this obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Knowing the MTU is required for RQ creation flow.
By our design, channels creation flow is totally isolated
from priv/netdev, and can be completed with access to
channels params and mdev.
Adding the MTU to the channels params helps preserving that.
In addition, we save it in RQ to make its access faster in
datapath checks.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix spelling mistake in debug message text.
"dettaching" -> "detaching"
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
With ConnectX-4, we expect the force teardown to fail in case that
DC was enabled, therefore change the message from error to warning.
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
1. This function is not used anywhere in mlx5 driver
2. It has a memcpy statement that makes no sense and produces build
warning with gcc8
drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq':
drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
Fixes: 01949d0109ee ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Instead of looking for the EQ of the CQ, remove that redundant code and
use the eq pointer stored in the cq struct.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Because I will be leaving Samsung soon, for reachability update
my reference e-mail to etezian.org.
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Executing stmfts_power_on() function lasts over 2 seconds, what
significantly slows down the boot and resume processes if driver is
compiled in. Avoid this delay by forcing this driver to be probed
and suspended/resumed asynchronously.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
The primary interface for the touchpad device in Thinkpad L570 is SMBus,
so ALPS overlooked PS2 interface Firmware setting of TrackStick, and
shipped with TrackStick otp bit is disabled.
The address 0xD7 contains device number information, so we can identify
the device by checking this value, but to access it we need to enable
Command mode, and then re-enable the device. Devices shipped in Thinkpad
L570 report either 0x0C or 0x1D as device numbers, if we see them we assume
that the devices are DualPoints.
The same issue exists on Dell Latitude 7370.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196929
Fixes: 646580f793 ("Input: ALPS - fix multi-touch decoding on SS4 plus touchpads")
Signed-off-by: Masaki Ota <masaki.ota@jp.alps.com>
Tested-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Tested-by: Jaak Ristioja <jaak@ristioja.ee>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Prashant Bhole says:
====================
These patches fix sg api usage in sockmap. Previously sockmap didn't
use sg_init_table(), which caused hitting BUG_ON in sg api, when
CONFIG_DEBUG_SG is enabled
v1: added sg_init_table() calls wherever needed.
v2:
- Patch1 adds new helper function in sg api. sg_init_marker()
- Patch2 sg_init_marker() and sg_init_table() in appropriate places
Backgroud:
While reviewing v1, John Fastabend raised a valid point about
unnecessary memset in sg_init_table() because sockmap uses sg table
which embedded in a struct. As enclosing struct is zeroed out, there
is unnecessary memset in sg_init_table.
So Daniel Borkmann suggested to define another static inline function
in scatterlist.h which only initializes sg_magic. Also this function
will be called from sg_init_table. From this suggestion I defined a
function sg_init_marker() which sets sg_magic and calls sg_mark_end()
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized in
sg_init_table() and it is verified in sg api while navigating. We hit
BUG_ON when magic check is failed.
In functions sg_tcp_sendpage and sg_tcp_sendmsg, the struct containing
the scatterlist is already zeroed out. So to avoid extra memset, we
use sg_init_marker() to initialize sg_magic.
Fixed following things:
- In bpf_tcp_sendpage: initialize sg using sg_init_marker
- In bpf_tcp_sendmsg: Replace sg_init_table with sg_init_marker
- In bpf_tcp_push: Replace memset with sg_init_table where consumed
sg entry needs to be re-initialized.
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
sg_init_marker initializes sg_magic in the sg table and calls
sg_mark_end() on the last entry of the table. This can be useful to
avoid memset in sg_init_table() when scatterlist is already zeroed out
For example: when scatterlist is embedded inside other struct and that
container struct is zeroed out
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The `__u64 time` field of the blk_io_trace struct refers to
the time in nanoseconds, not in microseconds. It is set in
__blk_add_trace, which does the following:
t->time = ktime_to_ns(ktime_get());
ktime_to_ns returns ktime_t in nanoseconds, not microseconds.
Signed-off-by: Souvik Banerjee <souvik1997@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When a new client call is requested, an rxrpc_conn_parameters struct object
is passed in with a bunch of parameters set, such as the local endpoint to
use. A pointer to the target peer record is also placed in there by
rxrpc_get_client_conn() - and this is removed if and only if a new
connection object is allocated. Thus it leaks if a new connection object
isn't allocated.
Fix this by putting any peer object attached to the rxrpc_conn_parameters
object in the function that allocated it.
Fixes: 19ffa01c9c45 ("rxrpc: Use structs to hold connection params and protocol info")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add a tracepoint to track reference counting on the rxrpc_peer struct.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
rxrpc_local objects cannot be disposed of until all the connections that
point to them have been RCU'd as a connection object holds refcount on the
local endpoint it is communicating through. Currently, this can cause an
assertion failure to occur when a network namespace is destroyed as there's
no check that the RCU destructors for the connections have been run before
we start trying to destroy local endpoints.
The kernel reports:
rxrpc: AF_RXRPC: Leaked local 0000000036a41bc1 {5}
------------[ cut here ]------------
kernel BUG at ../net/rxrpc/local_object.c:439!
Fix this by keeping a count of the live connections and waiting for it to
go to zero at the end of rxrpc_destroy_all_connections().
Fixes: dee46364ce6f ("rxrpc: Add RCU destruction for connections and calls")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add a tracepoint to track reference counting on the rxrpc_local struct.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
rxrpc_call structs don't pin sockets or network namespaces, but may attempt
to access both after their refcount reaches 0 so that they can detach
themselves from the network namespace. However, there's no guarantee that
the socket still exists at this point (so sock_net(&call->socket->sk) may
be invalid) and the namespace may have gone away if the call isn't pinning
a peer.
Fix this by (a) carrying a net pointer in the rxrpc_call struct and (b)
waiting for all calls to be destroyed when the network namespace goes away.
This was detected by checker:
net/rxrpc/call_object.c:634:57: warning: incorrect type in argument 1 (different address spaces)
net/rxrpc/call_object.c:634:57: expected struct sock const *sk
net/rxrpc/call_object.c:634:57: got struct sock [noderef] <asn:4>*<noident>
Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix various issues detected by checker.
Errors:
(*) rxrpc_discard_prealloc() should be using rcu_assign_pointer to set
call->socket.
Warnings:
(*) rxrpc_service_connection_reaper() should be passing NULL rather than 0 to
trace_rxrpc_conn() as the where argument.
(*) rxrpc_disconnect_client_call() should get its net pointer via the
call->conn rather than call->sock to avoid a warning about accessing
an RCU pointer without protection.
(*) Proc seq start/stop functions need annotation as they pass locks
between the functions.
False positives:
(*) Checker doesn't correctly handle of seq-retry lock context balance in
rxrpc_find_service_conn_rcu().
(*) Checker thinks execution may proceed past the BUG() in
rxrpc_publish_service_conn().
(*) Variable length array warnings from SKCIPHER_REQUEST_ON_STACK() in
rxkad.c.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The rxrpc_security_methods and rxrpc_security_sem user has been removed
in 648af7fca159 ("rxrpc: Absorb the rxkad security module"). This was
noticed by kbuild test robot for the -RT tree but is also true for !RT.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Commit a158bdd3 ("rxrpc: Fix call timeouts") reworked the time calculation
for the next resend event. For this calculation, "oldest" will be before
"now", so ktime_sub(oldest, now) will yield a negative value. When passed
to nsecs_to_jiffies which expects an unsigned value, the end result will be
a very large value, and a resend event scheduled far into the future. This
could cause calls to stall if some packets were lost.
Fix by ordering the arguments to ktime_sub correctly.
Fixes: a158bdd3247b ("rxrpc: Fix call timeouts")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
If a call-level abort is received for the previous call to complete on a
connection channel, then that abort is queued for the connection processor
to handle. Unfortunately, the connection processor then assumes without
checking that the abort is connection-level (ie. callNumber is 0) and
distributes it over all active calls on that connection, thereby
incorrectly aborting them.
Fix this by discarding aborts aimed at a completed call.
Further, discard all packets aimed at a call that's complete if there's
currently an active call on a channel, since the DATA packets associated
with the new call automatically terminate the old call.
Fixes: 18bfeba50dfd ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
rxrpc calls have a ring of packets that are awaiting ACK or retransmission
and a parallel ring of annotations that tracks the state of those packets.
If the initial transmission of a packet on the underlying UDP socket fails
then the packet annotation is marked for resend - but the setting of this
mark accidentally erases the last-packet mark also stored in the same
annotation slot. If this happens, a call won't switch out of the Tx phase
when all the packets have been transmitted.
Fix this by retaining the last-packet mark and only altering the packet
state.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The rxrpc_reduce_call_timer() function should be passed the 'current time'
in jiffies, not the current ktime time. It's confusing in rxrpc_resend
because that has to deal with both. Pass the correct current time in.
Note that this only affects the trace produced and not the functioning of
the code.
Fixes: a158bdd3247b ("rxrpc: Fix call timeouts")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix the firewall route keepalive part of AF_RXRPC which is currently
function incorrectly by replying to VERSION REPLY packets from the server
with VERSION REQUEST packets.
Instead, send VERSION REPLY packets to the peers of service connections to
act as keep-alives 20s after the latest packet was transmitted to that
peer.
Also, just discard VERSION REPLY packets rather than replying to them.
Signed-off-by: David Howells <dhowells@redhat.com>
|