summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-31btrfs: remove unused parameters from extent_submit_bio_start_tDavid Sterba
Remove parameters not used by any of the callbacks. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: separate types for submit_bio_start and submit_bio_doneDavid Sterba
The callbacks make use of different parameters that are passed to the other type unnecessarily. This patch adds separate types for each and the unused parameters will be removed. The type extent_submit_bio_hook_t keeps all parameters and can be used where the start/done types are not appropriate. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: kill tree_mod_log_set_root_pointer helperDavid Sterba
A useless wrapper around tree_mod_log_insert_root that hides missing error handling. Move it to the callers. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: kill tree_mod_log_set_node_key helperDavid Sterba
A trivial wrapper that can be simply opencoded and makes the GFP allocation request more visible. The error handling is now moved to the callers. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: kill trivial wrapper tree_mod_log_eb_moveDavid Sterba
The wrapper is effectively an alias for tree_mod_log_insert_move but also hides the missing error handling. To make that more visible, lift the BUG_ON to the callers. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: remove trivial locking wrappers of tree mod logDavid Sterba
The wrappers are trivial and do not bring any extra value on top of the plain locking primitives. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from __tree_mod_log_oldest_rootDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: embed tree_mod_move structure to tree_mod_elemDavid Sterba
The tree_mod_move is not used anywhere and can be embedded as anonymous structure. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop unused fs_info parameter from tree_mod_log_eb_moveDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from tree_mod_log_free_ebDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from tree_mod_log_free_ebDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from tree_mod_log_insert_keyDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from tree_mod_log_insert_moveDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: drop fs_info parameter from tree_mod_log_set_node_keyDavid Sterba
It's provided by the extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: document more parameters of submit_extent_pageDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: cleanup merging conditions in submit_extent_pageDavid Sterba
The merge call was factored out to a separate helper but it's a trivial one and arguably we can opencode it and cache the value. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: remove redundant variable in __do_readpageDavid Sterba
The value of page_end is only stored to end, no other use. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: assume that bio_ret is always valid in submit_extent_pageDavid Sterba
All callers pass a valid pointer so we can drop the redundant checks. The call to submit_one_bio never happend and can be removed. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31Btrfs: scrub: batch rebuild for raid56Liu Bo
In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K) as unit, however, scrub_extent() sets blocksize as unit, so rebuild process may be triggered on every block on a same stripe. A typical example would be that when we're replacing a disappeared disk, all reads on the disks get -EIO, every block (size is 4K if blocksize is 4K) would go thru these, scrub_handle_errored_block scrub_recheck_block # re-read pages one by one scrub_recheck_block # rebuild by calling raid56_parity_recover() page by page Although with raid56 stripe cache most of reads during rebuild can be avoided, the parity recover calculation(xor or raid6 algorithms) needs to be done $(BTRFS_STRIPE_LEN / blocksize) times. This makes it smarter by doing raid56 scrub/replace on stripe length. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: sort and group mount option definitionsDavid Sterba
Sort mount options by the primary name, followed by the 'no-' counterpart if it exists. Group the deprecated and debugging options. Enum and token defintions are synced. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Add nossd_spread mount optionHoward McLauchlan
Btrfs has two mount options for SSD optimizations: ssd and ssd_spread. Presently there is an option to disable all SSD optimizations, but there isn't an option to disable just ssd_spread. This patch adds a mount option nossd_spread that disables ssd_spread only. Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Remove btrfs_fs_info::open_ioctl_transNikolay Borisov
Since userspace transaction have been removed we no longer have use for this field so delete it. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Remove code referencing unused TRANS_USERSPACENikolay Borisov
Now that the userspace transaction ioctls have been removed, TRANS_USERSPACE is no longer used hence we can remove it. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Remove btrfs_file_private::transNikolay Borisov
Now that the userspace transaction IOCTL have been removed, this member is no longer used so just remove it Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Remove userspace transaction ioctlsNikolay Borisov
Commit 3558d4f88ec8 ("btrfs: Deprecate userspace transaction ioctls") marked the beginning of the end of userspace transaction. This commit finishes the job! There are no known users and ceph does not use the ioctl anymore. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Sage Weil <sage@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: qgroup: Fix root item corruption when multiple same source snapshots ↵Qu Wenruo
are created with quota enabled When multiple pending snapshots referring to the same source subvolume are executed, enabled quota will cause root item corruption, where root items are using old bytenr (no backref in extent tree). This can be triggered by fstests btrfs/152. The cause is when source subvolume is still dirty, extra commit (simplied transaction commit) of qgroup_account_snapshot() can skip dirty roots not recorded in current transaction, making root item of source subvolume not updated. Fix it by forcing recording source subvolume in current transaction before qgroup sub-transaction commit. Reported-by: Justin Maggard <jmaggard@netgear.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Relax memory barrier in btrfs_tree_unlockNikolay Borisov
When performing an unlock on an extent buffer we'd like to order the decrement of extent_buffer::blocking_writers with waking up any waiters. In such situations it's sufficient to use smp_mb__after_atomic rather than the heavy smp_mb. On architectures where atomic operations are fully ordered (such as x86 or s390) unconditionally executing a heavyweight smp_mb instruction causes a severe hit to performance while bringin no improvements in terms of correctness. The better thing is to use the appropriate smp_mb__after_atomic routine which will do the correct thing (invoke a full smp_mb or in the case of ordered atomics insert a compiler barrier). Put another way, an RMW atomic op + smp_load__after_atomic equals, in terms of semantics, to a full smp_mb. This ensures that none of the problems described in the accompanying comment of waitqueue_active occur. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: add define for oldest generationAnand Jain
Some functions can filter metadata by the generation. Add a define that will annotate such arguments. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: open code trivial helper btrfs_page_exists_in_rangeDavid Sterba
The called function name is self explanatory. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: Use filemap_range_has_page()Matthew Wilcox
The current implementation of btrfs_page_exists_in_range() gives the wrong answer if the workingset code has stored a shadow entry in the page cache. The filemap_range_has_page() function does not have this problem, and it's shared code, so use it instead. eigned-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-30net/mlx5e: Use inline MTTs in UMR WQEsTariq Toukan
When modifying the page mapping of a HW memory region (via a UMR post), post the new values inlined in WQE, instead of using a data pointer. This is a micro-optimization, inline UMR WQEs of different rings scale better in HW. In addition, this obsoletes a few control flows and helps delete ~50 LOC. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: Do not busy-wait for UMR completion in Striding RQTariq Toukan
Do not busy-wait a pending UMR completion. Under high HW load, busy-waiting a delayed completion would fully utilize the CPU core and mistakenly indicate a SW bottleneck. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: Code movements in RX UMR WQE postTariq Toukan
Gets the process of a UMR WQE post in one function, in preparation for a downstream patch that inlines the WQE data. No functional change here. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: Derive Striding RQ size from MTUTariq Toukan
In Striding RQ, each WQE serves multiple packets (hence called Multi-Packet WQE, MPWQE). The size of a MPWQE is constant (currently 256KB). Upon a ringparam set operation, we calculate the number of MPWQEs per RQ. For this, first it is needed to determine the number of packets that can reside within a single MPWQE. In this patch we use the actual MTU size instead of ETH_DATA_LEN for this calculation. This implies that a change in MTU might require a change in Striding RQ ring size. In addition, this obsoletes some WQEs-to-packets translation functions and helps delete ~60 LOC. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: Save MTU in channels paramsTariq Toukan
Knowing the MTU is required for RQ creation flow. By our design, channels creation flow is totally isolated from priv/netdev, and can be completed with access to channels params and mdev. Adding the MTU to the channels params helps preserving that. In addition, we save it in RQ to make its access faster in datapath checks. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: IPoIB, Fix spelling mistakeTalat Batheesh
Fix spelling mistake in debug message text. "dettaching" -> "detaching" Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5: Change teardown with force mode failure message to warningAlaa Hleihel
With ConnectX-4, we expect the force teardown to fail in case that DC was enabled, therefore change the message from error to warning. Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5: Eliminate query xsrq dead codeSaeed Mahameed
1. This function is not used anywhere in mlx5 driver 2. It has a memcpy statement that makes no sense and produces build warning with gcc8 drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq': drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict] Fixes: 01949d0109ee ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30net/mlx5e: Use eq ptr from cqSaeed Mahameed
Instead of looking for the EQ of the CQ, remove that redundant code and use the eq pointer stored in the cq struct. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-30Merge branch 'bpf-sockmap-sg-api-fixes'Daniel Borkmann
Prashant Bhole says: ==================== These patches fix sg api usage in sockmap. Previously sockmap didn't use sg_init_table(), which caused hitting BUG_ON in sg api, when CONFIG_DEBUG_SG is enabled v1: added sg_init_table() calls wherever needed. v2: - Patch1 adds new helper function in sg api. sg_init_marker() - Patch2 sg_init_marker() and sg_init_table() in appropriate places Backgroud: While reviewing v1, John Fastabend raised a valid point about unnecessary memset in sg_init_table() because sockmap uses sg table which embedded in a struct. As enclosing struct is zeroed out, there is unnecessary memset in sg_init_table. So Daniel Borkmann suggested to define another static inline function in scatterlist.h which only initializes sg_magic. Also this function will be called from sg_init_table. From this suggestion I defined a function sg_init_marker() which sets sg_magic and calls sg_mark_end() ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-30bpf: sockmap: initialize sg table entries properlyPrashant Bhole
When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized in sg_init_table() and it is verified in sg api while navigating. We hit BUG_ON when magic check is failed. In functions sg_tcp_sendpage and sg_tcp_sendmsg, the struct containing the scatterlist is already zeroed out. So to avoid extra memset, we use sg_init_marker() to initialize sg_magic. Fixed following things: - In bpf_tcp_sendpage: initialize sg using sg_init_marker - In bpf_tcp_sendmsg: Replace sg_init_table with sg_init_marker - In bpf_tcp_push: Replace memset with sg_init_table where consumed sg entry needs to be re-initialized. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-30lib/scatterlist: add sg_init_marker() helperPrashant Bhole
sg_init_marker initializes sg_magic in the sg table and calls sg_mark_end() on the last entry of the table. This can be useful to avoid memset in sg_init_table() when scatterlist is already zeroed out For example: when scatterlist is embedded inside other struct and that container struct is zeroed out Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-30rxrpc: Fix leak of rxrpc_peer objectsDavid Howells
When a new client call is requested, an rxrpc_conn_parameters struct object is passed in with a bunch of parameters set, such as the local endpoint to use. A pointer to the target peer record is also placed in there by rxrpc_get_client_conn() - and this is removed if and only if a new connection object is allocated. Thus it leaks if a new connection object isn't allocated. Fix this by putting any peer object attached to the rxrpc_conn_parameters object in the function that allocated it. Fixes: 19ffa01c9c45 ("rxrpc: Use structs to hold connection params and protocol info") Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Add a tracepoint to track rxrpc_peer refcountingDavid Howells
Add a tracepoint to track reference counting on the rxrpc_peer struct. Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Fix apparent leak of rxrpc_local objectsDavid Howells
rxrpc_local objects cannot be disposed of until all the connections that point to them have been RCU'd as a connection object holds refcount on the local endpoint it is communicating through. Currently, this can cause an assertion failure to occur when a network namespace is destroyed as there's no check that the RCU destructors for the connections have been run before we start trying to destroy local endpoints. The kernel reports: rxrpc: AF_RXRPC: Leaked local 0000000036a41bc1 {5} ------------[ cut here ]------------ kernel BUG at ../net/rxrpc/local_object.c:439! Fix this by keeping a count of the live connections and waiting for it to go to zero at the end of rxrpc_destroy_all_connections(). Fixes: dee46364ce6f ("rxrpc: Add RCU destruction for connections and calls") Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Add a tracepoint to track rxrpc_local refcountingDavid Howells
Add a tracepoint to track reference counting on the rxrpc_local struct. Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Fix potential call vs socket/net destruction raceDavid Howells
rxrpc_call structs don't pin sockets or network namespaces, but may attempt to access both after their refcount reaches 0 so that they can detach themselves from the network namespace. However, there's no guarantee that the socket still exists at this point (so sock_net(&call->socket->sk) may be invalid) and the namespace may have gone away if the call isn't pinning a peer. Fix this by (a) carrying a net pointer in the rxrpc_call struct and (b) waiting for all calls to be destroyed when the network namespace goes away. This was detected by checker: net/rxrpc/call_object.c:634:57: warning: incorrect type in argument 1 (different address spaces) net/rxrpc/call_object.c:634:57: expected struct sock const *sk net/rxrpc/call_object.c:634:57: got struct sock [noderef] <asn:4>*<noident> Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Fix checker warnings and errorsDavid Howells
Fix various issues detected by checker. Errors: (*) rxrpc_discard_prealloc() should be using rcu_assign_pointer to set call->socket. Warnings: (*) rxrpc_service_connection_reaper() should be passing NULL rather than 0 to trace_rxrpc_conn() as the where argument. (*) rxrpc_disconnect_client_call() should get its net pointer via the call->conn rather than call->sock to avoid a warning about accessing an RCU pointer without protection. (*) Proc seq start/stop functions need annotation as they pass locks between the functions. False positives: (*) Checker doesn't correctly handle of seq-retry lock context balance in rxrpc_find_service_conn_rcu(). (*) Checker thinks execution may proceed past the BUG() in rxrpc_publish_service_conn(). (*) Variable length array warnings from SKCIPHER_REQUEST_ON_STACK() in rxkad.c. Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: remove unused static variablesSebastian Andrzej Siewior
The rxrpc_security_methods and rxrpc_security_sem user has been removed in 648af7fca159 ("rxrpc: Absorb the rxkad security module"). This was noticed by kbuild test robot for the -RT tree but is also true for !RT. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-30rxrpc: Fix resend event time calculationMarc Dionne
Commit a158bdd3 ("rxrpc: Fix call timeouts") reworked the time calculation for the next resend event. For this calculation, "oldest" will be before "now", so ktime_sub(oldest, now) will yield a negative value. When passed to nsecs_to_jiffies which expects an unsigned value, the end result will be a very large value, and a resend event scheduled far into the future. This could cause calls to stall if some packets were lost. Fix by ordering the arguments to ktime_sub correctly. Fixes: a158bdd3247b ("rxrpc: Fix call timeouts") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>