summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-04-02ceph: adding protection for showing cap reservation infoChengguang Xu
Adding spinlock protection during getting cap reservation ralated fields so that the numbers match below BUG_ON condition in the code. BUG_ON(mdsc->caps_total_count != mdsc->caps_use_count + mdsc->caps_reserve_count + mdsc->caps_avail_count); Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: adding missing message types to ceph_msg_type_name()Chengguang Xu
Some of message types are missing in ceph_msg_type_name(), so just adding them for better understanding of output information. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: get the latest osdmap when using an existing clientIlya Dryomov
Currently we request the latest osdmap only if ceph_pg_poolid_by_name() fails with -ENOENT. This is effective with newly created pools, but we also want to avoid attempting to map from pools that were recently deleted and report "pool does not exist" instead. (Such an attempt eventually fails in the OSD client after map check code kicks in, but the error message is confusing.) Request the latest osdmap unconditionally after bumping a ref on an existing client in rbd_client_find(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: move rbd_get_client() below rbd_put_client()Ilya Dryomov
... to avoid a forward declaration in the next commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove redundant declaration of rbd_spec_put()Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02ceph: use seq_show_option for string type optionsChengguang Xu
Using seq_show_option to replace seq_printf for string type options. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: fix misjudgement of maximum monitor numberChengguang Xu
num_mon should allow up to CEPH_MAX_MON in ceph_monmap_decode(). Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: change permission for readonly debugfs entriesChengguang Xu
Remove write permission for debugfs entries which only have readonly function. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02ceph: keep consistent semantic in fscache related option combinationChengguang Xu
When specifying multiple fscache related options, the result isn't always the same as option order, this fix will keep strict consistent meaning by order. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02ceph: add newline to end of debug message formatChengguang Xu
Some of dout format do not include newline in the end, fix for the files which are in fs/ceph and net/ceph directories, and changing printk to dout for printing debug info in super.c Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: allow "fancy" stripingIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jason Dillaman <dillaman@redhat.com>
2018-04-02rbd: introduce OWN_BVECS data typeIlya Dryomov
If the layout is "fancy", we need to be able to rearrange the provided bio_vecs in stripe unit chunks to make it possible for the messenger to read/write directly from/to the provided data buffer, without employing a temporary data buffer for assembling the result. Higher level bio_vec arrays are generally immutable, so this requires copying into a private array. Only the bio_vecs themselves are shuffled around, not the actual data. OWN_BVECS doesn't own any pages. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove rbd_parent_request_{create,destroy}()Ilya Dryomov
rbd_parent_request_create() takes a ref on obj_req for child_img_req. There is no point in doing that because child_img_req is created on behalf of obj_req -- obj_req is the initiator and can't be completed before child_img_req. Open-code the rest of rbd_parent_request_create() and remove it along with rbd_parent_request_destroy(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: get rid of img_req->{offset,length}Ilya Dryomov
These are set, but no longer used. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove rbd_img_request_fill() and helpersIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: switch to common striping frameworkIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: create+truncate for whole-object layered discardsIlya Dryomov
A whole-object layered discard is implemented as a truncate rather than a delete: a dummy object is needed to prevent the CoW machinery from kicking in. However, a truncate on a non-existent object is a no-op. If the object doesn't exist in HEAD, a discard request is effectively ignored, which violates our "discard zeroes data" promise and breaks REQ_OP_WRITE_ZEROES implementation. A non-exclusive create on an existing object is also a no-op, so the fix is to do a compound create+truncate instead. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: move to obj_req->img_extentsIlya Dryomov
In preparation for rbd "fancy" striping, replace obj_req->img_offset with obj_req->img_extents. A single starting offset isn't sufficient because we want only one OSD request per object and will merge adjacent object extents in ceph_file_to_extents(). The final object extent may map into multiple different byte ranges in the image. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: incorporate ceph_object_extentIlya Dryomov
obj_req->object_no -> obj_req->ex.oe_objno obj_req->offset -> obj_req->ex.oe_off obj_req->length -> obj_req->ex.oe_len ... and use ex for linking object requests to image requests. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: move ceph_calc_file_object_mapping() to striper.cIlya Dryomov
ceph_calc_file_object_mapping() has nothing to do with osdmaps. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: striping framework implementationIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: store data_type in img_req instead of obj_reqIlya Dryomov
All object requests are associated with an image request now -- avoid duplicating the same info in each object request. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove obj_req->flags fieldIlya Dryomov
There are no standalone (!IMG_DATA) object requests anymore. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove old request completion codeIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: new request completion codeIlya Dryomov
Do away with partial request completions and all the associated complexity. Individual object requests no longer need to be completed in order -- when the last one becomes ready, we complete the entire higher level request all at once. This also wraps up the conversion to a state machine model and eliminates the recursion described in commit 6d69bb536bac ("rbd: prevent kernel stack blow up on rbd map"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: update rbd_img_request_submit() signatureIlya Dryomov
It should be void now. Also, object requests are unlinked only in image request destructor, which can't run before rbd_img_request_put(), so no need for _safe. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: add img_req->op_type fieldIlya Dryomov
Store op_type in its own field instead of packing it into flags. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: simplify rbd_osd_req_create()Ilya Dryomov
No need to pass rbd_dev and op_type to rbd_osd_req_create(): there are no standalone (!IMG_DATA) object requests anymore and osd_req->r_flags can be set in rbd_osd_req_format_{read,write}(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: remove old request handling codeIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: new request handling codeIlya Dryomov
The notable changes are: - instead of explicitly stat'ing the object to see if it exists before issuing the write, send the write optimistically along with the stat in a single OSD request - zero copyup optimization - all object requests are associated with an image request and have a valid ->img_request pointer; there are no standalone (!IMG_DATA) object requests anymore - code is structured as a state machine (vs a bunch of callbacks with implicit state) Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: handle zero-length data itemsIlya Dryomov
rbd needs this for null copyups -- if copyup data is all zeroes, we want to save some I/O and network bandwidth. See rbd_obj_issue_copyup() in the next commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02rbd: move from raw pages to bvec data descriptorsIlya Dryomov
In preparation for rbd "fancy" striping which requires bio_vec arrays, wire up BVECS data type and kill off PAGES data type. There is nothing wrong with using page vectors for copyup requests -- it's just less iterator boilerplate code to write for the new striping framework. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02libceph: introduce BVECS data typeIlya Dryomov
In preparation for rbd "fancy" striping, introduce ceph_bvec_iter for working with bio_vec array data buffers. The wrappers are trivial, but make it look similar to ceph_bio_iter. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: get rid of img_req->copyup_pagesIlya Dryomov
The initiating object request is the proper owner -- save a bit of space. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02rbd: don't (ab)use obj_req->pages for stat requestsIlya Dryomov
obj_req->pages is for provided data buffers. stat requests are internal and should be NODATA. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02rbd: remove bio cloning helpersIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02libceph, rbd: new bio handling code (aka don't clone bios)Ilya Dryomov
The reason we clone bios is to be able to give each object request (and consequently each ceph_osd_data/ceph_msg_data item) its own pointer to a (list of) bio(s). The messenger then initializes its cursor with cloned bio's ->bi_iter, so it knows where to start reading from/writing to. That's all the cloned bios are used for: to determine each object request's starting position in the provided data buffer. Introduce ceph_bio_iter to do exactly that -- store position within bio list (i.e. pointer to bio) + position within that bio (i.e. bvec_iter). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02rbd: start enums at 1 instead of 0Ilya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: change ceph_calc_file_object_mapping() signatureIlya Dryomov
- make it void - xlen (object extent length) out parameter should be u32 because only a single stripe unit is mapped at a time Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02libceph: eliminate overflows in ceph_calc_file_object_mapping()Ilya Dryomov
bl, stripeno and objsetno should be u64 -- otherwise large enough files get corrupted. How large depends on file layout: - 4M-objects layout (default): any file over 16P - 64K-objects layout (smallest possible object size): any file over 512T Only CephFS is affected, rbd doesn't use ceph_calc_file_object_mapping() yet. Fortunately, CephFS has a max_file_size configurable, the default for which is way below both of the above numbers. Reimplement the logic from scratch with no layout validation -- it's done on the MDS side. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02rbd: set max_segment_size to UINT_MAXIlya Dryomov
Commit 21acdf45f495 ("rbd: set max_segments to USHRT_MAX") removed the limit on max_segments. Remove the limit on max_segment_size as well. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-01ext4: force revalidation of directory pointer after seekdir(2)Theodore Ts'o
A malicious user could force the directory pointer to be in an invalid spot by using seekdir(2). Use the mechanism we already have to notice if the directory has changed since the last time we called ext4_readdir() to force a revalidation of the pointer. Reported-by: syzbot+1236ce66f79263e8a862@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-04-01Merge branch 'net-bgmac-Couple-of-sparse-warnings'David S. Miller
Florian Fainelli says: ==================== net: bgmac: Couple of sparse warnings This patch series fixes a couple of warnings reported by sparse, should not cause any functional problems since bgmac is typically used on LE platforms anyway. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()Florian Fainelli
bgmac_dma_tx_ring_free() assigns the ctl1 word which is a litle endian 32-bit word without using proper accessors, fix this, and because a length cannot be negative, use unsigned int while at it. Fixes: 9cde94506eac ("bgmac: implement scatter/gather support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01net: bgmac: Correctly annotate register spaceFlorian Fainelli
All the members: base, idm_base and nicpm_base should be annotated with __iomem since they are pointers to register space. This fixes a bunch of sparse reported warnings. Fixes: f6a95a24957a ("net: ethernet: bgmac: Add platform device support") Fixes: dd5c5d037f5e ("net: ethernet: bgmac: add NS2 support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01cifs: smbd: disconnect transport on RDMA errorsLong Li
On RDMA errors, transport should disconnect the RDMA CM connection. This will notify the upper layer, and it will attempt transport reconnect. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
2018-04-01cifs: smbd: avoid reconnect lockupLong Li
During transport reconnect, other processes may have registered memory and blocked on transport. This creates a deadlock situation because the transport resources can't be freed, and reconnect is blocked. Fix this by returning to upper layer on timeout. Before returning, transport status is set to reconnecting so other processes will release memory registration resources. Upper layer will retry the reconnect. This is not in fast I/O path so setting the timeout to 5 seconds. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
2018-04-01Don't log confusing message on reconnect by defaultSteve French
Change the following message (which can occur on reconnect) from a warning to an FYI message. It is confusing to users. [58360.523634] CIFS VFS: Free previous auth_key.response = 00000000a91cdc84 By default this message won't show up on reconnect unless the user bumps up the log level to include FYI messages. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-01Don't log expected error on DFS referral requestSteve French
STATUS_FS_DRIVER_REQUIRED is expected when DFS is not turned on on the server. Do not log it on DFS referral response. It clutters the dmesg log unnecessarily at mount time. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2018-04-01fs: cifs: Replace _free_xid call in cifs_root_iget functionPhillip Potter
Modify end of cifs_root_iget function in fs/cifs/inode.c to call free_xid(xid) instead of _free_xid(xid), thereby allowing debug notification of this action when enabled. Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: Steve French <smfrench@gmail.com>