Age | Commit message (Collapse) | Author |
|
It's expected that erofs_workgroup_unfreeze_final() won't
be used in other places. Let's fold it to simplify the code.
Link: https://lore.kernel.org/r/20200729180235.25443-1-hsiangkao@redhat.com
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
|
|
Each ondisk inode should be aligned with inode slot boundary
(32-byte alignment) because of nid calculation formula, so all
compact inodes (32 byte) cannot across page boundary. However,
extended inode is now 64-byte form, which can across page boundary
in principle if the location is specified on purpose, although
it's hard to be generated by mkfs due to the allocation policy
and rarely used by Android use case now mainly for > 4GiB files.
For now, only two fields `i_ctime_nsec` and `i_nlink' couldn't
be read from disk properly and cause out-of-bound memory read
with random value.
Let's fix now.
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
Cc: <stable@vger.kernel.org> # 4.19+
Link: https://lore.kernel.org/r/20200729175801.GA23973@xiangao.remote.csb
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
|
|
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200713130944.34419-1-grandmaster@al2klimov.de
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
|
|
In gfs2_glock_poke, make sure gfs2_holder_uninit is called on the local
glock holder. Without that, we're leaking a glock and a pid reference.
Fixes: 9e8990dea926 ("gfs2: Smarter iopen glock waiting")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Pass a pointer to the existing glock holder from
gfs2_file_{read,write}_iter to gfs2_file_direct_{read,write}
to save some stack space.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Before this patch, three flags were not represented in the glock output.
This patch adds them in:
c - GLF_INODE_CREATING
P - GLF_PENDING_DELETE
x - GLF_FREEING (both f and F are already used)
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
* pm-sleep:
PM: sleep: spread "const char *" correctness
PM: hibernate: fix white space in a few places
freezer: Add unsafe version of freezable_schedule_timeout_interruptible() for NFS
PM: sleep: core: Emit changed uevent on wakeup_sysfs_add/remove
* pm-domains:
PM: domains: Restore comment indentation for generic_pm_domain.child_links
PM: domains: Fix up terminology with parent/child
* powercap:
powercap: Add Power Limit4 support
powercap: idle_inject: Replace play_idle() with play_idle_precise() in comments
powercap: intel_rapl: add support for Sapphire Rapids
* pm-tools:
pm-graph v5.7 - important s2idle fixes
cpupower: Replace HTTP links with HTTPS ones
cpupower: Fix NULL but dereferenced coccicheck errors
cpupower: Fix comparing pointer to 0 coccicheck warns
|
|
The variable mds is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If the ceph_mdsc_init() fails, it will free the mdsc already.
Reported-by: syzbot+b57f46d8d6ea51960b8c@syzkaller.appspotmail.com
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Fix build warnings:
fs/ceph/mdsmap.c: In function ‘ceph_mdsmap_decode’:
fs/ceph/mdsmap.c:192:7: warning: variable ‘info_cv’ set but not used [-Wunused-but-set-variable]
fs/ceph/mdsmap.c:177:7: warning: variable ‘state_seq’ set but not used [-Wunused-but-set-variable]
fs/ceph/mdsmap.c:123:15: warning: variable ‘mdsmap_cv’ set but not used [-Wunused-but-set-variable]
Note that p is increased in ceph_decode_*.
Signed-off-by: Jia Yang <jiayang5@huawei.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Drop duplicated words "down" and "the" in fs/ceph/.
[ idryomov: merge into a single patch ]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Send metric flags to the MDS, indicating what metrics the client
supports. Currently that consists of cap statistics, and read, write and
metadata latencies.
URL: https://tracker.ceph.com/issues/43435
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This will send the caps/read/write/metadata metrics to any available MDS
once per second, which will be the same as the userland client. It will
skip the MDS sessions which don't support the metric collection, as the
MDSs will close socket connections when they get an unknown type
message.
We can disable the metric sending via the disable_send_metrics module
parameter.
[ jlayton: fix up endianness bug in ceph_mdsc_send_metrics() ]
URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If the session is already in closed state, we should skip it.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
[ idryomov: Do the same for the CRUSH paper and replace
ceph.newdream.net with ceph.io. ]
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Remove unnecassary casts in the argument to kfree.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
In aio case, if the completion comes very fast just before the
ceph_read_iter() returns to fs/aio.c, the kiocb will be freed in
the completion callback, then if ceph_read_iter() access again
we will potentially hit the use-after-free bug.
[ jlayton: initialize direct_lock early, and use it everywhere ]
URL: https://tracker.ceph.com/issues/45649
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Make this loop look a bit more sane. Also optimize away the spinlock
release/reacquire if we can't get an inode reference.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Make sure the delayed work stopped before releasing the resources.
cancel_delayed_work_sync() will only guarantee that the work finishes
executing if the work is already in the ->worklist. That means after
the cancel_delayed_work_sync() returns, it will leave the work requeued
if it was rearmed at the end. That can lead to a use after free once the
work struct is freed.
Fix it by flushing the delayed work instead of trying to cancel it, and
ensure that the work doesn't rearm if the mdsc is stopping.
URL: https://tracker.ceph.com/issues/46293
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
...and let the errnos bubble up to the callers.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This will help to reduce using the global mdsc->mutex lock in many
places.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
And remove the unsed mdsc parameter to simplify the code.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
cifs_mount() for DFS mounts is for a long time way too complex to
follow, mostly because it lacks some documentation, does a lot of
operations like resolving DFS roots and links, checking for path
components, perform failover, crap code, etc.
Besides adding some documentation to it, do some cleanup and ensure
that the following is implemented and supported:
* non-DFS mounts
* DFS failover
* DFS root mounts
- tcon and cifs_sb must contain DFS path (NOT including prefix)
- if prefix path, then save it in cifs_sb and it must not be
changed
* DFS link mounts
- tcon and cifs_sb must contain DFS path (including prefix)
- if prefix path, then save it in cifs_sb and it may be changed
* prevent recursion on broken link referrals (MAX_NESTED_LINKS)
* check every path component of the currently resolved
target (including prefix), and chase them accordingly
* make sure that DFS referrals go through newly resolved root
servers
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
For DFS root mounts that contain a prefix path, do not change them
after failover.
E.g., if the user mounts
//srvA/root/dir1
and then lost connection to srvA, it will reconnect to
//srvB/root/dir1
In case of DFS links, which may resolve to different prefix paths
depending on their list of targets, the following must be supported:
- mount //srvA/root/link/bar
- connect to //srvA/share
- set prefix path to "bar"
- lost connection to srvA
- reconnect to next target: //srvB/share/foo
- set new prefix path to "foo/bar"
In cifs_tree_connect(), check the server_type field of the cached DFS
referral to determine whether or not prefix path should be updated.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently if the call dfs_cache_get_tgt_share fails we cannot
fully guarantee that share and prefix are set to NULL and the
next iteration of the loop can end up potentially double freeing
these pointers. Since the semantics of dfs_cache_get_tgt_share
are ambiguous for failure cases with the setting of share and
prefix (currently now and the possibly the future), it seems
prudent to set the pointers to NULL when the objects are
free'd to avoid any double frees.
Addresses-Coverity: ("Double free")
Fixes: 96296c946a2a ("cifs: handle RESP_GET_DFS_REFERRAL.PathConsumed in reconnect")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
|
|
Use PathConsumed field when parsing prefixes of referral paths that
either match a cache entry or are a complete prefix path of an
existing entry.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In case there were no cached DFS referrals in
reconn_setup_dfs_targets(), set cifs_sb to NULL prior to calling
reconn_set_next_dfs_target() so it would not try to access an empty
tgt_list.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This function has nothing to do with *invalidation* but setting up the
next target server from a cached referral.
Rename it to reconn_set_next_dfs_target(). While at it, get rid of
some meaningless checks.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When looking up the DFS cache with a referral path that has more than
two path components, and is a complete prefix of an existing cache
entry, do not request another referral and just return the matched
entry as specified in MS-DFSC 3.2.5.5 Receiving a Root Referral
Request or Link Referral Request.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
They were identical execpt to CIFSTCon() vs. SMB2_tcon().
These are also available via ops->tree_connect().
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Convert cpu_to_be32(be32_to_cpu(E1) + E2) to use be32_add_cpu().
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Drop repeated words in multiple comments.
(be, use, the, See)
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Remove the superfuous break, as there is a 'return' before it.
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
RHBZ 1145308
Some very old server may not support SetPathInfo to adjust the timestamps
of directories. For these servers, try to open the directory and use SetFileInfo.
Minor correction to patch included that was
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Tested-by: Kenneth D'souza <kdsouza@redhat.com>
|
|
If server returns ERRBaduid but does not reset transport connection,
we'll keep sending command with a non-valid UID for the server as long
as transport is healthy, without actually recovering. This have been
observed on the field.
This patch adds ERRBaduid handling so that we set CifsNeedReconnect.
map_and_check_smb_error() can be modified to extend use cases.
Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
Fix build warning by removing unused variable 'server':
fs/cifs/inode.c:1089:26: warning:
variable server set but not used [-Wunused-but-set-variable]
1089 | struct TCP_Server_Info *server;
| ^~~~~~
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
|
When mounting with Kerberos, users have been confused about the
default error returned in scenarios in which either keyutils is
not installed or the user did not properly acquire a krb5 ticket.
Log a warning message in the case that "ENOKEY" is returned
from the get_spnego_key upcall so that users can better understand
why mount failed in those two cases.
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Handling a lease break for the cached root didn't free the
smb2_lease_break_work allocation, resulting in a leak:
unreferenced object 0xffff98383a5af480 (size 128):
comm "cifsd", pid 684, jiffies 4294936606 (age 534.868s)
hex dump (first 32 bytes):
c0 ff ff ff 1f 00 00 00 88 f4 5a 3a 38 98 ff ff ..........Z:8...
88 f4 5a 3a 38 98 ff ff 80 88 d6 8a ff ff ff ff ..Z:8...........
backtrace:
[<0000000068957336>] smb2_is_valid_oplock_break+0x1fa/0x8c0
[<0000000073b70b9e>] cifs_demultiplex_thread+0x73d/0xcc0
[<00000000905fa372>] kthread+0x11c/0x150
[<0000000079378e4e>] ret_from_fork+0x22/0x30
Avoid this leak by only allocating when necessary.
Fixes: a93864d93977 ("cifs: add lease tracking to the cached root fid")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
CC: Stable <stable@vger.kernel.org> # v4.18+
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The log of UAF problem is listed below.
BUG: KASAN: use-after-free in jffs2_rmdir+0xa4/0x1cc [jffs2] at addr c1f165fc
Read of size 4 by task rm/8283
=============================================================================
BUG kmalloc-32 (Tainted: P B O ): kasan: bad access detected
-----------------------------------------------------------------------------
INFO: Allocated in 0xbbbbbbbb age=3054364 cpu=0 pid=0
0xb0bba6ef
jffs2_write_dirent+0x11c/0x9c8 [jffs2]
__slab_alloc.isra.21.constprop.25+0x2c/0x44
__kmalloc+0x1dc/0x370
jffs2_write_dirent+0x11c/0x9c8 [jffs2]
jffs2_do_unlink+0x328/0x5fc [jffs2]
jffs2_rmdir+0x110/0x1cc [jffs2]
vfs_rmdir+0x180/0x268
do_rmdir+0x2cc/0x300
ret_from_syscall+0x0/0x3c
INFO: Freed in 0x205b age=3054364 cpu=0 pid=0
0x2e9173
jffs2_add_fd_to_list+0x138/0x1dc [jffs2]
jffs2_add_fd_to_list+0x138/0x1dc [jffs2]
jffs2_garbage_collect_dirent.isra.3+0x21c/0x288 [jffs2]
jffs2_garbage_collect_live+0x16bc/0x1800 [jffs2]
jffs2_garbage_collect_pass+0x678/0x11d4 [jffs2]
jffs2_garbage_collect_thread+0x1e8/0x3b0 [jffs2]
kthread+0x1a8/0x1b0
ret_from_kernel_thread+0x5c/0x64
Call Trace:
[c17ddd20] [c02452d4] kasan_report.part.0+0x298/0x72c (unreliable)
[c17ddda0] [d2509680] jffs2_rmdir+0xa4/0x1cc [jffs2]
[c17dddd0] [c026da04] vfs_rmdir+0x180/0x268
[c17dde00] [c026f4e4] do_rmdir+0x2cc/0x300
[c17ddf40] [c001a658] ret_from_syscall+0x0/0x3c
The root cause is that we don't get "jffs2_inode_info.sem" before
we scan list "jffs2_inode_info.dents" in function jffs2_rmdir.
This patch add codes to get "jffs2_inode_info.sem" before we scan
"jffs2_inode_info.dents" to slove the UAF problem.
Signed-off-by: Zhe Li <lizhe67@huawei.com>
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Thanks for the advice mentioned in the email.
This is my v3 patch for this problem.
Mounting jffs2 on nand flash will get message "failed: I/O error"
with the steps listed below.
1.umount jffs2
2.erase nand flash
3.mount jffs2 on it (this mounting operation will be successful)
4.do chown or chmod to the mount point directory
5.umount jffs2
6.mount jffs2 on nand flash
After step 6, we will get message "mount ... failed: I/O error".
Typical image of this problem is like:
Empty space found from 0x00000000 to 0x008a0000
Inode node at xx, totlen 0x00000044, #ino 1, version 1, isize 0...
The reason for this mounting failure is that at the end of function
jffs2_scan_medium(), jffs2 will check the used_size and some info
of nr_blocks.If conditions are met, it will return -EIO.
The detail is that, in the steps listed above, step 4 will write
jffs2_raw_inode into flash without jffs2_raw_dirent, which will
cause that there are some jffs2_raw_inode but no jffs2_raw_dirent
on flash. This will meet the condition at the end of function
jffs2_scan_medium() and return -EIO if we umount jffs2 and mount it
again.
We notice that jffs2 add the value of c->unchecked_size if we find
an inode node while mounting. And jffs2 will never add the value of
c->unchecked_size in other situations. So this patch add one more
condition about c->unchecked_size of the judgement to fix this problem.
Signed-off-by: Zhe Li <lizhe67@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
There a wrong orphan node deleting in error handling path in
ubifs_jnl_update() and ubifs_jnl_rename(), which may cause
following error msg:
UBIFS error (ubi0:0 pid 1522): ubifs_delete_orphan [ubifs]:
missing orphan ino 65
Fix this by checking whether the node has been operated for
adding to orphan list before being deleted,
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Fixes: 823838a486888cf484e ("ubifs: Add hashes to the tree node cache")
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Drop the repeated word "as" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Instead of creating ubifs file systems with UBIFS_FORMAT_VERSION
by default, add a module parameter ubifs.default_version to allow
the user to specify the desired version. Valid values are 4 to
UBIFS_FORMAT_VERSION (currently 5).
This way, one can for example create a file system with version 4
on kernel 4.19 which can still be mounted rw when downgrading to
kernel 4.9.
Signed-off-by: Martin Kaistra <martin.kaistra@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7248d2396ab85c8652e20c0c39beaba
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nfs_wb_all() calls filemap_write_and_wait(), which uses
filemap_check_errors() to determine the error to return.
filemap_check_errors() only looks at the mapping->flags and will
therefore only return either -ENOSPC or -EIO. To ensure that the
correct error is returned on close(), nfs{,4}_file_flush() should call
filemap_check_wb_err() which looks at the errseq value in
mapping->wb_err without consuming it.
Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with
generic one")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
As recently done with with send/recv, flip the if after
rw_verify_aread() in io_{read,write}() and tabulise left bits left.
This removes mispredicted by a compiler jump on the success/fast path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As soon as we install the file descriptor, we have to assume that it
can get arbitrarily closed. We currently account memory (and note that
we did) after installing the ring fd, which means that it could be a
potential use-after-free condition if the fd is closed right after
being installed, but before we fiddle with the ctx.
In fact, syzbot reported this exact scenario:
BUG: KASAN: use-after-free in io_account_mem fs/io_uring.c:7397 [inline]
BUG: KASAN: use-after-free in io_uring_create fs/io_uring.c:8369 [inline]
BUG: KASAN: use-after-free in io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
Read of size 1 at addr ffff888087a41044 by task syz-executor.5/18145
CPU: 0 PID: 18145 Comm: syz-executor.5 Not tainted 5.8.0-rc7-next-20200729-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x18f/0x20d lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
io_account_mem fs/io_uring.c:7397 [inline]
io_uring_create fs/io_uring.c:8369 [inline]
io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c429
Code: 8d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8f121d0c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
RAX: ffffffffffffffda RBX: 0000000000008540 RCX: 000000000045c429
RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000196
RBP: 000000000078bf38 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
R13: 00007fff86698cff R14: 00007f8f121d19c0 R15: 000000000078bf0c
Move the accounting of the ring used locked memory before we get and
install the ring file descriptor.
Cc: stable@vger.kernel.org
Reported-by: syzbot+9d46305e76057f30c74e@syzkaller.appspotmail.com
Fixes: 309758254ea6 ("io_uring: report pinned memory usage")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a simple helper to set timestamps with a kernel space file name and
switch the early init code over to it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Add a simple helper to stat with a kernel space file name and switch
the early init code over to it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Add a simple helper to mknod with a kernel space file name and switch
the early init code over to it. Remove the now unused ksys_mknod.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|