Age | Commit message (Collapse) | Author |
|
Change writeback path to create just one io_end structure for the
extent to which we submit IO and share it among bios writing that
extent. This prevents needless splitting and joining of unwritten
extents when they cannot be submitted as a single bio.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
|
|
So far ext4_bio_write_page() attached all the pages to ext4_io_end
structure. This makes that structure pretty heavy (1 KB for pointers
+ 16 bytes per page attached to the bio). Also later we would like to
share ext4_io_end structure among several bios in case IO to a single
extent needs to be split among several bios and pointing to pages from
ext4_io_end makes this complex.
We remove page pointers from ext4_io_end and use pointers from bio
itself instead. This isn't as easy when blocksize < pagesize because
then we can have several bios in flight for a single page and we have
to be careful when to call end_page_writeback(). However this is a
known problem already solved by block_write_full_page() /
end_buffer_async_write() so we mimic its behavior here. We mark
buffers going to disk with BH_Async_Write flag and in
ext4_bio_end_io() we check whether there are any buffers with
BH_Async_Write flag left. If there are not, we can call
end_page_writeback().
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
|
|
In parse_strtoul() we're still using deprecated simple_strtoul(). Remove
parse_strtoul() altogether and replace it with kstrtoul()
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
- grab_cache_page_write_begin() may not wait on page's writeback since
(1d1d1a767206). But it is still reasonable to wait on page's writeback
here in order to be on the safe side.
- Fix miss typo: pass 'length' instead of 'end' to __block_write_begin()
https://bugzilla.kernel.org/show_bug.cgi?id=56241
TESTCASE: git://oss.sgi.com/xfs/cmds/xfstests.git
MKFS_OPTIONS="-b1024" ; ./check ext4/304
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Akira Fujita <a-fujita.rs.jp.nec.com>
|
|
With bigalloc feature enabled we do not support indirect addressing at all
so we have to prevent extent addressing to indirect addressing
conversion in this case. The problem has been introduced with the commit
"ext4: support simple conversion of extent-mapped inodes to use i_blocks"
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Otherwise we deadlock if state recovery is initiated while we
sleep.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Move ext4_ind_migrate() into migrate.c file since it makes much more
sense and ext4_ext_migrate() is there as well.
Also fix tiny style problem - add spaces around "=" in "i=0".
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Fixes a regression in cifs_parse_mount_options where a password
which begins with a delimitor is parsed incorrectly as being a blank
password.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
nfs_delegation_claim_locks
The second check was added in commit 65b62a29 but it will never be true.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Pull another nfs fixlet from Trond Myklebust:
"I suddenly noticed that a one-line issue that I _thought_ I had fixed
with the nfs41_walk_client_list patch was apparently still there in
the pull request I sent earlier today. I'm very sorry for not
catching that in time.
- Fix a brain fart in nfs41_walk_client_list"
* tag 'nfs-for-3.9-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Doh! Typo in the fix to nfs41_walk_client_list
|
|
Make sure that we set the status to 0 on success. Missed in testing
because it never appears when doing multiple mounts to _different_
servers.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: <stable@vger.kernel.org> # 3.7.x: 7b1f1fd: NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
|
|
Pull NFS client bugfixes from Trond Myklebust:
- fix for memory corruption issues in nfs4[01]_walk_client_list (stable)
- fix for an Oopsable bug in rpc_clone_client (stable)
- another state manager deadlock in the NFSv4 open code
- memory leaks in nfs4_discover_server_trunking and rpc_new_client
* tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix another potential state manager deadlock
SUNRPC: Fix a potential memory leak in rpc_new_client
NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
NFSv4: Fix a memory leak in nfs4_discover_server_trunking
SUNRPC: Remove extra xprt_put()
|
|
This adds the origin indicator to the trace point for glock
demotion, so that it is possible to see where demote requests
have come from.
Note that requests generated from the demote_rq sysfs interface
will show as remote, since they are intended to replicate
exactly the effect of a demote reuqest from a remote node. It
is still possible to tell these apart by looking at the process
which initiated the demote request.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This patch adds a bool indicating whether the demote
request was originated locally or remotely. This is then
used by the iopen ->go_callback() to make 100% sure that
it will only respond to remote callbacks.
Since ->evict_inode() uses GL_NOCACHE when it attempts to
get an exclusive lock on the iopen lock, this may result
in extra scheduling of the workqueue in case that the
exclusive promotion request failed. This patch prevents
that from happening.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
None of these result in any bug, but they makes sparse complain.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
This patch should fix sparse complains about shadow declatations.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Currently in ENOSPC condition when writing into unwritten space, or
punching a hole, we might need to split the extent and grow extent tree.
However since we can not allocate any new metadata blocks we'll have to
zero out unwritten part of extent or punched out part of extent, or in
the worst case return ENOSPC even though use actually does not allocate
any space.
Also in delalloc path we do reserve metadata and data blocks for the
time we're going to write out, however metadata block reservation is
very tricky especially since we expect that logical connectivity implies
physical connectivity, however that might not be the case and hence we
might end up allocating more metadata blocks than previously reserved.
So in future, metadata reservation checks should be removed since we can
not assure that we do not under reserve.
And this is where reserved space comes into the picture. When mounting
the file system we slice off a little bit of the file system space (2%
or 4096 clusters, whichever is smaller) which can be then used for the
cases mentioned above to prevent costly zeroout, or unexpected ENOSPC.
The number of reserved clusters can be set via sysfs, however it can
never be bigger than number of free clusters in the file system.
Note that this patch fixes the failure of xfstest 274 as expected.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
|
The logic here is better expressed with a switch statement.
While we're here, CLOSED stateids (or stateids of an unkown type--which
would indicate a server bug) should probably return nfserr_bad_stateid,
though this behavior shouldn't affect any non-buggy client.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Make sure the client gives us an adequate backchannel.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Negotiation of the 4.1 session forechannel attributes is a mess. Fix:
- Move it all into check_forechannel_attrs instead of spreading
it between that, alloc_session, and init_forechannel_attrs.
- set a minimum "slotsize" so that our drc memory limits apply
even for small maxresponsesize_cached. This also fixes some
bugs when slotsize becomes <= 0.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Pass this struct by reference, not by value, and return an error instead
of a boolean to allow for future additions.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A nasty bug in fs/namespace.c caught by Andrey + a couple of less
serious unpleasantness - ecryptfs misc device playing hopeless games
with try_module_get() and palinfo procfs support being... not quite
correctly done, to be polite."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
mnt: release locks on error path in do_loopback
palinfo fixes
procfs: add proc_remove_subtree()
ecryptfs: close rmmod race
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* serialize the call of ->release() on per-pdeo mutex
* don't remove pdeo from per-pde list until we are through with it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* leave ->proc_fops alone; make ->pde_users negative instead
* trim pde_opener
* move relevant code in fs/proc/inode.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Switch huge if-statement in __proc_file_read() around. This then puts the
single line loop break immediately after the if-statement and allows us to
de-indent the huge comment and make it take fewer lines. The code following
the if-statement then follows naturally from the call to dp->read_proc().
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Kill create_proc_entry() in favour of create_proc_read_entry(), proc_create()
and proc_create_data().
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The only part of proc_dir_entry the code outside of fs/proc
really cares about is PDE(inode)->data. Provide a helper
for that; static inline for now, eventually will be moved
to fs/proc, along with the knowledge of struct proc_dir_entry
layout.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Same as single_open(), but preallocates the buffer of given size.
Doesn't make any sense for sizes up to PAGE_SIZE and doesn't make
sense if output of show() exceeds PAGE_SIZE only rarely - seq_read()
will take care of growing the buffer and redoing show(). If you
_know_ that it will be large, it might make more sense to look into
saner iterator, rather than go with single-shot one. If that's
impossible, single_open_size() might be for you.
Again, don't use that without a good reason; occasionally that's really
the best way to go, but very often there are better solutions.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Just have it pinned in dcache all along and let procfs ->kill_sb()
drop it before kill_anon_super().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
and rename __free_pipe_info() to free_pipe_info()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
not used anymore
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
it's used only as a flag to distinguish normal pipes/FIFOs from the
internal per-task one used by file-to-file splice. And pipe->files
would work just as well for that purpose...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
fs/pipe.c file_operations methods *know* that pipe is not an internal one;
no need to check pipe->inode for those callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
simplify get_pipe_info(), while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
now it can be done - put mutex into pipe_inode_info, use it instead
of ->i_mutex
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* new field - pipe->files; number of struct file over that pipe (all
sharing the same inode, of course); protected by inode->i_lock.
* pipe_release() decrements pipe->files, clears inode->i_pipe when
if the counter has reached 0 (all under ->i_lock) and, in that case,
frees pipe after having done pipe_unlock()
* fifo_open() starts with grabbing ->i_lock, and either bumps pipe->files
if ->i_pipe was non-NULL or allocates a new pipe (dropping and regaining
->i_lock) and rechecks ->i_pipe; if it's still NULL, inserts new pipe
there, otherwise bumps ->i_pipe->files and frees the one we'd allocated.
At that point we know that ->i_pipe is non-NULL and won't go away, so
we can do pipe_lock() on it and proceed as we used to. If we end up
failing, decrement pipe->files and if it reaches 0 clear ->i_pipe and
free the sucker after pipe_unlock().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* use the fact that file_inode(file)->i_pipe doesn't change
while the file is opened - no locks needed to access that.
* switch to pipe_lock/pipe_unlock where it's easy to do
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|