Age | Commit message (Collapse) | Author |
|
If we attempt to preallocate more than 2^32 blocks of space in a
single syscall, the transaction block reservation will overflow
leading to a hangs in the superblock block accounting code. This
is trivially reproduced with xfs_io. Fix the problem by capping the
allocation reservation to the maximum number of blocks a single
xfs_bmapi() call can allocate (2^21 blocks).
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
This fixes an unnecessary BUG().
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently on-disk structure is able to keep only 16bit project quota
id, so disallow 32bit ones. This fixes a problem where parts of
kernel structures holding project quota id are 32bit while parts
(on-disk) are 16bit variables which causes project quota member
files to be inaccessible for some operations (like mv/rm).
Signed-off-by: Arkadiusz Mi?kiewicz <arekm@maven.pl>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
When doing large parallel file creates on a 16p machines, large amounts of
time is being spent in _xfs_buf_find(). A system wide profile with perf top
shows this:
1134740.00 19.3% _xfs_buf_find
733142.00 12.5% __ticket_spin_lock
The problem is that the hash contains 45,000 buffers, and the hash table width
is only 256 buffers. That means we've got around 200 buffers per chain, and
searching it is quite expensive. The hash table size needs to increase.
Secondly, every time we do a lookup, we promote the buffer we find to the head
of the hash chain. This is causing cachelines to be dirtied and causes
invalidation of cachelines across all CPUs that may have walked the hash chain
recently. hence every walk of the hash chain is effectively a cold cache walk.
Remove the promotion to avoid this invalidation.
The results are:
1045043.00 21.2% __ticket_spin_lock
326184.00 6.6% _xfs_buf_find
A 70% drop in the CPU usage when looking up buffers. Unfortunately that does
not result in an increase in performance underthis workload as contention on
the inode_lock soaks up most of the reduction in CPU usage.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
p9_client_walk() can return error values if we run out of space or there
is a problem with the network.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
|
|
When an error happens during validation of read node, the typical situation is that
the LEB we read is unmapped (due to some bug). It is handy to include the mapping
status into the error message.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
The UBIFS bug in the GC list sorting comparison functions inspired
me to write internal debugging check functions which verify that
the list of nodes is sorted properly.
So, this patch implements 2 new debugging functions:
o 'dbg_check_data_nodes_order()' - check order of data nodes list
o 'dbg_check_nondata_nodes_order()' - check order of non-data nodes list
The debugging functions are executed only if general UBIFS debugging checks are
enabled. And they are compiled out if UBIFS debugging is disabled.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
When running the integrity test ('integck' from mtd-utils) on current
UBIFS on 2.6.35, I see that assertions in UBIFS 'list_sort()' comparison
functions trigger sometimes, e.g.:
UBIFS assert failed in data_nodes_cmp at 132 (pid 28311)
My investigation showed that this happens when 'list_sort()' calls the 'cmp()'
function with equivalent arguments. In this case, the 'struct list_head'
parameter, passed to 'cmp()' is bogus, and it does not belong to any element in
the original list.
And this issue seems to be introduced by commit:
commit 835cc0c8477fdbc59e0217891d6f11061b1ac4e2
Author: Don Mullis <don.mullis@gmail.com>
Date: Fri Mar 5 13:43:15 2010 -0800
It is easy to work around the issue by doing:
if (a == b)
return 0;
in UBIFS. It works, but 'lib_sort()' should nevertheless be fixed. Although it
is harmless to have this piece of code in UBIFS.
This patch adds that code to both UBIFS 'cmp()' functions:
'data_nodes_cmp()' and 'nondata_nodes_cmp()'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
When scanning the flash, UBIFS builds a list of flash nodes of type
'struct ubifs_scan_node'. Each scanned node has a 'snod->key' field. This field
is valid for most of the nodes, but invalid for some node type, e.g., truncation
nodes. It is safer to explicitly initialize such keys to something invalid,
rather than leaving them initialized to all zeros, which has key type of
UBIFS_INO_KEY.
This patch introduces new "fake" key type UBIFS_INVALID_KEY and initializes
unused 'snod->key' objects to this type. It also adds debugging assertions in
the TNC code to make sure no one ever tries to look these nodes up in the TNC.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
In the scanning code, in 'ubifs_add_snod()', we write rubbish into
'snod->key', because we assume that on-flash truncation nodes have a key, but
they do not. If the other parts of UBIFS then mistakenly try to look-up
the truncation node key (they should not do this, but may do because of a bug),
we can succeed and corrupt TNC. It looks like we did have such a situation in
'sort_nodes()' in gc.c.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
Improve assertions in gc.c in the comparison functions for 'list_sort()': check
key types _and_ node types.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
In comparison function for 'list_sort()' we use key type to distinguish between
node types. However, we have a bit simper way to detect node type -
'snod->type'. This more logical to use, comparing to decoding key types. Also
allows to get rid of 2 local variables.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
When moving nodes in GC, do not try to look up truncation nodes in TNC,
because they do not exist there. This would be harmless, because the TNC
look-up would fail, if we did not have bug 'ubifs_add_snod()' which reads
garbage into 'snod->key'. But in any case, it is less error prone to
explicitly ignore everything but inode, data, dentry and xentry nodes.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
This patch fixes the following false assertion warning:
UBIFS assert failed in data_nodes_cmp at 130 (pid 15107)
The assertion was wrong because it did not take into account that the
node can be an xentry.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
'ubifs_garbage_collect_leb()' should never return '-ENOSPC', and if it
does, this is an error. Thus, do not treat this error code specially.
'-EAGAIN' is a special error code, but not '-ENOSPC'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
In 'ubifs_garbage_collect()' on error path, we first switch to R/O mode, and
then synchronize write-buffers (to make sure no data are lost). But the GC
write-buffer synchronization will fail, because we are already in R/O mode.
This patch re-orders this and makes sure we first synchronize the write-buffer,
and then switch to R/O mode.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
If load_nilfs() gets an error while doing recovery, it will fail to
free the shadow inode of dat (nilfs->ns_gc_dat).
This fixes the leak issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
fsnotify: drop two useless bools in the fnsotify main loop
fsnotify: fix list walk order
fanotify: Return EPERM when a process is not privileged
fanotify: resize pid and reorder structure
fanotify: drop duplicate pr_debug statement
fanotify: flush outstanding perm requests on group destroy
fsnotify: fix ignored mask handling between inode and vfsmount marks
fanotify: add MAINTAINERS entry
fsnotify: reset used_inode and used_vfsmount on each pass
fanotify: do not dereference inode_mark when it is unset
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
eCryptfs: Fix encrypted file name lookup regression
ecryptfs: properly mark init functions
fs/ecryptfs: Return -ENOMEM on memory allocation failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: fix get_ticket_handler() error handling
ceph: don't BUG on ENOMEM during mds reconnect
ceph: ceph_mdsc_build_path() returns an ERR_PTR
ceph: Fix warnings
ceph: ceph_get_inode() returns an ERR_PTR
ceph: initialize fields on new dentry_infos
ceph: maintain i_head_snapc when any caps are dirty, not just for data
ceph: fix osd request lru adjustment when sending request
ceph: don't improperly set dir complete when holding EXCL cap
mm: exporting account_page_dirty
ceph: direct requests in snapped namespace based on nonsnap parent
ceph: queue cap snap writeback for realm children on snap update
ceph: include dirty xattrs state in snapped caps
ceph: fix xattr cap writeback
ceph: fix multiple mds session shutdown
|
|
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux:
nfsd: fix NULL dereference in nfsd_statfs()
nfsd4: fix downgrade/lock logic
nfsd4: typo fix in find_any_file
nfsd4: bad BUG() in preprocess_stateid_op
|
|
Setting the task state here may cause us to miss the wake up from
kthread_stop(), so we need to recheck kthread_should_stop() or risk
sleeping forever in the following schedule().
Symptom was an indefinite hang on an NFSv4 mount. (NFSv4 may create
multiple mounts in a temporary namespace while traversing the mount
path, and since the temporary namespace is immediately destroyed, it may
end up destroying a mount very soon after it was created, possibly
making this race more likely.)
INFO: task mount.nfs4:4314 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mount.nfs4 D 0000000000000000 2880 4314 4313 0x00000000
ffff88001ed6da28 0000000000000046 ffff88001ed6dfd8 ffff88001ed6dfd8
ffff88001ed6c000 ffff88001ed6c000 ffff88001ed6c000 ffff88001e5003a0
ffff88001ed6dfd8 ffff88001e5003a8 ffff88001ed6c000 ffff88001ed6dfd8
Call Trace:
[<ffffffff8196090d>] schedule_timeout+0x1cd/0x2e0
[<ffffffff8106a31c>] ? mark_held_locks+0x6c/0xa0
[<ffffffff819639a0>] ? _raw_spin_unlock_irq+0x30/0x60
[<ffffffff8106a5fd>] ? trace_hardirqs_on_caller+0x14d/0x190
[<ffffffff819671fe>] ? sub_preempt_count+0xe/0xd0
[<ffffffff8195fc80>] wait_for_common+0x120/0x190
[<ffffffff81033c70>] ? default_wake_function+0x0/0x20
[<ffffffff8195fdcd>] wait_for_completion+0x1d/0x20
[<ffffffff810595fa>] kthread_stop+0x4a/0x150
[<ffffffff81061a60>] ? thaw_process+0x70/0x80
[<ffffffff810cc68a>] bdi_unregister+0x10a/0x1a0
[<ffffffff81229dc9>] nfs_put_super+0x19/0x20
[<ffffffff810ee8c4>] generic_shutdown_super+0x54/0xe0
[<ffffffff810ee9b6>] kill_anon_super+0x16/0x60
[<ffffffff8122d3b9>] nfs4_kill_super+0x39/0x90
[<ffffffff810eda45>] deactivate_locked_super+0x45/0x60
[<ffffffff810edfb9>] deactivate_super+0x49/0x70
[<ffffffff81108294>] mntput_no_expire+0x84/0xe0
[<ffffffff811084ef>] release_mounts+0x9f/0xc0
[<ffffffff81108575>] put_mnt_ns+0x65/0x80
[<ffffffff8122cc56>] nfs_follow_remote_path+0x1e6/0x420
[<ffffffff8122cfbf>] nfs4_try_mount+0x6f/0xd0
[<ffffffff8122d0c2>] nfs4_get_sb+0xa2/0x360
[<ffffffff810edcb8>] vfs_kern_mount+0x88/0x1f0
[<ffffffff810ede92>] do_kern_mount+0x52/0x130
[<ffffffff81963d9a>] ? _lock_kernel+0x6a/0x170
[<ffffffff81108e9e>] do_mount+0x26e/0x7f0
[<ffffffff81106b3a>] ? copy_mount_options+0xea/0x190
[<ffffffff811094b8>] sys_mount+0x98/0xf0
[<ffffffff810024d8>] system_call_fastpath+0x16/0x1b
1 lock held by mount.nfs4/4314:
#0: (&type->s_umount_key#24){+.+...}, at: [<ffffffff810edfb1>] deactivate_super+0x41/0x70
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
The fsnotify main loop has 2 bools which indicated if we processed the
inode or vfsmount mark in that particular pass through the loop. These
bool can we replaced with the inode_group and vfsmount_group variables
and actually make the code a little easier to understand.
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
Marks were stored on the inode and vfsmonut mark list in order from
highest memory address to lowest memory address. The code to walk those
lists thought they were in order from lowest to highest with
unpredictable results when trying to match up marks from each. It was
possible that extra events would be sent to userspace when inode
marks ignoring events wouldn't get matched with the vfsmount marks.
This problem only affected fanotify when using both vfsmount and inode
marks simultaneously.
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
The appropriate error code when privileged operations are denied is
EPERM, not EACCES.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <paris@paris.rdu.redhat.com>
|
|
Fixes a regression caused by 21edad32205e97dc7ccb81a85234c77e760364c8
When file name encryption was enabled, ecryptfs_lookup() failed to use
the encrypted and encoded version of the upper, plaintext, file name
when performing a lookup in the lower file system. This made it
impossible to lookup existing encrypted file names and any newly created
files would have plaintext file names in the lower file system.
https://bugs.launchpad.net/ecryptfs/+bug/623087
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
|
|
Some ecryptfs init functions are not prefixed by __init and thus not
freed after initialization. This patch saved about 1kB in ecryptfs
module.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
|
|
In this code, 0 is returned on memory allocation failure, even though other
failures return -ENOMEM or other similar values.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression ret;
expression x,e1,e2,e3;
@@
ret = 0
... when != ret = e1
*x = \(kmalloc\|kcalloc\|kzalloc\)(...)
... when != ret = e2
if (x == NULL) { ... when != ret = e3
return ret;
}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
|
|
The commit ebabe9a9001af0af56c0c2780ca1576246e7a74b
pass a struct path to vfs_statfs
introduced the struct path initialization, and this seems to trigger
an Oops on my machine.
fh_dentry field may be NULL and set later in fh_verify(), thus the
initialization of path must be after fh_verify().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
|
|
If we already had a RW open for a file, and get a readonly open, we were
piggybacking on the existing RW open. That's inconsistent with the
downgrade logic which blows away the RW open assuming you'll still have
a readonly open.
Also, make sure there is a readonly or writeonly open available for
locking, again to prevent bad behavior in downgrade cases when any RW
open may be lost.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
It's OK for this function to return without setting filp--we do it in
the special-stateid case.
And there's a legitimate case where we can hit this, since we do permit
reads on write-only stateid's.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
On 08/26/2010 01:56 AM, joe hefner wrote:
> On a recent Fedora (13), I am seeing a mount failure message that I can not explain. I have a Windows Server 2003ýa with a share set up for access only for a specific username (say userfoo). If I try to mount it from Linux,ýusing userfoo and the correct password all is well. If I try with a bad password or with some other username (userbar), it fails with "Permission denied" as expected. If I try to mount as username = administrator, and give the correct administrator password, I would also expect "Permission denied", but I see "Cannot allocate memory" instead.
> ýfs/cifs/netmisc.c: Mapping smb error code 5 to POSIX err -13
> ýfs/cifs/cifssmb.c: Send error in QPathInfo = -13
> ýCIFS VFS: cifs_read_super: get root inode failed
Looks like the commit 0b8f18e3 assumed that cifs_get_inode_info() and
friends fail only due to memory allocation error when the inode is NULL
which is not the case if CIFSSMBQPathInfo() fails and returns an error.
Fix this by propagating the actual error code back.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
get_ticket_handler() returns a valid pointer or it returns
ERR_PTR(-ENOMEM) if kzalloc() fails.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
We are in a position to return an error; do that instead.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to
handle NULL returns.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
Just scrubbing some warnings so I can see real problem ones in the build
noise. For 32bit we need to coax gcc politely into believing we really
honestly intend to the casts. Using (u64)(unsigned long) means we cast from
a pointer to a type of the right size and then extend it. This stops the
warning spew.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
Eliminate sparse warning - bad constant expression
cifs: check for NULL session password
missing changes during ntlmv2/ntlmssp auth and sign
[CIFS] Fix ntlmv2 auth with ntlmssp
cifs: correction of unicode header files
cifs: fix NULL pointer dereference in cifs_find_smb_ses
cifs: consolidate error handling in several functions
cifs: clean up error handling in cifs_mknod
|
|
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under. It is used by queue_cap_snap to set up
the cap_snap. However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps. This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Eliminiate sparse warning during usage of crypto_shash_* APIs
error: bad constant expression
Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it
for every signature generation (shash descriptor for md5 hash).
From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001
From: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Date: Tue, 24 Aug 2010 11:47:43 -0500
Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the
page and not disard the pagecache content and return an error from writepage.
We used to do this correctly, but the logic got lost during the recent
reshuffle of the writepage code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Mike Gao <ygao.linux@gmail.com>
Tested-by: Mike Gao <ygao.linux@gmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
|
Formatting items requires memory allocation when using delayed
logging. Currently that memory allocation is done while holding the
CIL context lock in read mode. This means that if memory allocation
takes some time (e.g. enters reclaim), we cannot push on the CIL
until the allocation(s) required by formatting complete. This can
stall CIL pushes for some time, and once a push is stalled so are
all new transaction commits.
Fix this splitting the item formatting into two steps. The first
step which does the allocation and memcpy() into the allocated
buffer is now done outside the CIL context lock, and only the CIL
insert is done inside the CIL context lock. This avoids the stall
issue.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Delayed logging adds some serialisation to the log force process to
ensure that it does not deference a bad commit context structure
when determining if a CIL push is necessary or not. It does this by
grabing the CIL context lock exclusively, then dropping it before
pushing the CIL if necessary. This causes serialisation of all log
forces and pushes regardless of whether a force is necessary or not.
As a result fsync heavy workloads (like dbench) can be significantly
slower with delayed logging than without.
To avoid this penalty, copy the current sequence from the context to
the CIL structure when they are swapped. This allows us to do
unlocked checks on the current sequence without having to worry
about dereferencing context structures that may have already been
freed. Hence we can remove the CIL context locking in the forcing
code and only call into the push code if the current context matches
the sequence we need to force.
By passing the sequence into the push code, we can check the
sequence again once we have the CIL lock held exclusive and abort if
the sequence has already been pushed. This avoids a lock round-trip
and unnecessary CIL pushes when we have racing push calls.
The result is that the regression in dbench performance goes away -
this change improves dbench performance on a ramdisk from ~2100MB/s
to ~2500MB/s. This compares favourably to not using delayed logging
which retuns ~2500MB/s for the same workload.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
When we need to cover the log, we issue dummy transactions to ensure
the current log tail is on disk. Unfortunately we currently use the
root inode in the dummy transaction, and the act of committing the
transaction dirties the inode at the VFS level.
As a result, the VFS writeback of the dirty inode will prevent the
filesystem from idling long enough for the log covering state
machine to complete. The state machine gets stuck in a loop issuing
new dummy transactions to cover the log and never makes progress.
To avoid this problem, the dummy transactions should not cause
externally visible state changes. To ensure this occurs, make sure
that dummy transactions log an unchanging field in the superblock as
it's state is never propagated outside the filesystem. This allows
the log covering state machine to complete successfully and the
filesystem now correctly enters a fully idle state about 90s after
the last modification was made.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Because of delayed updates to sb_icount field in the super block, it
is possible to allocate over maxicount number of inodes. This
causes the arithmetic to calculate a negative number of free inodes
in user commands like df or stat -f.
Since maxicount is a somewhat arbitrary number, a slight over
allocation is not critical but user commands should be displayed as
0 or greater and never go negative. To do this the value in the
stats buffer f_ffree is capped to never go negative.
[ Modified to use max_t as per Christoph's comment. ]
Signed-off-by: Stu Brodsky <sbrodsky@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
|
During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will
go negative on inodes with more than 1024 dirty pages due to
implementation details of write_cache_pages(). Currently XFS will
abort page clustering in writeback once nr_to_write drops below
zero, and so for data integrity writeback we will do very
inefficient page at a time allocation and IO submission for inodes
with large numbers of dirty pages.
Fix this by only aborting the page clustering code when
wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE.
Cc: <stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|