Age | Commit message (Collapse) | Author |
|
Currently we unconditionally issue a flush from xfs_free_buftarg, but
since 2.6.29-rc1 this gives a warning in the style of
end_request: I/O error, dev vdb, sector 0
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
The inode can't be locked by anyone else as we just created it a few
lines above and it's not been added to any lookup data structure yet.
So use a trylock that must succeed to get around the lockdep warnings.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
Andras Korn reported an oops on log replay causes by a corrupted
xfs_inode_log_format_t passing a 0 size to kmem_zalloc. This patch handles
to small or too large numbers of log regions gracefully by rejecting the
log replay with a useful error message.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andras Korn <korn-sgi.com@chardonnay.math.bme.hu>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: don't call jbd2_journal_force_commit_nested without journal
ext4: Reorder fs/Makefile so that ext2 root fs's are mounted using ext2
ext4: Remove duplicate call to ext4_commit_super() in ext4_freeze()
|
|
tracing/core
|
|
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/wireless/iwlwifi/iwl-tx.c
net/8021q/vlan_core.c
net/core/dev.c
|
|
Add a new superblock value which tracks the lifetime amount of writes
to the filesystem. This is useful in estimating the amount of wear on
solid state drives (SSD's) caused by writes to the filesystem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
With delayed allocation we should not/cannot discard inode prealloc
space during file close. We would still have dirty pages for which we
haven't allocated blocks yet. With this fix after each get_blocks
request we check whether we have zero reserved blocks and if yes and
we don't have any writers on the file we discard inode prealloc space.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Commit 8e961870bb9804110d5c8211d5d9d500451c4518 removed the FREEZE/THAW
handling in xfs_compat_ioctl but never added any compat handler back, so
now any freeze/thaw request from a 32-bit binary ond 64-bit userspace
will fail.
As these ioctls are 32/64-bit compatible two simple COMPATIBLE_IOCTL
entries in fs/compat_ioctl.c will do the job.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 4ea3ada2955e4519befa98ff55dd62d6dfbd1705 declares d_obtain_alias()
as EXPORT_SYMBOL_GPL where it's supposed to replace d_alloc_anon which was
previously declared as EXPORT_SYMBOL and thus available to any loadable
module.
This patch reverts that.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
* git://git.infradead.org/mtd-2.6:
[MTD] [MAPS] Remove MODULE_DEVICE_TABLE() from ck804rom driver.
[JFFS2] fix mount crash caused by removed nodes
[JFFS2] force the jffs2 GC daemon to behave a bit better
[MTD] [MAPS] blackfin async requires complex mappings
[MTD] [MAPS] blackfin: fix memory leak in error path
[MTD] [MAPS] physmap: fix wrong free and del_mtd_{partition,device}
[MTD] slram: Handle negative devlength correctly
[MTD] map_rom has NULL erase pointer
[MTD] [LPDDR] qinfo_probe depends on lpddr
|
|
Check for IO error in ocfs2_get_sector().
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
This patch set a gap (4 bytes) between xattr entry and
name/value when xattr in bucket. This gap use to seperate
entry and name/value when a bucket is full. It had already
been set when xattr in inode/block.
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
For other metadata in ocfs2, metaecc is checked in ocfs2_read_blocks
with io_mutex held. While for xattr bucket, it is calculated by
the whole buckets. So we have to add a spin_lock to prevent multiple
processes calculating metaecc.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Tested-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
In ctime updating of xattr, it use the wrong type of access for
inode, so use ocfs2_journal_access_di instead.
Reported-and-Tested-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
In dlm_assert_master_handler(), if we get an incorrect assert master from a node
that, we reply with EINVAL asking the asserter to die. The problem is that an
assert is sent after so many hoops, it is invariably the node that thinks the
asserter is wrong, is actually wrong. So instead of killing the asserter, this
patch kills the assertee.
This patch papers over a race that is still being addressed.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
The code was using dlm->spinlock instead of dlm->ast_lock to protect the
ast_list. This patch fixes the issue.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
The dentry lock has a different format than other locks. This patch fixes
ocfs2_log_dlm_error() macro to make it print the dentry lock correctly.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay
the dlm_thread from sending the drop ref message if the lockres is being
migrated. The problem is that we make the dlm_thread wait for the migration
to complete. This causes a deadlock as dlm_thread also participates in the
lockres migration process.
A better fix for the original oss bugzilla#1012 is in testing.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
In __ocfs2_mark_extent_written, when we meet with the situation
of c_split_covers_rec, the old solution just replace the extent
record and forget to access and dirty the buffer_head. This will
cause a problem when the unwritten extent is in an extent block.
So access and dirty it.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: try committing transaction before returning ENOSPC
Btrfs: add better -ENOSPC handling
|
|
Newer gcc throw this warning:
fs/bio.c: In function ?bio_alloc_bioset?:
fs/bio.c:305: warning: ?p? may be used uninitialized in this function
since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Running without a journal, I oopsed when I ran out of space,
because we called jbd2_journal_force_commit_nested() from
ext4_should_retry_alloc() without a journal.
This should take care of it, I think.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
In fs/Makefile, ext3 was placed before ext2 so that a root filesystem
that possessed a journal, it would be mounted as ext3 instead of ext2.
This was necessary because a cleanly unmounted ext3 filesystem was
fully backwards compatible with ext2, and could be mounted by ext2 ---
but it was desirable that it be mounted with ext3 so that the
journaling would be enabled.
The ext4 filesystem supports new incompatible features, so there is no
danger of an ext4 filesystem being mistaken for an ext2 filesystem.
At that point, the relative ordering of ext4 with respect to ext2
didn't matter until ext4 gained the ability to mount filesystems
without a journal starting in 2.6.29-rc1. Now that this is the case,
given that ext4 is before ext2, it means that root filesystems that
were using the plain-jane ext2 format are getting mounted using the
ext4 filesystem driver, which is a change in behavior which could be
surprising to users.
It's doubtful that there are that many ext2-only root filesystem users
that would also have ext4 compiled into the kernel, but to adhere to
the principle of least surprise, the correct ordering in fs/Makefile
is ext3, followed by ext2, and finally ext4.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Commit c4be0c1d added error checking to ext4_freeze() when calling
ext4_commit_super(). Unfortunately the patch failed to remove the
original call to ext4_commit_super(), with the net result that when
freezing the filesystem, the superblock gets written twice, the first
time without error checking.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
into tracing/core
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
proc: fix PG_locked reporting in /proc/kpageflags
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix deadlock in ext4_write_begin() and ext4_da_write_begin()
ext4: Add fallback for find_group_flex
|
|
Expr always evaluates to zero.
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
|
|
|
|
|
When renaming a file such that a link to another inode is overwritten,
force any delay allocated blocks that to be allocated so that if the
filesystem is mounted with data=ordered, the data blocks will be
pushed out to disk along with the journal commit. Many application
programs expect this, so we do this to avoid zero length files if the
system crashes unexpectedly.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
When closing a file that had been previously truncated, force any
delay allocated blocks that to be allocated so that if the filesystem
is mounted with data=ordered, the data blocks will be pushed out to
disk along with the journal commit. Many application programs expect
this, so we do this to avoid zero length files if the system crashes
unexpectedly.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Add an ioctl which forces all of the delay allocated blocks to be
allocated. This also provides a function ext4_alloc_da_blocks() which
will be used by the following commits to force files to be fully
allocated to preserve application-expected ext3 behaviour.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
de_get is called before every proc_get_inode, but corresponding de_put is
called only when dropping last reference to an inode. This might cause
something like
remove_proc_entry: /proc/stats busy, count=14496
to be printed to the syslog.
The fix is to call de_put in case of an already initialized inode in
proc_get_inode.
Signed-off-by: Krzysztof Sachanowicz <analyzer1@gmail.com>
Tested-by: Marcin Pilipczuk <marcin.pilipczuk@gmail.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The mpage_da_writepages() function is only used in one place, so
inline it to simplify the call stack and make the code easier to
understand.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Struct mpage_da_data and mpage_add_bh_to_extent() use a fake struct
buffer_head which is 104 bytes on an x86_64 system, but only use 24
bytes of the structure. On systems that use a spinlock for atomic_t,
the stack savings will be even greater.
It turns out that using a fake struct buffer_head doesn't even save
that much code, and it makes the code more confusing since it's not
used as a "real" buffer head. So just store pass b_size and b_state
in mpage_add_bh_to_extent(), and store b_size, b_state, and b_block_nr
in the mpage_da_data structure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This parameter was always set to ext4_da_get_block_write().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Make sure we validate extent details only when read from the disk.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This patch adds checks to validate the extent entries along with extent
headers, to avoid crashes caused by corrupt filesystems.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly. However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode. This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.
The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.
This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.
Signed-off-by: Bryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The find_group_flex() inode allocator is now only used if the
filesystem is mounted using the "oldalloc" mount option. It is
replaced with the original Orlov allocator that has been updated for
flex_bg filesystems (it should behave the same way if flex_bg is
disabled). The inode allocator now functions by taking into account
each flex_bg group, instead of each block group, when deciding whether
or not it's time to allocate a new directory into a fresh flex_bg.
The block allocator has also been changed so that the first block
group in each flex_bg is preferred for use for storing directory
blocks. This keeps directory blocks close together, which is good for
speeding up e2fsck since large directories are more likely to look
like this:
debugfs: stat /home/tytso/Maildir/cur
Inode: 1844562 Type: directory Mode: 0700 Flags: 0x81000
Generation: 1132745781 Version: 0x00000000:0000ad71
User: 15806 Group: 15806 Size: 1060864
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 2072
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
Size of extra inode fields: 28
BLOCKS:
(0):7348651, (1-258):7348654-7348911
TOTAL: 259
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Functions ext4_write_begin() and ext4_da_write_begin() call
grab_cache_page_write_begin() without AOP_FLAG_NOFS. Thus it
can happen that page reclaim is triggered in that function
and it recurses back into the filesystem (or some other filesystem).
But this can lead to various problems as a transaction is already
started at that point. Add the necessary flag.
http://bugzilla.kernel.org/show_bug.cgi?id=11688
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This is a workaround for find_group_flex() which badly needs to be
replaced. One of its problems (besides ignoring the Orlov algorithm)
is that it is a bit hyperactive about returning failure under
suspicious circumstances. This can lead to spurious ENOSPC failures
even when there are inodes still available.
Work around this for now by retrying the search using
find_group_other() if find_group_flex() returns -1. If
find_group_other() succeeds when find_group_flex() has failed, log a
warning message.
A better block/inode allocator that will fix this problem for real has
been queued up for the next merge window.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Fix multiuser mounts so server does not invalidate earlier security contexts
[CIFS] improve posix semantics of file create
[CIFS] Fix oops in cifs_strfromUCS_le mounting to servers which do not specify their OS
cifs: posix fill in inode needed by posix open
cifs: properly handle case where CIFSGetSrvInodeNumber fails
cifs: refactor new_inode() calls and inode initialization
[CIFS] Prevent OOPs when mounting with remote prefixpath.
[CIFS] ipv6_addr_equal for address comparison
|
|
At scan time we observed following scenario:
node A inserted
node B inserted
node C inserted -> sets overlapped flag on node B
node A is removed due to CRC failure -> overlapped flag on node B remains
while (tn->overlapped)
tn = tn_prev(tn);
==> crash, when tn_prev(B) is referenced.
When the ultimate node is removed at scan time and the overlapped flag
is set on the penultimate node, then nothing updates the overlapped
flag of that node. The overlapped iterators blindly expect that the
ultimate node does not have the overlapped flag set, which causes the
scan code to crash.
It would be a huge overhead to go through the node chain on node
removal and fix up the overlapped flags, so detecting such a case on
the fly in the overlapped iterators is a simpler and reliable
solution.
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
contexts
When two different users mount the same Windows 2003 Server share using CIFS,
the first session mounted can be invalidated. Some servers invalidate the first
smb session when a second similar user (e.g. two users who get mapped by server to "guest")
authenticates an smb session from the same client.
By making sure that we set the 2nd and subsequent vc numbers to nonzero values,
this ensures that we will not have this problem.
Fixes Samba bug 6004, problem description follows:
How to reproduce:
- configure an "open share" (full permissions to Guest user) on Windows 2003
Server (I couldn't reproduce the problem with Samba server or Windows older
than 2003)
- mount the share twice with different users who will be authenticated as guest.
noacl,noperm,user=john,dir_mode=0700,domain=DOMAIN,rw
noacl,noperm,user=jeff,dir_mode=0700,domain=DOMAIN,rw
Result:
- just the mount point mounted last is accessible:
Signed-off-by: Steve French <sfrench@us.ibm.com>
|