Age | Commit message (Collapse) | Author |
|
When doing an incremental send it's possible that the computed send stream
contains clone operations that will fail on the receiver if the receiver
has compression enabled and the clone operations target a sector sized
extent that starts at a zero file offset, is not compressed on the source
filesystem but ends up being compressed and inlined at the destination
filesystem.
Example scenario:
$ mkfs.btrfs -f /dev/sdb
$ mount -o compress /dev/sdb /mnt
# By doing a direct IO write, the data is not compressed.
$ xfs_io -f -d -c "pwrite -S 0xab 0 4K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
$ xfs_io -c "reflink /mnt/foobar 0 8K 4K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
$ btrfs send -f /tmp/1.snap /mnt/mysnap1
$ btrfs send -f /tmp/2.snap -p /mnt/mysnap1 /mnt/mysnap2
$ umount /mnt
$ mkfs.btrfs -f /dev/sdc
$ mount -o compress /dev/sdc /mnt
$ btrfs receive -f /tmp/1.snap /mnt
$ btrfs receive -f /tmp/2.snap /mnt
ERROR: failed to clone extents to foobar
Operation not supported
The same could be achieved by mounting the source filesystem without
compression and doing a buffered IO write instead of a direct IO one,
and mounting the destination filesystem with compression enabled.
So fix this by issuing regular write operations in the send stream
instead of clone operations when the source offset is zero and the
range has a length matching the sector size.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There is a corner case that slips through the checkers in functions
reading extent buffer, ie.
if (start < eb->len) and (start + len > eb->len),
then
a) map_private_extent_buffer() returns immediately because
it's thinking the range spans across two pages,
b) and the checkers in read_extent_buffer(), WARN_ON(start > eb->len)
and WARN_ON(start + len > eb->start + eb->len), both are OK in this
corner case, but it'd actually try to access the eb->pages out of
bounds because of (start + len > eb->len).
The case is found by switching extent inline ref type from shared data
ref to non-shared data ref, which is a kind of metadata corruption.
It'd use the wrong helper to access the eb,
eg. btrfs_extent_data_ref_root(eb, ref) is used but the %ref passing
here is "struct btrfs_shared_data_ref". And if the extent item
happens to be the first item in the eb, then offset/length will get
over eb->len which ends up an invalid memory access.
This is adding proper checks in order to avoid invalid memory access,
ie. 'general protection fault', before it's too late.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
isofs_read_inode()
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
v2_read_file_info()
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
|
|
For a while now any NFSv2 mount where sec= is specified uses
AUTH_NULL. If sec= is not specified, the mount uses AUTH_UNIX.
Commit e68fd7c8071d ("mount: use sec= that was specified on the
command line") attempted to address a very similar problem with
NFSv3, and should have fixed this too, but it has a bug.
The MNTv1 MNT procedure does not return a list of security flavors,
so our client makes up a list containing just AUTH_NULL. This should
enable nfs_verify_authflavors() to assign the sec= specified flavor,
but instead, it incorrectly sets it to AUTH_NULL.
I expect this would also be a problem for any NFSv3 server whose
MNTv3 MNT procedure returned a security flavor list containing only
AUTH_NULL.
Fixes: e68fd7c8071d ("mount: use sec= that was specified on ... ")
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=310
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
An NFSv4.1 client might close a file after the user who opened it has
logged off. In this case the user's credentials may no longer be
valid, if they are e.g. kerberos credentials that have expired.
NFSv4.1 has a mechanism to allow the client to use machine credentials
to close a file. However due to a short-coming in the RFC, a CLOSE
with those credentials may not be possible if the file in question
isn't exported to the same security flavor - the required PUTFH must
be rejected when this is the case.
Specifically if a server and client support kerberos in general and
have used it to form a machine credential, but the file is only
exported to "sec=sys", a PUTFH with the machine credentials will fail,
so CLOSE is not possible.
As RPC_AUTH_UNIX (used by sec=sys) credentials can never expire, there
is no value in using the machine credential in place of them.
So in that case, just use the users credentials for CLOSE etc, as you would
in NFSv4.0
Signed-off-by: Neil Brown <neilb@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Another pile of small fixes and updates for x86:
- Plug a hole in the SMAP implementation which misses to clear AC on
NMI entry
- Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
parameter works correctly again
- Use the proper accessor in the startup64 code for next_early_pgt to
prevent accessing of invalid addresses and faulting in the early
boot code.
- Prevent CPU hotplug lock recursion in the MTRR code
- Unbreak CPU0 hotplugging
- Rename overly long CPUID bits which got introduced in this cycle
- Two commits which mark data 'const' and restrict the scope of data
and functions to file scope by making them 'static'"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Constify attribute_group structures
x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
x86: Fix norandmaps/ADDR_NO_RANDOMIZE
x86/mtrr: Prevent CPU hotplug lock recursion
x86: Mark various structures and functions as 'static'
x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
x86/smpboot: Unbreak CPU0 hotplug
x86/asm/64: Clear AC on NMI entries
|
|
Now that the mirror allocation has been moved, the parameter can go.
Also remove the redundant symbol export.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
There are a number of callers of nfs_pageio_complete() that want to
continue using the nfs_pageio_descriptor without needing to call
nfs_pageio_init() again. Examples include nfs_pageio_resend() and
nfs_pageio_cond_complete().
The problem is that nfs_pageio_complete() also calls
nfs_pageio_cleanup_mirroring(), which frees up the array of mirrors.
This can lead to writeback errors, in the next call to
nfs_pageio_setup_mirroring().
Fix by simply moving the allocation of the mirrors to
nfs_pageio_setup_mirroring().
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196709
Reported-by: JianhongYin <yin-jianhong@163.com>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Pull xfs fixes from Darrick Wong:
"A handful more bug fixes for you today.
Changes since last time:
- Don't leak resources when mount fails
- Don't accidentally clobber variables when looking for free inodes"
* tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't leak quotacheck dquots when cow recovery
xfs: clear MS_ACTIVE after finishing log recovery
iomap: fix integer truncation issues in the zeroing and dirtying helpers
xfs: fix inobt inode allocation search optimization
|
|
|
|
The buffer passed to btrfs_ioctl_tree_search* functions have to be at least
sizeof(struct btrfs_ioctl_search_header). If this is not the case then the
ioctl should return -EOVERFLOW and set the uarg->buf_size to the minimum
required size. Currently btrfs_ioctl_tree_search_v2 would return an -EOVERFLOW
error with ->buf_size being set to the value passed by user space. Fix this by
removing the size check and relying on search_ioctl, which already includes it
and correctly sets buf_size.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
__btrfs_submit_dio_bio
Currently the code checks whether we should do data checksumming in
btrfs_submit_direct and the boolean result of this check is passed to
btrfs_submit_direct_hook, in turn passing it to __btrfs_submit_dio_bio which
actually consumes it. The last function actually has all the necessary context
to figure out whether to skip the check or not, so let's move the check closer
to where it's being consumed. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When logging an inode in full mode that has an inline compressed extent
that represents a range with a size matching the sector size (currently
the same as the page size), has a trailing hole and the no-holes feature
is enabled, we end up failing an assertion leading to a trace like the
following:
[141812.031528] assertion failed: len == i_size, file: fs/btrfs/tree-log.c, line: 4453
[141812.033069] ------------[ cut here ]------------
[141812.034330] kernel BUG at fs/btrfs/ctree.h:3452!
[141812.035137] invalid opcode: 0000 [#1] PREEMPT SMP
[141812.035932] Modules linked in: btrfs dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_flakey dm_mod dax ppdev evdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 tpm_tis psmouse crypto_simd parport_pc sg pcspkr tpm_tis_core cryptd parport serio_raw glue_helper tpm i2c_piix4 i2c_core button sunrpc loop autofs4 ext4 crc16 jbd2 mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod ata_generic virtio_scsi ata_piix floppy crc32c_intel libata scsi_mod virtio_pci virtio_ring e1000 virtio [last unloaded: btrfs]
[141812.036790] CPU: 3 PID: 845 Comm: fdm-stress Tainted: G B W 4.12.3-btrfs-next-52+ #1
[141812.036790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[141812.036790] task: ffff8801e6694180 task.stack: ffffc90009004000
[141812.036790] RIP: 0010:assfail.constprop.18+0x1c/0x1e [btrfs]
[141812.036790] RSP: 0018:ffffc90009007bc0 EFLAGS: 00010282
[141812.036790] RAX: 0000000000000046 RBX: ffff88017512c008 RCX: 0000000000000001
[141812.036790] RDX: ffff88023fd95201 RSI: ffffffff8182264c RDI: 00000000ffffffff
[141812.036790] RBP: ffffc90009007bc0 R08: 0000000000000001 R09: 0000000000000001
[141812.036790] R10: 0000000000001000 R11: ffffffff82f5a0c9 R12: ffff88014e5947e8
[141812.036790] R13: 00000000000b4000 R14: ffff8801b234d008 R15: 0000000000000000
[141812.036790] FS: 00007fdba6ffd700(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[141812.036790] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[141812.036790] CR2: 00007fdb9c000010 CR3: 000000016efa2000 CR4: 00000000001406e0
[141812.036790] Call Trace:
[141812.036790] btrfs_log_inode+0x9f0/0xd3d [btrfs]
[141812.036790] ? __mutex_lock+0x120/0x3ce
[141812.036790] btrfs_log_inode_parent+0x224/0x685 [btrfs]
[141812.036790] ? lock_acquire+0x16b/0x1af
[141812.036790] btrfs_log_dentry_safe+0x60/0x7b [btrfs]
[141812.036790] btrfs_sync_file+0x32e/0x3f8 [btrfs]
[141812.036790] vfs_fsync_range+0x8a/0x9d
[141812.036790] vfs_fsync+0x1c/0x1e
[141812.036790] do_fsync+0x31/0x4a
[141812.036790] SyS_fdatasync+0x13/0x17
[141812.036790] entry_SYSCALL_64_fastpath+0x18/0xad
[141812.036790] RIP: 0033:0x7fdbac41a47d
[141812.036790] RSP: 002b:00007fdba6ffce30 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[141812.036790] RAX: ffffffffffffffda RBX: ffffffff81092c9f RCX: 00007fdbac41a47d
[141812.036790] RDX: 0000004cf0160a40 RSI: 0000000000000000 RDI: 0000000000000006
[141812.036790] RBP: ffffc90009007f98 R08: 0000000000000000 R09: 0000000000000010
[141812.036790] R10: 00000000000002e8 R11: 0000000000000293 R12: ffffffff8110cd90
[141812.036790] R13: ffffc90009007f78 R14: 0000000000000000 R15: 0000000000000000
[141812.036790] ? time_hardirqs_off+0x9/0x14
[141812.036790] ? trace_hardirqs_off_caller+0x1f/0xa3
[141812.036790] Code: c7 d6 61 6b a0 48 89 e5 e8 ba ef a8 e0 0f 0b 55 89 f1 48 c7 c2 6d 65 6b a0 48 89 fe 48 c7 c7 81 65 6b a0 48 89 e5 e8 9c ef a8 e0 <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89
[141812.036790] RIP: assfail.constprop.18+0x1c/0x1e [btrfs] RSP: ffffc90009007bc0
[141812.084448] ---[ end trace 44e472684c7a32cc ]---
Which happens because the code that logs a trailing hole when the no-holes
feature is enabled, did not consider that a compressed inline extent can
represent a range with a size matching the sector size, in which case
expanding the inode's i_size, through a truncate operation, won't lead
to padding with zeroes the page that represents the inline extent, and
therefore the inline extent remains after the truncation.
Fix this by adapting the assertion to accept inline extents representing
data with a sector size length if, and only if, the inline extents are
compressed.
A sample and trivial reproducer (for systems with a 4K page size) for this
issue:
mkfs.btrfs -O no-holes -f /dev/sdc
mount -o compress /dev/sdc /mnt
xfs_io -f -c "pwrite -S 0xab 0 4K" /mnt/foobar
sync
xfs_io -c "truncate 32K" /mnt/foobar
xfs_io -c "fsync" /mnt/foobar
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If the range being cleared was not marked for defrag and we are not
about to clear the range from the defrag status, we don't need to
lock and unlock the inode.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Chris Mason <clm@fb.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The error return variable ret is initialized to zero and then is
checked to see if it is non-zero in the if-block that follows it.
It is therefore impossible for ret to be non-zero after the if-block
hence the check is redundant and can be removed.
Detected by CoverityScan, CID#1021040 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The internal free space tree management routines are always exposed for
testing purposes. Make them dependent on SANITY_TESTS being on so that
they are exposed only when they really have to.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This variable was added in 1abe9b8a138c ("Btrfs: add initial tracepointi
support for btrfs"), yet it never really got used, only assigned to. So
let's remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We have a WARN_ON(!var) inside an if branch which is executed (among
others) only when var is true.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We aren't using this define, so removing it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Though BTRFS_FSID_SIZE and BTRFS_UUID_SIZE are of the same size, we
should use the matching constant for the fsid buffer.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Our dir_context->pos is supposed to hold the next position we're
supposed to look. If we successfully insert a delayed dir index we
could end up with a duplicate entry because we don't increase ctx->pos
after doing the dir_emit.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This reverts commit 68c4a4f8abc60c9440ede9cd123d48b78325f7a3, with
various conflict clean-ups.
The capability check required too much privilege compared to simple DAC
controls. A system builder was forced to have crash handler processes
run with CAP_SYSLOG which would give it the ability to read (and wipe)
the _current_ dmesg, which is much more access than being given access
only to the historical log stored in pstorefs.
With the prior commit to make the root directory 0750, the files are
protected by default but a system builder can now opt to give access
to a specific group (via chgrp on the pstorefs root directory) without
being forced to also give away CAP_SYSLOG.
Suggested-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
|
Currently only DMESG and CONSOLE record types are protected, and it isn't
obvious that they are using a capability check. Instead switch to explicit
root directory mode of 0750 to keep files private by default. This will
allow the removal of the capability check, which was non-obvious and
forces a process to have possibly too much privilege when simple post-boot
chgrp for readers would be possible without it.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
|
dq_data_lock is currently used to protect all modifications of quota
accounting information, consistency of quota accounting on the inode,
and dquot pointers from inode. As a result contention on the lock can be
pretty heavy.
Reduce the contention on the lock by protecting quota accounting
information by a new dquot->dq_dqb_lock and consistency of quota
accounting with inode usage by inode->i_lock.
This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 6% when user quota is
turned on. When those 50 processes belong to 50 different users, the
improvement is about 9%.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Provide helper __inode_get_bytes() which assumes i_lock is already
acquired. Quota code will need this to be able to use i_lock to protect
consistency of quota accounting information and inode usage.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
dquot_claim_reserved_space() and dquot_reclaim_reserved_space() have
only a single callsite. Inline them there.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
inode_incr_space() and inode_decr_space() have only two callsites.
Inline them there as that will make locking changes simpler.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
inode_add_rsv_space() and inode_sub_rsv_space() had only one callsite.
Inline them there directly. inode_claim_rsv_space() and
inode_reclaim_rsv_space() had two callsites so inline them there as
well. This will simplify further locking changes.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
When journalling quotas, we writeback all dquots immediately after
changing them as part of current transation. Thus there's no need to
write anything in dquot_writeback_dquots() and so we can avoid updating
list of dirty dquots to reduce dq_list_lock contention.
This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 15% when user quota is
turned on.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Filesystems that are journalling quotas generally don't need tracking of
dirty dquots in a list since forcing a transaction commit flushes all
quotas anyway. Allow filesystem to say it doesn't want dquots to be
tracked as it reduces contention on the dq_list_lock.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Currently every dquot carries a wait_queue_head_t used only when we are
turning quotas off to wait for last users to drop dquot references.
Since such rare case is not performance sensitive in any means, just use
a global waitqueue for this and save space in struct dquot. Also convert
the logic to use wait_event() instead of open-coding it.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Move locking of dq_list_lock into clear_dquot_dirty(). It makes the
function more self-contained and will simplify our life later.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Currently we mark dirty even dquots that are not active (i.e.,
initialization or reading failed for them). Thus later we have to check
whether dirty dquot is really active and just clear the dirty bit if
not. Avoid this complication by just never marking non-active dquot as
dirty.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
dqi_flags modifications are protected by dq_data_lock. However the
modifications in vfs_load_quota_inode() and in mark_info_dirty() were
not which could lead to corruption of dqi_flags. Since modifications to
dqi_flags are rare, this is hard to observe in practice but in theory it
could happen. Fix the problem by always using dq_data_lock for
protection.
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
If we fail a mount on account of cow recovery errors, it's possible that
a previous quotacheck left some dquots in memory. The bailout clause of
xfs_mountfs forgets to purge these, and so we leak them. Fix that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.
To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release. Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes. So, set
MS_ACTIVE right before we do the last part of log recovery and clear it
immediately after we finish the log recovery so that everything
will be torn down properly if we abort the mount.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
Currently we return -EIO on any error (or short read) from
->quota_read() while reading quota info. Propagate the error code
instead.
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
v2_read_file_info() returned -1 instead of proper error codes on error.
Luckily this is not easily visible from userspace as we have called
->check_quota_file shortly before and thus already verified the quota
file is sane. Still set the error codes to proper values.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->read_file_info() callback. This
is for consistency with other operations and it also allows us to get
rid of an ugliness in OCFS2.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->write_file_info() callback.
Mostly for consistency with other operations.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->get_next_id() callback. Mostly
for consistency with other operations.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->release_dqblk() callback. It
will allow quota formats to decide whether they need it or not.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
The old quota quota format has fixed offset in quota file based on ID so
there's no locking needed against concurrent modifications of the file
(locking against concurrent IO on the same dquot is still provided by
dq_lock).
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
When dquot has space already allocated in a quota file, we just
overwrite that place when writing dquot. So we don't need any protection
against other modifications of quota file as these keep dquot in place.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->write_dqblk() callback. It will
allow quota formats to decide whether they need it or not.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
The old quota format has fixed offset in quota file based on ID so
there's no locking needed against concurrent modifications of the file
(locking against concurrent IO on the same dquot is still provided by
dq_lock).
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Push down acquisition of dqio_sem into ->read_dqblk() callback. It will
allow quota formats to decide whether they need it or not.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Currently dquot writeout is only protected by dqio_sem held for writing.
As we transition to a finer grained locking we will use dquot->dq_lock
instead. So acquire it in dquot_commit() and move dqio_sem just around
->commit_dqblk() call as it is still needed to serialize quota file
changes.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
|