summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2016-11-24Merge branch 'xfs-4.10-extent-lookup' into for-nextDave Chinner
2016-11-24xfs: remove NULLEXTNUMChristoph Hellwig
We only ever set a field to this constant for an impossible to reach error case in xfs_bmap_search_extents. That functions has been removed, so we can remove the constant as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: remove xfs_bmap_search_extentsChristoph Hellwig
Now that all users are gone. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in xfs_reflink_end_cowChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in xfs_reflink_cancel_cow_blocksChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in xfs_reflink_trim_irec_to_next_cowChristoph Hellwig
And remove the unused return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: cleanup xfs_reflink_find_cow_mappingChristoph Hellwig
Use xfs_iext_lookup_extent to look up the extent, drop a useless check, drop a unneeded return value and clean up the general style a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in __xfs_reflink_reserve_cowChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers xfs_file_iomap_begin_delayChristoph Hellwig
And only lookup the previous extent inside xfs_iomap_prealloc_size if we actually need it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: remove prev argument to xfs_bmapi_reserve_delallocChristoph Hellwig
We can easily lookup the previous extent for the cases where we need it, which saves the callers from looking it up for us later in the series. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in __xfs_bunmapiChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in xfs_bmapi_writeChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: use new extent lookup helpers in xfs_bmapi_readChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: cleanup xfs_bmap_last_beforeChristoph Hellwig
Rewrite the function using xfs_iext_lookup_extent and xfs_iext_get_extent, and massage the flow into something easily understandable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24xfs: new inode extent list lookup helpersChristoph Hellwig
xfs_iext_lookup_extent looks up a single extent at the passed in offset, and returns the extent covering the area, or the one behind it in case of a hole, as well as the index of the returned extent in arguments, as well as a simple bool as return value that is set to false if no extent could be found because the offset is behind EOF. It is a simpler replacement for xfs_bmap_search_extent that leaves looking up the rarely needed previous extent to the caller and has a nicer calling convention. xfs_iext_get_extent is a helper for iterating over the extent list, it takes an extent index as input, and returns the extent at that index in it's expanded form in an argument if it exists. The actual return value is a bool whether the index is valid or not. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-24Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux ↵James Morris
into next
2016-11-23Merge tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Anna Schumaker: "Most of these fix regressions or races, but there is one patch for stable that Arnd sent me Stable bugfix: - Hide array-bounds warning Bugfixes: - Keep a reference on lock states while checking - Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state - Don't call close if the open stateid has already been cleared - Fix CLOSE rases with OPEN - Fix a regression in DELEGRETURN" * tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.x: hide array-bounds warning NFSv4.1: Keep a reference on lock states while checking NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state NFSv4: Don't call close if the open stateid has already been cleared NFSv4: Fix CLOSE races with OPEN NFSv4.1: Fix a regression in DELEGRETURN
2016-11-23Btrfs: fix emptiness check for dirtied extent buffers at check_leaf()Filipe Manana
We can not simply use the owner field from an extent buffer's header to get the id of the respective tree when the extent buffer is from a relocation tree. When we create the root for a relocation tree we leave (on purpose) the owner field with the same value as the subvolume's tree root (we do this at ctree.c:btrfs_copy_root()). So we must ignore extent buffers from relocation trees, which have the BTRFS_HEADER_FLAG_RELOC flag set, because otherwise we will always consider the extent buffer as not being the root of the tree (the root of original subvolume tree is always different from the root of the respective relocation tree). This lead to assertion failures when running with the integrity checker enabled (CONFIG_BTRFS_FS_CHECK_INTEGRITY=y) such as the following: [ 643.393409] BTRFS critical (device sdg): corrupt leaf, non-root leaf's nritems is 0: block=38506496, root=260, slot=0 [ 643.397609] BTRFS info (device sdg): leaf 38506496 total ptrs 0 free space 3995 [ 643.407075] assertion failed: 0, file: fs/btrfs/disk-io.c, line: 4078 [ 643.408425] ------------[ cut here ]------------ [ 643.409112] kernel BUG at fs/btrfs/ctree.h:3419! [ 643.409773] invalid opcode: 0000 [#1] PREEMPT SMP [ 643.410447] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq ppdev psmouse acpi_cpufreq parport_pc evdev parport tpm_tis tpm_tis_core pcspkr serio_raw i2c_piix4 sg tpm i2c_core button processor loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring scsi_mod virtio e1000 floppy [ 643.414356] CPU: 11 PID: 32726 Comm: btrfs Not tainted 4.8.0-rc8-btrfs-next-35+ #1 [ 643.414356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 643.414356] task: ffff880145e95b00 task.stack: ffff88014826c000 [ 643.414356] RIP: 0010:[<ffffffffa0352759>] [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP: 0018:ffff88014826fa28 EFLAGS: 00010292 [ 643.414356] RAX: 0000000000000039 RBX: ffff88014e2d7c38 RCX: 0000000000000001 [ 643.414356] RDX: ffff88023f4d2f58 RSI: ffffffff81806c63 RDI: 00000000ffffffff [ 643.414356] RBP: ffff88014826fa28 R08: 0000000000000001 R09: 0000000000000000 [ 643.414356] R10: ffff88014826f918 R11: ffffffff82f3c5ed R12: ffff880172910000 [ 643.414356] R13: ffff880233992230 R14: ffff8801a68a3310 R15: fffffffffffffff8 [ 643.414356] FS: 00007f9ca305e8c0(0000) GS:ffff88023f4c0000(0000) knlGS:0000000000000000 [ 643.414356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 643.414356] CR2: 00007f9ca3071000 CR3: 000000015d01b000 CR4: 00000000000006e0 [ 643.414356] Stack: [ 643.414356] ffff88014826fa50 ffffffffa02d655a 000000000000000a ffff88014e2d7c38 [ 643.414356] 0000000000000000 ffff88014826faa8 ffffffffa02b72f3 ffff88014826fab8 [ 643.414356] 00ffffffa03228e4 0000000000000000 0000000000000000 ffff8801bbd4e000 [ 643.414356] Call Trace: [ 643.414356] [<ffffffffa02d655a>] btrfs_mark_buffer_dirty+0xdf/0xe5 [btrfs] [ 643.414356] [<ffffffffa02b72f3>] btrfs_copy_root+0x18a/0x1d1 [btrfs] [ 643.414356] [<ffffffffa0322921>] create_reloc_root+0x72/0x1ba [btrfs] [ 643.414356] [<ffffffffa03267c2>] btrfs_init_reloc_root+0x7b/0xa7 [btrfs] [ 643.414356] [<ffffffffa02d9e44>] record_root_in_trans+0xdf/0xed [btrfs] [ 643.414356] [<ffffffffa02db04e>] btrfs_record_root_in_trans+0x50/0x6a [btrfs] [ 643.414356] [<ffffffffa030ad2b>] create_subvol+0x472/0x773 [btrfs] [ 643.414356] [<ffffffffa030b406>] btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [<ffffffffa030b406>] ? btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [<ffffffff810781ac>] ? preempt_count_add+0x65/0x68 [ 643.414356] [<ffffffff811a6e97>] ? __mnt_want_write+0x62/0x77 [ 643.414356] [<ffffffffa030b55d>] btrfs_ioctl_snap_create_transid+0xce/0x187 [btrfs] [ 643.414356] [<ffffffffa030b67d>] btrfs_ioctl_snap_create+0x67/0x81 [btrfs] [ 643.414356] [<ffffffffa030ecfd>] btrfs_ioctl+0x508/0x20dd [btrfs] [ 643.414356] [<ffffffff81293e39>] ? __this_cpu_preempt_check+0x13/0x15 [ 643.414356] [<ffffffff81155eca>] ? handle_mm_fault+0x976/0x9ab [ 643.414356] [<ffffffff81091300>] ? arch_local_irq_save+0x9/0xc [ 643.414356] [<ffffffff8119a2b0>] vfs_ioctl+0x18/0x34 [ 643.414356] [<ffffffff8119a8e8>] do_vfs_ioctl+0x581/0x600 [ 643.414356] [<ffffffff814b9552>] ? entry_SYSCALL_64_fastpath+0x5/0xa8 [ 643.414356] [<ffffffff81093fe9>] ? trace_hardirqs_on_caller+0x17b/0x197 [ 643.414356] [<ffffffff8119a9be>] SyS_ioctl+0x57/0x79 [ 643.414356] [<ffffffff814b9565>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 643.414356] [<ffffffff81091b08>] ? trace_hardirqs_off_caller+0x3f/0xaa [ 643.414356] Code: 89 83 88 00 00 00 31 c0 5b 41 5c 41 5d 5d c3 55 89 f1 48 c7 c2 98 bc 35 a0 48 89 fe 48 c7 c7 05 be 35 a0 48 89 e5 e8 13 46 dd e0 <0f> 0b 55 89 f1 48 c7 c2 9f d3 35 a0 48 89 fe 48 c7 c7 7a d5 35 [ 643.414356] RIP [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP <ffff88014826fa28> [ 643.468267] ---[ end trace 6a1b3fb1a9d7d6e3 ]--- This can be easily reproduced by running xfstests with the integrity checker enabled. Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
2016-11-23Btrfs: fix BUG_ON in btrfs_mark_buffer_dirtyLiu Bo
This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y. Commit 1ba98d0 ("Btrfs: detect corruption when non-root leaf has zero item") assumes that a leaf is its root when leaf->bytenr == btrfs_root_bytenr(root), however, we should not use btrfs_root_bytenr(root) since it's mainly got updated during committing transaction. So the check can fail when doing COW on this leaf while it is a root. This changes to use "if (leaf == btrfs_root_node(root))" instead, just like how we check whether leaf is a root in __btrfs_cow_block(). Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Cc: stable@vger.kernel.org # 4.8+ Reported-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com>
2016-11-23f2fs: return directly if block has been removed from the victimYunlei He
If one block has been to written to a new place, just return in move data process. This patch check it again with holding page lock. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23Revert "f2fs: do not recover from previous remained wrong dnodes"Chao Yu
i_times of inode will be set with current system time which can be configured through 'date', so it's not safe to judge dnode block as garbage data or unchanged inode depend on i_times. Now, we have used enhanced 'cp_ver + cp' crc method to verify valid dnode block, so I expect recoverying invalid dnode is almost not possible. This reverts commit 807b1e1c8e08452948495b1a9985ab46d329e5c2. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: remove checkpoint in f2fs_freezeJaegeuk Kim
The generic freeze_super() calls sync_filesystems() before f2fs_freeze(). So, basically we don't need to do checkpoint in f2fs_freeze(). But, in xfs/068, it triggers circular locking problem below due to gc_mutex for checkpoint. ====================================================== [ INFO: possible circular locking dependency detected ] 4.9.0-rc1+ #132 Tainted: G OE ------------------------------------------------------- 1. wait for __sb_start_write() by [<ffffffff9845f353>] dump_stack+0x85/0xc2 [<ffffffff980e80bf>] print_circular_bug+0x1cf/0x230 [<ffffffff980eb4d0>] __lock_acquire+0x19e0/0x1bc0 [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffff9826bdd0>] __sb_start_write+0x130/0x200 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffffc08c7c3b>] f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffff98289991>] iput+0x171/0x2c0 [<ffffffffc08cfccf>] f2fs_sync_inode_meta+0x3f/0xf0 [f2fs] [<ffffffffc08cfe04>] block_operations+0x84/0x110 [f2fs] [<ffffffffc08cff78>] write_checkpoint+0xe8/0xf20 [f2fs] [<ffffffff980e979d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffff9803e9d9>] ? sched_clock+0x9/0x10 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc08c6df5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffff982a4f90>] ? do_fsync+0x70/0x70 [<ffffffff982a4f90>] ? do_fsync+0x70/0x70 [<ffffffff982a4fb0>] sync_fs_one_sb+0x20/0x30 [<ffffffff9826ca3e>] iterate_supers+0xae/0x100 [<ffffffff982a50b5>] sys_sync+0x55/0x90 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 2. wait for sbi->gc_mutex by [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220 [<ffffffff989063d6>] mutex_lock_nested+0x76/0x3f0 [<ffffffffc08c6de9>] f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc08c7a6c>] f2fs_freeze+0x1c/0x20 [f2fs] [<ffffffff9826b6ef>] freeze_super+0xcf/0x190 [<ffffffff9827eebc>] do_vfs_ioctl+0x53c/0x6a0 [<ffffffff9827f099>] SyS_ioctl+0x79/0x90 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: assign segments correctly for direct_ioJaegeuk Kim
Previously, we assigned CURSEG_WARM_DATA for direct_io, but if we have two or four logs, we do not use that type at all. Let's fix it. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: fix wrong i_atime recoveryChao Yu
Shouldn't update in-memory i_atime with on-disk i_mtime of inode when recovering inode. Shuoran found this bug which is hidden for a long time, honour is belong to him. Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: record inode updating status correctlyChao Yu
We should record updating status of inode only for living inode, for those unlinked inode it needs to clear its ino cache, otherwise after the ino was been reused, it will cause unneeded node page writing during ->fsync. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Trace reset zone eventsDamien Le Moal
Similarly to the regular discard, trace zone reset events. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Reset sequential zones on zoned block devicesDamien Le Moal
When a zoned block device is mounted, discarding sections contained in sequential zones must reset the zone write pointer. For sections contained in conventional zones, the regular discard is used if the drive supports it. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Cache zoned block devices zone typeDamien Le Moal
With the zoned block device feature enabled, section discard need to do a zone reset for sections contained in sequential zones, and a regular discard (if supported) for sections stored in conventional zones. Avoid the need for a costly report zones to obtain a section zone type when discarding it by caching the types of the device zones in the super block information. This cache is initialized at mount time for mounts with the zoned block device feature enabled. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Do not allow adaptive mode for host-managed zoned block devicesDamien Le Moal
The LFS mode is mandatory for host-managed zoned block devices as update in place optimizations are not possible for segments in sequential zones. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Always enable discard for zoned blocks devicesDamien Le Moal
Zone write pointer reset acts as discard for zoned block devices. So if the zoned block device feature is enabled, always declare that discard is enabled, even if the device does not actually support the command. For the same reason, prevent the use the "nodicard" mount option. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Suppress discard warning message for zoned block devicesDamien Le Moal
For zoned block devices, discard is replaced by zone reset. So do not warn if the device does not supports discard. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Check zoned block feature for host-managed zoned block devicesDamien Le Moal
The F2FS_FEATURE_BLKZONED feature indicates that the drive was formatted with zone alignment optimization. This is optional for host-aware devices, but mandatory for host-managed zoned block devices. So check that the feature is set in this latter case. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Use generic zoned block device terminologyDamien Le Moal
SMR stands for "Shingled Magnetic Recording" which makes sense only for hard disk drives (spinning rust). The ZBC/ZAC standards enable management of SMR disks, but solid state drives may also support those standards. So rename the HMSMR feature to BLKZONED to avoid a HDD centric terminology. For the same reason, rename f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Add missing break in switch-caseDamien Le Moal
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: avoid infinite loop in the EIO case on recover_orphan_inodesJaegeuk Kim
This patch should fix an infinite loop case below. F2FS-fs : inject IO error in f2fs_read_end_io+0xf3/0x120 [f2fs] F2FS-fs (nvme0n1p1): recover_orphan_inode: orphan failed (ino=39ac1a), run fsck to fix. ... [<ffffffffc0b11ede>] sync_meta_pages+0xae/0x270 [f2fs] [<ffffffffc0b288dd>] ? flush_sit_entries+0x8d/0x960 [f2fs] [<ffffffffc0b13801>] write_checkpoint+0x361/0xf20 [f2fs] [<ffffffffb40e979d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0b0a199>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0b0a1a5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc0b2560e>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0b216c4>] f2fs_write_node_pages+0x34/0x320 [f2fs] [<ffffffffb41dff21>] do_writepages+0x21/0x30 [<ffffffffb429edb1>] __writeback_single_inode+0x61/0x760 [<ffffffffb490a937>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffb42a0805>] writeback_single_inode+0xd5/0x190 [<ffffffffb42a0959>] write_inode_now+0x99/0xc0 [<ffffffffb4289a16>] iput+0x1f6/0x2c0 [<ffffffffc0b0e3be>] f2fs_fill_super+0xe0e/0x1300 [f2fs] [<ffffffffb426c394>] ? sget_userns+0x4f4/0x530 [<ffffffffb426c692>] mount_bdev+0x182/0x1b0 [<ffffffffc0b0d5b0>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0b0a375>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffffb426d038>] mount_fs+0x38/0x170 [<ffffffffb428ec9b>] vfs_kern_mount+0x6b/0x160 [<ffffffffb4291d9e>] do_mount+0x1be/0xd60 [<ffffffffb4291a57>] ? copy_mount_options+0xb7/0x220 [<ffffffffb4292c54>] SyS_mount+0x94/0xd0 [<ffffffffb490b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: report error of f2fs_fill_dentriesChao Yu
Report error of f2fs_fill_dentries to ->iterate_shared, otherwise when error ocurrs, user may just list part of dirents in target directory without any hints. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: hide a maybe-uninitialized warningArnd Bergmann
gcc is unsure about the use of last_ofs_in_node, which might happen without a prior initialization: fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’: fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) { As pointed out by Chao Yu, the code is actually correct as 'prealloc' is only set if the last_ofs_in_node has been set, the two always get updated together. This initializes last_ofs_in_node to dn.ofs_in_node for each new dnode at the start of the 'next_block' loop, which at that point is a correct initialization as well. I assume that compilers that correctly track the contents of the variables and do not warn about the condition also figure out that they can eliminate the extra assignment here. Fixes: 46008c6d4232 ("f2fs: support in batch multi blocks preallocation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: remove percpu_count due to performance regressionJaegeuk Kim
This patch removes percpu_count usage due to performance regression in iozone. Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: make clean inodes when flushing inode pageJaegeuk Kim
This patch tries to make more clean inodes when flushing dirty inodes in checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: keep dirty inodes selectively for checkpointJaegeuk Kim
This is to avoid no free segment bug during checkpoint caused by a number of dirty inodes. The case was reported by Chao like this. 1. mount with lazytime option 2. fill 4k file until disk is full 3. sync filesystem 4. read all files in the image 5. umount In this case, we actually don't need to flush dirty inode to inode page during checkpoint. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: use BIO_MAX_PAGES for bio allocationJaegeuk Kim
We don't need to allocate bio partially in order to maximize sequential writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: declare static function for __build_free_nidsJaegeuk Kim
This patch avoids build warning. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: call f2fs_balance_fs for setattrJaegeuk Kim
If inode becomes dirty, we need to check the # of dirty inodes whether or not further checkpoint would be required. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: count dirty inodes to flush node pages during checkpointJaegeuk Kim
If there are a lot of dirty inodes, we need to flush all of them when doing checkpoint. So, we need to count this for enough free space. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: avoid casted negative value as shrink countChao Yu
This patch makes sure it returns a positive value instead of a probable casted negative value as shrink count. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: don't interrupt free nids building during nid allocationChao Yu
Let build_free_nids support sync/async methods, in allocation flow of nids, we use synchronuous method, so that we can avoid looping in alloc_nid when free memory is low; in unblock_operations and f2fs_balance_fs_bg we use asynchronuous method in where low memory condition can interrupt us. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: clean up free nid list operationsJaegeuk Kim
This patch cleans up to use consistent free nid list ops. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: split free nid listChao Yu
During free nid allocation, in order to do preallocation, we will tag free nid entry as allocated one and still leave it in free nid list, for other allocators who want to grab free nids, it needs to traverse the free nid list for lookup. It becomes overhead in scenario of allocating free nid intensively by multithreads. This patch splits free nid list to two list: {free,alloc}_nid_list, to keep free nids and preallocated free nids separately, after that, traverse latency will be gone, besides split nid_cnt for separate statistic. Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: clear nlink if fail to add_linkChao Yu
We don't need to keep incomplete created inode in cache, so if we fail to add link into directory during new inode creation, it's better to set nlink of inode to zero, then we can evict inode immediately. Otherwise release of nid belong to inode will be delayed until inode cache is being shrunk, it may cause a seemingly endless loop while allocating free nids in time of testing generic/269 case of fstest suit. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: add update_inode_page to fix kernel panic] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: fix sparse warningsEric Biggers
f2fs contained a number of endianness conversion bugs. Also, one function should have been 'static'. Found with sparse by running 'make C=2 CF=-D__CHECK_ENDIAN__ fs/f2fs/' Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>