summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-26edac: remove tile driverArnd Bergmann
The Tile architecture is obsolete and getting removed from the kernel, this driver appears to only be used there, and not on the ARM based successors (Tile-Mx, BlueField), so we should remove it as well. Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26scripts/checkstack.pl: remove blackfin supportTobias Klauser
The Blackfin port has been removed from the kernel, also remove the blackfin specific bits from the checkstack.pl script. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26ktest: remove obsolete architecturesArnd Bergmann
A number of architectures are being removed from the kernel, so we no longer need to test them. Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26recordmcount.pl: drop blackin and tile supportArnd Bergmann
These two architectures are getting removed, so we no longer need the special cases. Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26Documentation: arch-support: remove obsolete architecturesArnd Bergmann
A number of architecture ports are obsolete and getting dropped, so we no longer want to track the respective features. We already removed the lines for metag and mn10300, this does the same edits for all the others. For the remaining 21 architectures, this shows how many are known to implement each given feature: 19 time/modern-timekeeping/arch-support.txt 19 time/clockevents/arch-support.txt 15 core/tracehook/arch-support.txt 14 core/generic-idle-thread/arch-support.txt 13 locking/lockdep/arch-support.txt 12 io/dma-api-debug/arch-support.txt 11 debug/kgdb/arch-support.txt 10 time/virt-cpuacct/arch-support.txt 9 debug/kretprobes/arch-support.txt 9 debug/kprobes/arch-support.txt 8 vm/THP/arch-support.txt 8 vm/pte_special/arch-support.txt 8 vm/numa-memblock/arch-support.txt 8 io/sg-chain/arch-support.txt 7 perf/kprobes-event/arch-support.txt 7 locking/rwsem-optimized/arch-support.txt 7 debug/gcov-profile-all/arch-support.txt 7 core/jump-labels/arch-support.txt 7 core/BPF-JIT/arch-support.txt 6 vm/ELF-ASLR/arch-support.txt 6 time/context-tracking/arch-support.txt 6 seccomp/seccomp-filter/arch-support.txt 6 debug/stackprotector/arch-support.txt 5 time/irq-time-acct/arch-support.txt 5 io/dma-contiguous/arch-support.txt 5 debug/uprobes/arch-support.txt 4 vm/ioremap_prot/arch-support.txt 4 time/arch-tick-broadcast/arch-support.txt 4 perf/perf-stackdump/arch-support.txt 4 perf/perf-regs/arch-support.txt 3 debug/KASAN/arch-support.txt 2 vm/PG_uncached/arch-support.txt 2 vm/huge-vmap/arch-support.txt 2 sched/numa-balancing/arch-support.txt 2 sched/membarrier-sync-core/arch-support.txt 2 locking/cmpxchg-local/arch-support.txt 2 debug/optprobes/arch-support.txt 2 debug/kprobes-on-ftrace/arch-support.txt 1 vm/TLB/arch-support.txt 1 locking/queued-spinlocks/arch-support.txt 1 locking/queued-rwlocks/arch-support.txt 1 debug/user-ret-profiler/arch-support.txt 0 lib/strncasecmp/arch-support.txt Note that the list does not include riscv or nds32 yet, these still need to be added. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26asm-generic: clean up asm/unistd.hArnd Bergmann
The score architecture used a number of old system calls for compatibility with a traditional libc port, all architectures that got added later skip these. With score out of the way, we can finally clean up the syscall list to no longer provide these. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26asm-generic: siginfo: define ia64 si_codes unconditionallyArnd Bergmann
Unlike system call numbers the assignment of si_codes has never had a reason to be made per architecture. Some architectures have had unique conditions to report and reporting those conditions needed new si_codes. Nothing has ever needed si_codes to have different values on different architectures. The si_code space is vast so even with defining all si_codes on all architectures there is no danger in running out of si_code values. The history of the si_codes BUS_MCEERR_AR, BUS_MCEER_AO, SEGV_BNDERR, and SEGV_PKUERR show that a need of one architecture frequently becomes a need of another architecture which makes sharing si_codes between architectures a positive benefit and something to be encouraged. Where there are no conflicts with the historical ia64 arch specific si_codes and any other si_codes make them generic si_codes. We might need them on another architecture someday. This leaves only the good example of arch generic si_codes in the kernel for future architectures and architecture enhancments to follow. Without bad examples to follow it should be easy to avoid the mistakes of the past. Reported-by: Eric W. Biederman <ebiederm@xmission.com> [arnd: took Eric's changelog text] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26asm-generic: siginfo: remove obsolete #ifdefsArnd Bergmann
The frv, tile and blackfin architectures are being removed, so we can clean up this header by removing all the special cases except those for ia64. The SEGV_BNDERR and BUS_MCEERR_AR si_code macros are now defined unconditionally on all remaining architectures. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26treewide: simplify Kconfig dependencies for removed archsArnd Bergmann
A lot of Kconfig symbols have architecture specific dependencies. In those cases that depend on architectures we have already removed, they can be omitted. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26x86/devicetree: Use CPU description from Device TreeIvan Gorinov
Current x86 Device Tree implementation does not support multiprocessing. Use new DT bindings to describe the processors. Signed-off-by: Ivan Gorinov <ivan.gorinov@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Link: https://lkml.kernel.org/r/c291fb2cef51b730b59916d7745be0eaa4378c6c.1521753738.git.ivan.gorinov@intel.com
2018-03-26of/Documentation: Specify local APIC ID in "reg"Ivan Gorinov
Use the "reg" property to specify the processor's local APIC ID instead of setting it to the CPU node index in Device Tree. Local APIC ID is assigned by hardware and visible in the APIC ID register. Some processor models allow APIC ID to be changed by software, but CPUID instruction executed with %eax = 0x0b always returns the initial ID in %edx. Local APIC ID does not match the node index in many systems. Signed-off-by: Ivan Gorinov <ivan.gorinov@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Link: https://lkml.kernel.org/r/4b1a471a56ac0ebd7510f4759afce9104595d6da.1521753738.git.ivan.gorinov@intel.com
2018-03-26Btrfs: dev-replace: make sure target is identical to source when raid56 ↵Liu Bo
rebuild fails In the last step of scrub_handle_error_block, we try to combine good copies on all possible mirrors, this works fine for raid1 and raid10, but not for raid56 as it's doing parity rebuild. If parity rebuild doesn't get back with correct data which matches its checksum, in case of replace we'd rather write what is stored in the source device than the data calculuated from parity. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: raid56: remove redundant async_missing_raid56Liu Bo
async_missing_raid56() is identical to async_read_rebuild(). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: adjust return values of btrfs_inode_by_nameSu Yue
Previously, btrfs_inode_by_name() returned 0 which left caller to check objectid of location even location if the type was invalid. Let btrfs_inode_by_name() return -EUCLEAN if a corrupted location of a dir entry is found. Removal of label out_err also simplifies the function. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ drop unlikely ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: rename btrfs_close_extra_device to btrfs_free_extra_devidsAnand Jain
This function btrfs_close_extra_devices() is about freeing extra devids which once it may have belonged to this filesystem. So rename it and add the comment. The _devid suffix is appropriate as this function won't handle devices which are outside of the filesytem being mounted. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove root argument from cow_file_range_inlineNikolay Borisov
This argument is always set to the root of the inode, which is also passed. So let's get a reference inside the function and simplify the arg list. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: send: fix typo in TLV_PUTLiu Bo
According to tlv_put()'s prototype, data and attrlen needs to be exchanged in the macro, but seems all callers are already aware of this misorder and are therefore not affected. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove root argument from btrfs_log_dentry_safeNikolay Borisov
Now that nothing uses the root arg of btrfs_log_dentry_safe it can be safely removed. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove root arg from btrfs_log_inode_parentNikolay Borisov
btrfs_log_inode_parent is called from 2 places (btrfs_log_dentry_safe and btrfs_log_new_name) both of which pass inode->root as the root argument and the inode itself. Remove the redundant root argument and get a reference to the root directly from the inode, also remove redundant root != inode->root check from the same function. No functional change. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove redundant comment from btrfs_search_forwardNikolay Borisov
This function always sets keep_locks to 1 and saves the old value of keep_locks which is restored at the end. So there is no way it can be called without keep_locks being set. Remove comment imposing redundant requirement on callers. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: move btrfs_listxattr prototype to xattr.hDavid Sterba
There's a proper header for xattr handlers. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: adjust return type of btrfs_getxattrDavid Sterba
The xattr_handler::get prototype returns int, use it. The only ssize_t exception is the per-inode listxattr handler. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: drop extern from function declarationsDavid Sterba
Extern for functions does not make any difference, there are only a few so let's remove them before it's too late. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: drop underscores from exported xattr functionsDavid Sterba
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: send, do not issue unnecessary truncate operationsFilipe Manana
When send finishes processing an inode representing a regular file, it always issues a truncate operation for that file, even if its size did not change or the last write sets the file size correctly. In the most common cases, the issued write operations set the file to correct size (either full or incremental sends) or the file size did not change (for incremental sends), so the only case where a truncate operation is needed is when a file size becomes smaller in the send snapshot when compared to the parent snapshot. By not issuing unnecessary truncate operations we reduce the stream size and save time in the receiver. Currently truncating a file to the same size triggers writeback of its last page (if it's dirty) and waits for it to complete (only if the file size is not aligned with the filesystem's sector size). This is being fixed by another patch and is independent of this change (that patch's title is "Btrfs: skip writeback of last page when truncating file to same size"). The following script was used to measure time spent by a receiver without this change applied, with this change applied, and without this change and with the truncate fix applied (the fix to not make it start and wait for writeback to complete). $ cat test_send.sh #!/bin/bash SRC_DEV=/dev/sdc DST_DEV=/dev/sdd SRC_MNT=/mnt/sdc DST_MNT=/mnt/sdd mkfs.btrfs -f $SRC_DEV >/dev/null mkfs.btrfs -f $DST_DEV >/dev/null mount $SRC_DEV $SRC_MNT mount $DST_DEV $DST_MNT echo "Creating source filesystem" for ((t = 0; t < 10; t++)); do ( for ((i = 1; i <= 20000; i++)); do xfs_io -f -c "pwrite -S 0xab 0 5000" \ $SRC_MNT/file_$i > /dev/null done ) & worker_pids[$t]=$! done wait ${worker_pids[@]} echo "Creating and sending snapshot" btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null /usr/bin/time -f "send took %e seconds" \ btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1 /usr/bin/time -f "receive took %e seconds" \ btrfs receive -f $SRC_MNT/send_file $DST_MNT umount $SRC_MNT umount $DST_MNT The results, which are averages for 5 runs for each case, were the following: * Without this change average receive time was 26.49 seconds standard deviation of 2.53 seconds * Without this change and with the truncate fix average receive time was 12.51 seconds standard deviation of 0.32 seconds * With this change and without the truncate fix average receive time was 10.02 seconds standard deviation of 1.11 seconds Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: skip writeback of last page when truncating file to same sizeFilipe Manana
When we truncate a file to the same size and that size is not aligned with the sector size, we end up triggering writeback (and wait for it to complete) of the last page. This is unncessary as we can not have delayed allocation beyond the inode's i_size and the goal of truncating a file to its own size is to discard prealloc extents (allocated via the fallocate(2) system call). Besides the unnecessary IO start and wait, it also breaks the oppurtunity for larger contiguous extents on disk, as before the last dirty page there might be other dirty pages. This scenario is probably not very common in general, however it is common for btrfs receive implementations because currently the send stream always issues a truncate operation for each processed inode as the last operation for that inode (this truncate operation is not always needed and the send implementation will be addressed to avoid them). So improve this by not starting and waiting for writeback of the inode's last page when we are truncating to exactly the same size. The following script was used to quickly measure the time a receive operation takes: $ cat test_send.sh #!/bin/bash SRC_DEV=/dev/sdc DST_DEV=/dev/sdd SRC_MNT=/mnt/sdc DST_MNT=/mnt/sdd mkfs.btrfs -f $SRC_DEV >/dev/null mkfs.btrfs -f $DST_DEV >/dev/null mount $SRC_DEV $SRC_MNT mount $DST_DEV $DST_MNT echo "Creating source filesystem" for ((t = 0; t < 10; t++)); do ( for ((i = 1; i <= 20000; i++)); do xfs_io -f -c "pwrite -S 0xab 0 5000" \ $SRC_MNT/file_$i > /dev/null done ) & worker_pids[$t]=$! done wait ${worker_pids[@]} echo "Creating and sending snapshot" btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null /usr/bin/time -f "send took %e seconds" \ btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1 /usr/bin/time -f "receive took %e seconds" \ btrfs receive -f $SRC_MNT/send_file $DST_MNT umount $SRC_MNT umount $DST_MNT The results for 5 runs were the following: * Without this change average receive time was 26.49 seconds standard deviation of 2.53 seconds * With this change average receive time was 12.51 seconds standard deviation of 0.32 seconds Reported-by: Robbie Ko <robbieko@synology.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: dev-replace: skip prealloc extents when copy nocow pagesLiu Bo
It doens't make sense to process prealloc extents as pages will be filled with zero when reading prealloc extents. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: unify types for metadata_ratio and data_chunk_allocationsAnand Jain
We have btrfs_fs_info::data_chunk_allocations and btrfs_fs_info::metadata_ratio declared as unsigned which would be unsinged int and kernel style prefers unsigned int over bare unsigned. So this patch changes them to u32. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove redundant memory barriers around dio_private error statusNikolay Borisov
Using any kind of memory barriers around atomic operations which have a return value is redundant, since those operations themselves are fully ordered. atomic_t.txt states: - RMW operations that have a return value are fully ordered; Fully ordered primitives are ordered against everything prior and everything subsequent. Therefore a fully ordered primitive is like having an smp_mb() before and an smp_mb() after the primitive. Given this let's replace the extra memory barriers with comments. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: remove assert in btrfs_init_dev_replace_tgtdev()Anand Jain
In the same function we just ran btrfs_alloc_device() which means the btrfs_device::resized_list is sure to be empty and we are protected with the btrfs_fs_info::volume_mutex. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: add more __cold annotationsDavid Sterba
The __cold functions are placed to a special section, as they're expected to be called rarely. This could help i-cache prefetches or help compiler to decide which branches are more/less likely to be taken without any other annotations needed. Though we can't add more __exit annotations, it's still possible to add __cold (that's also added with __exit). That way the following function categories are tagged: - printf wrappers, error messages - exit helpers Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: add (the only possible) __exit annotationDavid Sterba
Recently, the __init annotations have been added. There's unfortunatelly only one case where we can add __exit, because most of the cleanup helpers are also called from the __init phase. As the __exit annotated functions get discarded completely for a built-in code, we'd miss them from the init phase. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: verify subvolid mount parameterAnand Jain
We aren't verifying the parameter passed to the subvolid mount option, so we won't report and fail the mount if a junk value is specified for example, -o subvolid=abc. This patch verifies the subvolid option with match_u64. Up to now the memparse function accepts the K/M/G/ suffixes, that are usually meant for size values and do not make sense for a subvolume it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: fix unexpected cow in run_delalloc_nocowLiu Bo
Fstests generic/475 provides a way to fail metadata reads while checking if checksum exists for the inode inside run_delalloc_nocow(), and csum_exist_in_range() interprets error (-EIO) as inode having checksum and makes its caller enter the cow path. In case of free space inode, this ends up with a warning in cow_file_range(). The same problem applies to btrfs_cross_ref_exist() since it may also read metadata in between. With this, run_delalloc_nocow() bails out when errors occur at the two places. cc: <stable@vger.kernel.org> v2.6.28+ Fixes: 17d217fe970d ("Btrfs: fix nodatasum handling in balancing code") Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove custom crc32c init codeNikolay Borisov
The custom crc32 init code was introduced in 14a958e678cd ("Btrfs: fix btrfs boot when compiled as built-in") to enable using btrfs as a built-in. However, later as pointed out by 60efa5eb2e88 ("Btrfs: use late_initcall instead of module_init") this wasn't enough and finally btrfs was switched to late_initcall which comes after the generic crc32c implementation is initiliased. The latter commit superseeded the former. Now that we don't have to maintain our own code let's just remove it and switch to using the generic implementation. Despite touching a lot of files the patch is really simple. Here is the gist of the changes: 1. Select LIBCRC32C rather than the low-level modules. 2. s/btrfs_crc32c/crc32c/g 3. replace hash.h with linux/crc32c.h 4. Move the btrfs namehash funcs to ctree.h and change the tree accordingly. I've tested this with btrfs being both a module and a built-in and xfstest doesn't complain. Does seem to fix the longstanding problem of not automatically selectiong the crc32c module when btrfs is used. Possibly there is a workaround in dracut. The modinfo confirms that now all the module dependencies are there: before: depends: zstd_compress,zstd_decompress,raid6_pq,xor,zlib_deflate after: depends: libcrc32c,zstd_compress,zstd_decompress,raid6_pq,xor,zlib_deflate Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add more info to changelog from mails ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26libcrc32c: Add crc32c_impl functionNikolay Borisov
This function returns a string with the currently in-use implementation of the crc32c algorithm, i.e crc32c-generic (for unoptimised, generic implementation) or crc32c-intel for the sse optimised version. This will be used by btrfs. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [ use crypto_shash_driver_name as suggested by Herbert ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Refactor __get_raid_index() to btrfs_bg_flags_to_raid_index()Qu Wenruo
Function __get_raid_index() is used to convert block group flags into raid index, which can be used to get various info directly from btrfs_raid_array[]. Refactor this function a little: 1) Rename to btrfs_bg_flags_to_raid_index() Double underline prefix is normally for internal functions, while the function is used by both extent-tree and volumes. Although the name is a little longer, but it should explain its usage quite well. 2) Move it to volumes.h and make it static inline Just several if-else branches, really no need to define it as a normal function. This also makes later code re-use between kernel and btrfs-progs easier. 3) Remove function get_block_group_index() Really no need to do such a simple thing as an exported function. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: tree-checker: Replace root parameter with fs_infoQu Wenruo
When inspecting the error message with real corruption, the "root=%llu" always shows "1" (root tree), instead of the correct owner. The problem is that we are getting @root from page->mapping->host, which points the same btree inode, so we will always get the same root. This makes the root owner output meaningless, and harder to port tree-checker to btrfs-progs. So get rid of the false and meaningless @root parameter and replace it with @fs_info. To get the owner, we can only rely on btrfs_header_owner() now. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: add tracepoint for em's EEXIST caseLiu Bo
This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the subtle bugs around merge_extent_mapping. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Move qgroup rescan on quota enable to btrfs_quota_enableNikolay Borisov
Currently btrfs_run_qgroups is doing a bit too much. Not only is it responsible for synchronizing in-memory state of qgroups to disk but it also contains code to trigger the initial qgroup rescan when quota is enabled initially. This condition is detected by checking that BTRFS_FS_QUOTA_ENABLED is not set and BTRFS_FS_QUOTA_ENABLING is set. Nothing really requires from the code to be structured (and scattered) the way it is so let's streamline things. First move the quota rescan code into btrfs_quota_enable, where its invocation is closer to the use. This also makes the FS_QUOTA_ENABLING flag redundant so let's remove it as well. This has been tested with a full xfstest run with qgroups enabled on the scratch device of every xfstest and no regressions were observed. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: use reada direction enum instead of constant value in ↵Gu JinXiang
load_free_space_tree load_free_space_tree calls either function load_free_space_bitmaps or load_free_space_extents. And either of those two will lead to call btrfs_next_item. So in function load_free_space_tree, use READA_FORWARD to read forward ahead. This also changes the value from READA_BACK to READA_FORWARD, since according to the logic, it should reada_for_search forward, not backward. Signed-off-by: Gu JinXiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: use reada direction enum instead of constant value in ↵Gu Jinxiang
populate_free_space_tree populate_free_space_tree calls function btrfs_search_slot_for_read with parameter int find_higher = 1, it means that, if no exact match is found, then use the next higher item. So in function populate_free_space_tree, use READA_FORWARD to read forward ahead. This also changes the value from READA_BACK to READA_FORWARD, since according to the logic, it should reada_for_search forward, not backward. Signed-off-by: Gu JinXiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Remove btrfs_inode::delayed_iput_countNikolay Borisov
delayed_iput_count wa supposed to be used to implement, well, delayed iput. The idea is that we keep accumulating the number of iputs we do until eventually the inode is deleted. Turns out we never really switched the delayed_iput_count from 0 to 1, hence all conditional code relying on the value of that member being different than 0 was never executed. This, as it turns out, didn't cause any problem due to the simple fact that the generic inode's i_count member was always used to count the number of iputs. So let's just remove the unused member and all unused code. This patch essentially provides no functional changes. While at it, also add proper documentation for btrfs_add_delayed_iput Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ reformat comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: volumes: Cleanup stripe size calculationQu Wenruo
Cleanup the following things: 1) open coded SZ_16M round up 2) use min() to replace open-coded size comparison 3) code style Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com> [ reformat comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Streamline btrfs_delalloc_reserve_metadata initial operationsNikolay Borisov
The behavior of btrfs_delalloc_reserve_metadata depends on whether the inode we are allocating for is the freespace inode or not. As it stands if we are the free node we set 'flush' and 'delalloc_lock' variable to certain values. Subsequently we check the values of those vars and act accordingly. Instead, simplify things by having 1 if which checks whether we are the freespace inode or not and do any specific operation in either branches of that if. This makes the code a bit easier to understand, as an added bonus it also shrinks the compiled size: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-17 (-17) Function old new delta btrfs_delalloc_reserve_metadata 1876 1859 -17 Total: Before=85966, After=85949, chg -0.02% No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Edmund Nadolski <enadolski@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: insert newly opened device to the end of the listAnand Jain
Add opened device to the tail of dev_alloc_list instead of head, so that it maintains the same order as dev_list. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: keep device list sortedAnand Jain
By maintaining the device list sorted lets us reproduce the problems related to missing chunk in the degraded mode much more consistent. So fix this by sorting the devices by devid within the kernel. So that we know which device is assigned to the struct fs_info::latest_bdev when all the devices are having and same SB generation. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26Btrfs: do not check inode's runtime flags under root->orphan_lockLiu Bo
It's not necessary to hold ->orphan_lock when checking inode's runtime flags. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Use schedule_timeout_interruptibleNikolay Borisov
Instead of manually fiddling with the state of the task (RUNNING->INTERRUPTIBLE->RUNNING) again just use schedule_timeout_interruptible which adjusts the task state as needed. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26btrfs: Move error handling of btrfs_start_dirty_block_groups closer to call siteNikolay Borisov
Even though btrfs_start_dirty_block_groups is fairly in the beginning of btrfs_commit_transaction outside of the critical section defined by the transaction states it can only be run by a single comitter. In other words it defines its own critical section thanks to the BTRFS_TRANS_DIRTY_BG run flag and ro_block_group_mutex. However, its error handling is outside of this critical section which is a bit counter-intuitive. So move the error handling righ after the function is executed and let the sole runner of dirty block groups handle the return value. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>