summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-31lib/test_bitmap: correct test data offsets for 32-bitAndy Shevchenko
On 32-bit platform the size of long is only 32 bits which makes wrong offset in the array of 64 bit size. Calculate offset based on BITS_PER_LONG. Link: http://lkml.kernel.org/r/20200109103601.45929-1-andriy.shevchenko@linux.intel.com Fixes: 30544ed5de43 ("lib/bitmap: introduce bitmap_replace() helper") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yury Norov <yury.norov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31kdb: Use for_each_console() helperAndy Shevchenko
Replace open coded single-linked list iteration loop with for_each_console() helper in use. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31kdb: remove redundant assignment to pointer bpColin Ian King
The point bp is assigned a value that is never read, it is being re-assigned later to bp = &kdb_breakpoints[lowbp] in a for-loop. Remove the redundant assignment. Addresses-Coverity ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20191128130753.181246-1-colin.king@canonical.com Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31kdb: Get rid of confusing diag msg from "rd" if current task has no regsDouglas Anderson
If you switch to a sleeping task with the "pid" command and then type "rd", kdb tells you this: No current kdb registers. You may need to select another task diag: -17: Invalid register name The first message makes sense, but not the second. Fix it by just returning 0 after commands accessing the current registers finish if we've already printed the "No current kdb registers" error. While fixing kdb_rd(), change the function to use "if" rather than "ifdef". It cleans the function up a bit and any modern compiler will have no trouble handling still producing good code. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111624.5.I121f4c6f0c19266200bf6ef003de78841e5bfc3d@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31kdb: Gid rid of implicit setting of the current task / regsDouglas Anderson
Some (but not all?) of the kdb backtrace paths would cause the kdb_current_task and kdb_current_regs to remain changed. As discussed in a review of a previous patch [1], this doesn't seem intuitive, so let's fix that. ...but, it turns out that there's actually no longer any reason to set the current task / current regs while backtracing anymore anyway. As of commit 2277b492582d ("kdb: Fix stack crawling on 'running' CPUs that aren't the master") if we're backtracing on a task running on a CPU we ask that CPU to do the backtrace itself. Linux can do that without anything fancy. If we're doing backtrace on a sleeping task we can also do that fine without updating globals. So this patch mostly just turns into deleting a bunch of code. [1] https://lore.kernel.org/r/20191010150735.dhrj3pbjgmjrdpwr@holly.lan Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111624.4.Ibc3d982bbeb9e46872d43973ba808cd4c79537c7@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31kdb: kdb_current_task shouldn't be exportedDouglas Anderson
The kdb_current_task variable has been declared in "kernel/debug/kdb/kdb_private.h" since 2010 when kdb was added to the mainline kernel. This is not a public header. There should be no reason that kdb_current_task should be exported and there are no in-kernel users that need it. Remove the export. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111623.3.I14b22b5eb15ca8f3812ab33e96621231304dc1f7@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31kdb: kdb_current_regs should be privateDouglas Anderson
As of the patch ("MIPS: kdb: Remove old workaround for backtracing on other CPUs") there is no reason for kdb_current_regs to be in the public "kdb.h". Let's move it next to kdb_current_task. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111623.2.Iadbfb484e90b557cc4b5ac9890bfca732cd99d77@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31MIPS: kdb: Remove old workaround for backtracing on other CPUsDouglas Anderson
As of commit 2277b492582d ("kdb: Fix stack crawling on 'running' CPUs that aren't the master") we no longer need any special case for doing stack dumps on CPUs that are not the kdb master. Let's remove. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111623.1.I30a0cac4d9880040c8d41495bd9a567fe3e24989@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-01-31Merge tag 'kvm-5.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "This is the first batch of KVM changes. ARM: - cleanups and corner case fixes. PPC: - Bugfixes x86: - Support for mapping DAX areas with large nested page table entries. - Cleanups and bugfixes here too. A particularly important one is a fix for FPU load when the thread has TIF_NEED_FPU_LOAD. There is also a race condition which could be used in guest userspace to exploit the guest kernel, for which the embargo expired today. - Fast path for IPI delivery vmexits, shaving about 200 clock cycles from IPI latency. - Protect against "Spectre-v1/L1TF" (bring data in the cache via speculative out of bound accesses, use L1TF on the sibling hyperthread to read it), which unfortunately is an even bigger whack-a-mole game than SpectreV1. Sean continues his mission to rewrite KVM. In addition to a sizable number of x86 patches, this time he contributed a pretty large refactoring of vCPU creation that affects all architectures but should not have any visible effect. s390 will come next week together with some more x86 patches" * tag 'kvm-5.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) x86/KVM: Clean up host's steal time structure x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed x86/kvm: Cache gfn to pfn translation x86/kvm: Introduce kvm_(un)map_gfn() x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit KVM: PPC: Book3S PR: Fix -Werror=return-type build failure KVM: PPC: Book3S HV: Release lock on page-out failure path KVM: arm64: Treat emulated TVAL TimerValue as a signed 32-bit integer KVM: arm64: pmu: Only handle supported event counters KVM: arm64: pmu: Fix chained SW_INCR counters KVM: arm64: pmu: Don't mark a counter as chained if the odd one is disabled KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset KVM: x86: Use a typedef for fastop functions KVM: X86: Add 'else' to unify fastop and execute call path KVM: x86: inline memslot_valid_for_gpte KVM: x86/mmu: Use huge pages for DAX-backed files KVM: x86/mmu: Remove lpage_is_disallowed() check from set_spte() KVM: x86/mmu: Fold max_mapping_level() into kvm_mmu_hugepage_adjust() KVM: x86/mmu: Zap any compound page when collapsing sptes KVM: x86/mmu: Remove obsolete gfn restoration in FNAME(fetch) ...
2020-01-31mlxsw: spectrum_qdisc: Fix 64-bit division error in mlxsw_sp_qdisc_tbf_rate_kbpsNathan Chancellor
When building arm32 allmodconfig: ERROR: "__aeabi_uldivmod" [drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined! rate_bytes_ps has type u64, we need to use a 64-bit division helper to avoid a build error. Fixes: a44f58c41bfb ("mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31ionic: fix rxq comp packet type maskShannon Nelson
Be sure to include all the packet type bits in the mask. Fixes: fbfb8031533c ("ionic: Add hardware init and device commands") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31net: phy: at803x: disable vddio regulatorMichael Walle
The probe() might enable a VDDIO regulator, which needs to be disabled again before calling regulator_put(). Add a remove() function. Fixes: 2f664823a470 ("net: phy: at803x: add device tree binding") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31net: mii_timestamper: fix static allocation by PHY driverMichael Walle
If phydev->mii_ts is set by the PHY driver, it will always be overwritten in of_mdiobus_register_phy(). Fix it. Also make sure, that the unregister() doesn't do anything if the mii_timestamper was provided by the PHY driver. Fixes: 1dca22b18421 ("net: mdio: of: Register discovered MII time stampers.") Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31net: mdio: of: fix potential NULL pointer derefernceMichael Walle
of_find_mii_timestamper() returns NULL if no timestamper is found. Therefore, guard the unregister_mii_timestamper() calls. Fixes: 1dca22b18421 ("net: mdio: of: Register discovered MII time stampers.") Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31Btrfs: send, fix emission of invalid clone operations within the same fileFilipe Manana
When doing an incremental send and a file has extents shared with itself at different file offsets, it's possible for send to emit clone operations that will fail at the destination because the source range goes beyond the file's current size. This happens when the file size has increased in the send snapshot, there is a hole between the shared extents and both shared extents are at file offsets which are greater the file's size in the parent snapshot. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt/sdb $ xfs_io -f -c "pwrite -S 0xf1 0 64K" /mnt/sdb/foobar $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base $ btrfs send -f /tmp/1.snap /mnt/sdb/base # Create a 320K extent at file offset 512K. $ xfs_io -c "pwrite -S 0xab 512K 64K" /mnt/sdb/foobar $ xfs_io -c "pwrite -S 0xcd 576K 64K" /mnt/sdb/foobar $ xfs_io -c "pwrite -S 0xef 640K 64K" /mnt/sdb/foobar $ xfs_io -c "pwrite -S 0x64 704K 64K" /mnt/sdb/foobar $ xfs_io -c "pwrite -S 0x73 768K 64K" /mnt/sdb/foobar # Clone part of that 320K extent into a lower file offset (192K). # This file offset is greater than the file's size in the parent # snapshot (64K). Also the clone range is a bit behind the offset of # the 320K extent so that we leave a hole between the shared extents. $ xfs_io -c "reflink /mnt/sdb/foobar 448K 192K 192K" /mnt/sdb/foobar $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr $ btrfs send -p /mnt/sdb/base -f /tmp/2.snap /mnt/sdb/incr $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt/sdc $ btrfs receive -f /tmp/1.snap /mnt/sdc $ btrfs receive -f /tmp/2.snap /mnt/sdc ERROR: failed to clone extents to foobar: Invalid argument The problem is that after processing the extent at file offset 256K, which refers to the first 128K of the 320K extent created by the buffered write operations, we have 'cur_inode_next_write_offset' set to 384K, which corresponds to the end offset of the partially shared extent (256K + 128K) and to the current file size in the receiver. Then when we process the extent at offset 512K, we do extent backreference iteration to figure out if we can clone the extent from some other inode or from the same inode, and we consider the extent at offset 256K of the same inode as a valid source for a clone operation, which is not correct because at that point the current file size in the receiver is 384K, which corresponds to the end of last processed extent (at file offset 256K), so using a clone source range from 256K to 256K + 320K is invalid because that goes past the current size of the file (384K) - this makes the receiver get an -EINVAL error when attempting the clone operation. So fix this by excluding clone sources that have a range that goes beyond the current file size in the receiver when iterating extent backreferences. A test case for fstests follows soon. Fixes: 11f2069c113e02 ("Btrfs: send, allow clone operations within the same file") CC: stable@vger.kernel.org # 5.5+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: do not do delalloc reservation under page lockJosef Bacik
We ran into a deadlock in production with the fixup worker. The stack traces were as follows: Thread responsible for the writeout, waiting on the page lock [<0>] io_schedule+0x12/0x40 [<0>] __lock_page+0x109/0x1e0 [<0>] extent_write_cache_pages+0x206/0x360 [<0>] extent_writepages+0x40/0x60 [<0>] do_writepages+0x31/0xb0 [<0>] __writeback_single_inode+0x3d/0x350 [<0>] writeback_sb_inodes+0x19d/0x3c0 [<0>] __writeback_inodes_wb+0x5d/0xb0 [<0>] wb_writeback+0x231/0x2c0 [<0>] wb_workfn+0x308/0x3c0 [<0>] process_one_work+0x1e0/0x390 [<0>] worker_thread+0x2b/0x3c0 [<0>] kthread+0x113/0x130 [<0>] ret_from_fork+0x35/0x40 [<0>] 0xffffffffffffffff Thread of the fixup worker who is holding the page lock [<0>] start_delalloc_inodes+0x241/0x2d0 [<0>] btrfs_start_delalloc_roots+0x179/0x230 [<0>] btrfs_alloc_data_chunk_ondemand+0x11b/0x2e0 [<0>] btrfs_check_data_free_space+0x53/0xa0 [<0>] btrfs_delalloc_reserve_space+0x20/0x70 [<0>] btrfs_writepage_fixup_worker+0x1fc/0x2a0 [<0>] normal_work_helper+0x11c/0x360 [<0>] process_one_work+0x1e0/0x390 [<0>] worker_thread+0x2b/0x3c0 [<0>] kthread+0x113/0x130 [<0>] ret_from_fork+0x35/0x40 [<0>] 0xffffffffffffffff Thankfully the stars have to align just right to hit this. First you have to end up in the fixup worker, which is tricky by itself (my reproducer does DIO reads into a MMAP'ed region, so not a common operation). Then you have to have less than a page size of free data space and 0 unallocated space so you go down the "commit the transaction to free up pinned space" path. This was accomplished by a random balance that was running on the host. Then you get this deadlock. I'm still in the process of trying to force the deadlock to happen on demand, but I've hit other issues. I can still trigger the fixup worker path itself so this patch has been tested in that regard, so the normal case is fine. Fixes: 87826df0ec36 ("btrfs: delalloc for page dirtied out-of-band in fixup worker") Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: drop the -EBUSY case in __extent_writepage_ioJosef Bacik
Now that we only return 0 or -EAGAIN from btrfs_writepage_cow_fixup, we do not need this -EBUSY case. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31Btrfs: keep pages dirty when using btrfs_writepage_fixup_workerChris Mason
For COW, btrfs expects pages dirty pages to have been through a few setup steps. This includes reserving space for the new block allocations and marking the range in the state tree for delayed allocation. A few places outside btrfs will dirty pages directly, especially when unmapping mmap'd pages. In order for these to properly go through COW, we run them through a fixup worker to wait for stable pages, and do the delalloc prep. 87826df0ec36 added a window where the dirty pages were cleaned, but pending more action from the fixup worker. We clear_page_dirty_for_io() before we call into writepage, so the page is no longer dirty. The commit changed it so now we leave the page clean between unlocking it here and the fixup worker starting at some point in the future. During this window, page migration can jump in and relocate the page. Once our fixup work actually starts, it finds page->mapping is NULL and we end up freeing the page without ever writing it. This leads to crc errors and other exciting problems, since it screws up the whole statemachine for waiting for ordered extents. The fix here is to keep the page dirty while we're waiting for the fixup worker to get to work. This is accomplished by returning -EAGAIN from btrfs_writepage_cow_fixup if we queued the page up for fixup, which will cause the writepage function to redirty the page. Because we now expect the page to be dirty once it gets to the fixup worker we must adjust the error cases to call clear_page_dirty_for_io() on the page. That is the bulk of the patch, but it is not the fix, the fix is the -EAGAIN from btrfs_writepage_cow_fixup. We cannot separate these two changes out because the error conditions change with the new expectations. Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: take overcommit into account in inc_block_group_roJosef Bacik
inc_block_group_ro does a calculation to see if we have enough room left over if we mark this block group as read only in order to see if it's ok to mark the block group as read only. The problem is this calculation _only_ works for data, where our used is always less than our total. For metadata we will overcommit, so this will almost always fail for metadata. Fix this by exporting btrfs_can_overcommit, and then see if we have enough space to remove the remaining free space in the block group we are trying to mark read only. If we do then we can mark this block group as read only. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: fix force usage in inc_block_group_roJosef Bacik
For some reason we've translated the do_chunk_alloc that goes into btrfs_inc_block_group_ro to force in inc_block_group_ro, but these are two different things. force for inc_block_group_ro is used when we are forcing the block group read only no matter what, for example when the underlying chunk is marked read only. We need to not do the space check here as this block group needs to be read only. btrfs_inc_block_group_ro() has a do_chunk_alloc flag that indicates that we need to pre-allocate a chunk before marking the block group read only. This has nothing to do with forcing, and in fact we _always_ want to do the space check in this case, so unconditionally pass false for force in this case. Then fixup inc_block_group_ro to honor force as it's expected and documented to do. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: Correctly handle empty trees in find_first_clear_extent_bitNikolay Borisov
Raviu reported that running his regular fs_trim segfaulted with the following backtrace: [ 237.525947] assertion failed: prev, in ../fs/btrfs/extent_io.c:1595 [ 237.525984] ------------[ cut here ]------------ [ 237.525985] kernel BUG at ../fs/btrfs/ctree.h:3117! [ 237.525992] invalid opcode: 0000 [#1] SMP PTI [ 237.525998] CPU: 4 PID: 4423 Comm: fstrim Tainted: G U OE 5.4.14-8-vanilla #1 [ 237.526001] Hardware name: ASUSTeK COMPUTER INC. [ 237.526044] RIP: 0010:assfail.constprop.58+0x18/0x1a [btrfs] [ 237.526079] Call Trace: [ 237.526120] find_first_clear_extent_bit+0x13d/0x150 [btrfs] [ 237.526148] btrfs_trim_fs+0x211/0x3f0 [btrfs] [ 237.526184] btrfs_ioctl_fitrim+0x103/0x170 [btrfs] [ 237.526219] btrfs_ioctl+0x129a/0x2ed0 [btrfs] [ 237.526227] ? filemap_map_pages+0x190/0x3d0 [ 237.526232] ? do_filp_open+0xaf/0x110 [ 237.526238] ? _copy_to_user+0x22/0x30 [ 237.526242] ? cp_new_stat+0x150/0x180 [ 237.526247] ? do_vfs_ioctl+0xa4/0x640 [ 237.526278] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [ 237.526283] do_vfs_ioctl+0xa4/0x640 [ 237.526288] ? __do_sys_newfstat+0x3c/0x60 [ 237.526292] ksys_ioctl+0x70/0x80 [ 237.526297] __x64_sys_ioctl+0x16/0x20 [ 237.526303] do_syscall_64+0x5a/0x1c0 [ 237.526310] entry_SYSCALL_64_after_hwframe+0x49/0xbe That was due to btrfs_fs_device::aloc_tree being empty. Initially I thought this wasn't possible and as a percaution have put the assert in find_first_clear_extent_bit. Turns out this is indeed possible and could happen when a file system with SINGLE data/metadata profile has a 2nd device added. Until balance is run or a new chunk is allocated on this device it will be completely empty. In this case find_first_clear_extent_bit should return the full range [0, -1ULL] and let the caller handle this i.e for trim the end will be capped at the size of actual device. Link: https://lore.kernel.org/linux-btrfs/izW2WNyvy1dEDweBICizKnd2KDwDiDyY2EYQr4YCwk7pkuIpthx-JRn65MPBde00ND6V0_Lh8mW0kZwzDiLDv25pUYWxkskWNJnVP0kgdMA=@protonmail.com/ Fixes: 45bfcfc168f8 ("btrfs: Implement find_first_clear_extent_bit") CC: stable@vger.kernel.org # 5.2+ Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: flush write bio if we loop in extent_write_cache_pagesJosef Bacik
There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02e1423 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31Btrfs: fix race between adding and putting tree mod seq elements and nodesFilipe Manana
There is a race between adding and removing elements to the tree mod log list and rbtree that can lead to use-after-free problems. Consider the following example that explains how/why the problems happens: 1) Task A has mod log element with sequence number 200. It currently is the only element in the mod log list; 2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to access the tree mod log. When it enters the function, it initializes 'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock' before checking if there are other elements in the mod seq list. Since the list it empty, 'min_seq' remains set to (u64)-1. Then it unlocks the lock 'tree_mod_seq_lock'; 3) Before task A acquires the lock 'tree_mod_log_lock', task B adds itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a sequence number of 201; 4) Some other task, name it task C, modifies a btree and because there elements in the mod seq list, it adds a tree mod elem to the tree mod log rbtree. That node added to the mod log rbtree is assigned a sequence number of 202; 5) Task B, which is doing fiemap and resolving indirect back references, calls btrfs get_old_root(), with 'time_seq' == 201, which in turn calls tree_mod_log_search() - the search returns the mod log node from the rbtree with sequence number 202, created by task C; 6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating the mod log rbtree and finds the node with sequence number 202. Since 202 is less than the previously computed 'min_seq', (u64)-1, it removes the node and frees it; 7) Task B still has a pointer to the node with sequence number 202, and it dereferences the pointer itself and through the call to __tree_mod_log_rewind(), resulting in a use-after-free problem. This issue can be triggered sporadically with the test case generic/561 from fstests, and it happens more frequently with a higher number of duperemove processes. When it happens to me, it either freezes the VM or it produces a trace like the following before crashing: [ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1 [ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 1245.321287] RIP: 0010:rb_next+0x16/0x50 [ 1245.321307] Code: .... [ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202 [ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b [ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80 [ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000 [ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038 [ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8 [ 1245.321539] FS: 00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000 [ 1245.321591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0 [ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1245.321706] Call Trace: [ 1245.321798] __tree_mod_log_rewind+0xbf/0x280 [btrfs] [ 1245.321841] btrfs_search_old_slot+0x105/0xd00 [btrfs] [ 1245.321877] resolve_indirect_refs+0x1eb/0xc60 [btrfs] [ 1245.321912] find_parent_nodes+0x3dc/0x11b0 [btrfs] [ 1245.321947] btrfs_check_shared+0x115/0x1c0 [btrfs] [ 1245.321980] ? extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322029] extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322066] do_vfs_ioctl+0x45a/0x750 [ 1245.322081] ksys_ioctl+0x70/0x80 [ 1245.322092] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1245.322113] __x64_sys_ioctl+0x16/0x20 [ 1245.322126] do_syscall_64+0x5c/0x280 [ 1245.322139] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1245.322155] RIP: 0033:0x7fdee3942dd7 [ 1245.322177] Code: .... [ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7 [ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004 [ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44 [ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48 [ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50 [ 1245.322423] Modules linked in: .... [ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]--- Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum sequence number and iterates the rbtree while holding the lock 'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock' lock, since it is now redundant. Fixes: bd989ba359f2ac ("Btrfs: add tree modification log functions") Fixes: 097b8a7c9e48e2 ("Btrfs: join tree mod log code with the code holding back delayed refs") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31powerpc: configs: Cleanup old Kconfig optionsKrzysztof Kozlowski
CONFIG_ENABLE_WARN_DEPRECATED is gone since commit 771c035372a0 ("deprecate the '__deprecated' attribute warnings entirely and for good"). CONFIG_IOSCHED_DEADLINE and CONFIG_IOSCHED_CFQ are gone since commit f382fb0bcef4 ("block: remove legacy IO schedulers"). The IOSCHED_DEADLINE was replaced by MQ_IOSCHED_DEADLINE and it will be now enabled by default (along with MQ_IOSCHED_KYBER). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130195223.3843-1-krzk@kernel.org
2020-01-31powerpc/configs/skiroot: Enable some more hardening optionsMichael Ellerman
Enable more hardening options. Note BUG_ON_DATA_CORRUPTION selects DEBUG_LIST and is essentially just a synonym for it. DEBUG_SG, DEBUG_NOTIFIERS, DEBUG_LIST, DEBUG_CREDENTIALS and SCHED_STACK_END_CHECK should all be low overhead and just add a few extra checks. SLAB_FREELIST_RANDOM, and SLUB_DEBUG_ON will add some overhead to the SLAB allocator, but nothing that should be meaningful for skiroot. Unselecting SLAB_MERGE_DEFAULT causes the SLAB to use more memory, but the skiroot kernel shouldn't be memory constrained on any of our systems, all it does is run a small bootloader. Disabling merging has some security/robustness benefit as it means a user-after-free or overflow will be limited to the objects in that slab, rather than potentially affecting objects from unrelated slabs that have been merged. Note also that slab merging is disabled anyway by enabling SLUB_DEBUG_ON, because of the SLAB_NEVER_MERGE mask. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-9-mpe@ellerman.id.au
2020-01-31powerpc/configs/skiroot: Disable xmon default & enable reboot on panicMichael Ellerman
If the skiroot kernel crashes we don't want it sitting at an xmon prompt forever. Instead it's more helpful to reboot and bring the boot loader back up, and if the crash was transient we can then boot successfully. Similarly if we panic we should reboot, with a short timeout in case someone is watching the console. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-8-mpe@ellerman.id.au
2020-01-31powerpc/configs/skiroot: Enable security featuresJoel Stanley
This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and FORTIFY_SOURCE. It also enables SECURITY_LOCKDOWN_LSM with _EARLY and LOCK_DOWN_KERNEL_FORCE_INTEGRITY options enabled. This still allows xmon to be used in read-only mode. MODULE_SIG is selected by lockdown, so it is still enabled. Because we're setting LOCK_DOWN_KERNELFORCE_INTEGRITY=y we also need to enable KEXEC_FILE=y so that kexec continues to work. Signed-off-by: Joel Stanley <joel@jms.id.au> [mpe: Switch to lockdown integrity mode per oohal, enable KEXEC_FILE as reported by jms] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-7-mpe@ellerman.id.au
2020-01-31powerpc/configs/skiroot: Update for symbol movement onlyMichael Ellerman
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-6-mpe@ellerman.id.au
2020-01-31powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIVMichael Ellerman
It's default n so we don't need to disable it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-5-mpe@ellerman.id.au
2020-01-31powerpc/configs/skiroot: Drop HID_LOGITECHMichael Ellerman
Commit bdd08fff4915 ("HID: logitech: Add depends on LEDS_CLASS to Logitech Kconfig entry") made HID_LOGITECH depend on LEDS_CLASS which we do not enable, meaning we are not actually enabling those drivers any more. The Kconfig help text suggests USB HID compliant Logictech devices will continue to work without HID_LOGITECH, so just drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-4-mpe@ellerman.id.au
2020-01-31powerpc/configs: Drop NET_VENDOR_HP which moved to stagingMichael Ellerman
The HP network driver moved to staging in commit 52340b82cf1a ("hp100: Move 100BaseVG AnyLAN driver to staging") meaning we don't need to disable it any more in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-3-mpe@ellerman.id.au
2020-01-31powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCEMichael Ellerman
The NET_CADENCE symbol was renamed to NET_VENDOR_CADENCE, so we don't need to disable the former, see commit 0df5f81c481e ("net: ethernet: Add missing VENDOR to Cadence and Packet Engines symbols"). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-2-mpe@ellerman.id.au
2020-01-31powerpc/configs: Drop CONFIG_QLGE which moved to stagingMichael Ellerman
The QLGE driver moved to staging in commit 955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/"), meaning our defconfigs that enable it have no effect as we don't enable CONFIG_STAGING. It sounds like the device is obsolete, so drop the driver. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200121043000.16212-1-mpe@ellerman.id.au
2020-01-31thermal: stm32: fix spelling mistake "preprare" -> "prepare"Colin Ian King
There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200130100537.18069-1-colin.king@canonical.com
2020-01-31Documentation: cpu-idle-cooling: fix a SEVERE docs build failureRandy Dunlap
Sphinx ('make htmldocs') stops with a SEVERE error: Sphinx parallel build error: SystemMessage: /home/rdunlap/lnx/next/linux-next-20200120/Documentation/driver-api/thermal/cpu-idle-cooling.rst:69: (SEVERE/4) Unexpected section title. ^ | so fix the .rst file so that the SEVERE build error does not happen. Also fix another minor formatting warning (unexpected unindent). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: linux-pm@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/712c1152-56b5-307f-b3f3-ed03a30b804a@infradead.org
2020-01-31Merge branches 'pm-cpufreq' and 'pm-core'Rafael J. Wysocki
* pm-cpufreq: cpufreq: Avoid creating excessively large stack frames * pm-core: PM: core: Fix handling of devices deleted during system-wide resume
2020-01-31powerpc: Do not consider weak unresolved symbol relocations as badAlexandre Ghiti
Commit 8580ac9404f6 ("bpf: Process in-kernel BTF") introduced two weak symbols that may be unresolved at link time which result in an absolute relocation to 0. relocs_check.sh emits the following warning: "WARNING: 2 bad relocations c000000001a41478 R_PPC64_ADDR64 _binary__btf_vmlinux_bin_start c000000001a41480 R_PPC64_ADDR64 _binary__btf_vmlinux_bin_end" whereas those relocations are legitimate even for a relocatable kernel compiled with -pie option. relocs_check.sh already excluded some weak unresolved symbols explicitly: remove those hardcoded symbols and add some logic that parses the symbols using nm, retrieves all the weak unresolved symbols and excludes those from the list of the potential bad relocations. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200118170335.21440-1-alex@ghiti.fr
2020-01-31Merge branch 'ttm-prot-fix' of git://people.freedesktop.org/~thomash/linux ↵Dave Airlie
into drm-next A small fix for the long-standing ttm vm page protection hack. Sent as a separate PR as it touches mm, has all acks in place. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellström (VMware) <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116102411.3056-1-thomas_os@shipmail.org
2020-01-30clk: qoriq: add ls1088a hwaccel clocks supportYangbo Lu
This patch is to add hwaccel clocks information for ls1088a. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Link: https://lkml.kernel.org/r/20191216100111.17122-1-yangbo.lu@nxp.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-01-30clk: ls1028a: Add clock driver for Display output interfaceWen He
Add clock driver for QorIQ LS1028A Display output interfaces(LCD, DPHY), as implemented in TSMC CLN28HPM PLL, this PLL supports the programmable integer division and range of the display output pixel clock's 27-594MHz. Signed-off-by: Wen He <wen.he_1@nxp.com> Signed-off-by: Michael Walle <michael@walle.cc> Link: https://lkml.kernel.org/r/20191213083402.35678-2-wen.he_1@nxp.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-01-30dt/bindings: clk: Add YAML schemas for LS1028A Display Clock bindingsWen He
LS1028A has a clock domain PXLCLK0 used for provide pixel clocks to Display output interface. Add a YAML schema for this. Signed-off-by: Wen He <wen.he_1@nxp.com> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lkml.kernel.org/r/20191213083402.35678-1-wen.he_1@nxp.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-01-30Merge tag 'mpx-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-mpx Pull x86 MPX removal from Dave Hansen: "MPX requires recompiling applications, which requires compiler support. Unfortunately, GCC 9.1 is expected to be be released without support for MPX. This means that there was only a relatively small window where folks could have ever used MPX. It failed to gain wide adoption in the industry, and Linux was the only mainstream OS to ever support it widely. Support for the feature may also disappear on future processors. This set completes the process that we started during the 5.4 merge window when the MPX prctl()s were removed. XSAVE support is left in place, which allows MPX-using KVM guests to continue to function" * tag 'mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-mpx: x86/mpx: remove MPX from arch/x86 mm: remove arch_bprm_mm_init() hook x86/mpx: remove bounds exception code x86/mpx: remove build infrastructure x86/alternatives: add missing insn.h include
2020-01-30Merge tag 'mtd/for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "MTD core - block2mtd: page index should use pgoff_t - maps: physmap: minimal Runtime PM support - maps: pcmciamtd: avoid possible sleep-in-atomic-context bugs - concat: Fix a comment referring to an unknown symbol Raw NAND: - Macronix: Use match_string() helper - Atmel: switch to using devm_fwnode_gpiod_get() - Denali: rework the SKIP_BYTES feature and add reset controlling - Brcmnand: set appropriate DMA mask - Cadence: add unspecified HAS_IOMEM dependency - Various cleanup. Onenand: - Rename Samsung and Omap2 drivers to avoid possible build warnings - Enable compile testing - Various build issues - Kconfig cleanup SPI-NAND: - Support for Toshiba TC58CVG2S0HRAIJ SPI-NOR: - Add support for TB selection using SR bit 6, - Add support for few flashes" * tag 'mtd/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (41 commits) mtd: concat: Fix a comment referring to an unknown symbol mtd: rawnand: add unspecified HAS_IOMEM dependency mtd: block2mtd: page index should use pgoff_t mtd: maps: physmap: Add minimal Runtime PM support mtd: maps: pcmciamtd: fix possible sleep-in-atomic-context bugs in pcmciamtd_set_vpp() mtd: onenand: Rename omap2 driver to avoid a build warning mtd: onenand: Use a better name for samsung driver mtd: rawnand: atmel: switch to using devm_fwnode_gpiod_get() mtd: spinand: add support for Toshiba TC58CVG2S0HRAIJ mtd: rawnand: macronix: Use match_string() helper to simplify the code mtd: sharpslpart: Fix unsigned comparison to zero mtd: onenand: Enable compile testing of OMAP and Samsung drivers mtd: onenand: samsung: Fix printing format for size_t on 64-bit mtd: onenand: samsung: Fix pointer cast -Wpointer-to-int-cast warnings on 64 bit mtd: rawnand: denali: remove hard-coded DENALI_DEFAULT_OOB_SKIP_BYTES mtd: rawnand: denali_dt: add reset controlling dt-bindings: mtd: denali_dt: document reset property mtd: rawnand: denali_dt: Add support for configuring SPARE_AREA_SKIP_BYTES mtd: rawnand: denali_dt: error out if platform has no associated data mtd: rawnand: brcmnand: Set appropriate DMA mask ...
2020-01-30Merge tag 'upstream-5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI/UBIFS updates from Miquel Raynal: "This pull request contains mostly fixes for UBI and UBIFS: UBI: - Fixes for memory leaks in error paths - Fix for an logic error in a fastmap selfcheck UBIFS: - Fix for FS_IOC_SETFLAGS related to fscrypt flag - Support for FS_ENCRYPT_FL - Fix for a dead lock in bulk-read mode" Sent on behalf of Richard Weinberger who is traveling. * tag 'upstream-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: Fix an error pointer dereference in error handling code ubifs: Fix memory leak from c->sup_node ubifs: Fix ino_t format warnings in orphan_delete() ubifs: Fix deadlock in concurrent bulk-read and writepage ubifs: Fix wrong memory allocation ubi: Free the normal volumes in error paths of ubi_attach_mtd_dev() ubi: Check the presence of volume before call ubi_fastmap_destroy_checkmap() ubifs: Add support for FS_ENCRYPT_FL ubifs: Fix FS_IOC_SETFLAGS unexpectedly clearing encrypt flag ubi: wl: Remove set but not used variable 'prev_e' ubi: fastmap: Fix inverted logic in seen selfcheck
2020-01-30Merge tag 'f2fs-for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this series, we've implemented transparent compression experimentally. It supports LZO and LZ4, but will add more later as we investigate in the field more. At this point, the feature doesn't expose compressed space to user directly in order to guarantee potential data updates later to the space. Instead, the main goal is to reduce data writes to flash disk as much as possible, resulting in extending disk life time as well as relaxing IO congestion. Alternatively, we're also considering to add ioctl() to reclaim compressed space and show it to user after putting the immutable bit. Enhancements: - add compression support - avoid unnecessary locks in quota ops - harden power-cut scenario for zoned block devices - use private bio_set to avoid IO congestion - replace GC mutex with rwsem to serialize callers Bug fixes: - fix dentry consistency and memory corruption in rename()'s error case - fix wrong swap extent reports - fix casefolding bugs - change lock coverage to avoid deadlock - avoid GFP_KERNEL under f2fs_lock_op And, we've cleaned up sysfs entries to prepare no debugfs" * tag 'f2fs-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (31 commits) f2fs: fix race conditions in ->d_compare() and ->d_hash() f2fs: fix dcache lookup of !casefolded directories f2fs: Add f2fs stats to sysfs f2fs: delete duplicate information on sysfs nodes f2fs: change to use rwsem for gc_mutex f2fs: update f2fs document regarding to fsync_mode f2fs: add a way to turn off ipu bio cache f2fs: code cleanup for f2fs_statfs_project() f2fs: fix miscounted block limit in f2fs_statfs_project() f2fs: show the CP_PAUSE reason in checkpoint traces f2fs: fix deadlock allocating bio_post_read_ctx from mempool f2fs: remove unneeded check for error allocating bio_post_read_ctx f2fs: convert inline_dir early before starting rename f2fs: fix memleak of kobject f2fs: fix to add swap extent correctly f2fs: run fsck when getting bad inode during GC f2fs: support data compression f2fs: free sysfs kobject f2fs: declare nested quota_sem and remove unnecessary sems f2fs: don't put new_page twice in f2fs_rename ...
2020-01-30Merge tag 'for_v5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF, quota, reiserfs, ext2 fixes and cleanups from Jan Kara: "A few assorted fixes and cleanups for udf, quota, reiserfs, and ext2" * tag 'for_v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs: remove unused macros fs/quota: remove unused macro udf: Clarify meaning of f_files in udf_statfs udf: Allow writing to 'Rewritable' partitions udf: Disallow R/W mode for disk with Metadata partition udf: Fix meaning of ENTITYID_FLAGS_* macros to be really bitwise-or flags udf: Fix free space reporting for metadata and virtual partitions udf: Update header files to UDF 2.60 udf: Move OSTA Identifier Suffix macros from ecma_167.h to osta_udf.h udf: Fix spelling in EXT_NEXT_EXTENT_ALLOCDESCS ext2: Adjust indentation in ext2_fill_super quota: avoid time_t in v1_disk_dqblk definition reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling reiserfs: Fix memory leak of journal device string ext2: set proper errno in error case of ext2_fill_super()
2020-01-30Merge tag 'xfs-5.6-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "In this release we clean out the last of the old 32-bit timestamp code, fix a number of bugs and memory corruptions on 32-bit platforms, and a refactoring of some of the extended attribute code. I think I'll be back next week with some refactoring of how the XFS buffer code returns error codes, however I prefer to hold onto that for another week to let it soak a while longer Summary: - Get rid of compat_time_t - Convert time_t to time64_t in quota code - Remove shadow variables - Prevent ATTR_ flag misuse in the attrmulti ioctls - Clean out strlen in the attr code - Remove some bogus asserts - Fix various file size limit calculation errors with 32-bit kernels - Pack xfs_dir2_sf_entry_t to fix build errors on arm oabi - Fix nowait inode locking calls for directio aio reads - Fix memory corruption bugs when invalidating remote xattr value buffers - Streamline remote attr value removal - Make the buffer log format size consistent across platforms - Strengthen buffer log format size checking - Fix messed up return types of xfs_inode_need_cow - Fix some unused variable warnings" * tag 'xfs-5.6-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (24 commits) xfs: remove unused variable 'done' xfs: fix uninitialized variable in xfs_attr3_leaf_inactive xfs: change return value of xfs_inode_need_cow to int xfs: check log iovec size to make sure it's plausibly a buffer log format xfs: make struct xfs_buf_log_format have a consistent size xfs: complain if anyone tries to create a too-large buffer log item xfs: clean up xfs_buf_item_get_format return value xfs: streamline xfs_attr3_leaf_inactive xfs: fix memory corruption during remote attr value buffer invalidation xfs: refactor remote attr value buffer invalidation xfs: fix IOCB_NOWAIT handling in xfs_file_dio_aio_read xfs: Add __packed to xfs_dir2_sf_entry_t definition xfs: fix s_maxbytes computation on 32-bit kernels xfs: truncate should remove all blocks, not just to the end of the page cache xfs: introduce XFS_MAX_FILEOFF xfs: remove bogus assertion when online repair isn't enabled xfs: Remove all strlen in all xfs_attr_* functions for attr names. xfs: fix misuse of the XFS_ATTR_INCOMPLETE flag xfs: also remove cached ACLs when removing the underlying attr xfs: reject invalid flags combinations in XFS_IOC_ATTRMULTI_BY_HANDLE ...
2020-01-30Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge window, we've added some performance improvements in how we handle inode locking in the read/write paths, and improving the performance of Direct I/O overwrites. We also now record the error code which caused the first and most recent ext4_error() report in the superblock, to make it easier to root cause problems in production systems. There are also many of the usual cleanups and miscellaneous bug fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (49 commits) jbd2: clean __jbd2_journal_abort_hard() and __journal_abort_soft() jbd2: make sure ESHUTDOWN to be recorded in the journal superblock ext4, jbd2: ensure panic when aborting with zero errno jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record jbd2_seq_info_next should increase position index jbd2: remove pointless assertion in __journal_remove_journal_head ext4,jbd2: fix comment and code style jbd2: delete the duplicated words in the comments ext4: fix extent_status trace points ext4: fix symbolic enum printing in trace output ext4: choose hardlimit when softlimit is larger than hardlimit in ext4_statfs_project() ext4: fix race conditions in ->d_compare() and ->d_hash() ext4: make dioread_nolock the default ext4: fix extent_status fragmentation for plain files jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal ext4: drop ext4_kvmalloc() ext4: Add EXT4_IOC_FSGETXATTR/EXT4_IOC_FSSETXATTR to compat_ioctl ext4: remove unused macro MPAGE_DA_EXTENT_TAIL ext4: add missing braces in ext4_ext_drop_refs() ext4: fix some nonstandard indentation in extents.c ...
2020-01-30rxrpc: Fix missing active use pinning of rxrpc_local objectDavid Howells
The introduction of a split between the reference count on rxrpc_local objects and the usage count didn't quite go far enough. A number of kernel work items need to make use of the socket to perform transmission. These also need to get an active count on the local object to prevent the socket from being closed. Fix this by getting the active count in those places. Also split out the raw active count get/put functions as these places tend to hold refs on the rxrpc_local object already, so getting and putting an extra object ref is just a waste of time. The problem can lead to symptoms like: BUG: kernel NULL pointer dereference, address: 0000000000000018 .. CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51 ... RIP: 0010:selinux_socket_sendmsg+0x5/0x13 ... Call Trace: security_socket_sendmsg+0x2c/0x3e sock_sendmsg+0x1a/0x46 rxrpc_send_keepalive+0x131/0x1ae rxrpc_peer_keepalive_worker+0x219/0x34b process_one_work+0x18e/0x271 worker_thread+0x1a3/0x247 kthread+0xe6/0xeb ret_from_fork+0x1f/0x30 Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting") Signed-off-by: David Howells <dhowells@redhat.com>
2020-01-30rxrpc: Fix insufficient receive notification generationDavid Howells
In rxrpc_input_data(), rxrpc_notify_socket() is called if the base sequence number of the packet is immediately following the hard-ack point at the end of the function. However, this isn't sufficient, since the recvmsg side may have been advancing the window and then overrun the position in which we're adding - at which point rx_hard_ack >= seq0 and no notification is generated. Fix this by always generating a notification at the end of the input function. Without this, a long call may stall, possibly indefinitely. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>