summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-27btrfs: remove hole check in prealloc_file_extent_clusterNikolay Borisov
Extents in the extent cluster are guaranteed to be contiguous as such the hole check inside the loop can never trigger. In fact this check was never functional since it was added in 18513091af94 ("btrfs: update btrfs_space_info's bytes_may_use timely") which came after the commit introducing clustered/contiguous extents 0257bb82d21b ("Btrfs: relocate file extents in clusters"). Let's just remove it as it adds noise to the source. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make __btrfs_drop_extents take btrfs_inodeNikolay Borisov
It has only 4 uses of a vfs_inode for inode_sub_bytes but unifies the interface with the non __ prefixed version. Will also makes converting its callers to btrfs_inode easier. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make btrfs_csum_one_bio takae btrfs_inodeNikolay Borisov
Will enable converting btrfs_submit_compressed_write to btrfs_inode more easily. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make extent_clear_unlock_delalloc take btrfs_inodeNikolay Borisov
It has one VFS and 1 btrfs inode usages but converting it to btrfs_inode interface will allow seamless conversion of its callers. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make create_io_em take btrfs_inodeNikolay Borisov
It really wants a btrfs_inode and will allow submit_compressed_extents to be completely converted to btrfs_inode in follow up patches. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make btrfs_reloc_clone_csums take btrfs_inodeNikolay Borisov
It really wants btrfs_inode and not a vfs inode. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make btrfs_lookup_ordered_extent take btrfs_inodeNikolay Borisov
It doesn't use the generic vfs inode for anything use btrfs_inode directly. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make get_extent_allocation_hint take btrfs_inodeNikolay Borisov
It doesn't use the vfs inode for anything, can just as easily take btrfs_inode. Follow up patches will convert callers as well. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: make __btrfs_add_ordered_extent take struct btrfs_inodeNikolay Borisov
This is internal btrfs function what really needs the vfs_inode only for igrab and a tracepoint. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: remove no longer used trans_list member of struct btrfs_ordered_extentFilipe Manana
The 'trans_list' member of an ordered extent was used to keep track of the ordered extents for which a transaction commit had to wait. These were ordered extents that were started and logged by an fsync. However we don't do that anymore and before we stopped doing it we changed the approach to wait for the ordered extents in commit 161c3549b45aee ("Btrfs: change how we wait for pending ordered extents"), which stopped using that list and therefore the 'trans_list' member is not used anymore since that commit. So just remove it since it's doing nothing and making each ordered extent structure waste memory (2 pointers). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: remove no longer used log_list member of struct btrfs_ordered_extentFilipe Manana
The 'log_list' member of an ordered extent was used keep track of which ordered extents we needed to wait after logging metadata, but is not used anymore since commit 5636cf7d6dc86f ("btrfs: remove the logged extents infrastructure"), as we now always wait on ordered extent completion before logging metadata. So just remove it since it's doing nothing and making each ordered extent structure waste more memory (2 pointers). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: add little-endian optimized key helpersDavid Sterba
The CPU and on-disk keys are mapped to two different structures because of the endianness. There's an intermediate buffer used to do the conversion, but this is not necessary when CPU and on-disk endianness match. Add optimized versions of helpers that take disk_key and use the buffer directly for CPU keys or drop the intermediate buffer and conversion. This saves a lot of stack space accross many functions and removes about 6K of generated binary code: text data bss dec hex filename 1090439 17468 14912 1122819 112203 pre/btrfs.ko 1084613 17456 14912 1116981 110b35 post/btrfs.ko Delta: -5826 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: qgroup: catch reserved space leaks at unmount timeQu Wenruo
Before this patch, qgroup completely relies on per-inode extent io tree to detect reserved data space leak. However previous bug has already shown how release page before btrfs_finish_ordered_io() could lead to leak, and since it's QGROUP_RESERVED bit cleared without triggering qgroup rsv, it can't be detected by per-inode extent io tree. So this patch adds another (and hopefully the final) safety net to catch qgroup data reserved space leak. At least the new safety net catches all the leaks during development, so it should be pretty useful in the real world. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: change timing for qgroup reserved space for ordered extents to fix ↵Qu Wenruo
reserved space leak [BUG] The following simple workload from fsstress can lead to qgroup reserved data space leak: 0/0: creat f0 x:0 0 0 0/0: creat add id=0,parent=-1 0/1: write f0[259 1 0 0 0 0] [600030,27288] 0 0/4: dwrite - xfsctl(XFS_IOC_DIOINFO) f0[259 1 0 0 64 627318] return 25, fallback to stat() 0/4: dwrite f0[259 1 0 0 64 627318] [610304,106496] 0 This would cause btrfs qgroup to leak 20480 bytes for data reserved space. If btrfs qgroup limit is enabled, such leak can lead to unexpected early EDQUOT and unusable space. [CAUSE] When doing direct IO, kernel will try to writeback existing buffered page cache, then invalidate them: generic_file_direct_write() |- filemap_write_and_wait_range(); |- invalidate_inode_pages2_range(); However for btrfs, the bi_end_io hook doesn't finish all its heavy work right after bio ends. In fact, it delays its work further: submit_extent_page(end_io_func=end_bio_extent_writepage); end_bio_extent_writepage() |- btrfs_writepage_endio_finish_ordered() |- btrfs_init_work(finish_ordered_fn); <<< Work queue execution >>> finish_ordered_fn() |- btrfs_finish_ordered_io(); |- Clear qgroup bits This means, when filemap_write_and_wait_range() returns, btrfs_finish_ordered_io() is not guaranteed to be executed, thus the qgroup bits for related range are not cleared. Now into how the leak happens, this will only focus on the overlapping part of buffered and direct IO part. 1. After buffered write The inode had the following range with QGROUP_RESERVED bit: 596 616K |///////////////| Qgroup reserved data space: 20K 2. Writeback part for range [596K, 616K) Write back finished, but btrfs_finish_ordered_io() not get called yet. So we still have: 596K 616K |///////////////| Qgroup reserved data space: 20K 3. Pages for range [596K, 616K) get released This will clear all qgroup bits, but don't update the reserved data space. So we have: 596K 616K | | Qgroup reserved data space: 20K That number doesn't match the qgroup bit range anymore. 4. Dio prepare space for range [596K, 700K) Qgroup reserved data space for that range, we got: 596K 616K 700K |///////////////|///////////////////////| Qgroup reserved data space: 20K + 104K = 124K 5. btrfs_finish_ordered_range() gets executed for range [596K, 616K) Qgroup free reserved space for that range, we got: 596K 616K 700K | |///////////////////////| We need to free that range of reserved space. Qgroup reserved data space: 124K - 20K = 104K 6. btrfs_finish_ordered_range() gets executed for range [596K, 700K) However qgroup bit for range [596K, 616K) is already cleared in previous step, so we only free 84K for qgroup reserved space. 596K 616K 700K | | | We need to free that range of reserved space. Qgroup reserved data space: 104K - 84K = 20K Now there is no way to release that 20K unless disabling qgroup or unmounting the fs. [FIX] This patch will change the timing of btrfs_qgroup_release/free_data() call. Here it uses buffered COW write as an example. The new timing | The old timing ----------------------------------------+--------------------------------------- btrfs_buffered_write() | btrfs_buffered_write() |- btrfs_qgroup_reserve_data() | |- btrfs_qgroup_reserve_data() | btrfs_run_delalloc_range() | btrfs_run_delalloc_range() |- btrfs_add_ordered_extent() | |- btrfs_qgroup_release_data() | The reserved is passed into | btrfs_ordered_extent structure | | btrfs_finish_ordered_io() | btrfs_finish_ordered_io() |- The reserved space is passed to | |- btrfs_qgroup_release_data() btrfs_qgroup_record | The resereved space is passed | to btrfs_qgroup_recrod | btrfs_qgroup_account_extents() | btrfs_qgroup_account_extents() |- btrfs_qgroup_free_refroot() | |- btrfs_qgroup_free_refroot() The point of such change is to ensure, when ordered extents are submitted, the qgroup reserved space is already released, to keep the timing aligned with file_write_and_wait_range(). So that qgroup data reserved space is all bound to btrfs_ordered_extent and solve the timing mismatch. Fixes: f695fdcef83a ("btrfs: qgroup: Introduce functions to release/free qgroup reserve data space") Suggested-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: file: reserve qgroup space after the hole punch range is lockedQu Wenruo
The incoming qgroup reserved space timing will move the data reservation to ordered extent completely. However in btrfs_punch_hole_lock_range() will call btrfs_invalidate_page(), which will clear QGROUP_RESERVED bit for the range. In current stage it's OK, but if we're making ordered extents handle the reserved space, then btrfs_punch_hole_lock_range() can clear the QGROUP_RESERVED bit before we submit ordered extent, leading to qgroup reserved space leakage. So here change the timing to make reserve data space after btrfs_punch_hole_lock_range(). The new timing is fine for either current code or the new code. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: inode: move qgroup reserved space release to the callers of ↵Qu Wenruo
insert_reserved_file_extent() This is to prepare for the incoming timing change of qgroup reserved data space and ordered extent. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: inode: refactor the parameters of insert_reserved_file_extent()Qu Wenruo
Function insert_reserved_file_extent() takes a long list of parameters, which are all for btrfs_file_extent_item, even including two reserved members, encryption and other_encoding. This makes the parameter list unnecessary long for a function which only gets called twice. This patch will refactor the parameter list, by using btrfs_file_extent_item as parameter directly to hugely reduce the number of parameters. Also, since there are only two callers, one in btrfs_finish_ordered_io() which inserts file extent for ordered extent, and one __btrfs_prealloc_file_range(). These two call sites have completely different context, where ordered extent can be compressed, but will always be regular extent, while the preallocated one is never going to be compressed and always has PREALLOC type. So use two small wrapper for these two different call sites to improve readability. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: clean up temporary page variables in scrub_checksum_tree_blockDavid Sterba
Add proper variable for the scrub page and use it instead of repeatedly dereferencing the other structures. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: simplify tree block checksum calculationDavid Sterba
Use a simpler iteration over tree block pages, same what csum_tree_block does: first page always exists, loop over the rest. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: clean up temporary page variables in scrub_checksum_dataDavid Sterba
Add proper variable for the scrub page and use it instead of repeatedly dereferencing the other structures. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: simplify data block checksum calculationDavid Sterba
We have sectorsize same as PAGE_SIZE, the checksum can be calculated in one go. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: clean up temporary page variables in scrub_checksum_superDavid Sterba
Add proper variable for the scrub page and use it instead of repeatedly dereferencing the other structures. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: remove temporary csum array in scrub_checksum_superDavid Sterba
The page contents with the checksum is available during the entire function so we don't need to make a copy. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: simplify superblock checksum calculationDavid Sterba
BTRFS_SUPER_INFO_SIZE is 4096, and fits to a page on all supported architectures, so we can calculate the checksum in one go. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: unify naming of page address variablesDavid Sterba
As the page mapping has been removed, rename the variables to 'kaddr' that we use everywhere else. The type is changed to 'char *' so pointer arithmetic works without casts. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: scrub: remove kmap/kunmap of pagesDavid Sterba
All pages that scrub uses in the scrub_block::pagev array are allocated with GFP_KERNEL and never part of any mapping, so kmap is not necessary, we only need to know the page address. In scrub_write_page_to_dev_replace we don't even need to call flush_dcache_page because of the same reason as above. Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: introduce "rescue=" mount optionQu Wenruo
This patch introduces a new "rescue=" mount option group for all mount options for data recovery. Different rescue sub options are seperated by ':'. E.g "ro,rescue=nologreplay:usebackuproot". The original plan was to use ';', but ';' needs to be escaped/quoted, or it will be interpreted by bash, similar to '|'. And obviously, user can specify rescue options one by one like: "ro,rescue=nologreplay,rescue=usebackuproot". The following mount options are converted to "rescue=", old mount options are deprecated but still available for compatibility purpose: - usebackuproot Now it's "rescue=usebackuproot" - nologreplay Now it's "rescue=nologreplay" Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: use btrfs_alloc_data_chunk_ondemand() when allocating space for ↵Filipe Manana
relocation We currently use btrfs_check_data_free_space() when allocating space for relocating data extents, but that is not necessary because that function combines btrfs_alloc_data_chunk_ondemand(), which does the actual space reservation, and btrfs_qgroup_reserve_data(). We can use btrfs_alloc_data_chunk_ondemand() directly because we know we do not need to reserve qgroup space since we are dealing with a relocation tree, which can never have qgroups (btrfs_qgroup_reserve_data() does nothing as is_fstree() returns false for a relocation tree). Conversely we can use btrfs_free_reserved_data_space_noquota() directly instead of btrfs_free_reserved_data_space(), since we had no qgroup reservation when allocating space. This change is preparatory work for another patch in this series that makes relocation reserve the exact amount of space it needs to relocate a data block group. The function btrfs_check_data_free_space() has the incovenient of requiring a start offset argument and we will want to be able to allocate space for multiple ranges, which are not consecutive, at once. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: remove the start argument from btrfs_free_reserved_data_space_noquota()Filipe Manana
The start argument for btrfs_free_reserved_data_space_noquota() is only used to make sure the amount of bytes we decrement from the bytes_may_use counter of the data space_info object is aligned to the filesystem's sector size. It serves no other purpose. All its current callers always pass a length argument that is already aligned to the sector size, so we can make the start argument go away. In fact its presence makes it impossible to use it in a context where we just want to free a number of bytes for a range for which either we do not know its start offset or for freeing multiple ranges at once (which are not contiguous). This change is preparatory work for a patch (third patch in this series) that makes relocation of data block groups that are not full reserve less data space. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: check-integrity: remove unnecessary failure messages during memory ↵Liao Pingfang
allocation As there is a dump_stack() done on memory allocation failures, these messages might as well be deleted instead. Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Reviewed-by: David Sterba <dsterba@suse.com> [ minor tweaks ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: use helper btrfs_get_block_groupAnand Jain
Use the helper function where it is open coded to increment the block_group reference count As btrfs_get_block_group() is a one-liner we could have open-coded it, but its partner function btrfs_put_block_group() isn't one-liner which does the free part in it. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: let btrfs_return_cluster_to_free_space() return voidAnand Jain
__btrfs_return_cluster_to_free_space() returns only 0. And all its parent functions don't need the return value either so make this a void function. Further, as none of the callers of btrfs_return_cluster_to_free_space() is actually using the return from this function, make this function also return void. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: remove no longer necessary chunk mutex locking casesFilipe Manana
Initially when the 'removed' flag was added to a block group to avoid races between block group removal and fitrim, by commit 04216820fe83d5 ("Btrfs: fix race between fs trimming and block group remove/allocation"), we had to lock the chunks mutex because we could be moving the block group from its current list, the pending chunks list, into the pinned chunks list, or we could just be adding it to the pinned chunks if it was not in the pending chunks list. Both lists were protected by the chunk mutex. However we no longer have those lists since commit 1c11b63eff2a67 ("btrfs: replace pending/pinned chunks lists with io tree"), and locking the chunk mutex is no longer necessary because of that. The same happens at btrfs_unfreeze_block_group(), we lock the chunk mutex because the block group's extent map could be part of the pinned chunks list and the call to remove_extent_mapping() could be deleting it from that list, which used to be protected by that mutex. So just remove those lock and unlock calls as they are not needed anymore. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: factor out reading of bg from find_frist_block_groupJohannes Thumshirn
When find_first_block_group() finds a block group item in the extent-tree, it does a lookup of the object in the extent mapping tree and does further checks on the item. Factor out this step from find_first_block_group() so we can further simplify the code. While we're at it, we can also just return early in find_first_block_group(), if the tree slot isn't found. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: get mapping tree directly from fsinfo in find_first_block_groupJohannes Thumshirn
We already have an fs_info in our function parameters, there's no need to do the maths again and get fs_info from the extent_root just to get the mapping_tree. Instead directly grab the mapping_tree from fs_info. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: simplify checks when adding excluded rangesNikolay Borisov
Adresses held in 'logical' array are always guaranteed to fall within the boundaries of the block group. That is, 'start' can never be smaller than cache->start. This invariant follows from the way the address are calculated in btrfs_rmap_block: stripe_nr = physical - map->stripes[i].physical; stripe_nr = div64_u64(stripe_nr, map->stripe_len); bytenr = chunk_start + stripe_nr * io_stripe_size; I.e it's always some IO stripe within the given chunk. Exploit this invariant to simplify the body of the loop by removing the unnecessary 'if' since its 'else' part is the one always executed. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: read stripe len directly in btrfs_rmap_blockNikolay Borisov
extent_map::orig_block_len contains the size of a physical stripe when it's used to describe block groups (calculated in read_one_chunk via calc_stripe_length or calculated in decide_stripe_size and then assigned to extent_map::orig_block_len in create_chunk). Exploit this fact to get the size directly rather than opencoding the calculations. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27btrfs: don't balance btree inode pages from buffered write pathNikolay Borisov
The call to btrfs_btree_balance_dirty has been there since the early days of BTRFS, when the btree was directly modified from the write path, hence dirtied btree inode pages. With the implementation of b888db2bd7b6 ("Btrfs: Add delayed allocation to the extent based page tree code") 13 years ago the btree is no longer modified from the write path, hence there is no point in calling this function. Just remove it. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-27io: Fix return type of _inb and _inlStafford Horne
The return type of functions _inb, _inw and _inl are all u16 which looks wrong. This patch makes them u8, u16 and u32 respectively. The original commit text for these does not indicate that these should be all forced to u16. Fixes: f009c89df79a ("io: Provide _inX() and _outX()") Signed-off-by: Stafford Horne <shorne@gmail.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-07-27powerpc/64s/hash: Fix hash_preload running with interrupts enabledNicholas Piggin
Commit 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the caller") removed the local_irq_disable from hash_preload, but it was required for more than just the page table walk: the hash pte busy bit is effectively a lock which may be taken in interrupt context, and the local update flag test must not be preempted before it's used. This solves apparent lockups with perf interrupting __hash_page_64K. If get_perf_callchain then also takes a hash fault on the same page while it is already locked, it will loop forever taking hash faults, which looks like this: cpu 0x49e: Vector: 100 (System Reset) at [c00000001a4f7d70] pc: c000000000072dc8: hash_page_mm+0x8/0x800 lr: c00000000000c5a4: do_hash_page+0x24/0x38 sp: c0002ac1cc69ac70 msr: 8000000000081033 current = 0xc0002ac1cc602e00 paca = 0xc00000001de1f280 irqmask: 0x03 irq_happened: 0x01 pid = 20118, comm = pread2_processe Linux version 5.8.0-rc6-00345-g1fad14f18bc6 49e:mon> t [c0002ac1cc69ac70] c00000000000c5a4 do_hash_page+0x24/0x38 (unreliable) --- Exception: 300 (Data Access) at c00000000008fa60 __copy_tofrom_user_power7+0x20c/0x7ac [link register ] c000000000335d10 copy_from_user_nofault+0xf0/0x150 [c0002ac1cc69af70] c00032bf9fa3c880 (unreliable) [c0002ac1cc69afa0] c000000000109df0 read_user_stack_64+0x70/0xf0 [c0002ac1cc69afd0] c000000000109fcc perf_callchain_user_64+0x15c/0x410 [c0002ac1cc69b060] c000000000109c00 perf_callchain_user+0x20/0x40 [c0002ac1cc69b080] c00000000031c6cc get_perf_callchain+0x25c/0x360 [c0002ac1cc69b120] c000000000316b50 perf_callchain+0x70/0xa0 [c0002ac1cc69b140] c000000000316ddc perf_prepare_sample+0x25c/0x790 [c0002ac1cc69b1a0] c000000000317350 perf_event_output_forward+0x40/0xb0 [c0002ac1cc69b220] c000000000306138 __perf_event_overflow+0x88/0x1a0 [c0002ac1cc69b270] c00000000010cf70 record_and_restart+0x230/0x750 [c0002ac1cc69b620] c00000000010d69c perf_event_interrupt+0x20c/0x510 [c0002ac1cc69b730] c000000000027d9c performance_monitor_exception+0x4c/0x60 [c0002ac1cc69b750] c00000000000b2f8 performance_monitor_common_virt+0x1b8/0x1c0 --- Exception: f00 (Performance Monitor) at c0000000000cb5b0 pSeries_lpar_hpte_insert+0x0/0x160 [link register ] c0000000000846f0 __hash_page_64K+0x210/0x540 [c0002ac1cc69ba50] 0000000000000000 (unreliable) [c0002ac1cc69bb00] c000000000073ae0 update_mmu_cache+0x390/0x3a0 [c0002ac1cc69bb70] c00000000037f024 wp_page_copy+0x364/0xce0 [c0002ac1cc69bc20] c00000000038272c do_wp_page+0xdc/0xa60 [c0002ac1cc69bc70] c0000000003857bc handle_mm_fault+0xb9c/0x1b60 [c0002ac1cc69bd50] c00000000006c434 __do_page_fault+0x314/0xc90 [c0002ac1cc69be20] c00000000000c5c8 handle_page_fault+0x10/0x2c --- Exception: 300 (Data Access) at 00007fff8c861fe8 SP (7ffff6b19660) is in userspace Fixes: 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the caller") Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727060947.10060-1-npiggin@gmail.com
2020-07-27modpost: explain why we can't use strsepWolfram Sang
Mention why we open-code strsep, so it is clear that it is intentional. Fixes: 736bb11898ef ("modpost: remove use of non-standard strsep() in HOSTCC code") Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-07-27Merge branch 'linux-5.8' of git://github.com/skeggsb/linux into drm-fixesDave Airlie
A couple of fixes for issues relating to format modifiers (there's still a patch pending from James Jones to hopefully address the remaining ones), regression fix from the recent HDA nightmare, and a race fix for Turing modesetting. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Ben Skeggs <skeggsb@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv5aAp+FZMZGTB+Nszc==h5gEbdNV58sSRRQDF1R5qQRGg@mail.gmail.com
2020-07-26signal: fix typo in dequeue_synchronous_signal()Pavel Machek
s/postive/positive/ Signed-off-by: Pavel Machek (CIP) <pavel@denx.de> Link: https://lore.kernel.org/r/20200724090531.GA14409@amd [christian.brauner@ubuntu.com: tweak commit message] Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-26Linux 5.8-rc7v5.8-rc7Linus Torvalds
2020-07-26Merge tag 'kbuild-fixes-v5.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild into master Pull Kbuild fixes from Masahiro Yamada: - do not use non-portable strsep() in a host program - fix single target builds for external modules - change Clang's --prefix option to make it work for the latest Clang * tag 'kbuild-fixes-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation kbuild: fix single target builds for external modules modpost: remove use of non-standard strsep() in HOSTCC code
2020-07-26drm/mcde: Fix stability issueLinus Walleij
Whenever a display update was sent, apart from updating the memory base address, we called mcde_display_send_one_frame() which also sent a command to the display requesting the TE IRQ and enabling the FIFO. When continuous updates are running this is wrong: we need to only send this to start the flow to the display on the very first update. This lead to the display pipeline locking up and crashing. Check if the flow is already running and in that case do not call mcde_display_send_one_frame(). This fixes crashes on the Samsung GT-S7710 (Skomer). Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Stephan Gerhold <stephan@gerhold.net> Cc: Stephan Gerhold <stephan@gerhold.net> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20200718233323.3407670-1-linus.walleij@linaro.org
2020-07-26Merge branch 'parisc-5.8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux into master Pull parisc fixes from Helge Deller: "Two fixes: - Add the cmpxchg() function for pointers to u8 values. This fixes a kernel linking error when building the tusb1210 driver (from Liam Beguin). - Add a define for atomic64_set_release() to fix CPU soft lockups which happen because of missing unlocks while processing bit operations (from John David Anglin)" * 'parisc-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Add atomic64_set_release() define to avoid CPU soft lockups parisc: add support for cmpxchg on u8 pointers
2020-07-26drm/bridge: nwl-dsi: Drop DRM_BRIDGE_ATTACH_NO_CONNECTOR check.Guido Günther
We don't create a connector but let panel_bridge handle that so there's no point in rejecting DRM_BRIDGE_ATTACH_NO_CONNECTOR. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/8b6545b991afce6add0a24f5f5d116778b0cb763.1595096667.git.agx@sigxcpu.org
2020-07-26drm/panel: Fix auo, kd101n80-45na horizontal noise on edges of panelJitao Shi
Fine tune the HBP and HFP to avoid the dot noise on the left and right edges. Signed-off-by: Jitao Shi <jitao.shi@mediatek.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200714123332.37609-1-jitao.shi@mediatek.com
2020-07-26drm: panel: simple: Delay HPD checking on boe_nv133fhm_n61 for 15 msDouglas Anderson
On boe_nv133fhm_n62 (and presumably on boe_nv133fhm_n61) a scope shows a small spike on the HPD line right when you power the panel on. The picture looks something like this: +-------------------------------------- | | | Power ---+ +--- | ++ | +----+| | HPD -----+ +---------------------------+ So right when power is applied there's a little bump in HPD and then there's small spike right before it goes low. The total time of the little bump plus the spike was measured on one panel as being 8 ms long. The total time for the HPD to go high on the same panel was 51.2 ms, though the datasheet only promises it is < 200 ms. When asked about this glitch, BOE indicated that it was expected and persisted until the TCON has been initialized. If this was a real hotpluggable DP panel then this wouldn't matter a whole lot. We'd debounce the HPD signal for a really long time and so the little blip wouldn't hurt. However, this is not a hotpluggable DP panel and the the debouncing logic isn't needed and just shows down the time needed to get the display working. This is why the code in panel_simple_prepare() doesn't do debouncing and just waits for HPD to go high once. Unfortunately if we get unlucky and happen to poll the HPD line right at the spike we can try talking to the panel before it's ready. Let's handle this situation by putting in a 15 ms prepare delay and decreasing the "hpd absent delay" by 15 ms. That means: * If you don't have HPD hooked up at all you've still got the hardcoded 200 ms delay. * If you've got HPD hooked up you will always wait at least 15 ms before checking HPD. The only case where this could be bad is if the panel is sharing a voltage rail with something else in the system and was already turned on long before the panel came up. In such a case we'll be delaying 15 ms for no reason, but it's not a huge delay and I don't see any other good solution to handle that case. Even though the delay was measured as 8 ms, 15 ms was chosen to give a bit of margin. Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716132120.1.I01e738cd469b61fc9b28b3ef1c6541a4f48b11bf@changeid