linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-05-27	xfs: force writes to delalloc regions to unwritten	Darrick J. Wong
	When writing to a delalloc region in the data fork, commit the new allocations (of the da reservation) as unwritten so that the mappings are only marked written once writeback completes successfully. This fixes the problem of stale data exposure if the system goes down during targeted writeback of a specific region of a file, as tested by generic/042. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: refactor xfs_iomap_prealloc_size	Darrick J. Wong
	Refactor xfs_iomap_prealloc_size to be the function that dynamically computes the per-file preallocation size by moving the allocsize= case to the caller. Break up the huge comment preceding the function to annotate the relevant parts of the code, and remove the impossible check_writeio case. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: measure all contiguous previous extents for prealloc size	Darrick J. Wong
	When we're estimating a new speculative preallocation length for an extending write, we should walk backwards through the extent list to determine the number of number of blocks that are physically and logically contiguous with the write offset, and use that as an input to the preallocation size computation. This way, preallocation length is truly measured by the effectiveness of the allocator in giving us contiguous allocations without being influenced by the state of a given extent. This fixes both the problem where ZERO_RANGE within an EOF can reduce preallocation, and prevents the unnecessary shrinkage of preallocation when delalloc extents are turned into unwritten extents. This was found as a regression in xfs/014 after changing delalloc writes to create unwritten extents during writeback. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: don't fail unwritten extent conversion on writeback due to edquot	Darrick J. Wong
	During writeback, it's possible for the quota block reservation in xfs_iomap_write_unwritten to fail with EDQUOT because we hit the quota limit. This causes writeback errors for data that was already written to disk, when it's not even guaranteed that the bmbt will expand to exceed the quota limit. Irritatingly, this condition is reported to userspace as EIO by fsync, which is confusing. We wrote the data, so allow the reservation. That might put us slightly above the hard limit, but it's better than losing data after a write. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-05-27	xfs: rearrange xfs_inode_walk_ag parameters	Darrick J. Wong
	The perag structure already has a pointer to the xfs_mount, so we don't need to pass that separately and can drop it. Having done that, move iter_flags so that the argument order is the same between xfs_inode_walk and xfs_inode_walk_ag. The latter will make things less confusing for a future patch that enables background scanning work to be done in parallel. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: straighten out all the naming around incore inode tree walks	Darrick J. Wong
	We're not very consistent about function names for the incore inode iteration function. Turn them all into xfs_inode_walk* variants. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-05-27	xfs: move xfs_inode_ag_iterator to be closer to the perag walking code	Darrick J. Wong
	Move the xfs_inode_ag_iterator function to be nearer xfs_inode_ag_walk so that we don't have to scroll back and forth to figure out how the incore inode walking function works. No functional changes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: use bool for done in xfs_inode_ag_walk	Darrick J. Wong
	This is a boolean variable, so use the bool type. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: fix inode ag walk predicate function return values	Darrick J. Wong
	There are a number of predicate functions that help the incore inode walking code decide if we really want to apply the iteration function to the inode. These are boolean decisions, so change the return types to boolean to match. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: refactor eofb matching into a single helper	Darrick J. Wong
	Refactor the two eofb-matching logics into a single helper so that we don't repeat ourselves. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-05-27	xfs: remove __xfs_icache_free_eofblocks	Darrick J. Wong
	This is now a pointless wrapper, so kill it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: remove flags argument from xfs_inode_ag_walk	Darrick J. Wong
	The incore inode walk code passes a flags argument and a pointer from the xfs_inode_ag_iterator caller all the way to the iteration function. We can reduce the function complexity by passing flags through the private pointer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: remove xfs_inode_ag_iterator_flags	Darrick J. Wong
	Combine xfs_inode_ag_iterator_flags and xfs_inode_ag_iterator_tag into a single wrapper function since there's only one caller of the _flags variant. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: remove unused xfs_inode_ag_iterator function	Darrick J. Wong
	Not used by anyone, so get rid of it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: replace open-coded XFS_ICI_NO_TAG	Darrick J. Wong
	Use XFS_ICI_NO_TAG instead of -1 when appropriate. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: move eofblocks conversion function to xfs_ioctl.c	Darrick J. Wong
	Move xfs_fs_eofblocks_from_user into the only file that actually uses it, so that we don't have this function cluttering up the header file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27	xfs: allow individual quota grace period extension	Eric Sandeen
	The only grace period which can be set in the kernel today is for id 0, i.e. the default grace period for all users. However, setting an individual grace period is useful; for example: Alice has a soft quota of 100 inodes, and a hard quota of 200 inodes Alice uses 150 inodes, and enters a short grace period Alice really needs to use those 150 inodes past the grace period The administrator extends Alice's grace period until next Monday vfs quota users such as ext4 can do this today, with setquota -T To enable this for XFS, we simply move the timelimit assignment out from under the (id == 0) test. Default setting remains under (id == 0). Note that this now is consistent with how we set warnings. (Userspace requires updates to enable this as well; xfs_quota needs to parse new options, and setquota needs to set appropriate field flags.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: per-type quota timers and warn limits	Eric Sandeen
	Move timers and warnings out of xfs_quotainfo and into xfs_def_quota so that we can utilize them on a per-type basis, rather than enforcing them based on the values found in the first enabled quota type. Signed-off-by: Eric Sandeen <sandeen@redhat.com> [zlang: new way to get defquota in xfs_qm_init_timelimits] [zlang: remove redundant defq assign] Signed-off-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: switch xfs_get_defquota to take explicit type	Eric Sandeen
	xfs_get_defquota() currently takes an xfs_dquot, and from that obtains the type of default quota we should get (user/group/project). But early in init, we don't have access to a fully set up quota, so that's not possible. The next patch needs go set up default quota timers early, so switch xfs_get_defquota to take an explicit type and add a helper function to obtain that type from an xfs_dquot for the existing callers. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: pass xfs_dquot to xfs_qm_adjust_dqtimers	Eric Sandeen
	Pass xfs_dquot rather than xfs_disk_dquot to xfs_qm_adjust_dqtimers; this makes it symmetric with xfs_qm_adjust_dqlimits and will help the next patch. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: fix up some whitespace in quota code	Eric Sandeen
	There is a fair bit of whitespace damage in the quota code, so fix up enough of it that subsequent patches are restricted to functional change to aid review. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: always return -ENOSPC on project quota reservation failure	Eric Sandeen
	XFS project quota treats project hierarchies as "mini filesysems" and so rather than -EDQUOT, the intent is to return -ENOSPC when a quota reservation fails, but this behavior is not consistent. The only place we make a decision between -EDQUOT and -ENOSPC returns based on quota type is in xfs_trans_dqresv(). This behavior is currently controlled by whether or not the XFS_QMOPT_ENOSPC flag gets passed into the quota reservation. However, its use is not consistent; paths such as xfs_create() and xfs_symlink() don't set the flag, so a reservation failure will return -EDQUOT for project quota reservation failures rather than -ENOSPC for these sorts of operations, even for project quota: # mkdir mnt/project # xfs_quota -x -c "project -s -p mnt/project 42" mnt # xfs_quota -x -c 'limit -p isoft=2 ihard=3 42' mnt # touch mnt/project/file{1,2,3} touch: cannot touch ‘mnt/project/file3’: Disk quota exceeded We can make this consistent by not requiring the flag to be set at the top of the callchain; instead we can simply test whether we are reserving a project quota with XFS_QM_ISPDQ in xfs_trans_dqresv and if so, return -ENOSPC for that failure. This removes the need for the XFS_QMOPT_ENOSPC altogether and simplifies the code a fair bit. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: group quota should return EDQUOT when prj quota enabled	Eric Sandeen
	Long ago, group & project quota were mutually exclusive, and so when we turned on XFS_QMOPT_ENOSPC ("return ENOSPC if project quota is exceeded") when project quota was enabled, we only needed to disable it again for user quota. When group & project quota got separated, this got missed, and as a result if project quota is enabled and group quota is exceeded, the error code returned is incorrectly returned as ENOSPC not EDQUOT. Fix this by stripping XFS_QMOPT_ENOSPC out of flags for group quota when we try to reserve the space. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: remove the m_active_trans counter	Dave Chinner
	It's a global atomic counter, and we are hitting it at a rate of half a million transactions a second, so it's bouncing the counter cacheline all over the place on large machines. We don't actually need it anymore - it used to be required because the VFS freeze code could not track/prevent filesystem transactions that were running, but that problem no longer exists. Hence to remove the counter, we simply have to ensure that nothing calls xfs_sync_sb() while we are trying to quiesce the filesytem. That only happens if the log worker is still running when we call xfs_quiesce_attr(). The log worker is cancelled at the end of xfs_quiesce_attr() by calling xfs_log_quiesce(), so just call it early here and then we can remove the counter altogether. Concurrent create, 50 million inodes, identical 16p/16GB virtual machines on different physical hosts. Machine A has twice the CPU cores per socket of machine B: unpatched patched machine A: 3m16s 2m00s machine B: 4m04s 4m05s Create rates: unpatched patched machine A: 282k+/-31k 468k+/-21k machine B: 231k+/-8k 233k+/-11k Concurrent rm of same 50 million inodes: unpatched patched machine A: 6m42s 2m33s machine B: 4m47s 4m47s The transaction rate on the fast machine went from just under 300k/sec to 700k/sec, which indicates just how much of a bottleneck this atomic counter was. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: separate read-only variables in struct xfs_mount	Dave Chinner
	Seeing massive cpu usage from xfs_agino_range() on one machine; instruction level profiles look similar to another machine running the same workload, only one machine is consuming 10x as much CPU as the other and going much slower. The only real difference between the two machines is core count per socket. Both are running identical 16p/16GB virtual machine configurations Machine A: 25.83% [k] xfs_agino_range 12.68% [k] __xfs_dir3_data_check 6.95% [k] xfs_verify_ino 6.78% [k] xfs_dir2_data_entry_tag_p 3.56% [k] xfs_buf_find 2.31% [k] xfs_verify_dir_ino 2.02% [k] xfs_dabuf_map.constprop.0 1.65% [k] xfs_ag_block_count And takes around 13 minutes to remove 50 million inodes. Machine B: 13.90% [k] __pv_queued_spin_lock_slowpath 3.76% [k] do_raw_spin_lock 2.83% [k] xfs_dir3_leaf_check_int 2.75% [k] xfs_agino_range 2.51% [k] __raw_callee_save___pv_queued_spin_unlock 2.18% [k] __xfs_dir3_data_check 2.02% [k] xfs_log_commit_cil And takes around 5m30s to remove 50 million inodes. Suspect is cacheline contention on m_sectbb_log which is used in one of the macros in xfs_agino_range. This is a read-only variable but shares a cacheline with m_active_trans which is a global atomic that gets bounced all around the machine. The workload is trying to run hundreds of thousands of transactions per second and hence cacheline contention will be occurring on this atomic counter. Hence xfs_agino_range() is likely just be an innocent bystander as the cache coherency protocol fights over the cacheline between CPU cores and sockets. On machine A, this rearrangement of the struct xfs_mount results in the profile changing to: 9.77% [kernel] [k] xfs_agino_range 6.27% [kernel] [k] __xfs_dir3_data_check 5.31% [kernel] [k] __pv_queued_spin_lock_slowpath 4.54% [kernel] [k] xfs_buf_find 3.79% [kernel] [k] do_raw_spin_lock 3.39% [kernel] [k] xfs_verify_ino 2.73% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock Vastly less CPU usage in xfs_agino_range(), but still 3x the amount of machine B and still runs substantially slower than it should. Current rm -rf of 50 million files: vanilla patched machine A 13m20s 6m42s machine B 5m30s 5m02s It's an improvement, hence indicating that separation and further optimisation of read-only global filesystem data is worthwhile, but it clearly isn't the underlying issue causing this specific performance degradation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: reduce free inode accounting overhead	Dave Chinner
	Shaokun Zhang reported that XFS was using substantial CPU time in percpu_count_sum() when running a single threaded benchmark on a high CPU count (128p) machine from xfs_mod_ifree(). The issue is that the filesystem is empty when the benchmark runs, so inode allocation is running with a very low inode free count. With the percpu counter batching, this means comparisons when the counter is less that 128 * 256 = 32768 use the slow path of adding up all the counters across the CPUs, and this is expensive on high CPU count machines. The summing in xfs_mod_ifree() is only used to fire an assert if an underrun occurs. The error is ignored by the higher level code. Hence this is really just debug code and we don't need to run it on production kernels, nor do we need such debug checks to return error values just to trigger an assert. Finally, xfs_mod_icount/xfs_mod_ifree are only called from xfs_trans_unreserve_and_mod_sb(), so get rid of them and just directly call the percpu_counter_add/percpu_counter_compare functions. The compare functions are now run only on debug builds as they are internal to ASSERT() checks and so only compiled in when ASSERTs are active (CONFIG_XFS_DEBUG=y or CONFIG_XFS_WARN=y). Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	xfs: gut error handling in xfs_trans_unreserve_and_mod_sb()	Dave Chinner
	xfs: gut error handling in xfs_trans_unreserve_and_mod_sb() From: Dave Chinner <dchinner@redhat.com> The error handling in xfs_trans_unreserve_and_mod_sb() is largely incorrect - rolling back the changes in the transaction if only one counter underruns makes all the other counters incorrect. We still allow the change to proceed and committing the transaction, except now we have multiple incorrect counters instead of a single underflow. Further, we don't actually report the error to the caller, so this is completely silent except on debug kernels that will assert on failure before we even get to the rollback code. Hence this error handling is broken, untested, and largely unnecessary complexity. Just remove it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-27	x86: Hide the archdata.iommu field behind generic IOMMU_API	Krzysztof Kozlowski
	There is a generic, kernel wide configuration symbol for enabling the IOMMU specific bits: CONFIG_IOMMU_API. Implementations (including INTEL_IOMMU and AMD_IOMMU driver) select it so use it here as well. This makes the conditional archdata.iommu field consistent with other platforms and also fixes any compile test builds of other IOMMU drivers, when INTEL_IOMMU or AMD_IOMMU are not selected). For the case when INTEL_IOMMU/AMD_IOMMU and COMPILE_TEST are not selected, this should create functionally equivalent code/choice. With COMPILE_TEST this field could appear if other IOMMU drivers are chosen but neither INTEL_IOMMU nor AMD_IOMMU are not. Reported-by: kbuild test robot <lkp@intel.com> Fixes: e93a1695d7fb ("iommu: Enable compile testing for some of drivers") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200518120855.27822-2-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-27	ia64: Hide the archdata.iommu field behind generic IOMMU_API	Krzysztof Kozlowski
	There is a generic, kernel wide configuration symbol for enabling the IOMMU specific bits: CONFIG_IOMMU_API. Implementations (including INTEL_IOMMU driver) select it so use it here as well. This makes the conditional archdata.iommu field consistent with other platforms and also fixes any compile test builds of other IOMMU drivers, when INTEL_IOMMU is not selected). For the case when INTEL_IOMMU and COMPILE_TEST are not selected, this should create functionally equivalent code/choice. With COMPILE_TEST this field could appear if other IOMMU drivers are chosen but INTEL_IOMMU not. Reported-by: kbuild test robot <lkp@intel.com> Fixes: e93a1695d7fb ("iommu: Enable compile testing for some of drivers") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200518120855.27822-1-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-27	Merge tag 'gpio-fixes-for-v5.7' of ↵	Linus Walleij
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into fixes gpio fixes for v5.7 - fix mutex and spinlock ordering in gpio-mlxbf2 - fix the return value checks on devm_platform_ioremap_resource in gpio-pxa and gpio-bcm-kona
2020-05-27	MIPS: Loongson64: select NO_EXCEPT_FILL	Jiaxun Yang
	Loongson64 load kernel at 0x82000000 and allocate exception vectors by ebase. So we don't need to reserve space for exception vectors at head of kernel. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-05-27	arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()	Anshuman Khandual
	There is no way to proceed when requested register could not be searched in arm64_ftr_reg[]. Requesting for a non present register would be an error as well. Hence lets just WARN_ON() when search fails in get_arm64_ftr_reg() rather than checking for return value and doing a BUG_ON() instead in some individual callers. But there are also caller instances that dont error out when register search fails. Add a new helper get_arm64_ftr_reg_nowarn() for such cases. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1590573876-19120-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-27	block: blk-crypto-fallback: remove redundant initialization of variable err	Colin Ian King
	The variable err is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Satya Tangirala <satyat@google.com> Addresses-Coverity: ("Unused value") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build	Pablo Neira Ayuso
	>> include/linux/netfilter/nf_conntrack_pptp.h:13:20: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers] extern const char *const pptp_msg_name(u_int16_t msg); ^~~~~~ Reported-by: kbuild test robot <lkp@intel.com> Fixes: 4c559f15efcc ("netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27	netfilter: conntrack: comparison of unsigned in cthelper confirmation	Pablo Neira Ayuso
	net/netfilter/nf_conntrack_core.c: In function nf_confirm_cthelper: net/netfilter/nf_conntrack_core.c:2117:15: warning: comparison of unsigned expression in < 0 is always false [-Wtype-limits] 2117 \| if (protoff < 0 \|\| (frag_off & htons(~0x7)) != 0) \| ^ ipv6_skip_exthdr() returns a signed integer. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: 703acd70f249 ("netfilter: nfnetlink_cthelper: unbreak userspace helper support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27	netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_update	Nathan Chancellor
	Clang warns: net/netfilter/nf_conntrack_core.c:2068:21: warning: variable 'ctinfo' is uninitialized when used here [-Wuninitialized] nf_ct_set(skb, ct, ctinfo); ^~~~~~ net/netfilter/nf_conntrack_core.c:2024:2: note: variable 'ctinfo' is declared here enum ip_conntrack_info ctinfo; ^ 1 warning generated. nf_conntrack_update was split up into nf_conntrack_update and __nf_conntrack_update, where the assignment of ctinfo is in nf_conntrack_update but it is used in __nf_conntrack_update. Pass the value of ctinfo from nf_conntrack_update to __nf_conntrack_update so that uninitialized memory is not used and everything works properly. Fixes: ee04805ff54a ("netfilter: conntrack: make conntrack userspace helpers work again") Link: https://github.com/ClangBuiltLinux/linux/issues/1039 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27	block: reduce part_stat_lock() scope	Christoph Hellwig
	We only need the stats lock (aka preempt_disable()) for updating the states, not for looking up or dropping the hd_struct reference. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: use __this_cpu_add() instead of access by smp_processor_id()	Konstantin Khlebnikov
	Most architectures have fast path to access percpu for current cpu. The required preempt_disable() is provided by part_stat_lock(). [hch: rebased] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: remove rcu_read_lock() from part_stat_lock()	Konstantin Khlebnikov
	The RCU lock is required only in disk_map_sector_rcu() to lookup the partition. After that request holds reference to related hd_struct. Replace get_cpu() with preempt_disable() - returned cpu index is unused. [hch: rebased] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: add a blk_account_io_merge_bio helper	Konstantin Khlebnikov
	Move the non-"new_io" branch of blk_account_io_start() into separate function. Fix merge accounting for discards (they were counted as write merges). The new blk_account_io_merge_bio() doesn't call update_io_ticks() unlike blk_account_io_start(), as there is no reason for that. [hch: rebased] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: account merge of two requests	Konstantin Khlebnikov
	Also rename blk_account_io_merge() into blk_account_io_merge_request() to distinguish it from merging request and bio. [hch: rebased] Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: always use a percpu variable for disk stats	Christoph Hellwig
	percpu variables have a perfectly fine working stub implementation for UP kernels, so use that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: move update_io_ticks to blk-core.c	Christoph Hellwig
	All callers are in blk-core.c, so move update_io_ticks over. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	block: remove generic_{start,end}_io_acct	Christoph Hellwig
	Remove these now unused functions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	zram: nvdimm: use bio_{start,end}_io_acct and disk_{start,end}_io_acct	Christoph Hellwig
	Switch zram to use the nicer bio accounting helpers, and as part of that ensure each bio is counted as a single I/O request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	nvdimm: use bio_{start,end}_io_acct	Christoph Hellwig
	Switch dm to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	dm: use bio_{start,end}_io_acct	Christoph Hellwig
	Switch dm to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	bcache: use bio_{start,end}_io_acct	Christoph Hellwig
	Switch bcache to use the nicer bio accounting helpers, and call the routines where we also sample the start time to give coherent accounting results. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	lightnvm/pblk: use bio_{start,end}_io_acct	Christoph Hellwig
	Switch rsxx to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-27	rsxx: use bio_{start,end}_io_acct	Christoph Hellwig
	Switch rsxx to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>