summaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)Author
2020-10-13Merge tag 'selinux-pr-20201012' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "A decent number of SELinux patches for v5.10, twenty two in total. The highlights are listed below, but all of the patches pass our test suite and merge cleanly. - A number of changes to how the SELinux policy is loaded and managed inside the kernel with the goal of improving the atomicity of a SELinux policy load operation. These changes account for the bulk of the diffstat as well as the patch count. A special thanks to everyone who contributed patches and fixes for this work. - Convert the SELinux policy read-write lock to RCU. - A tracepoint was added for audited SELinux access control events; this should help provide a more unified backtrace across kernel and userspace. - Allow the removal of security.selinux xattrs when a SELinux policy is not loaded. - Enable policy capabilities in SELinux policies created with the scripts/selinux/mdp tool. - Provide some "no sooner than" dates for the SELinux checkreqprot sysfs deprecation" * tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (22 commits) selinux: provide a "no sooner than" date for the checkreqprot removal selinux: Add helper functions to get and set checkreqprot selinux: access policycaps with READ_ONCE/WRITE_ONCE selinux: simplify away security_policydb_len() selinux: move policy mutex to selinux_state, use in lockdep checks selinux: fix error handling bugs in security_load_policy() selinux: convert policy read-write lock to RCU selinux: delete repeated words in comments selinux: add basic filtering for audit trace events selinux: add tracepoint on audited events selinux: Create new booleans and class dirs out of tree selinux: Standardize string literal usage for selinuxfs directory names selinux: Refactor selinuxfs directory populating functions selinux: Create function for selinuxfs directory cleanup selinux: permit removing security.selinux xattr before policy load selinux: fix memdup.cocci warnings selinux: avoid dereferencing the policy prior to initialization selinux: fix allocation failure check on newpolicy->sidtab selinux: refactor changing booleans selinux: move policy commit after updating selinuxfs ...
2020-10-13Merge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "Here are the driver updates for 5.10. A few SCSI updates in here too, in coordination with Martin as they depend on core block changes for the shared tag bitmap. This contains: - NVMe pull requests via Christoph: - fix keep alive timer modification (Amit Engel) - order the PCI ID list more sensibly (Andy Shevchenko) - cleanup the open by controller helper (Chaitanya Kulkarni) - use an xarray for the CSE log lookup (Chaitanya Kulkarni) - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni) - fix nvme_ns_report_zones (Christoph Hellwig) - add a sanity check to nvmet-fc (James Smart) - fix interrupt allocation when too many polled queues are specified (Jeffle Xu) - small nvmet-tcp optimization (Mark Wunderlich) - fix a controller refcount leak on init failure (Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni) - major refactoring of the scanning code (Christoph Hellwig) - MD updates via Song: - Bug fixes in bitmap code, from Zhao Heming - Fix a work queue check, from Guoqing Jiang - Fix raid5 oops with reshape, from Song Liu - Clean up unused code, from Jason Yan - Discard improvements, from Xiao Ni - raid5/6 page offset support, from Yufen Yu - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap, Hannes) - null_blk open/active zone limit support (Niklas) - Set of bcache updates (Coly, Dongsheng, Qinglang)" * tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits) md/raid5: fix oops during stripe resizing md/bitmap: fix memory leak of temporary bitmap md: fix the checking of wrong work queue md/bitmap: md_bitmap_get_counter returns wrong blocks md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks md/raid0: remove unused function is_io_in_chunk_boundary() nvme-core: remove extra condition for vwc nvme-core: remove extra variable nvme: remove nvme_identify_ns_list nvme: refactor nvme_validate_ns nvme: move nvme_validate_ns nvme: query namespace identifiers before adding the namespace nvme: revalidate zone bitmaps in nvme_update_ns_info nvme: remove nvme_update_formats nvme: update the known admin effects nvme: set the queue limits in nvme_update_ns_info nvme: remove the 0 lba_shift check in nvme_update_ns_info nvme: clean up the check for too large logic block sizes nvme: freeze the queue over ->lba_shift updates nvme: factor out a nvme_configure_metadata helper ...
2020-10-13Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ...
2020-10-13Merge tag 'for-5.10-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Mostly core updates with a few user visible bits and fixes. Hilights: - fsync performance improvements - less contention of log mutex (throughput +4%, latency -14%, dbench with 32 clients) - skip unnecessary commits for link and rename (throughput +6%, latency -30%, rename latency -75%, dbench with 16 clients) - make fast fsync wait only for writeback (throughput +10..40%, runtime -1..-20%, dbench with 1 to 64 clients on various file/block sizes) - direct io is now implemented using the iomap infrastructure, that's the main part, we still have a workaround that requires an iomap API update, coming in 5.10 - new sysfs exports: - information about the exclusive filesystem operation status (balance, device add/remove/replace, ...) - supported send stream version Core: - use ticket space reservations for data, fair policy using the same infrastructure as metadata - preparatory work to switch locking from our custom tree locks to standard rwsem, now the locking context is propagated to all callers, actual switch is expected to happen in the next dev cycle - seed device structures are now using list API - extent tracepoints print proper tree id - unified range checks for extent buffer helpers - send: avoid using temporary buffer for copying data - remove unnecessary RCU protection from space infos - remove unused readpage callback for metadata, enabling several cleanups - replace indirect function calls for end io hooks and remove extent_io_ops completely Fixes: - more lockdep warning fixes - fix qgroup reservation for delayed inode and an occasional reservation leak for preallocated files - fix device replace of a seed device - fix metadata reservation for fallocate that leads to transaction aborts - reschedule if necessary when logging directory items or when cloning lots of extents - tree-checker: fix false alert caused by legacy btrfs root item - send: fix rename/link conflicts for orphanized inodes - properly initialize device stats for seed devices - skip devices without magic signature when mounting Other: - error handling improvements, BUG_ONs replaced by proper handling, fuzz fixes - various function parameter cleanups - various W=1 cleanups - error/info messages improved Mishaps: - commit 62cf5391209a ("btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks") is a rebase leftover after the patch got merged to 5.9-rc8 as a466c85edc6f ("btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks"), the remaining part is trivial and the patch is in the middle of the series so I'm keeping it there instead of rebasing" * tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (161 commits) btrfs: rename BTRFS_INODE_ORDERED_DATA_CLOSE flag btrfs: annotate device name rcu_string with __rcu btrfs: skip devices without magic signature when mounting btrfs: cleanup cow block on error btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK fs: remove no longer used dio_end_io() btrfs: return error if we're unable to read device stats btrfs: init device stats for seed devices btrfs: remove struct extent_io_ops btrfs: call submit_bio_hook directly for metadata pages btrfs: stop calling submit_bio_hook for data inodes btrfs: don't opencode is_data_inode in end_bio_extent_readpage btrfs: call submit_bio_hook directly in submit_one_bio btrfs: remove extent_io_ops::readpage_end_io_hook btrfs: replace readpage_end_io_hook with direct calls btrfs: send, recompute reference path after orphanization of a directory btrfs: send, orphanize first all conflicting inodes when processing references btrfs: tree-checker: fix false alert caused by legacy btrfs root item btrfs: use unaligned helpers for stack and header set/get helpers btrfs: free-space-cache: use unaligned helpers to access data ...
2020-10-12Merge tag 'locks-v5.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking fix from Jeff Layton: "Just a single patch to fix up some tracepoint output" * tag 'locks-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: Remove extra "0x" in tracepoint format specifier
2020-10-12Merge tag 'x86-paravirt-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt cleanup from Ingo Molnar: "Clean up the paravirt code after the removal of 32-bit Xen PV support" * tag 'x86-paravirt-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Avoid needless paravirt step clearing page table entries x86/paravirt: Remove set_pte_at() pv-op x86/entry/32: Simplify CONFIG_XEN_PV build dependency x86/paravirt: Use CONFIG_PARAVIRT_XXL instead of CONFIG_PARAVIRT x86/paravirt: Clean up paravirt macros x86/paravirt: Remove 32-bit support from CONFIG_PARAVIRT_XXL
2020-10-12Merge tag 'core-static_call-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull static call support from Ingo Molnar: "This introduces static_call(), which is the idea of static_branch() applied to indirect function calls. Remove a data load (indirection) by modifying the text. They give the flexibility of function pointers, but with better performance. (This is especially important for cases where retpolines would otherwise be used, as retpolines can be pretty slow.) API overview: DECLARE_STATIC_CALL(name, func); DEFINE_STATIC_CALL(name, func); DEFINE_STATIC_CALL_NULL(name, typename); static_call(name)(args...); static_call_cond(name)(args...); static_call_update(name, func); x86 is supported via text patching, otherwise basic indirect calls are used, with function pointers. There's a second variant using inline code patching, inspired by jump-labels, implemented on x86 as well. The new APIs are utilized in the x86 perf code, a heavy user of function pointers, where static calls speed up the PMU handler by 4.2% (!). The generic implementation is not really excercised on other architectures, outside of the trivial test_static_call_init() self-test" * tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) static_call: Fix return type of static_call_init tracepoint: Fix out of sync data passing by static caller tracepoint: Fix overly long tracepoint names x86/perf, static_call: Optimize x86_pmu methods tracepoint: Optimize using static_call() static_call: Allow early init static_call: Add some validation static_call: Handle tail-calls static_call: Add static_call_cond() x86/alternatives: Teach text_poke_bp() to emulate RET static_call: Add simple self-test for static calls x86/static_call: Add inline static call implementation for x86-64 x86/static_call: Add out-of-line static call implementation static_call: Avoid kprobes on inline static_call()s static_call: Add inline static call infrastructure static_call: Add basic static call infrastructure compiler.h: Make __ADDRESSABLE() symbol truly unique jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() module: Properly propagate MODULE_STATE_COMING failure module: Fix up module_notifier return values ...
2020-10-12Merge tag 'sched-core-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - reorganize & clean up the SD* flags definitions and add a bunch of sanity checks. These new checks caught quite a few bugs or at least inconsistencies, resulting in another set of patches. - rseq updates, add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ - add a new tracepoint to improve CPU capacity tracking - improve overloaded SMP system load-balancing behavior - tweak SMT balancing - energy-aware scheduling updates - NUMA balancing improvements - deadline scheduler fixes and improvements - CPU isolation fixes - misc cleanups, simplifications and smaller optimizations * tag 'sched-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) sched/deadline: Unthrottle PI boosted threads while enqueuing sched/debug: Add new tracepoint to track cpu_capacity sched/fair: Tweak pick_next_entity() rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ rseq/selftests,x86_64: Add rseq_offset_deref_addv() rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ sched/fair: Use dst group while checking imbalance for NUMA balancer sched/fair: Reduce busy load balance interval sched/fair: Minimize concurrent LBs between domain level sched/fair: Reduce minimal imbalance threshold sched/fair: Relax constraint on task's load during load balance sched/fair: Remove the force parameter of update_tg_load_avg() sched/fair: Fix wrong cpu selecting from isolated domain sched: Remove unused inline function uclamp_bucket_base_value() sched/rt: Disable RT_RUNTIME_SHARE by default sched/deadline: Fix stale throttling on de-/boosted tasks sched/numa: Use runnable_avg to classify node sched/topology: Move sd_flag_debug out of #ifdef CONFIG_SYSCTL MAINTAINERS: Add myself as SCHED_DEADLINE reviewer sched/topology: Move SD_DEGENERATE_GROUPS_MASK out of linux/sched/topology.h ...
2020-10-12Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's quite a lot of code here, but much of it is due to the addition of a new PMU driver as well as some arm64-specific selftests which is an area where we've traditionally been lagging a bit. In terms of exciting features, this includes support for the Memory Tagging Extension which narrowly missed 5.9, hopefully allowing userspace to run with use-after-free detection in production on CPUs that support it. Work is ongoing to integrate the feature with KASAN for 5.11. Another change that I'm excited about (assuming they get the hardware right) is preparing the ASID allocator for sharing the CPU page-table with the SMMU. Those changes will also come in via Joerg with the IOMMU pull. We do stray outside of our usual directories in a few places, mostly due to core changes required by MTE. Although much of this has been Acked, there were a couple of places where we unfortunately didn't get any review feedback. Other than that, we ran into a handful of minor conflicts in -next, but nothing that should post any issues. Summary: - Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) Revert "arm64: initialize per-cpu offsets earlier" arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 ...
2020-10-09Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull v5.10 RCU changes from Paul E. McKenney: - Debugging for smp_call_function(). - Strict grace periods for KASAN. The point of this series is to find RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is further disabled by dfefault. Finally, the help text includes a goodly list of scary caveats. - New smp_call_function() torture test. - Torture-test updates. - Documentation updates. - Miscellaneous fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-07btrfs: use own btree inode io_tree owner idQu Wenruo
Btree inode is special compared to all other inode extent io_trees, although it has a btrfs inode, it doesn't have the track_uptodate bit at all. This means a lot of things like extent locking doesn't even need to be applied to btree io tree. Since it's so special, adds a new owner value for it to make debuging a little easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07btrfs: make ordered extent tracepoint take btrfs_inodeNikolay Borisov
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07btrfs: tracepoints: output proper root owner for trace_find_free_extent()Qu Wenruo
The current trace event always output result like this: find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA) find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA) find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA) find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA) find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA) find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA) T's saying we're allocating data extent for EXTENT tree, which is not even possible. It's because we always use EXTENT tree as the owner for trace_find_free_extent() without using the @root from btrfs_reserve_extent(). This patch will change the parameter to use proper @root for trace_find_free_extent(): Now it looks much better: find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP) find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA) find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=1(DATA) find_free_extent: root=5(FS_TREE) len=4096 empty_size=0 flags=1(DATA) find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA) find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP) find_free_extent: root=7(CSUM_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP) find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP) find_free_extent: root=1(ROOT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP) Reported-by: Hans van Kranenburg <hans@knorrie.org> CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-06Merge tag 'v5.9-rc5' into asoc-5.10Mark Brown
Linux 5.9-rc5
2020-10-06ASoC: Intel: Remove haswell solutionCezary Rojewski
Newly added catpt solution found in sound/soc/intel/catpt is a direct replacement to sound/soc/intel/haswell. It covers all features supported by it and more - by aligning to recommended flows and requirement list based on Windows driver equivalent. No harm is done to userspace as catpt - similarly to haswell - loads no extenal topology files while sharing the exact same ADSP firmware binary. Given the above, existing haswell code is redundant so remove it. Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@intel.com> Link: https://lore.kernel.org/r/20201006064907.16277-2-cezary.rojewski@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-03sched/debug: Add new tracepoint to track cpu_capacityVincent Donnefort
rq->cpu_capacity is a key element in several scheduler parts, such as EAS task placement and load balancing. Tracking this value enables testing and/or debugging by a toolkit. Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1598605249-72651-1-git-send-email-vincent.donnefort@arm.com
2020-10-02scsi: target: core: Add CONTROL field for trace eventsRoman Bolshakov
trace-cmd report doesn't show events from target subsystem because scsi_command_size() leaks through event format string: [target:target_sequencer_start] function scsi_command_size not defined [target:target_cmd_complete] function scsi_command_size not defined Addition of scsi_command_size() to plugin_scsi.c in trace-cmd doesn't help because an expression is used inside TP_printk(). trace-cmd event parser doesn't understand minus sign inside [ ]: Error: expected ']' but read '-' Rather than duplicating kernel code in plugin_scsi.c, provide a dedicated field for CONTROL byte. Link: https://lore.kernel.org/r/20200929125957.83069-1-r.bolshakov@yadro.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-02bcache: add set_uuid in struct cache_setColy Li
This patch adds a separated set_uuid[16] in struct cache_set, to store the uuid of the cache set. This is the preparation to remove the embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-30devlink: Add a tracepoint for trap reportsIdo Schimmel
Add a tracepoint for trap reports so that drop monitor could register its probe on it. Use trace_devlink_trap_report_enabled() to avoid wasting cycles setting the trap metadata if the tracepoint is not enabled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28KVM: x86: Allow deflecting unknown MSR accesses to user spaceAlexander Graf
MSRs are weird. Some of them are normal control registers, such as EFER. Some however are registers that really are model specific, not very interesting to virtualization workloads, and not performance critical. Others again are really just windows into package configuration. Out of these MSRs, only the first category is necessary to implement in kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against certain CPU models and MSRs that contain information on the package level are much better suited for user space to process. However, over time we have accumulated a lot of MSRs that are not the first category, but still handled by in-kernel KVM code. This patch adds a generic interface to handle WRMSR and RDMSR from user space. With this, any future MSR that is part of the latter categories can be handled in user space. Furthermore, it allows us to replace the existing "ignore_msrs" logic with something that applies per-VM rather than on the full system. That way you can run productive VMs in parallel to experimental ones where you don't care about proper MSR handling. Signed-off-by: Alexander Graf <graf@amazon.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200925143422.21718-3-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-25iocost: add iocg_forgive_debt tracepointTejun Heo
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24Merge branch 'for-5.10/block' into for-5.10/driversJens Axboe
* for-5.10/block: (140 commits) bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag bdi: invert BDI_CAP_NO_ACCT_WB bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag mm: use SWP_SYNCHRONOUS_IO more intelligently bdi: remove BDI_CAP_SYNCHRONOUS_IO bdi: remove BDI_CAP_CGROUP_WRITEBACK block: lift setting the readahead size into the block layer md: update the optimal I/O size on reshape bdi: initialize ->ra_pages and ->io_pages in bdi_init aoe: set an optimal I/O size bcache: inherit the optimal I/O size drbd: remove dead code in device_to_statistics fs: remove the unused SB_I_MULTIROOT flag block: mark blkdev_get static PM: mm: cleanup swsusp_swap_check mm: split swap_type_of PM: rewrite is_hibernate_resume_dev to not require an inode mm: cleanup claim_swapfile ocfs2: cleanup o2hb_region_dev_store dasd: cleanup dasd_scan_partitions ...
2020-09-21SUNRPC: Remove dprintk call sites in RPC queuing functionsChuck Lever
Remove redundant call sites or call sites that are already covered by tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Clean up RPC scheduler tracepointsChuck Lever
Remove several redundant dprintk call sites, and replace a couple of potentially useful ones with tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace rpcbind dprintk call sites with tracepointsChuck Lever
In many cases, tracepoints already report these errors. In others, the dprintks were mainly useful when this code was less mature. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Hoist trace_xprtrdma_op_setport into generic codeChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove rpcb_getport_async dprintk call sitesChuck Lever
In many cases, tracepoints already report these errors. In others, the dprintks were mainly useful when this code was less mature. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Clean up call_bind_status() observabilityChuck Lever
Time to remove dprintk call sites in here. Regarding the rpc_bind_status tracepoint: It's friendlier to administrators if they don't have to look up the error code to figure out what went wrong. Replace trace_rpc_bind_status with a set of tracepoints that report more specifically what the problem was, and what RPC program/version was being queried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Trace call_refresh eventsChuck Lever
Clean up: Replace dprintk call sites. Note that rpc_call_rpcerror() already has a trace point, so perhaps adding trace_rpc_refresh_status() isn't necessary. However, it does report a particular category of error. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Add trace_rpc_timeout_status()Chuck Lever
For a long while we've wanted a tracepoint that fires when a major timeout is reported in the system log. Such a tracepoint can be attached to other actions that can take place when a timeout is detected (eg, server or connection health assessment). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace connect dprintk call sites with a tracepointChuck Lever
This trace event can be used to audit transport connections from the client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace dprintk() call site in xs_nospace()Chuck Lever
"no socket space" is an exceptional and infrequent condition that troubleshooters want to know about. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace dprintk() call site in xprt_prepare_transmitChuck Lever
Generate a trace event when an RPC request is queued without being sent immediately. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Update debugging instrumentation in xprt_do_reserve()Chuck Lever
Replace a dprintk() with a tracepoint. The tracepoint marks the point where an RPC request is assigned an XID. Additional clean up: Remove trace_xprt_enq_xmit, which reports much the same thing. That tracepoint was added for debugging commit 918f3c1fe83c ("SUNRPC: Improve latency for interactive tasks"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove debugging instrumentation from xprt_releaseChuck Lever
These instruments don't appear to add any substantial value. We already have this at the termination of each RPC: iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58 iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384 iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Hoist trace_xprtrdma_op_allocate into generic codeChuck Lever
Introduce a tracepoint in call_allocate that reports the exact sizes in the RPC buffer allocation request and the status of the result. This helps catch problems with XDR buffer provisioning, and replaces transport-specific debugging instrumentation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove trace_xprt_complete_rqst()Chuck Lever
Request completion is already recorded by an "rpc_task_wakeup queue=xprt_pending" trace record. A subsequent rpc_xdr_recvfrom trace record shows the number of bytes received. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-18Merge branch 'mlx5_active_speed' into rdma.git for-nextJason Gunthorpe
Leon Romanovsky says: ==================== IBTA declares speed as 16 bits, but kernel stores it in u8. This series fixes in-kernel declaration while keeping external interface intact. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_active_speed': RDMA: Fix link active_speed size RDMA/mlx5: Delete duplicated mlx5_ptys_width enum net/mlx5: Refactor query port speed functions
2020-09-14rxrpc: Fix a missing NULL-pointer check in a traceDavid Howells
Fix the rxrpc_client tracepoint to not dereference conn to get the cid if conn is NULL, as it does for other fields. RIP: 0010:trace_event_raw_event_rxrpc_client+0x7e/0xe0 [rxrpc] Call Trace: rxrpc_activate_channels+0x62/0xb0 [rxrpc] rxrpc_connect_call+0x481/0x650 [rxrpc] ? wake_up_q+0xa0/0xa0 ? rxrpc_kernel_begin_call+0x12a/0x1b0 [rxrpc] rxrpc_new_client_call+0x2a5/0x5e0 [rxrpc] Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com>
2020-09-11f2fs: trace: fix typoChao Yu
Fixes a typo from 'compreesed' to 'compressed'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: support age threshold based garbage collectionChao Yu
There are several issues in current background GC algorithm: - valid blocks is one of key factors during cost overhead calculation, so if segment has less valid block, however even its age is young or it locates hot segment, CB algorithm will still choose the segment as victim, it's not appropriate. - GCed data/node will go to existing logs, no matter in-there datas' update frequency is the same or not, it may mix hot and cold data again. - GC alloctor mainly use LFS type segment, it will cost free segment more quickly. This patch introduces a new algorithm named age threshold based garbage collection to solve above issues, there are three steps mainly: 1. select a source victim: - set an age threshold, and select candidates beased threshold: e.g. 0 means youngest, 100 means oldest, if we set age threshold to 80 then select dirty segments which has age in range of [80, 100] as candiddates; - set candidate_ratio threshold, and select candidates based the ratio, so that we can shrink candidates to those oldest segments; - select target segment with fewest valid blocks in order to migrate blocks with minimum cost; 2. select a target victim: - select candidates beased age threshold; - set candidate_radius threshold, search candidates whose age is around source victims, searching radius should less than the radius threshold. - select target segment with most valid blocks in order to avoid migrating current target segment. 3. merge valid blocks from source victim into target victim with SSR alloctor. Test steps: - create 160 dirty segments: * half of them have 128 valid blocks per segment * left of them have 384 valid blocks per segment - run background GC Benefit: GC count and block movement count both decrease obviously: - Before: - Valid: 86 - Dirty: 1 - Prefree: 11 - Free: 6001 (6001) GC calls: 162 (BG: 220) - data segments : 160 (160) - node segments : 2 (2) Try to move 41454 blocks (BG: 41454) - data blocks : 40960 (40960) - node blocks : 494 (494) IPU: 0 blocks SSR: 0 blocks in 0 segments LFS: 41364 blocks in 81 segments - After: - Valid: 87 - Dirty: 0 - Prefree: 4 - Free: 6008 (6008) GC calls: 75 (BG: 76) - data segments : 74 (74) - node segments : 1 (1) Try to move 12813 blocks (BG: 12813) - data blocks : 12544 (12544) - node blocks : 269 (269) IPU: 0 blocks SSR: 12032 blocks in 77 segments LFS: 855 blocks in 2 segments Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-08rxrpc: Rewrite the client connection managerDavid Howells
Rewrite the rxrpc client connection manager so that it can support multiple connections for a given security key to a peer. The following changes are made: (1) For each open socket, the code currently maintains an rbtree with the connections placed into it, keyed by communications parameters. This is tricky to maintain as connections can be culled from the tree or replaced within it. Connections can require replacement for a number of reasons, e.g. their IDs span too great a range for the IDR data type to represent efficiently, the call ID numbers on that conn would overflow or the conn got aborted. This is changed so that there's now a connection bundle object placed in the tree, keyed on the same parameters. The bundle, however, does not need to be replaced. (2) An rxrpc_bundle object can now manage the available channels for a set of parallel connections. The lock that manages this is moved there from the rxrpc_connection struct (channel_lock). (3) There'a a dummy bundle for all incoming connections to share so that they have a channel_lock too. It might be better to give each incoming connection its own bundle. This bundle is not needed to manage which channels incoming calls are made on because that's the solely at whim of the client. (4) The restrictions on how many client connections are around are removed. Instead, a previous patch limits the number of client calls that can be allocated. Ordinarily, client connections are reaped after 2 minutes on the idle queue, but when more than a certain number of connections are in existence, the reaper starts reaping them after 2s of idleness instead to get the numbers back down. It could also be made such that new call allocations are forced to wait until the number of outstanding connections subsides. Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-04mm: Add PG_arch_2 page flagSteven Price
For arm64 MTE support it is necessary to be able to mark pages that contain user space visible tags that will need to be saved/restored e.g. when swapped out. To support this add a new arch specific flag (PG_arch_2). This flag is only available on 64-bit architectures due to the limited number of spare page flags on the 32-bit ones. Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi Kivilinna. 2) Fix loss of RTT samples in rxrpc, from David Howells. 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu. 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka. 5) We disable BH for too lokng in sctp_get_port_local(), add a cond_resched() here as well, from Xin Long. 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu. 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from Yonghong Song. 8) Missing of_node_put() in mt7530 DSA driver, from Sumera Priyadarsini. 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan. 10) Fix geneve tunnel checksumming bug in hns3, from Yi Li. 11) Memory leak in rxkad_verify_response, from Dinghao Liu. 12) In tipc, don't use smp_processor_id() in preemptible context. From Tuong Lien. 13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu. 14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter. 15) Fix ABI mismatch between driver and firmware in nfp, from Louis Peens. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits) net/smc: fix sock refcounting in case of termination net/smc: reset sndbuf_desc if freed net/smc: set rx_off for SMCR explicitly net/smc: fix toleration of fake add_link messages tg3: Fix soft lockup when tg3_reset_task() fails. doc: net: dsa: Fix typo in config code sample net: dp83867: Fix WoL SecureOn password nfp: flower: fix ABI mismatch between driver and firmware tipc: fix shutdown() of connectionless socket ipv6: Fix sysctl max for fib_multipath_hash_policy drivers/net/wan/hdlc: Change the default of hard_header_len to 0 net: gemini: Fix another missing clk_disable_unprepare() in probe net: bcmgenet: fix mask check in bcmgenet_validate_flow() amd-xgbe: Add support for new port mode net: usb: dm9601: Add USB ID of Keenetic Plus DSL vhost: fix typo in error message net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() pktgen: fix error message with wrong function name net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode cxgb4: fix thermal zone device registration ...
2020-09-01blk-iocost: restore inuse update tracepointsTejun Heo
Update and restore the inuse update tracepoints. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: decouple vrate adjustment from surplus transfersTejun Heo
Budget donations are inaccurate and could take multiple periods to converge. To prevent triggering vrate adjustments while surplus transfers were catching up, vrate adjustment was suppressed if donations were increasing, which was indicated by non-zero nr_surpluses. This entangling won't be necessary with the scheduled rewrite of donation mechanism which will make it precise and immediate. Let's decouple the two in preparation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_usTejun Heo
Currently, iocg->usages[] which are used to guide inuse adjustments are calculated from vtime deltas. This, however, assumes that the hierarchical inuse weight at the time of calculation held for the entire period, which often isn't true and can lead to significant errors. Now that we have absolute usage information collected, we can derive iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment decisions are made based on actual absolute usage. The calculated usage is clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal saturation regardless of the current hierarchical inuse weight. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01locks: Remove extra "0x" in tracepoint format specifierChuck Lever
Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-09-01tracepoint: Optimize using static_call()Steven Rostedt (VMware)
Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-08-31Merge tag 'v5.9-rc3' into rdma.git for-nextJason Gunthorpe
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>