summaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)Author
2023-02-13btrfs: pass find_free_extent_ctl to allocator tracepointsBoris Burkov
The allocator tracepoints currently have a pile of values from ffe_ctl. In modifying the allocator and adding more tracepoints, I found myself adding to the already long argument list of the tracepoints. It makes it a lot simpler to just send in the ffe_ctl itself. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-12tracing: Fix TASK_COMM_LEN in trace event format fileYafang Shao
After commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"), the content of the format file under /sys/kernel/tracing/events/task/task_newtask was changed from field:char comm[16]; offset:12; size:16; signed:0; to field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; John reported that this change breaks older versions of perfetto. Then Mathieu pointed out that this behavioral change was caused by the use of __stringify(_len), which happens to work on macros, but not on enum labels. And he also gave the suggestion on how to fix it: :One possible solution to make this more robust would be to extend :struct trace_event_fields with one more field that indicates the length :of an array as an actual integer, without storing it in its stringified :form in the type, and do the formatting in f_show where it belongs. The result as follows after this change, $ cat /sys/kernel/tracing/events/task/task_newtask/format field:char comm[16]; offset:12; size:16; signed:0; Link: https://lore.kernel.org/lkml/Y+QaZtz55LIirsUO@google.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230210155921.4610-1-laoar.shao@gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230212151303.12353-1-laoar.shao@gmail.com Cc: stable@vger.kernel.org Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Kajetan Puchalski <kajetan.puchalski@arm.com> CC: Qais Yousef <qyousef@layalina.io> Fixes: 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN") Reported-by: John Stultz <jstultz@google.com> Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-07rxrpc: Trace ack.rwindDavid Howells
Log ack.rwind in the rxrpc_tx_ack tracepoint. This value is useful to see as it represents flow-control information to the peer. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-02-07Merge branch 'for-6.3/cxl' into cxl/nextDan Williams
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3. Resolve a conflict with the fix and move of cxl_report_and_clear() from pci.c to core/pci.c.
2023-02-07f2fs: use iostat_lat_type directly as a parameter in the ↵Yangtao Li
iostat_update_and_unbind_ctx() Convert to use iostat_lat_type as parameter instead of raw number. BTW, move NUM_PREALLOC_IOSTAT_CTXS to the header file, adjust iostat_lat[{0,1,2}] to iostat_lat[{READ_IO,WRITE_SYNC_IO,WRITE_ASYNC_IO}] in tracepoint function, and rename iotype to page_type to match the definition. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-07tracing: Acquire buffer from temparary trace sequenceLinyu Yuan
there is one dwc3 trace event declare as below, DECLARE_EVENT_CLASS(dwc3_log_event, TP_PROTO(u32 event, struct dwc3 *dwc), TP_ARGS(event, dwc), TP_STRUCT__entry( __field(u32, event) __field(u32, ep0state) __dynamic_array(char, str, DWC3_MSG_MAX) ), TP_fast_assign( __entry->event = event; __entry->ep0state = dwc->ep0state; ), TP_printk("event (%08x): %s", __entry->event, dwc3_decode_event(__get_str(str), DWC3_MSG_MAX, __entry->event, __entry->ep0state)) ); the problem is when trace function called, it will allocate up to DWC3_MSG_MAX bytes from trace event buffer, but never fill the buffer during fast assignment, it only fill the buffer when output function are called, so this means if output function are not called, the buffer will never used. add __get_buf(len) which acquiree buffer from iter->tmp_seq when trace output function called, it allow user write string to acquired buffer. the mentioned dwc3 trace event will changed as below, DECLARE_EVENT_CLASS(dwc3_log_event, TP_PROTO(u32 event, struct dwc3 *dwc), TP_ARGS(event, dwc), TP_STRUCT__entry( __field(u32, event) __field(u32, ep0state) ), TP_fast_assign( __entry->event = event; __entry->ep0state = dwc->ep0state; ), TP_printk("event (%08x): %s", __entry->event, dwc3_decode_event(__get_buf(DWC3_MSG_MAX), DWC3_MSG_MAX, __entry->event, __entry->ep0state)) );. Link: https://lore.kernel.org/linux-trace-kernel/1675065249-23368-1-git-send-email-quic_linyyuan@quicinc.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-06net: bridge: Add a tracepoint for MDB overflowsPetr Machata
The following patch will add two more maximum MDB allowances to the global one, mcast_hash_max, that exists today. In all these cases, attempts to add MDB entries above the configured maximums through netlink, fail noisily and obviously. Such visibility is missing when adding entries through the control plane traffic, by IGMP or MLD packets. To improve visibility in those cases, add a trace point that reports the violation, including the relevant netdevice (be it a slave or the bridge itself), and the MDB entry parameters: # perf record -e bridge:br_mdb_full & # [...] # perf script | cut -d: -f4- dev v2 af 2 src ::ffff:0.0.0.0 grp ::ffff:239.1.1.112/00:00:00:00:00:00 vid 0 dev v2 af 10 src :: grp ff0e::112/00:00:00:00:00:00 vid 0 dev v2 af 2 src ::ffff:0.0.0.0 grp ::ffff:239.1.1.112/00:00:00:00:00:00 vid 10 dev v2 af 10 src 2001:db8:1::1 grp ff0e::1/00:00:00:00:00:00 vid 10 dev v2 af 2 src ::ffff:192.0.2.1 grp ::ffff:239.1.1.1/00:00:00:00:00:00 vid 10 CC: Steven Rostedt <rostedt@goodmis.org> CC: linux-trace-kernel@vger.kernel.org Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-02mm: discard __GFP_ATOMICNeilBrown
__GFP_ATOMIC serves little purpose. Its main effect is to set ALLOC_HARDER which adds a few little boosts to increase the chance of an allocation succeeding, one of which is to lower the water-mark at which it will succeed. It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also adjusts this watermark. It is probable that other users of __GFP_HIGH should benefit from the other little bonuses that __GFP_ATOMIC gets. __GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM. There is little point to this. We already get a might_sleep() warning if __GFP_DIRECT_RECLAIM is set. __GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is probable that testing ALLOC_HARDER is a better fit here. __GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might sleep. This should test __GFP_DIRECT_RECLAIM instead. This patch: - removes __GFP_ATOMIC - allows __GFP_HIGH allocations to ignore watermark boosting as well as GFP_ATOMIC requests. - makes other adjustments as suggested by the above. The net result is not change to GFP_ATOMIC allocations. Other allocations that use __GFP_HIGH will benefit from a few different extra privileges. This affects: xen, dm, md, ntfs3 the vermillion frame buffer hibernation ksm swap all of which likely produce more benefit than cost if these selected allocation are more likely to succeed quickly. [mgorman: Minor adjustments to rework on top of a series] Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31rxrpc: Change rx_packet tracepoint to display securityIndex not type twiceDavid Howells
Change the rx_packet tracepoint to display the securityIndex from the packet header instead of displaying the type in numeric form. There's no need for the latter, as the display of the type in symbolic form will fall back automatically to displaying the hex value if no symbol is available. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: Simplify ACK handlingDavid Howells
Now that general ACK transmission is done from the same thread as incoming DATA packet wrangling, there's no possibility that the SACK table will be being updated by the latter whilst the former is trying to copy it to an ACK. This means that we can safely rotate the SACK table whilst updating it without having to take a lock, rather than keeping all the bits inside it in fixed place and copying and then rotating it in the transmitter. Therefore, simplify SACK handing by keeping track of starting point in the ring and rotate slots down as we consume them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: De-atomic call->ackr_window and call->ackr_nr_unackedDavid Howells
call->ackr_window doesn't need to be atomic as ACK generation and ACK transmission are now done in the same thread, so drop the atomic64 handling and split it into two separate members. Similarly, call->ackr_nr_unacked doesn't need to be atomic now either. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: Generate extra pings for RTT during heavy-receive callDavid Howells
When doing a call that has a single transmitted data packet and a massive amount of received data packets, we only ping for one RTT sample, which means we don't get a good reading on it. Fix this by converting occasional IDLE ACKs into PING ACKs to elicit a response. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: Shrink the tabulation in the rxrpc trace header a bitDavid Howells
Shrink the tabulation in the rxrpc trace header a bit to allow for fields with long type names that have been removed. Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31rxrpc: Remove whitespace before ')' in trace headerDavid Howells
Work around checkpatch warnings in the rxrpc trace header by removing whitespace before ')' on lines defining the trace record struct. Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31Merge v6.2-rc6 into drm-nextDaniel Vetter
Due to holidays we started -next with more -fixes in-flight than usual, and people have been asking where they are. Backmerge to get things better in sync. Conflicts: - Tiny conflict in drm_fbdev_generic.c between variable rename and missing error handling that got added. - Conflict in drm_fb_helper.c between the added call to vgaswitcheroo in drm_fb_helper_single_fb_probe and a refactor patch that extracted lots of helpers and incidentally removed the dev local variable. Readd it to make things compile. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-01-30f2fs: introduce trace_f2fs_replace_atomic_write_blockChao Yu
Commit 3db1de0e582c ("f2fs: change the current atomic write way") removed old tracepoints, but it missed to add new one, this patch fixes to introduce trace_f2fs_replace_atomic_write_block to trace atomic_write commit flow. Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30rxrpc: Fix trace stringDavid Howells
Fix a trace string to indicate that it's discarding the local endpoint for a preallocated peer, not a preallocated connection. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-26habanalabs: define events to trace PCI LBW accessOhad Sharabi
There are cases where it may be useful to dump the whole LBW configs. Yet, doing so while spamming the kernel log will probably shade other important messages since the LBW access is done in sheer volume. To answer this we add trace events for those too. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: define traces for COMMS protocolOhad Sharabi
As the COMMS protocol is being used more widely in our driver, an available debug tool for the handshake will be handy. This commit defines tracepoints to various key points of the COMMS protocol. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-25bpf/tracing: Use stage6 of tracing to not duplicate macrosSteven Rostedt (Google)
The bpf events are created by the same macro magic as tracefs trace events are. But to hook into bpf, it has its own code. It duplicates many of the same macros as the tracefs macros and this is an issue because it misses bug fixes as well as any new enhancements that come with the other trace macros. As the trace macros have been put into their own staging files, have bpf take advantage of this and use the tracefs stage 6 macros that the "fast ssign" portion of the trace event macro uses. Link: https://lkml.kernel.org/r/20230124202515.873075730@goodmis.org Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/ Cc: bpf@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25perf/tracing: Use stage6 of tracing to not duplicate macrosSteven Rostedt (Google)
The perf events are created by the same macro magic as tracefs trace events are. But to hook into perf, it has its own code. It duplicates many of the same macros as the tracefs macros and this is an issue because it misses bug fixes as well as any new enhancements that come with the other trace macros. As the trace macros have been put into their own staging files, have perf take advantage of this and use the tracefs stage 6 macros that the "fast assign" portion of the trace event macro uses. Link: https://lkml.kernel.org/r/20230124202515.716458410@goodmis.org Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/ Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-24Merge tag 'scmi-updates-6.3' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm SCMI updates for v6.3 The main addition is a unified userspace interface for SCMI irrespective of the underlying transport and along with some changed to refactor the SCMI stack probing sequence. 1. SCMI unified userspace interface This is to have a unified way of testing an SCMI platform firmware implementation for compliance, fuzzing etc., from the perspective of the non-secure OSPM irrespective of the underlying transport supporting SCMI. It is just for testing/development and not a feature intended fo use in production. Currently an SCMI Compliance Suite[1] can only work by injecting SCMI messages using the mailbox test driver only which makes it transport specific and can't be used with any other transport like virtio, smc/hvc, optee, etc. Also the shared memory can be transport specific and it is better to even abstract/hide those details while providing the userspace access. So in order to scale with any transport, we need a unified interface for the same. In order to achieve that, SCMI "raw mode support" is being added through debugfs which is more configurable as well. A userspace application can inject bare SCMI binary messages into the SCMI core stack; such messages will be routed by the SCMI regular kernel stack to the backend platform firmware using the configured transport transparently. This eliminates the to know about the specific underlying transport internals that will be taken care of by the SCMI core stack itself. Further no additional changes needed in the device tree like in the mailbox-test driver. [1] https://gitlab.arm.com/tests/scmi-tests 2. Refactoring of the SCMI stack probing sequence On some platforms, SCMI transport can be provide by OPTEE/TEE which introduces certain dependency in the probe ordering. In order to address the same, the SCMI bus is split into its own module which continues to be initialized at subsys_initcall, while the SCMI core stack, including its various transport backends (like optee, mailbox, virtio, smc), is now moved into a separate module at module_init level. This allows the other possibly dependent subsystems to register and/or access SCMI bus well before the core SCMI stack and its dependent transport backends. * tag 'scmi-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (31 commits) firmware: arm_scmi: Clarify raw per-channel ABI documentation firmware: arm_scmi: Add per-channel raw injection support firmware: arm_scmi: Add the raw mode co-existence support firmware: arm_scmi: Call raw mode hooks from the core stack firmware: arm_scmi: Reject SCMI drivers when configured in raw mode firmware: arm_scmi: Add debugfs ABI documentation for raw mode firmware: arm_scmi: Add core raw transmission support firmware: arm_scmi: Add debugfs ABI documentation for common entries firmware: arm_scmi: Populate a common SCMI debugfs root debugfs: Export debugfs_create_str symbol include: trace: Add platform and channel instance references firmware: arm_scmi: Add internal platform/channel identifiers firmware: arm_scmi: Move errors defs and code to common.h firmware: arm_scmi: Add xfer helpers to provide raw access firmware: arm_scmi: Add flags field to xfer firmware: arm_scmi: Refactor scmi_wait_for_message_response firmware: arm_scmi: Refactor polling helpers firmware: arm_scmi: Refactor xfer in-flight registration routines firmware: arm_scmi: Split bus and driver into distinct modules firmware: arm_scmi: Introduce a new lifecycle for protocol devices ... Link: https://lore.kernel.org/r/20230120162152.1438456-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-23net/sock: Introduce trace_sk_data_ready()Peilin Ye
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready() callback implementations. For example: <...> iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable <...> Suggested-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-20include: trace: Add platform and channel instance referencesCristian Marussi
Add the channel and platform instance indentifier to SCMI message dump traces in order to easily associate message flows to specific transport channels. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20230118121426.492864-9-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-18cma: tracing: print alloc result in trace_cma_alloc_finishWenchao Hao
The result of the allocation attempt is not printed in trace_cma_alloc_finish, but it's important to do it so we can set filters to catch specific errors on allocation or to trigger some operations on specific errors. We have printed the result in log, but the log is conditional and could not be filtered by tracing events. It introduces little overhead to print this result. The result of allocation is named `errorno' in the trace. Link: https://lkml.kernel.org/r/20221208142130.1501195-1-haowenchao@huawei.com Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-13iommu: Remove detach_dev callbackLu Baolu
The detach_dev callback of domain ops is not called in the IOMMU core. Remove this callback to avoid dead code. The trace event for detaching domain from device is removed accordingly. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13sock: add tracepoint for send recv lengthYunhui Cui
Add 2 tracepoints to monitor the tcp/udp traffic of per process and per cgroup. Regarding monitoring the tcp/udp traffic of each process, there are two existing solutions, the first one is https://www.atoptool.nl/netatop.php. The second is via kprobe/kretprobe. Netatop solution is implemented by registering the hook function at the hook point provided by the netfilter framework. These hook functions may be in the soft interrupt context and cannot directly obtain the pid. Some data structures are added to bind packets and processes. For example, struct taskinfobucket, struct taskinfo ... Every time the process sends and receives packets it needs multiple hashmaps,resulting in low performance and it has the problem fo inaccurate tcp/udp traffic statistics(for example: multiple threads share sockets). We can obtain the information with kretprobe, but as we know, kprobe gets the result by trappig in an exception, which loses performance compared to tracepoint. We compared the performance of tracepoints with the above two methods, and the results are as follows: ab -n 1000000 -c 1000 -r http://127.0.0.1/index.html without trace: Time per request: 39.660 [ms] (mean) Time per request: 0.040 [ms] (mean, across all concurrent requests) netatop: Time per request: 50.717 [ms] (mean) Time per request: 0.051 [ms] (mean, across all concurrent requests) kr: Time per request: 43.168 [ms] (mean) Time per request: 0.043 [ms] (mean, across all concurrent requests) tracepoint: Time per request: 41.004 [ms] (mean) Time per request: 0.041 [ms] (mean, across all concurrent requests It can be seen that tracepoint has better performance. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Signed-off-by: Xiongchun Duan <duanxiongchun@bytedance.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11f2fs: support accounting iostat count and avg_bytesYangtao Li
Previously, we supported to account iostat io_bytes, in this patch, it adds to account iostat count and avg_bytes: time: 1671648667 io_bytes count avg_bytes [WRITE] app buffered data: 31 2 15 Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-06f2fs: remove the create argument to f2fs_map_blocksChristoph Hellwig
The create argument is always identicaly to map->m_may_create, so use that consistently. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-06rxrpc: Move client call connection to the I/O threadDavid Howells
Move the connection setup of client calls to the I/O thread so that a whole load of locking and barrierage can be eliminated. This necessitates the app thread waiting for connection to complete before it can begin encrypting data. This also completes the fix for a race that exists between call connection and call disconnection whereby the data transmission code adds the call to the peer error distribution list after the call has been disconnected (say by the rxrpc socket getting closed). The fix is to complete the process of moving call connection, data transmission and call disconnection into the I/O thread and thus forcibly serialising them. Note that the issue may predate the overhaul to an I/O thread model that were included in the merge window for v6.2, but the timing is very much changed by the change given below. Fixes: cf37b5987508 ("rxrpc: Move DATA transmission into call processor work item") Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Set up a connection bundle from a call, not rxrpc_conn_parametersDavid Howells
Use the information now stored in struct rxrpc_call to configure the connection bundle and thence the connection, rather than using the rxrpc_conn_parameters struct. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Offload the completion of service conn security to the I/O threadDavid Howells
Offload the completion of the challenge/response cycle on a service connection to the I/O thread. After the RESPONSE packet has been successfully decrypted and verified by the work queue, offloading the changing of the call states to the I/O thread makes iteration over the conn's channel list simpler. Do this by marking the RESPONSE skbuff and putting it onto the receive queue for the I/O thread to collect. We put it on the front of the queue as we've already received the packet for it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Tidy up abort generation infrastructureDavid Howells
Tidy up the abort generation infrastructure in the following ways: (1) Create an enum and string mapping table to list the reasons an abort might be generated in tracing. (2) Replace the 3-char string with the values from (1) in the places that use that to log the abort source. This gets rid of a memcpy() in the tracepoint. (3) Subsume the rxrpc_rx_eproto tracepoint with the rxrpc_abort tracepoint and use values from (1) to indicate the trace reason. (4) Always make a call to an abort function at the point of the abort rather than stashing the values into variables and using goto to get to a place where it reported. The C optimiser will collapse the calls together as appropriate. The abort functions return a value that can be returned directly if appropriate. Note that this extends into afs also at the points where that generates an abort. To aid with this, the afs sources need to #define RXRPC_TRACE_ONLY_DEFINE_ENUMS before including the rxrpc tracing header because they don't have access to the rxrpc internal structures that some of the tracepoints make use of. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Clean up connection abortDavid Howells
Clean up connection abort, using the connection state_lock to gate access to change that state, and use an rxrpc_call_completion value to indicate the difference between local and remote aborts as these can be pasted directly into the call state. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Implement a mechanism to send an event notification to a connectionDavid Howells
Provide a means by which an event notification can be sent to a connection through such that the I/O thread can pick it up and handle it rather than doing it in a separate workqueue. This is then used to move the deferred final ACK of a call into the I/O thread rather than a separate work queue as part of the drive to do all transmission from the I/O thread. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Only disconnect calls in the I/O threadDavid Howells
Only perform call disconnection in the I/O thread to reduce the locking requirement. This is the first part of a fix for a race that exists between call connection and call disconnection whereby the data transmission code adds the call to the peer error distribution list after the call has been disconnected (say by the rxrpc socket getting closed). The fix is to complete the process of moving call connection, data transmission and call disconnection into the I/O thread and thus forcibly serialising them. Note that the issue may predate the overhaul to an I/O thread model that were included in the merge window for v6.2, but the timing is very much changed by the change given below. Fixes: cf37b5987508 ("rxrpc: Move DATA transmission into call processor work item") Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Only set/transmit aborts in the I/O threadDavid Howells
Only set the abort call completion state in the I/O thread and only transmit ABORT packets from there. rxrpc_abort_call() can then be made to actually send the packet. Further, ABORT packets should only be sent if the call has been exposed to the network (ie. at least one attempted DATA transmission has occurred for it). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-06rxrpc: Make the local endpoint hold a ref on a connected callDavid Howells
Make the local endpoint and it's I/O thread hold a reference on a connected call until that call is disconnected. Without this, we're reliant on either the AF_RXRPC socket to hold a ref (which is dropped when the call is released) or a queued work item to hold a ref (the work item is being replaced with the I/O thread). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-05Merge tag 'net-6.2-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, wifi, and netfilter. Current release - regressions: - bpf: fix nullness propagation for reg to reg comparisons, avoid null-deref - inet: control sockets should not use current thread task_frag - bpf: always use maximal size for copy_array() - eth: bnxt_en: don't link netdev to a devlink port for VFs Current release - new code bugs: - rxrpc: fix a couple of potential use-after-frees - netfilter: conntrack: fix IPv6 exthdr error check - wifi: iwlwifi: fw: skip PPAG for JF, avoid FW crashes - eth: dsa: qca8k: various fixes for the in-band register access - eth: nfp: fix schedule in atomic context when sync mc address - eth: renesas: rswitch: fix getting mac address from device tree - mobile: ipa: use proper endpoint mask for suspend Previous releases - regressions: - tcp: add TIME_WAIT sockets in bhash2, fix regression caught by Jiri / python tests - net: tc: don't intepret cls results when asked to drop, fix oob-access - vrf: determine the dst using the original ifindex for multicast - eth: bnxt_en: - fix XDP RX path if BPF adjusted packet length - fix HDS (header placement) and jumbo thresholds for RX packets - eth: ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf, avoid memory corruptions Previous releases - always broken: - ulp: prevent ULP without clone op from entering the LISTEN status - veth: fix race with AF_XDP exposing old or uninitialized descriptors - bpf: - pull before calling skb_postpull_rcsum() (fix checksum support and avoid a WARN()) - fix panic due to wrong pageattr of im->image (when livepatch and kretfunc coexist) - keep a reference to the mm, in case the task is dead - mptcp: fix deadlock in fastopen error path - netfilter: - nf_tables: perform type checking for existing sets - nf_tables: honor set timeout and garbage collection updates - ipset: fix hash:net,port,net hang with /0 subnet - ipset: avoid hung task warning when adding/deleting entries - selftests: net: - fix cmsg_so_mark.sh test hang on non-x86 systems - fix the arp_ndisc_evict_nocarrier test for IPv6 - usb: rndis_host: secure rndis_query check against int overflow - eth: r8169: fix dmar pte write access during suspend/resume with WOL - eth: lan966x: fix configuration of the PCS - eth: sparx5: fix reading of the MAC address - eth: qed: allow sleep in qed_mcp_trace_dump() - eth: hns3: - fix interrupts re-initialization after VF FLR - fix handling of promisc when MAC addr table gets full - refine the handling for VF heartbeat - eth: mlx5: - properly handle ingress QinQ-tagged packets on VST - fix io_eq_size and event_eq_size params validation on big endian - fix RoCE setting at HCA level if not supported at all - don't turn CQE compression on by default for IPoIB - eth: ena: - fix toeplitz initial hash key value - account for the number of XDP-processed bytes in interface stats - fix rx_copybreak value update Misc: - ethtool: harden phy stat handling against buggy drivers - docs: netdev: convert maintainer's doc from FAQ to a normal document" * tag 'net-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits) caif: fix memory leak in cfctrl_linkup_request() inet: control sockets should not use current thread task_frag net/ulp: prevent ULP without clone op from entering the LISTEN status qed: allow sleep in qed_mcp_trace_dump() MAINTAINERS: Update maintainers for ptp_vmw driver usb: rndis_host: Secure rndis_query check against int overflow net: dpaa: Fix dtsec check for PCS availability octeontx2-pf: Fix lmtst ID used in aura free drivers/net/bonding/bond_3ad: return when there's no aggregator netfilter: ipset: Rework long task execution when adding/deleting entries netfilter: ipset: fix hash:net,port,net hang with /0 subnet net: sparx5: Fix reading of the MAC address vxlan: Fix memory leaks in error path net: sched: htb: fix htb_classify() kernel-doc net: sched: cbq: dont intepret cls results when asked to drop net: sched: atm: dont intepret cls results when asked to drop dt-bindings: net: marvell,orion-mdio: Fix examples dt-bindings: net: sun8i-emac: Add phy-supply property net: ipa: use proper endpoint mask for suspend selftests: net: return non-zero for failures reported in arp_ndisc_evict_nocarrier ...
2023-01-04cxl/pci: Move tracepoint definitions to drivers/cxl/core/Dan Williams
CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-02Merge tag 'for-6.2-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "First batch of regression and regular fixes: - regressions: - fix error handling after conversion to qstr for paths - fix raid56/scrub recovery caused by uninitialized variable after conversion to error bitmaps - restore qgroup backref lookup behaviour after recent refactoring - fix leak of device lists at module exit time - fix resolving backrefs for inline extent followed by prealloc - reset defrag ioctl buffer on memory allocation error" * tag 'for-6.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix fscrypt name leak after failure to join log transaction btrfs: scrub: fix uninitialized return value in recover_scrub_rbio btrfs: fix resolving backrefs for inline extent followed by prealloc btrfs: fix trace event name typo for FLUSH_DELAYED_REFS btrfs: restore BTRFS_SEQ_LAST when looking up qgroup backref lookup btrfs: fix leak of fs devices after removing btrfs module btrfs: fix an error handling path in btrfs_defrag_leaves() btrfs: fix an error handling path in btrfs_rename()
2022-12-28rxrpc: Fix a couple of potential use-after-freesDavid Howells
At the end of rxrpc_recvmsg(), if a call is found, the call is put and then a trace line is emitted referencing that call in a couple of places - but the call may have been deallocated by the time those traces happen. Fix this by stashing the call debug_id in a variable and passing that to the tracepoint rather than the call pointer. Fixes: 849979051cbc ("rxrpc: Add a tracepoint to follow what recvmsg does") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-27tracing/rseq: Add mm_cid field to rseq_updateMathieu Desnoyers
Add the mm_cid field to the rseq_update event, allowing tracers to follow which mm_cid is observed by user-space, and whether negative mm_cid values are visible in case of internal scheduler implementation issues. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221122203932.231377-22-mathieu.desnoyers@efficios.com
2022-12-27rseq: Extend struct rseq with numa node idMathieu Desnoyers
Adding the NUMA node id to struct rseq is a straightforward thing to do, and a good way to figure out if anything in the user-space ecosystem prevents extending struct rseq. This NUMA node id field allows memory allocators such as tcmalloc to take advantage of fast access to the current NUMA node id to perform NUMA-aware memory allocation. It can also be useful for implementing fast-paths for NUMA-aware user-space mutexes. It also allows implementing getcpu(2) purely in user-space. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221122203932.231377-5-mathieu.desnoyers@efficios.com
2022-12-21Merge tag 'pwm/for-6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "Various changes across the board, mostly improvements and cleanups" * tag 'pwm/for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (42 commits) pwm: pca9685: Convert to i2c's .probe_new() pwm: sun4i: Propagate errors in .get_state() to the caller pwm: Handle .get_state() failures pwm: sprd: Propagate errors in .get_state() to the caller pwm: rockchip: Propagate errors in .get_state() to the caller pwm: mtk-disp: Propagate errors in .get_state() to the caller pwm: imx27: Propagate errors in .get_state() to the caller pwm: cros-ec: Propagate errors in .get_state() to the caller pwm: crc: Propagate errors in .get_state() to the caller leds: qcom-lpg: Propagate errors in .get_state() to the caller drm/bridge: ti-sn65dsi86: Propagate errors in .get_state() to the caller pwm/tracing: Also record trace events for failed API calls pwm: Make .get_state() callback return an error code pwm: pxa: Enable for MMP platform pwm: pxa: Add reference manual link and limitations pwm: pxa: Use abrupt shutdown mode pwm: pxa: Remove clk enable/disable from pxa_pwm_config pwm: pxa: Set duty cycle to 0 when disabling PWM pwm: pxa: Remove pxa_pwm_enable/disable pwm: mediatek: Add support for MT7986 ...
2022-12-21Merge tag 'net-6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, netfilter and can. Current release - regressions: - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func - rxrpc: - fix security setting propagation - fix null-deref in rxrpc_unuse_local() - fix switched parameters in peer tracing Current release - new code bugs: - rxrpc: - fix I/O thread startup getting skipped - fix locking issues in rxrpc_put_peer_locked() - fix I/O thread stop - fix uninitialised variable in rxperf server - fix the return value of rxrpc_new_incoming_call() - microchip: vcap: fix initialization of value and mask - nfp: fix unaligned io read of capabilities word Previous releases - regressions: - stop in-kernel socket users from corrupting socket's task_frag - stream: purge sk_error_queue in sk_stream_kill_queues() - openvswitch: fix flow lookup to use unmasked key - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port() - devlink: - hold region lock when flushing snapshots - protect devlink dump by the instance lock Previous releases - always broken: - bpf: - prevent leak of lsm program after failed attach - resolve fext program type when checking map compatibility - skbuff: account for tail adjustment during pull operations - macsec: fix net device access prior to holding a lock - bonding: switch back when high prio link up - netfilter: flowtable: really fix NAT IPv6 offload - enetc: avoid buffer leaks on xdp_do_redirect() failure - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg() - dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq" * tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits) net: fec: check the return value of build_skb() net: simplify sk_page_frag Treewide: Stop corrupting socket's task_frag net: Introduce sk_use_task_frag in struct sock. mctp: Remove device type check at unregister net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len can: flexcan: avoid unbalanced pm_runtime_enable warning Documentation: devlink: add missing toc entry for etas_es58x devlink doc mctp: serial: Fix starting value for frame check sequence nfp: fix unaligned io read of capabilities word net: stream: purge sk_error_queue in sk_stream_kill_queues() myri10ge: Fix an error handling path in myri10ge_probe() net: microchip: vcap: Fix initialization of value and mask rxrpc: Fix the return value of rxrpc_new_incoming_call() rxrpc: rxperf: Fix uninitialised variable rxrpc: Fix I/O thread stop rxrpc: Fix switched parameters in peer tracing rxrpc: Fix locking issues in rxrpc_put_peer_locked() rxrpc: Fix I/O thread startup getting skipped ...
2022-12-20Merge tag 'asm-generic-6.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are only three fairly simple patches. The #include change to linux/swab.h addresses a userspace build issue, and the change to the mmio tracing logic helps provide more useful traces" * tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info include/uapi/linux/swab: Fix potentially missing __always_inline
2022-12-19rxrpc: Fix switched parameters in peer tracingDavid Howells
Fix the switched parameters on rxrpc_alloc_peer() and rxrpc_get_peer(). The ref argument and the why argument got mixed. Fixes: 47c810a79844 ("rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracing") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-15Merge tag 'trace-v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Add options to the osnoise tracer: - 'panic_on_stop' option that panics the kernel if osnoise is greater than some user defined threshold. - 'preempt' option, to test noise while preemption is disabled - 'irq' option, to test noise when interrupts are disabled - Add .percent and .graph suffix to histograms to give different outputs - Add nohitcount to disable showing hitcount in histogram output - Add new __cpumask() to trace event fields to annotate that a unsigned long array is a cpumask to user space and should be treated as one. - Add trace_trigger kernel command line parameter to enable trace event triggers at boot up. Useful to trace stack traces, disable tracing and take snapshots. - Fix x86/kmmio mmio tracer to work with the updates to lockdep - Unify the panic and die notifiers - Add back ftrace_expect reference that is used to extract more information in the ftrace_bug() code. - Have trigger filter parsing errors show up in the tracing error log. - Updated MAINTAINERS file to add kernel tracing mailing list and patchwork info - Use IDA to keep track of event type numbers. - And minor fixes and clean ups * tag 'trace-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (44 commits) tracing: Fix cpumask() example typo tracing: Improve panic/die notifiers ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY kernels tracing: Do not synchronize freeing of trigger filter on boot up tracing: Remove pointer (asterisk) and brackets from cpumask_t field tracing: Have trigger filter parsing errors show up in error_log x86/mm/kmmio: Remove redundant preempt_disable() tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line Documentation/osnoise: Add osnoise/options documentation tracing/osnoise: Add preempt and/or irq disabled options tracing/osnoise: Add PANIC_ON_STOP option Documentation/osnoise: Escape underscore of NO_ prefix tracing: Fix some checker warnings tracing/osnoise: Make osnoise_options static tracing: remove unnecessary trace_trigger ifdef ring-buffer: Handle resize in early boot up tracing/hist: Fix issue of losting command info in error_log tracing: Fix issue of missing one synthetic field tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx' tracing/hist: Fix wrong return value in parse_action_params() ...