summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-10-10bpf: Update bpf_override_return() commentMartin Kelly
The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only on x86. This was presumably true at the time of writing, but it's now supported on many other architectures too. Drop this statement, since it's not correct anymore and it fits better in other documentation anyway. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc3). No conflicts and no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11Merge tag 'drm-misc-next-2024-10-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.13: UAPI Changes: - Add drm fdinfo support to panthor, and add sysfs knob to toggle. Cross-subsystem Changes: - Convert fbdev drivers to use backlight power constants. - Some small dma-fence fixes. - Some kernel-doc fixes. Core Changes: - Small drm client fixes. - Document requirements that you need to file a bug before marking a test as flaky. - Remove swapped and pinned bo's from TTM lru list. Driver Changes: - Assorted small fixes to panel/elida-kd35t133, nouveau, vc4, imx. - Fix some bridges to drop cached edids on power off. - Add Jenson BL-JT60050-01A, Samsung s6e3ha8 & AMS639RQ08 panels. - Make 180° rotation work on ilitek-ili9881c, even for already-rotated panels. - Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # drivers/gpu/drm/panthor/panthor_drv.c From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8dc111ca-d20c-4e0d-856e-c12d208cbf2a@linux.intel.com
2024-10-10Merge tag 'net-6.12-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Current release - regressions: - dsa: sja1105: fix reception from VLAN-unaware bridges - Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled" - eth: fec: don't save PTP state if PTP is unsupported Current release - new code bugs: - smc: fix lack of icsk_syn_mss with IPPROTO_SMC, prevent null-deref - eth: airoha: update Tx CPU DMA ring idx at the end of xmit loop - phy: aquantia: AQR115c fix up PMA capabilities Previous releases - regressions: - tcp: 3 fixes for retrans_stamp and undo logic Previous releases - always broken: - net: do not delay dst_entries_add() in dst_release() - netfilter: restrict xtables extensions to families that are safe, syzbot found a way to combine ebtables with extensions that are never used by userspace tools - sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start - mptcp: handle consistently DSS corruption, and prevent corruption due to large pmtu xmit" * tag 'net-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) MAINTAINERS: Add headers and mailing list to UDP section MAINTAINERS: consistently exclude wireless files from NETWORKING [GENERAL] slip: make slhc_remember() more robust against malicious packets net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC ppp: fix ppp_async_encode() illegal access docs: netdev: document guidance on cleanup patches phonet: Handle error of rtnl_register_module(). mpls: Handle error of rtnl_register_module(). mctp: Handle error of rtnl_register_module(). bridge: Handle error of rtnl_register_module(). vxlan: Handle error of rtnl_register_module(). rtnetlink: Add bulk registration helpers for rtnetlink message handlers. net: do not delay dst_entries_add() in dst_release() mptcp: pm: do not remove closing subflows mptcp: fallback when MPTCP opts are dropped after 1st data tcp: fix mptcp DSS corruption due to large pmtu xmit mptcp: handle consistently DSS corruption net: netconsole: fix wrong warning net: dsa: refuse cross-chip mirroring operations net: fec: don't save PTP state if PTP is unsupported ...
2024-10-10clocksource: Remove unused clocksource_change_ratingDr. David Alan Gilbert
clocksource_change_rating() has been unused since 2017's commit 63ed4e0c67df ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code") Remove it. __clocksource_change_rating now only has one use which is ifdef'd. Move it into the ifdef'd section. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241010135446.213098-1-linux@treblig.org
2024-10-10fgraph: Simplify return address printing in function graph tracerMasami Hiramatsu (Google)
Simplify return address printing in the function graph tracer by removing fgraph_extras. Since this feature is only used by the function graph tracer and the feature flags can directly accessible from the function graph tracer, fgraph_extras can be removed from the fgraph callback. Cc: Donglin Peng <dolinux.peng@gmail.com> Link: https://lore.kernel.org/172857234900.270774.15378354017601069781.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-10bpf: fix argument type in bpf_loop documentationMatteo Croce
The `index` argument to bpf_loop() is threaded as an u64. This lead in a subtle verifier denial where clang cloned the argument in another register[1]. [1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895 Signed-off-by: Matteo Croce <teknoraver@meta.com> Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-10Merge branch 'net-introduce-tx-h-w-shaping-api'Jakub Kicinski
Paolo Abeni says: ==================== net: introduce TX H/W shaping API We have a plurality of shaping-related drivers API, but none flexible enough to meet existing demand from vendors[1]. This series introduces new device APIs to configure in a flexible way TX H/W shaping. The new functionalities are exposed via a newly defined generic netlink interface and include introspection capabilities. Some self-tests are included, on top of a dummy netdevsim implementation. Finally a basic implementation for the iavf driver is provided. Some usage examples: * Configure shaping on a given queue: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ --do set --json '{"ifindex": '$IFINDEX', "shaper": {"handle": {"scope": "queue", "id":'$QUEUEID'}, "bw-max": 2000000}}' * Container B/W sharing The orchestration infrastructure wants to group the container-related queues under a RR scheduling and limit the aggregate bandwidth: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ --do group --json '{"ifindex": '$IFINDEX', "leaves": [ {"handle": {"scope": "queue", "id":'$QID1'}, "weight": '$W1'}, {"handle": {"scope": "queue", "id":'$QID2'}, "weight": '$W2'}], {"handle": {"scope": "queue", "id":'$QID3'}, "weight": '$W3'}], "handle": {"scope":"node"}, "bw-max": 10000000}' {'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 0}} Q1 \ \ Q2 -- node 0 ------- netdev / (bw-max: 10M) Q3 / * Delegation A containers wants to limit the aggregate B/W bandwidth of 2 of the 3 queues it owns - the starting configuration is the one from the previous point: SPEC=Documentation/netlink/specs/net_shaper.yaml ./tools/net/ynl/cli.py --spec $SPEC \ --do group --json '{"ifindex": '$IFINDEX', "leaves": [ {"handle": {"scope": "queue", "id":'$QID1'}, "weight": '$W1'}, {"handle": {"scope": "queue", "id":'$QID2'}, "weight": '$W2'}], "handle": {"scope": "node"}, "bw-max": 5000000 }' {'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 1}} Q1 -- node 1 --------\ / (bw-max: 5M) \ Q2 / node 0 ------- netdev /(bw-max: 10M) Q3 ------------------/ In a group operation, when prior to the op itself, the leaves have different parents, the user must specify the parent handle for the group. I.e., starting from the previous config: ./tools/net/ynl/cli.py --spec $SPEC \ --do group --json '{"ifindex": '$IFINDEX', "leaves": [ {"handle": {"scope": "queue", "id":'$QID1'}, "weight": '$W1'}, {"handle": {"scope": "queue", "id":'$QID3'}, "weight": '$W3'}], "handle": {"scope": "node"}, "bw-max": 3000000 }' Netlink error: Invalid argument nl_len = 96 (80) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'All the leaves shapers must have the same old parent'} ./tools/net/ynl/cli.py --spec $SPEC \ --do group --json '{"ifindex": '$IFINDEX', "leaves": [ {"handle": {"scope": "queue", "id":'$QID1'}, "weight": '$W1'}, {"handle": {"scope": "queue", "id":'$QID3'}, "weight": '$W3'}], "handle": {"scope": "node"}, "parent": {"scope": "node", "id": 1}, "bw-max": 3000000 } {'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 2}} Q1 -- node 2 --- /(bw-max:3M)\ Q3 / \ ---- node 1 \ / (bw-max: 5M)\ Q2 node 0 ------- netdev (bw-max: 10M) * Cleanup: Still starting from config 1To delete a single queue shaper ./tools/net/ynl/cli.py --spec $SPEC --do delete --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "queue", "id":'$QID3'}}' Q1 -- node 2 --- (bw-max:3M)\ \ ---- node 1 \ / (bw-max: 5M)\ Q2 node 0 ------- netdev (bw-max: 10M) Deleting a node shaper relinks all its leaves to the node's parent: ./tools/net/ynl/cli.py --spec $SPEC --do delete --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "node", "id":2}}' Q1 ---\ \ node 1----- \ / (bw-max: 5M)\ Q2----/ node 0 ------- netdev (bw-max: 10M) Deleting the last shaper under a node shaper deletes the node, too: ./tools/net/ynl/cli.py --spec $SPEC --do delete --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "queue", "id":'$QID1'}}' ./tools/net/ynl/cli.py --spec $SPEC --do delete --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "queue", "id":'$QID2'}}' ./tools/net/ynl/cli.py --spec $SPEC --do get --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "node", "id": 1}}' Netlink error: No such file or directory nl_len = 44 (28) nl_flags = 0x300 nl_type = 2 error: -2 extack: {'bad-attr': '.handle'} Such delete recurses on parents that are left over with no leaves: ./tools/net/ynl/cli.py --spec $SPEC --do get --json \ '{"ifindex": '$IFINDEX', "handle": {"scope": "node", "id": 0}}' Netlink error: No such file or directory nl_len = 44 (28) nl_flags = 0x300 nl_type = 2 error: -2 extack: {'bad-attr': '.handle'} v8: https://lore.kernel.org/cover.1727704215.git.pabeni@redhat.com v7: https://lore.kernel.org/cover.1725919039.git.pabeni@redhat.com v6: https://lore.kernel.org/cover.1725457317.git.pabeni@redhat.com v5: https://lore.kernel.org/cover.1724944116.git.pabeni@redhat.com v4: https://lore.kernel.org/cover.1724165948.git.pabeni@redhat.com v3: https://lore.kernel.org/cover.1722357745.git.pabeni@redhat.com RFC v2: https://lore.kernel.org/cover.1721851988.git.pabeni@redhat.com RFC v1: https://lore.kernel.org/cover.1719518113.git.pabeni@redhat.com ==================== Link: https://patch.msgid.link/cover.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10virtchnl: support queue rate limit and quanta size configurationWenjun Wu
This patch adds new virtchnl opcodes and structures for rate limit and quanta size configuration, which include: 1. VIRTCHNL_OP_CONFIG_QUEUE_BW, to configure max bandwidth for each VF per queue. 2. VIRTCHNL_OP_CONFIG_QUANTA, to configure quanta size per queue. 3. VIRTCHNL_OP_GET_QOS_CAPS, VF queries current QoS configuration, such as enabled TCs, arbiter type, up2tc and bandwidth of VSI node. The configuration is previously set by DCB and PF, and now is the potential QoS capability of VF. VF can take it as reference to configure queue TC mapping. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/839002f7bd6f63b985a060a51b079f6e6dbbe237.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10netlink: spec: add shaper introspection supportPaolo Abeni
Allow the user-space to fine-grain query the shaping features supported by the NIC on each domain. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/3ddd10e450e3fe7d4b944c0d0b886d4483529ee6.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10net-shapers: implement NL get operationPaolo Abeni
Introduce the basic infrastructure to implement the net-shaper core functionality. Each network devices carries a net-shaper cache, the NL get() operation fetches the data from such cache. The cache is initially empty, will be fill by the set()/group() operation implemented later and is destroyed at device cleanup time. The net_shaper_fill_handle(), net_shaper_ctx_init(), and net_shaper_generic_pre() implementations handle generic index type attributes, despite the current caller always pass a constant value to avoid more noise in later patches using them with different attributes. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/ddd10fd645a9367803ad02fca4a5664ea5ace170.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10netlink: spec: add shaper YAML specPaolo Abeni
Define the user-space visible interface to query, configure and delete network shapers via yaml definition. Add dummy implementations for the relevant NL callbacks. set() and delete() operations touch a single shaper creating/updating or deleting it. The group() operation creates a shaper's group, nesting multiple input shapers under the specified output shaper. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/7a33a1ff370bdbcd0cd3f909575c912cd56f41da.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10genetlink: extend info user-storage to match NL cb ctxPaolo Abeni
This allows a more uniform implementation of non-dump and dump operations, and will be used later in the series to avoid some per-operation allocation. Additionally rename the NL_ASSERT_DUMP_CTX_FITS macro, to fit a more extended usage. Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/1130cc2896626b84587a2a5f96a5c6829638f4da.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10mctp: Handle error of rtnl_register_module().Kuniyuki Iwashima
Since introduced, mctp has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 583be982d934 ("mctp: Add device handling and netlink interface") Fixes: 831119f88781 ("mctp: Add neighbour netlink interface") Fixes: 06d2f4c583a7 ("mctp: Add netlink route management") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10rtnetlink: Add bulk registration helpers for rtnetlink message handlers.Kuniyuki Iwashima
Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"), once rtnl_msg_handlers[protocol] was allocated, the following rtnl_register_module() for the same protocol never failed. However, after the commit, rtnl_msg_handler[protocol][msgtype] needs to be allocated in each rtnl_register_module(), so each call could fail. Many callers of rtnl_register_module() do not handle the returned error, and we need to add many error handlings. To handle that easily, let's add wrapper functions for bulk registration of rtnetlink message handlers. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10OPP: Drop redundant *_opp_attach|detach_genpd()Ulf Hansson
All users of *_opp_attach|detach_genpd(), have been converted to use dev|devm_pm_domain_attach|detach_list(), hence let's drop it along with its corresponding exported functions. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-12-ulf.hansson@linaro.org
2024-10-10pmdomain: core: Set the required dev for a required OPP during genpd attachUlf Hansson
In the single PM domain case there is no need for platform code to specify the index of the corresponding required OPP in DT, as the index must be zero. This allows us to assign a required dev for the required OPP from genpd, while attaching a device to its PM domain. In this way, we can remove some of the genpd specific code in the OPP core for the single PM domain case. Although, this cleanup is made from a subsequent change. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-7-ulf.hansson@linaro.org
2024-10-10PM: domains: Support required OPPs in dev_pm_domain_attach_list()Ulf Hansson
In the multiple PM domain case we need platform code to specify the index of the corresponding required OPP in DT for a device, which is what *_opp_attach_genpd() is there to help us with. However, attaching a device to its PM domains is in general better done with dev_pm_domain_attach_list(). To avoid having two different ways to manage this and to prepare for the removal of *_opp_attach_genpd(), let's extend dev_pm_domain_attach|detach_list() to manage the required OPPs too. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-5-ulf.hansson@linaro.org
2024-10-10OPP: Rework _set_required_devs() to manage a single device per callUlf Hansson
At this point there are no consumer drivers that makes use of _set_required_devs(), hence it should be straightforward to rework the code to enable it to better integrate with the PM domain attach procedure. During attach, one device at the time is being hooked up to its corresponding PM domain. Therefore, let's update the _set_required_devs() to align to this behaviour, allowing callers to fill out one required_dev per call. Subsequent changes starts making use of this. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-4-ulf.hansson@linaro.org
2024-10-10regulator: max5970: Drop unused structsPatrick Rudolph
After splitting the max5970 into a MFD device clean the remaining code and drop unused structs. The struct max5970_data and enum max5970_chip_type aren't used. Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com> Acked-by: Lee Jones <lee@kernel.org> Link: https://patch.msgid.link/20241002125500.78278-1-patrick.rudolph@9elements.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-10ASoC: topology: Bump minimal topology ABI versionAmadeusz Sławiński
When v4 topology support was removed, minimal topology ABI version should have been bumped. Fixes: fe4a07454256 ("ASoC: Drop soc-topology ABI v4 support") Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Link: https://patch.msgid.link/20241009081230.304918-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-10net/mlx5: qos: Flesh out element_attributes in mlx5_ifc.hCosmin Ratiu
This is used for multiple purposes, depending on the scheduling element created. There are a few helper struct defined a long time ago, but they are not easy to find in the file and they are about to get new members. This commit cleans up this area a bit by: - moving the helper structs closer to where they are relevant. - defining a helper union to include all of them to help discoverability. - making use of it everywhere element_attributes is used. - using a consistent 'attr' name. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10net: Remove likely from l3mdev_master_ifindex_by_indexBreno Leitao
The likely() annotation in l3mdev_master_ifindex_by_index() has been found to be incorrect 100% of the time in real-world workloads (e.g., web servers). Annotated branches shows the following in these servers: correct incorrect % Function File Line 0 169053813 100 l3mdev_master_ifindex_by_index l3mdev.h 81 This is happening because l3mdev_master_ifindex_by_index() is called from __inet_check_established(), which calls l3mdev_master_ifindex_by_index() passing the socked bounded interface. l3mdev_master_ifindex_by_index(net, sk->sk_bound_dev_if); Since most sockets are not going to be bound to a network device, the likely() is giving the wrong assumption. Remove the likely() annotation to ensure more accurate branch prediction. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241008163205.3939629-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10Add dev_warn_probe() and improve error handling inMark Brown
Merge series from Dragan Simic <dsimic@manjaro.org>: This is a small series that introduces dev_warn_probe() function, which produces warnings on failed resource acquisitions, and improves error handling in the probe paths of Rockchip SPI drivers, by using functions dev_err_probe() and dev_warn_probe() properly in multiple places. This series also performs a bunch of small, rather trivial code cleanups, to make the code neater and a bit easier to read.
2024-10-10Merge branch 'for-v6.13/clk-dt-bindings' into next/clkKrzysztof Kozlowski
2024-10-10dt-bindings: clock: exynosautov920: add peric1, misc and hsi0/1 clock ↵Sunyeal Hong
definitions Add peric1, misc and hsi0/1 clock definitions. - CMU_PERIC1 for USI, IC2 and I3C - CMU_MISC for MISC, GIC and OTP - HSI0 for PCIE - HSI1 for USB and MMC Signed-off-by: Sunyeal Hong <sunyeal.hong@samsung.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241009042110.2379903-2-sunyeal.hong@samsung.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2024-10-10Merge patch series "timekeeping/fs: multigrain timestamp redux"Christian Brauner
Jeff Layton <jlayton@kernel.org> says: The VFS has always used coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide when to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. What we need is a way to only use fine-grained timestamps when they are being actively queried. Use the (unused) top bit in inode->i_ctime_nsec as a flag that indicates whether the current timestamps have been queried via stat() or the like. When it's set, we allow the kernel to use a fine-grained timestamp iff it's necessary to make the ctime show a different value. This solves the problem of being able to distinguish the timestamp between updates, but introduces a new problem: it's now possible for a file being changed to get a fine-grained timestamp. A file that is altered just a bit later can then get a coarse-grained one that appears older than the earlier fine-grained time. This violates timestamp ordering guarantees. To remedy this, keep a global monotonic atomic64_t value that acts as a timestamp floor. When we go to stamp a file, we first get the latter of the current floor value and the current coarse-grained time. If the inode ctime hasn't been queried then we just attempt to stamp it with that value. If it has been queried, then first see whether the current coarse time is later than the existing ctime. If it is, then we accept that value. If it isn't, then we get a fine-grained time and try to swap that into the global floor. Whether that succeeds or fails, we take the resulting floor time, convert it to realtime and try to swap that into the ctime. We take the result of the ctime swap whether it succeeds or fails, since either is just as valid. Filesystems can opt into this by setting the FS_MGTIME fstype flag. Others should be unaffected (other than being subject to the same floor value as multigrain filesystems). * patches from https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org: tmpfs: add support for multigrain timestamps btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps Documentation: add a new file documenting multigrain timestamps fs: add percpu counters for significant multigrain timestamp events fs: tracepoints around multigrain timestamp events fs: handle delegated timestamps in setattr_copy_mgtime fs: have setattr_copy handle multigrain timestamps appropriately fs: add infrastructure for multigrain timestamps Link: https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10fs: tracepoints around multigrain timestamp eventsJeff Layton
Add some tracepoints around various multigrain timestamp events. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20241002-mgtime-v10-6-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10fs: handle delegated timestamps in setattr_copy_mgtimeJeff Layton
An update to the inode ctime typically requires the latest clock value possible. The exception to this rule is when there is a nfsd write delegation and the server is proxying timestamps from the client. When nfsd gets a CB_GETATTR response, update the timestamp value in the inode to the values that the client is tracking. The client doesn't send a ctime value (since that's always determined by the exported filesystem), but it can send a mtime value. In the case where it does, update the ctime to a value commensurate with that instead of the current time. If ATTR_DELEG is set, then use ia_ctime value instead of setting the timestamp to the current time. With the addition of delegated timestamps, the server may receive a request to update only the atime, which doesn't involve a ctime update. Trust the ATTR_CTIME flag in the update and only update the ctime when it's set. Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20241002-mgtime-v10-5-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10timekeeping: Add percpu counter for tracking floor swap eventsJeff Layton
The mgtime_floor value is a global variable for tracking the latest fine-grained timestamp handed out. Because it's a global, track the number of times that a new floor value is assigned. Add a new percpu counter to the timekeeping code to track the number of floor swap events that have occurred. A later patch will add a debugfs file to display this counter alongside other stats involving multigrain timestamps. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Link: https://lore.kernel.org/all/20241002-mgtime-v10-2-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10timekeeping: Add interfaces for handling timestamps with a floor valueJeff Layton
Multigrain timestamps allow the kernel to use fine-grained timestamps when an inode's attributes is being actively observed via ->getattr(). With this support, it's possible for a file to get a fine-grained timestamp, and another modified after it to get a coarse-grained stamp that is earlier than the fine-grained time. If this happens then the files can appear to have been modified in reverse order, which breaks VFS ordering guarantees [1]. To prevent this, maintain a floor value for multigrain timestamps. Whenever a fine-grained timestamp is handed out, record it, and when later coarse-grained stamps are handed out, ensure they are not earlier than that value. If the coarse-grained timestamp is earlier than the fine-grained floor, return the floor value instead. Add a static singleton atomic64_t into timekeeper.c that is used to keep track of the latest fine-grained time ever handed out. This is tracked as a monotonic ktime_t value to ensure that it isn't affected by clock jumps. Because it is updated at different times than the rest of the timekeeper object, the floor value is managed independently of the timekeeper via a cmpxchg() operation, and sits on its own cacheline. Add two new public interfaces: - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the coarse-grained clock and the floor time - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries to swap it into the floor. A timespec64 is filled with the result. The floor value is global and updated via a single try_cmpxchg(). If that fails then the operation raced with a concurrent update. Any concurrent update must be later than the existing floor value, so any racing tasks can accept any resulting floor value without retrying. [1]: POSIX requires that files be stamped with realtime clock values, and makes no provision for dealing with backward clock jumps. If a backward realtime clock jump occurs, then files can appear to have been modified in reverse order. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/all/20241002-mgtime-v10-1-d1c4717f5284@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-09ipv4: Retire global IPv4 hash table inet_addr_lst.Kuniyuki Iwashima
No one uses inet_addr_lst anymore, so let's remove it. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-5-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Namespacify IPv4 address GC.Kuniyuki Iwashima
Each IPv4 address could have a lifetime, which is useful for DHCP, and GC is periodically executed as check_lifetime_work. check_lifetime() does the actual GC under RTNL. 1. Acquire RTNL 2. Iterate inet_addr_lst 3. Remove IPv4 address if expired 4. Release RTNL Namespacifying the GC is required for per-netns RTNL, but using the per-netns hash table will shorten the time on the hash bucket iteration under RTNL. Let's add per-netns GC work and use the per-netns hash table. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Link IPv4 address to per-netns hash table.Kuniyuki Iwashima
As a prep for per-netns RTNL conversion, we want to namespacify the IPv4 address hash table and the GC work. Let's allocate the per-netns IPv4 address hash table to net->ipv4.inet_addr_lst and link IPv4 addresses into it. The actual users will be converted later. Note that the IPv6 address hash table is already namespacified. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Fix misspelling of "accept*" in netAlexander Zubkov
Several files have "accept*" misspelled as "accpet*" in the comments. Fix all such occurrences. Signed-off-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008162756.22618-2-green@qrator.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Convert fib_validate_source() to dscp_t.Guillaume Nault
Pass a dscp_t variable to fib_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. All callers of fib_validate_source() already have a dscp_t variable to pass as parameter. We just need to remove the inet_dscp_to_dsfield() conversions. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/08612a4519bc5a3578bb493fbaad82437ebb73dc.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Convert ip_mc_validate_source() to dscp_t.Guillaume Nault
Pass a dscp_t variable to ip_mc_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Callers of ip_mc_validate_source() to consider are: * ip_route_input_mc() which already has a dscp_t variable to pass as parameter. We just need to remove the inet_dscp_to_dsfield() conversion. * udp_v4_early_demux() which gets the DSCP directly from the IPv4 header and can simply use the ip4h_dscp() helper. Also, stop including net/inet_dscp.h in udp.c as we don't use any of its declarations anymore. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/c91b2cca04718b7ee6cf5b9c1d5b40507d65a8d4.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Convert ip_route_use_hint() to dscp_t.Guillaume Nault
Pass a dscp_t variable to ip_route_use_hint(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only ip_rcv_finish_core() actually calls ip_route_use_hint(). Use the ip4h_dscp() helper to get the DSCP from the IPv4 header. While there, modify the declaration of ip_route_use_hint() in include/net/route.h so that it matches the prototype of its implementation in net/ipv4/route.c. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/c40994fdf804db7a363d04fdee01bf48dddda676.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09of: kunit: Extract some overlay boiler plate into macrosStephen Boyd
Make the lives of __of_overlay_apply_kunit() callers easier by extracting some of the boiler plate involved in referencing the DT overlays. Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Peng Fan <peng.fan@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240822002433.1163814-3-sboyd@kernel.org
2024-10-09clk: test: Add test managed of_clk_add_hw_provider()Stephen Boyd
Add a test managed version of of_clk_add_hw_provider() that automatically unregisters the clk_hw provider upon test conclusion. Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Peng Fan <peng.fan@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240822002433.1163814-2-sboyd@kernel.org
2024-10-09clk: Remove unused clk_hw_rate_is_protectedDr. David Alan Gilbert
clk_hw_rate_is_protected() was added in 2017's commit e55a839a7a1c ("clk: add clock protection mechanism to clk core") but has been unused. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241009003552.254675-1-linux@treblig.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-10-09tracing/bpf: Add might_fault check to syscall probesMathieu Desnoyers
Add a might_fault() check to validate that the bpf sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-9-mathieu.desnoyers@efficios.com Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Andrii Nakryiko <andrii@kernel.org> # BPF parts Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing/perf: Add might_fault check to syscall probesMathieu Desnoyers
Add a might_fault() check to validate that the perf sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-8-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing/ftrace: Add might_fault check to syscall probesMathieu Desnoyers
Add a might_fault() check to validate that the ftrace sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-7-mathieu.desnoyers@efficios.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing: Allow system call tracepoints to handle page faultsMathieu Desnoyers
Use Tasks Trace RCU to protect iteration of system call enter/exit tracepoint probes to allow those probes to handle page faults. In preparation for this change, all tracers registering to system call enter/exit tracepoints should expect those to be called with preemption enabled. This allows tracers to fault-in userspace system call arguments such as path strings within their probe callbacks. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-6-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing/bpf: disable preemption in syscall probeMathieu Desnoyers
In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that bpf can handle this change by explicitly disabling preemption within the bpf system call tracepoint probes to respect the current expectations within bpf tracing code. This change does not yet allow bpf to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-5-mathieu.desnoyers@efficios.com Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Andrii Nakryiko <andrii@kernel.org> # BPF parts Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing/perf: disable preemption in syscall probeMathieu Desnoyers
In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that perf can handle this change by explicitly disabling preemption within the perf system call tracepoint probes to respect the current expectations within perf ring buffer code. This change does not yet allow perf to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-4-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing/ftrace: disable preemption in syscall probeMathieu Desnoyers
In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that ftrace can handle this change by explicitly disabling preemption within the ftrace system call tracepoint probes to respect the current expectations within ftrace ring buffer code. This change does not yet allow ftrace to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-3-mathieu.desnoyers@efficios.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALLMathieu Desnoyers
In preparation for allowing system call tracepoints to handle page faults, introduce TRACE_EVENT_SYSCALL to declare the sys_enter/sys_exit tracepoints. Move the common code between __DECLARE_TRACE and __DECLARE_TRACE_SYSCALL into __DECLARE_TRACE_COMMON. This change is not meant to alter the generated code, and only prepares the following modifications. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-2-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-09closures: Add closure_wait_event_timeout()Kent Overstreet
Add a closure version of wait_event_timeout(), with the same semantics. The closure version is useful because unlike wait_event(), it allows blocking code to run in the conditional expression. Cc: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>