summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-01-26fuse: add request extensionMiklos Szeredi
Will need to add supplementary groups to create messages, so add the general concept of a request extension. A request extension is appended to the end of the main request. It has a header indicating the size and type of the extension. The create security context (fuse_secctx_*) is similar to the generic request extension, so include that as well in a backward compatible manner. Add the total extension length to the request header. The offset of the extension block within the request can be calculated by: inh->len - inh->total_extlen * 8 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-01-26gpu: host1x: External timeout/cancellation for fencesMikko Perttunen
Currently all fences have a 30 second timeout to ensure they are cleaned up if the fence never completes otherwise. However, this one size fits all solution doesn't actually fit in every case, such as syncpoint waiting where we want to be able to have timeouts longer than 30 seconds. As such, we want to be able to give control over fence cancellation to the caller (and maybe eventually get rid of the internal timeout altogether). Here we add this cancellation mechanism by essentially adding a function for entering the timeout path by function call, and changing the syncpoint wait function to use it. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2023-01-26gpu: host1x: Implement job tracking using DMA fencesMikko Perttunen
In anticipation of removal of the intr API, implement job tracking using DMA fences instead. The main two things about this are making cdma_update schedule the work since fence completion can now be called from interrupt context, and some complication in ensuring the callback is not running when we free the fence. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2023-01-26net: ethtool: provide shims for stats aggregation helpers when ↵Vladimir Oltean
CONFIG_ETHTOOL_NETLINK=n ethtool_aggregate_*_stats() are implemented in net/ethtool/stats.c, a file which is compiled out when CONFIG_ETHTOOL_NETLINK=n. In order to avoid adding Kbuild dependencies from drivers (which call these helpers) on CONFIG_ETHTOOL_NETLINK, let's add some shim definitions which simply make the helpers dead code. This means the function prototypes should have been located in include/linux/ethtool_netlink.h rather than include/linux/ethtool.h. Fixes: 449c5459641a ("net: ethtool: add helpers for aggregate statistics") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230125110214.4127759-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26Revert "gpiolib: of: Introduce hook for missing gpio-ranges"Andy Shevchenko
This reverts commit 3550bba25d5587a701e6edf20e20984d2ee72c78. No users for this one, revert it for good. The ->add_pin_ranges() can be used instead. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20230113215352.44272-5-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-01-26nfsd: remove fetch_iversion export operationJeff Layton
Now that the i_version counter is reported in struct kstat, there is no need for this export operation. Acked-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-01-26vfs: plumb i_version handling into struct kstatJeff Layton
The NFS server has a lot of special handling for different types of change attribute access, depending on the underlying filesystem. In most cases, it's doing a getattr anyway and then fetching that value after the fact. Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a kernel-only symbol (for now). If requested and getattr can implement it, it can fill out this field. For IS_I_VERSION inodes, add a generic implementation in vfs_getattr_nosec. Take care to mask STATX_CHANGE_COOKIE off in requests from userland and in the result mask. Since not all filesystems can give the same guarantees of monotonicity, claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to indicate that they offer an i_version value that can never go backward. Eventually if we decide to make the i_version available to userland, we can just designate a field for it in struct statx, and move the STATX_CHANGE_COOKIE definition to the uapi header. Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-01-26fs: clarify when the i_version counter must be updatedJeff Layton
The i_version field in the kernel has had different semantics over the decades, but NFSv4 has certain expectations. Update the comments in iversion.h to describe when the i_version must change. Cc: Colin Walters <walters@verbum.org> Cc: NeilBrown <neilb@suse.de> Cc: Trond Myklebust <trondmy@hammerspace.com> Cc: Dave Chinner <david@fromorbit.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-01-26fs: uninline inode_query_iversionJeff Layton
Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-01-26icmp: Add counters for rate limitsJamie Bainbridge
There are multiple ICMP rate limiting mechanisms: * Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec * v4 per-host limits: net.ipv4.icmp_ratelimit/ratemask * v6 per-host limits: net.ipv6.icmp_ratelimit/ratemask However, when ICMP output is limited, there is no way to tell which limit has been hit or even if the limits are responsible for the lack of ICMP output. Add counters for each of the cases above. As we are within local_bh_disable(), use the __INC stats variant. Example output: # nstat -sz "*RateLimit*" IcmpOutRateLimitGlobal 134 0.0 IcmpOutRateLimitHost 770 0.0 Icmp6OutRateLimitHost 84 0.0 Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Suggested-by: Abhishek Rawal <rawal.abhishek92@gmail.com> Link: https://lore.kernel.org/r/273b32241e6b7fdc5c609e6f5ebc68caf3994342.1674605770.git.jamie.bainbridge@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26accel: Add .mmap to DRM_ACCEL_FOPSJeffrey Hugo
In reviewing the ivpu driver, DEFINE_DRM_ACCEL_FOPS could have been used if DRM_ACCEL_FOPS defined .mmap to be drm_gem_mmap. Lets add that since accel drivers are a variant of drm drivers, modern drm drivers are expected to use GEM, and mmap() is a common operation that is expected to be heavily used in accel drivers thus the common accel driver should be able to just use DEFINE_DRM_ACCEL_FOPS() for convenience. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: define events to trace PCI LBW accessOhad Sharabi
There are cases where it may be useful to dump the whole LBW configs. Yet, doing so while spamming the kernel log will probably shade other important messages since the LBW access is done in sheer volume. To answer this we add trace events for those too. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: add uapi to flush inbound HBM transactionsOhad Sharabi
When doing p2p with a NIC device, the NIC needs to make sure all the writes to the HBM (through the PCI bar of the Gaudi device) were flushed. It can be done by either the NIC or the host reading through the PCI bar. To support the host side, we supply a simple uapi to perform this flush through the driver, because the user can't create such a transaction by itself (the PCI bar isn't exposed to normal users). Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs/uapi: move uapi file to drmOded Gabbay
Move the habanalabs.h uapi file from include/uapi/misc to include/uapi/drm, and rename it to habanalabs_accel.h. This is required before moving the actual driver to the accel subsystem. Update MAINTAINERS file accordingly. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: pass-through request from user to f/wfarah kassabri
Add a uAPI, as part of the INFO IOCTL, to allow users to send requests directly to f/w, according to a pre-defined set of opcodes that the f/w exposes. The f/w will put the result in a kernel-allocated buffer, which the driver will then copy to the user-supplied buffer. This will allow f/w tools to communicate directly with the f/w without the need to add a new uAPI to the driver for each new type of request. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: modify export dmabuf APIOhad Sharabi
A previous commit deprecated the option to export from handle, leaving the code with no support for devices with virtual memory. This commit modifies the export API in a way that unifies the uAPI to user address for both cases (i.e. with and without MMU support) and add the actual support for devices with virtual memory. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: define traces for COMMS protocolOhad Sharabi
As the COMMS protocol is being used more widely in our driver, an available debug tool for the handshake will be handy. This commit defines tracepoints to various key points of the COMMS protocol. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26drm/fb-helper: Initialize fb-helper's preferred BPP in prepare functionThomas Zimmermann
Initialize the fb-helper's preferred_bpp field early from within drm_fb_helper_prepare(); instead of the later client hot-plugging callback. This simplifies the generic fbdev setup function. No real changes, but all drivers' fbdev code has to be adapted. v3: * build with CONFIG_DRM_FBDEV_EMULATION unset (kernel test bot) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-7-tzimmermann@suse.de
2023-01-26drm/fb-helper: Introduce drm_fb_helper_unprepare()Thomas Zimmermann
Move the fb-helper clean-up code into drm_fb_helper_unprepare(). No functional changes. v2: * declare as static inline (kernel test robot) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-4-tzimmermann@suse.de
2023-01-26drm/client: Add hotplug_failed flagThomas Zimmermann
Signal failed hotplugging with a flag in struct drm_client_dev. If set, the client helpers will not further try to set up the fbdev display. This used to be signalled with a combination of cleared pointers in struct drm_fb_helper, which prevents us from initializing these pointers early after allocation. The change also harmonizes behavior among DRM clients. Additional DRM clients will now handle failed hotplugging like fbdev does. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-3-tzimmermann@suse.de
2023-01-25inet: Add IP_LOCAL_PORT_RANGE socket optionJakub Sitnicki
Users who want to share a single public IP address for outgoing connections between several hosts traditionally reach for SNAT. However, SNAT requires state keeping on the node(s) performing the NAT. A stateless alternative exists, where a single IP address used for egress can be shared between several hosts by partitioning the available ephemeral port range. In such a setup: 1. Each host gets assigned a disjoint range of ephemeral ports. 2. Applications open connections from the host-assigned port range. 3. Return traffic gets routed to the host based on both, the destination IP and the destination port. An application which wants to open an outgoing connection (connect) from a given port range today can choose between two solutions: 1. Manually pick the source port by bind()'ing to it before connect()'ing the socket. This approach has a couple of downsides: a) Search for a free port has to be implemented in the user-space. If the chosen 4-tuple happens to be busy, the application needs to retry from a different local port number. Detecting if 4-tuple is busy can be either easy (TCP) or hard (UDP). In TCP case, the application simply has to check if connect() returned an error (EADDRNOTAVAIL). That is assuming that the local port sharing was enabled (REUSEADDR) by all the sockets. # Assume desired local port range is 60_000-60_511 s = socket(AF_INET, SOCK_STREAM) s.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) s.bind(("192.0.2.1", 60_000)) s.connect(("1.1.1.1", 53)) # Fails only if 192.0.2.1:60000 -> 1.1.1.1:53 is busy # Application must retry with another local port In case of UDP, the network stack allows binding more than one socket to the same 4-tuple, when local port sharing is enabled (REUSEADDR). Hence detecting the conflict is much harder and involves querying sock_diag and toggling the REUSEADDR flag [1]. b) For TCP, bind()-ing to a port within the ephemeral port range means that no connecting sockets, that is those which leave it to the network stack to find a free local port at connect() time, can use the this port. IOW, the bind hash bucket tb->fastreuse will be 0 or 1, and the port will be skipped during the free port search at connect() time. 2. Isolate the app in a dedicated netns and use the use the per-netns ip_local_port_range sysctl to adjust the ephemeral port range bounds. The per-netns setting affects all sockets, so this approach can be used only if: - there is just one egress IP address, or - the desired egress port range is the same for all egress IP addresses used by the application. For TCP, this approach avoids the downsides of (1). Free port search and 4-tuple conflict detection is done by the network stack: system("sysctl -w net.ipv4.ip_local_port_range='60000 60511'") s = socket(AF_INET, SOCK_STREAM) s.setsockopt(SOL_IP, IP_BIND_ADDRESS_NO_PORT, 1) s.bind(("192.0.2.1", 0)) s.connect(("1.1.1.1", 53)) # Fails if all 4-tuples 192.0.2.1:60000-60511 -> 1.1.1.1:53 are busy For UDP this approach has limited applicability. Setting the IP_BIND_ADDRESS_NO_PORT socket option does not result in local source port being shared with other connected UDP sockets. Hence relying on the network stack to find a free source port, limits the number of outgoing UDP flows from a single IP address down to the number of available ephemeral ports. To put it another way, partitioning the ephemeral port range between hosts using the existing Linux networking API is cumbersome. To address this use case, add a new socket option at the SOL_IP level, named IP_LOCAL_PORT_RANGE. The new option can be used to clamp down the ephemeral port range for each socket individually. The option can be used only to narrow down the per-netns local port range. If the per-socket range lies outside of the per-netns range, the latter takes precedence. UAPI-wise, the low and high range bounds are passed to the kernel as a pair of u16 values in host byte order packed into a u32. This avoids pointer passing. PORT_LO = 40_000 PORT_HI = 40_511 s = socket(AF_INET, SOCK_STREAM) v = struct.pack("I", PORT_HI << 16 | PORT_LO) s.setsockopt(SOL_IP, IP_LOCAL_PORT_RANGE, v) s.bind(("127.0.0.1", 0)) s.getsockname() # Local address between ("127.0.0.1", 40_000) and ("127.0.0.1", 40_511), # if there is a free port. EADDRINUSE otherwise. [1] https://github.com/cloudflare/cloudflare-blog/blob/232b432c1d57/2022-02-connectx/connectx.py#L116 Reviewed-by: Marek Majkowski <marek@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25bpf/selftests: Verify struct_ops prog sleepable behaviorDavid Vernet
In a set of prior changes, we added the ability for struct_ops programs to be sleepable. This patch enhances the dummy_st_ops selftest suite to validate this behavior by adding a new sleepable struct_ops entry to dummy_st_ops. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125164735.785732-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25bpf: Pass const struct bpf_prog * to .check_memberDavid Vernet
The .check_member field of struct bpf_struct_ops is currently passed the member's btf_type via const struct btf_type *t, and a const struct btf_member *member. This allows the struct_ops implementation to check whether e.g. an ops is supported, but it would be useful to also enforce that the struct_ops prog being loaded for that member has other qualities, like being sleepable (or not). This patch therefore updates the .check_member() callback to also take a const struct bpf_prog *prog argument. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125164735.785732-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25virtchnl: i40e/iavf: rename iwarp to rdmaJesse Brandeburg
Since the latest Intel hardware does both IWARP and ROCE, rename the term IWARP in the virtchnl header to be RDMA. Do this for both upper and lower case instances. Many of the non-virtchnl.h changes were done with regular expression replacements using perl like: perl -p -i -e 's/_IWARP/_RDMA/' <files> perl -p -i -e 's/_iwarp/_rdma/' <files> and I had to pick up a few instances manually. The virtchnl.h header has some comments and clarity added around when to use certain defines. note: had to fix a checkpatch warning for a long line by wrapping one of the lines I changed. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-25virtchnl: do structure hardeningJesse Brandeburg
The virtchnl interface can have a bunch of "soft" defined structures hardened by using explicit sizes for declarations, and then referring to the enum type that uses them in a comment. None of these changes should change any of the structure sizes. Also, remove a duplicate line in a switch statement and let two cases uses the same code. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-25virtchnl: update header and increase header clarityJesse Brandeburg
We already have the SPDX header, so just leave a copyright notice with an updated year and get rid of the boilerplate header (so 2002!). In addition, update a couple of comments to clarify how the various parts of the virtchannel header interaction work. No functional changes. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-25virtchnl: remove unused structure declarationJesse Brandeburg
Nothing uses virtchnl_msg, just remove it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-25bpf/tracing: Use stage6 of tracing to not duplicate macrosSteven Rostedt (Google)
The bpf events are created by the same macro magic as tracefs trace events are. But to hook into bpf, it has its own code. It duplicates many of the same macros as the tracefs macros and this is an issue because it misses bug fixes as well as any new enhancements that come with the other trace macros. As the trace macros have been put into their own staging files, have bpf take advantage of this and use the tracefs stage 6 macros that the "fast ssign" portion of the trace event macro uses. Link: https://lkml.kernel.org/r/20230124202515.873075730@goodmis.org Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/ Cc: bpf@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25perf/tracing: Use stage6 of tracing to not duplicate macrosSteven Rostedt (Google)
The perf events are created by the same macro magic as tracefs trace events are. But to hook into perf, it has its own code. It duplicates many of the same macros as the tracefs macros and this is an issue because it misses bug fixes as well as any new enhancements that come with the other trace macros. As the trace macros have been put into their own staging files, have perf take advantage of this and use the tracefs stage 6 macros that the "fast assign" portion of the trace event macro uses. Link: https://lkml.kernel.org/r/20230124202515.716458410@goodmis.org Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/ Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25soc: mediatek: add cmdq support of mtk-mmsys config API for mt8195 vdosys1Nancy.Lin
Add cmdq support for mtk-mmsys config API. The mmsys config register settings need to take effect with the other HW settings(like OVL_ADAPTOR...) at the same vblanking time. If we use CPU to write the mmsys reg, we can't guarantee all the settings can be written in the same vblanking time. Cmdq is used for this purpose. We prepare all the related HW settings in one cmdq packet. The first command in the packet is "wait stream done", and then following with all the HW settings. After the cmdq packet is flush to GCE HW. The GCE waits for the "stream done event" to coming and then starts flushing all the HW settings. This can guarantee all the settings flush in the same vblanking. Signed-off-by: Nancy.Lin <nancy.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Bo-Chen Chen <rex-bc.chen@mediatek.com> Link: https://lore.kernel.org/r/20230113104434.28023-8-nancy.lin@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-01-25soc: mediatek: add mtk-mmsys config API for mt8195 vdosys1Nancy.Lin
Add four mmsys config APIs. The config APIs are used for config mmsys reg. Some mmsys regs need to be set according to the HW engine binding to the mmsys simultaneously. 1. mtk_mmsys_merge_async_config: config merge async width/height. async is used for cross-clock domain synchronization. 2. mtk_mmsys_hdr_confing: config hdr backend async width/height. 3. mtk_mmsys_mixer_in_config and mtk_mmsys_mixer_in_config: config mixer related settings. Signed-off-by: Nancy.Lin <nancy.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Bo-Chen Chen <rex-bc.chen@mediatek.com> Link: https://lore.kernel.org/r/20230113104434.28023-7-nancy.lin@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-01-25soc: mediatek: add mtk-mmsys ethdr and mdp_rdma componentsNancy.Lin
Add new mmsys component: ethdr_mixer and mdp_rdma. These components will use in mt8195 vdosys1. Signed-off-by: Nancy.Lin <nancy.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Bo-Chen Chen <rex-bc.chen@mediatek.com> Link: https://lore.kernel.org/r/20230113104434.28023-4-nancy.lin@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-01-25dt-bindings: reset: mt8195: add vdosys1 reset control bitNancy.Lin
Add vdosys1 reset control bit for MT8195 platform. Signed-off-by: Nancy.Lin <nancy.lin@mediatek.com> Reviewed-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Rex-BC Chen <rex-bc.chen@mediatek.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20230113104434.28023-3-nancy.lin@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2023-01-25usb: gadget: Add support for RZ/V2M USB3DRD driverBiju Das
The RZ/V2M USB3.1 Gen1 Interface (USB) composed of a USB3.1 Gen1 Dual Role Device controller (USB3DRD), a USB3.1 Gen1 Host controller (USB3HOST), a USB3.1 Gen1 Peripheral controller (USB3PERI). The reset for both host and peri are located in USB3DRD block. The USB3DRD registers are mapped in the AXI address space of the Peripheral module. Add USB3DRD driver to handle reset for both host and peri modules. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/20230121145853.4792-6-biju.das.jz@bp.renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-25Merge branch 'iommu-memory-accounting' into coreJoerg Roedel
Merge patch-set from Jason: "Let iommufd charge IOPTE allocations to the memory cgroup" Description: IOMMUFD follows the same design as KVM and uses memory cgroups to limit the amount of kernel memory a iommufd file descriptor can pin down. The various internal data structures already use GFP_KERNEL_ACCOUNT to charge its own memory. However, one of the biggest consumers of kernel memory is the IOPTEs stored under the iommu_domain and these allocations are not tracked. This series is the first step in fixing it. The iommu driver contract already includes a 'gfp' argument to the map_pages op, allowing iommufd to specify GFP_KERNEL_ACCOUNT and then having the driver allocate the IOPTE tables with that flag will capture a significant amount of the allocations. Update the iommu_map() API to pass in the GFP argument, and fix all call sites. Replace iommu_map_atomic(). Audit the "enterprise" iommu drivers to make sure they do the right thing. Intel and S390 ignore the GFP argument and always use GFP_ATOMIC. This is problematic for iommufd anyhow, so fix it. AMD and ARM SMMUv2/3 are already correct. A follow up series will be needed to capture the allocations made when the iommu_domain itself is allocated, which will complete the job. Link: https://lore.kernel.org/linux-iommu/0-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com/
2023-01-25iommu: Add a gfp parameter to iommu_map_sg()Jason Gunthorpe
Follow the pattern for iommu_map() and remove iommu_map_sg_atomic(). This allows __iommu_dma_alloc_noncontiguous() to use a GFP_KERNEL allocation here, based on the provided gfp flags. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25iommu: Remove iommu_map_atomic()Jason Gunthorpe
There is only one call site and it can now just pass the GFP_ATOMIC to the normal iommu_map(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25iommu: Add a gfp parameter to iommu_map()Jason Gunthorpe
The internal mechanisms support this, but instead of exposting the gfp to the caller it wrappers it into iommu_map() and iommu_map_atomic() Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/1-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25iommu: Implement of_iommu_get_resv_regions()Thierry Reding
This is an implementation that IOMMU drivers can use to obtain reserved memory regions from a device tree node. It uses the reserved-memory DT bindings to find the regions associated with a given device. If these regions are marked accordingly, identity mappings will be created for them in the IOMMU domain that the devices will be attached to. Cc: Frank Rowand <frowand.list@gmail.com> Cc: devicetree@vger.kernel.org Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230120174251.4004100-4-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25of: Introduce of_translate_dma_region()Thierry Reding
This function is similar to of_translate_dma_address() but also reads a length in addition to an address from a device tree property. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230120174251.4004100-2-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25net/smc: De-tangle ism and smc device initializationStefan Raspl
The struct device for ISM devices was part of struct smcd_dev. Move to struct ism_dev, provide a new API call in struct smcd_ops, and convert existing SMCD code accordingly. Furthermore, remove struct smcd_dev from struct ism_dev. This is the final part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Consolidate SMC-D-related codeStefan Raspl
The ism module had SMC-D-specific code sprinkled across the entire module. We are now consolidating the SMC-D-specific parts into the latter parts of the module, so it becomes more clear what code is intended for use with ISM, and which parts are glue code for usage in the context of SMC-D. This is the fourth part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Separate SMC-D and ISM APIsStefan Raspl
We separate the code implementing the struct smcd_ops API in the ISM device driver from the functions that may be used by other exploiters of ISM devices. Note: We start out small, and don't offer the whole breadth of the ISM device for public use, as many functions are specific to or likely only ever used in the context of SMC-D. This is the third part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Register SMC-D as ISM clientStefan Raspl
Register the smc module with the new ism device driver API. This is the second part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/ism: Add new API for client registrationStefan Raspl
Add a new API that allows other drivers to concurrently access ISM devices. To do so, we introduce a new API that allows other modules to register for ISM device usage. Furthermore, we move the GID to struct ism, where it belongs conceptually, and rename and relocate struct smcd_event to struct ism_event. This is the first part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Introduce struct ism_dmbStefan Raspl
Conceptually, a DMB is a structure that belongs to ISM devices. However, SMC currently 'owns' this structure. So future exploiters of ISM devices would be forced to include SMC headers to work - which is just weird. Therefore, we switch ISM to struct ism_dmb, introduce a new public header with the definition (will be populated with further API calls later on), and, add a thin wrapper to please SMC. Since structs smcd_dmb and ism_dmb are identical, we can simply convert between the two for now. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25ALSA: ac97: make remove callback of ac97 driver void returnedDawei Li
Since commit fc7a6209d571 ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. As such, change the remove function for ac97 based drivers to return void. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB2323A5AB1B2578EF4FA15DA7CAFB9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-24bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listenerJakub Sitnicki
A listening socket linked to a sockmap has its sk_prot overridden. It points to one of the struct proto variants in tcp_bpf_prots. The variant depends on the socket's family and which sockmap programs are attached. A child socket cloned from a TCP listener initially inherits their sk_prot. But before cloning is finished, we restore the child's proto to the listener's original non-tcp_bpf_prots one. This happens in tcp_create_openreq_child -> tcp_bpf_clone. Today, in tcp_bpf_clone we detect if the child's proto should be restored by checking only for the TCP_BPF_BASE proto variant. This is not correct. The sk_prot of listening socket linked to a sockmap can point to to any variant in tcp_bpf_prots. If the listeners sk_prot happens to be not the TCP_BPF_BASE variant, then the child socket unintentionally is left if the inherited sk_prot by tcp_bpf_clone. This leads to issues like infinite recursion on close [1], because the child state is otherwise not set up for use with tcp_bpf_prot operations. Adjust the check in tcp_bpf_clone to detect all of tcp_bpf_prots variants. Note that it wouldn't be sufficient to check the socket state when overriding the sk_prot in tcp_bpf_update_proto in order to always use the TCP_BPF_BASE variant for listening sockets. Since commit b8b8315e39ff ("bpf, sockmap: Remove unhash handler for BPF sockmap usage") it is possible for a socket to transition to TCP_LISTEN state while already linked to a sockmap, e.g. connect() -> insert into map -> connect(AF_UNSPEC) -> listen(). [1]: https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/ Fixes: e80251555f0b ("tcp_bpf: Don't let child socket inherit parent protocol ops on copy") Reported-by: syzbot+04c21ed96d861dccc5cd@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-2-1e0ee7ac2f90@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-24bpf: Allow trusted args to walk struct when checking BTF IDsDavid Vernet
When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier currently enforces that the top-level type must match when calling the kfunc. In other words, the verifier does not allow the BPF program to pass a bitwise equivalent struct, despite it being allowed according to the C standard. For example, if you have the following type: struct nf_conn___init { struct nf_conn ct; }; The C standard stipulates that it would be safe to pass a struct nf_conn___init to a kfunc expecting a struct nf_conn. The verifier currently disallows this, however, as semantically kfuncs may want to enforce that structs that have equivalent types according to the C standard, but have different BTF IDs, are not able to be passed to kfuncs expecting one or the other. For example, struct nf_conn___init may not be queried / looked up, as it is allocated but may not yet be fully initialized. On the other hand, being able to pass types that are equivalent according to the C standard will be useful for other types of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a series of kfuncs will be added which allow programs to do bitwise queries on cpumasks that are either allocated by the program (in which case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as its first element), or a cpumask that was allocated by the main kernel (in which case it will just be a straight cpumask_t, as in task->cpus_ptr). Having the two types of cpumasks allows us to distinguish between the two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t cannot be. On the other hand, a struct bpf_cpumask can of course be queried in the exact same manner as a cpumask_t, with e.g. bpf_cpumask_test_cpu(). If we were to enforce that top level types match, then a user that's passing a struct bpf_cpumask to a read-only cpumask_t argument would have to cast with something like bpf_cast_to_kern_ctx() (which itself would need to be updated to expect the alias, and currently it only accommodates a single alias per prog type). Additionally, not specifying KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a struct bpf_cpumask *, and another as a struct cpumask * (i.e. cpumask_t). In order to enable this, this patch relaxes the constraint that a KF_TRUSTED_ARGS kfunc must have strict type matching, and instead only enforces strict type matching if a type is observed to be a "no-cast alias" (i.e., that the type names are equivalent, but one is suffixed with ___init). Additionally, in order to try and be conservative and match existing behavior / expectations, this patch also enforces strict type checking for acquire kfuncs. We were already enforcing it for release kfuncs, so this should also improve the consistency of the semantics for kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230120192523.3650503-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-24bpf: Enable annotating trusted nested pointersDavid Vernet
In kfuncs, a "trusted" pointer is a pointer that the kfunc can assume is safe, and which the verifier will allow to be passed to a KF_TRUSTED_ARGS kfunc. Currently, a KF_TRUSTED_ARGS kfunc disallows any pointer to be passed at a nonzero offset, but sometimes this is in fact safe if the "nested" pointer's lifetime is inherited from its parent. For example, the const cpumask_t *cpus_ptr field in a struct task_struct will remain valid until the task itself is destroyed, and thus would also be safe to pass to a KF_TRUSTED_ARGS kfunc. While it would be conceptually simple to enable this by using BTF tags, gcc unfortunately does not yet support this. In the interim, this patch enables support for this by using a type-naming convention. A new BTF_TYPE_SAFE_NESTED macro is defined in verifier.c which allows a developer to specify the nested fields of a type which are considered trusted if its parent is also trusted. The verifier is also updated to account for this. A patch with selftests will be added in a follow-on change, along with documentation for this feature. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230120192523.3650503-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>