git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-09-20	bpf: Remove unused btf_struct_access stub	Daniel Xu
	This stub was not being used anywhere. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/590e7bd6172ffe0f3d7b51cd40e8ded941aaf7e8.1663683114.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-20	ASoC: SOF: Adding amd HS functionality to the sof core	V sujith kumar Reddy
	Add I2S HS control instance to the sof core. This will help the amd topology to use the I2S HS Dai. Signed-off-by: V sujith kumar Reddy <Vsujithkumar.Reddy@amd.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20220913144319.1055302-4-Vsujithkumar.Reddy@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-20	tcp: Introduce optional per-netns ehash.	Kuniyuki Iwashima
	The more sockets we have in the hash table, the longer we spend looking up the socket. While running a number of small workloads on the same host, they penalise each other and cause performance degradation. The root cause might be a single workload that consumes much more resources than the others. It often happens on a cloud service where different workloads share the same computing resource. On EC2 c5.24xlarge instance (196 GiB memory and 524288 (1Mi / 2) ehash entries), after running iperf3 in different netns, creating 24Mi sockets without data transfer in the root netns causes about 10% performance regression for the iperf3's connection. thash_entries sockets length Gbps 524288 1 1 50.7 24Mi 48 45.1 It is basically related to the length of the list of each hash bucket. For testing purposes to see how performance drops along the length, I set 131072 (1Mi / 8) to thash_entries, and here's the result. thash_entries sockets length Gbps 131072 1 1 50.7 1Mi 8 49.9 2Mi 16 48.9 4Mi 32 47.3 8Mi 64 44.6 16Mi 128 40.6 24Mi 192 36.3 32Mi 256 32.5 40Mi 320 27.0 48Mi 384 25.0 To resolve the socket lookup degradation, we introduce an optional per-netns hash table for TCP, but it's just ehash, and we still share the global bhash, bhash2 and lhash2. With a smaller ehash, we can look up non-listener sockets faster and isolate such noisy neighbours. In addition, we can reduce lock contention. We can control the ehash size by a new sysctl knob. However, depending on workloads, it will require very sensitive tuning, so we disable the feature by default (net.ipv4.tcp_child_ehash_entries == 0). Moreover, we can fall back to using the global ehash in case we fail to allocate enough memory for a new ehash. The maximum size is 16Mi, which is large enough that even if we have 48Mi sockets, the average list length is 3, and regression would be less than 1%. We can check the current ehash size by another read-only sysctl knob, net.ipv4.tcp_ehash_entries. A negative value means the netns shares the global ehash (per-netns ehash is disabled or failed to allocate memory). # dmesg \| cut -d ' ' -f 5- \| grep "established hash" TCP established hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc hugepage) # sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = 524288 # can be changed by thash_entries # sysctl net.ipv4.tcp_child_ehash_entries net.ipv4.tcp_child_ehash_entries = 0 # disabled by default # ip netns add test1 # ip netns exec test1 sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = -524288 # share the global ehash # sysctl -w net.ipv4.tcp_child_ehash_entries=100 net.ipv4.tcp_child_ehash_entries = 100 # ip netns add test2 # ip netns exec test2 sysctl net.ipv4.tcp_ehash_entries net.ipv4.tcp_ehash_entries = 128 # own a per-netns ehash with 2^n buckets When more than two processes in the same netns create per-netns ehash concurrently with different sizes, we need to guarantee the size in one of the following ways: 1) Share the global ehash and create per-netns ehash First, unshare() with tcp_child_ehash_entries==0. It creates dedicated netns sysctl knobs where we can safely change tcp_child_ehash_entries and clone()/unshare() to create a per-netns ehash. 2) Control write on sysctl by BPF We can use BPF_PROG_TYPE_CGROUP_SYSCTL to allow/deny read/write on sysctl knobs. Note that the global ehash allocated at the boot time is spread over available NUMA nodes, but inet_pernet_hashinfo_alloc() will allocate pages for each per-netns ehash depending on the current process's NUMA policy. By default, the allocation is done in the local node only, so the per-netns hash table could fully reside on a random node. Thus, depending on the NUMA policy the netns is created with and the CPU the current thread is running on, we could see some performance differences for highly optimised networking applications. Note also that the default values of two sysctl knobs depend on the ehash size and should be tuned carefully: tcp_max_tw_buckets : tcp_child_ehash_entries / 2 tcp_max_syn_backlog : max(128, tcp_child_ehash_entries / 128) As a bonus, we can dismantle netns faster. Currently, while destroying netns, we call inet_twsk_purge(), which walks through the global ehash. It can be potentially big because it can have many sockets other than TIME_WAIT in all netns. Splitting ehash changes that situation, where it's only necessary for inet_twsk_purge() to clean up TIME_WAIT sockets in each netns. With regard to this, we do not free the per-netns ehash in inet_twsk_kill() to avoid UAF while iterating the per-netns ehash in inet_twsk_purge(). Instead, we do it in tcp_sk_exit_batch() after calling tcp_twsk_purge() to keep it protocol-family-independent. In the future, we could optimise ehash lookup/iteration further by removing netns comparison for the per-netns ehash. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	tcp: Save unnecessary inet_twsk_purge() calls.	Kuniyuki Iwashima
	While destroying netns, we call inet_twsk_purge() in tcp_sk_exit_batch() and tcpv6_net_exit_batch() for AF_INET and AF_INET6. These commands trigger the kernel to walk through the potentially big ehash twice even though the netns has no TIME_WAIT sockets. # ip netns add test # ip netns del test or # unshare -n /bin/true >/dev/null When tw_refcount is 1, we need not call inet_twsk_purge() at least for the net. We can save such unneeded iterations if all netns in net_exit_list have no TIME_WAIT sockets. This change eliminates the tax by the additional unshare() described in the next patch to guarantee the per-netns ehash size. Tested: # mount -t debugfs none /sys/kernel/debug/ # echo cleanup_net > /sys/kernel/debug/tracing/set_ftrace_filter # echo inet_twsk_purge >> /sys/kernel/debug/tracing/set_ftrace_filter # echo function > /sys/kernel/debug/tracing/current_tracer # cat ./add_del_unshare.sh for i in `seq 1 40` do (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) & done wait; # ./add_del_unshare.sh Before the patch: # cat /sys/kernel/debug/tracing/trace_pipe kworker/u128:0-8 [031] ...1. 174.162765: cleanup_net <-process_one_work kworker/u128:0-8 [031] ...1. 174.240796: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [032] ...1. 174.244759: inet_twsk_purge <-tcp_sk_exit_batch kworker/u128:0-8 [034] ...1. 174.290861: cleanup_net <-process_one_work kworker/u128:0-8 [039] ...1. 175.245027: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [046] ...1. 175.290541: inet_twsk_purge <-tcp_sk_exit_batch kworker/u128:0-8 [037] ...1. 175.321046: cleanup_net <-process_one_work kworker/u128:0-8 [024] ...1. 175.941633: inet_twsk_purge <-cleanup_net kworker/u128:0-8 [025] ...1. 176.242539: inet_twsk_purge <-tcp_sk_exit_batch After: # cat /sys/kernel/debug/tracing/trace_pipe kworker/u128:0-8 [038] ...1. 428.116174: cleanup_net <-process_one_work kworker/u128:0-8 [038] ...1. 428.262532: cleanup_net <-process_one_work kworker/u128:0-8 [030] ...1. 429.292645: cleanup_net <-process_one_work Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	tcp: Set NULL to sk->sk_prot->h.hashinfo.	Kuniyuki Iwashima
	We will soon introduce an optional per-netns ehash. This means we cannot use the global sk->sk_prot->h.hashinfo to fetch a TCP hashinfo. Instead, set NULL to sk->sk_prot->h.hashinfo for TCP and get a proper hashinfo from net->ipv4.tcp_death_row.hashinfo. Note that we need not use sk->sk_prot->h.hashinfo if DCCP is disabled. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	tcp: Don't allocate tcp_death_row outside of struct netns_ipv4.	Kuniyuki Iwashima
	We will soon introduce an optional per-netns ehash and access hash tables via net->ipv4.tcp_death_row->hashinfo instead of &tcp_hashinfo in most places. It could harm the fast path because dereferences of two fields in net and tcp_death_row might incur two extra cache line misses. To save one dereference, let's place tcp_death_row back in netns_ipv4 and fetch hashinfo via net->ipv4.tcp_death_row"."hashinfo. Note tcp_death_row was initially placed in netns_ipv4, and commit fbb8295248e1 ("tcp: allocate tcp_death_row outside of struct netns_ipv4") changed it to a pointer so that we can fire TIME_WAIT timers after freeing net. However, we don't do so after commit 04c494e68a13 ("Revert "tcp/dccp: get rid of inet_twsk_purge()""), so we need not define tcp_death_row as a pointer. Also, we move refcount_dec_and_test(&tw_refcount) from tcp_sk_exit() to tcp_sk_exit_batch() as a debug check. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	headers: Remove some left-over license text	Christophe JAILLET
	Remove a left-over from commit 2874c5fd2842 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152") There is no need for an empty "License:". Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/0e5ff727626b748238f4b78932f81572143d8f0b.1662896317.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	firmware: xilinx: add support for sd/gem config	Ronak Jain
	Add new APIs in firmware to configure SD/GEM registers. Internally it calls PM IOCTL for below SD/GEM register configuration: - SD/EMMC select - SD slot type - SD base clock - SD 8 bit support - SD fixed config - GEM SGMII Mode - GEM fixed config Signed-off-by: Ronak Jain <ronak.jain@xilinx.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20	block: remove PSI accounting from the bio layer	Christoph Hellwig
	PSI accounting is now done by the VM code, where it should have been since the beginning. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220915094200.139713-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-20	mm: add PSI accounting around ->read_folio and ->readahead calls	Christoph Hellwig
	PSI tries to account for the cost of bringing back in pages discarded by the MM LRU management. Currently the prime place for that is hooked into the bio submission path, which is a rather bad place: - it does not actually account I/O for non-block file systems, of which we have many - it adds overhead and a layering violation to the block layer Add the accounting into the two places in the core MM code that read pages into an address space by calling into ->read_folio and ->readahead so that the entire file system operations are covered, to broaden the coverage and allow removing the accounting in the block layer going forward. As psi_memstall_enter can deal with nested calls this will not lead to double accounting even while the bio annotations are still present. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220915094200.139713-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-20	Support for CS42L83 on Apple machines	Mark Brown
	Merge series from Martin Povišer <povik+lin@cutebit.org>: there's a CS42L83 headphone jack codec found in Apple computers (in the recent 'Apple Silicon' ones as well as in earlier models, one example [1]). The part isn't publicly documented, but it appears almost identical to CS42L42, for which we have a driver in kernel. This series adapts the CS42L42 driver to the new part, and makes one change in anticipation of a machine driver for the Apple computers. Patch 1 adds new compatible to the cs42l42 schema. Patches 2 to 7 are taken from Richard's recent series [2] adding soundwire support to cs42l42. They are useful refactorings to build on in the later patches, and also this way our work doesn't diverge. (I fixed missing free_irq path in cs42l42_init, did s/Soundwire/SoundWire/ in changelogs, rebased.) Patch 8 exports some regmap-related symbols from cs42l42.c so they can be used to create cs42l83 regmap in cs42l83-i2c.c later. Patch 9 is the cs42l83 support proper. Patch 10 implements 'set_bclk_ratio' on the cs42l42 core. This will be called by the upcoming ASoC machine driver for 'Apple Silicon' Macs. (We have touched on this change to be made in earlier discussion, see [3] and replies.) Patch 11 brings cs42l42-i2c.c in sync with cs42l83-i2c.c on dev_err_probe() usage.
2022-09-20	ASoC: SOF: Intel: override mclk_id for ES8336 support	Mark Brown
	Merge series from Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>: This patchset solves a known issue with ES8336 platforms wrt MCLK selection. Most of the devices use the MCLK0 signal, but some devices do use the MCLK1 signal. The MCLK is defined in the topology, it would be a nightmare to generate more topology files just for one MCLK difference. With a minor extension to the intel-nhlt library, the MCLK information can be found by parsing the NHLT table, and we can override the mclk_id at boot time. The only known issues for this platform remain the detection of GPIO and microphone connections, currently only possible with manual quirks. Thanks to Eugene J. Markow for testing this patchset.
2022-09-20	ALSA: hda: intel-nhlt: add intel_nhlt_ssp_mclk_mask()	Pierre-Louis Bossart
	SOF topologies hard-code the MCLK used for SSP connections. That was a bad idea in hindsight, this information should really come from BIOS and/or machine driver. This patch introduces a helper to scan all SSP endpoints connected to a codec, and all formats to see what MCLK is used. When BIT(0) of the mdivc offset if set in the SSP blob, MCLK0 is used, and likewise when BIT(1) is set MCLK1 is used. The case where both MCLKs are used is possible but has never been seen in practice so should be treated as an error by the caller. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20220919115350.43104-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-20	ASoC: soc.h: use array instead of playback/capture_widget	Kuninori Morimoto
	snd_soc_pcm_runtime has playback/capture_widget for Codec2Coddec. The naming is unclear. This patch names it as c2c_widget and uses array. struct snd_soc_pcm_runtime { ... => struct snd_soc_dapm_widget playback_widget; => struct snd_soc_dapm_widget capture_widget; ... } Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87pmfqv9mk.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-20	ASoC: soc.h: use defined number instead of direct number	Kuninori Morimoto
	snd_soc_pcm_runtime has dpcm for Playback/Capture, but it is defined directly "2". It should use defined number. struct snd_soc_pcm_runtime { ... => struct snd_soc_dpcm_runtime dpcm[2]; ... } This patch fixup it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87r106v9mv.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-20	ASoC: soc.h: remove num_cpus/codecs	Kuninori Morimoto
	Current rtd has both dai_link pointer (A) and num_cpus/codecs (B). (A) rtd->dai_link = dai_link; (B) rtd->num_cpus = dai_link->num_cpus; (B) rtd->num_codecs = dai_link->num_codecs; But, we can get num_cpus/codecs (B) via dai_link (A). This means we don't need to keep num_cpus/codecs on rtd. This patch removes these. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87sfkmv9n3.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-20	HID: convert defines of HID class requests into a proper enum	Benjamin Tissoires
	This allows to export the type in BTF and so in the automatically generated vmlinux.h. It will also add some static checks on the users when we change the ll driver API (see not below). Note that we need to also do change in the ll_driver API, but given that this will have a wider impact outside of this tree, we leave this as a TODO for the future. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220902132938.2409206-11-benjamin.tissoires@redhat.com
2022-09-20	HID: export hid_report_type to uapi	Benjamin Tissoires
	When we are dealing with eBPF, we need to have access to the report type. Currently our implementation differs from the USB standard, making it impossible for users to know the exact value besides hardcoding it themselves. And instead of a blank define, convert it as an enum. Note that we need to also do change in the ll_driver API, but given that this will have a wider impact outside of this tree, we leave this as a TODO for the future. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220902132938.2409206-10-benjamin.tissoires@redhat.com
2022-09-20	HID: core: store the unique system identifier in hid_device	Benjamin Tissoires
	This unique identifier is currently used only for ensuring uniqueness in sysfs. However, this could be handful for userspace to refer to a specific hid_device by this id. 2 use cases are in my mind: LEDs (and their naming convention), and HID-BPF. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220902132938.2409206-9-benjamin.tissoires@redhat.com
2022-09-20	seg6: add NEXT-C-SID support for SRv6 End behavior	Andrea Mayer
	The NEXT-C-SID mechanism described in [1] offers the possibility of encoding several SRv6 segments within a single 128 bit SID address. Such a SID address is called a Compressed SID (C-SID) container. In this way, the length of the SID List can be drastically reduced. A SID instantiated with the NEXT-C-SID flavor considers an IPv6 address logically structured in three main blocks: i) Locator-Block; ii) Locator-Node Function; iii) Argument. C-SID container +------------------------------------------------------------------+ \| Locator-Block \|Loc-Node\| Argument \| \| \|Function\| \| +------------------------------------------------------------------+ <--------- B -----------> <- NF -> <------------- A ---------------> (i) The Locator-Block can be any IPv6 prefix available to the provider; (ii) The Locator-Node Function represents the node and the function to be triggered when a packet is received on the node; (iii) The Argument carries the remaining C-SIDs in the current C-SID container. The NEXT-C-SID mechanism relies on the "flavors" framework defined in [2]. The flavors represent additional operations that can modify or extend a subset of the existing behaviors. This patch introduces the support for flavors in SRv6 End behavior implementing the NEXT-C-SID one. An SRv6 End behavior with NEXT-C-SID flavor works as an End behavior but it is capable of processing the compressed SID List encoded in C-SID containers. An SRv6 End behavior with NEXT-C-SID flavor can be configured to support user-provided Locator-Block and Locator-Node Function lengths. In this implementation, such lengths must be evenly divisible by 8 (i.e. must be byte-aligned), otherwise the kernel informs the user about invalid values with a meaningful error code and message through netlink_ext_ack. If Locator-Block and/or Locator-Node Function lengths are not provided by the user during configuration of an SRv6 End behavior instance with NEXT-C-SID flavor, the kernel will choose their default values i.e., 32-bit Locator-Block and 16-bit Locator-Node Function. [1] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression [2] - https://datatracker.ietf.org/doc/html/rfc8986 Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	clocksource/drivers/timer-ti-dm: Move struct omap_dm_timer fields to driver	Tony Lindgren
	There is no longer any need to expose the elements of struct omap_dm_timer outside the driver. The pwm and remoteproc drivers just use struct omap_dm_timer as a cookie. Let's move the elements of struct omap_dm_timer into struct dmtimer that is private to the driver. To do this, we mostly rename omap_dm_timer to dmtimer in the driver. We keep omap_dm_timer only for the exposed functions in the platform_data for the pwm and remoteproc drivers. Let's also add a note about not using the exposed functions internally as those will get deprecated eventually in favor of Linux generic frameworks. Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Link: https://lore.kernel.org/r/20220815131250.34603-8-tony@atomide.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-09-20	clocksource/drivers/timer-ti-dm: Move private defines to the driver	Tony Lindgren
	These defines are only used by timer-ti-dm driver. Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Link: https://lore.kernel.org/r/20220815131250.34603-6-tony@atomide.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-09-20	clocksource/drivers/timer-ti-dm: Simplify register access further	Tony Lindgren
	Let's unify register access and use dmtimer_read() and dmtimer_write() also for the timer revision specific registers like we now do for the shread registers. Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Link: https://lore.kernel.org/r/20220815131250.34603-5-tony@atomide.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-09-20	clocksource/drivers/timer-ti-dm: Drop unused functions	Tony Lindgren
	We still have some unused functions left, let's drop them. Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Link: https://lore.kernel.org/r/20220815131250.34603-2-tony@atomide.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-09-20	net: dsa: felix: add support for changing DSA master	Vladimir Oltean
	Changing the DSA master means different things depending on the tagging protocol in use. For NPI mode ("ocelot" and "seville"), there is a single port which can be configured as NPI, but DSA only permits changing the CPU port affinity of user ports one by one. So changing a user port to a different NPI port globally changes what the NPI port is, and breaks the user ports still using the old one. To address this while still permitting the change of the NPI port, require that the user ports which are still affine to the old NPI port are down, and cannot be brought up until they are all affine to the same NPI port. The tag_8021q mode ("ocelot-8021q") is more flexible, in that each user port can be freely assigned to one CPU port or to the other. This works by filtering host addresses towards both tag_8021q CPU ports, and then restricting the forwarding from a certain user port only to one of the two tag_8021q CPU ports. Additionally, the 2 tag_8021q CPU ports can be placed in a LAG. This works by enabling forwarding via PGID_SRC from a certain user port towards the logical port ID containing both tag_8021q CPU ports, but then restricting forwarding per packet, via the LAG hash codes in PGID_AGGR, to either one or the other. When we change the DSA master to a LAG device, DSA guarantees us that the LAG has at least one lower interface as a physical DSA master. But DSA masters can come and go as lowers of that LAG, and ds->ops->port_change_master() will not get called, because the DSA master is still the same (the LAG). So we need to hook into the ds->ops->port_lag_{join,leave} calls on the CPU ports and update the logical port ID of the LAG that user ports are assigned to. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net: dsa: allow masters to join a LAG	Vladimir Oltean
	There are 2 ways in which a DSA user port may become handled by 2 CPU ports in a LAG: (1) its current DSA master joins a LAG ip link del bond0 && ip link add bond0 type bond mode 802.3ad ip link set eno2 master bond0 When this happens, all user ports with "eno2" as DSA master get automatically migrated to "bond0" as DSA master. (2) it is explicitly configured as such by the user # Before, the DSA master was eno3 ip link set swp0 type dsa master bond0 The design of this configuration is that the LAG device dynamically becomes a DSA master through dsa_master_setup() when the first physical DSA master becomes a LAG slave, and stops being so through dsa_master_teardown() when the last physical DSA master leaves. A LAG interface is considered as a valid DSA master only if it contains existing DSA masters, and no other lower interfaces. Therefore, we mainly rely on method (1) to enter this configuration. Each physical DSA master (LAG slave) retains its dev->dsa_ptr for when it becomes a standalone DSA master again. But the LAG master also has a dev->dsa_ptr, and this is actually duplicated from one of the physical LAG slaves, and therefore needs to be balanced when LAG slaves come and go. To the switch driver, putting DSA masters in a LAG is seen as putting their associated CPU ports in a LAG. We need to prepare cross-chip host FDB notifiers for CPU ports in a LAG, by calling the driver's ->lag_fdb_add method rather than ->port_fdb_add. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net: dsa: propagate extack to port_lag_join	Vladimir Oltean
	Drivers could refuse to offload a LAG configuration for a variety of reasons, mainly having to do with its TX type. Additionally, since DSA masters may now also be LAG interfaces, and this will translate into a call to port_lag_join on the CPU ports, there may be extra restrictions there. Propagate the netlink extack to this DSA method in order for drivers to give a meaningful error message back to the user. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net: dsa: allow the DSA master to be seen and changed through rtnetlink	Vladimir Oltean
	Some DSA switches have multiple CPU ports, which can be used to improve CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(), sets up only the first one, leading to suboptimal use of hardware. The desire is to not change the default configuration but to permit the user to create a dynamic mapping between individual user ports and the CPU port that they are served by, configurable through rtnetlink. It is also intended to permit load balancing between CPU ports, and in that case, the foreseen model is for the DSA master to be a bonding interface whose lowers are the physical DSA masters. To that end, we create a struct rtnl_link_ops for DSA user ports with the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that contains the ifindex of the newly desired DSA master. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net: dsa: introduce dsa_port_get_master()	Vladimir Oltean
	There is a desire to support for DSA masters in a LAG. That configuration is intended to work by simply enslaving the master to a bonding/team device. But the physical DSA master (the LAG slave) still has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical CPU port. However, we would like to be able to retrieve the LAG that's the upper of the physical DSA master. In preparation for that, introduce a helper called dsa_port_get_master() that replaces all occurrences of the dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will be made later within the helper itself. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net: introduce iterators over synced hw addresses	Vladimir Oltean
	Some network drivers use __dev_mc_sync()/__dev_uc_sync() and therefore program the hardware only with addresses with a non-zero sync_cnt. Some of the above drivers also need to save/restore the address filtering lists when certain events happen, and they need to walk through the struct net_device :: uc and struct net_device :: mc lists. But these lists contain unsynced addresses too. To keep the appearance of an elementary form of data encapsulation, provide iterators through these lists that only look at entries with a non-zero sync_cnt, instead of filtering entries out from device drivers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	efi/libstub: implement generic EFI zboot	Ard Biesheuvel
	Implement a minimal EFI app that decompresses the real kernel image and launches it using the firmware's LoadImage and StartImage boot services. This removes the need for any arch-specific hacks. Note that on systems that have UEFI secure boot policies enabled, LoadImage/StartImage require images to be signed, or their hashes known a priori, in order to be permitted to boot. There are various possible strategies to work around this requirement, but they all rely either on overriding internal PI/DXE protocols (which are not part of the EFI spec) or omitting the firmware provided LoadImage() and StartImage() boot services, which is also undesirable, given that they encapsulate platform specific policies related to secure boot and measured boot, but also related to memory permissions (whether or not and which types of heap allocations have both write and execute permissions.) The only generic and truly portable way around this is to simply sign both the inner and the outer image with the same key/cert pair, so this is what is implemented here. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-20	drm/plane-helper: Provide DRM_PLANE_NON_ATOMIC_FUNCS initializer macro	Thomas Zimmermann
	Provide DRM_PLANE_NON_ATOMIC_FUNCS, which initializes plane functions of non-atomic drivers to default values. The macro is not supposed to be used in new code, but helps with documenting and finding existing users. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Lyude Paul <lyude@redhat.com> # nouveau Link: https://patchwork.freedesktop.org/patch/msgid/20220909105947.6487-5-tzimmermann@suse.de
2022-09-20	drm/plane: Allocate planes with drm_universal_plane_alloc()	Thomas Zimmermann
	Provide drm_univeral_plane_alloc() to allocate and initialize a plane. Code for non-atomic drivers uses this pattern. Convert them to the new function. The modeset helpers contain a quirk for handling their color formats differently. Set the flag outside plane allocation. The new function is already deprecated to some extend. Drivers should rather use drmm_univeral_plane_alloc() or drm_universal_plane_init(). v2: * kerneldoc fixes (Javier) * grammar fixes in commit message Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> # nouveau Link: https://patchwork.freedesktop.org/patch/msgid/20220909105947.6487-3-tzimmermann@suse.de
2022-09-20	drm/plane: Remove drm_plane_init()	Thomas Zimmermann
	Open-code drm_plane_init() and remove the function from DRM. The implementation of drm_plane_init() is a simple wrapper around a call to drm_universal_plane_init(), so drivers can just use that instead. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Lyude Paul <lyude@redhat.com> # nouveau Acked-by: Jyri Sarha <jyri.sarha@iki.fi> Link: https://patchwork.freedesktop.org/patch/msgid/20220909105947.6487-2-tzimmermann@suse.de
2022-09-20	flow_offload: Introduce flow_match_l2tpv3	Wojciech Drewek
	Allow to offload L2TPv3 filters by adding flow_rule_match_l2tpv3. Drivers can extract L2TPv3 specific fields from now on. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	net/sched: flower: Add L2TPv3 filter	Wojciech Drewek
	Add support for matching on L2TPv3 session ID. Session ID can be specified only when ip proto was set to IPPROTO_L2TP. Example filter: # tc filter add dev $PF1 ingress prio 1 protocol ip \ flower \ ip_proto l2tp \ l2tpv3_sid 1234 \ skip_sw \ action mirred egress redirect dev $VF1_PR Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	flow_dissector: Add L2TPv3 dissectors	Wojciech Drewek
	Allow to dissect L2TPv3 specific field which is: - session ID (32 bits) L2TPv3 might be transported over IP or over UDP, this implementation is only about L2TPv3 over IP. IP protocol carries L2TPv3 when ip_proto is IPPROTO_L2TP (115). Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	uapi: move IPPROTO_L2TP to in.h	Wojciech Drewek
	IPPROTO_L2TP is currently defined in l2tp.h, but most of ip protocols are defined in in.h file. Move it there in order to keep code clean. Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20	ALSA: hda: ext: fix locking in stream_release	Pierre-Louis Bossart
	The snd_hdac_ext_stream_release() routine uses the bus reg_lock, but releases it before calling snd_hdac_stream_release() where the bus reg_lock is taken again. This creates a timing window where the link stream release could test an invalid 'opened' boolean status and fail to recouple the host and link parts. Fix by exposing a locked version of snd_hdac_stream_release() and use it without releasing the spinlock. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20220919121041.43463-8-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-20	ALSA: hda: add snd_hdac_stop_streams() helper	Pierre-Louis Bossart
	Minor code reuse, no functionality change. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20220919121041.43463-6-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-20	ALSA: hda: Use hdac_ext prefix in snd_hdac_stream_free_all() for clarity	Pierre-Louis Bossart
	Make sure there's no ambiguity on layering with the appropriate prefix added. Pure rename, no functionality changed. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20220919121041.43463-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-20	ALSA: hda: ext: make snd_hdac_ext_stream_init() static	Pierre-Louis Bossart
	There are no external users of this helper, move to static and remove sympol export. No functionality change. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20220919121041.43463-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-20	ALSA: hda: make snd_hdac_stream_clear() static	Pierre-Louis Bossart
	This helper has no users outside of hdac_stream.c. External users should only use snd_hdac_stream_start() and snd_hdac_stream_stop(). No functional change beyond making the function static and removing the symbol export. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20220919121041.43463-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-19	Merge tag 'for-net-2022-09-09' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix HCIGETDEVINFO regression * tag 'for-net-2022-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: Fix HCIGETDEVINFO regression ==================== Link: https://lore.kernel.org/r/20220909201642.3810565-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19	lib/cpumask: deprecate nr_cpumask_bits	Yury Norov
	Cpumask code is written in assumption that when CONFIG_CPUMASK_OFFSTACK is enabled, all cpumasks have boot-time defined size, otherwise the size is always NR_CPUS. The latter is wrong because the number of possible cpus is always calculated on boot, and it may be less than NR_CPUS. On my 4-cpu arm64 VM the nr_cpu_ids is 4, as expected, and nr_cpumask_bits is 256, which corresponds to NR_CPUS. This not only leads to useless traversing of cpumask bits greater than 4, this also makes some cpumask routines fail. For example, cpumask_full(0b1111000..000) would erroneously return false in the example above because tail bits in the mask are all unset. This patch deprecates nr_cpumask_bits and wires it to nr_cpu_ids unconditionally, so that cpumask routines will not waste time traversing unused part of cpu masks. It also fixes cpumask_full() and similar routines. As a side effect, because now a length of cpumasks is defined at run-time even if CPUMASK_OFFSTACK is disabled, compiler can't optimize corresponding functions. It increases kernel size by ~2.5KB if OFFSTACK is off. This is addressed in the following patch. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-19	lib/cpumask: delete misleading comment	Yury Norov
	The comment says that HOTPLUG config option enables all cpus in cpu_possible_mask up to NR_CPUs. This is wrong. Even if HOTPLUG is enabled, the mask is populated on boot with respect to ACPI/DT records. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-19	smp: add set_nr_cpu_ids()	Yury Norov
	In preparation to support compile-time nr_cpu_ids, add a setter for the variable. This is a no-op for all arches. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-19	Merge tag 'ib-mfd-net-pinctrl-v6.0' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Lee Jones says: ==================== Immutable branch between MFD, Net and Pinctrl due for the v6.0 merge window * tag 'ib-mfd-net-pinctrl-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: ocelot: Add support for the vsc7512 chip via spi dt-bindings: mfd: ocelot: Add bindings for VSC7512 resource: add define macro for register address resources pinctrl: microchip-sgpio: add ability to be used in a non-mmio configuration pinctrl: microchip-sgpio: allow sgpio driver to be used as a module pinctrl: ocelot: add ability to be used in a non-mmio configuration net: mdio: mscc-miim: add ability to be used in a non-mmio configuration mfd: ocelot: Add helper to get regmap from a resource ==================== Link: https://lore.kernel.org/r/YxrjyHcceLOFlT/c@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19	drm/amdgpu: add MES and MES-KIQ version in debugfs	Yifan Zhang
	This patch addes MES and MES-KIQ version in debugfs. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19	drm/amdgpu: add two new subquery ids	Hawking Zhang
	To support query rlcp and rlcv firmware version from existing AMDGPU_INFO_FW_VERSION interface Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>