summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-27Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge our KVM topic branch, this has been independently included in linux-next for most of the development cycle.
2023-10-27futex: Don't include process MM in futex key on no-MMUBen Wolsieffer
On no-MMU, all futexes are treated as private because there is no need to map a virtual address to physical to match the futex across processes. This doesn't quite work though, because private futexes include the current process's mm_struct as part of their key. This makes it impossible for one process to wake up a shared futex being waited on in another process. Fix this bug by excluding the mm_struct from the key. With a single address space, the futex address is already a unique key. Fixes: 784bdf3bb694 ("futex: Assume all mappings are private on !MMU systems") Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: André Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/r/20231019204548.1236437-2-ben.wolsieffer@hefring.com
2023-10-27Merge branch 'mdb-get'David S. Miller
Ido Schimmel says: ==================== Add MDB get support This patchset adds MDB get support, allowing user space to request a single MDB entry to be retrieved instead of dumping the entire MDB. Support is added in both the bridge and VXLAN drivers. Patches #1-#6 are small preparations in both drivers. Patches #7-#8 add the required uAPI attributes for the new functionality and the MDB get net device operation (NDO), respectively. Patches #9-#10 implement the MDB get NDO in both drivers. Patch #11 registers a handler for RTM_GETMDB messages in rtnetlink core. The handler derives the net device from the ifindex specified in the ancillary header and invokes its MDB get NDO. Patches #12-#13 add selftests by converting tests that use MDB dump with grep to the new MDB get functionality. iproute2 changes can be found here [1]. v2: * Patch #7: Add a comment to describe attributes structure. * Patch #9: Add a comment above spin_lock_bh(). [1] https://github.com/idosch/iproute2/tree/submit/mdb_get_v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27selftests: vxlan_mdb: Use MDB get instead of dumpIdo Schimmel
Test the new MDB get functionality by converting dump and grep to MDB get. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27selftests: bridge_mdb: Use MDB get instead of dumpIdo Schimmel
Test the new MDB get functionality by converting dump and grep to MDB get. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27rtnetlink: Add MDB get supportIdo Schimmel
Now that both the bridge and VXLAN drivers implement the MDB get net device operation, expose the functionality to user space by registering a handler for RTM_GETMDB messages. Derive the net device from the ifindex specified in the ancillary header and invoke its MDB get NDO. Note that unlike other get handlers, the allocation of the skb containing the response is not performed in the common rtnetlink code as the size is variable and needs to be determined by the respective driver. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27vxlan: mdb: Add MDB get supportIdo Schimmel
Implement support for MDB get operation by looking up a matching MDB entry, allocating the skb according to the entry's size and then filling in the response. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: mcast: Add MDB get supportIdo Schimmel
Implement support for MDB get operation by looking up a matching MDB entry, allocating the skb according to the entry's size and then filling in the response. The operation is performed under the bridge multicast lock to ensure that the entry does not change between the time the reply size is determined and when the reply is filled in. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net: Add MDB get device operationIdo Schimmel
Add MDB net device operation that will be invoked by rtnetlink code in response to received RTM_GETMDB messages. Subsequent patches will implement the operation in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: add MDB get uAPI attributesIdo Schimmel
Add MDB get attributes that correspond to the MDB set attributes used in RTM_NEWMDB messages. Specifically, add 'MDBA_GET_ENTRY' which will hold a 'struct br_mdb_entry' and 'MDBA_GET_ENTRY_ATTRS' which will hold 'MDBE_ATTR_*' attributes that are used as indexes (source IP and source VNI). An example request will look as follows: [ struct nlmsghdr ] [ struct br_port_msg ] [ MDBA_GET_ENTRY ] struct br_mdb_entry [ MDBA_GET_ENTRY_ATTRS ] [ MDBE_ATTR_SOURCE ] struct in_addr / struct in6_addr [ MDBE_ATTR_SRC_VNI ] u32 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27vxlan: mdb: Factor out a helper for remote entry size calculationIdo Schimmel
Currently, netlink notifications are sent for individual remote entries and not for the entire MDB entry itself. Subsequent patches are going to add MDB get support which will require the VXLAN driver to reply with an entire MDB entry. Therefore, as a preparation, factor out a helper to calculate the size of an individual remote entry. When determining the size of the reply this helper will be invoked for each remote entry in the MDB entry. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27vxlan: mdb: Adjust function argumentsIdo Schimmel
Adjust the function's arguments and rename it to allow it to be reused by future call sites that only have access to 'struct vxlan_mdb_entry_key', but not to 'struct vxlan_mdb_config'. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: mcast: Rename MDB entry get functionIdo Schimmel
The current name is going to conflict with the upcoming net device operation for the MDB get operation. Rename the function to br_mdb_entry_skb_get(). No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: mcast: Factor out a helper for PG entry size calculationIdo Schimmel
Currently, netlink notifications are sent for individual port group entries and not for the entire MDB entry itself. Subsequent patches are going to add MDB get support which will require the bridge driver to reply with an entire MDB entry. Therefore, as a preparation, factor out an helper to calculate the size of an individual port group entry. When determining the size of the reply this helper will be invoked for each port group entry in the MDB entry. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: mcast: Account for missing attributesIdo Schimmel
The 'MDBA_MDB' and 'MDBA_MDB_ENTRY' nest attributes are not accounted for when calculating the size of MDB notifications. Add them along with comments for existing attributes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27bridge: mcast: Dump MDB entries even when snooping is disabledIdo Schimmel
Currently, the bridge driver does not dump MDB entries when multicast snooping is disabled although the entries are present in the kernel: # bridge mdb add dev br0 port swp1 grp 239.1.1.1 permanent # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ff9d:e61b temp # ip link set dev br0 type bridge mcast_snooping 0 # bridge mdb show dev br0 # ip link set dev br0 type bridge mcast_snooping 1 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ff9d:e61b temp This behavior differs from other netlink dump interfaces that dump entries regardless if they are used or not. For example, VLANs are dumped even when VLAN filtering is disabled: # ip link set dev br0 type bridge vlan_filtering 0 # bridge vlan show dev swp1 port vlan-id swp1 1 PVID Egress Untagged Remove the check and always dump MDB entries: # bridge mdb add dev br0 port swp1 grp 239.1.1.1 permanent # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp # ip link set dev br0 type bridge mcast_snooping 0 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp # ip link set dev br0 type bridge mcast_snooping 1 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27Merge tag 'thunderbolt-for-v6.7-rc1' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Changes for v6.7 merge window This includes following USB4/Thunderbolt changes for the v6.7 merge window: - Configure asymmetric link if the DisplayPort bandwidth requires so - Enable path power management packet support for USB4 v2 routers - Make the bandwidth reservations to follow the USB4 v2 connection manager guide suggestions - DisplayPort tunneling improvements - Small cleanups and improvements around the driver. All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (25 commits) thunderbolt: Fix one kernel-doc comment thunderbolt: Configure asymmetric link if needed and bandwidth allows thunderbolt: Add support for asymmetric link thunderbolt: Introduce tb_switch_depth() thunderbolt: Introduce tb_for_each_upstream_port_on_path() thunderbolt: Introduce tb_port_path_direction_downstream() thunderbolt: Set path power management packet support bit for USB4 v2 routers thunderbolt: Change bandwidth reservations to comply USB4 v2 thunderbolt: Make is_gen4_link() available to the rest of the driver thunderbolt: Use weight constants in tb_usb3_consumed_bandwidth() thunderbolt: Use constants for path weight and priority thunderbolt: Add DP IN added last in the head of the list of DP resources thunderbolt: Create multiple DisplayPort tunnels if there are more DP IN/OUT pairs thunderbolt: Log NVM version of routers and retimers thunderbolt: Use tb_tunnel_xxx() log macros in tb.c thunderbolt: Expose tb_tunnel_xxx() log macros to the rest of the driver thunderbolt: Use tb_tunnel_dbg() where possible to make logging more consistent thunderbolt: Fix typo of HPD bit for Hot Plug Detect thunderbolt: Fix typo in enum tb_link_width kernel-doc thunderbolt: Fix debug log when DisplayPort adapter not available for pairing ...
2023-10-27Merge tag 'extcon-next-for-6.7' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next Chanwoo writes: Update extcon next for v6.7 Detailed description for this pull request: - Add new Realtek DHC(Digital Home Hub) RTD SoC external connector driver : Detect USB Type C cable detection for USB and USB_HOST cable and support USB Type-C connector class. The extcon-rtk-type-c.c driver supports the following Realtek RTD SoC: - realtek,rtd1295-type-c - realtek,rtd1312c-type-c - realtek,rtd1315e-type-c - realtek,rtd1319-type-c - realtek,rtd1319d-type-c - realtek,rtd1395-type-c - realtek,rtd1619-type-c - realtek,rtd1619b-type-c - Add device-tree compatible string for extcon-max77693 and extcon-77843.c. * tag 'extcon-next-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon: extcon: realtek: add the error handler for nvmem_cell_read extcon: max77843: add device-tree compatible string extcon: max77693: add device-tree compatible string dt-bindings: usb: Add Realtek DHC RTD SoC Type-C extcon: add Realtek DHC RTD SoC Type-C driver
2023-10-27Merge branch 'tcp-ao'David S. Miller
Dmitry Safonov says: ==================== net/tcp: Add TCP-AO support This is version 16 of TCP-AO support. It addresses the build warning in the middle of patch set, reported by kernel test robot. There's one Sparse warning introduced by tcp_sigpool_start(): __cond_acquires() seems to currently being broken. I've described the reasoning for it on v9 cover letter. Also, checkpatch.pl warnings were addressed, but yet I've left the ones that are more personal preferences (i.e. 80 columns limit). Please, ping me if you have a strong feeling about one of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27Documentation/tcp: Add TCP-AO documentationDmitry Safonov
It has Frequently Asked Questions (FAQ) on RFC 5925 - I found it very useful answering those before writing the actual code. It provides answers to common questions that arise on a quick read of the RFC, as well as how they were answered. There's also comparison to TCP-MD5 option, evaluation of per-socket vs in-kernel-DB approaches and description of uAPI provided. Hopefully, it will be as useful for reviewing the code as it was for writing. Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP_AO_REPAIRDmitry Safonov
Add TCP_AO_REPAIR setsockopt(), getsockopt(). They let a user to repair TCP-AO ISNs/SNEs. Also let the user hack around when (tp->repair) is on and add ao_info on a socket in any supported state. As SNEs now can be read/written at any moment, use WRITE_ONCE()/READ_ONCE() to set/read them. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Wire up l3index to TCP-AODmitry Safonov
Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, without this flag the key will work in the default VRF l3index = 0 for connections. To prevent AO-keys from overlapping, it's restricted to add key B for a socket that has key A, which have the same sndid/rcvid and one of the following is true: - !(A.keyflags & TCP_AO_KEYF_IFINDEX) or !(B.keyflags & TCP_AO_KEYF_IFINDEX) so that any key is non-bound to a VRF - A.l3index == B.l3index both want to work for the same VRF Additionally, it's restricted to match TCP-MD5 keys for the same peer the following way: |--------------|--------------------|----------------|---------------| | | MD5 key without | MD5 key | MD5 key | | | l3index | l3index=0 | l3index=N | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | without | reject | reject | reject | | l3index | | | | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=0 | reject | reject | allow | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=N | reject | allow | reject | |--------------|--------------------|----------------|---------------| This is done with the help of tcp_md5_do_lookup_any_l3index() to reject adding AO key without TCP_AO_KEYF_IFINDEX if there's TCP-MD5 in any VRF. This is important for case where sysctl_tcp_l3mdev_accept = 1 Similarly, for TCP-AO lookups tcp_ao_do_lookup() may be used with l3index < 0, so that __tcp_ao_key_cmp() will match TCP-AO key in any VRF. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add static_key for TCP-AODmitry Safonov
Similarly to TCP-MD5, add a static key to TCP-AO that is patched out when there are no keys on a machine and dynamically enabled with the first setsockopt(TCP_AO) adds a key on any socket. The static key is as well dynamically disabled later when the socket is destructed. The lifetime of enabled static key here is the same as ao_info: it is enabled on allocation, passed over from full socket to twsk and destructed when ao_info is scheduled for destruction. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)Dmitry Safonov
Delete becomes very, very fast - almost free, but after setsockopt() syscall returns, the key is still alive until next RCU grace period. Which is fine for listen sockets as userspace needs to be aware of setsockopt(TCP_AO) and accept() race and resolve it with verification by getsockopt() after TCP connection was accepted. The benchmark results (on non-loaded box, worse with more RCU work pending): > ok 33 Worst case delete 16384 keys: min=5ms max=10ms mean=6.93904ms stddev=0.263421 > ok 34 Add a new key 16384 keys: min=1ms max=4ms mean=2.17751ms stddev=0.147564 > ok 35 Remove random-search 16384 keys: min=5ms max=10ms mean=6.50243ms stddev=0.254999 > ok 36 Remove async 16384 keys: min=0ms max=0ms mean=0.0296107ms stddev=0.0172078 Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO getsockopt()sDmitry Safonov
Introduce getsockopt(TCP_AO_GET_KEYS) that lets a user get TCP-AO keys and their properties from a socket. The user can provide a filter to match the specific key to be dumped or ::get_all = 1 may be used to dump all keys in one syscall. Add another getsockopt(TCP_AO_INFO) for providing per-socket/per-ao_info stats: packet counters, Current_key/RNext_key and flags like ::ao_required and ::accept_icmps. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add option for TCP-AO to (not) hash headerDmitry Safonov
Provide setsockopt() key flag that makes TCP-AO exclude hashing TCP header for peers that match the key. This is needed for interraction with middleboxes that may change TCP options, see RFC5925 (9.2). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Ignore specific ICMPs for TCP-AO connectionsDmitry Safonov
Similarly to IPsec, RFC5925 prescribes: ">> A TCP-AO implementation MUST default to ignore incoming ICMPv4 messages of Type 3 (destination unreachable), Codes 2-4 (protocol unreachable, port unreachable, and fragmentation needed -- ’hard errors’), and ICMPv6 Type 1 (destination unreachable), Code 1 (administratively prohibited) and Code 4 (port unreachable) intended for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN- WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs." A selftest (later in patch series) verifies that this attack is not possible in this TCP-AO implementation. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add tcp_hash_fail() ratelimited logsDmitry Safonov
Add a helper for logging connection-detailed messages for failed TCP hash verification (both MD5 and AO). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO SNE supportDmitry Safonov
Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO segments countersDmitry Safonov
Introduce segment counters that are useful for troubleshooting/debugging as well as for writing tests. Now there are global snmp counters as well as per-socket and per-key. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Verify inbound TCP-AO signed segmentsDmitry Safonov
Now there is a common function to verify signature on TCP segments: tcp_inbound_hash(). It has checks for all possible cross-interactions with MD5 signs as well as with unsigned segments. The rules from RFC5925 are: (1) Any TCP segment can have at max only one signature. (2) TCP connections can't switch between using TCP-MD5 and TCP-AO. (3) TCP-AO connections can't stop using AO, as well as unsigned connections can't suddenly start using AO. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Sign SYN-ACK segments with TCP-AODmitry Safonov
Similarly to RST segments, wire SYN-ACKs to TCP-AO. tcp_rsk_used_ao() is handy here to check if the request socket used AO and needs a signature on the outgoing segments. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Wire TCP-AO to request socketsDmitry Safonov
Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO sign to twskDmitry Safonov
Add support for sockets in time-wait state. ao_info as well as all keys are inherited on transition to time-wait socket. The lifetime of ao_info is now protected by ref counter, so that tcp_ao_destroy_sock() will destruct it only when the last user is gone. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add AO sign to RST packetsDmitry Safonov
Wire up sending resets to TCP-AO hashing. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add tcp_parse_auth_options()Dmitry Safonov
Introduce a helper that: (1) shares the common code with TCP-MD5 header options parsing (2) looks for hash signature only once for both TCP-MD5 and TCP-AO (3) fails with -EEXIST if any TCP sign option is present twice, see RFC5925 (2.2): ">> A single TCP segment MUST NOT have more than one TCP-AO in its options sequence. When multiple TCP-AOs appear, TCP MUST discard the segment." Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO sign to outgoing packetsDmitry Safonov
Using precalculated traffic keys, sign TCP segments as prescribed by RFC5925. Per RFC, TCP header options are included in sign calculation: "The TCP header, by default including options, and where the TCP checksum and TCP-AO MAC fields are set to zero, all in network- byte order." (5.1.3) tcp_ao_hash_header() has exclude_options parameter to optionally exclude TCP header from hash calculation, as described in RFC5925 (9.1), this is needed for interaction with middleboxes that may change "some TCP options". This is wired up to AO key flags and setsockopt() later. Similarly to TCP-MD5 hash TCP segment fragments. From this moment a user can start sending TCP-AO signed segments with one of crypto ahash algorithms from supported by Linux kernel. It can have a user-specified MAC length, to either save TCP option header space or provide higher protection using a longer signature. The inbound segments are not yet verified, TCP-AO option is ignored and they are accepted. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Calculate TCP-AO traffic keysDmitry Safonov
Add traffic key calculation the way it's described in RFC5926. Wire it up to tcp_finish_connect() and cache the new keys straight away on already established TCP connections. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Prevent TCP-MD5 with TCP-AO being setDmitry Safonov
Be as conservative as possible: if there is TCP-MD5 key for a given peer regardless of L3 interface - don't allow setting TCP-AO key for the same peer. According to RFC5925, TCP-AO is supposed to replace TCP-MD5 and there can't be any switch between both on any connected tuple. Later it can be relaxed, if there's a use, but in the beginning restrict any intersection. Note: it's still should be possible to set both TCP-MD5 and TCP-AO keys on a listening socket for *different* peers. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Introduce TCP_AO setsockopt()sDmitry Safonov
Add 3 setsockopt()s: 1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket 2. TCP_AO_DEL_KEY to delete present MKT from a socket 3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk Userspace has to introduce keys on every socket it wants to use TCP-AO option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT. RFC5925 prohibits definition of MKTs that would match the same peer, so do sanity checks on the data provided by userspace. Be as conservative as possible, including refusal of defining MKT on an established connection with no AO, removing the key in-use and etc. (1) and (2) are to be used by userspace key manager to add/remove keys. (3) main purpose is to set RNext_key, which (as prescribed by RFC5925) is the KeyID that will be requested in TCP-AO header from the peer to sign their segments with. At this moment the life of ao_info ends in tcp_v4_destroy_sock(). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Add TCP-AO config and structuresDmitry Safonov
Introduce new kernel config option and common structures as well as helpers to be used by TCP-AO code. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27net/tcp: Prepare tcp_md5sig_pool for TCP-AODmitry Safonov
TCP-AO, similarly to TCP-MD5, needs to allocate tfms on a slow-path, which is setsockopt() and use crypto ahash requests on fast paths, which are RX/TX softirqs. Also, it needs a temporary/scratch buffer for preparing the hash. Rework tcp_md5sig_pool in order to support other hashing algorithms than MD5. It will make it possible to share pre-allocated crypto_ahash descriptors and scratch area between all TCP hash users. Internally tcp_sigpool calls crypto_clone_ahash() API over pre-allocated crypto ahash tfm. Kudos to Herbert, who provided this new crypto API. I was a little concerned over GFP_ATOMIC allocations of ahash and crypto_request in RX/TX (see tcp_sigpool_start()), so I benchmarked both "backends" with different algorithms, using patched version of iperf3[2]. On my laptop with i7-7600U @ 2.80GHz: clone-tfm per-CPU-requests TCP-MD5 2.25 Gbits/sec 2.30 Gbits/sec TCP-AO(hmac(sha1)) 2.53 Gbits/sec 2.54 Gbits/sec TCP-AO(hmac(sha512)) 1.67 Gbits/sec 1.64 Gbits/sec TCP-AO(hmac(sha384)) 1.77 Gbits/sec 1.80 Gbits/sec TCP-AO(hmac(sha224)) 1.29 Gbits/sec 1.30 Gbits/sec TCP-AO(hmac(sha3-512)) 481 Mbits/sec 480 Mbits/sec TCP-AO(hmac(md5)) 2.07 Gbits/sec 2.12 Gbits/sec TCP-AO(hmac(rmd160)) 1.01 Gbits/sec 995 Mbits/sec TCP-AO(cmac(aes128)) [not supporetd yet] 2.11 Gbits/sec So, it seems that my concerns don't have strong grounds and per-CPU crypto_request allocation can be dropped/removed from tcp_sigpool once ciphers get crypto_clone_ahash() support. [1]: https://lore.kernel.org/all/ZDefxOq6Ax0JeTRH@gondor.apana.org.au/T/#u [2]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27Merge tag 'icc-6.7-rc1' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect changes for 6.7 This pull request contains the interconnect changes for the 6.7-rc1 merge window which contains just driver changes with the following highlights: Driver changes: - New interconnect driver for the SDX75 platform. - Support for coefficients to allow node-specific rate adjustments. - Update DT bindings according to the recent changes of how we represent the SMD and RPM bus clocks on Qualcomm platforms. - Misc fixes and cleanups. Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: (36 commits) interconnect: qcom: Convert to platform remove callback returning void dt-bindings: interconnect: qcom,rpmh: do not require reg on SDX65 MC virt interconnect: imx: Replace inclusion of kernel.h in the header interconnect: fix error handling in qnoc_probe() interconnect: qcom: osm-l3: Replace custom implementation of COUNT_ARGS() interconnect: msm8974: Replace custom implementation of COUNT_ARGS() interconnect: imx: Replace custom implementation of COUNT_ARGS() interconnect: qcom: Add SDX75 interconnect provider driver dt-bindings: interconnect: Add compatibles for SDX75 interconnect: qcom: sm8350: Set ACV enable_mask interconnect: qcom: sm8250: Set ACV enable_mask interconnect: qcom: sm8150: Set ACV enable_mask interconnect: qcom: sm6350: Set ACV enable_mask interconnect: qcom: sdm845: Set ACV enable_mask interconnect: qcom: sdm670: Set ACV enable_mask interconnect: qcom: sc8280xp: Set ACV enable_mask interconnect: qcom: sc8180x: Set ACV enable_mask interconnect: qcom: sc7280: Set ACV enable_mask interconnect: qcom: sc7180: Set ACV enable_mask interconnect: qcom: qdu1000: Set ACV enable_mask ...
2023-10-27ALSA: virtio: use ack callbackMatias Ezequiel Vara Larsen
This commit uses the ack() callback to determine when a buffer has been updated, then exposes it to guest. The current mechanism splits a dma buffer into descriptors that are exposed to the device. This dma buffer is shared with the user application. When the device consumes a buffer, the driver moves the request from the used ring to available ring. The driver exposes the buffer to the device without knowing if the content has been updated from the user. The section 2.8.21.1 of the virtio spec states that: "The device MAY access the descriptor chains the driver created and the memory they refer to immediately". If the device picks up buffers from the available ring just after it is notified, it happens that the content may be old. When the ack() callback is invoked, the driver exposes only the buffers that have already been updated, i.e., enqueued in the available ring. Thus, the device always picks up a buffer that is updated. For capturing, the driver starts by exposing all the available buffers to device. After device updates the content of a buffer, it enqueues it in the used ring. It is only after the ack() for capturing is issued that the driver re-enqueues the buffer in the available ring. Co-developed-by: Anton Yakovlev <anton.yakovlev@opensynergy.com> Signed-off-by: Anton Yakovlev <anton.yakovlev@opensynergy.com> Signed-off-by: Matias Ezequiel Vara Larsen <mvaralar@redhat.com> Link: https://lore.kernel.org/r/ZTjkn1YAFz67yfqx@fedora Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27ALSA: scarlett2: Remap Level Meter valuesGeoffrey D. Bennett
The values previously returned by the Level Meter control were passed through from the interface without interpretation, but it has been discovered that the order of the values matches the mux assignment order (which is not presented to userspace). In addition, the values for disabled mux outputs, and mux outputs which share a source are invalid. This patch adds a per-device meter_map[], and a dynamic meter_level_map[] which is updated on routing changes. The meter level map gets used by scarlett2_meter_ctl_get() to both present the values in a standard order, and to fix up the invalid values by zeroing them (for disabled outputs) and copying them (for mux outputs which share a source). Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/d437ace603eff685d2e0c3d0960589d7a09dd647.1698342632.git.g@b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27ALSA: scarlett2: Allow passing any output to line_out_remap()Geoffrey D. Bennett
Line outputs 3 & 4 on the Gen 3 18i8 are internally the analogue 7 and 8 outputs, and this renumbering is hidden from the user by line_out_remap(). By allowing higher values (representing non-analogue outputs) to be passed to line_out_remap(), repeated code from scarlett2_mux_src_enum_ctl_get() and scarlett2_mux_src_enum_ctl_put() can be removed. Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/3b70267931f5994628ab27306c73cddd17b93c8f.1698342632.git.g@b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27ALSA: scarlett2: Add support for reading firmware versionGeoffrey D. Bennett
The 84 bytes read during initialisation step 2 were previously ignored. This patch retrieves the firmware version from bytes 8-11, stores it in the scarlett2_data struct, and makes it available through a new control "Firmware Version". Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/e76cd80c3445769e60c95df12c4635fc8abfe5c7.1698342632.git.g@b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27ALSA: scarlett2: Rename Gen 3 config setsGeoffrey D. Bennett
The config sets are named NO_MIXER, GEN_2, GEN_3, and CLARETT currently. Rename NO_MIXER and GEN_3 to GEN_3A and GEN_3B respectively as NO_MIXER is only for the smaller Gen 3 devices. Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/19ae5eea7fc499945efa8eeda7fcd8afe73f62d9.1698342632.git.g@b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27ALSA: scarlett2: Rename scarlett_gen2 to scarlett2Geoffrey D. Bennett
This driver was originally developed for the Focusrite Scarlett Gen 2 series. Since then Focusrite have used a similar protocol for their Gen 3, Gen 4, Clarett USB, Clarett+, and Vocaster series. Let's call this common protocol the "Scarlett 2 Protocol" and rename the driver to scarlett2 to not imply that it is restricted to Gen 2 series devices. Signed-off-by: Geoffrey D. Bennett <g@b4.vu> Link: https://lore.kernel.org/r/e1ad7f69a1e20cdb39094164504389160c1a0a0b.1698342632.git.g@b4.vu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-10-27pmdomain: Merge branch fixes into nextUlf Hansson
Merge the pmdomain fixes for v6.6-rc[n] into the next branch, to allow them to get tested together with the new pmdomain changes that are targeted for v6.7. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>