summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2024-09-03ethtool: RX software timestamp for allGal Pressman
All devices support SOF_TIMESTAMPING_RX_SOFTWARE by virtue of net_timestamp_check() being called in the device independent code. Move the responsibility of reporting SOF_TIMESTAMPING_RX_SOFTWARE and SOF_TIMESTAMPING_SOFTWARE, and setting PHC index to -1 to the core. Device drivers no longer need to use them. Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Link: https://lore.kernel.org/netdev/661550e348224_23a2b2294f7@willemb.c.googlers.com.notmuch/ Co-developed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240901112803.212753-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03bpf, net: Fix a potential race in do_sock_getsockopt()Tze-nan Wu
There's a potential race when `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` is false during the execution of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`, but becomes true when `BPF_CGROUP_RUN_PROG_GETSOCKOPT` is called. This inconsistency can lead to `BPF_CGROUP_RUN_PROG_GETSOCKOPT` receiving an "-EFAULT" from `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`. Scenario shown as below: `process A` `process B` ----------- ------------ BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN enable CGROUP_GETSOCKOPT BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT) To resolve this, remove the `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` macro and directly uses `copy_from_sockptr` to ensure that `max_optlen` is always set before `BPF_CGROUP_RUN_PROG_GETSOCKOPT` is invoked. Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Co-developed-by: Yanghui Li <yanghui.li@mediatek.com> Signed-off-by: Yanghui Li <yanghui.li@mediatek.com> Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://patch.msgid.link/20240830082518.23243-1-Tze-nan.Wu@mediatek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03net: dqs: Do not use extern for unused dql_groupBreno Leitao
When CONFIG_DQL is not enabled, dql_group should be treated as a dead declaration. However, its current extern declaration assumes the linker will ignore it, which is generally true across most compiler and architecture combinations. But in certain cases, the linker still attempts to resolve the extern struct, even when the associated code is dead, resulting in a linking error. For instance the following error in loongarch64: >> loongarch64-linux-ld: net-sysfs.c:(.text+0x589c): undefined reference to `dql_group' Modify the declaration of the dead object to be an empty declaration instead of an extern. This change will prevent the linker from attempting to resolve an undefined reference. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409012047.eCaOdfQJ-lkp@intel.com/ Fixes: 74293ea1c4db ("net: sysfs: Do not create sysfs for non BQL device") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://patch.msgid.link/20240902101734.3260455-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03sch/netem: fix use after free in netem_dequeueStephen Hemminger
If netem_dequeue() enqueues packet to inner qdisc and that qdisc returns __NET_XMIT_STOLEN. The packet is dropped but qdisc_tree_reduce_backlog() is not called to update the parent's q.qlen, leading to the similar use-after-free as Commit e04991a48dbaf382 ("netem: fix return value if duplicate enqueue fails") Commands to trigger KASAN UaF: ip link add type dummy ip link set lo up ip link set dummy0 up tc qdisc add dev lo parent root handle 1: drr tc filter add dev lo parent 1: basic classid 1:1 tc class add dev lo classid 1:1 drr tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2: handle 3: drr tc filter add dev lo parent 3: basic classid 3:1 action mirred egress redirect dev dummy0 tc class add dev lo classid 3:1 drr ping -c1 -W0.01 localhost # Trigger bug tc class del dev lo classid 1:1 tc class add dev lo classid 1:1 drr ping -c1 -W0.01 localhost # UaF Fixes: 50612537e9ab ("netem: fix classful handling") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://patch.msgid.link/20240901182438.4992-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03ioam6: improve checks on user dataJustin Iurman
This patch improves two checks on user data. The first one prevents bit 23 from being set, as specified by RFC 9197 (Sec 4.4.1): Bit 23 Reserved; MUST be set to zero upon transmission and be ignored upon receipt. This bit is reserved to allow for future extensions of the IOAM Trace-Type bit field. The second one checks that the tunnel destination address != IPV6_ADDR_ANY, just like we already do for the tunnel source address. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20240830191919.51439-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-03netfilter: nf_tables: set element timeout update supportPablo Neira Ayuso
Store new timeout and expiration in transaction object, use them to update elements from .commit path. Otherwise, discard update if .abort path is exercised. Use update_flags in the transaction to note whether the timeout, expiration, or both need to be updated. Annotate access to timeout extension now that it can be updated while lockless read access is possible. Reject timeout updates on elements with no timeout extension. Element transaction remains in the 96 bytes kmalloc slab on x86_64 after this update. This patch requires ("netfilter: nf_tables: use timestamp to check for set element timeout") to make sure an element does not expire while transaction is ongoing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: zero timeout means element never times outPablo Neira Ayuso
This patch uses zero as timeout marker for those elements that never expire when the element is created. If userspace provides no timeout for an element, then the default set timeout applies. However, if no default set timeout is specified and timeout flag is set on, then timeout extension is allocated and timeout is set to zero to allow for future updates. Use of zero a never timeout marker has been suggested by Phil Sutter. Note that, in older kernels, it is already possible to define elements that never expire by declaring a set with the set timeout flag set on and no global set timeout, in this case, new element with no explicit timeout never expire do not allocate the timeout extension, hence, they never expire. This approach makes it complicated to accomodate element timeout update, because element extensions do not support reallocations. Therefore, allocate the timeout extension and use the new marker for this case, but do not expose it to userspace to retain backward compatibility in the set listing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: consolidate timeout extension for elementsPablo Neira Ayuso
Expiration and timeout are stored in separated set element extensions, but they are tightly coupled. Consolidate them in a single extension to simplify and prepare for set element updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: annotate data-races around element expirationPablo Neira Ayuso
element expiration can be read-write locklessly, it can be written by dynset and read from netlink dump, add annotation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nft_dynset: annotate data-races around set timeoutPablo Neira Ayuso
set timeout can be read locklessly while being updated from control plane, add annotation. Fixes: 123b99619cca ("netfilter: nf_tables: honor set timeout and garbage collection updates") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: remove annotation to access set timeout while holding lockPablo Neira Ayuso
Mutex is held when adding an element, no need for READ_ONCE, remove it. Fixes: 123b99619cca ("netfilter: nf_tables: honor set timeout and garbage collection updates") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: reject expiration higher than timeoutPablo Neira Ayuso
Report ERANGE to userspace if user specifies an expiration larger than the timeout. Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: reject element expiration with no timeoutPablo Neira Ayuso
If element timeout is unset and set provides no default timeout, the element expiration is silently ignored, reject this instead to let user know this is unsupported. Also prepare for supporting timeout that never expire, where zero timeout and expiration must be also rejected. Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nf_tables: elements with timeout below CONFIG_HZ never expirePablo Neira Ayuso
Element timeout that is below CONFIG_HZ never expires because the timeout extension is not allocated given that nf_msecs_to_jiffies64() returns 0. Set timeout to the minimum value to honor timeout. Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netdev_features: remove NETIF_F_ALL_FCOEAlexander Lobakin
NETIF_F_ALL_FCOE is used only in vlan_dev.c, 2 times. Now that it's only 2 bits, open-code it and remove the definition from netdev_features.h. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netdev_features: convert NETIF_F_FCOE_MTU to dev->fcoe_mtuAlexander Lobakin
Ability to handle maximum FCoE frames of 2158 bytes can never be changed and thus more of an attribute, not a toggleable feature. Move it from netdev_features_t to "cold" priv flags (bitfield bool) and free yet another feature bit. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_localAlexander Lobakin
"Interface can't change network namespaces" is rather an attribute, not a feature, and it can't be changed via Ethtool. Make it a "cold" private flag instead of a netdev_feature and free one more bit. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netdev_features: convert NETIF_F_LLTX to dev->lltxAlexander Lobakin
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netdevice: convert private flags > BIT(31) to bitfieldsAlexander Lobakin
Make dev->priv_flags `u32` back and define bits higher than 31 as bitfield booleans as per Jakub's suggestion. This simplifies code which accesses these bits with no optimization loss (testb both before/after), allows to not extend &netdev_priv_flags each time, but also scales better as bits > 63 in the future would only add a new u64 to the structure with no complications, comparing to that extending ::priv_flags would require converting it to a bitmap. Note that I picked `unsigned long :1` to not lose any potential optimizations comparing to `bool :1` etc. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03netfilter: nf_tables: drop unused 3rd argument from validate callback opsFlorian Westphal
Since commit a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation") the validate() callback no longer needs the return pointer argument. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: conntrack: Convert to use ERR_CAST()Shen Lichuan
Use the ERR_CAST macro to clearly indicate that this is a pointer to an error value and that a type conversion was performed. Signed-off-by: Shen Lichuan <shenlichuan@vivo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: Use kmemdup_array instead of kmemdup for multiple allocationYan Zhen
When we are allocating an array, using kmemdup_array() to take care about multiplication and possible overflows. Also it makes auditing the code easier. Signed-off-by: Yan Zhen <yanzhen@vivo.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: nft_counter: Use u64_stats_t for statistic.Sebastian Andrzej Siewior
The nft_counter uses two s64 counters for statistics. Those two are protected by a seqcount to ensure that the 64bit variable is always properly seen during updates even on 32bit architectures where the store is performed by two writes. A side effect is that the two counter (bytes and packet) are written and read together in the same window. This can be replaced with u64_stats_t. write_seqcount_begin()/ end() is replaced with u64_stats_update_begin()/ end() and behaves the same way as with seqcount_t on 32bit architectures. Additionally there is a preempt_disable on PREEMPT_RT to ensure that a reader does not preempt a writer. On 64bit architectures the macros are removed and the reads happen without any retries. This also means that the reader can observe one counter (bytes) from before the update and the other counter (packets) but that is okay since there is no requirement to have both counter from the same update window. Convert the statistic to u64_stats_t. There is one optimisation: nft_counter_do_init() and nft_counter_clone() allocate a new per-CPU counter and assign a value to it. During this assignment preemption is disabled which is not needed because the counter is not yet exposed to the system so there can not be another writer or reader. Therefore disabling preemption is omitted and raw_cpu_ptr() is used to obtain a pointer to a counter for the assignment. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03netfilter: ctnetlink: support CTA_FILTER for flushChangliang Wu
From cb8aa9a, we can use kernel side filtering for dump, but this capability is not available for flush. This Patch allows advanced filter with CTA_FILTER for flush Performace 1048576 ct flows in total, delete 50,000 flows by origin src ip 3.06s -> dump all, compare and delete 584ms -> directly flush with filter Signed-off-by: Changliang Wu <changliang.wu@smartx.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-03net/9p/usbg: Add new usb gadget function transportMichael Grzeschik
Add the new gadget function for 9pfs transport. This function is defining an simple 9pfs transport interface that consists of one in and one out endpoint. The endpoints transmit and receive the 9pfs protocol payload when mounting a 9p filesystem over usb. Tested-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Link: https://lore.kernel.org/r/20240116-ml-topic-u9p-v12-2-9a27de5160e0@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-02Merge tag 'linux-can-next-for-6.12-20240830' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2024-08-30 The first patch is by Duy Nguyen and document the R-Car V4M support in the rcar-canfd DT bindings. Frank Li's patch converts the microchip,mcp251x.txt DT bindings documentation to yaml. A patch by Zhang Changzhong update a comment in the j1939 CAN networking stack. Stefan Mätje's patch updates the CAN configuration netlink code, so that the bit timing calculation doesn't work on stale can_priv::ctrlmode data. Martin Jocic contributes a patch for the kvaser_pciefd driver to convert some ifdefs into if (IS_ENABLED()). The last patch is by Yan Zhen and simplifies the probe() function of the kvaser USB driver by using dev_err_probe(). * tag 'linux-can-next-for-6.12-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: kvaser_usb: Simplify with dev_err_probe() can: kvaser_pciefd: Use IS_ENABLED() instead of #ifdef can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode can: j1939: use correct function name in comment dt-bindings: can: convert microchip,mcp251x.txt to yaml dt-bindings: can: renesas,rcar-canfd: Document R-Car V4M support ==================== Link: https://patch.msgid.link/20240830214406.1605786-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02Merge tag 'for-net-2024-08-30' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - qca: If memdump doesn't work, re-enable IBS - MGMT: Fix not generating command complete for MGMT_OP_DISCONNECT - Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE" - MGMT: Ignore keys being loaded with invalid type * tag 'for-net-2024-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Ignore keys being loaded with invalid type Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE" Bluetooth: MGMT: Fix not generating command complete for MGMT_OP_DISCONNECT Bluetooth: hci_sync: Introduce hci_cmd_sync_run/hci_cmd_sync_run_once Bluetooth: qca: If memdump doesn't work, re-enable IBS ==================== Link: https://patch.msgid.link/20240830220300.1316772-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02Merge tag 'linux-can-fixes-for-6.11-20240830' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-08-30 The first patch is by Kuniyuki Iwashima for the CAN BCM protocol that adds a missing proc entry removal when a device unregistered. Simon Horman fixes the cleanup in the error cleanup path of the m_can driver's open function. Markus Schneider-Pargmann contributes 7 fixes for the m_can driver, all related to the recently added IRQ coalescing support. The next 2 patches are by me, target the mcp251xfd driver and fix ring and coalescing configuration problems when switching from CAN-CC to CAN-FD mode. Simon Arlott's patch fixes a possible deadlock in the mcp251x driver. The last patch is by Martin Jocic for the kvaser_pciefd driver and fixes a problem with lost IRQs, which result in starvation, under high load situations. * tag 'linux-can-fixes-for-6.11-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: kvaser_pciefd: Use a single write when releasing RX buffers can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open can: mcp251xfd: mcp251xfd_ring_init(): check TX-coalescing configuration can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode can: m_can: Limit coalescing to peripheral instances can: m_can: Reset cached active_interrupts on start can: m_can: disable_all_interrupts, not clear active_interrupts can: m_can: Do not cancel timer from within timer can: m_can: Remove m_can_rx_peripheral indirection can: m_can: Remove coalesing disable in isr during suspend can: m_can: Reset coalescing during suspend/resume can: m_can: Release irq on error in m_can_open can: bcm: Remove proc entry when dev is unregistered. ==================== Link: https://patch.msgid.link/20240830215914.1610393-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02netdev-genl: Set extack and fix error on napi-getJoe Damato
In commit 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi"), when an invalid NAPI ID is specified the return value -EINVAL is used and no extack is set. Change the return value to -ENOENT and set the extack. Before this commit: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --do napi-get --json='{"id": 451}' Netlink error: Invalid argument nl_len = 36 (20) nl_flags = 0x100 nl_type = 2 error: -22 After this commit: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --do napi-get --json='{"id": 451}' Netlink error: No such file or directory nl_len = 44 (28) nl_flags = 0x300 nl_type = 2 error: -2 extack: {'bad-attr': '.id'} Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20240831121707.17562-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02tcp_bpf: Remove an unused parameter for bpf_tcp_ingress()Yaxin Chen
Parameter flags is not used in bpf_tcp_ingress(). Signed-off-by: Yaxin Chen <yaxin.chen1@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20240823224843.1985277-1-yaxin.chen1@bytedance.com
2024-09-02bpf, sockmap: Correct spelling skmsg.cSimon Horman
Correct spelling in skmsg.c. As reported by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20240829-sockmap-spell-v1-1-a614d76564cc@kernel.org
2024-09-01SUNRPC: make various functions static, or not exported.NeilBrown
Various functions are only used within the sunrpc module, and several are only use in the one file. So clean up: These are marked static, and any EXPORT is removed. svc_rcpb_setup() svc_rqst_alloc() svc_rqst_free() - also moved before first use svc_rpcbind_set_version() svc_drop() - also moved to svc.c These are now not EXPORTed, but are not static. svc_authenticate() svc_sock_update_bufs() Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-31bpf: Unmask upper DSCP bits in __bpf_redirect_neigh_v4()Ido Schimmel
Unmask the upper DSCP bits when calling ip_route_output_flow() so that in the future it could perform the FIB lookup according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv6: sit: Unmask upper DSCP bits in ipip6_tunnel_xmit()Ido Schimmel
The function calls flowi4_init_output() to initialize an IPv4 flow key with which it then performs a FIB lookup using ip_route_output_flow(). The 'tos' variable with which the TOS value in the IPv4 flow key (flowi4_tos) is initialized contains the full DS field. Unmask the upper DSCP bits so that in the future the FIB lookup could be performed according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv4: Unmask upper DSCP bits in ip_send_unicast_reply()Ido Schimmel
The function calls flowi4_init_output() to initialize an IPv4 flow key with which it then performs a FIB lookup using ip_route_output_flow(). 'arg->tos' with which the TOS value in the IPv4 flow key (flowi4_tos) is initialized contains the full DS field. Unmask the upper DSCP bits so that in the future the FIB lookup could be performed according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31xfrm: Unmask upper DSCP bits in xfrm_get_tos()Ido Schimmel
The function returns a value that is used to initialize 'flowi4_tos' before being passed to the FIB lookup API in the following call chain: xfrm_bundle_create() tos = xfrm_get_tos(fl, family) xfrm_dst_lookup(..., tos, ...) __xfrm_dst_lookup(..., tos, ...) xfrm4_dst_lookup(..., tos, ...) __xfrm4_dst_lookup(..., tos, ...) fl4->flowi4_tos = tos __ip_route_output_key(net, fl4) Unmask the upper DSCP bits so that in the future the output route lookup could be performed according to the full DSCP value. Remove IPTOS_RT_MASK since it is no longer used. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv4: Unmask upper DSCP bits when building flow keyIdo Schimmel
build_sk_flow_key() and __build_flow_key() are used to build an IPv4 flow key before calling one of the FIB lookup APIs. Unmask the upper DSCP bits so that in the future the lookup could be performed according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()Ido Schimmel
The function is called to resolve a route for an ICMP message that is sent in response to a situation. Based on the type of the generated ICMP message, the function is either passed the DS field of the packet that generated the ICMP message or a DS field that is derived from it. Unmask the upper DSCP bits before resolving and output route via ip_route_output_key_hash() so that in the future the lookup could be performed according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv4: Unmask upper DSCP bits in ip_route_output_key_hash()Ido Schimmel
Unmask the upper DSCP bits so that in the future output routes could be looked up according to the full DSCP value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-31ipv4: Unmask upper DSCP bits in RTM_GETROUTE output route lookupIdo Schimmel
Unmask the upper DSCP bits when looking up an output route via the RTM_GETROUTE netlink message so that in the future the lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-30Bluetooth: MGMT: Ignore keys being loaded with invalid typeLuiz Augusto von Dentz
Due to 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 there could be keys stored with the wrong address type so this attempt to detect it and ignore them instead of just failing to load all keys. Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE"Luiz Augusto von Dentz
This reverts commit 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 which breaks compatibility with commands like: bluetoothd[46328]: @ MGMT Command: Load.. (0x0013) plen 74 {0x0001} [hci0] Keys: 2 BR/EDR Address: C0:DC:DA:A5:E5:47 (Samsung Electronics Co.,Ltd) Key type: Authenticated key from P-256 (0x03) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 6ed96089bd9765be2f2c971b0b95f624 LE Address: D7:2A:DE:1E:73:A2 (Static) Key type: Unauthenticated key from P-256 (0x02) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 87dd2546ededda380ffcdc0a8faa4597 @ MGMT Event: Command Status (0x0002) plen 3 {0x0001} [hci0] Load Long Term Keys (0x0013) Status: Invalid Parameters (0x0d) Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Bluetooth: MGMT: Fix not generating command complete for MGMT_OP_DISCONNECTLuiz Augusto von Dentz
MGMT_OP_DISCONNECT can be called while mgmt_device_connected has not been called yet, which will cause the connection procedure to be aborted, so mgmt_device_disconnected shall still respond with command complete to MGMT_OP_DISCONNECT and just not emit MGMT_EV_DEVICE_DISCONNECTED since MGMT_EV_DEVICE_CONNECTED was never sent. To fix this MGMT_OP_DISCONNECT is changed to work similarly to other command which do use hci_cmd_sync_queue and then use hci_conn_abort to disconnect and returns the result, in order for hci_conn_abort to be used from hci_cmd_sync context it now uses hci_cmd_sync_run_once. Link: https://github.com/bluez/bluez/issues/932 Fixes: 12d4a3b2ccb3 ("Bluetooth: Move check for MGMT_CONNECTED flag into mgmt.c") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Bluetooth: hci_sync: Introduce hci_cmd_sync_run/hci_cmd_sync_run_onceLuiz Augusto von Dentz
This introduces hci_cmd_sync_run/hci_cmd_sync_run_once which acts like hci_cmd_sync_queue/hci_cmd_sync_queue_once but runs immediately when already on hdev->cmd_sync_work context. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30can: j1939: use correct function name in commentZhang Changzhong
The function j1939_cancel_all_active_sessions() was renamed to j1939_cancel_active_session() but name in comment wasn't updated. Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Link: https://patch.msgid.link/1724935703-44621-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-08-30ethtool: pse-pd: move pse validation into setDiogo Jahchan Koike
Move validation into set, removing .set_validate operation as its current implementation holds the rtnl lock for acquiring the PHY device, defeating the intended purpose of checking before grabbing the lock. Reported-by: syzbot+ec369e6d58e210135f71@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ec369e6d58e210135f71 Fixes: 31748765bed3 ("net: ethtool: pse-pd: Target the command to the requested PHY") Signed-off-by: Diogo Jahchan Koike <djahchankoike@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20240829184830.5861-1-djahchankoike@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30icmp: icmp_msgs_per_sec and icmp_msgs_burst sysctls become per netnsEric Dumazet
Previous patch made ICMP rate limits per netns, it makes sense to allow each netns to change the associated sysctl. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240829144641.3880376-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30icmp: move icmp_global.credit and icmp_global.stamp to per netns storageEric Dumazet
Host wide ICMP ratelimiter should be per netns, to provide better isolation. Following patch in this series makes the sysctl per netns. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240829144641.3880376-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30icmp: change the order of rate limitsEric Dumazet
ICMP messages are ratelimited : After the blamed commits, the two rate limiters are applied in this order: 1) host wide ratelimit (icmp_global_allow()) 2) Per destination ratelimit (inetpeer based) In order to avoid side-channels attacks, we need to apply the per destination check first. This patch makes the following change : 1) icmp_global_allow() checks if the host wide limit is reached. But credits are not yet consumed. This is deferred to 3) 2) The per destination limit is checked/updated. This might add a new node in inetpeer tree. 3) icmp_global_consume() consumes tokens if prior operations succeeded. This means that host wide ratelimit is still effective in keeping inetpeer tree small even under DDOS. As a bonus, I removed icmp_global.lock as the fast path can use a lock-free operation. Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Reported-by: Keyu Man <keyu.man@email.ucr.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240829144641.3880376-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30net: openvswitch: Use ERR_CAST() to returnYan Zhen
Using ERR_CAST() is more reasonable and safer, When it is necessary to convert the type of an error pointer and return it. Signed-off-by: Yan Zhen <yanzhen@vivo.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20240829095509.3151987-1-yanzhen@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>