Age | Commit message (Collapse) | Author |
|
The cn10k hardware ptp timestamp format has been modified primarily
to support 1-step ptp clock. The 64-bit timestamp used by hardware is
split into two 32-bit fields, the upper one holds seconds, the lower
one nanoseconds. A new register (PTP_CLOCK_SEC) has been added that
returns the current seconds value. The nanoseconds register PTP_CLOCK_HI
resets after every second. The cn10k RPM block provides Rx/Tx timestamps
to the NIX block using the new timestamp format. The software can read
the current timestamp in nanoseconds by reading both PTP_CLOCK_SEC &
PTP_CLOCK_HI registers.
This patch provides support for new timestamp format.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hangbin Liu says:
====================
bonding: add IPv6 NS/NA monitor support
This patch add bond IPv6 NS/NA monitor support. A new option
ns_ip6_target is added, which is similar with arp_ip_target.
The IPv6 NS/NA monitor will take effect when there is a valid IPv6
address. Both ARP monitor and NS monitor will working at the same time.
A new extra storage field is added to struct bond_opt_value for IPv6 support.
Function bond_handle_vlan() is split from bond_arp_send() for both
IPv4/IPv6 usage.
To alloc NS message and send out. ndisc_ns_create() and ndisc_send_skb()
are exported.
v1 -> v2:
1. remove sysfs entry[1] and only keep netlink support.
RFC -> v1:
1. define BOND_MAX_ND_TARGETS as BOND_MAX_ARP_TARGETS
2. adjust for reverse xmas tree ordering of local variables
3. remove bond_do_ns_validate()
4. add extra field for bond_opt_value
5. set IS_ENABLED(CONFIG_IPV6) for IPv6 codes
[1] https://lore.kernel.org/netdev/8863.1645071997@famine
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch add a new bonding option ns_ip6_target, which correspond
to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS
request to determine the health of the link.
For other related options like the validation, we still use
arp_validate, and will change to ns_validate later.
Note: the sysfs configuration support was removed based on
https://lore.kernel.org/netdev/8863.1645071997@famine
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new bonding parameter ns_targets to store IPv6 address.
Add required bond_ns_send/rcv functions first before adding
IPv6 address option setting.
Add two functions bond_send/rcv_validate so we can send/recv
ARP and NS at the same time.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adding an extra storage field for bond_opt_value so we can set large
bytes of data for bonding options in future, e.g. IPv6 address.
Define a new call bond_opt_initextra(). Also change the checking order of
__bond_opt_init() and check values first.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Function bond_handle_vlan() is split from bond_arp_send() for later
IPv6 usage.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch separate NS message allocation steps from ndisc_send_ns(),
so it could be used in other places, like bonding, to allocate and
send IPv6 NS message.
Also export ndisc_send_skb() and ndisc_ns_create() for later bonding usage.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'max_rx_len' can be up to GBETH_RX_BUFF_MAX (i.e. 8192) (see
'gbeth_hw_info').
The default value of 'num_rx_ring' can be BE_RX_RING_SIZE (i.e. 1024).
So this loop can allocate 8 Mo of memory.
Previous memory allocations in this function already use GFP_KERNEL, so
use __netdev_alloc_skb() and an explicit GFP_KERNEL instead of a
implicit GFP_ATOMIC.
This gives more opportunities of successful allocation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use skb_put_zero() instead of hand-writing it. This saves a few lines of
code and is more readable.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
ipv4: Invalidate neighbour for broadcast address upon address addition
Patch #1 solves a recently reported issue [1]. See detailed description
in the changelog.
Patch #2 adds a matching test case.
Targeting at net-next since as far as I can tell this use case never
worked.
There are no regressions in fib_tests.sh with this change:
# ./fib_tests.sh
...
Tests passed: 186
Tests failed: 0
[1] https://lore.kernel.org/netdev/55a04a8f-56f3-f73c-2aea-2195923f09d1@huawei.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test that resolved neighbours for IPv4 broadcast addresses are
unaffected by the configuration of matching broadcast routes, whereas
unresolved neighbours are invalidated.
Without previous patch:
# ./fib_tests.sh -t ipv4_bcast_neigh
IPv4 broadcast neighbour tests
TEST: Resolved neighbour for broadcast address [ OK ]
TEST: Resolved neighbour for network broadcast address [ OK ]
TEST: Unresolved neighbour for broadcast address [FAIL]
TEST: Unresolved neighbour for network broadcast address [FAIL]
Tests passed: 2
Tests failed: 2
With previous patch:
# ./fib_tests.sh -t ipv4_bcast_neigh
IPv4 broadcast neighbour tests
TEST: Resolved neighbour for broadcast address [ OK ]
TEST: Resolved neighbour for network broadcast address [ OK ]
TEST: Unresolved neighbour for broadcast address [ OK ]
TEST: Unresolved neighbour for network broadcast address [ OK ]
Tests passed: 4
Tests failed: 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case user space sends a packet destined to a broadcast address when a
matching broadcast route is not configured, the kernel will create a
unicast neighbour entry that will never be resolved [1].
When the broadcast route is configured, the unicast neighbour entry will
not be invalidated and continue to linger, resulting in packets being
dropped.
Solve this by invalidating unresolved neighbour entries for broadcast
addresses after routes for these addresses are internally configured by
the kernel. This allows the kernel to create a broadcast neighbour entry
following the next route lookup.
Another possible solution that is more generic but also more complex is
to have the ARP code register a listener to the FIB notification chain
and invalidate matching neighbour entries upon the addition of broadcast
routes.
It is also possible to wave off the issue as a user space problem, but
it seems a bit excessive to expect user space to be that intimately
familiar with the inner workings of the FIB/neighbour kernel code.
[1] https://lore.kernel.org/netdev/55a04a8f-56f3-f73c-2aea-2195923f09d1@huawei.com/
Reported-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Open coded calculation can be avoided and replaced by the
equivalent csum_replace_by_diff() and csum_sub().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Menglong Dong says:
====================
net: add skb drop reasons to TCP packet receive
In the commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()"),
we added the support of reporting the reasons of skb drops to kfree_skb
tracepoint. And in this series patches, reasons for skb drops are added
to TCP layer (both TCPv4 and TCPv6 are considered).
Following functions are processed:
tcp_v4_rcv()
tcp_v6_rcv()
tcp_v4_inbound_md5_hash()
tcp_v6_inbound_md5_hash()
tcp_add_backlog()
tcp_v4_do_rcv()
tcp_v6_do_rcv()
tcp_rcv_established()
tcp_data_queue()
tcp_data_queue_ofo()
The functions we handled are mostly for packet ingress, as skb drops
hardly happens in the egress path of TCP layer. However, it's a little
complex for TCP state processing, as I find that it's hard to report skb
drop reasons to where it is freed. For example, when skb is dropped in
tcp_rcv_state_process(), the reason can be caused by the call of
tcp_v4_conn_request(), and it's hard to return a drop reason from
tcp_v4_conn_request(). So such cases are skipped for this moment.
Following new drop reasons are introduced (what they mean can be see
in the document for them):
/* SKB_DROP_REASON_TCP_MD5* corresponding to LINUX_MIB_TCPMD5* */
SKB_DROP_REASON_TCP_MD5NOTFOUND
SKB_DROP_REASON_TCP_MD5UNEXPECTED
SKB_DROP_REASON_TCP_MD5FAILURE
SKB_DROP_REASON_SOCKET_BACKLOG
SKB_DROP_REASON_TCP_FLAGS
SKB_DROP_REASON_TCP_ZEROWINDOW
SKB_DROP_REASON_TCP_OLD_DATA
SKB_DROP_REASON_TCP_OVERWINDOW
/* corresponding to LINUX_MIB_TCPOFOMERGE */
SKB_DROP_REASON_TCP_OFOMERGE
Here is a example to get TCP packet drop reasons from ftrace:
$ echo 1 > /sys/kernel/debug/tracing/events/skb/kfree_skb/enable
$ cat /sys/kernel/debug/tracing/trace
$ <idle>-0 [036] ..s1. 647.428165: kfree_skb: skbaddr=000000004d037db6 protocol=2048 location=0000000074cd1243 reason: NO_SOCKET
$ <idle>-0 [020] ..s2. 639.676674: kfree_skb: skbaddr=00000000bcbfa42d protocol=2048 location=00000000bfe89d35 reason: PROTO_MEM
From the reason 'PROTO_MEM' we can know that the skb is dropped because
the memory configured in net.ipv4.tcp_mem is up to the limition.
Changes since v2:
- remove the 'inline' of tcp_drop() in the 1th patch, as Jakub
suggested
Changes since v1:
- enrich the document for this series patches in the cover letter,
as Eric suggested
- fix compile warning report by Jakub in the 6th patch
- let NO_SOCKET trump the XFRM failure in the 2th and 3th patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace tcp_drop() used in tcp_data_queue_ofo with tcp_drop_reason().
Following drop reasons are introduced:
SKB_DROP_REASON_TCP_OFOMERGE
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace tcp_drop() used in tcp_data_queue() with tcp_drop_reason().
Following drop reasons are introduced:
SKB_DROP_REASON_TCP_ZEROWINDOW
SKB_DROP_REASON_TCP_OLD_DATA
SKB_DROP_REASON_TCP_OVERWINDOW
SKB_DROP_REASON_TCP_OLD_DATA is used for the case that end_seq of skb
less than the left edges of receive window. (Maybe there is a better
name?)
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace tcp_drop() used in tcp_rcv_established() with tcp_drop_reason().
Following drop reasons are added:
SKB_DROP_REASON_TCP_FLAGS
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb() used in tcp_v4_do_rcv() and tcp_v6_do_rcv() with
kfree_skb_reason().
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass the address of drop_reason to tcp_add_backlog() to store the
reasons for skb drops when fails. Following drop reasons are
introduced:
SKB_DROP_REASON_SOCKET_BACKLOG
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass the address of drop reason to tcp_v4_inbound_md5_hash() and
tcp_v6_inbound_md5_hash() to store the reasons for skb drops when this
function fails. Therefore, the drop reason can be passed to
kfree_skb_reason() when the skb needs to be freed.
Following drop reasons are added:
SKB_DROP_REASON_TCP_MD5NOTFOUND
SKB_DROP_REASON_TCP_MD5UNEXPECTED
SKB_DROP_REASON_TCP_MD5FAILURE
SKB_DROP_REASON_TCP_MD5* above correspond to LINUX_MIB_TCPMD5*
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb() used in tcp_v6_rcv() with kfree_skb_reason().
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use kfree_skb_reason() for some path in tcp_v4_rcv() that missed before,
including:
SKB_DROP_REASON_SOCKET_FILTER
SKB_DROP_REASON_XFRM_POLICY
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For TCP protocol, tcp_drop() is used to free the skb when it needs
to be dropped. To make use of kfree_skb_reason() and pass the drop
reason to it, introduce the function tcp_drop_reason(). Meanwhile,
make tcp_drop() an inline call to tcp_drop_reason().
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
smatch warnings:
drivers/net/ethernet/marvell/prestera/prestera_acl.c:103
prestera_acl_chain_to_client() error: buffer overflow
'client_map' 3 <= 3
prestera_acl_chain_to_client(u32 chain_index, ...)
...
u32 client_map[] = {
PRESTERA_HW_COUNTER_CLIENT_LOOKUP_0,
PRESTERA_HW_COUNTER_CLIENT_LOOKUP_1,
PRESTERA_HW_COUNTER_CLIENT_LOOKUP_2
};
if (chain_index > ARRAY_SIZE(client_map))
...
Fixes: fa5d824ce5dd ("net: prestera: acl: add multi-chain support offload")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The KSZ9477 SPI driver already has support for the KSZ8563. The same switch
chip can also be managed via i2c and we have an KSZ9477 I2C driver, but
that one lacks the relevant compatible entry. Add it.
DT bindings already describe this compatible.
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These two error paths need to release_sock(sk) before returning.
Fixes: a6a6fe27bab4 ("net/smc: Dynamic control handshake limitation by socket options")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide access to HW offloaded packets over stats64 interface.
The rx/tx_bytes values needed some fixing since HW is accounting size of
the Ethernet frame together with FCS.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Russell King says:
====================
net: phylink: remove pcs_poll
This small series removes the now unused pcs_poll members from DSA and
phylink. "git grep pcs_poll drivers/net/ net/" on net-next confirms that
the only places that reference this are in DSA core code and phylink
code:
drivers/net/phy/phylink.c: if (pl->config->pcs_poll || pcs->poll)
drivers/net/phy/phylink.c: poll |= pl->config->pcs_poll;
net/dsa/port.c: dp->pl_config.pcs_poll = ds->pcs_poll;
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
phylink_config's pcs_poll is no longer used, let's get rid of it.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With drivers converted over to using phylink PCS, there is no need for
the struct dsa_switch member "pcs_poll" to exist anymore - there is a
flag in the struct phylink_pcs which indicates whether this PCS needs
to be polled which supersedes this.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When hsr_create_self_node() calls hsr_node_get_first(), the suspicious
RCU usage warning is occurred. The reason why this warning is raised is
the callers of hsr_node_get_first() use rcu_read_lock_bh() and
other different synchronization mechanisms. Thus, this patch solved by
replacing rcu_dereference() with rcu_dereference_bh_check().
The kernel test robot reports:
[ 50.083470][ T3596] =============================
[ 50.088648][ T3596] WARNING: suspicious RCU usage
[ 50.093785][ T3596] 5.17.0-rc3-next-20220208-syzkaller #0 Not tainted
[ 50.100669][ T3596] -----------------------------
[ 50.105513][ T3596] net/hsr/hsr_framereg.c:34 suspicious rcu_dereference_check() usage!
[ 50.113799][ T3596]
[ 50.113799][ T3596] other info that might help us debug this:
[ 50.113799][ T3596]
[ 50.124257][ T3596]
[ 50.124257][ T3596] rcu_scheduler_active = 2, debug_locks = 1
[ 50.132368][ T3596] 2 locks held by syz-executor.0/3596:
[ 50.137863][ T3596] #0: ffffffff8d3357e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x3be/0xb80
[ 50.147470][ T3596] #1: ffff88807ec9d5f0 (&hsr->list_lock){+...}-{2:2}, at: hsr_create_self_node+0x225/0x650
[ 50.157623][ T3596]
[ 50.157623][ T3596] stack backtrace:
[ 50.163510][ T3596] CPU: 1 PID: 3596 Comm: syz-executor.0 Not tainted 5.17.0-rc3-next-20220208-syzkaller #0
[ 50.173381][ T3596] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 50.183623][ T3596] Call Trace:
[ 50.186904][ T3596] <TASK>
[ 50.189844][ T3596] dump_stack_lvl+0xcd/0x134
[ 50.194640][ T3596] hsr_node_get_first+0x9b/0xb0
[ 50.199499][ T3596] hsr_create_self_node+0x22d/0x650
[ 50.204688][ T3596] hsr_dev_finalize+0x2c1/0x7d0
[ 50.209669][ T3596] hsr_newlink+0x315/0x730
[ 50.214113][ T3596] ? hsr_dellink+0x130/0x130
[ 50.218789][ T3596] ? rtnl_create_link+0x7e8/0xc00
[ 50.223803][ T3596] ? hsr_dellink+0x130/0x130
[ 50.228397][ T3596] __rtnl_newlink+0x107c/0x1760
[ 50.233249][ T3596] ? rtnl_setlink+0x3c0/0x3c0
[ 50.238043][ T3596] ? is_bpf_text_address+0x77/0x170
[ 50.243362][ T3596] ? lock_downgrade+0x6e0/0x6e0
[ 50.248219][ T3596] ? unwind_next_frame+0xee1/0x1ce0
[ 50.253605][ T3596] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 50.259669][ T3596] ? __sanitizer_cov_trace_cmp4+0x1c/0x70
[ 50.265423][ T3596] ? is_bpf_text_address+0x99/0x170
[ 50.270819][ T3596] ? kernel_text_address+0x39/0x80
[ 50.275950][ T3596] ? __kernel_text_address+0x9/0x30
[ 50.281336][ T3596] ? unwind_get_return_address+0x51/0x90
[ 50.286975][ T3596] ? create_prof_cpu_mask+0x20/0x20
[ 50.292178][ T3596] ? arch_stack_walk+0x93/0xe0
[ 50.297172][ T3596] ? kmem_cache_alloc_trace+0x42/0x2c0
[ 50.302637][ T3596] ? rcu_read_lock_sched_held+0x3a/0x70
[ 50.308194][ T3596] rtnl_newlink+0x64/0xa0
[ 50.312524][ T3596] ? __rtnl_newlink+0x1760/0x1760
[ 50.317545][ T3596] rtnetlink_rcv_msg+0x413/0xb80
[ 50.322631][ T3596] ? rtnl_newlink+0xa0/0xa0
[ 50.327159][ T3596] netlink_rcv_skb+0x153/0x420
[ 50.331931][ T3596] ? rtnl_newlink+0xa0/0xa0
[ 50.336436][ T3596] ? netlink_ack+0xa80/0xa80
[ 50.341095][ T3596] ? netlink_deliver_tap+0x1a2/0xc40
[ 50.346532][ T3596] ? netlink_deliver_tap+0x1b1/0xc40
[ 50.351839][ T3596] netlink_unicast+0x539/0x7e0
[ 50.356633][ T3596] ? netlink_attachskb+0x880/0x880
[ 50.361750][ T3596] ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70
[ 50.368003][ T3596] ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70
[ 50.374707][ T3596] ? __phys_addr_symbol+0x2c/0x70
[ 50.379753][ T3596] ? __sanitizer_cov_trace_cmp8+0x1d/0x70
[ 50.385568][ T3596] ? __check_object_size+0x16c/0x4f0
[ 50.390859][ T3596] netlink_sendmsg+0x904/0xe00
[ 50.395715][ T3596] ? netlink_unicast+0x7e0/0x7e0
[ 50.400722][ T3596] ? __sanitizer_cov_trace_const_cmp4+0x1c/0x70
[ 50.407003][ T3596] ? netlink_unicast+0x7e0/0x7e0
[ 50.412119][ T3596] sock_sendmsg+0xcf/0x120
[ 50.416548][ T3596] __sys_sendto+0x21c/0x320
[ 50.421052][ T3596] ? __ia32_sys_getpeername+0xb0/0xb0
[ 50.426427][ T3596] ? lockdep_hardirqs_on_prepare+0x400/0x400
[ 50.432721][ T3596] ? __context_tracking_exit+0xb8/0xe0
[ 50.438188][ T3596] ? lock_downgrade+0x6e0/0x6e0
[ 50.443041][ T3596] ? lock_downgrade+0x6e0/0x6e0
[ 50.447902][ T3596] __x64_sys_sendto+0xdd/0x1b0
[ 50.452759][ T3596] ? lockdep_hardirqs_on+0x79/0x100
[ 50.457964][ T3596] ? syscall_enter_from_user_mode+0x21/0x70
[ 50.464150][ T3596] do_syscall_64+0x35/0xb0
[ 50.468565][ T3596] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 50.474452][ T3596] RIP: 0033:0x7f3148504e1c
[ 50.479052][ T3596] Code: fa fa ff ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 20 fb ff ff 48 8b
[ 50.498926][ T3596] RSP: 002b:00007ffeab5f2ab0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
[ 50.507342][ T3596] RAX: ffffffffffffffda RBX: 00007f314959d320 RCX: 00007f3148504e1c
[ 50.515393][ T3596] RDX: 0000000000000048 RSI: 00007f314959d370 RDI: 0000000000000003
[ 50.523444][ T3596] RBP: 0000000000000000 R08: 00007ffeab5f2b04 R09: 000000000000000c
[ 50.531492][ T3596] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[ 50.539455][ T3596] R13: 00007f314959d370 R14: 0000000000000003 R15: 0000000000000000
Fixes: 4acc45db7115 ("net: hsr: use hlist_head instead of list_head for mac addresses")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: syzbot+f0eb4f3876de066b128c@syzkaller.appspotmail.com
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use kcalloc() instead of kmalloc_array() and a loop to set all the values
of the array to NULL.
While at it, remove a duplicated assignment to 'scq->num_entries'.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Radu Bulie says:
====================
Provide direct access to 1588 one step register
DPAA2 MAC supports 1588 one step timestamping.
If this option is enabled then for each transmitted PTP event packet,
the 1588 SINGLE_STEP register is accessed to modify the following fields:
-offset of the correction field inside the PTP packet
-UDP checksum update bit, in case the PTP event packet has
UDP encapsulation
These values can change any time, because there may be multiple
PTP clients connected, that receive various 1588 frame types:
- L2 only frame
- UDP / Ipv4
- UDP / Ipv6
- other
The current implementation uses dpni_set_single_step_cfg to update the
SINLGE_STEP register.
Using an MC command on the Tx datapath for each transmitted 1588 message
introduces high delays, leading to low throughput and consequently to a
small number of supported PTP clients. Besides these, the nanosecond
correction field from the PTP packet will contain the high delay from the
driver which together with the originTimestamp will render timestamp
values that are unacceptable in a GM clock implementation.
This patch series replaces the dpni_set_single_step_cfg function call from
the Tx datapath for 1588 messages (when one step timestamping is enabled)
with a callback that either implements direct access to the SINGLE_STEP
register, eliminating the overhead caused by the MC command that will need
to be dispatched by the MC firmware through the MC command portal
interface or falls back to the dpni_set_single_step_cfg in case the MC
version does not have support for returning the single step register
base address.
In other words all the delay introduced by dpni_set_single_step_cfg
function will be eliminated (if MC version has support for returning the
base address of the single step register), improving the egress driver
performance for PTP packets when single step timestamping is enabled.
The first patch adds a new attribute that contains the base address of
the SINGLE_STEP register. It will be used to directly update the register
on the Tx datapath.
The second patch updates the driver such that the SINGLE_STEP
register is either accessed directly if MC version >= 10.32 or is
accessed through dpni_set_single_step_cfg command when 1588 messages
are transmitted.
Changes in v2:
- move global function pointer into the driver's private structure in 2/2
- move repetitive code outside the body of the callback functions in 2/2
- update function dpaa2_ptp_onestep_reg_update_method and remove goto
statement from non error path in 2/2
Changes in v3:
- remove static storage class specifier from within the structure in 2/2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DPAA2 MAC supports 1588 one step timestamping.
If this option is enabled then for each transmitted PTP event packet,
the 1588 SINGLE_STEP register is accessed to modify the following fields:
-offset of the correction field inside the PTP packet
-UDP checksum update bit, in case the PTP event packet has
UDP encapsulation
These values can change any time, because there may be multiple
PTP clients connected, that receive various 1588 frame types:
- L2 only frame
- UDP / Ipv4
- UDP / Ipv6
- other
The current implementation uses dpni_set_single_step_cfg to update the
SINLGE_STEP register.
Using an MC command on the Tx datapath for each transmitted 1588 message
introduces high delays, leading to low throughput and consequently to a
small number of supported PTP clients. Besides these, the nanosecond
correction field from the PTP packet will contain the high delay from the
driver which together with the originTimestamp will render timestamp
values that are unacceptable in a GM clock implementation.
This patch updates the Tx datapath for 1588 messages when single step
timestamp is enabled and provides direct access to SINGLE_STEP register,
eliminating the overhead caused by the dpni_set_single_step_cfg
MC command. MC version >= 10.32 implements this functionality.
If the MC version does not have support for returning the
single step register base address, the driver will use
dpni_set_single_step_cfg command for updates operations.
All the delay introduced by dpni_set_single_step_cfg
function will be eliminated (if MC version has support for returning the
base address of the single step register), improving the egress driver
performance for PTP packets when single step timestamping is enabled.
Before these changes the maximum throughput for 1588 messages with
single step hardware timestamp enabled was around 2000pps.
After the updates the throughput increased up to 32.82 Mbps / 46631.02 pps.
Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dpni_get_single_step_cfg is an MC firmware command used for
retrieving the contents of SINGLE_STEP 1588 register available
in a DPMAC.
This patch adds a new version of this command that returns as an extra
argument the physical base address of the aforementioned register.
The address will be used to directly modify the contents of the
SINGLE_STEP register instead of invoking the MC command
dpni_set_single_step_cgf. The former approach introduced huge delays on
the TX datapath when one step PTP events were transmitted. This led to low
throughput and high latencies observed in the PTP correction field.
Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After recent patches, and in particular commits
faab39f63c1f ("net: allow out-of-order netdev unregistration") and
e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev")
we no longer need the barrier implemented in rtnl_lock_unregistering().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix flower destroy template callback to release template
only for specific tc chain instead of all chain tempaltes.
The issue was intruduced by previous commit that introduced
multi-chain support.
Fixes: fa5d824ce5dd ("net: prestera: acl: add multi-chain support offload")
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
cleanup_net() is competing with other rtnl users.
Instead of calling br_net_exit() for each netns,
call br_net_exit_batch() once.
This gives cleanup_net() ability to group more devices
and call unregister_netdevice_many() only once for all bridge devices.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Matt Johnston says:
====================
MCTP I2C driver
This patch series adds a netdev driver providing MCTP transport over
I2C.
I think I've addressed all the points raised in v5. It now has
mctp_i2c_unregister() to run things in the correct order, waiting for
the worker thread and I2C rx to complete.
Cheers,
Matt
--
v6:
- Changed netdev register/unregister/free to avoid races. Ensure that
netif functions are not used by irq handler/threads after unregister.
- Fix incoming I2C hwaddr that was previously incorrect (left
shifted 1 bit)
- Add a check that byte_count wire header matches the length received
- Renamed I2C driver to mctp-i2c-interface
- Removed __func__ from print messages, added missing newlines
- Removed sysfs mctp_current_mux file which was used for debug
- Renamed curr_lock to sel_lock
- Tidied comment formatting
- Fix newline in Kconfig
v5:
- Fix incorrect format string
v4:
- Switch to __i2c_transfer() rather than __i2c_smbus_xfer(), drop 255 byte
smbus patches
- Use wait_event_idle() for the sleeping TX thread
- Use dev_addr_set()
v3:
- Added Reviewed-bys for npcm7xx
- Resend with net-next open
v2:
- Simpler Kconfig condition for i2c-mux dependency, from Randy Dunlap
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provides MCTP network transport over an I2C bus, as specified in
DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes.
Each I2C bus to be used for MCTP is flagged in devicetree by a
'mctp-controller' property on the bus node. Each flagged bus gets a
mctpi2cX net device created based on the bus number. A
'mctp-i2c-controller' I2C client needs to be added under the adapter. In
an I2C mux situation the mctp-i2c-controller node must be attached only
to the root I2C bus. The I2C client will handle incoming I2C slave block
write data for subordinate busses as well as its own bus.
In configurations without devicetree a driver instance can be attached
to a bus using the I2C slave new_device mechanism.
The MCTP core will hold/release the MCTP I2C device while responses
are pending (a 6 second timeout or once a socket is closed, response
received etc). While held the MCTP I2C driver will lock the I2C bus so
that the correct I2C mux remains selected while responses are received.
(Ideally we would just lock the mux to keep the current bus selected for
the response rather than a full I2C bus lock, but that isn't exposed in
the I2C mux API)
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Wolfram Sang <wsa@kernel.org> # I2C transport parts
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Used to define a local endpoint to communicate with MCTP peripherals
attached to an I2C bus. This I2C endpoint can communicate with remote
MCTP devices on the I2C bus.
In the example I2C topology below (matching the second yaml example) we
have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are
indicated by the 'mctp-controller' DT property on an I2C bus node.
A mctp-i2c-controller I2C client DT node is placed at the top of the
mux topology, since only the root I2C adapter will support I2C slave
functionality.
.-------.
|eeprom |
.------------. .------. /'-------'
| adapter | | mux --@0,i2c5------'
| i2c1 ----.*| --@1,i2c6--.--.
|............| \'------' \ \ .........
| mctp-i2c- | \ \ \ .mctpB .
| controller | \ \ '.0x30 .
| | \ ......... \ '.......'
| 0x50 | \ .mctpA . \ .........
'------------' '.0x1d . '.mctpC .
'.......' '.0x31 .
'.......'
(mctpX boxes above are remote MCTP devices not included in the DT at
present, they can be hotplugged/probed at runtime. A DT binding for
specific fixed MCTP devices could be added later if required)
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support for MRT6MSG_WRMIFWHOLE which is used to pass
full packet and real vif id when the incoming interface is wrong.
While the RP and FHR are setting up state we need to be sending the
registers encapsulated with all the data inside otherwise we lose it.
The RP then decapsulates it and forwards it to the interested parties.
Currently with WRONGMIF we can only be sending empty register packets
and will lose that data.
This behaviour can be enabled by using MRT_PIM with
val == MRT6MSG_WRMIFWHOLE. This doesn't prevent MRT6MSG_WRONGMIF from
happening, it happens in addition to it, also it is controlled by the same
throttling parameters as WRONGMIF (i.e. 1 packet per 3 seconds currently).
Both messages are generated to keep backwards compatibily and avoid
breaking someone who was enabling MRT_PIM with val == 4, since any
positive val is accepted and treated the same.
Signed-off-by: Mobashshera Rasool <mobash.rasool.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 'if (ntu == rx_ring->count)' block in i40e_alloc_rx_buffers_zc()
was previously residing in the loop, but after introducing the
batched interface it is used only to wrap-around the NTU descriptor,
thus no more need to assign 'xdp'.
'cleaned_count' in i40e_clean_rx_irq_zc() was previously being
incremented in the loop, but after commit f12738b6ec06
("i40e: remove unnecessary cleaned_count updates") it gets
assigned only once after it, so the initialization can be dropped.
Fixes: 6aab0bb0c5cd ("i40e: Use the xsk batched rx allocation interface")
Fixes: f12738b6ec06 ("i40e: remove unnecessary cleaned_count updates")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jeremy Kerr says:
====================
Add checks for incoming packet addresses
This series adds a couple of checks for valid addresses on incoming MCTP
packets. We introduce a couple of helpers in 1/2, and use them in the
ingress path in 2/2.
====================
Link: https://lore.kernel.org/r/20220218042554.564787-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This change adds some basic sanity checks for the source and dest
headers of packets on initial receive.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, we have mctp_address_ok(), which checks if an EID is in the
"valid" range of 8-254 inclusive. However, 0 and 255 may also be valid
addresses, depending on context. 0 is the NULL EID, which may be set
when physical addressing is used. 255 is valid as a destination address
for broadcasts.
This change renames mctp_address_ok to mctp_address_unicast, and adds
similar helpers for broadcast and null EIDs, which will be used in an
upcoming commit.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds a new protocol attribute to IPv4 and IPv6 addresses.
Inspiration was taken from the protocol attribute of routes. User space
applications like iproute2 can set/get the protocol with the Netlink API.
The attribute is stored as an 8-bit unsigned integer.
The protocol attribute is set by kernel for these categories:
- IPv4 and IPv6 loopback addresses
- IPv6 addresses generated from router announcements
- IPv6 link local addresses
User space may pass custom protocols, not defined by the kernel.
Grouping addresses on their origin is useful in scenarios where you want
to distinguish between addresses based on who added them, e.g. kernel
vs. user space.
Tagging addresses with a string label is an existing feature that could be
used as a solution. Unfortunately the max length of a label is
15 characters, and for compatibility reasons the label must be prefixed
with the name of the device followed by a colon. Since device names also
have a max length of 15 characters, only -1 characters is guaranteed to be
available for any origin tag, which is not that much.
A reference implementation of user space setting and getting protocols
is available for iproute2:
https://github.com/westermo/iproute2/commit/9a6ea18bd79f47f293e5edc7780f315ea42ff540
Signed-off-by: Jacques de Laval <Jacques.De.Laval@westermo.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220217150202.80802-1-Jacques.De.Laval@westermo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Shannon Nelson says:
====================
ionic: driver updates
These are a couple of checkpatch cleanup patches, a bug fix,
and something to alleviate memory pressure in tight places.
====================
Link: https://lore.kernel.org/r/20220217220252.52293-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix up some checkpatch complaints that have crept in:
doubled words words, mispellled words, doubled lines.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace strlcpy with strscpy to clean up a checkpatch complaint.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|