Age | Commit message (Collapse) | Author |
|
No need to postpone this to the commit release path, since no packets
are walking over this object, this is accessed from control plane only.
This helped uncovered UAF triggered by races with the netlink notifier.
Fixes: 9dd732e0bdf5 ("netfilter: nf_tables: memleak flow rule from commit path")
Reported-by: syzbot+8f747f62763bc6c32916@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
commit release path is invoked via call_rcu and it runs lockless to
release the objects after rcu grace period. The netlink notifier handler
might win race to remove objects that the transaction context is still
referencing from the commit release path.
Call rcu_barrier() to ensure pending rcu callbacks run to completion
if the list of transactions to be destroyed is not empty.
Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership")
Reported-by: syzbot+8f747f62763bc6c32916@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In nft_inner_parse_l2l3(), the return value of skb_header_pointer() is
'veth' instead of 'eth' when case 'htons(ETH_P_8021Q)' and fix it.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Peng Wu <wupeng58@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
GRE_VERSION and GRE_VERSION0 are expressed in network byte order,
use __be16. Uncovered by sparse:
net/netfilter/nft_payload.c:112:25: warning: incorrect type in assignment (different base types)
net/netfilter/nft_payload.c:112:25: expected unsigned int [usertype] version
net/netfilter/nft_payload.c:112:25: got restricted __be16
net/netfilter/nft_payload.c:114:22: warning: restricted __be16 degrades to integer
Fixes: c247897d7c19 ("netfilter: nft_payload: access GRE payload via inner offset")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
As a first strep in introducing proper PAN management and association,
we need to be able to create coordinator interfaces which might act as
coordinator or PAN coordinator.
Hence, let's add the minimum support to allow the creation of these
interfaces.
Even though the necessary logic to handle several interfaces on the same
device is added to make this future move easier, in practice only
several interfaces of type MONITOR are allowed at the same time. The
other combinations are not allowed (interface creation is possible but
only one can be opened at a time) because, with a single PHY featuring a
single set of address filters, we cannot afford handling two distinct
interfaces (with different address filters or filtering requirements):
* Having 2 NODEs, 2 COORDs or 1 NODE + 1 COORD
-> cannot work because the address filters would be different
* Having 1 MONITOR + either 1 NODE or 1 COORD
-> cannot work because the filtering levels are incompatible
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221026093502.602734-4-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
While going through the whole interface opening logic in my head I was
consistently bothered by the condition checking whether there was only
one interface of type NODE/COORD opened at the same time. What actually
bothered me was the fact that in one case we would use the wpan_dev
pointer directly while in the other case we would use the sdata pointer,
making it harder to differentiate both. In practice the condition should
be straightforward to read. IMHO dropping the wpan_dev indirection
allows to clarify the check.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221026093502.602734-3-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
It may appear clearer to free the skb at the end of the path rather than
in the middle, within a helper.
Move kfree_skb() from the end of __ieee802154_rx_handle_packet() to
right after it in the calling function ieee802154_rx(). Doing so implies
reworking a little bit the exit path.
Suggested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221026093502.602734-2-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
IPv4 reassembly unit can decide to drop frags based on
/proc/sys/net/ipv4/ipfrag_max_dist sysctl.
Add a specific drop reason to track this specific
and weird case.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Used to track skbs freed after a timeout happened
in a reassmbly unit.
Passing a @reason argument to inet_frag_rbtree_purge()
allows to use correct consumed status for frags
that have been successfully re-assembled.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is used to track when a duplicate segment received by various
reassembly units is dropped.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When an skb with a frag list is consumed, we currently
pretend all skbs in the frag list were dropped.
In order to fix this, add a @reason argument to skb_release_data()
and skb_release_all().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This will allow to simply use in the future:
kfree_skb_reason(skb, reason);
Instead of repeating sequences like:
if (dropped)
kfree_skb_reason(skb, reason);
else
consume_skb(skb);
For instance, following patch in the series is adding
@reason to skb_release_data() and skb_release_all(),
so that we can propagate a meaningful @reason whenever
consume_skb()/kfree_skb() have to take care of a potential frag_list.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch use the new helper unregister_netdevice_many_notify() for
rtnl_delete_link(), so that the kernel could reply unicast when userspace
set NLM_F_ECHO flag to request the new created interface info.
At the same time, the parameters of rtnl_delete_link() need to be updated
since we need nlmsghdr and portid info.
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch pass the netlink header message in rtnl_newlink_create() to
the new updated rtnl_configure_link(), so that the kernel could reply
unicast when userspace set NLM_F_ECHO flag to request the new created
interface info.
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add new helper unregister_netdevice_many_notify(), pass netlink message
header and portid, which could be used to notify userspace when flag
NLM_F_ECHO is set.
Make the unregister_netdevice_many() as a wrapper of new function
unregister_netdevice_many_notify().
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch pass netlink message header and portid to rtnl_configure_link()
All the functions in this call chain need to add the parameters so we can
use them in the last call rtnl_notify(), and notify the userspace about
the new link info if NLM_F_ECHO flag is set.
- rtnl_configure_link()
- __dev_notify_flags()
- rtmsg_ifinfo()
- rtmsg_ifinfo_event()
- rtmsg_ifinfo_build_skb()
- rtmsg_ifinfo_send()
- rtnl_notify()
Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub
suggested.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We can't use "skb" again after passing it to qdisc_enqueue(). This is
basically identical to commit 2f09707d0c97 ("sch_sfb: Also store skb
len before calling child enqueue").
Fixes: d7f4f332f082 ("sch_red: update backlog as well")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Clean up the use of unsafe_memcpy() by adding a flexible array
at the end of netlink message header and splitting up the header
and data copies.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend packet socket option PACKET_IGNORE_OUTGOING to fanout groups.
The socket option sets ptype.ignore_outgoing, which makes
dev_queue_xmit_nit skip the socket.
When the socket joins a fanout group, the option is not reflected in
the struct ptype of the group. dev_queue_xmit_nit only tests the
fanout ptype, so the flag is ignored once a socket joins a
fanout group.
Inheriting the option from a socket would change established behavior.
Different sockets in the group can set different flags, and can also
change them at runtime.
Testing in packet_rcv_fanout defeats the purpose of the original
patch, which is to avoid skb_clone in dev_queue_xmit_nit (esp. for
MSG_ZEROCOPY packets).
Instead, introduce a new fanout group flag with the same behavior.
Tested with https://github.com/wdebruij/kerneltools/blob/master/tests/test_psock_fanout_ignore_outgoing.c
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221027211014.3581513-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DSA tagging protocol drivers can be changed at runtime through sysfs and
at probe time through the device tree (support for the latter was added
later).
When changing through sysfs, it is assumed that the module for the new
tagging protocol was already loaded into the kernel (in fact this is
only a concern for Ocelot/Felix switches, where we have tag_ocelot.ko
and tag_ocelot_8021q.ko; for every other switch, the default and
alternative protocols are compiled within the same .ko, so there is
nothing for the user to load).
The kernel cannot currently call request_module(), because it has no way
of constructing the modalias name of the tagging protocol driver
("dsa_tag-%d", where the number is one of DSA_TAG_PROTO_*_VALUE).
The device tree only contains the string name of the tagging protocol
("ocelot-8021q"), and the only mapping between the string and the
DSA_TAG_PROTO_OCELOT_8021Q_VALUE is present in tag_ocelot_8021q.ko.
So this is a chicken-and-egg situation and dsa_core.ko has nothing based
on which it can automatically request the insertion of the module.
As a consequence, if CONFIG_NET_DSA_TAG_OCELOT_8021Q is built as module,
the switch will forever defer probing.
The long-term solution is to make DSA call request_module() somehow,
but that probably needs some refactoring.
What we can do to keep operating with existing device tree blobs is to
cancel the attempt to change the tagging protocol with the one specified
there, and to remain operating with the default one. Depending on the
situation, the default protocol might still allow some functionality
(in the case of ocelot, it does), and it's better to have that than to
fail to probe.
Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT")
Link: https://lore.kernel.org/lkml/20221027113248.420216-1-michael@walle.cc/
Reported-by: Heiko Thiery <heiko.thiery@gmail.com>
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221027145439.3086017-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Without this only the client initiated tcp sockets have SOCK_SUPPORT_ZC.
The listening socket on the server also has it, but the accepted
connections didn't, which meant IORING_OP_SEND[MSG]_ZC will always
fails with -EOPNOTSUPP.
Fixes: e993ffe3da4b ("net: flag sockets supporting msghdr originated zerocopy")
Cc: <stable@vger.kernel.org> # 6.0
CC: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/io-uring/20221024141503.22b4e251@kernel.org/T/#m38aa19b0b825758fb97860a38ad13122051f9dda
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove SOCK_SUPPORT_ZC when we're setting ulp as it might not support
msghdr::ubuf_info, e.g. like TLS replacing ->sk_prot with a new set of
handlers.
Cc: <stable@vger.kernel.org> # 6.0
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sockmap replaces ->sk_prot with its own callbacks, we should remove
SOCK_SUPPORT_ZC as the new proto doesn't support msghdr::ubuf_info.
Cc: <stable@vger.kernel.org> # 6.0
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Mark udp ipv6 as supporting msghdr::ubuf_info. In the original commit
SOCK_SUPPORT_ZC was supposed to be set by a udp_init_sock() call from
udp6_init_sock(), but
d38afeec26ed4 ("tcp/udp: Call inet6_destroy_sock() in IPv6 ...")
removed it and so ipv6 udp misses the flag.
Cc: <stable@vger.kernel.org> # 6.0
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I missed one of the families in OvS when annotating .resv_start_op.
This triggers the warning added in commit ce48ebdd5651 ("genetlink:
limit the use of validation workarounds to old ops").
Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com
Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
====================
pull-request: wireless-next-2022-10-28
First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7.
All mac80211 driver are now converted to use internal TX queues, this
might cause some regressions so we wanted to do this early in the
cycle.
Note: wireless tree was merged[1] to wireless-next to avoid some
conflicts with mac80211 patches between the trees. Unfortunately there
are still two smaller conflicts in net/mac80211/util.c which Stephen
also reported[2]. In the first conflict initialise scratch_len to
"params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and
in the second conflict take the version which uses elems->scratch_pos.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91
[2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/
mac80211
- preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues
- add API to show the link STAs in debugfs
- all mac80211 drivers are now using mac80211 internal TX queues (iTXQs)
rtw89
- support 8852BE
rtl8xxxu
- support RTL8188FU
brmfmac
- support two station interfaces concurrently
bcma
- support SPROM rev 11
====================
Link: https://lore.kernel.org/r/20221028132943.304ECC433B5@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
rcv_wnd can be useful to diagnose TCP performance where receiver window
becomes the bottleneck. rehash reports the PLB and timeout triggered
rehash attempts by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A u32 counter is added to tcp_sock for counting the number of PLB
triggered rehashes for a TCP connection. An SNMP counter is also
added to count overall PLB triggered rehash events for a host. These
counters are hooked up to PLB implementation for DCTCP.
TCP_NLA_REHASH is added to SCM_TIMESTAMPING_OPT_STATS that reports
the rehash attempts triggered due to PLB or timeouts. This gives
a historical view of sustained congestion or timeouts experienced
by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PLB support is added to TCP DCTCP code. As DCTCP uses ECN as the
congestion signal, PLB also uses ECN to make decisions whether to change
the path or not upon sustained congestion.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Congestion control algorithms track PLB state and cause the connection
to trigger a path change when either of the 2 conditions is satisfied:
- No packets are in flight and (# consecutive congested rounds >=
sysctl_tcp_plb_idle_rehash_rounds)
- (# consecutive congested rounds >= sysctl_tcp_plb_rehash_rounds)
A round (RTT) is marked as congested when congestion signal
(ECN ce_ratio) over an RTT is greater than sysctl_tcp_plb_cong_thresh.
In the event of RTO, PLB (via tcp_write_timeout()) triggers a path
change and disables congestion-triggered path changes for random time
between (sysctl_tcp_plb_suspend_rto_sec, 2*sysctl_tcp_plb_suspend_rto_sec)
to avoid hopping onto the "connectivity blackhole". RTO-triggered
path changes can still happen during this cool-off period.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PLB (Protective Load Balancing) is a host based mechanism for load
balancing across switch links. It leverages congestion signals(e.g. ECN)
from transport layer to randomly change the path of the connection
experiencing congestion. PLB changes the path of the connection by
changing the outgoing IPv6 flow label for IPv6 connections (implemented
in Linux by calling sk_rethink_txhash()). Because of this implementation
mechanism, PLB can currently only work for IPv6 traffic. For more
information, see the SIGCOMM 2022 paper:
https://doi.org/10.1145/3544216.3544226
This commit adds new sysctl knobs and sets their default values for
TCP PLB.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
1) Move struct nft_payload_set definition to .c file where it is
only used.
2) Shrink transport and inner header offset fields in the nft_pktinfo
structure to 16-bits, from Florian Westphal.
3) Get rid of nft_objref Kbuild toggle, make it built-in into
nf_tables. This expression is used to instantiate conntrack helpers
in nftables. After removing the conntrack helper auto-assignment
toggle it this feature became more important so move it to the nf_tables
core module. Also from Florian.
4) Extend the existing function to calculate payload inner header offset
to deal with the GRE and IPIP transport protocols.
6) Add inner expression support for nf_tables. This new expression
provides a packet parser for tunneled packets which uses a userspace
description of the expected inner headers. The inner expression
invokes the payload expression (via direct call) to match on the
inner header protocol fields using the inner link, network and
transport header offsets.
An example of the bytecode generated from userspace to match on
IP source encapsulated in a VxLAN packet:
# nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4
netdev x y
[ meta load l4proto => reg 1 ]
[ cmp eq reg 1 0x00000011 ]
[ payload load 2b @ transport header + 2 => reg 1 ]
[ cmp eq reg 1 0x0000b512 ]
[ inner type vxlan hdrsize 8 flags f [ meta load protocol => reg 1 ] ]
[ cmp eq reg 1 0x00000008 ]
[ inner type vxlan hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ]
[ cmp eq reg 1 0x04030201 ]
7) Store inner link, network and transport header offsets in percpu
area to parse inner packet header once only. Matching on a different
tunnel type invalidates existing offsets in the percpu area and it
invokes the inner tunnel parser again.
8) Add support for inner meta matching. This support for
NFTA_META_PROTOCOL, which specifies the inner ethertype, and
NFT_META_L4PROTO, which specifies the inner transport protocol.
9) Extend nft_inner to parse GENEVE optional fields to calculate the
link layer offset.
10) Update inner expression so tunnel offset points to GRE header
to normalize tunnel header handling. This also allows to perform
different interpretations of the GRE header from userspace.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nft_inner: set tunnel offset to GRE header offset
netfilter: nft_inner: add geneve support
netfilter: nft_meta: add inner match support
netfilter: nft_inner: add percpu inner context
netfilter: nft_inner: support for inner tunnel header matching
netfilter: nft_payload: access ipip payload for inner offset
netfilter: nft_payload: access GRE payload via inner offset
netfilter: nft_objref: make it builtin
netfilter: nf_tables: reduce nft_pktinfo by 8 bytes
netfilter: nft_payload: move struct nft_payload_set definition where it belongs
====================
Link: https://lore.kernel.org/r/20221026132227.3287-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion")
abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start")
8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If a zero length is passed to kmalloc() it returns 0x10, which is
not a valid address. gss_unwrap_resp_integ() subsequently crashes
when it attempts to dereference that pointer.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
There is a null-ptr-deref when xps sysfs alloc failed:
BUG: KASAN: null-ptr-deref in sysfs_do_create_link_sd+0x40/0xd0
Read of size 8 at addr 0000000000000030 by task gssproxy/457
CPU: 5 PID: 457 Comm: gssproxy Not tainted 6.0.0-09040-g02357b27ee03 #9
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
kasan_report+0xa3/0x120
sysfs_do_create_link_sd+0x40/0xd0
rpc_sysfs_client_setup+0x161/0x1b0
rpc_new_client+0x3fc/0x6e0
rpc_create_xprt+0x71/0x220
rpc_create+0x1d4/0x350
gssp_rpc_create+0xc3/0x160
set_gssp_clnt+0xbc/0x140
write_gssp+0x116/0x1a0
proc_reg_write+0xd6/0x130
vfs_write+0x177/0x690
ksys_write+0xb9/0x150
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
When the xprt_switch sysfs alloc failed, should not add xprt and
switch sysfs to it, otherwise, maybe null-ptr-deref; also initialize
the 'xps_sysfs' to NULL to avoid oops when destroy it.
Fixes: 2a338a543163 ("sunrpc: add a symlink from rpc-client directory to the xprt_switch")
Fixes: d408ebe04ac5 ("sunrpc: add add sysfs directory per xprt under each xprt_switch")
Fixes: baea99445dd4 ("sunrpc: add xprt_switch direcotry to sunrpc's sysfs")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Similar to changes done in TCP in blamed commit.
We should not sense pfmemalloc status in sendpage() methods.
Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221027040637.1107703-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
skb_append_pagefrags() is used by af_unix and udp sendpage()
implementation so far.
In commit 326140063946 ("tcp: TX zerocopy should not sense
pfmemalloc status") we explained why we should not sense
pfmemalloc status for pages owned by user space.
We should also use skb_fill_page_desc_noacc()
in skb_append_pagefrags() to avoid following KCSAN report:
BUG: KCSAN: data-race in lru_add_fn / skb_append_pagefrags
write to 0xffffea00058fc1c8 of 8 bytes by task 17319 on cpu 0:
__list_add include/linux/list.h:73 [inline]
list_add include/linux/list.h:88 [inline]
lruvec_add_folio include/linux/mm_inline.h:323 [inline]
lru_add_fn+0x327/0x410 mm/swap.c:228
folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246
lru_add_drain_cpu+0x73/0x250 mm/swap.c:669
lru_add_drain+0x21/0x60 mm/swap.c:773
free_pages_and_swap_cache+0x16/0x70 mm/swap_state.c:311
tlb_batch_pages_flush mm/mmu_gather.c:59 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:256 [inline]
tlb_flush_mmu+0x5b2/0x640 mm/mmu_gather.c:263
tlb_finish_mmu+0x86/0x100 mm/mmu_gather.c:363
exit_mmap+0x190/0x4d0 mm/mmap.c:3098
__mmput+0x27/0x1b0 kernel/fork.c:1185
mmput+0x3d/0x50 kernel/fork.c:1207
copy_process+0x19fc/0x2100 kernel/fork.c:2518
kernel_clone+0x166/0x550 kernel/fork.c:2671
__do_sys_clone kernel/fork.c:2812 [inline]
__se_sys_clone kernel/fork.c:2796 [inline]
__x64_sys_clone+0xc3/0xf0 kernel/fork.c:2796
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffffea00058fc1c8 of 8 bytes by task 17325 on cpu 1:
page_is_pfmemalloc include/linux/mm.h:1817 [inline]
__skb_fill_page_desc include/linux/skbuff.h:2432 [inline]
skb_fill_page_desc include/linux/skbuff.h:2453 [inline]
skb_append_pagefrags+0x210/0x600 net/core/skbuff.c:3974
unix_stream_sendpage+0x45e/0x990 net/unix/af_unix.c:2338
kernel_sendpage+0x184/0x300 net/socket.c:3561
sock_sendpage+0x5a/0x70 net/socket.c:1054
pipe_to_sendpage+0x128/0x160 fs/splice.c:361
splice_from_pipe_feed fs/splice.c:415 [inline]
__splice_from_pipe+0x222/0x4d0 fs/splice.c:559
splice_from_pipe fs/splice.c:594 [inline]
generic_splice_sendpage+0x89/0xc0 fs/splice.c:743
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0x80/0xa0 fs/splice.c:931
splice_direct_to_actor+0x305/0x620 fs/splice.c:886
do_splice_direct+0xfb/0x180 fs/splice.c:974
do_sendfile+0x3bf/0x910 fs/read_write.c:1255
__do_sys_sendfile64 fs/read_write.c:1323 [inline]
__se_sys_sendfile64 fs/read_write.c:1309 [inline]
__x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1309
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x0000000000000000 -> 0xffffea00058fc188
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17325 Comm: syz-executor.0 Not tainted 6.1.0-rc1-syzkaller-00158-g440b7895c990-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022
Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221027040346.1104204-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-10-27
Anssi Hannula fixes the use of the completions in the kvaser_usb
driver.
Biju Das contributes 2 patches for the rcar_canfd driver. A IRQ storm
that can be triggered by high CAN bus load and channel specific IRQ
handlers are fixed.
Yang Yingliang fixes the j1939 transport protocol by moving a
kfree_skb() out of a spin_lock_irqsave protected section.
* tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb()
can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L
can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive
can: kvaser_usb: Fix possible completions during init_completion
====================
Link: https://lore.kernel.org/r/20221027114356.1939821-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As explained by Julian, fib_nh_scope is related to fib_nh_gw4, but
fib_info_update_nhc_saddr() needs the scope of the route, which is
the scope "before" fib_nh_scope, ie fib_nh_scope - 1.
This patch fixes the problem described in commit 747c14307214 ("ip: fix
dflt addr selection for connected nexthop").
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Link: https://lore.kernel.org/netdev/6c8a44ba-c2d5-cdf-c5c7-5baf97cba38@ssi.bg/
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 747c14307214b55dbd8250e1ab44cad8305756f1.
As explained by Julian, nhc_scope is related to nhc_gw, not to the route.
Revert the original patch. The initial problem is fixed differently in the
next commit.
Link: https://lore.kernel.org/netdev/6c8a44ba-c2d5-cdf-c5c7-5baf97cba38@ssi.bg/
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit eb55dc09b5dd040232d5de32812cc83001a23da6.
The patch that introduces this bug is reverted right after this one.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During review of previous change another thing came up - we should
limit the use of validation workarounds to old commands.
Don't list the workarounds one by one, as we're rejecting all existing
ones. We can deal with the masking in the unlikely event that new flag
is added.
Link: https://lore.kernel.org/all/6ba9f727e555fd376623a298d5d305ad408c3d47.camel@sipsolutions.net/
Link: https://lore.kernel.org/r/20221026001524.1892202-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The function pfkey_send_acquire may race with pfkey_register
(which could even be in a different name space). This may result
in a buffer overrun.
Allocating the maximum amount of memory that could be used prevents
this.
Reported-by: syzbot+1e9af9185d8850e2c2fa@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Instead of discovering the kmalloc bucket size _after_ allocation, round
up proactively so the allocation is explicitly made for the full size,
allowing the compiler to correctly reason about the resulting size of
the buffer through the existing __alloc_size() hint.
This will allow for kernels built with CONFIG_UBSAN_BOUNDS or the
coming dynamic bounds checking under CONFIG_FORTIFY_SOURCE to gain
back the __alloc_size() hints that were temporarily reverted in commit
93dd04ab0b2b ("slab: remove __alloc_size attribute from __kmalloc_track_caller")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20221021234713.you.031-kees@kernel.org/
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221025223811.up.360-kees@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
spin_unlock_irqrestore() before kfree_skb()
It is not allowed to call kfree_skb() from hardware interrupt context
or with interrupts being disabled. The skb is unlinked from the queue,
so it can be freed after spin_unlock_irqrestore().
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.com
Cc: stable@vger.kernel.org
[mkl: adjust subject]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
As noted by Paolo Abeni, pr_warn doesn't generate any splat and can still
preserve the warning to the user that feature downgrade occurred. We
likely cannot introduce other kinds of checks / enforcement here because
syzbot can generate different genl versions to the datapath.
Reported-by: syzbot+31cde0bef4bbf8ba2d86@syzkaller.appspotmail.com
Fixes: 44da5ae5fbea ("openvswitch: Drop user features if old user space attempted to create datapath")
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use KUNIT_EXPECT_MEMEQ to compare memory blocks in replacement of the
KUNIT_EXPECT_EQ macro. Therefor, the statement
KUNIT_EXPECT_EQ(test, memcmp(foo, bar, size), 0);
is replaced by:
KUNIT_EXPECT_MEMEQ(test, foo, bar, size);
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Acked-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Variable total_copied is just being incremented and it's never used
anywhere else. The variable and the increment are redundant so
remove it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20221024135046.2159523-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
My recent patch missed that mptcp_subflow_create_socket()
was creating a 'kernel' socket, then converted it to 'user' socket.
Fixes: 0cafd77dcd03 ("net: add a refcount tracker for kernel sockets")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20221025180546.652251-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|