Age | Commit message (Collapse) | Author |
|
All users of this call are in socket or tty code, so handling
it there means we can avoid the table entry in fs/compat_ioctl.c.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Unlike the normal SIOCOUTQ, SIOCOUTQNSD was never handled in compat
mode. Add it to the common socket compat handler along with similar
ones.
Fixes: 2f4e1b397097 ("tcp: ioctl type SIOCOUTQNSD returns amount of data not sent")
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The af_unix protocol family has a custom ioctl command (inexplicibly
based on SIOCPROTOPRIVATE), but never had a compat_ioctl handler for
32-bit applications.
Since all commands are compatible here, add a trivial wrapper that
performs the compat_ptr() conversion for SIOCOUTQ/SIOCINQ. SIOCUNIXFILE
does not use the argument, but it doesn't hurt to also use compat_ptr()
here.
Fixes: ba94f3088b79 ("unix: add ioctl to open a unix socket file with O_PATH")
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
All these ioctl commands are compatible, so we can handle
them with a trivial wrapper in hci_sock.c and remove
the listing in fs/compat_ioctl.c.
A few of the commands pass integer arguments instead of
pointers, so for correctness skip the compat_ptr() conversion
here.
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
All these ioctl commands are compatible, so we can handle
them with a trivial wrapper in rfcomm/sock.c and remove
the listing in fs/compat_ioctl.c.
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.
One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.
I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Unbind callbacks on chain deletion.
Fixes: 8fc618c52d16 ("netfilter: nf_tables_offload: refactor the nft_flow_offload_chain function")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Other garbage collector might remove an entry not fully set up yet.
[570953.958293] RIP: 0010:memcmp+0x9/0x50
[...]
[570953.958567] flow_offload_hash_cmp+0x1e/0x30 [nf_flow_table]
[570953.958585] flow_offload_lookup+0x8c/0x110 [nf_flow_table]
[570953.958606] nf_flow_offload_ip_hook+0x135/0xb30 [nf_flow_table]
[570953.958624] nf_flow_offload_inet_hook+0x35/0x37 [nf_flow_table_inet]
[570953.958646] nf_hook_slow+0x3c/0xb0
[570953.958664] __netif_receive_skb_core+0x90f/0xb10
[570953.958678] ? ip_rcv_finish+0x82/0xa0
[570953.958692] __netif_receive_skb_one_core+0x3b/0x80
[570953.958711] __netif_receive_skb+0x18/0x60
[570953.958727] netif_receive_skb_internal+0x45/0xf0
[570953.958741] napi_gro_receive+0xcd/0xf0
[570953.958764] ixgbe_clean_rx_irq+0x432/0xe00 [ixgbe]
[570953.958782] ixgbe_poll+0x27b/0x700 [ixgbe]
[570953.958796] net_rx_action+0x284/0x3c0
[570953.958817] __do_softirq+0xcc/0x27c
[570953.959464] irq_exit+0xe8/0x100
[570953.960097] do_IRQ+0x59/0xe0
[570953.960734] common_interrupt+0xf/0xf
Fixes: 43c8f131184f ("netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch allows you to register one netdev basechain to multiple
devices. This adds a new NFTA_HOOK_DEVS netlink attribute to specify
the list of netdevices. Basechains store a list of hooks.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
After unbinding the list of flow_block callbacks, iterate over it to
remove the existing rules in the netdevice that has just been
unregistered.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add helper function to set up the flow_cls_offload object.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This allows to reuse nft_setup_cb_call() from the callback unbind path.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add nft_flow_block_chain() helper function to reuse this function from
netdev event handler.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Rise the maximum limit of devices per flowtable up to 256. Rename
NFT_FLOWTABLE_DEVICE_MAX to NFT_NETDEVICE_MAX in preparation to reuse
the netdev hook parser for ingress basechain.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Allow netdevice only once per flowtable, otherwise hit EEXIST.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Use a list of hooks per device instead an array.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Hardware offload needs access to the priority field, store this field in
the nf_flowtable object.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Since commit 342db221829f ("sched: Call skb_get_hash_perturb
in sch_fq_codel") we no longer need anything from this file.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Include <net/addrconf.h> for the missing declarations of
various functions. Fixes the following sparse warnings:
net/ipv6/addrconf_core.c:94:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:100:5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:106:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:112:5: warning: symbol 'register_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:118:5: warning: symbol 'unregister_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:125:5: warning: symbol 'inet6addr_validator_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:237:6: warning: symbol 'in6_dev_finish_destroy' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
syzbot found the following crash on:
HEAD commit: 1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a600000
kernel config: https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
dashboard link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=136aa8c4600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=109ba792600000
=====================================================================
BUG: memory leak
unreferenced object 0xffff8881207e4100 (size 128):
comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
hex dump (first 32 bytes):
00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff .p........."....
00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 ..#.............
backtrace:
[<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
[<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
[<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
[<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
[<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
[<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
[<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
[<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
BUG: memory leak
unreferenced object 0xffff88811723b600 (size 64):
comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
hex dump (first 32 bytes):
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1 .............5..
backtrace:
[<00000000352f46d8>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<00000000352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<00000000352f46d8>] slab_alloc mm/slab.c:3319 [inline]
[<00000000352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
[<00000000352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
[<000000008e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
[<000000008e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0 net/openvswitch/vport.c:343
[<00000000541e4f4a>] ovs_vport_alloc+0x7f/0xf0 net/openvswitch/vport.c:139
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
BUG: memory leak
unreferenced object 0xffff8881228ca500 (size 128):
comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
hex dump (first 32 bytes):
00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff ..'........"....
40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 @.#.............
backtrace:
[<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
[<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
[<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
[<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
[<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
[<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
[<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
[<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
=====================================================================
The function in net core, register_netdevice(), may fail with vport's
destruction callback either invoked or not. After commit 309b66970ee2
("net: openvswitch: do not free vport if register_netdevice() is failed."),
the duty to destroy vport is offloaded from the driver OTOH, which ends
up in the memory leak reported.
It is fixed by releasing vport unless device is registered successfully.
To do that, the callback assignment is defered until device is registered.
Reported-by: syzbot+13210896153522fe1ee5@syzkaller.appspotmail.com
Fixes: 309b66970ee2 ("net: openvswitch: do not free vport if register_netdevice() is failed.")
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
[sbrivio: this was sent to dev@openvswitch.org and never made its way
to netdev -- resending original patch]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
We get one warnings when build kernel W=1:
net/sched/sch_taprio.c:1155:6: warning: no previous prototype for ‘taprio_offload_config_changed’ [-Wmissing-prototypes]
Make the function static to fix this.
Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Now that ports are dynamically listed in the fabric, there is no need
to provide a special helper to allocate the dsa_switch structure. This
will give more flexibility to drivers to embed this structure as they
wish in their private structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Allocate the struct dsa_port the first time it is accessed with
dsa_port_touch, and remove the static dsa_port array from the
dsa_switch structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when setting up the default CPU port. Unassign it on teardown.
Now that we can iterate over multiple CPU ports, remove dst->cpu_dp.
At the same time, provide a better error message for CPU-less tree.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when looking up the first CPU port in the tree.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Now that we have a potential list of CPU ports, make use of it instead
of only configuring the master device of an unique CPU port.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports to find a port from a given node.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of accessing the dsa_switch array
of ports when iterating over DSA ports of a switch to set up the
routing table.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when setting up the switches and their ports.
At the same time, provide setup states and messages for ports and
switches as it is done for the trees.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when looking for a slave device from a given master interface.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Add a list of switch ports within the switch fabric. This will help the
lookup of a port inside the whole fabric, and it is the first step
towards supporting multiple CPU ports, before deprecating the usage of
the unique dst->cpu_dp pointer.
In preparation for a future allocation of the dsa_port structures,
return -ENOMEM in case no structure is returned, even though this
error cannot be reached yet.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Do not let the drivers access the ds->ports static array directly
while there is a dsa_to_port helper for this purpose.
At the same time, un-const this helper since the SJA1105 driver
assigns the priv member of the returned dsa_port structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
With the introduction of the link group termination worker there is
no longer a need to postpone smc_close_active_abort() to a worker.
To protect socket destruction due to normal and abnormal socket
closing, the socket refcount is increased.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use a worker for link group termination to guarantee process context.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
If a link group and its connections must be terminated,
* wake up socket waiters
* do not enable buffer reuse
A linkgroup might be terminated while normal connection closing
is running. Avoid buffer reuse and its related LLC DELETE RKEY
call, if linkgroup termination has started. And use the earliest
indication of linkgroup termination possible, namely the removal
from the linkgroup list.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
There are lots of link group termination scenarios. Most of them
still allow to inform the peer of the terminating sockets about aborting.
This patch tries to call smc_close_abort() for terminating sockets.
And the internal TCP socket is reset with tcp_abort().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Usually link groups are freed delayed to enable quick connection
creation for a follow-on SMC socket. Terminated link groups are
freed faster. This patch makes sure, fast schedule of link group
freeing is not rescheduled by a delayed schedule. And it makes sure
link group freeing is not rescheduled, if the real freeing is already
running.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Locking hierarchy requires that the link group conns_lock can be
taken if the socket lock is held, but not vice versa. Nevertheless
socket termination during abnormal link group termination should
be protected by the socket lock.
This patch reduces the time segments the link group conns_lock is
held to enable usage of lock_sock in smc_lgr_terminate().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
When a link group is to be terminated, it is sufficient to hold
the lgr lock when unlinking the link group from its list.
Move the lock-protected link group unlinking into smc_lgr_terminate().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The resources for a terminated socket are being cleaned up.
This patch makes sure
* no more data is received for an actively terminated socket
* no more data is sent for an actively or passively terminated socket
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use tcf_tm_dump(), instead of an open coded variant (no functional change
in this patch).
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes the iph field from the state structure, which is not
properly initialized. Instead, add a new field to make the "do we want
to set DF" be the state bit and move the code to set the DF flag from
ip_frag_next().
Joint work with Pablo and Linus.
Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators")
Reported-by: Patrick Schönthaler <patrick@notvads.ovh>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Several cases of overlapping changes which were for the most
part trivially resolvable.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
"I was battling a cold after some recent trips, so quite a bit piled up
meanwhile, sorry about that.
Highlights:
1) Fix fd leak in various bpf selftests, from Brian Vazquez.
2) Fix crash in xsk when device doesn't support some methods, from
Magnus Karlsson.
3) Fix various leaks and use-after-free in rxrpc, from David Howells.
4) Fix several SKB leaks due to confusion of who owns an SKB and who
should release it in the llc code. From Eric Biggers.
5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet.
6) Jumbo packets don't work after resume on r8169, as the BIOS resets
the chip into non-jumbo mode during suspend. From Heiner Kallweit.
7) Corrupt L2 header during MPLS push, from Davide Caratti.
8) Prevent possible infinite loop in tc_ctl_action, from Eric
Dumazet.
9) Get register bits right in bcmgenet driver, based upon chip
version. From Florian Fainelli.
10) Fix mutex problems in microchip DSA driver, from Marek Vasut.
11) Cure race between route lookup and invalidation in ipv4, from Wei
Wang.
12) Fix performance regression due to false sharing in 'net'
structure, from Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits)
net: reorder 'struct net' fields to avoid false sharing
net: dsa: fix switch tree list
net: ethernet: dwmac-sun8i: show message only when switching to promisc
net: aquantia: add an error handling in aq_nic_set_multicast_list
net: netem: correct the parent's backlog when corrupted packet was dropped
net: netem: fix error path for corrupted GSO frames
macb: propagate errors when getting optional clocks
xen/netback: fix error path of xenvif_connect_data()
net: hns3: fix mis-counting IRQ vector numbers issue
net: usb: lan78xx: Connect PHY before registering MAC
vsock/virtio: discard packets if credit is not respected
vsock/virtio: send a credit update when buffer size is changed
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
net: ensure correct skb->tstamp in various fragmenters
net: bcmgenet: reset 40nm EPHY on energy detect
net: bcmgenet: soft reset 40nm EPHYs before MAC init
net: phy: bcm7xxx: define soft_reset for 40nm EPHY
net: bcmgenet: don't set phydev->link from MAC
net: Update address for MediaTek ethernet driver in MAINTAINERS
ipv4: fix race condition between route lookup and invalidation
...
|
|
If there are multiple switch trees on the device, only the last one
will be listed, because the arguments of list_add_tail are swapped.
Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If packet corruption failed we jump to finish_segs and return
NET_XMIT_SUCCESS. Seeing success will make the parent qdisc
increment its backlog, that's incorrect - we need to return
NET_XMIT_DROP.
Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To corrupt a GSO frame we first perform segmentation. We then
proceed using the first segment instead of the full GSO skb and
requeue the rest of the segments as separate packets.
If there are any issues with processing the first segment we
still want to process the rest, therefore we jump to the
finish_segs label.
Commit 177b8007463c ("net: netem: fix backlog accounting for
corrupted GSO frames") started using the pointer to the first
segment in the "rest of segments processing", but as mentioned
above the first segment may had already been freed at this point.
Backlog corrections for parent qdiscs have to be adjusted.
Fixes: 177b8007463c ("net: netem: fix backlog accounting for corrupted GSO frames")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the remote peer doesn't respect the credit information
(buf_alloc, fwd_cnt), sending more data than it can send,
we should drop the packets to prevent a malicious peer
from using all of our memory.
This is patch follows the VIRTIO spec: "VIRTIO_VSOCK_OP_RW data
packets MUST only be transmitted when the peer has sufficient
free buffer space for the payload"
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the user application set a new buffer size value, we should
update the remote peer about this change, since it uses this
information to calculate the credit available.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Thomas found that some forwarded packets would be stuck
in FQ packet scheduler because their skb->tstamp contained
timestamps far in the future.
We thought we addressed this point in commit 8203e2d844d3
("net: clear skb->tstamp in forwarding paths") but there
is still an issue when/if a packet needs to be fragmented.
In order to meet EDT requirements, we have to make sure all
fragments get the original skb->tstamp.
Note that this original skb->tstamp should be zero in
forwarding path, but might have a non zero value in
output path if user decided so.
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Thomas Bartschies <Thomas.Bartschies@cvk.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|