Age | Commit message (Collapse) | Author |
|
Users reported a v5.3 performance regression and inability to establish
huge page mappings. A revised version of the ndctl "dax.sh" huge page
unit test identifies commit 23c84eb78375 "dax: Fix missed wakeup with
PMD faults" as the source.
Update get_unlocked_entry() to check for NULL entries before checking
the entry order, otherwise NULL is misinterpreted as a present pte
conflict. The 'order' check needs to happen before the locked check as
an unlocked entry at the wrong order must fallback to lookup the correct
order.
Reported-by: Jeff Smits <jeff.smits@intel.com>
Reported-by: Doug Nelson <doug.nelson@intel.com>
Cc: <stable@vger.kernel.org>
Fixes: 23c84eb78375 ("dax: Fix missed wakeup with PMD faults")
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Link: https://lore.kernel.org/r/157167532455.3945484.11971474077040503994.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The list_kref reaches a count of 0 when all the static OPPs are removed,
for example when dev_pm_opp_of_cpumask_remove_table() is called, though
the actual OPP table may not get freed as it may still be referenced by
other parts of the kernel, like from a call to
dev_pm_opp_set_supported_hw(). And if we call
dev_pm_opp_of_cpumask_add_table() again at this point, we must
reinitialize the list_kref otherwise the kernel will hit a WARN() in
kref infrastructure for incrementing a kref with value 0.
Fixes: 11e1a1648298 ("opp: Don't decrement uninitialized list_kref")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
There is one more problematic case I noticed while recently fixing BPF kallsyms
handling in cd7455f1013e ("bpf: Fix use after free in subprog's jited symbol
removal") and that is bpf_get_prog_name().
If BTF has been attached to the prog, then we may be able to fetch the function
signature type id in kallsyms through prog->aux->func_info[prog->aux->func_idx].type_id.
However, while the BTF object itself is torn down via RCU callback, the prog's
aux->func_info is immediately freed via kvfree(prog->aux->func_info) once the
prog's refcount either hit zero or when subprograms were already exposed via
kallsyms and we hit the error path added in 5482e9a93c83 ("bpf: Fix memleak in
aux->func_info and aux->btf").
This violates RCU as well since kallsyms could be walked in parallel where we
could access aux->func_info. Hence, defer kvfree() to after RCU grace period.
Looking at ba64e7d85252 ("bpf: btf: support proper non-jit func info") there
is no reason/dependency where we couldn't defer the kvfree(aux->func_info) into
the RCU callback.
Fixes: 5482e9a93c83 ("bpf: Fix memleak in aux->func_info and aux->btf")
Fixes: ba64e7d85252 ("bpf: btf: support proper non-jit func info")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/875f2906a7c1a0691f2d567b4d8e4ea2739b1e88.1571779205.git.daniel@iogearbox.net
|
|
Add HD Audio Device PCI ID for the Intel Tigerlake and Jasperlake
platform.
Signed-off-by: Pan Xiuli <xiuli.pan@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20191022194402.23178-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Since commit 342db221829f ("sched: Call skb_get_hash_perturb
in sch_fq_codel") we no longer need anything from this file.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Include <net/addrconf.h> for the missing declarations of
various functions. Fixes the following sparse warnings:
net/ipv6/addrconf_core.c:94:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:100:5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:106:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:112:5: warning: symbol 'register_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:118:5: warning: symbol 'unregister_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:125:5: warning: symbol 'inet6addr_validator_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:237:6: warning: symbol 'in6_dev_finish_destroy' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
syzbot found the following crash on:
HEAD commit: 1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a600000
kernel config: https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
dashboard link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=136aa8c4600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=109ba792600000
=====================================================================
BUG: memory leak
unreferenced object 0xffff8881207e4100 (size 128):
comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
hex dump (first 32 bytes):
00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff .p........."....
00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 ..#.............
backtrace:
[<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
[<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
[<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
[<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
[<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
[<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
[<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
[<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
BUG: memory leak
unreferenced object 0xffff88811723b600 (size 64):
comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
hex dump (first 32 bytes):
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1 .............5..
backtrace:
[<00000000352f46d8>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<00000000352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<00000000352f46d8>] slab_alloc mm/slab.c:3319 [inline]
[<00000000352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
[<00000000352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
[<000000008e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
[<000000008e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0 net/openvswitch/vport.c:343
[<00000000541e4f4a>] ovs_vport_alloc+0x7f/0xf0 net/openvswitch/vport.c:139
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
BUG: memory leak
unreferenced object 0xffff8881228ca500 (size 128):
comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
hex dump (first 32 bytes):
00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff ..'........"....
40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 @.#.............
backtrace:
[<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
[<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
[<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
[<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
[<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
[<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
[<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
[<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
[<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
[<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
[<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
[<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
[<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
[<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
[<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
[<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
[<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
[<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
[<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
[<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
[<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
[<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
=====================================================================
The function in net core, register_netdevice(), may fail with vport's
destruction callback either invoked or not. After commit 309b66970ee2
("net: openvswitch: do not free vport if register_netdevice() is failed."),
the duty to destroy vport is offloaded from the driver OTOH, which ends
up in the memory leak reported.
It is fixed by releasing vport unless device is registered successfully.
To do that, the callback assignment is defered until device is registered.
Reported-by: syzbot+13210896153522fe1ee5@syzkaller.appspotmail.com
Fixes: 309b66970ee2 ("net: openvswitch: do not free vport if register_netdevice() is failed.")
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
[sbrivio: this was sent to dev@openvswitch.org and never made its way
to netdev -- resending original patch]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Kernel test robot reported that the l2tp.sh test script failed:
# selftests: net: l2tp.sh
# Warning: file l2tp.sh is not executable, correct this.
Set executable bits.
Fixes: e858ef1cd4bc ("selftests: Add l2tp tests")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
We get one warnings when build kernel W=1:
net/sched/sch_taprio.c:1155:6: warning: no previous prototype for ‘taprio_offload_config_changed’ [-Wmissing-prototypes]
Make the function static to fix this.
Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Michael Chan says:
====================
Devlink and error recovery bug fix patches.
Most of the work is by Vasundhara Volam.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
disabled device.
With the recently added error recovery logic, the device may already
be disabled if the firmware recovery is unsuccessful. In
bnxt_remove_one(), check that the device is still enabled first
before calling pci_disable_device().
Fixes: 3bc7d4a352ef ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Minor formatting changes to diagnose cb for FW devlink health
reporter.
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
When firmware indicates that driver needs to invoke firmware reset
which is common for both error recovery and live firmware reset path,
driver needs a different time to wait before polling for firmware
readiness.
Modify the wait time to fw_reset_min_dsecs, which is initialised to
correct timeout for error recovery and firmware reset.
Fixes: 4037eb715680 ("bnxt_en: Add a new BNXT_FW_RESET_STATE_POLL_FW_DOWN state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The current code does not do endian swapping between the devlink
parameter and the internal NVRAM representation. Define a union to
represent the little endian NVRAM data and add 2 helper functions to
copy to and from the NVRAM data with the proper byte swapping.
Fixes: 782a624d00fa ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The current code that rounds up the NVRAM parameter bit size to the next
byte size for the devlink parameter is not always correct. The MSIX
devlink parameters are 4 bytes and we don't get the correct size
using this method.
Fix it by adding a new dl_num_bytes member to the bnxt_dl_nvm_param
structure which statically provides bytesize information according
to the devlink parameter type definition.
Fixes: 782a624d00fa ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
When the address width of DMA is greater than 32, the packet header occupies
a BD descriptor. The starting address of the data should be added to the
header length.
Fixes: a993db88d17d ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: yuqi jin <jinyuqi@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The ionic driver started using dymamic_hex_dump(), but
that is not always defined:
drivers/net/ethernet/pensando/ionic/ionic_main.c:229:2: error: implicit declaration of function 'dynamic_hex_dump' [-Werror,-Wimplicit-function-declaration]
Add a dummy implementation to use when CONFIG_DYNAMIC_DEBUG
is disabled, printing nothing.
Fixes: 938962d55229 ("ionic: Add adminq action")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The table entry in __XDP_ACT_SYM_TAB for the last item is set
to { -1, 0 } where it should be { -1, NULL } as the second item
is a pointer to a string.
Fixes the following sparse warnings:
./include/trace/events/xdp.h:28:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:53:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:82:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:140:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:155:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:190:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:225:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:260:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:318:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:356:1: warning: Using plain integer as NULL pointer
./include/trace/events/xdp.h:390:1: warning: Using plain integer as NULL pointer
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191022125925.10508-1-ben.dooks@codethink.co.uk
|
|
Vivien Didelot says:
====================
The dsa_switch structure represents the physical switch device itself,
and is allocated by the driver. The dsa_switch_tree and dsa_port structures
represent the logical switch fabric (eventually composed of multiple switch
devices) and its ports, and are allocated by the DSA core.
This branch lists the logical ports directly in the fabric which simplifies
the iteration over all ports when assigning the default CPU port or configuring
the D in DSA in drivers like mv88e6xxx.
This also removes the unique dst->cpu_dp pointer and is a first step towards
supporting multiple CPU ports and dropping the DSA_MAX_PORTS limitation.
Because the dsa_port structures are not tied to the dsa_switch structure
anymore, we do not need to provide an helper for the drivers to allocate a
switch structure. Like in many other subsystems, drivers can now embed their
dsa_switch structure as they wish into their private structure. This will
be particularly interesting for the Broadcom drivers which were currently
limited by the dynamically allocated array of DSA ports.
The series implements the list of dsa_port structures, makes use of it,
then drops dst->cpu_dp and the dsa_switch_alloc helper.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Now that ports are dynamically listed in the fabric, there is no need
to provide a special helper to allocate the dsa_switch structure. This
will give more flexibility to drivers to embed this structure as they
wish in their private structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Allocate the struct dsa_port the first time it is accessed with
dsa_port_touch, and remove the static dsa_port array from the
dsa_switch structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Like the dsa_switch_tree structures, the dsa_port structures will be
allocated on switch registration.
The SJA1105 driver is the only one accessing the dsa_port structure
after the switch allocation and before the switch registration.
For that reason, move switch registration prior to assigning the priv
member of the dsa_port structures.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Instead of digging into the other dsa_switch structures of the fabric
and relying too much on the dsa_to_port helper, use the new list
of switch fabric ports to remap the Port VLAN Map of local bridge
group members or remap the Port VLAN Table entry of external bridge
group members.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Instead of digging into the other dsa_switch structures of the fabric
and relying too much on the dsa_to_port helper, use the new list of
switch fabric ports to define the mask of the local ports allowed to
receive frames from another port of the fabric.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Since mv88e6xxx_pvt_map is a static helper, no need to return
-EOPNOTSUPP if the chip has no PVT, simply silently skip the operation.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when setting up the default CPU port. Unassign it on teardown.
Now that we can iterate over multiple CPU ports, remove dst->cpu_dp.
At the same time, provide a better error message for CPU-less tree.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when looking up the first CPU port in the tree.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Now that we have a potential list of CPU ports, make use of it instead
of only configuring the master device of an unique CPU port.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports to find a port from a given node.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of accessing the dsa_switch array
of ports when iterating over DSA ports of a switch to set up the
routing table.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when setting up the switches and their ports.
At the same time, provide setup states and messages for ports and
switches as it is done for the trees.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of iterating over switches and their
ports when looking for a slave device from a given master interface.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use the new ports list instead of accessing the dsa_switch array
of ports in the dsa_to_port helper.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Add a list of switch ports within the switch fabric. This will help the
lookup of a port inside the whole fabric, and it is the first step
towards supporting multiple CPU ports, before deprecating the usage of
the unique dst->cpu_dp pointer.
In preparation for a future allocation of the dsa_port structures,
return -ENOMEM in case no structure is returned, even though this
error cannot be reached yet.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Do not let the drivers access the ds->ports static array directly
while there is a dsa_to_port helper for this purpose.
At the same time, un-const this helper since the SJA1105 driver
assigns the priv member of the returned dsa_port structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
LIBBPF_OPTS is implemented as a mix of field declaration and memset
+ assignment. This makes it neither variable declaration nor purely
statements, which is a problem, because you can't mix it with either
other variable declarations nor other function statements, because C90
compiler mode emits warning on mixing all that together.
This patch changes LIBBPF_OPTS into a strictly declaration of variable
and solves this problem, as can be seen in case of bpftool, which
previously would emit compiler warning, if done this way (LIBBPF_OPTS as
part of function variables declaration block).
This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow
kernel convention for similar macros more closely.
v1->v2:
- rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki).
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
|
|
LLVM alu32 was introduced in LLVM7:
https://reviews.llvm.org/rL325987
https://reviews.llvm.org/rL325989
Experiments showed that in general performance is better with alu32
enabled:
https://lwn.net/Articles/775316/
This patch turns on alu32 with no-flavor test_progs which is tested
most often. The flavor test at no_alu32/test_progs can be used to
test without alu32 enabled. The Makefile check for whether LLVM
supports '-mattr=+alu32 -mcpu=v3' is removed as LLVM7 should be
available for recent distributions and also latest LLVM is preferred
to run BPF selftests.
Note that jmp32 is checked by -mcpu=probe and will be enabled if the
host kernel supports it.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191022043119.2625263-1-yhs@fb.com
|
|
Heiner Kallweit says:
====================
The attempt to improve performance by changing the PCIe max read request
size was added in the vendor driver more than 10 years back and copied
to r8169 driver. In the vendor driver this has been removed long ago.
Obviously it had no effect, also in my tests I didn't see any
difference. Typically the max payload size is less than 512 bytes
anyway, and the PCI core takes care that the maximum supported value
is set. So let's remove fiddling with PCIe max read request size from
r8169 too. This change allows to simplify the driver in the subsequent
three patches of this series.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
We can remove rtl_hw_start_8168bef() and use rtl_hw_start_8168b()
instead because setting register Config4 is done in
rtl_jumbo_config(), being called from rtl_hw_start().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
We can remove rtl_hw_start_8168dp() because it's the same as
rtl_hw_start_8168dp() now.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
r8168b_0_hw_jumbo_enable() and r8168b_0_hw_jumbo_disable() both do the
same and just set PCI_EXP_DEVCTL_NOSNOOP_EN. We can simplify the code
by moving this setting for RTL8168B to rtl_hw_start_8168().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The attempt to improve performance by changing the PCIe max read request
size was added in the vendor driver more than 10 years back and copied
to r8169 driver. In the vendor driver this has been removed long ago.
Obviously it had no effect, also in my tests I didn't see any
difference. Typically the max payload size is less than 512 bytes
anyway, and the PCI core takes care that the maximum supported value
is set. So let's remove fiddling with PCIe max read request size from
r8169 too.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
syzkaller managed to trigger the following crash:
[...]
BUG: unable to handle page fault for address: ffffc90001923030
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline]
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline]
RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709
[...]
Call Trace:
kernel_text_address kernel/extable.c:147 [inline]
__kernel_text_address+0x9a/0x110 kernel/extable.c:102
unwind_get_return_address+0x4c/0x90 arch/x86/kernel/unwind_frame.c:19
arch_stack_walk+0x98/0xe0 arch/x86/kernel/stacktrace.c:26
stack_trace_save+0xb6/0x150 kernel/stacktrace.c:123
save_stack mm/kasan/common.c:69 [inline]
set_track mm/kasan/common.c:77 [inline]
__kasan_kmalloc+0x11c/0x1b0 mm/kasan/common.c:510
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:518
slab_post_alloc_hook mm/slab.h:584 [inline]
slab_alloc mm/slab.c:3319 [inline]
kmem_cache_alloc+0x1f5/0x2e0 mm/slab.c:3483
getname_flags+0xba/0x640 fs/namei.c:138
getname+0x19/0x20 fs/namei.c:209
do_sys_open+0x261/0x560 fs/open.c:1091
__do_sys_open fs/open.c:1115 [inline]
__se_sys_open fs/open.c:1110 [inline]
__x64_sys_open+0x87/0x90 fs/open.c:1110
do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
[...]
After further debugging it turns out that we walk kallsyms while in parallel
we tear down a BPF program which contains subprograms that have been JITed
though the program itself has not been fully exposed and is eventually bailing
out with error.
The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes
the symbols, however, bpf_prog_free() tears down the JIT memory too early via
scheduled work. Instead, it needs to properly respect RCU grace period as the
kallsyms walk for BPF is under RCU.
Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error
path where we defer final destruction when we have subprogs in the program.
Fixes: 7d1982b4e335 ("bpf: fix panic in prog load calls cleanup")
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/bpf/55f6367324c2d7e9583fa9ccf5385dcbba0d7a6e.1571752452.git.daniel@iogearbox.net
|
|
Karsten Graul says:
====================
More patches to address abnormal termination processing of
sockets and link groups.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
With the introduction of the link group termination worker there is
no longer a need to postpone smc_close_active_abort() to a worker.
To protect socket destruction due to normal and abnormal socket
closing, the socket refcount is increased.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Use a worker for link group termination to guarantee process context.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
If a link group and its connections must be terminated,
* wake up socket waiters
* do not enable buffer reuse
A linkgroup might be terminated while normal connection closing
is running. Avoid buffer reuse and its related LLC DELETE RKEY
call, if linkgroup termination has started. And use the earliest
indication of linkgroup termination possible, namely the removal
from the linkgroup list.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
There are lots of link group termination scenarios. Most of them
still allow to inform the peer of the terminating sockets about aborting.
This patch tries to call smc_close_abort() for terminating sockets.
And the internal TCP socket is reset with tcp_abort().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Usually link groups are freed delayed to enable quick connection
creation for a follow-on SMC socket. Terminated link groups are
freed faster. This patch makes sure, fast schedule of link group
freeing is not rescheduled by a delayed schedule. And it makes sure
link group freeing is not rescheduled, if the real freeing is already
running.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Locking hierarchy requires that the link group conns_lock can be
taken if the socket lock is held, but not vice versa. Nevertheless
socket termination during abnormal link group termination should
be protected by the socket lock.
This patch reduces the time segments the link group conns_lock is
held to enable usage of lock_sock in smc_lgr_terminate().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|