summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-03-09ionic: print pci bus lane infoShannon Nelson
Print the PCI bus lane information so people can keep an eye out for poor configurations that might not support the advertised speed. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ionic: support ethtool rxhash disableShannon Nelson
We can disable rxhashing by setting rss_types to 0. The user can toggle this with "ethtool -K <ethX> rxhash off|on", which calls into the .ndo_set_features callback with the NETIF_F_RXHASH feature bit set or cleared. This patch adds a check for that bit and updates the FW if necessary. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ionic: clean up bitflag usageShannon Nelson
Remove the unused flags field and and fix the bitflag names to include the _F_ flag hint. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ionic: improve irq numa localityShannon Nelson
Spreading the interrupts across the CPU cores is good for load balancing, but not necessarily as good when using a CPU/core that is not part of the NUMA local CPU. If it can be localized, the kernel's cpumask_local_spread() service will pick a core that is on the node close to the PCI device. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ionic: remove pragma packedShannon Nelson
Replace the misguided "#pragma packed" with tags on each struct/union definition that actually needs it. This is safer and more efficient on the various compilers and architectures. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ionic: keep ionic dev on lif init failShannon Nelson
If the basic ionic interface works but the lif creation fails, don't fail the probe. This will allow us to use the driver to help inspect the hw/fw/pci interface for debugging purposes. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09Merge branch 'mptcp-don-t-auto-adjust-rcvbuf-size-if-locked'David S. Miller
Florian Westphal says: ==================== mptcp: don't auto-adjust rcvbuf size if locked The mptcp receive buffer is auto-sized based on the subflow receive buffer. Don't do this if userspace specfied a value via SO_RCVBUF setsockopt. Also update selftest program to provide a new option to set a fixed size. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09mptcp: don't grow mptcp socket receive buffer when rcvbuf is lockedFlorian Westphal
The mptcp rcvbuf size is adjusted according to the subflow rcvbuf size. This should not be done if userspace did set a fixed value. Fixes: 600911ff5f72bae ("mptcp: add rmem queue accounting") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09mptcp: selftests: add rcvbuf set optionFlorian Westphal
allows to run the tests with fixed receive buffer by passing "-R <value>" to mptcp_connect.sh. While at it, add a default 10 second poll timeout so the "-t" becomes optional -- this makes mptcp_connect simpler to use during manual testing. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09net: dsa: mt7530: add support for port mirroringDENG Qingfang
Add support for configuring port mirroring through the cls_matchall classifier. We do a full ingress and/or egress capture towards a capture port. MT7530 supports one monitor port and multiple mirrored ports. Signed-off-by: DENG Qingfang <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09Merge tag 'batadv-next-for-davem-20200306' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Avoid RCU list-traversal in spinlock, by Sven Eckelmann - Replace zero-length array with flexible-array member, by Gustavo A. R. Silva ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09Merge tag 'batadv-net-for-davem-20200306' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - Don't schedule OGM for disabled interface, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09Merge branch 'r8169-series-with-improvements-to-rtl_tx'David S. Miller
Heiner Kallweit says: ==================== r8169: series with improvements to rtl_tx This series includes few improvements to rtl_tx(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09r8169: remove now unneeded barrier in rtl_txHeiner Kallweit
Until ae84bc187337 ("r8169: don't use bit LastFrag in tx descriptor after send") we used to access another bit in the descriptor, therefore it seems the barrier was needed. Since this commit DescOwn is the only bit we're interested in, so the barrier isn't needed any longer. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09r8169: simplify usage of rtl8169_unmap_tx_skbHeiner Kallweit
Simplify the parameters taken by rtl8169_unmap_tx_skb, this makes usage of this function easier to read and understand. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09r8169: ensure tx_skb is fully reset after calling rtl8169_unmap_tx_skbHeiner Kallweit
So far tx_skb->skb is the only member of the two structs that is not reset. Make understanding the code easier by resetting both structs completely in rtl8169_unmap_tx_skb. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09r8169: convert while to for loop in rtl_txHeiner Kallweit
Slightly improve the code by converting this while to a for loop. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09net: mscc: ocelot: properly account for VLAN header length when setting MRUVladimir Oltean
What the driver writes into MAC_MAXLEN_CFG does not actually represent VLAN_ETH_FRAME_LEN but instead ETH_FRAME_LEN + ETH_FCS_LEN. Yes they are numerically equal, but the difference is important, as the switch treats VLAN-tagged traffic specially and knows to increase the maximum accepted frame size automatically. So it is always wrong to account for VLAN in the MAC_MAXLEN_CFG register. Unconditionally increase the maximum allowed frame size for double-tagged traffic. Accounting for the additional length does not mean that the other VLAN membership checks aren't performed, so there's no harm done. Also, stop abusing the MTU name for configuring the MRU. There is no support for configuring the MRU on an interface at the moment. Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Fixes: fa914e9c4d94 ("net: mscc: ocelot: create a helper for changing the port MTU") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()Eric Dumazet
Commit e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") added a cond_resched_rcu() in a loop using rcu protection to iterate over slaves. This is breaking rcu rules, so lets instead use cond_resched() at a point we can reschedule Fixes: e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09Merge branch 's390-qeth-next'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: updates 2020-03-06 please apply the following patch series for qeth to netdev's net-next tree. Just a small update to take care of a regression wrt to IRQ handling in net-next, reported by Qian Cai. The fix needs some qdio layer changes, so you will find Vasily's Acked-by in that patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09s390/qeth: remove VNICC callback parameter structJulian Wiedmann
After recent cleanups this is just a complicated wrapper around an u32*. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09s390/qdio: add tighter controls for IRQ pollingJulian Wiedmann
Once the call to qdio_establish() has completed, qdio is free to deliver data IRQs to the device driver's IRQ poll handler. For qeth (the only qdio driver that currently uses IRQ polling) this is problematic, since the IRQs can arrive before its NAPI instance is even registered. Calling napi_schedule() from qeth_qdio_start_poll() then crashes in various nasty ways. Until recently qeth checked for IFF_UP to drop such early interrupts, but that's fragile as well since it doesn't enforce any ordering. Fix this properly by bringing up the qdio device in IRQS_DISABLED mode, and have the driver explicitly opt-in to receive data IRQs. qeth does so from qeth_open(), which kick-starts a NAPI poll and then calls qdio_start_irq() from qeth_poll(). Also add a matching qdio_stop_irq() in qeth_stop() to switch the qdio dataplane back into a disabled state. Fixes: 3d35dbe6224e ("s390/qeth: don't check for IFF_UP when scheduling napi") CC: Qian Cai <cai@lca.pw> Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09cgroup, netclassid: periodically release file_lock on classid updatingDmitry Yakunin
In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts. This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion. Now update is non atomic and socket may be skipped using calls: dup2(oldfd, newfd); close(oldfd); But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer. New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration. So in common cases this patch has no side effects. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09net: sched: pie: change tc_pie_xstats->probLeslie Monis
Commit 105e808c1da2 ("pie: remove pie_vars->accu_prob_overflows") changes the scale of probability values in PIE from (2^64 - 1) to (2^56 - 1). This affects the precision of tc_pie_xstats->prob in user space. This patch ensures user space is unaffected. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09macvlan: add cond_resched() during multicast processingMahesh Bandewar
The Rx bound multicast packets are deferred to a workqueue and macvlan can also suffer from the same attack that was discovered by Syzbot for IPvlan. This solution is not as effective as in IPvlan. IPvlan defers all (Tx and Rx) multicast packet processing to a workqueue while macvlan does this way only for the Rx. This fix should address the Rx codition to certain extent. Tx is still suseptible. Tx multicast processing happens when .ndo_start_xmit is called, hence we cannot add cond_resched(). However, it's not that severe since the user which is generating / flooding will be affected the most. Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ipvlan: add cond_resched_rcu() while processing muticast backlogMahesh Bandewar
If there are substantial number of slaves created as simulated by Syzbot, the backlog processing could take much longer and result into the issue found in the Syzbot report. INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 1, t=10502 jiffies, g=5049, c=5048, q=752) All QSes seen, last rcu_sched kthread activity 10502 (4294965563-4294955061), jiffies_till_next_fqs=1, root ->qsmask 0x0 syz-executor.1 R running task on cpu 1 10984 11210 3866 0x30020008 179034491270 Call Trace: <IRQ> [<ffffffff81497163>] _sched_show_task kernel/sched/core.c:8063 [inline] [<ffffffff81497163>] _sched_show_task.cold+0x2fd/0x392 kernel/sched/core.c:8030 [<ffffffff8146a91b>] sched_show_task+0xb/0x10 kernel/sched/core.c:8073 [<ffffffff815c931b>] print_other_cpu_stall kernel/rcu/tree.c:1577 [inline] [<ffffffff815c931b>] check_cpu_stall kernel/rcu/tree.c:1695 [inline] [<ffffffff815c931b>] __rcu_pending kernel/rcu/tree.c:3478 [inline] [<ffffffff815c931b>] rcu_pending kernel/rcu/tree.c:3540 [inline] [<ffffffff815c931b>] rcu_check_callbacks.cold+0xbb4/0xc29 kernel/rcu/tree.c:2876 [<ffffffff815e3962>] update_process_times+0x32/0x80 kernel/time/timer.c:1635 [<ffffffff816164f0>] tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:161 [<ffffffff81616ae4>] tick_sched_timer+0x44/0x130 kernel/time/tick-sched.c:1193 [<ffffffff815e75f7>] __run_hrtimer kernel/time/hrtimer.c:1393 [inline] [<ffffffff815e75f7>] __hrtimer_run_queues+0x307/0xd90 kernel/time/hrtimer.c:1455 [<ffffffff815e90ea>] hrtimer_interrupt+0x2ea/0x730 kernel/time/hrtimer.c:1513 [<ffffffff844050f4>] local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1031 [inline] [<ffffffff844050f4>] smp_apic_timer_interrupt+0x144/0x5e0 arch/x86/kernel/apic/apic.c:1056 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 RIP: 0010:do_raw_read_lock+0x22/0x80 kernel/locking/spinlock_debug.c:153 RSP: 0018:ffff8801dad07ab8 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff12 RAX: 0000000000000000 RBX: ffff8801c4135680 RCX: 0000000000000000 RDX: 1ffff10038826afe RSI: ffff88019d816bb8 RDI: ffff8801c41357f0 RBP: ffff8801dad07ac0 R08: 0000000000004b15 R09: 0000000000310273 R10: ffff88019d816bb8 R11: 0000000000000001 R12: ffff8801c41357e8 R13: 0000000000000000 R14: ffff8801cfb19850 R15: ffff8801cfb198b0 [<ffffffff8101460e>] __raw_read_lock_bh include/linux/rwlock_api_smp.h:177 [inline] [<ffffffff8101460e>] _raw_read_lock_bh+0x3e/0x50 kernel/locking/spinlock.c:240 [<ffffffff840d78ca>] ipv6_chk_mcast_addr+0x11a/0x6f0 net/ipv6/mcast.c:1006 [<ffffffff84023439>] ip6_mc_input+0x319/0x8e0 net/ipv6/ip6_input.c:482 [<ffffffff840211c8>] dst_input include/net/dst.h:449 [inline] [<ffffffff840211c8>] ip6_rcv_finish+0x408/0x610 net/ipv6/ip6_input.c:78 [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:292 [inline] [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:286 [inline] [<ffffffff840214de>] ipv6_rcv+0x10e/0x420 net/ipv6/ip6_input.c:278 [<ffffffff83a29efa>] __netif_receive_skb_one_core+0x12a/0x1f0 net/core/dev.c:5303 [<ffffffff83a2a15c>] __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:5417 [<ffffffff83a2f536>] process_backlog+0x216/0x6c0 net/core/dev.c:6243 [<ffffffff83a30d1b>] napi_poll net/core/dev.c:6680 [inline] [<ffffffff83a30d1b>] net_rx_action+0x47b/0xfb0 net/core/dev.c:6748 [<ffffffff846002c8>] __do_softirq+0x2c8/0x99a kernel/softirq.c:317 [<ffffffff813e656a>] invoke_softirq kernel/softirq.c:399 [inline] [<ffffffff813e656a>] irq_exit+0x16a/0x1a0 kernel/softirq.c:439 [<ffffffff84405115>] exiting_irq arch/x86/include/asm/apic.h:561 [inline] [<ffffffff84405115>] smp_apic_timer_interrupt+0x165/0x5e0 arch/x86/kernel/apic/apic.c:1058 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 </IRQ> RIP: 0010:__sanitizer_cov_trace_pc+0x26/0x50 kernel/kcov.c:102 RSP: 0018:ffff880196033bd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 RAX: ffff88019d8161c0 RBX: 00000000ffffffff RCX: ffffc90003501000 RDX: 0000000000000002 RSI: ffffffff816236d1 RDI: 0000000000000005 RBP: ffff880196033bd8 R08: ffff88019d8161c0 R09: 0000000000000000 R10: 1ffff10032c067f0 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [<ffffffff816236d1>] do_futex+0x151/0x1d50 kernel/futex.c:3548 [<ffffffff816260f0>] C_SYSC_futex kernel/futex_compat.c:201 [inline] [<ffffffff816260f0>] compat_SyS_futex+0x270/0x3b0 kernel/futex_compat.c:175 [<ffffffff8101da17>] do_syscall_32_irqs_on arch/x86/entry/common.c:353 [inline] [<ffffffff8101da17>] do_fast_syscall_32+0x357/0xe1c arch/x86/entry/common.c:415 [<ffffffff84401a9b>] entry_SYSENTER_compat+0x8b/0x9d arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f23c69 RSP: 002b:00000000f5d1f12c EFLAGS: 00000282 ORIG_RAX: 00000000000000f0 RAX: ffffffffffffffda RBX: 000000000816af88 RCX: 0000000000000080 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000816af8c RBP: 00000000f5d1f228 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 rcu_sched kthread starved for 10502 jiffies! g5049 c5048 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1 rcu_sched R running task on cpu 1 13048 8 2 0x90000000 179099587640 Call Trace: [<ffffffff8147321f>] context_switch+0x60f/0xa60 kernel/sched/core.c:3209 [<ffffffff8100095a>] __schedule+0x5aa/0x1da0 kernel/sched/core.c:3934 [<ffffffff810021df>] schedule+0x8f/0x1b0 kernel/sched/core.c:4011 [<ffffffff8101116d>] schedule_timeout+0x50d/0xee0 kernel/time/timer.c:1803 [<ffffffff815c13f1>] rcu_gp_kthread+0xda1/0x3b50 kernel/rcu/tree.c:2327 [<ffffffff8144b318>] kthread+0x348/0x420 kernel/kthread.c:246 [<ffffffff84400266>] ret_from_fork+0x56/0x70 arch/x86/entry/entry_64.S:393 Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”) Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09ipvlan: don't deref eth hdr before checking it's setMahesh Bandewar
IPvlan in L3 mode discards outbound multicast packets but performs the check before ensuring the ether-header is set or not. This is an error that Eric found through code browsing. Fixes: 2ad7bf363841 (“ipvlan: Initial check-in of the IPVLAN driver.”) Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09tcp: add bytes not sent to SCM_TIMESTAMPING_OPT_STATSYousuk Seung
Add TCP_NLA_BYTES_NOTSENT to SCM_TIMESTAMPING_OPT_STATS that reports bytes in the write queue but not sent. This is the same metric as what is exported with tcp_info.tcpi_notsent_bytes. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09sfc: detach from cb_page in efx_copy_channel()Edward Cree
It's a resource, not a parameter, so we can't copy it into the new channel's TX queues, otherwise aliasing will lead to resource- management bugs if the channel is subsequently torn down without being initialised. Before the Fixes:-tagged commit there was a similar bug with tsoh_page, but I'm not sure it's worth doing another fix for such old kernels. Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Suggested-by: Derek Shute <Derek.Shute@stratus.com> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09net/mlx5e: Show/set Rx network flow classification rules on ul repVlad Buslov
Reuse infrastructure that already exists for pf in legacy mode to show/set Rx network flow classification rules for uplink representors. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09net/mlx5e: Init ethtool steering for representorsVlad Buslov
During transition to uplink representors the code responsible for initializing ethtool steering functionality wasn't added to representor init rx routine. This causes NULL pointer dereference during configuration of network flow classification rule with ethtool (only possible to reproduce with next commit in this series which registers necessary ethtool callbacks). Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09net/mlx5e: Show/set Rx flow indir table and RSS hash key on ul repVlad Buslov
Reuse infrastructure that already exists for pf in legacy mode to show/set Rx flow hash indirection table and RSS hash key for uplink representors. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09net/mlx5e: Introduce root ft concept for representors netdevsSaeed Mahameed
Uplink representor traffic will be redirected to an empty root ft rather than directly to a direct tir or ttc table, this root ft will be empty and will be used as a link for auto-chaining with ttc table or ethtool tables in downstream patches. On load, fs core will connect uplink rep root_ft with ttc table. In case ethtool steering will be used, fs core will auto connect root_ft with the ethtool bypass tables, which will be connected with the ttc table. vport_rx_rule[uplink_rep]->root_ft->ethtool->ttc. For non-uplink representors, for simplicity root_ft will always point at ttc table, hence the replace vport_rx rule logic is removed. vport_rx_rule[non_uplink_rep]->root_ft(ttc). For now ethtool steering support can only be available on uplink rep. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-09net/mlx5: E-switch, make query inline mode a static functionParav Pandit
mlx5_eswitch_inline_mode_get() is used only in eswitch_offloads.c. Hence, make it static and adjacent to its caller function. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: Allocate smaller size tables for ft offloadPaul Blakey
Instead of giving ft tables one of the largest tables available - 4M, give it a more reasonable size - 64k. Especially since it will always be created as a miss hook in the following patch. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5e: Fix an IS_ERR() vs NULL checkDan Carpenter
The esw_vport_tbl_get() function returns error pointers on error. Fixes: 96e326878fa5 ("net/mlx5e: Eswitch, Use per vport tables for mirroring") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: Verify goto chain offload supportEli Cohen
According to PRM, forward to flow table along with either packet reformat or decap is supported only if reformat_and_fwd_to_table capability is set for the flow table. Add dependency on the capability and pack all the conditions for "goto chain" in a single function. Fix language in error message in case of not supporting forward to a lower numbered flow table. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: E-Switch, Use vport metadata matching only when mandatoryMajd Dibbiny
Multi-port RoCE mode requires tagging traffic that passes through the vport. This matching can cause performance degradation, therefore disable it and use the legacy matching on vhca_id and source_port when possible. Fixes: 92ab1eb392c6 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: Tidy up and fix reverse christmas ordringMark Bloch
Use reverse chirstmas tree inside mlx5e_ethtool_get_link_ksettings. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: Expose port speed when possibleMark Bloch
When port speed can't be reported based on ext_eth_proto_capability or eth_proto_capability instead of reporting speed as unknown check if the port's speed can be inferred based on the data_rate_oper field. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux This series adds some HW bits and definitions for mlx5 driver, to be used by downstream features in both rdma and netdev branches. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: HW bit for goto chain offload support net/mlx5: Expose link speed directly net/mlx5: Introduce TLS and IPSec objects enums net/mlx5: Introduce egress acl forward-to-vport capability net/mlx5: Expose raw packet pacing APIs net/mlx5e: Replace zero-length array with flexible-array member net/mlx5: fix spelling mistake "reserverd" -> "reserved" Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09Merge tag 'ktest-v5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull Ktest fixes and clean ups from Steven Rostedt: - Make the default option oldconfig instead of randconfig (one too many times I lost my config because I left the build type out) - Add timeout to ssh sync to sync before reboot (prevents test hangs) - A couple of spelling fix patches * tag 'ktest-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest: Fix typos in ktest.pl ktest: Add timeout for ssh sync testing ktest: Make default build option oldconfig not randconfig ktest: Fix some typos in sample.conf
2020-03-09Merge tag 'mmc-v5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - sdhci-msm: Silence warning about turning function into static - sdhci-pci-gli: Fix support for GL975x by enabling MSI interrupt * tag 'mmc-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-pci-gli: Enable MSI interrupt for GL975x mmc: sdhci-msm: Mark sdhci_msm_cqe_disable static
2020-03-10bpftool: Fix typo in bash-completionSong Liu
_bpftool_get_map_names => _bpftool_get_prog_names for prog-attach|detach. Fixes: 99f9863a0c45 ("bpftool: Match maps by name") Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200309173218.2739965-5-songliubraving@fb.com
2020-03-10bpftool: Bash completion for "bpftool prog profile"Song Liu
Add bash completion for "bpftool prog profile" command. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200309173218.2739965-4-songliubraving@fb.com
2020-03-10bpftool: Documentation for bpftool prog profileSong Liu
Add documentation for the new bpftool prog profile command. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200309173218.2739965-3-songliubraving@fb.com
2020-03-10bpftool: Introduce "prog profile" commandSong Liu
With fentry/fexit programs, it is possible to profile BPF program with hardware counters. Introduce bpftool "prog profile", which measures key metrics of a BPF program. bpftool prog profile command creates per-cpu perf events. Then it attaches fentry/fexit programs to the target BPF program. The fentry program saves perf event value to a map. The fexit program reads the perf event again, and calculates the difference, which is the instructions/cycles used by the target program. Example input and output: ./bpftool prog profile id 337 duration 3 cycles instructions llc_misses 4228 run_cnt 3403698 cycles (84.08%) 3525294 instructions # 1.04 insn per cycle (84.05%) 13 llc_misses # 3.69 LLC misses per million isns (83.50%) This command measures cycles and instructions for BPF program with id 337 for 3 seconds. The program has triggered 4228 times. The rest of the output is similar to perf-stat. In this example, the counters were only counting ~84% of the time because of time multiplexing of perf counters. Note that, this approach measures cycles and instructions in very small increments. So the fentry/fexit programs introduce noticeable errors to the measurement results. The fentry/fexit programs are generated with BPF skeletons. Therefore, we build bpftool twice. The first time _bpftool is built without skeletons. Then, _bpftool is used to generate the skeletons. The second time, bpftool is built with skeletons. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200309173218.2739965-2-songliubraving@fb.com
2020-03-09Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Some bug fixes all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: Adjust label in virtballoon_probe virtio-blk: improve virtqueue error to BLK_STS virtio-blk: fix hw_queue stopped on arbitrary error virtio_ring: Fix mem leak with vring_new_virtqueue()
2020-03-09pid: make ENOMEM return value more obviousChristian Brauner
The alloc_pid() codepath used to be simpler. With the introducation of the ability to choose specific pids in 49cb2fc42ce4 ("fork: extend clone3() to support setting a PID") it got more complex. It hasn't been super obvious that ENOMEM is returned when the pid namespace init process/child subreaper of the pid namespace has died. As can be seen from multiple attempts to improve this see e.g. [1] and most recently [2]. We regressed returning ENOMEM in [3] and [2] restored it. Let's add a comment on top explaining that this is historic and documented behavior and cannot easily be changed. [1]: 35f71bc0a09a ("fork: report pid reservation failure properly") [2]: b26ebfe12f34 ("pid: Fix error return value in some cases") [3]: 49cb2fc42ce4 ("fork: extend clone3() to support setting a PID") Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-03-09bpf, doc: Update maintainers for L7 BPFLorenz Bauer
Add Jakub and myself as maintainers for sockmap related code. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200309111243.6982-13-lmb@cloudflare.com