summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-07Merge branch 'virtio_net-XDP-adjust_head'David S. Miller
John Fastabend says: ==================== XDP adjust head support for virtio This series adds adjust head support for virtio. The following is my test setup. I use qemu + virtio as follows, ./x86_64-softmmu/qemu-system-x86_64 \ -hda /var/lib/libvirt/images/Fedora-test0.img \ -m 4096 -enable-kvm -smp 2 -netdev tap,id=hn0,queues=4,vhost=on \ -device virtio-net-pci,netdev=hn0,mq=on,guest_tso4=off,guest_tso6=off,guest_ecn=off,guest_ufo=off,vectors=9 In order to use XDP with virtio until LRO is supported TSO must be turned off in the host. The important fields in the above command line are the following, guest_tso4=off,guest_tso6=off,guest_ecn=off,guest_ufo=off Also note it is possible to conusme more queues than can be supported because when XDP is enabled for retransmit XDP attempts to use a queue per cpu. My standard queue count is 'queues=4'. After loading the VM I run the relevant XDP test programs in, ./sammples/bpf For this series I tested xdp1, xdp2, and xdp_tx_iptunnel. I usually test with iperf (-d option to get bidirectional traffic), ping, and pktgen. I also have a modified xdp1 that returns XDP_PASS on any packet to ensure the normal traffic path to the stack continues to work with XDP loaded. It would be great to automate this soon. At the moment I do it by hand which is starting to get tedious. v2: original series dropped trace points after merge. ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: XDP support for adjust_headJohn Fastabend
Add support for XDP adjust head by allocating a 256B header region that XDP programs can grow into. This is only enabled when a XDP program is loaded. In order to ensure that we do not have to unwind queue headroom push queue setup below bpf_prog_add. It reads better to do a prog ref unwind vs another queue setup call. At the moment this code must do a full reset to ensure old buffers without headroom on program add or with headroom on program removal are not used incorrectly in the datapath. Ideally we would only have to disable/enable the RX queues being updated but there is no API to do this at the moment in virtio so use the big hammer. In practice it is likely not that big of a problem as this will only happen when XDP is enabled/disabled changing programs does not require the reset. There is some risk that the driver may either have an allocation failure or for some reason fail to correctly negotiate with the underlying backend in this case the driver will be left uninitialized. I have not seen this ever happen on my test systems and for what its worth this same failure case can occur from probe and other contexts in virtio framework. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: refactor freeze/restore logic into virtnet reset logicJohn Fastabend
For XDP we will need to reset the queues to allow for buffer headroom to be configured. In order to do this we need to essentially run the freeze()/restore() code path. Unfortunately the locking requirements between the freeze/restore and reset paths are different however so we can not simply reuse the code. This patch refactors the code path and adds a reset helper routine. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: remove duplicate queue pair binding in XDPJohn Fastabend
Factor out qp assignment. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: factor out xdp handler for readabilityJohn Fastabend
At this point the do_xdp_prog is mostly if/else branches handling the different modes of virtio_net. So remove it and handle running the program in the per mode handlers. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: wrap rtnl_lock in test for calling with lock already heldJohn Fastabend
For XDP use case and to allow ethtool reset tests it is useful to be able to use reset paths from contexts where rtnl lock is already held. This requries updating virtnet_set_queues and free_receive_bufs the two places where rtnl_lock is taken in virtio_net. To do this we use the following pattern, _foo(...) { do stuff } foo(...) { rtnl_lock(); _foo(...); rtnl_unlock()}; this allows us to use freeze()/restore() flow from both contexts. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07mac80211: add back lost debugfs filesJohannes Berg
Somehow these files were never present or lost, but the code is there and they seem somewhat useful, so add them back. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-07ath9k: fix race condition in enabling/disabling IRQsFelix Fietkau
The code currently relies on refcounting to disable IRQs from within the IRQ handler and re-enabling them again after the tasklet has run. However, due to race conditions sometimes the IRQ handler might be called twice, or the tasklet may not run at all (if interrupted in the middle of a reset). This can cause nasty imbalances in the irq-disable refcount which will get the driver permanently stuck until the entire radio has been stopped and started again (ath_reset will not recover from this). Instead of using this fragile logic, change the code to ensure that running the irq handler during tasklet processing is safe, and leave the refcount untouched. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath9k_hw: check if the chip failed to wake upFelix Fietkau
In an RFC patch, Sven Eckelmann and Simon Wunderlich reported: "QCA 802.11n chips (especially AR9330/AR9340) sometimes end up in a state in which a read of AR_CFG always returns 0xdeadbeef. This should not happen when when the power_mode of the device is ATH9K_PM_AWAKE." Include the check for the default register state in the existing MAC hang check. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath9k: rename tx_complete_work to hw_check_workFelix Fietkau
Also include common MAC alive check. This should make the hang checks more reliable for modes where beacons are not sent and is used as a starting point for further hang check improvements Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07wcn36xx: Implement cancel_hw_scanBjorn Andersson
In the even that the wcn36xx interface is brought down while a hw_scan is active we must abort and wait for the ongoing scan to signal completion to mac80211. Reported-by: Mart Raudsepp <leio@gentoo.org> Fixes: 886039036c20 ("wcn36xx: Implement firmware assisted scan") Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: fix reading sram contents for QCA4019Ashok Raj Nagarajan
With QCA4019 platform, SRAM address can be accessed directly from host but currently, we are assuming sram addresses cannot be accessed directly and hence we convert the addresses. While there, clean up growing hw checks during conversion of target CPU address to CE address. Now we have function pointer pertaining to different chips. Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: fix boot failure in UTF mode/testmodeTamizh chelvam
Rx filter reset and the dynamic tx switch mode (EXT_RESOURCE_CFG) configuration are causing the following errors when UTF firmware is loaded to the target. Error message 1: [ 598.015629] ath10k_pci 0001:01:00.0: failed to ping firmware: -110 [ 598.020828] ath10k_pci 0001:01:00.0: failed to reset rx filter: -110 [ 598.141556] ath10k_pci 0001:01:00.0: failed to start core (testmode): -110 Error message 2: [ 668.615839] ath10k_ahb a000000.wifi: failed to send ext resource cfg command : -95 [ 668.618902] ath10k_ahb a000000.wifi: failed to start core (testmode): -95 Avoiding these configurations while bringing the target in testmode is solving the problem. Cc: stable@vger.kernel.org Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: add debugfs support to get per peer tids log via tracingMaharaja Kennadyrajan
This patch provides support to get per peer tids log. echo 1 > /sys/kernel/debug/ieee80211/phyX/netdev\:wlanX/stations/ XX:XX/peer_debug_trigger These logs will be the part of FWLOGS which we collect the logs via tracing interface. Here we will get the peer tigd logs only once(not repeatedly) when we write 1 to the debugfs file. Signed-off-by: Maharaja Kennadyrajan <c_mkenna@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: few whitespace fixesKalle Valo
Fixes checkpatch warnings: drivers/net/wireless/ath/ath10k/pci.c:1593: Statements should start on a tabstop drivers/net/wireless/ath/ath10k/ce.c:962: Alignment should match open parenthesis Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: use names in function definition argumentsKalle Valo
Fixes new checkpatch warnings: drivers/net/wireless/ath/ath10k/htt.h:1823: function definition argument 'struct sk_buff *' should also have an identifier name drivers/net/wireless/ath/ath10k/wmi.h:6607: function definition argument 'struct wmi_start_scan_arg *' should also have an identifier name Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07ath10k: prefer unsigned int over just unsignedKalle Valo
Fixes new checkpatch warnings: drivers/net/wireless/ath/ath10k/htt.h:1639: Prefer 'unsigned int' to bare use of 'unsigned' drivers/net/wireless/ath/ath10k/htt.h:1660: Prefer 'unsigned int' to bare use of 'unsigned' Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07Revert "ALSA: line6: Only determine control port properties if needed"Takashi Iwai
This reverts commit f6a0dd107ad0c8b59d1c9735eea4b8cb9f460949. The commit caused a regression on LINE6 Transport that has no control caps. Although reverting the commit may result back in a spurious error message for some device again, it's the simplest regression fix, hence it's taken as is at first. The further code fix will follow later. Fixes: f6a0dd107ad0 ("ALSA: line6: Only determine control port properties if needed") Reported-by: Igor Zinovev <zinigor@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-02-07rtlwifi: rtl8192c-common: Fix "BUG: KASAN:Larry Finger
Kernels built with CONFIG_KASAN=y report the following BUG for rtl8192cu and rtl8192c-common: ================================================================== BUG: KASAN: slab-out-of-bounds in rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common] at addr ffff8801c90edb08 Read of size 1 by task kworker/0:1/38 page:ffffea0007243800 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x8000000000004000(head) page dumped because: kasan: bad access detected CPU: 0 PID: 38 Comm: kworker/0:1 Not tainted 4.9.7-gentoo #3 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77-DS3H, BIOS F11a 11/13/2013 Workqueue: rtl92c_usb rtl_watchdog_wq_callback [rtlwifi] 0000000000000000 ffffffff829eea33 ffff8801d7f0fa30 ffff8801c90edb08 ffffffff824c0f09 ffff8801d4abee80 0000000000000004 0000000000000297 ffffffffc070b57c ffff8801c7aa7c48 ffff880100000004 ffffffff000003e8 Call Trace: [<ffffffff829eea33>] ? dump_stack+0x5c/0x79 [<ffffffff824c0f09>] ? kasan_report_error+0x4b9/0x4e0 [<ffffffffc070b57c>] ? _usb_read_sync+0x15c/0x280 [rtl_usb] [<ffffffff824c0f75>] ? __asan_report_load1_noabort+0x45/0x50 [<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common] [<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common] [<ffffffffc06d0cbe>] ? rtl92c_dm_rf_saving+0x96e/0x1330 [rtl8192c_common] ... The problem is due to rtl8192ce and rtl8192cu sharing routines, and having different layouts of struct rtl_pci_priv, which is used by rtl8192ce, and struct rtl_usb_priv, which is used by rtl8192cu. The problem was resolved by placing the struct bt_coexist_info at the head of each of those private areas. Reported-and-tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> # 4.0+ Cc: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtcoutsrc.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtcoutsrc.c to use the standard routines. It also deletes the definitions of the now-unused debugging macros, and turns on compilation of all the routines in btcoexist. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtc8821a2ant.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtc8821a2ant.c to use the standard routines. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtc8821a1ant.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtc8821a1ant.c to use the standard routines. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtc8723b2ant.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtc8723b2ant.c to use the standard routines. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtc8723b1ant.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtc8723b1ant.c to use the standard routines. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Convert halbtc8192e2ant.c to use standard debuggingLarry Finger
The routines in btcoexist use different debugging routines than are used in the other drivers. This patch converts halbtc8192e2ant.c to use the standard routines. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: rtl8821ae: Fix typo in symbol for bandwidth numbersLarry Finger
In several places, "BANDWITH" is used when "BANDWIDTH" should have been used. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07mwifiex: Avoid skipping WEP key deletion for APGanapathi Bhat
This patch fixes the issue specific to AP. AP is started with WEP security and external station is connected to it. Data path works in this case. Now if AP is restarted with WPA/WPA2 security, station is able to connect but ping fails. Driver skips the deletion of WEP keys if interface type is AP. Removing that redundant check resolves the issue. Fixes: e57f1734d87a ("mwifiex: add key material v2 support") Signed-off-by: Ganapathi Bhat <gbhat@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rt2x00: avoid introducing a USB dependency in the rt2x00lib moduleStanislaw Gruszka
As reported by Felix: Though protected by an ifdef, introducing an usb symbol dependency in the rt2x00lib module is a major inconvenience for distributions that package kernel modules split into individual packages. Get rid of this unnecessary dependency by calling the usb related function from a more suitable place. Cc: Vishal Thanki <vishalthanki@gmail.com> Reported-by: Felix Fietkau <nbd@nbd.name> Fixes: 8b4c0009313f ("rt2x00usb: Use usb anchor to manage URB") Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07brcmfmac: use wiphy_read_of_freq_limits to respect limits from DTRafał Miłecki
This new helper reads extra frequency limits specified in DT and disables unavailable chanels. This is useful for devices (like home routers) with chipsets limited e.g. by board design. In order to respect info read from DT we simply need to check for IEEE80211_CHAN_DISABLED bit when constructing channel info. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07mwifiex: don't include mac80211.hJohannes Berg
This driver doesn't use mac80211, so it shouldn't include mac80211.h, include only the necessary cfg80211.h instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07Merge tag 'iwlwifi-next-for-kalle-2017-02-06' of ↵Kalle Valo
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next Second batch of improvements and fixes for v4.11. * A bunch of bugfixes for the DQA code; * Work on support for new A000 devices continues; * Some clean-ups and general improvements
2017-02-07rtlwifi: Add work queue for c2h cmd.Ping-Ke Shih
btcoex needs to sleep, thus it must run in thread context. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: rtl8723be: fix ant_sel codePing-Ke Shih
When ant_sel is set, we need to fill single_ant_path to select correct antenna path. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoex: move bt_type declarationPing-Ke Shih
Routine rtl_get_hwpg_bt_type() is better in halbtcoutsrc.c than in rtl_btc.c. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: rtl8723be: btcoex: add package_type function to btcoexPing-Ke Shih
The new code handles the package-type of the chip. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: move btcoex's ant_num declarationPing-Ke Shih
File halbtcoutsrc.c is a better place for routine rtl_get_hwpg_ant_num() than file rtl_btc.c. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: rtl8723be: btcoexist: Add single_ant_pathPing-Ke Shih
Some devices with RTL8732BE wifi/Bluetooth adapters are shipped with only a single antenna. On a subset of these, the EEPROM is incorectly coded to indicate the wrong connection. The resulting problems have been fixed for wifi. This change handles them for BT. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Change logging in halbtc8192e2ant.cLarry Finger
This routine uses its own debugging macros These are changed to use the the recently rewritten RT_TRACE macro. There are also some renamed variables that were missed in the previous step. The only functional change is that some debugging statements have been dropped based on the final code supplied by Realtek. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: btcoexist: Add vendor definition for new btcoexistPing-Ke Shih
Routine halbtc_get() will need to be able to get the vendor ID. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: Add a new enumeration value to btc_set_typePing-Ke Shih
The new value is needed for future capability. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: Set retry limit depends on vif type.Ping-Ke Shih
We assign different retry limit according to vif type, because it can boost performance in field. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: shaofu <shaofu@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-07rtlwifi: Fix programing CAM content sequence.Ping-Ke Shih
There is a potential race condition when the control byte of a CAM entry is written first. Write in reverse order to correct the condition. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: shaofu <shaofu@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-02-06Merge branch 'bridge-improve-cache-utilization'David S. Miller
Nikolay Aleksandrov says: ==================== bridge: improve cache utilization This is the first set which begins to deal with the bad bridge cache access patterns. The first patch rearranges the bridge and port structs a little so the frequently (and closely) accessed members are in the same cache line. The second patch then moves the garbage collection to a workqueue trying to improve system responsiveness under load (many fdbs) and more importantly removes the need to check if the matched entry is expired in __br_fdb_get which was a major source of false-sharing. The third patch is a preparation for the final one which If properly configured, i.e. ports bound to CPUs (thus updating "updated" locally) then the bridge's HitM goes from 100% to 0%, but even without binding we get a win because previously every lookup that iterated over the hash chain caused false-sharing due to the first cache line being used for both mac/vid and used/updated fields. Some results from tests I've run: (note that these were run in good conditions for the baseline, everything ran on a single NUMA node and there were only 3 fdbs) 1. baseline 100% Load HitM on the fdbs (between everyone who has done lookups and hit one of the 3 hash chains of the communicating src/dst fdbs) Overall 5.06% Load HitM for the bridge, first place in the list 2. patched & ports bound to CPUs 0% Local load HitM, bridge is not even in the c2c report list Also there's 3% consistent improvement in netperf tests. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: fdb: write to used and updated at most once per jiffyNikolay Aleksandrov
Writing once per jiffy is enough to limit the bridge's false sharing. After this change the bridge doesn't show up in the local load HitM stats. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move write-heavy fdb members in their own cache lineNikolay Aleksandrov
Fdb's used and updated fields are written to on every packet forward and packet receive respectively. Thus if we are receiving packets from a particular fdb, they'll cause false-sharing with everyone who has looked it up (even if it didn't match, since mac/vid share cache line!). The "used" field is even worse since it is updated on every packet forward to that fdb, thus the standard config where X ports use a single gateway results in 100% fdb false-sharing. Note that this patch does not prevent the last scenario, but it makes it better for other bridge participants which are not using that fdb (and are only doing lookups over it). The point is with this move we make sure that only communicating parties get the false-sharing, in a later patch we'll show how to avoid that too. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move to workqueue gcNikolay Aleksandrov
Move the fdb garbage collector to a workqueue which fires at least 10 milliseconds apart and cleans chain by chain allowing for other tasks to run in the meantime. When having thousands of fdbs the system is much more responsive. Most importantly remove the need to check if the matched entry has expired in __br_fdb_get that causes false-sharing and is completely unnecessary if we cleanup entries, at worst we'll get 10ms of traffic for that entry before it gets deleted. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: modify bridge and port to have often accessed fields in one cache lineNikolay Aleksandrov
Move around net_bridge so the vlan fields are in the beginning since they're checked on every packet even if vlan filtering is disabled. For the port move flags & vlan group to the beginning, so they're in the same cache line with the port's state (both flags and state are checked on each packet). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bpf: enable verifier to add 0 to packet ptrWilliam Tu
The patch fixes the case when adding a zero value to the packet pointer. The zero value could come from src_reg equals type BPF_K or CONST_IMM. The patch fixes both, otherwise the verifer reports the following error: [...] R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=4) R2=pkt_end R3=fp-12 R4=imm4,min_value=4,max_value=4 R5=pkt(id=0,off=4,r=4) 269: (bf) r2 = r0 // r2 becomes imm0 270: (77) r2 >>= 3 271: (bf) r4 = r1 // r4 becomes pkt ptr 272: (0f) r4 += r2 // r4 += 0 addition of negative constant to packet pointer is not allowed Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Mihai Budiu <mbudiu@vmware.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06Merge branch 'read-vnet_hdr_sz-once'David S. Miller
Willem de Bruijn says: ==================== read vnet_hdr_sz once Tuntap devices allow concurrent use and update of field vnet_hdr_sz. Read the field once to avoid TOCTOU. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06macvtap: read vnet_hdr_size onceWillem de Bruijn
When IFF_VNET_HDR is enabled, a virtio_net header must precede data. Data length is verified to be greater than or equal to expected header length tun->vnet_hdr_sz before copying. Macvtap functions read the value once, but unless READ_ONCE is used, the compiler may ignore this and read multiple times. Enforce a single read and locally cached value to avoid updates between test and use. Signed-off-by: Willem de Bruijn <willemb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>