summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-30net: qed: convert to SPDX License IdentifiersAlexander Lobakin
QLogic QED drivers source code is dual licensed under GPL-2.0/BSD-3-Clause. Remove all the boilerplates in the existing code and replace it with the correct SPDX tag. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: qed: correct existing SPDX tagsAlexander Lobakin
QLogic QED drivers source code is dual licensed under GPL-2.0/BSD-3-Clause. Correct already existing but wrong SPDX tags to match the actual license. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30selftests/bpf: Allow substituting custom vmlinux.h for selftests buildAndrii Nakryiko
Similarly to bpftool Makefile, allow to specify custom location of vmlinux.h to be used during the build. This allows simpler testing setups with checked-in pre-generated vmlinux.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630004759.521530-2-andriin@fb.com
2020-06-30tools/bpftool: Allow substituting custom vmlinux.h for the buildAndrii Nakryiko
In some build contexts (e.g., Travis CI build for outdated kernel), vmlinux.h, generated from available kernel, doesn't contain all the types necessary for BPF program compilation. For such set up, the most maintainable way to deal with this problem is to keep pre-generated (almost up-to-date) vmlinux.h checked in and use it for compilation purposes. bpftool after that can deal with kernel missing some of the features in runtime with no problems. To that effect, allow to specify path to custom vmlinux.h to bpftool's Makefile with VMLINUX_H variable. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630004759.521530-1-andriin@fb.com
2020-06-30PCI: Make pcie_find_root_port() work for Root PortsMika Westerberg
Commit 6ae72bfa656e ("PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()") broke acpi_pci_bridge_d3() because calling pcie_find_root_port() on a Root Port returned NULL when it should return the Root Port, which in turn broke power management of PCIe hierarchies. Rework pcie_find_root_port() so it returns its argument when it is already a Root Port. [bhelgaas: test device only once, test for PCIe] Fixes: 6ae72bfa656e ("PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()") Link: https://lore.kernel.org/r/20200622161248.51099-1-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2020-06-30 The following pull-request contains BPF updates for your *net* tree. We've added 28 non-merge commits during the last 9 day(s) which contain a total of 35 files changed, 486 insertions(+), 232 deletions(-). The main changes are: 1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer types, from Yonghong Song. 2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki. 3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb internals and integrate it properly into DMA core, from Christoph Hellwig. 4) Fix RCU splat from recent changes to avoid skipping ingress policy when kTLS is enabled, from John Fastabend. 5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its position masking to work, from Andrii Nakryiko. 6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading of network programs, from Maciej Żenczykowski. 7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer. 8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30tcp: call tcp_ack_tstamp() when not fully ackedYousuk Seung
When skb is coalesced tcp_ack_tstamp() still needs to be called when not fully acked in tcp_clean_rtx_queue(), otherwise SCM_TSTAMP_ACK timestamps may never be fired. Since the original patch series had dependent commits, this patch fixes the issue instead of reverting by restoring calls to tcp_ack_tstamp() when skb is not fully acked. Fixes: fdb7eb21ddd3 ("tcp: stamp SCM_TSTAMP_ACK later in tcp_clean_rtx_queue()") Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net/mlx5e: fix memory leak of tlsColin Ian King
The error return path when create_singlethread_workqueue fails currently does not kfree tls and leads to a memory leak. Fix this by kfree'ing tls before returning -ENOMEM. Addresses-Coverity: ("Resource leak") Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30mptcp: do nonce initialization at subflow creation timePaolo Abeni
This clean-up the code a bit, reduces the number of used hooks and indirect call requested, and allow better error reporting from __mptcp_subflow_connect() Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net/tls: fix sign extension issue when left shifting u16 valueColin Ian King
Left shifting the u16 value promotes it to a int and then it gets sign extended to a u64. If len << 16 is greater than 0x7fffffff then the upper bits get set to 1 because of the implicit sign extension. Fix this by casting len to u64 before shifting it. Addresses-Coverity: ("integer handling issues") Fixes: ed9b7646b06a ("net/tls: Add asynchronous resync") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge branch 'sfc-prerequisites-for-EF100-driver-part-2'David S. Miller
Edward Cree says: ==================== sfc: prerequisites for EF100 driver, part 2 Continuing on from [1], this series further prepares the sfc codebase for the introduction of the EF100 driver. [1]: https://lore.kernel.org/netdev/20200629.173812.1532344417590172093.davem@davemloft.net/T/ ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30bpf: Add tests for PTR_TO_BTF_ID vs. null comparisonYonghong Song
Add two tests for PTR_TO_BTF_ID vs. null ptr comparison, one for PTR_TO_BTF_ID in the ctx structure and the other for PTR_TO_BTF_ID after one level pointer chasing. In both cases, the test ensures condition is not removed. For example, for this test struct bpf_fentry_test_t { struct bpf_fentry_test_t *a; }; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) { if (arg == 0) test7_result = 1; return 0; } Before the previous verifier change, we have xlated codes: int test7(long long unsigned int * ctx): ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) 0: (79) r1 = *(u64 *)(r1 +0) ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) 1: (b4) w0 = 0 2: (95) exit After the previous verifier change, we have: int test7(long long unsigned int * ctx): ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) 0: (79) r1 = *(u64 *)(r1 +0) ; if (arg == 0) 1: (55) if r1 != 0x0 goto pc+4 ; test7_result = 1; 2: (18) r1 = map[id:6][0]+48 4: (b7) r2 = 1 5: (7b) *(u64 *)(r1 +0) = r2 ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) 6: (b4) w0 = 0 7: (95) exit Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630171241.2523875-1-yhs@fb.com
2020-06-30bpf: Fix an incorrect branch elimination by verifierYonghong Song
Wenbo reported an issue in [1] where a checking of null pointer is evaluated as always false. In this particular case, the program type is tp_btf and the pointer to compare is a PTR_TO_BTF_ID. The current verifier considers PTR_TO_BTF_ID always reprents a non-null pointer, hence all PTR_TO_BTF_ID compares to 0 will be evaluated as always not-equal, which resulted in the branch elimination. For example, struct bpf_fentry_test_t { struct bpf_fentry_test_t *a; }; int BPF_PROG(test7, struct bpf_fentry_test_t *arg) { if (arg == 0) test7_result = 1; return 0; } int BPF_PROG(test8, struct bpf_fentry_test_t *arg) { if (arg->a == 0) test8_result = 1; return 0; } In above bpf programs, both branch arg == 0 and arg->a == 0 are removed. This may not be what developer expected. The bug is introduced by Commit cac616db39c2 ("bpf: Verifier track null pointer branch_taken with JNE and JEQ"), where PTR_TO_BTF_ID is considered to be non-null when evaluting pointer vs. scalar comparison. This may be added considering we have PTR_TO_BTF_ID_OR_NULL in the verifier as well. PTR_TO_BTF_ID_OR_NULL is added to explicitly requires a non-NULL testing in selective cases. The current generic pointer tracing framework in verifier always assigns PTR_TO_BTF_ID so users does not need to check NULL pointer at every pointer level like a->b->c->d. We may not want to assign every PTR_TO_BTF_ID as PTR_TO_BTF_ID_OR_NULL as this will require a null test before pointer dereference which may cause inconvenience for developers. But we could avoid branch elimination to preserve original code intention. This patch simply removed PTR_TO_BTD_ID from reg_type_not_null() in verifier, which prevented the above branches from being eliminated. [1]: https://lore.kernel.org/bpf/79dbb7c0-449d-83eb-5f4f-7af0cc269168@fb.com/T/ Fixes: cac616db39c2 ("bpf: Verifier track null pointer branch_taken with JNE and JEQ") Reported-by: Wenbo Zhang <ethercflow@gmail.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630171240.2523722-1-yhs@fb.com
2020-06-30Merge branch 'net-ipa-three-bug-fixes'David S. Miller
Alex Elder says: ==================== net: ipa: three bug fixes This series contains three bug fixes for the Qualcomm IPA driver. In practice these bugs are unlikke.y to be harmful, but they do represent incorrect code. Version 2 adds "Fixes" tags to two of the patches and fixes a typo in one (found by checkpatch.pl). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: ipa: introduce ipa_cmd_tag_process()Alex Elder
Create a new function ipa_cmd_tag_process() that simply allocates a transaction, adds a tag process command to it to clear the hardware pipeline, and commits the transaction. Call it in from ipa_endpoint_suspend(), after suspending the modem endpoints but before suspending the AP command TX and AP LAN RX endpoints (which are used by the tag sequence). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: ipa: no checksum offload for SDM845 LAN RXAlex Elder
The AP LAN RX endpoint should not have download checksum offload enabled. The receive handler does properly accommodate the trailer that's added by the hardware, but we ignore it. Fixes: 1ed7d0c0fdba ("soc: qcom: ipa: configuration data") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: ipa: always check for stopped channelAlex Elder
In gsi_channel_stop(), there's a check to see if the channel might have entered STOPPED state since a previous call, which might have timed out before stopping completed. That check actually belongs in gsi_channel_stop_command(), which is called repeatedly by gsi_channel_stop() for RX channels. Fixes: 650d1603825d ("soc: qcom: ipa: the generic software interface") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: don't call tx_remove if there isn't oneEdward Cree
EF100 won't have an efx->type->tx_remove method, because there's nothing for it to do. So make the call conditional. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise initialisation of efx->vport_idEdward Cree
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise efx->[rt]xq_entries initialisationEdward Cree
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: initialise max_[tx_]channels in efx_init_channels()Edward Cree
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: move definition of EFX_MC_STATS_GENERATION_INVALIDEdward Cree
Saves a whole #include from nic.c. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: factor out efx_tx_tso_header_length() and understand encapsulationEdward Cree
ef100 will need to check this against NIC limits. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: remove duplicate declaration of efx_enqueue_skb_tso()Edward Cree
Define it in nic_common.h, even though the ef100 driver will have a different implementation backing it (actually a WARN_ON_ONCE as it should never get called by ef100. But it needs to still exist because common TX path code references it). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise TSO fallback codeEdward Cree
ef100 will need this if it gets GSO skbs it can't handle (e.g. too long header length). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise efx_sync_rx_buffer()Edward Cree
The ef100 RX path will also need to DMA-sync RX buffers. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise some MAC configuration codeEdward Cree
Refactor it a little as we go, and introduce efx_mcdi_set_mtu() which we will later use for ef100 to change MTU without touching other MAC settings. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise miscellaneous efx functionsEdward Cree
Various left-over bits and pieces from efx.c that are needed by ef100. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: add missing licence info to mcdi_filters.cEdward Cree
Both the licence notice and the SPDX tag were missing from this file. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: commonise MCDI MAC stats handlingEdward Cree
Most of it was already declared in mcdi_port_common.h, so just move the implementations to mcdi_port_common.c. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30sfc: move NIC-specific mcdi_port declarations out of common headerEdward Cree
These functions are implemented in mcdi_port.c, which will not be linked into the EF100 driver; thus their prototypes should not be visible in common header files. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge branch 'Convert-Broadcom-SF2-to-mac_link_up-resolved-state'David S. Miller
Russell King says: ==================== Convert Broadcom SF2 to mac_link_up() resolved state Convert Broadcom SF2 DSA support to use the newly provided resolved link state via mac_link_up() rather than using the state in mac_config(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa/bcm_sf2: move pause mode setting into mac_link_up()Russell King
bcm_sf2 only appears to support pause modes on RGMII interfaces (the enable bits are in the RGMII control register.) Setup the pause modes for RGMII connections. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa/bcm_sf2: move speed/duplex forcing to mac_link_up()Russell King
Convert the bcm_sf2 to use the finalised speed and duplex in its mac_link_up() call rather than the parameters in mac_config(). Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa/bcm_sf2: fix incorrect usage of state->linkRussell King
state->link has never been valid in mac_config() implementations - while it may be correct in some calls, it is not true that it can be relied upon. Fix bcm_sf2 to use the correct method of handling forced link status. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge branch 'Convert-Broadcom-B53-to-mac_link_up-resolved-state'David S. Miller
Russell King says: ==================== Convert Broadcom B53 to mac_link_up() resolved state These two patches update the Broadcom B53 DSA support to use the newly provided resolved link state via mac_link_up() rather than using the state in mac_config(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa/b53: use resolved link config in mac_link_up()Russell King
Convert the B53 driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(). This is just a matter of moving the call to b53_force_port_config(). Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa/b53: change b53_force_port_config() pause argumentRussell King
Replace the b53_force_port_config() pause argument, which is based on phylink's MLO_PAUSE_* definitions, to use a pair of booleans. This will allow us to move b53_force_port_config() from b53_phylink_mac_config() to b53_phylink_mac_link_up(). Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: mvneta: fix use of state->speedRussell King
When support for short preambles was added, it incorrectly keyed its decision off state->speed instead of state->interface. state->speed is not guaranteed to be correct for in-band modes, which can lead to short preambles being unexpectedly disabled. Fix this by keying off the interface mode, which is the only way that mvneta can operate at 2.5Gbps. Fixes: da58a931f248 ("net: mvneta: Add support for 2500Mbps SGMII") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge tag 'batadv-next-for-davem-20200630' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - update mailing list URL, by Sven Eckelmann - fix typos and grammar in documentation, by Sven Eckelmann - introduce a configurable per interface hop penalty, by Linus Luessing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: dsa: Improve subordinate PHY error messageFlorian Fainelli
It is not very informative to know the DSA master device when a subordinate network device fails to get its PHY setup. Provide the device name and capitalize PHY while we are it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30hinic: remove unused but set variableLuo bin
remove unused but set variable to avoid auto build test WARNING Signed-off-by: Luo bin <luobin9@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge tag 'exfat-for-5.8-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: - Zero out unused characters of FileName field to avoid a complaint from some fsck tool. - Fix memory leak on error paths. - Fix unnecessary VOL_DIRTY set when calling rmdir on non-empty directory. - Call sync_filesystem() for read-only remount (Fix generic/452 test in xfstests) - Add own fsync() to flush dirty metadata. * tag 'exfat-for-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: flush dirty metadata in fsync exfat: move setting VOL_DIRTY over exfat_remove_entries() exfat: call sync_filesystem for read-only remount exfat: add missing brelse() calls on error paths exfat: Set the unused characters of FileName field to the value 0000h
2020-06-30Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2020-06-29 This series contains updates to only the igc driver. Sasha added Energy Efficient Ethernet (EEE) support and Latency Tolerance Reporting (LTR) support for the igc driver. Added Low Power Idle (LPI) counters and cleaned up unused TCP segmentation counters. Removed igc_power_down_link() and call igc_power_down_phy_copper_base() directly. Removed unneeded copper media check. Andre cleaned up timestamping by removing un-supported features and duplicate code for i225. Fixed the timestamp check on the proper flag instead of the skb for pending transmit timestamps. Refactored igc_ptp_set_timestamp_mode() to simply the flow. v2: Removed the log message in patch 1 as suggested by David Miller. Note: The locking issue Jakub Kicinski saw in patch 5, currently exists in the current net-next tree, so Andre will resolve the locking issue in a follow-on patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30Merge branch 'support-AF_PACKET-for-layer-3-devices'David S. Miller
Jason A. Donenfeld says: ==================== support AF_PACKET for layer 3 devices Hans reported that packets injected by a correct-looking and trivial libpcap-based program were not being accepted by wireguard. In investigating that, I noticed that a few devices weren't properly handling AF_PACKET-injected packets, and so this series introduces a bit of shared infrastructure to support that. The basic problem begins with socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)) sockets. When sendto is called, AF_PACKET examines the headers of the packet with this logic: static void packet_parse_headers(struct sk_buff *skb, struct socket *sock) { if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) && sock->type == SOCK_RAW) { skb_reset_mac_header(skb); skb->protocol = dev_parse_header_protocol(skb); } skb_probe_transport_header(skb); } The middle condition there triggers, and we jump to dev_parse_header_protocol. Note that this is the only caller of dev_parse_header_protocol in the kernel, and I assume it was designed for this purpose: static inline __be16 dev_parse_header_protocol(const struct sk_buff *skb) { const struct net_device *dev = skb->dev; if (!dev->header_ops || !dev->header_ops->parse_protocol) return 0; return dev->header_ops->parse_protocol(skb); } Since AF_PACKET already knows which netdev the packet is going to, the dev_parse_header_protocol function can see if that netdev has a way it prefers to figure out the protocol from the header. This, again, is the only use of parse_protocol in the kernel. At the moment, it's only used with ethernet devices, via eth_header_parse_protocol. This makes sense, as mostly people are used to AF_PACKET-injecting ethernet frames rather than layer 3 frames. But with nothing in place for layer 3 netdevs, this function winds up returning 0, and skb->protocol then is set to 0, and then by the time it hits the netdev's ndo_start_xmit, the driver doesn't know what to do with it. This is a problem because drivers very much rely on skb->protocol being correct, and routinely reject packets where it's incorrect. That's why having this parsing happen for injected packets is quite important. In wireguard, ipip, and ipip6, for example, packets from AF_PACKET are just dropped entirely. For tun devices, it's sort of uglier, with the tun "packet information" header being passed to userspace containing a bogus protocol value. Some userspace programs are ill-equipped to deal with that. (But of course, that doesn't happen with tap devices, which benefit from the similar shared infrastructure for layer 2 netdevs, further motiviating this patchset for layer 3 netdevs.) This patchset addresses the issue by first adding a layer 3 header parse function, much akin to the existing one for layer 2 packets, and then adds a shared header_ops structure that, also much akin to the existing one for layer 2 packets. Then it wires it up to a few immediate places that stuck out as requiring it, and does a bit of cleanup. This patchset seems like it's fixing real bugs, so it might be appropriate for stable. But they're also very old bugs, so if you'd rather not backport to stable, that'd make sense to me too. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: xfrmi: implement header_ops->parse_protocol for AF_PACKETJason A. Donenfeld
The xfrm interface uses skb->protocol to determine packet type, and bails out if it's not set. For AF_PACKET injection, we need to support its call chain of: packet_sendmsg -> packet_snd -> packet_parse_headers -> dev_parse_header_protocol -> parse_protocol Without a valid parse_protocol, this returns zero, and xfrmi rejects the skb. So, this wires up the ip_tunnel handler for layer 3 packets for that case. Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: sit: implement header_ops->parse_protocol for AF_PACKETJason A. Donenfeld
Sit uses skb->protocol to determine packet type, and bails out if it's not set. For AF_PACKET injection, we need to support its call chain of: packet_sendmsg -> packet_snd -> packet_parse_headers -> dev_parse_header_protocol -> parse_protocol Without a valid parse_protocol, this returns zero, and sit rejects the skb. So, this wires up the ip_tunnel handler for layer 3 packets for that case. Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: vti: implement header_ops->parse_protocol for AF_PACKETJason A. Donenfeld
Vti uses skb->protocol to determine packet type, and bails out if it's not set. For AF_PACKET injection, we need to support its call chain of: packet_sendmsg -> packet_snd -> packet_parse_headers -> dev_parse_header_protocol -> parse_protocol Without a valid parse_protocol, this returns zero, and vti rejects the skb. So, this wires up the ip_tunnel handler for layer 3 packets for that case. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30tun: implement header_ops->parse_protocol for AF_PACKETJason A. Donenfeld
The tun driver passes up skb->protocol to userspace in the form of PI headers. For AF_PACKET injection, we need to support its call chain of: packet_sendmsg -> packet_snd -> packet_parse_headers -> dev_parse_header_protocol -> parse_protocol Without a valid parse_protocol, this returns zero, and the tun driver then gives userspace bogus values that it can't deal with. Note that this isn't the case with tap, because tap already benefits from the shared infrastructure for ethernet headers. But with tun, there's nothing. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30wireguard: queueing: make use of ip_tunnel_parse_protocolJason A. Donenfeld
Now that wg_examine_packet_protocol has been added for general consumption as ip_tunnel_parse_protocol, it's possible to remove wg_examine_packet_protocol and simply use the new ip_tunnel_parse_protocol function directly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>