summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2018-09-20net/ipv4: Move device validation to helperDavid Ahern
Move the device matching check in __fib_validate_source to a helper and export it for use by netfilter modules. Code move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18netlink: add ethernet address policy typesJohannes Berg
Commonly, ethernet addresses are just using a policy of { .len = ETH_ALEN } which leaves userspace free to send more data than it should, which may hide bugs. Introduce NLA_EXACT_LEN which checks for exact size, rejecting the attribute if it's not exactly that length. Also add NLA_EXACT_LEN_WARN which requires the minimum length and will warn on longer attributes, for backward compatibility. Use these to define NLA_POLICY_ETH_ADDR (new strict policy) and NLA_POLICY_ETH_ADDR_COMPAT (compatible policy with warning); these are used like this: static const struct nla_policy <name>[...] = { [NL_ATTR_NAME] = NLA_POLICY_ETH_ADDR, ... }; Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18netlink: add NLA_REJECT policy typeJohannes Berg
In some situations some netlink attributes may be used for output only (kernel->userspace) or may be reserved for future use. It's then helpful to be able to prevent userspace from using them in messages sent to the kernel, since they'd otherwise be ignored and any future will become impossible if this happens. Add NLA_REJECT to the policy which does nothing but reject (with EINVAL) validation of any messages containing this attribute. Allow for returning a specific extended ACK error message in the validation_data pointer. While at it clear up the documentation a bit - the NLA_BITFIELD32 documentation was added to the list of len field descriptions. Also, use NL_SET_BAD_ATTR() in one place where it's open-coded. The specific case I have in mind now is a shared nested attribute containing request/response data, and it would be pointless and potentially confusing to have userspace include response data in the messages that actually contain a request. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two new tls tests added in parallel in both net and net-next. Used Stephen Rothwell's linux-next resolution. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-17tls: async support causes out-of-bounds access in crypto APIsJohn Fastabend
When async support was added it needed to access the sk from the async callback to report errors up the stack. The patch tried to use space after the aead request struct by directly setting the reqsize field in aead_request. This is an internal field that should not be used outside the crypto APIs. It is used by the crypto code to define extra space for private structures used in the crypto context. Users of the API then use crypto_aead_reqsize() and add the returned amount of bytes to the end of the request memory allocation before posting the request to encrypt/decrypt APIs. So this breaks (with general protection fault and KASAN error, if enabled) because the request sent to decrypt is shorter than required causing the crypto API out-of-bounds errors. Also it seems unlikely the sk is even valid by the time it gets to the callback because of memset in crypto layer. Anyways, fix this by holding the sk in the skb->sk field when the callback is set up and because the skb is already passed through to the callback handler via void* we can access it in the handler. Then in the handler we need to be careful to NULL the pointer again before kfree_skb. I added comments on both the setup (in tls_do_decryption) and when we clear it from the crypto callback handler tls_decrypt_done(). After this selftests pass again and fixes KASAN errors/warnings. Fixes: 94524d8fc965 ("net/tls: Add support for async decryption of tls records") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Vakul Garg <Vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13vxlan: Remove duplicated include from vxlan.hYueHaibing
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13tls: zero the crypto information from tls_context before freeingSabrina Dubroca
This contains key material in crypto_send_aes_gcm_128 and crypto_recv_aes_gcm_128. Introduce union tls_crypto_context, and replace the two identical unions directly embedded in struct tls_context with it. We can then use this union to clean up the memory in the new tls_ctx_free() function. Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13net: sock: introduce SOCK_XDPJason Wang
This patch introduces a new sock flag - SOCK_XDP. This will be used for notifying the upper layer that XDP program is attached on the lower socket, and requires for extra headroom. TUN will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13llc: avoid blocking in llc_sap_close()Cong Wang
llc_sap_close() is called by llc_sap_put() which could be called in BH context in llc_rcv(). We can't block in BH. There is no reason to block it here, kfree_rcu() should be sufficient. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13net: dsa: Add Lantiq / Intel GSWIP tag supportHauke Mehrtens
This handles the tag added by the PMAC on the VRX200 SoC line. The GSWIP uses internally a GSWIP special tag which is located after the Ethernet header. The PMAC which connects the GSWIP to the CPU converts this special tag used by the GSWIP into the PMAC special tag which is added in front of the Ethernet header. This was tested with GSWIP 2.1 found in the VRX200 SoCs, other GSWIP versions use slightly different PMAC special tags. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-09-12scsi: libcxgbi: fib6_ino reference in rt6_info is rcu protectedDavid Ahern
The fib6_info reference in rt6_info is rcu protected. Add a helper to extract prefsrc from and update cxgbi_check_route6 to use it. Fixes: 0153167aebd0 ("net/ipv6: Remove rt6i_prefsrc") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for you net tree: 1) Remove duplicated include at the end of UDP conntrack, from Yue Haibing. 2) Restore conntrack dependency on xt_cluster, from Martin Willi. 3) Fix splat with GSO skbs from the checksum target, from Florian Westphal. 4) Rework ct timeout support, the template strategy to attach custom timeouts is not correct since it will not work in conjunction with conntrack zones and we have a possible free after use when removing the rule due to missing refcounting. To fix these problems, do not use conntrack template at all and set custom timeout on the already valid conntrack object. This fix comes with a preparation patch to simplify timeout adjustment by initializating the first position of the timeout array for all of the existing trackers. Patchset from Florian Westphal. 5) Fix missing dependency on from IPv4 chain NAT type, from Florian. 6) Release chain reference counter from the flush path, from Taehee Yoo. 7) After flushing an iptables ruleset, conntrack hooks are unregistered and entries are left stale to be cleaned up by the timeout garbage collector. No TCP tracking is done on established flows by this time. If ruleset is reloaded, then hooks are registered again and TCP tracking is restored, which considers packets to be invalid. Clear window tracking to exercise TCP flow pickup from the middle given that history is lost for us. Again from Florian. 8) Fix crash from netlink interface with CONFIG_NF_CONNTRACK_TIMEOUT=y and CONFIG_NF_CT_NETLINK_TIMEOUT=n. 9) Broken CT target due to returning incorrect type from ctnl_timeout_find_get(). 10) Solve conntrack clash on NF_REPEAT verdicts too, from Michal Vaner. 11) Missing conversion of hashlimit sysctl interface to new API, from Cong Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10net: sched: cls_flower: dump offload count valueVlad Buslov
Change flower in_hw_count type to fixed-size u32 and dump it as TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared blocks and re-offload functionality. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10sch_netem: Move private queue handler to generic location.David S. Miller
By hand copies of SKB list handlers do not belong in individual packet schedulers. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10sch_htb: Remove local SKB queue handling code.David S. Miller
Instead, adjust __qdisc_enqueue_tail() such that HTB can use it instead. The only other caller of __qdisc_enqueue_tail() is qdisc_enqueue_tail() so we can move the backlog and return value handling (which HTB doesn't need/want) to the latter. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10net/ipv6: Remove rt6i_prefsrcDavid Ahern
After the conversion to fib6_info, rt6i_prefsrc has a single user that reads the value and otherwise it is only set. The one reader can be converted to use rt->from so rt6i_prefsrc can be removed, reducing rt6_info by another 20 bytes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05rtnetlink: add rtnl_get_net_ns_capable()Christian Brauner
get_target_net() will be used in follow-up patches in ipv{4,6} codepaths to retrieve network namespaces based on network namespace identifiers. So remove the static declaration and export in the rtnetlink header. Also, rename it to rtnl_get_net_ns_capable() to make it obvious what this function is doing. Export rtnl_get_net_ns_capable() so it can be used when ipv6 is built as a module. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05mac80211: add an option for drivers to check if packets can be aggregatedSara Sharon
Some hardwares have limitations on the packets' type in AMSDU. Add an optional driver callback to determine if two skbs can be used in the same AMSDU or not. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: allow AMSDU size limitation per-TIDSara Sharon
Some drivers may have AMSDU size limitation per TID, due to HW constrains. Add an option to set this limit. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: add an option for station management TXQSara Sharon
We have a TXQ abstraction for non-data packets that need powersave buffering. Since the AP cannot sleep, in case of station we can use this TXQ for all management frames, regardless if they are bufferable. Add HW flag to allow that. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: support reporting 0-length PSDU in radiotapShaul Triebitz
For certain sounding frames, it may be useful to report them to userspace even though they don't have a PSDU in order to determine the PHY parameters (e.g. VHT rate/stream config.) Add support for this to mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: Fix PTK rekey freezes and clear text leakAlexander Wetzel
Rekeying PTK keys without "Extended Key ID for Individually Addressed Frames" did use a procedure not suitable to replace in-use keys and could caused the following issues: 1) Freeze caused by incoming frames: If the local STA installed the key prior to the remote STA we still had the old key active in the hardware when mac80211 switched over to the new key. Therefore there was a window where the card could hand over frames decoded with the old key to mac80211 and bump the new PN (IV) value to an incorrect high number. When it happened the local replay detection silently started to drop all frames sent with the new key. 2) Freeze caused by outgoing frames: If mac80211 was providing the PN (IV) and handed over a clear text frame for encryption to the hardware prior to a key change the driver/card could have processed the queued frame after switching to the new key. This bumped the PN value on the remote STA to an incorrect high number, tricking the remote STA to discard all frames we sent later. 3) Freeze caused by RX aggregation reorder buffer: An aggregation session started with the old key and ending after the switch to the new key also bumped the PN to an incorrect high number, freezing the connection quite similar to 1). 4) Freeze caused by repeating lost frames in an aggregation session: A driver could repeat a lost frame and encrypt it with the new key while in a TX aggregation session without updating the PN for the new key. This also could freeze connections similar to 2). 5) Clear text leak: Removing encryption offload from the card cleared the encryption offload flag only after the card had deleted the key and we did not stop TX during the rekey. The driver/card could therefore get unencrypted frames from mac80211 while no longer be instructed to encrypt them. To prevent those issues the key install logic has been changed: - Mac80211 divers known to be able to rekey PTK0 keys have to set @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0, - mac80211 stops queuing frames depending on the key during the replace - the key is first replaced in the hardware and after that in mac80211 - and mac80211 stops/blocks new aggregation sessions during the rekey. For drivers not setting @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 the user space must avoid PTK rekeys if "Extended Key ID for Individually Addressed Frames" is not being used. Rekeys for mac80211 drivers without this flag will generate a warning and use an extra call to ieee80211_flush_queues() to both highlight and try to prevent the issues with not updated drivers. The core of the fix changes the key install procedure from: - atomic switch over to the new key in mac80211 - remove the old key in the hardware (stops encryption offloading, fall back to software encryption with a potential clear text packet leak in between) - delete the inactive old key in mac80211 - enable hardware encryption offloading for the new key to: - if it's a PTK mark the old key as tainted to drop TX frames with the outgoing key - replace the key in hardware with the new one - atomic switch over to the new (not marked as tainted) key in mac80211 (which also resumes TX) - delete the inactive old key in mac80211 With the new sequence the hardware will be unable to decrypt frames encrypted with the old key prior to switching to the new key in mac80211 and thus prevent PNs from packets decrypted with the old key to be accounted against the new key. For that to work the drivers have to provide a clear boundary. Mac80211 drivers setting @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 confirm to provide it and mac80211 will then be able to correctly rekey in-use PTK keys with those drivers. The mac80211 requirements for drivers to set the flag have been added to the "Hardware crypto acceleration" documentation section. It drills down to: The drivers must not hand over frames decrypted with the old key to mac80211 once the call to set_key() with %DISABLE_KEY has been completed. It's allowed to either drop or continue to use the old key for any outgoing frames which are already in the queues, but it must not send out any of them unencrypted or encrypted with the new key. Even with the new boundary in place aggregation sessions with the reorder buffer are problematic: RX aggregation session started prior and completed after the rekey could still dump frames received with the old key at mac80211 after it switched over to the new key. This is side stepped by stopping all (RX and TX) aggregation sessions when replacing a PTK key and hardware key offloading. Stopping TX aggregation sessions avoids the need to get the PNs (IVs) updated in frames prepared for the old key and (re)transmitted after the switch to the new key. As a bonus it improves the compatibility when the remote STA is not handling rekeys as it should. When using software crypto aggregation sessions are not stopped. Mac80211 won't be able to decode the dangerous frames and discard them without special handling. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> [trim overly long rekey warning] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: support radiotap L-SIG dataShaul Triebitz
As before with HE, the data needs to be provided by the driver in the skb head, since there's not enough space in the skb CB. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: Store sk_pacing_shift in ieee80211_hwWen Gong
Make it possibly for drivers to adjust the default skb_pacing_shift by storing it in the hardware struct. Signed-off-by: Wen Gong <wgong@codeaurora.org> [adjust commit log, move & adjust comment] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: introduce capability flags for VHT EXT NSS supportJohannes Berg
Depending on whether or not rate control supports selecting rates depending on the bandwidth, we can use VHT extended NSS support. In essence, this is dot11VHTExtendedNSSBWCapable from the spec, since depending on that we'll need to parse the bandwidth. If needed, also set/clear the VHT Capability Element bit for this capability so that we don't advertise it erroneously or don't advertise it when we actually use it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05cfg80211: add he_capabilities (ext) IE to AP settingsShaul Triebitz
Same as for HT and VHT. This helps the lower level to know whether the AP supports HE. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05mac80211: add an optional TXQ for other PS-buffered framesJohannes Berg
Some drivers may want to also use the TXQ abstraction with non-data packets that need powersave buffering, so add a hardware flag to allow this. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-09-03Merge tag 'mac80211-for-davem-2018-09-03' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Here are quite a large number of fixes, notably: * various A-MSDU building fixes (currently only affects mt76) * syzkaller & spectre fixes in hwsim * TXQ vs. teardown fix that was causing crashes * embed WMM info in reg rule, bad code here had been causing crashes * one compilation issue with fix from Arnd (rfkill-gpio includes) * fixes for a race and bad data during/after channel switch * nl80211: a validation fix, attribute type & unit fixes along with other small fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-01net/tls: Add support for async decryption of tls recordsVakul Garg
When tls records are decrypted using asynchronous acclerators such as NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on getting -EINPROGRESS, the tls record processing stops till the time the crypto accelerator finishes off and returns the result. This incurs a context switch and is not an efficient way of accessing the crypto accelerators. Crypto accelerators work efficient when they are queued with multiple crypto jobs without having to wait for the previous ones to complete. The patch submits multiple crypto requests without having to wait for for previous ones to complete. This has been implemented for records which are decrypted in zero-copy mode. At the end of recvmsg(), we wait for all the asynchronous decryption requests to complete. The references to records which have been sent for async decryption are dropped. For cases where record decryption is not possible in zero-copy mode, asynchronous decryption is not used and we wait for decryption crypto api to complete. For crypto requests executing in async fashion, the memory for aead_request, sglists and skb etc is freed from the decryption completion handler. The decryption completion handler wakesup the sleeping user context when recvmsg() flags that it has done sending all the decryption requests and there are no more decryption requests pending to be completed. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31Revert "net: sched: act: add extack for lookup callback"Cong Wang
This reverts commit 331a9295de23 ("net: sched: act: add extack for lookup callback"). This extack is never used after 6 months... In fact, it can be just set in the caller, right after ->lookup(). Cc: Alexander Aring <aring@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-09-01 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus. 2) BPF verifier improvements by giving each register its own liveness chain which allows to simplify and getting rid of skip_callee() logic, from Edward. 3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap and percpu lru hashmap. Also add generic percpu formatted print on bpftool so the same can be dumped there, from Yonghong. 4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and TCP_SAVED_SYN options to allow reflection of tos/tclass from received SYN packet, from Nikita. 5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2 interaction and removal of incorrect shutdown() calls, from John. 6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-01xsk: i40e: get rid of useless struct xdp_umem_propsMagnus Karlsson
This commit gets rid of the structure xdp_umem_props. It was there to be able to break a dependency at one point, but this is no longer needed. The values in the struct are instead stored directly in the xdp_umem structure. This simplifies the xsk code as well as af_xdp zero-copy drivers and as a bonus gets rid of one internal header file. The i40e driver is also adapted to the new interface in this commit. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-29Merge tag 'mac80211-next-for-davem-2018-08-29' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Only a few changes at this point: * new channels in 60 GHz * clarify (average) ACK signal reporting API * expose ieee80211_send_layer2_update() for all drivers * start/stop mac80211's TXQs properly when required * avoid regulatory restore with IE ignoring * spelling: contidion -> condition * fully implement WFA Multi-AP backhaul ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29genetlink: constify genl_err_attr() argumentMichal Kubecek
genl_err_attr() sets netlink_ext_ack::bad_attr which is a pointer to const struct nlattr so make the attr argument also const. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-29xsk: expose xdp_umem_get_{data,dma} to driversBjörn Töpel
Move the xdp_umem_get_{data,dma} functions to include/net/xdp_sock.h, so that the upcoming zero-copy implementation in the Ethernet drivers can utilize them. Also, supply some dummy function implementations for CONFIG_XDP_SOCKETS=n configs. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29xdp: export xdp_rxq_info_unreg_mem_modelBjörn Töpel
Export __xdp_rxq_info_unreg_mem_model as xdp_rxq_info_unreg_mem_model, so it can be used from netdev drivers. Also, add additional checks for the memory type. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29xdp: implement convert_to_xdp_frame for MEM_TYPE_ZERO_COPYBjörn Töpel
This commit adds proper MEM_TYPE_ZERO_COPY support for convert_to_xdp_frame. Converting a MEM_TYPE_ZERO_COPY xdp_buff to an xdp_frame is done by transforming the MEM_TYPE_ZERO_COPY buffer into a MEM_TYPE_PAGE_ORDER0 frame. This is costly, and in the future it might make sense to implement a more sophisticated thread-safe alloc/free scheme for MEM_TYPE_ZERO_COPY, so that no allocation and copy is required in the fast-path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29netfilter: nf_tables: rework ct timeout set supportFlorian Westphal
Using a private template is problematic: 1. We can't assign both a zone and a timeout policy (zone assigns a conntrack template, so we hit problem 1) 2. Using a template needs to take care of ct refcount, else we'll eventually free the private template due to ->use underflow. This patch reworks template policy to instead work with existing conntrack. As long as such conntrack has not yet been placed into the hash table (unconfirmed) we can still add the timeout extension. The only caveat is that we now need to update/correct ct->timeout to reflect the initial/new state, otherwise the conntrack entry retains the default 'new' timeout. Side effect of this change is that setting the policy must now occur from chains that are evaluated *after* the conntrack lookup has taken place. No released kernel contains the timeout policy feature yet, so this change should be ok. Changes since v2: - don't handle 'ct is confirmed case' - after previous patch, no need to special-case tcp/dccp/sctp timeout anymore Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-28cfg80211: Add support for 60GHz band channels 5 and 6Alexei Avshalom Lazar
The current support in the 60GHz band is for channels 1-4. Add support for channels 5 and 6. This requires enlarging ieee80211_channel.center_freq from u16 to u32. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-08-28mac80211: add stop/start logic for software TXQsManikanta Pubbisetty
Sometimes, it is required to stop the transmissions momentarily and resume it later; stopping the txqs becomes very critical in scenarios where the packet transmission has to be ceased completely. For example, during the hardware restart, during off channel operations, when initiating CSA(upon detecting a radar on the DFS channel), etc. The TX queue stop/start logic in mac80211 works well in stopping the TX when drivers make use of netdev queues, i.e, when Qdiscs in network layer take care of traffic scheduling. Since the devices implementing wake_tx_queue can run without Qdiscs, packets will be handed to mac80211 directly without queueing them in the netdev queues. Also, mac80211 does not invoke any of the netif_stop_*/netif_wake_* APIs if wake_tx_queue is implemented. Since the queues are not stopped in this case, transmissions can continue and this will impact negatively on the operation of the wireless device. For example, During hardware restart, we stop the netdev queues so that packets are not sent to the driver. Since ath10k implements wake_tx_queue, TX queues will not be stopped and packets might reach the hardware while it is restarting; this can make hardware unresponsive and the only possible option for recovery is to reboot the entire system. There is another problem to this, it is observed that the packets were sent on the DFS channel for a prolonged duration after radar detection impacting the channel closing time. We can still invoke netif stop/wake APIs when wake_tx_queue is implemented but this could lead to packet drops in network layer; adding stop/start logic for software TXQs in mac80211 instead makes more sense; the change proposed adds the same in mac80211. Signed-off-by: Manikanta Pubbisetty <mpubbise@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-08-28cfg80211/mac80211: make ieee80211_send_layer2_update a public functionDedy Lansky
Make ieee80211_send_layer2_update() a common function so other drivers can re-use it. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-08-28cfg80211: make wmm_rule part of the reg_rule structureStanislaw Gruszka
Make wmm_rule be part of the reg_rule structure. This simplifies the code a lot at the cost of having bigger memory usage. However in most cases we have only few reg_rule's and when we do have many like in iwlwifi we do not save memory as it allocates a separate wmm_rule for each channel anyway. This also fixes a bug reported in various places where somewhere the pointers were corrupted and we ended up doing a null-dereference. Fixes: 230ebaa189af ("cfg80211: read wmm rules from regulatory database") Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> [rephrase commit message slightly] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) ICE, E1000, IGB, IXGBE, and I40E bug fixes from the Intel folks. 2) Better fix for AB-BA deadlock in packet scheduler code, from Cong Wang. 3) bpf sockmap fixes (zero sized key handling, etc.) from Daniel Borkmann. 4) Send zero IPID in TCP resets and SYN-RECV state ACKs, to prevent attackers using it as a side-channel. From Eric Dumazet. 5) Memory leak in mediatek bluetooth driver, from Gustavo A. R. Silva. 6) Hook up rt->dst.input of ipv6 anycast routes properly, from Hangbin Liu. 7) hns and hns3 bug fixes from Huazhong Tan. 8) Fix RIF leak in mlxsw driver, from Ido Schimmel. 9) iova range check fix in vhost, from Jason Wang. 10) Fix hang in do_tcp_sendpages() with tls, from John Fastabend. 11) More r8152 chips need to disable RX aggregation, from Kai-Heng Feng. 12) Memory exposure in TCA_U32_SEL handling, from Kees Cook. 13) TCP BBR congestion control fixes from Kevin Yang. 14) hv_netvsc, ignore non-PCI devices, from Stephen Hemminger. 15) qed driver fixes from Tomer Tayar. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits) net: sched: Fix memory exposure from short TCA_U32_SEL qed: fix spelling mistake "comparsion" -> "comparison" vhost: correctly check the iova range when waking virtqueue qlge: Fix netdev features configuration. net: macb: do not disable MDIO bus at open/close time Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency" net: macb: Fix regression breaking non-MDIO fixed-link PHYs mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge i40e: fix condition of WARN_ONCE for stat strings i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled ixgbe: fix driver behaviour after issuing VFLR ixgbe: Prevent unsupported configurations with XDP ixgbe: Replace GFP_ATOMIC with GFP_KERNEL igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback() igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init() igb: Use an advanced ctx descriptor for launchtime e1000: ensure to free old tx/rx rings in set_ringparam() e1000: check on netif_running() before calling e1000_up() ixgb: use dma_zalloc_coherent instead of allocator/memset ice: Trivial formatting fixes ...
2018-08-22net_sched: fix unused variable warning in stmmacArnd Bergmann
The new tcf_exts_for_each_action() macro doesn't reference its arguments when CONFIG_NET_CLS_ACT is disabled, which leads to a harmless warning in at least one driver: drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c: In function 'tc_fill_actions': drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c:64:6: error: unused variable 'i' [-Werror=unused-variable] Adding a cast to void lets us avoid this kind of warning. To be on the safe side, do it for all three arguments, not just the one that caused the warning. Fixes: 244cd96adb5f ("net_sched: remove list_head from tc_action") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull core signal handling updates from Eric Biederman: "It was observed that a periodic timer in combination with a sufficiently expensive fork could prevent fork from every completing. This contains the changes to remove the need for that restart. This set of changes is split into several parts: - The first part makes PIDTYPE_TGID a proper pid type instead something only for very special cases. The part starts using PIDTYPE_TGID enough so that in __send_signal where signals are actually delivered we know if the signal is being sent to a a group of processes or just a single process. - With that prep work out of the way the logic in fork is modified so that fork logically makes signals received while it is running appear to be received after the fork completes" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (22 commits) signal: Don't send signals to tasks that don't exist signal: Don't restart fork when signals come in. fork: Have new threads join on-going signal group stops fork: Skip setting TIF_SIGPENDING in ptrace_init_task signal: Add calculate_sigpending() fork: Unconditionally exit if a fatal signal is pending fork: Move and describe why the code examines PIDNS_ADDING signal: Push pid type down into complete_signal. signal: Push pid type down into __send_signal signal: Push pid type down into send_signal signal: Pass pid type into do_send_sig_info signal: Pass pid type into send_sigio_to_task & send_sigurg_to_task signal: Pass pid type into group_send_sig_info signal: Pass pid and pid type into send_sigqueue posix-timers: Noralize good_sigevent signal: Use PIDTYPE_TGID to clearly store where file signals will be sent pid: Implement PIDTYPE_TGID pids: Move the pgrp and session pid pointers from task_struct to signal_struct kvm: Don't open code task_pid in kvm_vcpu_ioctl pids: Compute task_tgid using signal->leader_pid ...
2018-08-21net_sched: remove unused tcfa_capabCong Wang
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove list_head from tc_actionCong Wang
After commit 90b73b77d08e, list_head is no longer needed. Now we just need to convert the list iteration to array iteration for drivers. Fixes: 90b73b77d08e ("net: sched: change action API to use array of pointers to actions") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove unused tcf_idr_check()Cong Wang
tcf_idr_check() is replaced by tcf_idr_check_alloc(), and __tcf_idr_check() now can be folded into tcf_idr_search(). Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>