summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2020-09-28mac80211: get correct default channel width for S1GThomas Pedersen
When deleting a channel context, mac80211 would assing NL80211_CHAN_WIDTH_20_NOHT as the default channel width. This is wrong in S1G however, so instead get the allowed channel width for a given channel. Fixes eg. configuring strange (20Mhz) width during a scan on the S1G band. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-2-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28wireless: radiotap: fix some kernel-docJohannes Berg
The vendor namespaces argument isn't described here, add it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200924192511.2bf5cc761d3a.I9b4579ab3eebe3d7889b59eea8fa50d683611bab@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: fix some missing kernel-docJohannes Berg
Two parameters are not described, add them. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200924192511.21411377b0a8.I1add147d782a3bf38287bde412dc98f69323c732@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28nl80211/cfg80211: support 6 GHz scanningTova Mussai
Support 6 GHz scanning, by * a new scan flag to scan for colocated BSSes advertised by (and found) APs on 2.4 & 5 GHz * doing the necessary reduced neighbor report parsing for this, to find them * adding the ability to split the scan request in case the device by itself cannot support this. Also add some necessary bits in mac80211 to not break with these changes. Signed-off-by: Tova Mussai <tova.mussai@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200918113313.232917c93af9.Ida22f0212f9122f47094d81659e879a50434a6a2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: Inform AP when returning operating channelLoic Poulain
Because we can miss AP wakeup (beacon) while scanning other channels, it's better go into wakeup state and inform the AP of that upon returning to the operating channel, rather than staying asleep and waiting for the next TIM indicating traffic for us. This saves precious time, especially when we only have 200ms inter- scan period for monitoring the active connection. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/1593420923-26668-1-git-send-email-loic.poulain@linaro.org [rewrite commit message a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28net: vlan: Fixed signedness in vlan_group_prealloc_vid()Florian Fainelli
After commit d0186842ec5f ("net: vlan: Avoid using BUG() in vlan_proto_idx()"), vlan_proto_idx() was changed to return a signed integer, however one of its called: vlan_group_prealloc_vid() was still using an unsigned integer for its return value, fix that. Fixes: d0186842ec5f ("net: vlan: Avoid using BUG() in vlan_proto_idx()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_rtl4_a: use the generic flow dissector procedureVladimir Oltean
Remove the .flow_dissect procedure, so the flow dissector will call the generic variant which works for this tagging protocol. Cc: Linus Walleij <linus.walleij@linaro.org> Cc: DENG Qingfang <dqfext@gmail.com> Cc: Mauri Sandberg <sandberg@mailfence.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_sja1105: use a custom flow dissector procedureVladimir Oltean
The sja1105 is a bit of a special snowflake, in that not all frames are transmitted/received in the same way. L2 link-local frames are received with the source port/switch ID information put in the destination MAC address. For the rest, a tag_8021q header is used. So only the latter frames displace the rest of the headers and need to use the generic flow dissector procedure. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_qca: use the generic flow dissector procedureVladimir Oltean
Remove the .flow_dissect procedure, so the flow dissector will call the generic variant which works for this tagging protocol. Cc: John Crispin <john@phrozen.org> Cc: Alexander Lobakin <alobakin@pm.me> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_mtk: use the generic flow dissector procedureVladimir Oltean
Remove the .flow_dissect procedure, so the flow dissector will call the generic variant which works for this tagging protocol. Cc: DENG Qingfang <dqfext@gmail.com> Cc: Sean Wang <sean.wang@mediatek.com> Cc: John Crispin <john@phrozen.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_edsa: use the generic flow dissector procedureVladimir Oltean
Remove the .flow_dissect procedure, so the flow dissector will call the generic variant which works for this tagging protocol. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_dsa: use the generic flow dissector procedureVladimir Oltean
Remove the .flow_dissect procedure, so the flow dissector will call the generic variant which works for this tagging protocol. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_brcm: use generic flow dissector procedureVladimir Oltean
There are 2 Broadcom tags in use, one places the DSA tag before the Ethernet destination MAC address, and the other before the EtherType. Nonetheless, both displace the rest of the headers, so this tagger can use the generic flow dissector procedure which accounts for that. The ASCII art drawing is a good reference though, so keep it but move it somewhere else. Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: flow_dissector: avoid indirect call to DSA .flow_dissect for generic caseVladimir Oltean
With the recent mitigations against speculative execution exploits, indirect function calls are more expensive and it would be good to avoid them where possible. In the case of DSA, most switch taggers will shift the EtherType and next headers by a fixed amount equal to that tag's length in bytes. So we can use a generic procedure to determine that, without calling into custom tagger code. However we still leave the flow_dissect method inside struct dsa_device_ops as an override for the generic function. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: point out the tail taggersVladimir Oltean
The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and KSZ9893 switches also use tail tags. Tell that to the DSA core, since this makes a difference for the flow dissector. Most switches break the parsing of frame headers, but these ones don't, so no flow dissector adjustment needs to be done for them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: make the .flow_dissect tagger callback return voidVladimir Oltean
There is no tagger that returns anything other than zero, so just change the return type appropriately. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_ocelot: use a short prefix on both ingress and egressVladimir Oltean
There are 2 goals that we follow: - Reduce the header size - Make the header size equal between RX and TX The issue that required long prefix on RX was the fact that the ocelot DSA tag, being put before Ethernet as it is, would overlap with the area that a DSA master uses for RX filtering (destination MAC address mainly). Now that we can ask DSA to put the master in promiscuous mode, in theory we could remove the prefix altogether and call it a day, but it looks like we can't. Using no prefix on ingress, some packets (such as ICMP) would be received, while others (such as PTP) would not be received. This is because the DSA master we use (enetc) triggers parse errors ("MAC rx frame errors") presumably because it sees Ethernet frames with a bad length. And indeed, when using no prefix, the EtherType (bytes 12-13 of the frame, bits 96-111) falls over the REW_VAL field from the extraction header, aka the PTP timestamp. When turning the short (32-bit) prefix on, the EtherType overlaps with bits 64-79 of the extraction header, which are a reserved area transmitted as zero by the switch. The packets are not dropped by the DSA master with a short prefix. Actually, the frames look like this in tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 - added by a downstream sja1105). 89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \ dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \ ctrl 0x0004: Information, send seq 2, rcv seq 0, \ Flags [Response], length 78 0x0000: 8880 000a 001d 890c a9f2 0100 0000 100f ................ 0x0010: 0400 0000 0180 c200 000e 001f 7b63 0248 ............{c.H 0x0020: dadb 0482 88f7 1202 0036 0000 0000 0000 .........6...... 0x0030: 0000 0000 0000 0000 0000 001f 7bff fe63 ............{..c 0x0040: 0248 0001 1f81 0500 0000 0000 0000 0000 .H.............. 0x0050: 0000 0000 0000 0000 0000 0000 ............ So the short prefix is our new default: we've shortened our RX frames by 12 octets, increased TX by 4, and headers are now equal between RX and TX. Note that we still need promiscuous mode for the DSA master to not drop it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_sja1105: request promiscuous mode for masterVladimir Oltean
Currently PTP is broken when ports are in standalone mode (the tagger keeps printing this message): sja1105 spi0.1: Expected meta frame, is 01-80-c2-00-00-0e in the DSA master multicast filter? Sure, one might say "simply add 01-80-c2-00-00-0e to the master's RX filter" but things become more complicated because: - Actually all frames in the 01-80-c2-xx-xx-xx and 01-1b-19-xx-xx-xx range are trapped to the CPU automatically - The switch mangles bytes 3 and 4 of the MAC address via the incl_srcpt ("include source port [in the DMAC]") option, which is how source port and switch id identification is done for link-local traffic on RX. But this means that an address installed to the RX filter would, at the end of the day, not correspond to the final address seen by the DSA master. Assume RX filtering lists on DSA masters are typically too small to include all necessary addresses for PTP to work properly on sja1105, and just request promiscuous mode unconditionally. Just an example: Assuming the following addresses are trapped to the CPU: 01-80-c2-00-00-00 to 01-80-c2-00-00-ff 01-1b-19-00-00-00 to 01-1b-19-00-00-ff These are 512 addresses. Now let's say this is a board with 3 switches, and 4 ports per switch. The 512 addresses become 6144 addresses that must be managed by the DSA master's RX filtering lists. This may be refined in the future, but for now, it is simply not worth it to add the additional addresses to the master's RX filter, so simply request it to become promiscuous as soon as the driver probes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: allow drivers to request promiscuous mode on masterVladimir Oltean
Currently DSA assumes that taggers don't mess with the destination MAC address of the frames on RX. That is not always the case. Some DSA headers are placed before the Ethernet header (ocelot), and others simply mangle random bytes from the destination MAC address (sja1105 with its incl_srcpt option). Currently the DSA master goes to promiscuous mode automatically when the slave devices go too (such as when enslaved to a bridge), but in standalone mode this is a problem that needs to be dealt with. So give drivers the possibility to signal that their tagging protocol will get randomly dropped otherwise, and let DSA deal with fixing that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: introduce flash update overwrite maskJacob Keller
Sections of device flash may contain settings or device identifying information. When performing a flash update, it is generally expected that these settings and identifiers are not overwritten. However, it may sometimes be useful to allow overwriting these fields when performing a flash update. Some examples include, 1) customizing the initial device config on first programming, such as overwriting default device identifying information, or 2) reverting a device configuration to known good state provided in the new firmware image, or 3) in case it is suspected that current firmware logic for managing the preservation of fields during an update is broken. Although some devices are able to completely separate these types of settings and fields into separate components, this is not true for all hardware. To support controlling this behavior, a new DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an nla_bitfield32 which will define what subset of fields in a component should be overwritten during an update. If no bits are specified, or of the overwrite mask is not provided, then an update should not overwrite anything, and should maintain the settings and identifiers as they are in the previous image. If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set, then the device should be configured to overwrite any of the settings in the requested component with settings found in the provided image. Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the device should be configured to overwrite any device identifiers in the requested component with the identifiers from the image. Multiple overwrite modes may be combined to indicate that a combination of the set of fields that should be overwritten. Drivers which support the new overwrite mask must set the DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the supported_flash_update_params field of their devlink_ops. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: convert flash_update to use params structureJacob Keller
The devlink core recently gained support for checking whether the driver supports a flash_update parameter, via `supported_flash_update_params`. However, parameters are specified as function arguments. Adding a new parameter still requires modifying the signature of the .flash_update callback in all drivers. Convert the .flash_update function to take a new `struct devlink_flash_update_params` instead. By using this structure, and the `supported_flash_update_params` bit field, a new parameter to flash_update can be added without requiring modification to existing drivers. As before, all parameters except file_name will require driver opt-in. Because file_name is a necessary field to for the flash_update to make sense, no "SUPPORTED" bitflag is provided and it is always considered valid. All future additional parameters will require a new bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25devlink: check flash_update parameter support in net coreJacob Keller
When implementing .flash_update, drivers which do not support per-component update are manually checking the component parameter to verify that it is NULL. Without this check, the driver might accept an update request with a component specified even though it will not honor such a request. Instead of having each driver check this, move the logic into net/core/devlink.c, and use a new `supported_flash_update_params` field in the devlink_ops. Drivers which will support per-component update must now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in the supported_flash_update_params in their devlink_ops. This helps ensure that drivers do not forget to check for a NULL component if they do not support per-component update. This also enables a slightly better error message by enabling the core stack to set the netlink bad attribute message to indicate precisely the unsupported attribute in the message. Going forward, any new additional parameter to flash update will require a bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Cc: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25tcp: consolidate tcp_mark_skb_lost and tcp_skb_mark_lostYuchung Cheng
tcp_skb_mark_lost is used by RFC6675-SACK and can easily be replaced with the new tcp_mark_skb_lost handler. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25tcp: simplify tcp_mark_skb_lostYuchung Cheng
This patch consolidates and simplifes the loss marking logic used by a few loss detections (RACK, RFC6675, NewReno). Previously each detection uses a subset of several intertwined subroutines. This unncessary complexity has led to bugs (and fixes of bug fixes). tcp_mark_skb_lost now is the single one routine to mark a packet loss when a loss detection caller deems an skb ist lost: 1. rewind tp->retransmit_hint_skb if skb has lower sequence or all lost ones have been retransmitted. 2. book-keeping: adjust flags and counts depending on if skb was retransmitted or not. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25tcp: move tcp_mark_skb_lostYuchung Cheng
A pure refactor to move tcp_mark_skb_lost to tcp_input.c to prepare for the later loss marking consolidation. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25tcp: consistently check retransmit hintYuchung Cheng
tcp_simple_retransmit() used for path MTU discovery may not adjust the retransmit hint properly by deducting retrans_out before checking it to adjust the hint. This patch fixes this by a correct routine tcp_mark_skb_lost() already used by the RACK loss detection. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25net: bridge: mcast: remove only S,G port groups from sg_port hashNikolay Aleksandrov
We should remove a group from the sg_port hash only if it's an S,G entry. This makes it correct and more symmetric with group add. Also since *,G groups are not added to that hash we can hide a bug. Fixes: 085b53c8beab ("net: bridge: mcast: add sg_port rhashtable") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25sunrpc: simplify do_cache_cleanJ. Bruce Fields
Is it just me, or is the logic written in a slightly convoluted way? I find it a little easier to read this way. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25sunrpc: cache : Replace seq_printf with seq_putsXu Wang
seq_puts is a lot cheaper than seq_printf, so use that to print literal strings. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25SUNRPC/NFSD: Implement xdr_reserve_space_vec()Anna Schumaker
Reserving space for a large READ payload requires special handling when reserving space in the xdr buffer pages. One problem we can have is use of the scratch buffer, which is used to get a pointer to a contiguous region of data up to PAGE_SIZE. When using the scratch buffer, calls to xdr_commit_encode() shift the data to it's proper alignment in the xdr buffer. If we've reserved several pages in a vector, then this could potentially invalidate earlier pointers and result in incorrect READ data being sent to the client. I get around this by looking at the amount of space left in the current page, and never reserve more than that for each entry in the read vector. This lets us place data directly where it needs to go in the buffer pages. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25net: sunrpc: delete repeated wordsRandy Dunlap
Drop duplicate words in net/sunrpc/. Also fix "Anyone" to be "Any one". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25net: vlan: Avoid using BUG() in vlan_proto_idx()Florian Fainelli
While we should always make sure that we specify a valid VLAN protocol to vlan_proto_idx(), killing the machine when an invalid value is specified is too harsh and not helpful for debugging. All callers are capable of dealing with an error returned by vlan_proto_idx() so check the index value and propagate it accordingly. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_sk_assign() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL". Meaning it specifically takes a literal NULL. ARG_PTR_TO_BTF_ID_SOCK_COMMON does not allow a literal NULL, so another ARG type is required for this purpose and another follow-up patch can be used if there is such need. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000415.3857374-1-kafai@fb.com
2020-09-25bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_tcp_*_syncookie() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000409.3856725-1-kafai@fb.com
2020-09-25bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau
This patch changes the bpf_sk_storage_*() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com
2020-09-25bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ↵Martin KaFai Lau
ARG_PTR_TO_BTF_ID_SOCK_COMMON The previous patch allows the networking bpf prog to use the bpf_skc_to_*() helpers to get a PTR_TO_BTF_ID socket pointer, e.g. "struct tcp_sock *". It allows the bpf prog to read all the fields of the tcp_sock. This patch changes the bpf_sk_release() and bpf_sk_*cgroup_id() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. For example, the following will work: sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0); if (!sk) return; tp = bpf_skc_to_tcp_sock(sk); if (!tp) { bpf_sk_release(sk); return; } lsndtime = tp->lsndtime; /* Pass tp to bpf_sk_release() will also work */ bpf_sk_release(tp); Since PTR_TO_BTF_ID could be NULL, the helper taking ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime. A btf_id of "struct sock" may not always mean a fullsock. Regardless the helper's running context may get a non-fullsock or not, considering fullsock check/handling is pretty cheap, it is better to keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID* will be able to handle the minisock situation. In the bpf_sk_*cgroup_id() case, it will try to get a fullsock by using sk_to_full_sk() as its skb variant bpf_sk"b"_*cgroup_id() has already been doing. bpf_sk_release can already handle minisock, so nothing special has to be done. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000356.3856047-1-kafai@fb.com
2020-09-25bpf: Enable bpf_skc_to_* sock casting helper to networking prog typeMartin KaFai Lau
There is a constant need to add more fields into the bpf_tcp_sock for the bpf programs running at tc, sock_ops...etc. A current workaround could be to use bpf_probe_read_kernel(). However, other than making another helper call for reading each field and missing CO-RE, it is also not as intuitive to use as directly reading "tp->lsndtime" for example. While already having perfmon cap to do bpf_probe_read_kernel(), it will be much easier if the bpf prog can directly read from the tcp_sock. This patch tries to do that by using the existing casting-helpers bpf_skc_to_*() whose func_proto returns a btf_id. For example, the func_proto of bpf_skc_to_tcp_sock returns the btf_id of the kernel "struct tcp_sock". These helpers are also added to is_ptr_cast_function(). It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id. That will keep the ref-tracking works properly. The bpf_skc_to_* helpers are made available to most of the bpf prog types in filter.c. The bpf_skc_to_* helpers will be limited by perfmon cap. This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON. The helper accepting this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON]) or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON). The bpf_skc_to_*() helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will accept pointer obtained from skb->sk. Instead of specifying both arg_type and arg_btf_id in the same func_proto which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the compatible_reg_types[] in verifier.c. The reason is the arg_btf_id is always the same. Discussion in this thread: https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/ The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is expecting a PTR_TO_BTF_ID which could be NULL. This is the same behavior as the existing helper taking ARG_PTR_TO_BTF_ID. The _SOCK_COMMON part means the helper is also expecting the legacy SOCK_COMMON pointer. By excluding the _OR_NULL part, the bpf prog cannot call helper with a literal NULL which doesn't make sense in most cases. e.g. bpf_skc_to_tcp_sock(NULL) will be rejected. All PTR_TO_*_OR_NULL reg has to do a NULL check first before passing into the helper or else the bpf prog will be rejected. This behavior is nothing new and consistent with the current expectation during bpf-prog-load. [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace ARG_PTR_TO_SOCK* of other existing helpers later such that those existing helpers can take the PTR_TO_BTF_ID returned by the bpf_skc_to_*() helpers. The only special case is bpf_sk_lookup_assign() which can accept a literal NULL ptr. It has to be handled specially in another follow up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ] [ When converting the older helpers that take ARG_PTR_TO_SOCK* in the later patch, if the kernel does not support BTF, ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON because no reg->type could have PTR_TO_BTF_ID in this case. It is not a concern for the newer-btf-only helper like the bpf_skc_to_*() here though because these helpers must require BTF vmlinux to begin with. ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200925000350.3855720-1-kafai@fb.com
2020-09-25Bluetooth: MGMT: Fix not checking if BT_HS is enabledLuiz Augusto von Dentz
This checks if BT_HS is enabled relecting it on MGMT_SETTING_HS instead of always reporting it as supported. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25Bluetooth: Disable High Speed by defaultLuiz Augusto von Dentz
Bluetooth High Speed requires hardware support which is very uncommon nowadays since HS has not pickup interest by the industry. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25Bluetooth: L2CAP: Fix calling sk_filter on non-socket based channelLuiz Augusto von Dentz
Only sockets will have the chan->data set to an actual sk, channels like A2MP would have its own data which would likely cause a crash when calling sk_filter, in order to fix this a new callback has been introduced so channels can implement their own filtering if necessary. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25Bluetooth: A2MP: Fix not initializing all membersLuiz Augusto von Dentz
This fixes various places where a stack variable is used uninitialized. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25Merge tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds
Pull NFS server fix from Chuck Lever: "Fix incorrect calculation on platforms that implement flush_dcache_page()" * tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6: SUNRPC: Fix svc_flush_dcache()
2020-09-25Bluetooth: Fix update of own_addr_type if ll_privacy supportedSathish Narasimman
During system powercycle when trying to get the random address hci_get_random_address set own_addr_type as 0x01. In which if we enable ll_privacy it is supposed to be 0x03. Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25xfrm: Use correct address family in xfrm_state_findHerbert Xu
The struct flowi must never be interpreted by itself as its size depends on the address family. Therefore it must always be grouped with its original family value. In this particular instance, the original family value is lost in the function xfrm_state_find. Therefore we get a bogus read when it's coupled with the wrong family which would occur with inter- family xfrm states. This patch fixes it by keeping the original family value. Note that the same bug could potentially occur in LSM through the xfrm_state_pol_flow_match hook. I checked the current code there and it seems to be safe for now as only secid is used which is part of struct flowi_common. But that API should be changed so that so that we don't get new bugs in the future. We could do that by replacing fl with just secid or adding a family field. Reported-by: syzbot+577fbac3145a6eb2e7a5@syzkaller.appspotmail.com Fixes: 48b8d78315bf ("[XFRM]: State selection update to use inner...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-24net: tcp: drop unused function argument from mptcp_incoming_optionsFlorian Westphal
Since commit cfde141ea3faa30e ("mptcp: move option parsing into mptcp_incoming_options()"), the 3rd function argument is no longer used. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24tcp: skip DSACKs with dubious sequence rangesPriyaranjan Jha
Currently, we use length of DSACKed range to compute number of delivered packets. And if sequence range in DSACK is corrupted, we can get bogus dsacked/acked count, and bogus cwnd. This patch put bounds on DSACKed range to skip update of data delivery and spurious retransmission information, if the DSACK is unlikely caused by sender's action: - DSACKed range shouldn't be greater than maximum advertised rwnd. - Total no. of DSACKed segments shouldn't be greater than total no. of retransmitted segs. Unlike spurious retransmits, network duplicates or corrupted DSACKs shouldn't be counted as delivery. Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24net/tls: race causes kernel panicRohit Maheshwari
BUG: kernel NULL pointer dereference, address: 00000000000000b8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 80000008b6fef067 P4D 80000008b6fef067 PUD 8b6fe6067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 12 PID: 23871 Comm: kworker/12:80 Kdump: loaded Tainted: G S 5.9.0-rc3+ #1 Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1 03/29/2018 Workqueue: events tx_work_handler [tls] RIP: 0010:tx_work_handler+0x1b/0x70 [tls] Code: dc fe ff ff e8 16 d4 a3 f6 66 0f 1f 44 00 00 0f 1f 44 00 00 55 53 48 8b 6f 58 48 8b bd a0 04 00 00 48 85 ff 74 1c 48 8b 47 28 <48> 8b 90 b8 00 00 00 83 e2 02 75 0c f0 48 0f ba b0 b8 00 00 00 00 RSP: 0018:ffffa44ace61fe88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff91da9e45cc30 RCX: dead000000000122 RDX: 0000000000000001 RSI: ffff91da9e45cc38 RDI: ffff91d95efac200 RBP: ffff91da133fd780 R08: 0000000000000000 R09: 000073746e657665 R10: 8080808080808080 R11: 0000000000000000 R12: ffff91dad7d30700 R13: ffff91dab6561080 R14: 0ffff91dad7d3070 R15: ffff91da9e45cc38 FS: 0000000000000000(0000) GS:ffff91dad7d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b8 CR3: 0000000906478003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: process_one_work+0x1a7/0x370 worker_thread+0x30/0x370 ? process_one_work+0x370/0x370 kthread+0x114/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 tls_sw_release_resources_tx() waits for encrypt_pending, which can have race, so we need similar changes as in commit 0cada33241d9de205522e3858b18e506ca5cce2c here as well. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24mptcp: retransmit ADD_ADDR when timeoutGeliang Tang
This patch implemented the retransmition of ADD_ADDR when no ADD_ADDR echo is received. It added a timer with the announced address. When timeout occurs, ADD_ADDR will be retransmitted. Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24mptcp: add sk_stop_timer_sync helperGeliang Tang
This patch added a new helper sk_stop_timer_sync, it deactivates a timer like sk_stop_timer, but waits for the handler to finish. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24mptcp: add struct mptcp_pm_add_entryGeliang Tang
Add a new struct mptcp_pm_add_entry to describe add_addr's entry. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>