summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2019-08-21xprtrdma: Remove rpcrdma_buffer::rb_mrlockChuck Lever
Clean up: Now that the free list is used sparingly, get rid of the separate spin lock protecting it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Cache free MRs in each rpcrdma_reqChuck Lever
Instead of a globally-contended MR free list, cache MRs in each rpcrdma_req as they are released. This means acquiring and releasing an MR will be lock-free in the common case, even outside the transport send lock. The original idea of per-rpcrdma_req MR free lists was suggested by Shirley Ma <shirley.ma@oracle.com> several years ago. I just now figured out how to make that idea work with on-demand MR allocation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xdp: xdp_umem: replace kmap on vmap for umem mapIvan Khoronzhuk
For 64-bit there is no reason to use vmap/vunmap, so use page_address as it was initially. For 32 bits, in some apps, like in samples xdpsock_user.c when number of pgs in use is quite big, the kmap memory can be not enough, despite on this, kmap looks like is deprecated in such cases as it can block and should be used rather for dynamic mm. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21mac80211: minstrel_ht: improve rate probing for devices with static fallbackFelix Fietkau
On some devices that only support static rate fallback tables sending rate control probing packets can be really expensive. Probing lower rates can already hurt throughput quite a bit. What hurts even more is the fact that on mt76x0/mt76x2, single probing packets can only be forced by directing packets at a different internal hardware queue, which causes some heavy reordering and extra latency. The reordering issue is mainly problematic while pushing lots of packets to a particular station. If there is little activity, the overhead of probing is neglegible. The static fallback behavior is designed to pretty much only handle rate control algorithms that use only a very limited set of rates on which the algorithm switches up/down based on packet error rate. In order to better support that kind of hardware, this patch implements a different approach to rate probing where it switches to a slightly higher rate, waits for tx status feedback, then updates the stats and switches back to the new max throughput rate. This only triggers above a packet rate of 100 per stats interval (~50ms). For that kind of probing, the code has to reduce the set of probing rates a lot more compared to single packet probing, so it uses only one packet per MCS group which is either slightly faster, or as close as possible to the max throughput rate. This allows switching between similar rates with different numbers of streams. The algorithm assumes that the hardware will work its way lower within an MCS group in case of retransmissions, so that lower rates don't have to be probed by the high packets per second rate probing code. To further reduce the search space, it also does not probe rates with lower channel bandwidth than the max throughput rate. At the moment, these changes will only affect mt76x0/mt76x2. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-4-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: fix default max throughput rate indexesFelix Fietkau
Use the first supported rate instead of 0 (which can be invalid) Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-3-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: reduce unnecessary rate probing attemptsFelix Fietkau
On hardware with static fallback tables (e.g. mt76x2), rate probing attempts can be very expensive. On such devices, avoid sampling rates slower than the per-group max throughput rate, based on the assumption that the fallback table will take care of probing lower rates within that group if the higher rates fail. To further reduce unnecessary probing attempts, skip duplicate attempts on rates slower than the max throughput rate. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: fix per-group max throughput rate initializationFelix Fietkau
The group number needs to be multiplied by the number of rates per group to get the full rate index Fixes: 5935839ad735 ("mac80211: improve minstrel_ht rate sorting by throughput & probability") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21nl80211: Add support for EDMG channelsAlexei Avshalom Lazar
802.11ay specification defines Enhanced Directional Multi-Gigabit (EDMG) STA and AP which allow channel bonding of 2 channels and more. Introduce new NL attributes that are needed for enabling and configuring EDMG support. Two new attributes are used by kernel to publish driver's EDMG capabilities to the userspace: NL80211_BAND_ATTR_EDMG_CHANNELS - bitmap field that indicates the 2.16 GHz channel(s) that are supported by the driver. When this attribute is not set it means driver does not support EDMG. NL80211_BAND_ATTR_EDMG_BW_CONFIG - represent the channel bandwidth configurations supported by the driver. Additional two new attributes are used by the userspace for connect command and for AP configuration: NL80211_ATTR_WIPHY_EDMG_CHANNELS NL80211_ATTR_WIPHY_EDMG_BW_CONFIG New rate info flag - RATE_INFO_FLAGS_EDMG, can be reported from driver and used for bitrate calculation that will take into account EDMG according to the 802.11ay specification. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Link: https://lore.kernel.org/r/1566138918-3823-2-git-send-email-ailizaro@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: fix possible NULL pointerderef in obss pd codeJohn Crispin
he_spr_ie_elem is dereferenced before the NULL check. fix this by moving the assignment after the check. fixes commit 697f6c507c74 ("mac80211: propagate HE operation info into bss_conf") This was reported by the static code checker. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190813070712.25509-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: add assoc-at supportBen Greear
Report timestamp for when sta becomes associated. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20190809180001.26393-2-greearb@candelatech.com [fix ktime_get_boot_ns() to ktime_get_boottime_ns(), assoc_at type to u64] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: Support assoc-at timer in sta-infoBen Greear
Report timestamp of when sta became associated. This is the boottime clock, units are nano-seconds. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20190809180001.26393-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: apply same mandatory rate flags for 5GHz and 6GHzArend van Spriel
For the new 6GHz band the same rules apply for mandatory rates so add it to set_mandatory_flags_band() function. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-9-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: ibss: use 11a mandatory rates for 6GHz band operationArend van Spriel
The default mandatory rates, ie. when not specified by user-space, is determined by the band. Select 11a rateset for 6GHz band. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-8-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: use same IR permissive rules for 6GHz bandArend van Spriel
The function cfg80211_ir_permissive_chan() is applicable for 6GHz band as well so make sure it is handled. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-7-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: add 6GHz in code handling array with NUM_NL80211_BANDS entriesArend van Spriel
In nl80211.c there is a policy for all bands in NUM_NL80211_BANDS and in trace.h there is a callback trace for multicast rates which is per band in NUM_NL80211_BANDS. Both need to be extended for the new NL80211_BAND_6GHZ. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-6-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: extend ieee80211_operating_class_to_band() for 6GHzArend van Spriel
Add 6GHz operating class range as defined in 802.11ax D4.1 Annex E. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-5-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: util: add 6GHz channel to freq conversion and vice versaArend van Spriel
Extend the functions ieee80211_channel_to_frequency() and ieee80211_frequency_to_channel() to support 6GHz band according specification in 802.11ax D4.1 27.3.22.2. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-4-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: add 6GHz UNII band definitionsArend van Spriel
For the new 6GHz there are new UNII band definitions as listed in the FCC notice [1]. [1] https://docs.fcc.gov/public/attachments/FCC-18-147A1_Rcd.pdf Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-3-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21nl80211: add 6GHz band definition to enum nl80211_bandArend van Spriel
In the 802.11ax specification a new band is introduced, which is also proposed by FCC for unlicensed use. This band is referred to as 6GHz spanning frequency range from 5925 to 7125 MHz. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-2-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21Revert "cfg80211: fix processing world regdomain when non modular"Hodaszi, Robert
This reverts commit 96cce12ff6e0 ("cfg80211: fix processing world regdomain when non modular"). Re-triggering a reg_process_hint with the last request on all events, can make the regulatory domain fail in case of multiple WiFi modules. On slower boards (espacially with mdev), enumeration of the WiFi modules can end up in an intersected regulatory domain, and user cannot set it with 'iw reg set' anymore. This is happening, because: - 1st module enumerates, queues up a regulatory request - request gets processed by __reg_process_hint_driver(): - checks if previous was set by CORE -> yes - checks if regulator domain changed -> yes, from '00' to e.g. 'US' -> sends request to the 'crda' - 2nd module enumerates, queues up a regulator request (which triggers the reg_todo() work) - reg_todo() -> reg_process_pending_hints() sees, that the last request is not processed yet, so it tries to process it again. __reg_process_hint driver() will run again, and: - checks if the last request's initiator was the core -> no, it was the driver (1st WiFi module) - checks, if the previous initiator was the driver -> yes - checks if the regulator domain changed -> yes, it was '00' (set by core, and crda call did not return yet), and should be changed to 'US' ------> __reg_process_hint_driver calls an intersect Besides, the reg_process_hint call with the last request is meaningless since the crda call has a timeout work. If that timeout expires, the first module's request will lost. Cc: stable@vger.kernel.org Fixes: 96cce12ff6e0 ("cfg80211: fix processing world regdomain when non modular") Signed-off-by: Robert Hodaszi <robert.hodaszi@digi.com> Link: https://lore.kernel.org/r/20190614131600.GA13897@a1-hr Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: add missing length field increment when generating Radiotap headerJohn Crispin
The code generating the Tx Radiotap header when using tx_status_ext was missing a field increment after setting the VHT bandwidth. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-4-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: 80Mhz was not reported properly when using tx_status_extJohn Crispin
When reporting 80MHz, we need to set 4 and not 2 inside the corresponding field inside the Tx Radiotap header. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-3-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: fix bad guard when reporting legacy ratesJohn Crispin
When reporting legacy rates inside the TX Radiotap header we need to split the check between "uses tx_statua_ext" and "is legacy rate". Not doing so would make the code drop into the !tx_status_ext path. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-2-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: fix TX legacy rate reporting when tx_status_ext is usedJohn Crispin
The RX Radiotap header length was not calculated properly when reporting legacy rates using tx_status_ext. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190807075949.32414-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: Fix Extended Key ID key install checksAlexander Wetzel
Fix two shortcomings in the Extended Key ID API: 1) Allow the userspace to install pairwise keys using keyid 1 without NL80211_KEY_NO_TX set. This allows the userspace to install and activate pairwise keys with keyid 1 in the same way as for keyid 0, simplifying the API usage for e.g. FILS and FT key installs. 2) IEEE 802.11 - 2016 restricts Extended Key ID usage to CCMP/GCMP ciphers in IEEE 802.11 - 2016 "9.4.2.25.4 RSN capabilities". Enforce that when installing a key. Cc: stable@vger.kernel.org # 5.2 Fixes: 6cdd3979a2bd ("nl80211/cfg80211: Extended Key ID support") Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Link: https://lore.kernel.org/r/20190805123400.51567-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: fix possible sta leakJohannes Berg
If TDLS station addition is rejected, the sta memory is leaked. Avoid this by moving the check before the allocation. Cc: stable@vger.kernel.org Fixes: 7ed5285396c2 ("mac80211: don't initiate TDLS connection if station is not associated to AP") Link: https://lore.kernel.org/r/20190801073033.7892-1-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-20xprtrdma: Ensure creating an MR does not trigger FS writebackChuck Lever
Probably would be good to also pass GFP flags to ib_alloc_mr. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Move rpcrdma_mr_get out of frwr_mapChuck Lever
Refactor: Retrieve an MR and handle error recovery entirely in rpc_rdma.c, as this is not a device-specific function. Note that since commit 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit queue"), the xprt_transmit function handles the cond_resched. The transport no longer has to do this itself. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Combine rpcrdma_mr_put and rpcrdma_mr_unmap_and_putChuck Lever
Clean up. There is only one remaining rpcrdma_mr_put call site, and it can be directly replaced with unmap_and_put because mr->mr_dir is set to DMA_NONE just before the call. Now all the call sites do a DMA unmap, and we can just rename mr_unmap_and_put to mr_put, which nicely matches mr_get. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Simplify rpcrdma_mr_popChuck Lever
Clean up: rpcrdma_mr_pop call sites check if the list is empty first. Let's replace the list_empty with less costly logic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20net: fix __ip_mc_inc_group usageLi RongQing
in ip_mc_inc_group, memory allocation flag, not mcast mode, is expected by __ip_mc_inc_group similar issue in __ip_mc_join_group, both mcase mode and gfp_t are needed here, so use ____ip_mc_inc_group(...) Fixes: 9fb20801dab4 ("net: Fix ip_mc_{dec,inc}_group allocation context") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20net/ncsi: Ensure 32-bit boundary for data cksumTerry S. Duncan
The NCSI spec indicates that if the data does not end on a 32 bit boundary, one to three padding bytes equal to 0x00 shall be present to align the checksum field to a 32-bit boundary. Signed-off-by: Terry S. Duncan <terry.s.duncan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20net: dsa: enable and disable all portsVivien Didelot
Call the .port_enable and .port_disable functions for all ports, not only the user ports, so that drivers may optimize the power consumption of all ports after a successful setup. Unused ports are now disabled on setup. CPU and DSA ports are now enabled on setup and disabled on teardown. User ports were already enabled at slave creation and disabled at slave destruction. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20net: dsa: use a single switch statement for port setupVivien Didelot
It is currently difficult to read the different steps involved in the setup and teardown of ports in the DSA code. Keep it simple with a single switch statement for each port type: UNUSED, CPU, DSA, or USER. Also no need to call devlink_port_unregister from within dsa_port_setup as this step is inconditionally handled by dsa_port_teardown on error. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20net/smc: make sure EPOLLOUT is raisedJason Baron
Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE being set when -EAGAIN is returned to ensure that EPOLLOUT is raised. Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior that Eric Dumazet introduced for tcp sockets. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Ursula Braun <ubraun@linux.ibm.com> Cc: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methodsChuck Lever
Commit 48be539dd44a ("xprtrdma: Introduce ->alloc_slot call-out for xprtrdma") added a separate alloc_slot and free_slot to the RPC/RDMA transport. Later, commit 75891f502f5f ("SUNRPC: Support for congestion control when queuing is enabled") modified the generic alloc/free_slot methods, but neglected the methods in xprtrdma. Found via code review. Fixes: 75891f502f5f ("SUNRPC: Support for congestion control ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Rename rpcrdma_buffer::rb_allChuck Lever
Clean up: There are other "all" list heads. For code clarity distinguish this one as for use only for MRs by renaming it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Rename CQE field in Receive trace pointsChuck Lever
Make the field name the same for all trace points that handle pointers to struct rpcrdma_rep. That makes it easy to grep for matching rep points in trace output. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Boost maximum transport header sizeChuck Lever
Although I haven't seen any performance results that justify it, I've received several complaints that NFS/RDMA no longer supports a maximum rsize and wsize of 1MB. These days it is somewhat smaller. To simplify the logic that determines whether a chunk list is necessary, the implementation uses a fixed maximum size of the transport header. Currently that maximum size is 256 bytes, one quarter of the default inline threshold size for RPC/RDMA v1. Since commit a78868497c2e ("xprtrdma: Reduce max_frwr_depth"), the size of chunks is also smaller to take advantage of inline page lists in device internal MR data structures. The combination of these two design choices has reduced the maximum NFS rsize and wsize that can be used for most RNIC/HCAs. Increasing the maximum transport header size and the maximum number of RDMA segments it can contain increases the negotiated maximum rsize/wsize on common RNIC/HCAs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Fix calculation of ri_max_segs againChuck Lever
Commit 302d3deb206 ("xprtrdma: Prevent inline overflow") added this calculation back in 2016, but got it wrong. I tested only the lower bound, which is why there is a max_t there. The upper bound should be rounded up too. Now, when using DIV_ROUND_UP, that takes care of the lower bound as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Update obsolete commentChuck Lever
Comment was made obsolete by commit 8cec3dba76a4 ("xprtrdma: rpcrdma_regbuf alignment"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xprtrdma: Refresh the documenting comment in frwr_ops.cChuck Lever
Things have changed since this comment was written. In particular, the reworking of connection closing, on-demand creation of MRs, and the removal of fr_state all mean that deferring MR recovery to frwr_map is no longer needed. The description is obsolete. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xdp: unpin xdp umem pages in error pathIvan Khoronzhuk
Fix mem leak caused by missed unpin routine for umem pages. Fixes: 8aef7340ae9695 ("xsk: introduce xdp_umem_page") Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20SUNRPC: Inline xdr_commit_encodeChuck Lever
Micro-optimization: For xdr_commit_encode call sites in net/sunrpc/xdr.c, eliminate the extra calling sequence. On my client, this change saves about a microsecond for every 30 calls to xdr_reserve_space(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20SUNRPC: Remove rpc_wake_up_queued_task_on_wq()Chuck Lever
Clean up: commit c544577daddb ("SUNRPC: Clean up transport write space handling") appears to have removed the last caller of rpc_wake_up_queued_task_on_wq(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20xfrm: policy: avoid warning splat when merging nodesFlorian Westphal
syzbot reported a splat: xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57 Call Trace: xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline] xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline] xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023 xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139 xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182 xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574 xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670 xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477 xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] There is no reproducer, however, the warning can be reproduced by adding rules with ever smaller prefixes. The sanity check ("does the policy match the node") uses the prefix value of the node before its updated to the smaller value. To fix this, update the prefix earlier. The bug has no impact on tree correctness, this is only to prevent a false warning. Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-08-19sctp: remove net sctp.x_enable working as a global switchXin Long
The netns sctp feature flags shouldn't work as a global switch, which is mostly like a firewall/netfilter's job. Also, it will break asoc as it discard or accept chunks incorrectly when net sctp.x_enable is changed after the asoc is created. Since each type of chunk's processing function will check the corresp asoc's feature flag, this 'global switch' should be removed, and net sctp.x_enable will only work as the default feature flags for the future sctp sockets/endpoints. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19sctp: add SCTP_AUTH_SUPPORTED sockoptXin Long
SCTP_AUTH_SUPPORTED sockopt is used to set enpoint's auth flag. With this feature, each endpoint will have its own flag for its future asoc's auth_capable, instead of netns auth flag. Note that when both ep's auth_enable is enabled, endpoint auth related data should be initialized. If asconf_enable is also set, SCTP_CID_ASCONF/SCTP_CID_ASCONF_ACK should be added into auth_chunk_list. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19sctp: add sctp_auth_init and sctp_auth_freeXin Long
This patch is to factor out sctp_auth_init and sctp_auth_free functions, and sctp_auth_init will also be used in the next patch for SCTP_AUTH_SUPPORTED sockopt. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-19sctp: use ep and asoc auth_enable properlyXin Long
sctp has per endpoint auth flag and per asoc auth flag, and the asoc one should be checked when coming to asoc and the endpoint one should be checked when coming to endpoint. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>