summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-05-29netfilter: nf_tables: add and use nft_sk helperFlorian Westphal
This allows to change storage placement later on without changing readers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: reduce size of nf_hook_state on 32bit platformsFlorian Westphal
Reduce size from 28 to 24 bytes on 32bit platforms. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: x_tables: reduce xt_action_param by 8 byteFlorian Westphal
The fragment offset in ipv4/ipv6 is a 16bit field, so use u16 instead of unsigned int. On 64bit: 40 bytes to 32 bytes. By extension this also reduces nft_pktinfo (56 to 48 byte). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: use nfnetlink_unicast()Pablo Neira Ayuso
Replace netlink_unicast() calls by nfnetlink_unicast() which already deals with translating EAGAIN to ENOBUFS as the nfnetlink core expects. nfnetlink_unicast() calls nlmsg_unicast() which returns zero in case of success, otherwise the netlink core function netlink_rcv_skb() turns err > 0 into an acknowlegment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: xt_CT: Remove redundant assignment to retYang Li
Variable 'ret' is set to zero but this value is never read as it is overwritten with a new value later on, hence it is a redundant assignment and can be removed Clean up the following clang-analyzer warning: net/netfilter/xt_CT.c:175:2: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores] Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: x_tables: improve limit_mt scalabilityJason Baron
We've seen this spin_lock show up high in profiles. Let's introduce a lockless version. I've tested this using pktgen_sample01_simple.sh. Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: Remove leading spaces in KconfigJuerg Haefliger
Remove leading spaces before tabs in Kconfig file(s) by running the following command: $ find net/netfilter -name 'Kconfig*' | xargs sed -r -i 's/^[ ]+\t/\t/' Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: nf_tables: prefer direct calls for set lookupsFlorian Westphal
Extend nft_set_do_lookup() to use direct calls when retpoline feature is enabled. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28r8169: Fix fall-through warning for ClangGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20210528202327.GA39994@embeddedor Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: stmmac: the XPCS obscures a potential "PHY not found" errorVladimir Oltean
stmmac_mdio_register() has logic to search for PHYs on the MDIO bus and assign them IRQ lines, as well as to set priv->plat->phy_addr. If no PHY is found, the "found" variable remains set to 0 and the function errors out. After the introduction of commit f213bbe8a9d6 ("net: stmmac: Integrate it with DesignWare XPCS"), the "found" variable was immediately reused for searching for a PCS on the same MDIO bus. This can result in 2 types of potential problems (none of them seems to be seen on the only Intel system that sets has_xpcs = true, otherwise it would have been reported): 1. If a PCS is found but a PHY is not, then the code happily exits with no error. One might say "yes, but this is not possible, because of_mdiobus_register will probe a PHY for all MDIO addresses, including for the XPCS, so if an XPCS exists, then a PHY certainly exists too". Well, that is not true, see intel_mgbe_common_data(): /* Ensure mdio bus scan skips intel serdes and pcs-xpcs */ plat->mdio_bus_data->phy_mask = 1 << INTEL_MGBE_ADHOC_ADDR; plat->mdio_bus_data->phy_mask |= 1 << INTEL_MGBE_XPCS_ADDR; 2. A PHY is found but an MDIO device with the XPCS PHY ID isn't, and in that case, the error message will be "No PHY found". Confusing. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20210527155959.3270478-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: st95hf: mark ACPI and OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or ACPI ID tables. If one configuration is disabled, the table will be unused: drivers/nfc/st95hf/core.c:1059:34: warning: ‘st95hf_spi_of_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-12-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: st21nfca: mark ACPI and OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or ACPI ID tables. If one configuration is disabled, the table will be unused: drivers/nfc/st21nfca/i2c.c:593:34: warning: ‘of_st21nfca_i2c_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-11-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: st-nci: mark ACPI and OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or ACPI ID tables. If one configuration is disabled, the table will be unused: drivers/nfc/st-nci/spi.c:296:34: warning: ‘of_st_nci_spi_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-10-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: pn544: mark ACPI and OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or ACPI ID tables. If one configuration is disabled, the table will be unused: drivers/nfc/pn544/i2c.c:53:36: warning: ‘pn544_hci_i2c_acpi_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-9-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: s3fwrn5: mark OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or I2C ID tables. If OF is disabled, the table will be unused: drivers/nfc/s3fwrn5/i2c.c:265:34: warning: ‘of_s3fwrn5_i2c_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-8-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: pn533: mark OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or I2C ID tables. If OF is disabled, the table will be unused: drivers/nfc/pn533/i2c.c:252:34: warning: ‘of_pn533_i2c_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-7-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: mrvl: skip impossible NCI_MAX_PAYLOAD_SIZE checkKrzysztof Kozlowski
The nci_ctrl_hdr.plen field us u8, so checkign if it is bigger than NCI_MAX_PAYLOAD_SIZE does not make any sense. Fix warning reported by Smatch: drivers/nfc/nfcmrvl/i2c.c:52 nfcmrvl_i2c_read() warn: impossible condition '(nci_hdr.plen > 255) => (0-255 > 255)' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-6-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: mrvl: mark OF device ID tables as maybe unusedKrzysztof Kozlowski
The driver can match either via OF or I2C ID tables. If OF is disabled, the table will be unused: drivers/nfc/nfcmrvl/spi.c:199:34: warning: ‘of_nfcmrvl_spi_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-5-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: pn533: drop of_match_ptr from device ID tableKrzysztof Kozlowski
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it might be not relevant here). This fixes compile warning (!CONFIG_OF): drivers/nfc/pn533/i2c.c:252:34: warning: ‘of_pn533_i2c_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-4-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: port100: correct kerneldoc for structureKrzysztof Kozlowski
The port100_in_rf_setting structure does not contain valid kerneldoc docummentation, unlike the port100_tg_rf_setting structure. Correct the kerneldoc to fix W=1 warnings: warning: This comment starts with '/**', but isn't a kernel-doc comment. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-3-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: fdp: drop ACPI_PTR from device ID tableKrzysztof Kozlowski
The driver can match only via the ACPI ID table so the table should be always used and the ACPI_PTR does not have any sense. This fixes fixes compile warning (!CONFIG_ACPI): drivers/nfc/fdp/i2c.c:362:36: warning: ‘fdp_nci_i2c_acpi_match’ defined but not used [-Wunused-const-variable=] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-2-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28nfc: fdp: correct kerneldoc for structureKrzysztof Kozlowski
Since structure comments are not kerneldoc, remove the double ** to fix W=1 warnings: warning: This comment starts with '/**', but isn't a kernel-doc comment. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210528124200.79655-1-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28samples: pktgen: add UDP tx checksum supportLorenzo Bianconi
Introduce k parameter in pktgen samples in order to toggle UDP tx checksum Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/cf16417902062c6ea2fd3c79e00510e36a40c31a.1622210713.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28Merge branch 'net-hdlc_fr-clean-up-some-code-style-issues'Jakub Kicinski
Peng Li says: ==================== net: hdlc_fr: clean up some code style issues V1 -> V2: 1, Use appropriate commit prefix suggested by Jakub Kicinski, replace commit prefix "net: wan" by "net: hdlc_fr". ==================== Link: https://lore.kernel.org/r/1622160769-6678-1-git-send-email-huangguangbin2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: remove unnecessary out of memory messagePeng Li
This patch removes unnecessary out of memory message, to fix the following checkpatch.pl warning: "WARNING: Possible unnecessary 'out of memory' message" Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: remove redundant braces {}Peng Li
This patch removes redundant braces {}, to fix the checkpatch.pl warning: "braces {} are not necessary for any arm of this statement" Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: add braces {} to all arms of the statementPeng Li
Braces {} should be used on all arms of this statement. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: remove space after '!'Peng Li
According to the chackpatch.pl, space prohibited after that '!'. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: code indent use tabs where possiblePeng Li
Code indent should use tabs where possible. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: move out assignment in if conditionPeng Li
Should not use assignment in if condition. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: add some required spacesPeng Li
Add spaces required after that close brace '}'. Add spaces required before the open parenthesis '('. Add spaces required after that ','. Add spaces required around that '='. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: fix an code style issue about "foo* bar"Peng Li
Fix the checkpatch error as "foo* bar" and should be "foo *bar", and "(foo*)" should be "(foo *)". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: add blank line after declarationsPeng Li
This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28net: hdlc_fr: remove redundant blank linesPeng Li
This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28Merge branch 'mptcp-miscellaneous-cleanup'Jakub Kicinski
Mat Martineau says: ==================== mptcp: Miscellaneous cleanup Here are some cleanup patches we've collected in the MPTCP tree. Patches 1-4 do some general tidying. Patch 5 adds an explicit check at netlink command parsing time to require a port number when the 'signal' flag is set, to catch the error earlier. Patches 6 & 7 fix up the MPTCP 'enabled' sysctl, enforcing it as a boolean value, and ensuring that the !CONFIG_SYSCTL build still works after the boolean change. ==================== Link: https://lore.kernel.org/r/20210527235430.183465-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: restrict values of 'enabled' sysctlMatthieu Baerts
To avoid confusions, it seems better to parse this sysctl parameter as a boolean. We use it as a boolean, no need to parse an integer and bring confusions if we see a value different from 0 and 1, especially with this parameter name: enabled. It seems fine to do this modification because the default value is 1 (enabled). Then the only other interesting value to set is 0 (disabled). All other values would not have changed the default behaviour. Suggested-by: Florian Westphal <fw@strlen.de> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: support SYSCTL only if enabledMatthieu Baerts
Since the introduction of the sysctl support in MPTCP with commit 784325e9f037 ("mptcp: new sysctl to control the activation per NS"), we don't check CONFIG_SYSCTL. Until now, that was not an issue: the register and unregister functions were replaced by NO-OP one if SYSCTL was not enabled in the config. The only thing we could have avoid is not to reserve memory for the table but that's for the moment only a small table per net-ns. But the following commit is going to use SYSCTL_ZERO and SYSCTL_ONE which are not be defined if SYSCTL is not enabled in the config. This causes 'undefined reference' errors from the linker. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: make sure flag signal is set when add addr with portJianguo Wu
When add address with port, it is mean to create a listening socket, and send an ADD_ADDR to remote, so it must have flag signal set, add this check in mptcp_pm_parse_addr(). Fixes: a77e9179c7651 ("mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink") Acked-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: remove redundant initialization in pm_nl_init_net()Jianguo Wu
Memory of struct pm_nl_pernet{} is allocated by kzalloc() in setup_net()->ops_init(), so it's no need to reset counters and zero bitmap in pm_nl_init_net(). Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: generate subflow hmac after mptcp_finish_join()Jianguo Wu
For outgoing subflow join, when recv SYNACK, in subflow_finish_connect(), the mptcp_finish_join() may return false in some cases, and send a RESET to remote, and no local hmac is required. So generate subflow hmac after mptcp_finish_join(). Fixes: ec3edaa7ca6c ("mptcp: Add handling of outgoing MP_JOIN requests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: using TOKEN_MAX_RETRIES instead of magic numberJianguo Wu
We have macro TOKEN_MAX_RETRIES for the number of token generate retries, so using TOKEN_MAX_RETRIES in subflow_check_req(). And rename TOKEN_MAX_RETRIES to MPTCP_TOKEN_MAX_RETRIES as it is now exposed. Fixes: 535fb8152f31 ("mptcp: token: move retry to caller") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: fix pr_debug in mptcp_token_new_connectJianguo Wu
After commit 2c5ebd001d4f ("mptcp: refactor token container"), pr_debug() is called before mptcp_crypto_key_gen_sha() in mptcp_token_new_connect(), so the output local_key, token and idsn are 0, like: MPTCP: ssk=00000000f6b3c4a2, local_key=0, token=0, idsn=0 Move pr_debug() after mptcp_crypto_key_gen_sha(). Fixes: 2c5ebd001d4f ("mptcp: refactor token container") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28Merge branch 'mptcp-fixes-for-5-13'Jakub Kicinski
Mat Martineau says: ==================== mptcp: Fixes for 5.13 These patches address two issues in MPTCP. Patch 1 fixes a locking issue affecting MPTCP-level retransmissions. Patches 2-4 improve handling of out-of-order packet arrival early in a connection, so it falls back to TCP rather than forcing a reset. Includes a selftest. ==================== Link: https://lore.kernel.org/r/20210527233140.182728-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: update selftest for fallback due to OoOPaolo Abeni
The previous commit noted that we can have fallback scenario due to OoO (or packet drop). Update the self-tests accordingly Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: do not reset MP_CAPABLE subflow on mapping errorsPaolo Abeni
When some mapping related errors occurs we close the main MPC subflow with a RST. We should instead fallback gracefully to TCP, and do the reset only for MPJ subflows. Fixes: d22f4988ffec ("mptcp: process MP_CAPABLE data option") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/192 Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: always parse mptcp options for MPC reqskPaolo Abeni
In subflow_syn_recv_sock() we currently skip options parsing for OoO packet, given that such packets may not carry the relevant MPC option. If the peer generates an MPC+data TSO packet and some of the early segments are lost or get reorder, we server will ignore the peer key, causing transient, unexpected fallback to TCP. The solution is always parsing the incoming MPTCP options, and do the fallback only for in-order packets. This actually cleans the existing code a bit. Fixes: d22f4988ffec ("mptcp: process MP_CAPABLE data option") Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28mptcp: fix sk_forward_memory corruption on retransmissionPaolo Abeni
MPTCP sk_forward_memory handling is a bit special, as such field is protected by the msk socket spin_lock, instead of the plain socket lock. Currently we have a code path updating such field without handling the relevant lock: __mptcp_retrans() -> __mptcp_clean_una_wakeup() Several helpers in __mptcp_clean_una_wakeup() will update sk_forward_alloc, possibly causing such field corruption, as reported by Matthieu. Address the issue providing and using a new variant of blamed function which explicitly acquires the msk spin lock. Fixes: 64b9cea7a0af ("mptcp: fix spurious retransmissions") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/172 Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28netfilter: add and use nft_set_do_lookup helperFlorian Westphal
Followup patch will add a CONFIG_RETPOLINE wrapper to avoid the ops->lookup() indirection cost for retpoline builds. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR stateStefano Brivio
We don't need a valid MXCSR state for the lookup routines, none of the instructions we use rely on or affect any bit in the MXCSR register. Instead of calling kernel_fpu_begin(), we can pass 0 as mask to kernel_fpu_begin_mask() and spare one LDMXCSR instruction. Commit 49200d17d27d ("x86/fpu/64: Don't FNINIT in kernel_fpu_begin()") already speeds up lookups considerably, and by dropping the MCXSR initialisation we can now get a much smaller, but measurable, increase in matching rates. The table below reports matching rates and a wild approximation of clock cycles needed for a match in a "port,net" test with 10 entries from selftests/netfilter/nft_concat_range.sh, limited to the first field, i.e. the port (with nft_set_rbtree initialisation skipped), run on a single AMD Epyc 7351 thread (2.9GHz, 512 KiB L1D$, 8 MiB L2$). The (very rough) estimation of clock cycles is obtained by simply dividing frequency by matching rate. The "cycles spared" column refers to the difference in cycles compared to the previous row, and the rate increase also refers to the previous row. Results are averages of six runs. Merely for context, I'm also reporting packet rates obtained by skipping kernel_fpu_begin() and kernel_fpu_end() altogether (which shows a very limited impact now), as well as skipping the whole lookup function, compared to simply counting and dropping all packets using the netdev hook drop (see nft_concat_range.sh for details). This workload also includes packet generation with pktgen and the receive path of veth. |matching| est. | cycles | rate | | rate | cycles | spared |increase| | (Mpps) | | | | --------------------------------------|--------|--------|--------|--------| FNINIT, LDMXCSR (before 49200d17d27d) | 5.245 | 553 | - | - | LDMXCSR only (with 49200d17d27d) | 6.347 | 457 | 96 | 21.0% | Without LDMXCSR (this patch) | 6.461 | 449 | 8 | 1.8% | -------- for reference only: ---------|--------|--------|--------|--------| Without kernel_fpu_begin() | 6.513 | 445 | 4 | 0.8% | Without actual matching (return true) | 7.649 | 379 | 66 | 17.4% | Without lookup operation (netdev drop)| 10.320 | 281 | 98 | 34.9% | The clock cycles spared by avoiding LDMXCSR appear to be in line with CPI and latency indicated in the manuals of comparable architectures: Intel Skylake (CPI: 1, latency: 7) and AMD 12h (latency: 12) -- I couldn't find this information for AMD 17h. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28netfilter: nft_exthdr: Support SCTP chunksPhil Sutter
Chunks are SCTP header extensions similar in implementation to IPv6 extension headers or TCP options. Reusing exthdr expression to find and extract field values from them is therefore pretty straightforward. For now, this supports extracting data from chunks at a fixed offset (and length) only - chunks themselves are an extensible data structure; in order to make all fields available, a nested extension search is needed. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>