summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-30wifi: iwlwifi: mvm: fix crash on queue removal for MLD API tooMiri Korenblit
The patch linked below fixes the crash on queue removal bug only for the non-MLD API. Do the same for the MLD API. Fixes: c5a976cf6a75 ("wifi: iwlwifi: modify new queue allocation command") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.527dace26147.Ia215df5833634f95688a979f39fae70c1ac4e027@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: fix "modify_mask" value in the link cmd.Miri Korenblit
This bitmap indicates what fields of the cmd got changed. A field will be ignored by the FW if the corresponding flag wasn't set. There are a few cases in which we currently set the wrong bits when sending this cmd, which caused FW asserts. Fix this by setting the correct bits in each case. Fixes: 1ab26632332e ("wifi: iwlwifi: mvm: Add an add_interface() callback for mld mode") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.19ddbee0c98d.I595abb79d0419c9a21e5234303c2c3fd5290a52a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add all missing ops to iwl_mvm_mld_opsMiri Korenblit
Add all the callbacks that are not changing with the new MLD API and register to mac80211 with the new ops. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.a2f724342522.I5d1d6a8f5f14e6275da56ea704c3c0063fee5226@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add support for post_channel_switch in MLD modeMiri Korenblit
Adjust the existing iwl_mvm_post_channel_switch() to the new MLD API and use it in the new MLD ieee80211_ops Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.fa3992f7dfd2.Ie298a9b1522e956d7b699f0432795548bc6e47f9@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: unite sta_modify_disable_tx flowsMiri Korenblit
These flows are the same in both MLD API and the current API, except for the commands that are being sent during this flows. Instead of checking each time before calling these floews what API we use and then call the correct function, call always the old one, which in turn will call the new one in case we're using the MLD API. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.5692d8dea9be.Ib1882b2c2f0b0603abc4b7d4a0ecc45cd1fbf9a7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add cancel/remain_on_channel for MLD modeMiri Korenblit
Add an MLD version of the remain_on_channel and cancel_remain_on_channel callbacks. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.b51813dbebd4.Ia25bbd63d3138e4759237ce2be0cd0436fe01c0a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: refactor iwl_mvm_roc()Miri Korenblit
This flow is almost the same for both MLD and non-MLD modes, except for some function calls. Therefore there is no reason to add an MLD version of this flow. Instead - put the parts that are unique for each mode in helper functions, and in the next patch each version of this flow will call the common part with pointers to its specific helper functions. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.61bc077a7f3c.Ia3aa81d3293792bf8f80528dbc67a711ce334b32@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add some new MLD opsMiri Korenblit
Add MLD version of bss_info_changed/switch_vif_chanctx/ config_iface_filter and conf_tx() callbacks. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.9c83c253d610.Ibf2006be9ece87896c17cb43dfe3654ac73d81ff@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add sta handling flows for MLD modeMiri Korenblit
In MLD mode we have a new STA cmd. As a result, it is also changes the flows of adding/updating/removing and handling state of a station. Add these flows. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.b5548cfd8fe3.I70f9c8f3c95e18d5c9af0a5681e0830893509531@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: add an indication that the new MLD API is usedMiri Korenblit
WE can't mix between the new MLD API and the old API. I.e. - we can't send one of the new cmds and then one of the old ones. This will cause a FW assert. So we need an indication what API should be used. We use the new API if: 1. FW supports it 2. We are registered to mac80211 with the new MLD ops Add an indication which will only be true if both conditions are true. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.5756b0907403.I0adce36d1783cce23d0e080e3c4a8953db33b515@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: sta preparation for MLOGregory Greenman
Split iwl_mvm_sta into general and link specific parts. As a first step, all link dependent parameters reside in deflink. The change was done mostly using the spatch below with some manual adjustments. @iwl_mvm_sta@ struct iwl_mvm_sta *s; identifier var = {sta_id, lq_sta, avg_energy}; @@ ( s-> - var + deflink.var ) Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.34eace06d583.I1f8c5e919a71b21030460fbdd220d42401b688b1@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: vif preparation for MLOGregory Greenman
In MLO, some fields of iwl_mvm_vif should be defined in the context of a link. Define a separate structure for these fields and add a deflink object to hold it as part of iwl_mvm_vif. Non-MLO legacy code will use only deflink object while MLO related code will use the corresponding link from the link array. It follows the strategy applied in mac80211 for introducing MLO changes. The below spatch takes care of updating all driver code to access fields separated into MLD specific data structure via deflink (need to convert all references to the fields listed in var to deflink.var and also to take care of calls like iwl_mvm_vif_from_mac80211(vif)->field). @iwl_mld_vif@ struct iwl_mvm_vif *v; struct ieee80211_vif *vv; identifier fn; identifier var = {bssid, ap_sta_id, bcast_sta, mcast_sta, beacon_stats, smps_requests, probe_resp_data, he_ru_2mhz_block, cab_queue, phy_ctxt, queue_params}; @@ ( v-> - var + deflink.var | fn(vv)-> - var + deflink.var ) Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230328104948.4896576f0a9f.Ifaf0187c96b9fe52b24bd629331165831a877691@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: nl80211: support advertising S1G capabilitiesKieran Frewen
Include S1G capabilities in netlink band info messages. Signed-off-by: Kieran Frewen <kieran.frewen@morsemicro.com> Co-developed-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Signed-off-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Link: https://lore.kernel.org/r/20230223212917.4010246-1-gilad.itzkovitch@virscient.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: S1G capabilities information element in probe requestKieran Frewen
Add the missing S1G capabilities information element to probe requests. Signed-off-by: Kieran Frewen <kieran.frewen@morsemicro.com> Co-developed-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Signed-off-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Link: https://lore.kernel.org/r/20230223032512.3848105-1-gilad.itzkovitch@virscient.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30mac80211: minstrel_ht: remove unused n_supported variableTom Rix
clang with W=1 reports net/mac80211/rc80211_minstrel_ht.c:1711:6: error: variable 'n_supported' set but not used [-Werror,-Wunused-but-set-variable] int n_supported = 0; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230325132610.1334820-1-trix@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded staFelix Fietkau
Avoid potential data corruption issues caused by uninitialized driver private data structures. Reported-by: Brian Coverstone <brian@mainsequence.net> Fixes: 6a9d1b91f34d ("mac80211: add pre-RCU-sync sta removal driver operation") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230324120924.38412-3-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix flow dissection for forwarded packetsFelix Fietkau
Adjust the network header to point at the correct payload offset Fixes: 986e43b19ae9 ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230324120924.38412-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix mesh forwardingFelix Fietkau
Linearize packets (needed for forwarding A-MSDU subframes). Fixes: 986e43b19ae9 ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230324120924.38412-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix receiving mesh packets in forwarding=0 networksFelix Fietkau
When forwarding is set to 0, frames are typically sent with ttl=1. Move the ttl decrement check below the check for local receive in order to fix packet drops. Reported-by: Thomas Hühn <thomas.huehn@hs-nordhausen.de> Reported-by: Nick Hainke <vincent@systemli.org> Fixes: 986e43b19ae9 ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230326151709.17743-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix the size calculation of ieee80211_ie_len_eht_cap()Ryder Lee
Here should return the size of ieee80211_eht_cap_elem_fixed, so fix it. Fixes: 820acc810fb6 ("mac80211: Add EHT capabilities to association/probe request") Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/06c13635fc03bcff58a647b8e03e9f01a74294bd.1679935259.git.ryder.lee@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: iwlwifi: mvm: Use 64-bit division helper in ↵Nathan Chancellor
iwl_mvm_get_crosstimestamp_fw() There is a 64-bit division in iwl_mvm_get_crosstimestamp_fw(), which results in a link failure when building 32-bit architectures with clang: ld.lld: error: undefined symbol: __udivdi3 >>> referenced by ptp.c >>> drivers/net/wireless/intel/iwlwifi/mvm/ptp.o:(iwl_mvm_phc_get_crosstimestamp) in archive vmlinux.a GCC has optimizations for division by a constant that clang does not implement, so this issue is not visible when building with GCC. Use the 64-bit division helper div_u64(), which takes a u64 dividend and u32 divisor, which matches this situation and prevents the emission of a libcall for the division. Fixes: 21fb8da6ebe4 ("wifi: iwlwifi: mvm: read synced time from firmware if supported") Reported-by: Arnd Bergmann <arnd@arndb.de> Link: https://github.com/ClangBuiltLinux/linux/issues/1826 Reported-by: "kernelci.org bot" <bot@kernelci.org> Link: https://lore.kernel.org/6423173a.620a0220.3d5cc.6358@mx.google.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: fix potential null pointer dereferenceFelix Fietkau
rx->sta->amsdu_mesh_control is being passed to ieee80211_amsdu_to_8023s without checking rx->sta. Since it doesn't make sense to accept A-MSDU packets without a sta, simply add a check earlier. Fixes: 6e4c0d0460bd ("wifi: mac80211: add a workaround for receiving non-standard mesh A-MSDU") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230330090001.60750-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30wifi: mac80211: drop bogus static keywords in A-MSDU rxFelix Fietkau
These were unintentional copy&paste mistakes. Cc: stable@vger.kernel.org Fixes: 986e43b19ae9 ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230330090001.60750-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-30x86/acpi/boot: Correct acpi_is_processor_usable() checkEric DeVolder
The logic in acpi_is_processor_usable() requires the online capable bit be set for hotpluggable CPUs. The online capable bit has been introduced in ACPI 6.3. However, for ACPI revisions < 6.3 which do not support that bit, CPUs should be reported as usable, not the other way around. Reverse the check. [ bp: Rewrite commit message. ] Fixes: e2869bd7af60 ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC") Suggested-by: Miguel Luis <miguel.luis@oracle.com> Suggested-by: Boris Ostrovsky <boris.ovstrosky@oracle.com> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: David R <david@unsolicited.net> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20230327191026.3454-2-eric.devolder@oracle.com
2023-03-30x86/ACPI/boot: Use FADT version to check support for online capableMario Limonciello
ACPI 6.3 introduced the online capable bit, and also introduced MADT version 5. Latter was used to distinguish whether the offset storing online capable could be used. However ACPI 6.2b has MADT version "45" which is for an errata version of the ACPI 6.2 spec. This means that the Linux code for detecting availability of MADT will mistakenly flag ACPI 6.2b as supporting online capable which is inaccurate as it's an ACPI 6.3 feature. Instead use the FADT major and minor revision fields to distinguish this. [ bp: Massage. ] Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable") Reported-by: Eric DeVolder <eric.devolder@oracle.com> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/943d2445-84df-d939-f578-5d8240d342cc@unsolicited.net
2023-03-30Merge branch 'fix-header-length-on-skb-merging'Paolo Abeni
Arseniy Krasnov says: ==================== fix header length on skb merging this patchset fixes appending newly arrived skbuff to the last skbuff of the socket's queue during rx path. Problem fires when we are trying to append data to skbuff which was already processed in dequeue callback at least once. Dequeue callback calls function 'skb_pull()' which changes 'skb->len'. In current implementation 'skb->len' is used to update length in header of last skbuff after new data was copied to it. This is bug, because value in header is used to calculate 'rx_bytes'/'fwd_cnt' and thus must be constant during skbuff lifetime. Here is example, we have two skbuffs: skb0 with length 10 and skb1 with length 4. 1) skb0 arrives, hdr->len == skb->len == 10, rx_bytes == 10 2) Read 3 bytes from skb0, skb->len == 7, hdr->len == 10, rx_bytes == 10 3) skb1 arrives, hdr->len == skb->len == 4, rx_bytes == 14 4) Append skb1 to skb0, skb0 now has skb->len == 11, hdr->len == 11. But value of 11 in header is invalid. 5) Read whole skb0, update rx_bytes by 11 from skb0's header. 6) At this moment rx_bytes == 3, but socket's queue is empty. This bug starts to fire since: commit 077706165717 ("virtio/vsock: don't use skbuff state to account credit") In fact, it presents before, but didn't triggered due to a little bit buggy implementation of credit calculation logic. So i'll use Fixes tag for it. I really forgot about this branch in rx path when implemented patch 077706165717. This patchset contains 3 patches: 1) Fix itself. 2) Patch with WARN_ONCE() to catch such problems in future. 3) Patch with test which triggers skb appending logic. It looks like simple test with several 'send()' and 'recv()', but it checks, that skbuff appending works ok. ==================== Link: https://lore.kernel.org/r/0683cc6e-5130-484c-1105-ef2eb792d355@sberdevices.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-30test/vsock: new skbuff appending testArseniy Krasnov
This adds test which checks case when data of newly received skbuff is appended to the last skbuff in the socket's queue. It looks like simple test with 'send()' and 'recv()', but internally it triggers logic which appends one received skbuff to another. Test checks that this feature works correctly. This test is actual only for virtio transport. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-30virtio/vsock: WARN_ONCE() for invalid state of socketArseniy Krasnov
This adds WARN_ONCE() and return from stream dequeue callback when socket's queue is empty, but 'rx_bytes' still non-zero. This allows the detection of potential bugs due to packet merging (see previous patch). Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-30virtio/vsock: fix header length on skb mergingArseniy Krasnov
This fixes appending newly arrived skbuff to the last skbuff of the socket's queue. Problem fires when we are trying to append data to skbuff which was already processed in dequeue callback at least once. Dequeue callback calls function 'skb_pull()' which changes 'skb->len'. In current implementation 'skb->len' is used to update length in header of the last skbuff after new data was copied to it. This is bug, because value in header is used to calculate 'rx_bytes'/'fwd_cnt' and thus must be not be changed during skbuff's lifetime. Bug starts to fire since: commit 077706165717 ("virtio/vsock: don't use skbuff state to account credit") It presents before, but didn't triggered due to a little bit buggy implementation of credit calculation logic. So use Fixes tag for it. Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit") Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-30tee: Pass a pointer to virt_to_page()Linus Walleij
Like the other calls in this function virt_to_page() expects a pointer, not an integer. However since many architectures implement virt_to_pfn() as a macro, this function becomes polymorphic and accepts both a (unsigned long) and a (void *). Fix this up with an explicit cast. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2023-03-30KVM: arm64: PMU: Restore the guest's EL0 event counting after migrationReiji Watanabe
Currently, with VHE, KVM enables the EL0 event counting for the guest on vcpu_load() or KVM enables it as a part of the PMU register emulation process, when needed. However, in the migration case (with VHE), the same handling is lacking, as vPMU register values that were restored by userspace haven't been propagated yet (the PMU events haven't been created) at the vcpu load-time on the first KVM_RUN (kvm_vcpu_pmu_restore_guest() called from vcpu_load() on the first KVM_RUN won't do anything as events_{guest,host} of kvm_pmu_events are still zero). So, with VHE, enable the guest's EL0 event counting on the first KVM_RUN (after the migration) when needed. More specifically, have kvm_pmu_handle_pmcr() call kvm_vcpu_pmu_restore_guest() so that kvm_pmu_handle_pmcr() on the first KVM_RUN can take care of it. Fixes: d0c94c49792c ("KVM: arm64: Restore PMU configuration on first run") Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20230329023944.2488484-1-reijiw@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-03-29Merge tag 'mlx5-updates-2023-03-28' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-03-28 Dragos Tatulea says: ==================== net/mlx5e: RX, Drop page_cache and fully use page_pool For page allocation on the rx path, the mlx5e driver has been using an internal page cache in tandem with the page pool. The internal page cache uses a queue for page recycling which has the issue of head of queue blocking. This patch series drops the internal page_cache altogether and uses the page_pool to implement everything that was done by the page_cache before: * Let the page_pool handle dma mapping and unmapping. * Use fragmented pages with fragment counter instead of tracking via page ref. * Enable skb recycling. The patch series has the following effects on the rx path: * Improved performance for the cases when there was low page recycling due to head of queue blocking in the internal page_cache. The test for this was running a single iperf TCP stream to a rx queue which is bound on the same cpu as the application. |-------------+--------+--------+------+---------| | rq type | before | after | unit | diff | |-------------+--------+--------+------+---------| | striding rq | 30.1 | 31.4 | Gbps | 4.14 % | | legacy rq | 30.2 | 33.0 | Gbps | 8.48 % | |-------------+--------+--------+------+---------| * Small XDP performance degradation. The test was is XDP drop program running on a single rx queue with small packets incoming it looks like this: |-------------+----------+----------+------+---------| | rq type | before | after | unit | diff | |-------------+----------+----------+------+---------| | striding rq | 19725449 | 18544617 | pps | -6.37 % | | legacy rq | 19879931 | 18631841 | pps | -6.70 % | |-------------+----------+----------+------+---------| This will be handled in a different patch series by adding support for multi-packet per page. * For other cases the performance is roughly the same. The above numbers were obtained on the following system: 24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz 32 GB RAM ConnectX-7 single port The breakdown on the patch series is the following: * Preparations for introducing the mlx5e_frag_page struct. * Delete the mlx5e_page_cache struct. * Enable dma mapping from page_pool. * Enable skb recycling and fragment counting. * Do deferred release of pages (just before alloc) to ensure better page_pool cache utilization. ==================== * tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats net/mlx5e: RX, Break the wqe bulk refill in smaller chunks net/mlx5e: RX, Increase WQE bulk size for legacy rq net/mlx5e: RX, Split off release path for xsk buffers for legacy rq net/mlx5e: RX, Defer page release in legacy rq for better recycling net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags net/mlx5e: RX, Defer page release in striding rq for better recycling net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name net/mlx5e: RX, Enable skb page recycling through the page_pool net/mlx5e: RX, Enable dma map and sync from page_pool allocator net/mlx5e: RX, Remove internal page_cache net/mlx5e: RX, Store SHAMPO header pages in array net/mlx5e: RX, Remove alloc unit layout constraint for striding rq net/mlx5e: RX, Remove alloc unit layout constraint for legacy rq net/mlx5e: RX, Remove mlx5e_alloc_unit argument in page allocation ==================== Link: https://lore.kernel.org/r/20230328205623.142075-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29Merge branch 'bnxt_en-3-bug-fixes'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: 3 Bug fixes This series contains 3 small bug fixes covering ethtool self test, PCI ID string typos, and some missing 200G link speed ethtool reporting logic. ==================== Link: https://lore.kernel.org/r/20230329013021.5205-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29bnxt_en: Add missing 200G link speed reportingMichael Chan
bnxt_fw_to_ethtool_speed() is missing the case statement for 200G link speed reported by firmware. As a result, ethtool will report unknown speed when the firmware reports 200G link speed. Fixes: 532262ba3b84 ("bnxt_en: ethtool: support PAM4 link speeds up to 200G") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29bnxt_en: Fix typo in PCI id to device description string mappingKalesh AP
Fix 57502 and 57508 NPAR description string entries. The typos caused these devices to not match up with lspci output. Fixes: 49c98421e6ab ("bnxt_en: Add PCI IDs for 57500 series NPAR devices.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29bnxt_en: Fix reporting of test result in ethtool selftestKalesh AP
When the selftest command fails, driver is not reporting the failure by updating the "test->flags" when bnxt_close_nic() fails. Fixes: eb51365846bc ("bnxt_en: Add basic ethtool -t selftest support.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29i40e: fix registers dump after run ethtool adapter self testRadoslaw Tyl
Fix invalid registers dump from ethtool -d ethX after adapter self test by ethtool -t ethY. It causes invalid data display. The problem was caused by overwriting i40e_reg_list[].elements which is common for ethtool self test and dump. Fixes: 22dd9ae8afcc ("i40e: Rework register diagnostic") Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230328172659.3906413-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-28 (ice) This series contains updates to ice driver only. Jesse fixes mismatched header documentation reported when building with W=1. Brett restricts setting of VSI context to only applicable fields for the given ICE_AQ_VSI_PROP_Q_OPT_VALID bit. Junfeng adds check when adding Flow Director filters that conflict with existing filter rules. Jakob Koschel adds interim variable for iterating to prevent possible misuse after looping. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg() ice: add profile conflict check for AVF FDIR ice: Fix ice_cfg_rdma_fltr() to only update relevant fields ice: fix W=1 headers mismatch ==================== Link: https://lore.kernel.org/r/20230328172035.3904953-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29Merge tag 'ieee802154-for-net-2023-03-29' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== ieee802154 for net 2023-03-29 Two small fixes this time. Dongliang Mu removed an unnecessary null pointer check. Harshit Mogalapalli fixed an int comparison unsigned against signed from a recent other fix in the ca8210 driver. * tag 'ieee802154-for-net-2023-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: net: ieee802154: remove an unnecessary null pointer check ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx() ==================== Link: https://lore.kernel.org/r/20230329064541.2147400-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29net: ena: removed unused tx_bytes variableSimon Horman
clang 16.0.0 with W=1 reports: drivers/net/ethernet/amazon/ena/ena_netdev.c:1901:6: error: variable 'tx_bytes' set but not used [-Werror,-Wunused-but-set-variable] u32 tx_bytes = 0; The variable is not used so remove it. Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230328151958.410687-1-horms@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29octeon_ep: unlock the correct lock on error pathDan Carpenter
The h and the f letters are swapped so it unlocks the wrong lock. Fixes: 577f0d1b1c5f ("octeon_ep: add separate mailbox command and response queues") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/251aa2a2-913e-4868-aac9-0a90fc3eeeda@kili.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29bnx2x: use the right build_skb() helperJakub Kicinski
build_skb() no longer accepts slab buffers. Since slab use is fairly uncommon we prefer the drivers to call a separate slab_build_skb() function appropriately. bnx2x uses the old semantics where size of 0 meant buffer from slab. It sets the fp->rx_frag_size to 0 for MTUs which don't fit in a page. It needs to call slab_build_skb(). This fixes the WARN_ONCE() of incorrect API use seen with bnx2x. Reported-by: Thomas Voegtle <tv@lio96.de> Link: https://lore.kernel.org/all/b8f295e4-ba57-8bfb-7d9c-9d62a498a727@lio96.de/ Fixes: ce098da1497c ("skbuff: Introduce slab_build_skb()") Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230329000013.2734957-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29net: ipa: compute DMA pool size properlyAlex Elder
In gsi_trans_pool_init_dma(), the total size of a pool of memory used for DMA transactions is calculated. However the calculation is done incorrectly. For 4KB pages, this total size is currently always more than one page, and as a result, the calculation produces a positive (though incorrect) total size. The code still works in this case; we just end up with fewer DMA pool entries than we intended. Bjorn Andersson tested booting a kernel with 16KB pages, and hit a null pointer derereference in sg_alloc_append_table_from_pages(), descending from gsi_trans_pool_init_dma(). The cause of this was that a 16KB total size was going to be allocated, and with 16KB pages the order of that allocation is 0. The total_size calculation yielded 0, which eventually led to the crash. Correcting the total_size calculation fixes the problem. Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com> Tested-by: Bjorn Andersson <quic_bjorande@quicinc.com> Fixes: 9dd441e4ed57 ("soc: qcom: ipa: GSI transactions") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230328162751.2861791-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29ptp: add ToD device driver for Intel FPGA cardsTianfei Zhang
Adding a DFL (Device Feature List) device driver of ToD device for Intel FPGA cards. The Intel FPGA Time of Day(ToD) IP within the FPGA DFL bus is exposed as PTP Hardware clock(PHC) device to the Linux PTP stack to synchronize the system clock to its ToD information using phc2sys utility of the Linux PTP stack. The DFL is a hardware List within FPGA, which defines a linked list of feature headers within the device MMIO space to provide an extensible way of adding subdevice features. Signed-off-by: Raghavendra Khadatare <raghavendrax.anand.khadatare@intel.com> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20230328142455.481146-1-tianfei.zhang@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30nvme-tcp: fix a possible UAF when failing to allocate an io queueSagi Grimberg
When we allocate a nvme-tcp queue, we set the data_ready callback before we actually need to use it. This creates the potential that if a stray controller sends us data on the socket before we connect, we can trigger the io_work and start consuming the socket. In this case reported: we failed to allocate one of the io queues, and as we start releasing the queues that we already allocated, we get a UAF [1] from the io_work which is running before it should really. Fix this by setting the socket ops callbacks only before we start the queue, so that we can't accidentally schedule the io_work in the initialization phase before the queue started. While we are at it, rename nvme_tcp_restore_sock_calls to pair with nvme_tcp_setup_sock_ops. [1]: [16802.107284] nvme nvme4: starting error recovery [16802.109166] nvme nvme4: Reconnecting in 10 seconds... [16812.173535] nvme nvme4: failed to connect socket: -111 [16812.173745] nvme nvme4: Failed reconnect attempt 1 [16812.173747] nvme nvme4: Reconnecting in 10 seconds... [16822.413555] nvme nvme4: failed to connect socket: -111 [16822.413762] nvme nvme4: Failed reconnect attempt 2 [16822.413765] nvme nvme4: Reconnecting in 10 seconds... [16832.661274] nvme nvme4: creating 32 I/O queues. [16833.919887] BUG: kernel NULL pointer dereference, address: 0000000000000088 [16833.920068] nvme nvme4: Failed reconnect attempt 3 [16833.920094] #PF: supervisor write access in kernel mode [16833.920261] nvme nvme4: Reconnecting in 10 seconds... [16833.920368] #PF: error_code(0x0002) - not-present page [16833.921086] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [16833.921191] RIP: 0010:_raw_spin_lock_bh+0x17/0x30 ... [16833.923138] Call Trace: [16833.923271] <TASK> [16833.923402] lock_sock_nested+0x1e/0x50 [16833.923545] nvme_tcp_try_recv+0x40/0xa0 [nvme_tcp] [16833.923685] nvme_tcp_io_work+0x68/0xa0 [nvme_tcp] [16833.923824] process_one_work+0x1e8/0x390 [16833.923969] worker_thread+0x53/0x3d0 [16833.924104] ? process_one_work+0x390/0x390 [16833.924240] kthread+0x124/0x150 [16833.924376] ? set_kthread_struct+0x50/0x50 [16833.924518] ret_from_fork+0x1f/0x30 [16833.924655] </TASK> Reported-by: Yanjun Zhang <zhangyanjun@cestc.cn> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Yanjun Zhang <zhangyanjun@cestc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-03-29selftests/bpf: Rewrite two infinite loops in bound check casesXu Kuohai
The two infinite loops in bound check cases added by commit 1a3148fc171f ("selftests/bpf: Check when bounds are not in the 32-bit range") increased the execution time of test_verifier from about 6 seconds to about 9 seconds. Rewrite these two infinite loops to finite loops to get rid of this extra time cost. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20230329011048.1721937-1-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29Merge branch 'veristat: add better support of freplace programs'Alexei Starovoitov
Andrii Nakryiko says: ==================== Teach veristat how to deal with freplace BPF programs. As they can't be directly loaded by veristat without custom user-space part that sets correct target program FD, veristat always fails freplace programs. This patch set teaches veristat to guess target program type that will be inherited by freplace program itself, and subtitute it for BPF_PROG_TYPE_EXT (freplace) one for the purposes of BPF verification. Patch #1 fixes bug in libbpf preventing overriding freplace with specific program type. Patch #2 adds convenient -d flag to request veristat to emit libbpf debug logs. It help debugging why a specific BPF program fails to load, if the problem is not due to BPF verification itself. v3->v4: - fix optional kern_name check when guessing prog type (Alexei); v2->v3: - fix bpf_obj_id selftest that uses legacy bpf_prog_test_load() helper, which always sets program type programmatically; teach the helper to do it only if actually necessary (Stanislav); v1->v2: - fix compilation error reported by old GCC (my GCC v11 doesn't produce even a warning) and Clang (see CI failure at [0]): GCC version: veristat.c: In function ‘fixup_obj’: veristat.c:908:1: error: label at end of compound statement 908 | skip_freplace_fixup: | ^~~~~~~~~~~~~~~~~~~ Clang version: veristat.c:909:1: error: label at end of compound statement is a C2x extension [-Werror,-Wc2x-extensions] } ^ 1 error generated. [0] https://github.com/kernel-patches/bpf/actions/runs/4515972059/jobs/7953845335 ==================== Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29veristat: guess and substitue underlying program type for freplace (EXT) progsAndrii Nakryiko
SEC("freplace") (i.e., BPF_PROG_TYPE_EXT) programs are not loadable as is through veristat, as kernel expects actual program's FD during BPF_PROG_LOAD time, which veristat has no way of knowing. Unfortunately, freplace programs are a pretty important class of programs, especially when dealing with XDP chaining solutions, which rely on EXT programs. So let's do our best and teach veristat to try to guess the original program type, based on program's context argument type. And if guessing process succeeds, we manually override freplace/EXT with guessed program type using bpf_program__set_type() setter to increase chances of proper BPF verification. We rely on BTF and maintain a simple lookup table. This process is obviously not 100% bulletproof, as valid program might not use context and thus wouldn't have to specify correct type. Also, __sk_buff is very ambiguous and is the context type across many different program types. We pick BPF_PROG_TYPE_CGROUP_SKB for now, which seems to work fine in practice so far. Similarly, some program types require specifying attach type, and so we pick one out of possible few variants. Best effort at its best. But this makes veristat even more widely applicable. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230327185202.1929145-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29veristat: add -d debug mode option to see debug libbpf logAndrii Nakryiko
Add -d option to allow requesting libbpf debug logs from veristat. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230327185202.1929145-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29libbpf: disassociate section handler on explicit bpf_program__set_type() callAndrii Nakryiko
If user explicitly overrides programs's type with bpf_program__set_type() API call, we need to disassociate whatever SEC_DEF handler libbpf determined initially based on program's SEC() definition, as it's not goind to be valid anymore and could lead to crashes and/or confusing failures. Also, fix up bpf_prog_test_load() helper in selftests/bpf, which is force-setting program type (even if that's completely unnecessary; this is quite a legacy piece of code), and thus should expect auto-attach to not work, yet one of the tests explicitly relies on auto-attach for testing. Instead, force-set program type only if it differs from the desired one. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230327185202.1929145-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>