summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-21page_pool: add a comment explaining the fragment counter usageIlias Apalodimas
When reading the page_pool code the first impression is that keeping two separate counters, one being the page refcnt and the other being fragment pp_frag_count, is counter-intuitive. However without that fragment counter we don't know when to reliably destroy or sync the outstanding DMA mappings. So let's add a comment explaining this part. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20230217222130.85205-1-ilias.apalodimas@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net: ethtool: fix __ethtool_dev_mm_supported() implementationVladimir Oltean
The MAC Merge layer is supported when ops->get_mm() returns 0. The implementation was changed during review, and in this process, a bug was introduced. Link: https://lore.kernel.org/netdev/20230111161706.1465242-5-vladimir.oltean@nxp.com/ Fixes: 04692c9020b7 ("net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC)") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu> Link: https://lore.kernel.org/all/20230220122343.1156614-2-vladimir.oltean@nxp.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21ethtool: pse-pd: Fix double word in commentsBo Liu
Remove the repeated word "for" in comments. Signed-off-by: Bo Liu <liubo03@inspur.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20230221083036.2414-1-liubo03@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21xsk: add linux/vmalloc.h to xsk.cXuan Zhuo
Fix the failure of the compilation under the sh4. Because we introduced remap_vmalloc_range() earlier, this has caused the compilation failure on the sh4 platform. So this introduction of the header file of linux/vmalloc.h. config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20230221/202302210041.kpPQLlNQ-lkp@intel.com/config) compiler: sh4-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=9f78bf330a66cd400b3e00f370f597e9fa939207 git remote add net-next https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git git fetch --no-tags net-next master git checkout 9f78bf330a66cd400b3e00f370f597e9fa939207 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash net/ Fixes: 9f78bf330a66 ("xsk: support use vaddr as ring") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302210041.kpPQLlNQ-lkp@intel.com/ Link: https://lore.kernel.org/r/20230221075140.46988-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx5e: Align IPsec ASO result memory to be as required by hardwareLeon Romanovsky
Hardware requires an alignment to 64 bytes to return ASO data. Missing this alignment caused to unpredictable results while ASO events were generated. Fixes: 8518d05b8f9a ("net/mlx5e: Create Advanced Steering Operation object for IPsec") Reported-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/de0302c572b90c9224a72868d4e0d657b6313c4b.1676797613.git.leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20Merge tag 'mlx5-updates-2023-02-15' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-02-15 1) From Gal Tariq and Parav, Few cleanups for mlx5 driver. 2) From Vlad: Allow offloading of ct 'new' match based on [1] [1] https://lore.kernel.org/netdev/20230201163100.1001180-1-vladbu@nvidia.com/ * tag 'mlx5-updates-2023-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: RX, Remove doubtful unlikely call net/mlx5e: Fix outdated TLS comment net/mlx5e: Remove unused function mlx5e_sq_xmit_simple net/mlx5e: Allow offloading of ct 'new' match net/mlx5e: Implement CT entry update net/mlx5: Simplify eq list traversal net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle() net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff() net/mlx5e: Switch to using napi_build_skb() ==================== Link: https://lore.kernel.org/r/20230218090513.284718-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20Merge branch 'net-sched-cls_api-support-hardware-miss-to-tc-action'Jakub Kicinski
Paul Blakey says: ==================== net/sched: cls_api: Support hardware miss to tc action This series adds support for hardware miss to instruct tc to continue execution in a specific tc action instance on a filter's action list. The mlx5 driver patch (besides the refactors) shows its usage instead of using just chain restore. Currently a filter's action list must be executed all together or not at all as driver are only able to tell tc to continue executing from a specific tc chain, and not a specific filter/action. This is troublesome with regards to action CT, where new connections should be sent to software (via tc chain restore), and established connections can be handled in hardware. Checking for new connections is done when executing the ct action in hardware (by checking the packet's tuple against known established tuples). But if there is a packet modification (pedit) action before action CT and the checked tuple is a new connection, hardware will need to revert the previous packet modifications before sending it back to software so it can re-match the same tc filter in software and re-execute its CT action. The following is an example configuration of stateless nat on mlx5 driver that isn't supported before this patchet: #Setup corrosponding mlx5 VFs in namespaces $ ip netns add ns0 $ ip netns add ns1 $ ip link set dev enp8s0f0v0 netns ns0 $ ip netns exec ns0 ifconfig enp8s0f0v0 1.1.1.1/24 up $ ip link set dev enp8s0f0v1 netns ns1 $ ip netns exec ns1 ifconfig enp8s0f0v1 1.1.1.2/24 up #Setup tc arp and ct rules on mxl5 VF representors $ tc qdisc add dev enp8s0f0_0 ingress $ tc qdisc add dev enp8s0f0_1 ingress $ ifconfig enp8s0f0_0 up $ ifconfig enp8s0f0_1 up #Original side $ tc filter add dev enp8s0f0_0 ingress chain 0 proto ip flower \ ct_state -trk ip_proto tcp dst_port 8888 \ action pedit ex munge tcp dport set 5001 pipe \ action csum ip tcp pipe \ action ct pipe \ action goto chain 1 $ tc filter add dev enp8s0f0_0 ingress chain 1 proto ip flower \ ct_state +trk+est \ action mirred egress redirect dev enp8s0f0_1 $ tc filter add dev enp8s0f0_0 ingress chain 1 proto ip flower \ ct_state +trk+new \ action ct commit pipe \ action mirred egress redirect dev enp8s0f0_1 $ tc filter add dev enp8s0f0_0 ingress chain 0 proto arp flower \ action mirred egress redirect dev enp8s0f0_1 #Reply side $ tc filter add dev enp8s0f0_1 ingress chain 0 proto arp flower \ action mirred egress redirect dev enp8s0f0_0 $ tc filter add dev enp8s0f0_1 ingress chain 0 proto ip flower \ ct_state -trk ip_proto tcp \ action ct pipe \ action pedit ex munge tcp sport set 8888 pipe \ action csum ip tcp pipe \ action mirred egress redirect dev enp8s0f0_0 #Run traffic $ ip netns exec ns1 iperf -s -p 5001& $ sleep 2 #wait for iperf to fully open $ ip netns exec ns0 iperf -c 1.1.1.2 -p 8888 #dump tc filter stats on enp8s0f0_0 chain 0 rule and see hardware packets: $ tc -s filter show dev enp8s0f0_0 ingress chain 0 proto ip | grep "hardware.*pkt" Sent hardware 9310116832 bytes 6149672 pkt Sent hardware 9310116832 bytes 6149672 pkt Sent hardware 9310116832 bytes 6149672 pkt A new connection executing the first filter in hardware will first rewrite the dst port to the new port, and then the ct action is executed, because this is a new connection, hardware will need to be send this back to software, on chain 0, to execute the first filter again in software. The dst port needs to be reverted otherwise it won't re-match the old dst port in the first filter. Because of that, currently mlx5 driver will reject offloading the above action ct rule. This series adds support for hardware partially executing a filter's action list, and letting tc software continue processing in the specific action instance where hardware left off (in the above case after the "action pedit ex munge tcp dport... of the first rule") allowing support for scenarios such as the above. ==================== Link: https://lore.kernel.org/r/20230217223620.28508-1-paulb@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx5e: TC, Set CT miss to the specific ct action instancePaul Blakey
Currently, CT misses restore the missed chain on the tc skb extension so tc will continue from the relevant chain. Instead, restore the CT action's miss cookie on the extension, which will instruct tc to continue from the this specific CT action instance on the relevant filter's action list. Map the CT action's miss_cookie to a new miss object (ACT_MISS), and use this miss mapping instead of the current chain miss object (CHAIN_MISS) for CT action misses. To restore this new miss mapping value, add a RX restore rule for each such mapping value. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Sholmo <ozsh@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REGPaul Blakey
This reg usage is always a mapped object, not necessarily containing chain info. Rename to properly convey what it stores. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx5: Refactor tc miss handling to a single functionPaul Blakey
Move tc miss handling code to en_tc.c, and remove duplicate code. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx5: Kconfig: Make tc offload depend on tc skb extensionPaul Blakey
Tc skb extension is a basic requirement for using tc offload to support correct restoration on action miss. Depend on it. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/sched: flower: Support hardware miss to tc actionPaul Blakey
To support hardware miss to tc action in actions on the flower classifier, implement the required getting of filter actions, and setup filter exts (actions) miss by giving it the filter's handle and actions. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/sched: flower: Move filter handle initialization earlierPaul Blakey
To support miss to action during hardware offload the filter's handle is needed when setting up the actions (tcf_exts_init()), and before offloading. Move filter handle initialization earlier. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/sched: cls_api: Support hardware miss to tc actionPaul Blakey
For drivers to support partial offload of a filter's action list, add support for action miss to specify an action instance to continue from in sw. CT action in particular can't be fully offloaded, as new connections need to be handled in software. This imposes other limitations on the actions that can be offloaded together with the CT action, such as packet modifications. Assign each action on a filter's action list a unique miss_cookie which drivers can then use to fill action_miss part of the tc skb extension. On getting back this miss_cookie, find the action instance with relevant cookie and continue classifying from there. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/sched: Rename user cookie and act cookiePaul Blakey
struct tc_action->act_cookie is a user defined cookie, and the related struct flow_action_entry->act_cookie is used as an handle similar to struct flow_cls_offload->cookie. Rename tc_action->act_cookie to user_cookie, and flow_action_entry->act_cookie to cookie so their names would better fit their usage. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20Merge tag 'ieee802154-for-net-next-2023-02-20' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2023-02-20 Miquel Raynal build upon his earlier work and introduced two new features into the ieee802154 stack. Beaconing to announce existing PAN's and passive scanning to discover the beacons and associated PAN's. The matching changes to the userspace configuration tool have been posted as well and will be released together with the kernel release. Arnd Bergmann and Dmitry Torokhov worked on converting the at86rf230 and cc2520 drivers away from the unused platform_data usage and towards the new gpiod API. (I had to add a revert as Dmitry found a regression on an already pushed tree on my side). Changes since v1 (pull request 2023-02-02) - Netlink API extack and NLA_POLICY* usage as suggested by Jakub - Removed always true condition found by kernel test robot - Simplify device removal with running background job for scanning - Fix problems with beacon sending in some cases by using the MLME tx path * tag 'ieee802154-for-net-next-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next: ieee802154: Drop device trackers mac802154: Fix an always true condition mac802154: Send beacons using the MLME Tx path ieee802154: Change error code on monitor scan netlink request ieee802154: Convert scan error messages to extack ieee802154: Use netlink policies when relevant on scan parameters ieee802154: at86rf230: switch to using gpiod API ieee802154: at86rf230: drop support for platform data Revert "at86rf230: convert to gpio descriptors" cc2520: move to gpio descriptors mac802154: Avoid superfluous endianness handling at86rf230: convert to gpio descriptors mac802154: Handle basic beaconing ieee802154: Add support for user beaconing requests mac802154: Handle passive scanning mac802154: Add MLME Tx locked helpers mac802154: Prepare forcing specific symbol duration ieee802154: Introduce a helper to validate a channel ieee802154: Define a beacon frame header ieee802154: Add support for user scanning requests ==================== Link: https://lore.kernel.org/r/20230220213749.386451-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20sfc: fix builds without CONFIG_RTC_LIBAlejandro Lucero
Add an embarrassingly missed semicolon plus and embarrassingly missed parenthesis breaking kernel building when CONFIG_RTC_LIB is not set like the one reported with ia64 config. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302170047.EjCPizu3-lkp@intel.com/ Fixes: 14743ddd2495 ("sfc: add devlink info support for ef100") Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Link: https://lore.kernel.org/r/20230220110133.29645-1-alejandro.lucero-palau@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20sfc: clean up some inconsistent indentingsYang Li
Fix some indentngs and remove the warning below: drivers/net/ethernet/sfc/mae.c:657 efx_mae_enumerate_mports() warn: inconsistent indenting Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4117 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20230220065958.52941-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/mlx4_en: Introduce flexible array to silence overflow warningKees Cook
The call "skb_copy_from_linear_data(skb, inl + 1, spc)" triggers a FORTIFY memcpy() warning on ppc64 platform: In function ‘fortify_memcpy_chk’, inlined from ‘skb_copy_from_linear_data’ at ./include/linux/skbuff.h:4029:2, inlined from ‘build_inline_wqe’ at drivers/net/ethernet/mellanox/mlx4/en_tx.c:722:4, inlined from ‘mlx4_en_xmit’ at drivers/net/ethernet/mellanox/mlx4/en_tx.c:1066:3: ./include/linux/fortify-string.h:513:25: error: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 513 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Same behaviour on x86 you can get if you use "__always_inline" instead of "inline" for skb_copy_from_linear_data() in skbuff.h The call here copies data into inlined tx destricptor, which has 104 bytes (MAX_INLINE) space for data payload. In this case "spc" is known in compile-time but the destination is used with hidden knowledge (real structure of destination is different from that the compiler can see). That cause the fortify warning because compiler can check bounds, but the real bounds are different. "spc" can't be bigger than 64 bytes (MLX4_INLINE_ALIGN), so the data can always fit into inlined tx descriptor. The fact that "inl" points into inlined tx descriptor is determined earlier in mlx4_en_xmit(). Avoid confusing the compiler with "inl + 1" constructions to get to past the inl header by introducing a flexible array "data" to the struct so that the compiler can see that we are not dealing with an array of inl structs, but rather, arbitrary data following the structure. There are no changes to the structure layout reported by pahole, and the resulting machine code is actually smaller. Reported-by: Josef Oskera <joskera@redhat.com> Link: https://lore.kernel.org/lkml/20230217094541.2362873-1-joskera@redhat.com Fixes: f68f2ff91512 ("fortify: Detect struct member overflows in memcpy() at compile-time") Cc: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20230218183842.never.954-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net/ulp: Remove redundant ->clone() test in inet_clone_ulp().Kuniyuki Iwashima
Commit 2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status") guarantees that all ULP listeners have clone() op, so we no longer need to test it in inet_clone_ulp(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230217200920.85306-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-02-17 We've added 64 non-merge commits during the last 7 day(s) which contain a total of 158 files changed, 4190 insertions(+), 988 deletions(-). The main changes are: 1) Add a rbtree data structure following the "next-gen data structure" precedent set by recently-added linked-list, that is, by using kfunc + kptr instead of adding a new BPF map type, from Dave Marchevsky. 2) Add a new benchmark for hashmap lookups to BPF selftests, from Anton Protopopov. 3) Fix bpf_fib_lookup to only return valid neighbors and add an option to skip the neigh table lookup, from Martin KaFai Lau. 4) Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accouting for container environments, from Yafang Shao. 5) Batch of ice multi-buffer and driver performance fixes, from Alexander Lobakin. 6) Fix a bug in determining whether global subprog's argument is PTR_TO_CTX, which is based on type names which breaks kprobe progs, from Andrii Nakryiko. 7) Prep work for future -mcpu=v4 LLVM option which includes usage of BPF_ST insn. Thus improve BPF_ST-related value tracking in verifier, from Eduard Zingerman. 8) More prep work for later building selftests with Memory Sanitizer in order to detect usages of undefined memory, from Ilya Leoshkevich. 9) Fix xsk sockets to check IFF_UP earlier to avoid a NULL pointer dereference via sendmsg(), from Maciej Fijalkowski. 10) Implement BPF trampoline for RV64 JIT compiler, from Pu Lehui. 11) Fix BPF memory allocator in combination with BPF hashtab where it could corrupt special fields e.g. used in bpf_spin_lock, from Hou Tao. 12) Fix LoongArch BPF JIT to always use 4 instructions for function address so that instruction sequences don't change between passes, from Hengqi Chen. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits) selftests/bpf: Add bpf_fib_lookup test bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup riscv, bpf: Add bpf trampoline support for RV64 riscv, bpf: Add bpf_arch_text_poke support for RV64 riscv, bpf: Factor out emit_call for kernel and bpf context riscv: Extend patch_text for multiple instructions Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES" selftests/bpf: Add global subprog context passing tests selftests/bpf: Convert test_global_funcs test to test_loader framework bpf: Fix global subprog context argument resolution logic LoongArch, bpf: Use 4 instructions for function address in JIT bpf: bpf_fib_lookup should not return neigh in NUD_FAILED state bpf: Disable bh in bpf_test_run for xdp and tc prog xsk: check IFF_UP earlier in Tx path Fix typos in selftest/bpf files selftests/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() samples/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() bpftool: Use bpf_{btf,link,map,prog}_get_info_by_fd() libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd() ... ==================== Link: https://lore.kernel.org/r/20230217221737.31122-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20sched/topology: fix KASAN warning in hop_cmp()Yury Norov
Despite that prev_hop is used conditionally on cur_hop is not the first hop, it's initialized unconditionally. Because initialization implies dereferencing, it might happen that the code dereferences uninitialized memory, which has been spotted by KASAN. Fix it by reorganizing hop_cmp() logic. Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()") Signed-off-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/Y+7avK6V9SyAWsXi@yury-laptop/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20net: bcmgenet: Support wake-up from s2idleFlorian Fainelli
When we suspend into s2idle we also need to enable the interrupt line that generates the MPD and HFB interrupts towards the host CPU interrupt controller (typically the ARM GIC or MIPS L1) to make it exit s2idle. When we suspend into other modes such as "standby" or "mem" we engage a power management state machine which will gate off the CPU L1 controller (priv->irq0) and ungate the side band wake-up interrupt (priv->wol_irq). It is safe to have both enabled as wake-up sources because they are mutually exclusive given any suspend mode. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20scm: add user copy checks to put_cmsg()Eric Dumazet
This is a followup of commit 2558b8039d05 ("net: use a bounce buffer for copying skb->mark") x86 and powerpc define user_access_begin, meaning that they are not able to perform user copy checks when using user_write_access_begin() / unsafe_copy_to_user() and friends [1] Instead of waiting bugs to trigger on other arches, add a check_object_size() in put_cmsg() to make sure that new code tested on x86 with CONFIG_HARDENED_USERCOPY=y will perform more security checks. [1] We can not generically call check_object_size() from unsafe_copy_to_user() because UACCESS is enabled at this point. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20devlink: drop leftover duplicate/unused codePaolo Abeni
The recent merge from net left-over some unused code in leftover.c - nomen omen. Just drop the unused bits. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge tag 'linux-can-next-for-6.3-20230217' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2023-02-17 - fixed this is a pull request of 4 patches for net-next/master. The first patch is by Yang Li and converts the ctucanfd driver to devm_platform_ioremap_resource(). The last 3 patches are by Frank Jungclaus, target the esd_usb driver and contains preparations for the upcoming support of the esd CAN-USB/3 hardware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: lan966x: Use automatic selection of VCAP rule actionsetHoratiu Vultur
Since commit 81e164c4aec5 ("net: microchip: sparx5: Add automatic selection of VCAP rule actionset") the VCAP API has the capability to select automatically the actionset based on the actions that are attached to the rule. So it is not needed anymore to hardcode the actionset in the driver, therefore it is OK to remove this. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge branch 'default_rps_mask-follow-up'David S. Miller
Paolo Abeni says: ==================== net: default_rps_mask follow-up The first patch namespacify the setting. In the common case, once proper isolation is in place in the main namespace, forwarding to/from each child netns will allways happen on the desidered CPUs. Any additional RPS stage inside the child namespace will not provide additional isolation and could hurt performance badly if picking a CPU on a remote node. The 2nd patch adds more self-tests coverage. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20self-tests: more rps self testsPaolo Abeni
Explicitly check for child netns and main ns independency Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: make default_rps_mask a per netns attributePaolo Abeni
That really was meant to be a per netns attribute from the beginning. The idea is that once proper isolation is in place in the main namespace, additional demux in the child namespaces will be redundant. Let's make child netns default rps mask empty by default. To avoid bloating the netns with a possibly large cpumask, allocate it on-demand during the first write operation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge tag 'wireless-next-2023-02-17' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.3 Third set of patches for v6.3. This time only a set of small fixes submitted during the last day or two. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for net-next: 1) Add safeguard to check for NULL tupe in objects updates via NFT_MSG_NEWOBJ, this should not ever happen. From Alok Tiwari. 2) Incorrect pointer check in the new destroy rule command, from Yang Yingliang. 3) Incorrect status bitcheck in nf_conntrack_udp_packet(), from Florian Westphal. 4) Simplify seq_print_acct(), from Ilia Gavrilov. 5) Use 2-arg optimal variant of kfree_rcu() in IPVS, from Julian Anastasov. 6) TCP connection enters CLOSE state in conntrack for locally originated TCP reset packet from the reject target, from Florian Westphal. The fixes #2 and #3 in this series address issues from the previous pull nf-next request in this net-next cycle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: microchip: sparx5: reduce stack usageArnd Bergmann
The vcap_admin structures in vcap_api_next_lookup_advanced_test() take several hundred bytes of stack frame, but when CONFIG_KASAN_STACK is enabled, each one of them also has extra padding before and after it, which ends up blowing the warning limit: In file included from drivers/net/ethernet/microchip/vcap/vcap_api.c:3521: drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c: In function 'vcap_api_next_lookup_advanced_test': drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1954:1: error: the frame size of 1448 bytes is larger than 1400 bytes [-Werror=frame-larger-than=] 1954 | } Reduce the total stack usage by replacing the five structures with an array that only needs one pair of padding areas. Fixes: 1f741f001160 ("net: microchip: sparx5: Add KUNIT tests for enabling/disabling chains") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20sfc: use IS_ENABLED() checks for CONFIG_SFC_SRIOVArnd Bergmann
One local variable has become unused after a recent change: drivers/net/ethernet/sfc/ef100_nic.c: In function 'ef100_probe_netdev_pf': drivers/net/ethernet/sfc/ef100_nic.c:1155:21: error: unused variable 'net_dev' [-Werror=unused-variable] struct net_device *net_dev = efx->net_dev; ^~~~~~~ The variable is still used in an #ifdef. Replace the #ifdef with an if(IS_ENABLED()) check that lets the compiler see where it is used, rather than adding another #ifdef. This also fixes an uninitialized return value in ef100_probe_netdev_pf() that gcc did not spot. Fixes: 7e056e2360d9 ("sfc: obtain device mac address based on firmware handle for ef100") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ice: properly alloc ICE_VSI_LBMichal Swiatkowski
Devlink reload patchset introduced regression. ICE_VSI_LB wasn't taken into account when doing default allocation. Fix it by adding a case for ICE_VSI_LB in ice_vsi_alloc_def(). Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20sfc: Fix spelling mistake "creationg" -> "creating"Colin Ian King
There is a spelling mistake in a pci_warn message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by:  Alejandro Lucero <alejandro.lucero-palau@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20octeontx2-af: Add NIX Errata workaround on CN10K siliconGeetha sowjanya
This patch adds workaround for below 2 HW erratas 1. Due to improper clock gating, NIXRX may free the same NPA buffer multiple times.. to avoid this, always enable NIX RX conditional clock. 2. NIX FIFO does not get initialized on reset, if the SMQ flush is triggered before the first packet is processed, it will lead to undefined state. The workaround to perform SMQ flush only if packet count is non-zero in MDQ. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: phy: Read EEE abilities when using .featuresAndrew Lunn
A PHY driver can use a static integer value to indicate what link mode features it supports, i.e, its abilities.. This is the old way, but useful when dynamically determining the devices features does not work, e.g. support of fibre. EEE support has been moved into phydev->supported_eee. This needs to be set otherwise the code assumes EEE is not supported. It is normally set as part of reading the devices abilities. However if a static integer value was used, the dynamic reading of the abilities is not performed. Add a call to genphy_c45_read_eee_abilities() to read the EEE abilities. Fixes: 8b68710a3121 ("net: phy: start using genphy_c45_ethtool_get/set_eee()") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge branch 'phydev-locks'David S. Miller
Andrew Lunn says: ==================== Add additional phydev locks The phydev lock should be held when accessing members of phydev, or calling into the driver. Some of the phy_ethtool_ functions are missing locks. Add them. To avoid deadlock the marvell driver is modified since it calls one of the functions which gain locks, which would result in a deadlock. The missing locks have not caused noticeable issues, so these patches are for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: phy: Add locks to ethtool functionsAndrew Lunn
The phydev lock should be held while accessing members of phydev, or calling into the driver. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: phy: marvell: Use the unlocked genphy_c45_ethtool_get_eee()Andrew Lunn
phy_ethtool_get_eee() is about to gain locking of the phydev lock. This means it cannot be used within a PHY driver without causing a deadlock. Swap to using genphy_c45_ethtool_get_eee() which assumes the lock has already been taken. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge branch 'icmp6-drop-reason'David S. Miller
Eric Dumazet says: ==================== ipv6: icmp6: better drop reason support This series aims to have more precise drop reason reports for icmp6. This should reduce false positives on most usual cases. This can be extended as needed later. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to icmpv6_echo_reply()Eric Dumazet
Change icmpv6_echo_reply() to return a drop reason. For the moment, return NOT_SPECIFIED or SKB_CONSUMED. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_NS_OTHERHOSTEric Dumazet
Hosts can often receive neighbour discovery messages that are not for them. Use a dedicated drop reason to make clear the packet is dropped for this normal case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_BAD_OPTIONSEric Dumazet
This is a generic drop reason for any error detected in ndisc_parse_options(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to ndisc_redirect_rcv()Eric Dumazet
Change ndisc_redirect_rcv() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED and values from icmpv6_notify(). More reasons are added later. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to ndisc_router_discovery()Eric Dumazet
Change ndisc_router_discovery() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED and SKB_CONSUMED. More reasons are added later. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to ndisc_recv_rs()Eric Dumazet
Change ndisc_recv_rs() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED or SKB_CONSUMED. More reasons are added later. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to ndisc_recv_na()Eric Dumazet
Change ndisc_recv_na() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED or SKB_CONSUMED. More reasons are added later. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add drop reason support to ndisc_recv_ns()Eric Dumazet
Change ndisc_recv_ns() to return a drop reason. For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED or SKB_CONSUMED. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>