summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-16Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2021-03-16 This series contains updates to i40e, ixgbe, and ice drivers. Magnus Karlsson says: Optimize run_xdp_zc() for the XDP program verdict being XDP_REDIRECT in the xsk zero-copy path. This path is only used when having AF_XDP zero-copy on and in that case most packets will be directed to user space. This provides around 100k extra packets in throughput on my server when running l2fwd in xdpsock. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Merge branch 'mlxsw-Add-support-for-egress-and-policy-based-sampling'David S. Miller
Ido Schimmel says: ==================== mlxsw: Add support for egress and policy-based sampling So far mlxsw only supported ingress sampling using matchall classifier. This series adds support for egress sampling and policy-based sampling using flower classifier on Spectrum-2 and newer ASICs. As such, it is now possible to issue these commands: # tc filter add dev swp1 egress pref 1 proto all matchall action sample rate 100 group 1 # tc filter add dev swp2 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action sample rate 100 group 2 When performing egress sampling (using either matchall or flower) the ASIC is able to report the end-to-end latency which is passed to the psample module. Series overview: Patches #1-#3 are preparations without any functional changes Patch #4 generalizes the idea of sampling triggers and creates a hash table to track active sampling triggers in preparation for egress and policy-based triggers. The motivation is explained in the changelog Patch #5 flips mlxsw to start using this hash table instead of storing ingress sampling triggers as an attribute of the sampled port Patch #6 finally adds support for egress sampling using matchall classifier Patches #7-#8 add support for policy-based sampling using flower classifier Patches #9 extends the mlxsw sampling selftest to cover the new triggers Patch #10 makes sure that egress sampling configuration only fails on Spectrum-1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16selftests: mlxsw: Test egress sampling limitation on Spectrum-1 onlyIdo Schimmel
Make sure egress sampling configuration only fails on Spectrum-1, given that mlxsw now supports it on Spectrum-{2,3}. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16selftests: mlxsw: Add tc sample tests for new triggersIdo Schimmel
Test that packets are sampled when tc-sample is used with matchall egress binding and flower classifier. Verify that when performing sampling on egress the end-to-end latency is reported as metadata. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum_acl: Offload FLOW_ACTION_SAMPLEIdo Schimmel
Implement support for action sample when used with a flower classifier by implementing the required sampler_add() / sampler_del() callbacks and registering an Rx listener for the sampled packets. The sampler_add() callback returns an error for Spectrum-1 as the functionality is not supported. In Spectrum-{2,3} the callback creates a mirroring agent towards the CPU. The agent's identifier is used by the policy engine code to mirror towards the CPU with probability. The Rx listener for the sampled packet is registered with the 'policy engine' mirroring reason and passes trapped packets to the psample module after looking up their parameters (e.g., sampling group). Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: core_acl_flex_actions: Add mirror sampler actionIdo Schimmel
Add core functionality required to support mirror sampler action in the policy engine. The switch driver (e.g., 'mlxsw_spectrum') is required to implement the sampler_add() / sampler_del() callbacks that perform the necessary configuration before the sampler action can be installed. The next patch will implement it for Spectrum-{2,3}, while Spectrum-1 will return an error, given it is not supported. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum_matchall: Add support for egress samplingIdo Schimmel
Allow user space to install a matchall classifier with sample action on egress. This is only supported on Spectrum-2 onwards, so Spectrum-1 will continue to return an error. Programming the hardware to sample on egress is identical to ingress sampling with the sole change of using a different sampling trigger. Upon receiving a sampled packet, the sampling trigger (ingress vs. egress) will be encoded in the mirroring reason in the Completion Queue Element (CQE). The mirroring reason is used to lookup the sampling parameters (e.g., psample group) which are passed to the psample module. Note that locally generated packets that are sampled are simply consumed. This is done for several reasons. First, such packets do not have an ingress netdev given that their Rx local port is the CPU port. This breaks several basic assumptions. Second, sampling using the same interface (tc), but with flower classifier will not result in locally generated packets being sampled given that such packets are not subject to the policy engine. Third, realistically, this is not a big deal given that the vast majority of the packets being transmitted through the port are not locally generated packets. Fourth, if such packets do need to be sampled, they can be sampled with a 'skip_hw' filter and reported to the same sampling group as the data path packets. The software sampling rate can also be adjusted to fit the rate of the locally generated packets which is much lower than the rate of the data path traffic. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum: Start using sampling triggers hash tableIdo Schimmel
Start using the previously introduced sampling triggers hash table to store sampling parameters instead of storing them as attributes of the sampled port. This makes it easier to introduce new sampling triggers. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum: Track sampling triggers in a hash tableIdo Schimmel
Currently, mlxsw supports a single sampling trigger type (i.e., received packet). When sampling is configured on an ingress port, the sampling parameters (e.g., pointer to the psample group) are stored as an attribute of the port, so that they could be passed to psample_sample_packet() when a sampled packet is trapped to the CPU. Subsequent patches are going to add more types of sampling triggers, making it difficult to maintain the current scheme. Instead, store all the active sampling triggers with their associated parameters in a hash table. That way, more trigger types can be easily added. The next patch will flip mlxsw to use the hash table instead of the current scheme. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum_matchall: Pass matchall entry to sampling operationsIdo Schimmel
The entry will be required by the next patches, so pass it. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum_matchall: Push sampling checks to per-ASIC operationsIdo Schimmel
Push some sampling checks to the per-ASIC operations, as they are no longer relevant for all ASICs. The sampling rate validation against the MPSC maximum rate is only relevant for Spectrum-1, as Spectrum-2 and later ASICs no longer use MPSC register for sampling. The ingress / egress validation is pushed down to the per-ASIC operations since subsequent patches are going to remove it for Spectrum-2 and later ASICs. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16mlxsw: spectrum_matchall: Propagate extack furtherIdo Schimmel
Due to the differences between Spectrum-1 and later ASICs, some of the checks currently performed at the common code (where extack is available) will need to be pushed to the per-ASIC operations. As a preparation, propagate extack further to maintain proper error reporting. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Merge branch 'dpaa2-switch-small-cleanup'David S. Miller
Ioana Ciornei says: ==================== dpaa2-switch: small cleanup This patch set addresses various low-hanging issues in both dpaa2-switch and dpaa2-eth drivers. Unused ABI functions are removed from dpaa2-switch, all the kernel-doc warnings are fixed up in both drivers and the coding style for the remaining ABIs is fixed-up a bit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16dpaa2-eth: fixup kdoc warningsIoana Ciornei
Running kernel-doc over the dpaa2-eth driver generates a bunch of warnings. Fix them up by removing code comments for macros which are self-explanatory, respecting the kdoc format for macro documentation and other small changes like describing the expected return values of functions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16dpaa2-switch: fit the function declaration on the same lineIoana Ciornei
Multiple ABI function declarations are split unnecessarry on multiple lines. Fix this so that we have a consistent coding style. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16dpaa2-switch: reduce the size of the if_id bitmap to 64 bitsIoana Ciornei
The maximum number of DPAA2 switch interfaces, including the control interface, is 64. Even though this restriction existed from the first place, the command structures which use an interface id bitmap were poorly described and even though a single uint64_t is enough, all of them used an array of 4 uint64_t's. Fix this by reducing the size of the interface id field to a single uint64_t. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16dpaa2-switch: fix kdoc warningsIoana Ciornei
Running kernel-doc over the dpaa2-switch driver generates a bunch of warnings. Fix them up by removing code comments for macros which are self-explanatory and adding a bit more context for the dpsw_if_get_port_mac_addr() function and the fields of the dpsw_vlan_if_cfg structure. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16dpaa2-switch: remove unused ABI functionsIoana Ciornei
Cleanup the dpaa2-switch driver a bit by removing any unused MC firmware ABI definitions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16net: broadcom: BCM4908_ENET should not default to y, unconditionallyGeert Uytterhoeven
Merely enabling compile-testing should not enable additional code. To fix this, restrict the automatic enabling of BCM4908_ENET to ARCH_BCM4908. Fixes: 4feffeadbcb2e5b1 ("net: broadcom: bcm4908enet: add BCM4908 controller driver") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16net: ipa: Remove useless error messageZihao Tang
Fix the following coccicheck report: drivers/net/ipa/gsi.c:1341:2-9: line 1341 is redundant because platform_get_irq() already prints an error Remove dev_err() messages after platform_get_irq_byname() failures. Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com> Signed-off-by: Jay Fang <f.fangjian@huawei.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ctwenxu
When openvswitch conntrack offload with act_ct action. The first rule do conntrack in the act_ct in tc subsystem. And miss the next rule in the tc and fallback to the ovs datapath but miss set post_ct flag which will lead the ct_state_key with -trk flag. Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Merge branch 'switchdev-dsa-docs'David S. Miller
Vladimir Oltean says: ==================== Documentation updates for switchdev and DSA Many changes were made to the code but of course the documentation was not kept up to date. This is an attempt to update some of the verbiage. The documentation is still not complete, but it's time to make some more changes to the code first, before documenting the rest. Changes in v2: Integrated feedback from Andrew, Florian, Tobias, Ido, George. ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: switchdev: fix command for static FDB entriesVladimir Oltean
The "bridge fdb add" command provided in the switchdev documentation is junk now, not only because it is syntactically incorrect and rejected by the iproute2 bridge program, but also because it was not updated in light of Arkadi Sharshevsky's radical switchdev refactoring in commit 29ab586c3d83 ("net: switchdev: Remove bridge bypass support from switchdev"). Try to explain what the intended usage pattern is with the new kernel implementation. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: switchdev: clarify device driver behaviorFlorian Fainelli
This patch provides details on the expected behavior of switchdev enabled network devices when operating in a "stand alone" mode, as well as when being bridge members. This clarifies a number of things that recently came up during a bug fixing session on the b53 DSA switch driver. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: add paragraph for the HSR/PRP offloadVladimir Oltean
Add a short summary of the methods that a driver writer must implement for offloading a HSR/PRP network interface. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: add paragraph for the MRP offloadVladimir Oltean
Add a short summary of the methods that a driver writer must implement for getting an MRP instance to work on top of a DSA switch. Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: add paragraph for the LAG offloadVladimir Oltean
Add a short summary of the methods that a driver writer must implement for offloading a link aggregation group, and what is still missing. Cc: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: mention integration with devlinkVladimir Oltean
Add a short summary of the devlink features supported by the DSA core. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: document the port_bridge_flags methodVladimir Oltean
The documentation was already lagging behind by not mentioning the old version of port_bridge_flags (port_set_egress_floods). So now we are skipping one step and just explaining how a DSA driver should configure address learning and flooding settings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: remove TODO about porting more vendor driversVladimir Oltean
On one hand, the link is dead and therefore useless. On the other hand, there are always more drivers to port, but at this stage, DSA does not need to affirm itself as the driver model to use for Ethernet-connected switches (since we already have 15 tagging protocols supported and probably more switch families from various vendors), so there is nothing actionable to do. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: remove references to switchdev prepare/commitVladimir Oltean
After the recent series containing commit bae33f2b5afe ("net: switchdev: remove the transaction structure from port attributes"), there aren't prepare/commit transactional phases anymore in most of the switchdev objects/attributes, and as a result, there aren't any in the DSA driver API either. So remove this piece. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: remove static port count from limitationsVladimir Oltean
After Vivien's series from 2019 containing commits 27d4d19d7c82 ("net: dsa: remove limitation of switch index value") and ab8ccae122a4 ("net: dsa: add ports list in the switch fabric"), this is basically no longer true. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: dsa: rewrite chapter about tagging protocolVladimir Oltean
The chapter about tagging protocols is out of date because it doesn't mention all taggers that have been added since last documentation update. But judging based on that, it will always tend to lag behind, and there's no good reason why we would enumerate the supported hardware. Instead we could do something more useful and explain what there is to know about tagging protocols instead. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Documentation: networking: update the graphical representationVladimir Oltean
While preparing some slides for a customer presentation, I found the existing high-level view to be a bit confusing, so I modified it a little bit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Merge tag 'linux-can-fixes-for-5.12-20210316' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-03-16 this is a pull request of 11 patches for net/master. The first patch is by Martin Willi and fixes the deletion of network name spaces with physical CAN interfaces in them. The next two patches are by me an fix the ISOTP protocol, to ensure that unused flags in classical CAN frames are properly initialized to zero. Stephane Grosjean contributes a patch for the pcan_usb_fd driver, which add MODULE_SUPPORTED_DEVICE lines for two supported devices. Angelo Dureghello's patch for the flexcan driver fixes a potential div by zero, if the bitrate is not set during driver probe. Jimmy Assarsson's patch for the kvaser_pciefd disables bus load reporting in the device, if it was previously enabled by the vendor's out of tree drier. A patch for the kvaser_usb adds support for a new device, by adding the appropriate USB product ID. Tong Zhang contributes two patches for the c_can driver. First a use-after-free in the c_can_pci driver is fixed, in the second patch the runtime PM for the c_can_pci is fixed by moving the runtime PM enable/disable from the core driver to the platform driver. The last two patches are by Torin Cooper-Bennun for the m_can driver. First a extraneous msg loss warning is removed then he fixes the RX-path, which might be blocked by errors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16docs: net: ena: Fix ena_start_xmit() function name typoZenghui Yu
The ena.rst documentation referred to end_start_xmit() when it should refer to ena_start_xmit(). Fix the typo. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2021-03-15 This series contains updates to e1000e only. Chen Yu says: The NIC is put in runtime suspend status when there is no cable connected. As a result, it is safe to keep non-wakeup NIC in runtime suspended during s2ram because the system does not rely on the NIC plug event nor WoL to wake up the system. Besides that, unlike the s2idle, s2ram does not need to manipulate S0ix settings during suspend. ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16selftests/net: fix warnings on reuseaddr_ports_exhaustedCarlos Llamas
Fix multiple warnings seen with gcc 10.2.1: reuseaddr_ports_exhausted.c:32:41: warning: missing braces around initializer [-Wmissing-braces] 32 | struct reuse_opts unreusable_opts[12] = { | ^ 33 | {0, 0, 0, 0}, | { } { } Fixes: 7f204a7de8b0 ("selftests: net: Add SO_REUSEADDR test to check if 4-tuples are fully utilized.") Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16MIPS: vmlinux.lds.S: Fix appended dtb not properly alignedPaul Cercueil
Commit 6654111c893f ("MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes") changed the alignment from STRUCT_ALIGNMENT bytes to 8 bytes. The commit's message makes it sound like it was actually done on purpose, but this is not the case. The commit was written when raw appended dtb were not aligned at all. The STRUCT_ALIGN() was added a few days before, in commit 7a05293af39f ("MIPS: boot/compressed: Copy DTB to aligned address"). The true purpose of the commit was not to align specifically to 8 bytes, but to make sure that the generated vmlinux' size was properly padded to the alignment required for DTBs. While the switch to 8-byte alignment worked for vmlinux-appended dtb blobs, it broke vmlinuz-appended dtb blobs, as the decompress routine moves the blob to a STRUCT_ALIGNMENT aligned address. Fix this by changing the raw appended dtb blob alignment from 8 bytes back to STRUCT_ALIGNMENT bytes in vmlinux.lds.S. Fixes: 6654111c893f ("MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes") Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-16net: ipv4: route.c: simplify procfs codeYejune Deng
proc_creat_seq() that directly take a struct seq_operations, and deal with network namespaces in ->open. Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTARTOleg Nesterov
Save the current_thread_info()->status of X86 in the new restart_block->arch_data field so TS_COMPAT_RESTART can be removed again. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210201174716.GA17898@redhat.com
2021-03-16x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()Oleg Nesterov
The comment in get_nr_restart_syscall() says: * The problem is that we can get here when ptrace pokes * syscall-like values into regs even if we're not in a syscall * at all. Yes, but if not in a syscall then the status & (TS_COMPAT|TS_I386_REGS_POKED) check below can't really help: - TS_COMPAT can't be set - TS_I386_REGS_POKED is only set if regs->orig_ax was changed by 32bit debugger; and even in this case get_nr_restart_syscall() is only correct if the tracee is 32bit too. Suppose that a 64bit debugger plays with a 32bit tracee and * Tracee calls sleep(2) // TS_COMPAT is set * User interrupts the tracee by CTRL-C after 1 sec and does "(gdb) call func()" * gdb saves the regs by PTRACE_GETREGS * does PTRACE_SETREGS to set %rip='func' and %orig_rax=-1 * PTRACE_CONT // TS_COMPAT is cleared * func() hits int3. * Debugger catches SIGTRAP. * Restore original regs by PTRACE_SETREGS. * PTRACE_CONT get_nr_restart_syscall() wrongly returns __NR_restart_syscall==219, the tracee calls ia32_sys_call_table[219] == sys_madvise. Add the sticky TS_COMPAT_RESTART flag which survives after return to user mode. It's going to be removed in the next step again by storing the information in the restart block. As a further cleanup it might be possible to remove also TS_I386_REGS_POKED with that. Test-case: $ cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests $ gcc -o erestartsys-trap-debuggee ptrace-tests/tests/erestartsys-trap-debuggee.c --m32 $ gcc -o erestartsys-trap-debugger ptrace-tests/tests/erestartsys-trap-debugger.c -lutil $ ./erestartsys-trap-debugger Unexpected: retval 1, errno 22 erestartsys-trap-debugger: ptrace-tests/tests/erestartsys-trap-debugger.c:421 Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174709.GA17895@redhat.com
2021-03-16x86: Move TS_COMPAT back to asm/thread_info.hOleg Nesterov
Move TS_COMPAT back to asm/thread_info.h, close to TS_I386_REGS_POKED. It was moved to asm/processor.h by b9d989c7218a ("x86/asm: Move the thread_info::status field to thread_struct"), then later 37a8f7c38339 ("x86/asm: Move 'status' from thread_struct to thread_info") moved the 'status' field back but TS_COMPAT was forgotten. Preparatory patch to fix the COMPAT case for get_nr_restart_syscall() Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174649.GA17880@redhat.com
2021-03-16kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()Oleg Nesterov
Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT. Add a new helper which sets restart_block->fn and calls a dummy arch_set_restart_data() helper. Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
2021-03-16perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENTKan Liang
On a Haswell machine, the perf_fuzzer managed to trigger this message: [117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20) [117248.089957] Call Trace: [117248.092685] intel_pmu_pebs_enable_all+0x31/0x40 [117248.097737] intel_pmu_enable_all+0xa/0x10 [117248.102210] __perf_event_task_sched_in+0x2df/0x2f0 [117248.107511] finish_task_switch.isra.0+0x15f/0x280 [117248.112765] schedule_tail+0xc/0x40 [117248.116562] ret_from_fork+0x8/0x30 A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if the precise_ip is set. The bit 58 is reserved by the HW. Accessing the bit causes the unchecked MSR access error. The fake event doesn't support PEBS. The case should be rejected. Fixes: 097e4311cda9 ("perf/x86: Add constraint to create guest LBR event without hw counter") Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-2-git-send-email-kan.liang@linux.intel.com
2021-03-16perf/x86/intel: Fix a crash caused by zero PEBS statusKan Liang
A repeatable crash can be triggered by the perf_fuzzer on some Haswell system. https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/ For some old CPUs (HSW and earlier), the PEBS status in a PEBS record may be mistakenly set to 0. To minimize the impact of the defect, the commit was introduced to try to avoid dropping the PEBS record for some cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates the local pebs_status accordingly. However, it doesn't correct the PEBS status in the PEBS record, which may trigger the crash, especially for the large PEBS. It's possible that all the PEBS records in a large PEBS have the PEBS status 0. If so, the first get_next_pebs_record_by_bit() in the __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large PEBS, the 'count' parameter must > 1. The second get_next_pebs_record_by_bit() will crash. Besides the local pebs_status, correct the PEBS status in the PEBS record as well. Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event") Reported-by: Vince Weaver <vincent.weaver@maine.edu> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
2021-03-16wireless/nl80211: fix wdev_id may be used uninitializedJarod Wilson
Build currently fails with -Werror=maybe-uninitialized set: net/wireless/nl80211.c: In function '__cfg80211_wdev_from_attrs': net/wireless/nl80211.c:124:44: error: 'wdev_id' may be used uninitialized in this function [-Werror=maybe-uninitialized] Easy fix is to just initialize wdev_id to 0, since it's value doesn't otherwise matter unless have_wdev_id is true. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") CC: Johannes Berg <johannes@sipsolutions.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jakub Kicinski <kuba@kernel.org> CC: linux-wireless@vger.kernel.org CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Link: https://lore.kernel.org/r/20210312163651.1398207-1-jarod@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-03-16mac80211: choose first enabled channel for monitorKarthikeyan Kathirvel
Even if the first channel from sband channel list is invalid or disabled mac80211 ends up choosing it as the default channel for monitor interfaces, making them not usable. Fix this by assigning the first available valid or enabled channel instead. Signed-off-by: Karthikeyan Kathirvel <kathirve@codeaurora.org> Link: https://lore.kernel.org/r/1615440547-7661-1-git-send-email-kathirve@codeaurora.org [reword commit message, comment, code cleanups] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-03-16nl80211: fix locking for wireless device netns changeJohannes Berg
We have all the network interfaces marked as netns-local since the only reasonable thing to do right now is to set a whole device, including all netdevs, into a different network namespace. For this reason, we also have our own way of changing the network namespace. Unfortunately, the RTNL locking changes broke this, and it now results in many RTNL assertions. The trivial fix for those (just hold RTNL for the changes) however leads to deadlocks in the cfg80211 netdev notifier. Since we only need the wiphy, and that's still protected by the RTNL, add a new NL80211_FLAG_NO_WIPHY_MTX flag to the nl80211 ops and use it to _not_ take the wiphy mutex but only the RTNL. This way, the notifier does all the work necessary during unregistration/registration of the netdevs from the old and in the new namespace. Reported-by: Sid Hayn <sidhayn@gmail.com> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210310215839.eadf7c43781b.I5fc6cf6676f800ab8008e03bbea9c3349b02d804@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-03-16mac80211: Check crypto_aead_encrypt for errorsDaniel Phan
crypto_aead_encrypt returns <0 on error, so if these calls are not checked, execution may continue with failed encrypts. It also seems that these two crypto_aead_encrypt calls are the only instances in the codebase that are not checked for errors. Signed-off-by: Daniel Phan <daniel.phan36@gmail.com> Link: https://lore.kernel.org/r/20210309204137.823268-1-daniel.phan36@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>