summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-25ravb: Remove the macros NUM_TX_DESC_GEN[23]Biju Das
For addressing 4 bytes alignment restriction on transmission buffer for R-Car Gen2 we use 2 descriptors whereas it is a single descriptor for other cases. Replace the macros NUM_TX_DESC_GEN[23] with magic number and add a comment to explain it. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net/sched: ets: fix crash when flipping from 'strict' to 'quantum'Davide Caratti
While running kselftests, Hangbin observed that sch_ets.sh often crashes, and splats like the following one are seen in the output of 'dmesg': BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 159f12067 P4D 159f12067 PUD 159f13067 PMD 0 Oops: 0000 [#1] SMP NOPTI CPU: 2 PID: 921 Comm: tc Not tainted 5.14.0-rc6+ #458 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:__list_del_entry_valid+0x2d/0x50 Code: 48 8b 57 08 48 b9 00 01 00 00 00 00 ad de 48 39 c8 0f 84 ac 6e 5b 00 48 b9 22 01 00 00 00 00 ad de 48 39 ca 0f 84 cf 6e 5b 00 <48> 8b 32 48 39 fe 0f 85 af 6e 5b 00 48 8b 50 08 48 39 f2 0f 85 94 RSP: 0018:ffffb2da005c3890 EFLAGS: 00010217 RAX: 0000000000000000 RBX: ffff9073ba23f800 RCX: dead000000000122 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff9073ba23fbc8 RBP: ffff9073ba23f890 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: dead000000000100 R13: ffff9073ba23fb00 R14: 0000000000000002 R15: 0000000000000002 FS: 00007f93e5564e40(0000) GS:ffff9073bba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000014ad34000 CR4: 0000000000350ee0 Call Trace: ets_qdisc_reset+0x6e/0x100 [sch_ets] qdisc_reset+0x49/0x1d0 tbf_reset+0x15/0x60 [sch_tbf] qdisc_reset+0x49/0x1d0 dev_reset_queue.constprop.42+0x2f/0x90 dev_deactivate_many+0x1d3/0x3d0 dev_deactivate+0x56/0x90 qdisc_graft+0x47e/0x5a0 tc_get_qdisc+0x1db/0x3e0 rtnetlink_rcv_msg+0x164/0x4c0 netlink_rcv_skb+0x50/0x100 netlink_unicast+0x1a5/0x280 netlink_sendmsg+0x242/0x480 sock_sendmsg+0x5b/0x60 ____sys_sendmsg+0x1f2/0x260 ___sys_sendmsg+0x7c/0xc0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x3a/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f93e44b8338 Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 43 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55 RSP: 002b:00007ffc0db737a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000061255c06 RCX: 00007f93e44b8338 RDX: 0000000000000000 RSI: 00007ffc0db73810 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000687880 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common joydev i2c_i801 pcspkr i2c_smbus lpc_ich virtio_balloon ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel libata serio_raw virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 When the change() function decreases the value of 'nstrict', we must take into account that packets might be already enqueued on a class that flips from 'strict' to 'quantum': otherwise that class will not be added to the bandwidth-sharing list. Then, a call to ets_qdisc_reset() will attempt to do list_del(&alist) with 'alist' filled with zero, hence the NULL pointer dereference. For classes flipping from 'strict' to 'quantum', initialize an empty list and eventually add it to the bandwidth-sharing list, if there are packets already enqueued. In this way, the kernel will: a) prevent crashing as described above. b) avoid retaining the backlog packets (for an arbitrarily long time) in case no packet is enqueued after a change from 'strict' to 'quantum'. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge branch 'dsa-sja1105-vlan-tags'David S. Miller
Vladimir Oltean says: ==================== Make sja1105 treat tag_8021q VLANs more like real DSA tags This series solves a nuisance with the sja1105 driver, which is that non-DSA tagged packets sent directly by the DSA master would still exit the switch just fine. We also had an issue for packets coming from the outside world with a crafted DSA tag, the switch would not reject that tag but think it was valid. ====================
2021-08-25net: dsa: tag_sja1105: stop asking the sja1105 driver in sja1105_xmit_tpidVladimir Oltean
Introduced in commit 38b5beeae7a4 ("net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously"), the sja1105_xmit_tpid function solved quite a different problem than our needs are now. Then, we used best-effort VLAN filtering and we were using the xmit_tpid to tunnel packets coming from an 8021q upper through the TX VLAN allocated by tag_8021q to that egress port. The need for a different VLAN protocol depending on switch revision came from the fact that this in itself was more of a hack to trick the hardware into accepting tunneled VLANs in the first place. Right now, we deny 8021q uppers (see sja1105_prechangeupper). Even if we supported them again, we would not do that using the same method of {tunneling the VLAN on egress, retagging the VLAN on ingress} that we had in the best-effort VLAN filtering mode. It seems rather simpler that we just allocate a VLAN in the VLAN table that is simply not used by the bridge at all, or by any other port. Anyway, I have 2 gripes with the current sja1105_xmit_tpid: 1. When sending packets on behalf of a VLAN-aware bridge (with the new TX forwarding offload framework) plus untagged (with the tag_8021q VLAN added by the tagger) packets, we can see that on SJA1105P/Q/R/S and later (which have a qinq_tpid of ETH_P_8021AD), some packets sent through the DSA master have a VLAN protocol of 0x8100 and others of 0x88a8. This is strange and there is no reason for it now. If we have a bridge and are therefore forced to send using that bridge's TPID, we can as well blend with that bridge's VLAN protocol for all packets. 2. The sja1105_xmit_tpid introduces a dependency on the sja1105 driver, because it looks inside dp->priv. It is desirable to keep as much separation between taggers and switch drivers as possible. Now it doesn't do that anymore. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: sja1105: drop untagged packets on the CPU and DSA portsVladimir Oltean
The sja1105 driver is a bit special in its use of VLAN headers as DSA tags. This is because in VLAN-aware mode, the VLAN headers use an actual TPID of 0x8100, which is understood even by the DSA master as an actual VLAN header. Furthermore, control packets such as PTP and STP are transmitted with no VLAN header as a DSA tag, because, depending on switch generation, there are ways to steer these control packets towards a precise egress port other than VLAN tags. Transmitting control packets as untagged means leaving a door open for traffic in general to be transmitted as untagged from the DSA master, and for it to traverse the switch and exit a random switch port according to the FDB lookup. This behavior is a bit out of line with other DSA drivers which have native support for DSA tagging. There, it is to be expected that the switch only accepts DSA-tagged packets on its CPU port, dropping everything that does not match this pattern. We perhaps rely a bit too much on the switches' hardware dropping on the CPU port, and place no other restrictions in the kernel data path to avoid that. For example, sja1105 is also a bit special in that STP/PTP packets are transmitted using "management routes" (sja1105_port_deferred_xmit): when sending a link-local packet from the CPU, we must first write a SPI message to the switch to tell it to expect a packet towards multicast MAC DA 01-80-c2-00-00-0e, and to route it towards port 3 when it gets it. This entry expires as soon as it matches a packet received by the switch, and it needs to be reinstalled for the next packet etc. All in all quite a ghetto mechanism, but it is all that the sja1105 switches offer for injecting a control packet. The driver takes a mutex for serializing control packets and making the pairs of SPI writes of a management route and its associated skb atomic, but to be honest, a mutex is only relevant as long as all parties agree to take it. With the DSA design, it is possible to open an AF_PACKET socket on the DSA master net device, and blast packets towards 01-80-c2-00-00-0e, and whatever locking the DSA switch driver might use, it all goes kaput because management routes installed by the driver will match skbs sent by the DSA master, and not skbs generated by the driver itself. So they will end up being routed on the wrong port. So through the lens of that, maybe it would make sense to avoid that from happening by doing something in the network stack, like: introduce a new bit in struct sk_buff, like xmit_from_dsa. Then, somewhere around dev_hard_start_xmit(), introduce the following check: if (netdev_uses_dsa(dev) && !skb->xmit_from_dsa) kfree_skb(skb); Ok, maybe that is a bit drastic, but that would at least prevent a bunch of problems. For example, right now, even though the majority of DSA switches drop packets without DSA tags sent by the DSA master (and therefore the majority of garbage that user space daemons like avahi and udhcpcd and friends create), it is still conceivable that an aggressive user space program can open an AF_PACKET socket and inject a spoofed DSA tag directly on the DSA master. We have no protection against that; the packet will be understood by the switch and be routed wherever user space says. Furthermore: there are some DSA switches where we even have register access over Ethernet, using DSA tags. So even user space drivers are possible in this way. This is a huge hole. However, the biggest thing that bothers me is that udhcpcd attempts to ask for an IP address on all interfaces by default, and with sja1105, it will attempt to get a valid IP address on both the DSA master as well as on sja1105 switch ports themselves. So with IP addresses in the same subnet on multiple interfaces, the routing table will be messed up and the system will be unusable for traffic until it is configured manually to not ask for an IP address on the DSA master itself. It turns out that it is possible to avoid that in the sja1105 driver, at least very superficially, by requesting the switch to drop VLAN-untagged packets on the CPU port. With the exception of control packets, all traffic originated from tag_sja1105.c is already VLAN-tagged, so only STP and PTP packets need to be converted. For that, we need to uphold the equivalence between an untagged and a pvid-tagged packet, and to remember that the CPU port of sja1105 uses a pvid of 4095. Now that we drop untagged traffic on the CPU port, non-aggressive user space applications like udhcpcd stop bothering us, and sja1105 effectively becomes just as vulnerable to the aggressive kind of user space programs as other DSA switches are (ok, users can also create 8021q uppers on top of the DSA master in the case of sja1105, but in future patches we can easily deny that, but it still doesn't change the fact that VLAN-tagged packets can still be injected over raw sockets). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: sja1105: prevent tag_8021q VLANs from being received on user portsVladimir Oltean
Currently it is possible for an attacker to craft packets with a fake DSA tag and send them to us, and our user ports will accept them and preserve that VLAN when transmitting towards the CPU. Then the tagger will be misled into thinking that the packets came on a different port than they really came on. Up until recently there wasn't a good option to prevent this from happening. In SJA1105P and later, the MAC Configuration Table introduced two options called: - DRPSITAG: Drop Single Inner Tagged Frames - DRPSOTAG: Drop Single Outer Tagged Frames Because the sja1105 driver classifies all VLANs as "outer VLANs" (S-Tags), it would be in principle possible to enable the DRPSOTAG bit on ports using tag_8021q, and drop on ingress all packets which have a VLAN tag. When the switch is VLAN-unaware, this works, because it uses a custom TPID of 0xdadb, so any "tagged" packets received on a user port are probably a spoofing attempt. But when the switch overall is VLAN-aware, and some ports are standalone (therefore they use tag_8021q), the TPID is 0x8100, and the port can receive a mix of untagged and VLAN-tagged packets. The untagged ones will be classified to the tag_8021q pvid, and the tagged ones to the VLAN ID from the packet header. Yes, it is true that since commit 4fbc08bd3665 ("net: dsa: sja1105: deny 8021q uppers on ports") we no longer support this mixed mode, but that is a temporary limitation which will eventually be lifted. It would be nice to not introduce one more restriction via DRPSOTAG, which would make the standalone ports of a VLAN-aware switch drop genuinely VLAN-tagged packets. Also, the DRPSOTAG bit is not available on the first generation of switches (SJA1105E, SJA1105T). So since one of the key features of this driver is compatibility across switch generations, this makes it an even less desirable approach. The breakthrough comes from commit bef0746cf4cc ("net: dsa: sja1105: make sure untagged packets are dropped on ingress ports with no pvid"), where it became obvious that untagged packets are not dropped even if the ingress port is not in the VMEMB_PORT vector of that port's pvid. However, VLAN-tagged packets are subject to VLAN ingress checking/dropping. This means that instead of using the catch-all DRPSOTAG bit introduced in SJA1105P, we can drop tagged packets on a per-VLAN basis, and this is already compatible with SJA1105E/T. This patch adds an "allowed_ingress" argument to sja1105_vlan_add(), and we call it with "false" for tag_8021q VLANs on user ports. The tag_8021q VLANs still need to be allowed, of course, on ingress to DSA ports and CPU ports. We also need to refine the drop_untagged check in sja1105_commit_pvid to make it not freak out about this new configuration. Currently it will try to keep the configuration consistent between untagged and pvid-tagged packets, so if the pvid of a port is 1 but VLAN 1 is not in VMEMB_PORT, packets tagged with VID 1 will behave the same as untagged packets, and be dropped. This behavior is what we want for ports under a VLAN-aware bridge, but for the ports with a tag_8021q pvid, we want untagged packets to be accepted, but packets tagged with a header recognized by the switch as a tag_8021q VLAN to be dropped. So only restrict the drop_untagged check to apply to the bridge_pvid, not to the tag_8021q_pvid. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: mt7530: manually set up VLAN ID 0DENG Qingfang
The driver was relying on dsa_slave_vlan_rx_add_vid to add VLAN ID 0. After the blamed commit, VLAN ID 0 won't be set up anymore, breaking software bridging fallback on VLAN-unaware bridges. Manually set up VLAN ID 0 to fix this. Fixes: 06cfb2df7eb0 ("net: dsa: don't advertise 'rx-vlan-filter' when not needed") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25qede: Fix memset corruptionShai Malin
Thanks to Kees Cook who detected the problem of memset that starting from not the first member, but sized for the whole struct. The better change will be to remove the redundant memset and to clear only the msix_cnt member. Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge branch 'mana-EQ-sharing'David S. Miller
Haiyang Zhang says: ==================== net: mana: Add support for EQ sharing The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed the device limit. Support EQ sharing, so that multiple vPorts can share the same set of MSIXs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: mana: Add WARN_ON_ONCE in case of CQE read overflowHaiyang Zhang
This is not an expected case normally. Add WARN_ON_ONCE in case of CQE read overflow, instead of failing silently. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: mana: Add support for EQ sharingHaiyang Zhang
The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed the device limit. Support EQ sharing, so that multiple vPorts (NICs) can share the same set of MSIXs. And, report the EQ-sharing capability bit to the host, which means the host can potentially offer more vPorts and queues to the VM. Also update the resource limit checking and error handling for better robustness. Now, we support up to 256 virtual ports per VF (it was 16/VF), and support up to 64 queues per vPort (it was 16). Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: mana: Move NAPI from EQ to CQHaiyang Zhang
The existing code has NAPI threads polling on EQ directly. To prepare for EQ sharing among vPorts, move NAPI from EQ to CQ so that one EQ can serve multiple CQs from different vPorts. The "arm bit" is only set when CQ processing is completed to reduce the number of EQ entries, which in turn reduce the number of interrupts on EQ. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25netxen_nic: Remove the repeated declarationShaokun Zhang
Function 'netxen_rom_fast_read' is declared twice, so remove the repeated declaration. Cc: Manish Chopra <manishc@marvell.com> Cc: Rahul Verma <rahulv@marvell.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25cxgb4: Properly revert VPD changesNathan Chancellor
Clang warns: drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'kw_offset' is uninitialized when used here [-Werror,-Wuninitialized] FIND_VPD_KW(i, "RV"); ^~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:39: note: expanded from macro 'FIND_VPD_KW' var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \ ^~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:34: note: initialize the variable 'kw_offset' to silence this warning unsigned int vpdr_len, kw_offset, id_len; ^ = 0 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'vpdr_len' is uninitialized when used here [-Werror,-Wuninitialized] FIND_VPD_KW(i, "RV"); ^~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:50: note: expanded from macro 'FIND_VPD_KW' var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \ ^~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:23: note: initialize the variable 'vpdr_len' to silence this warning unsigned int vpdr_len, kw_offset, id_len; ^ = 0 2 errors generated. The series "PCI/VPD: Convert more users to the new VPD API functions" was applied to net-next when it should have been applied to the PCI tree because of build errors. However, commit 82e34c8a9bdf ("Revert "Revert "cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()""") reapplied a change, resulting in the warning above. Properly revert commit 8d63ee602da3 ("cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()") to fix the warning and restore proper functionality. This also reverts commit 3a93bedea050 ("cxgb4: Remove unused vpd_param member ec") to avoid future merge conflicts, as that change has been applied to the PCI tree. Link: https://lore.kernel.org/r/20210823120929.7c6f7a4f@canb.auug.org.au/ Link: https://lore.kernel.org/r/1ca29408-7bc7-4da5-59c7-87893c9e0442@gmail.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge branch 'mptcp-next'David S. Miller
Mat Martineau says: ==================== mptcp: Optimize output options and add MP_FAIL This patch set contains two groups of changes that we've been testing in the MPTCP tree. The first optimizes the code path and data structure for populating MPTCP option headers when transmitting. Patch 1 reorganizes code to reduce the number of conditionals that need to be evaluated in common cases. Patch 2 rearranges struct mptcp_out_options to save 80 bytes (on x86_64). The next five patches add partial support for the MP_FAIL option as defined in RFC 8684. MP_FAIL is an option header used to cleanly handle MPTCP checksum failures. When the MPTCP checksum detects an error in the MPTCP DSS header or the data mapped by that header, the receiver uses a TCP RST with MP_FAIL to close the subflow that experienced the error and provide associated MPTCP sequence number information to the peer. RFC 8684 also describes how a single-subflow connection can discard corrupt data and remain connected under certain conditions using MP_FAIL, but that feature is not implemented here. Patches 3-5 implement MP_FAIL transmit and receive, and integrates with checksum validation. Patches 6 & 7 add MP_FAIL selftests and the MIBs required for those tests. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25selftests: mptcp: add MP_FAIL mibs checkGeliang Tang
This patch added a function chk_fail_nr to check the mibs for MP_FAIL. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: add the mibs for MP_FAILGeliang Tang
This patch added the mibs for MP_FAIL: MPTCP_MIB_MPFAILTX and MPTCP_MIB_MPFAILRX. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: send out MP_FAIL when data checksum failsGeliang Tang
When a bad checksum is detected, set the send_mp_fail flag to send out the MP_FAIL option. Add a new function mptcp_has_another_subflow() to check whether there's only a single subflow. When multiple subflows are in use, close the affected subflow with a RST that includes an MP_FAIL option and discard the data with the bad checksum. Set the sk_state of the subsocket to TCP_CLOSE, then the flag MPTCP_WORK_CLOSE_SUBFLOW will be set in subflow_sched_work_if_closed, and the subflow will be closed. When a single subfow is in use, temporarily handled by sending MP_FAIL with a RST too. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: MP_FAIL suboption receivingGeliang Tang
This patch added handling for receiving MP_FAIL suboption. Add a new members mp_fail and fail_seq in struct mptcp_options_received. When MP_FAIL suboption is received, set mp_fail to 1 and save the sequence number to fail_seq. Then invoke mptcp_pm_mp_fail_received to deal with the MP_FAIL suboption. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: MP_FAIL suboption sendingGeliang Tang
This patch added the MP_FAIL suboption sending support. Add a new flag named send_mp_fail in struct mptcp_subflow_context. If this flag is set, send out MP_FAIL suboption. Add a new member fail_seq in struct mptcp_out_options to save the data sequence number to put into the MP_FAIL suboption. An MP_FAIL option could be included in a RST or on the subflow-level ACK. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: shrink mptcp_out_options structPaolo Abeni
After the previous patch we can alias with a union several fields in mptcp_out_options. Such struct is stack allocated and memset() for each plain TCP out packet. Every saved byted counts. Before: pahole -EC mptcp_out_options # ... /* size: 136, cachelines: 3, members: 17 */ After: pahole -EC mptcp_out_options # ... /* size: 56, cachelines: 1, members: 9 */ Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: optimize out option generationPaolo Abeni
Currently we have several protocol constraints on MPTCP options generation (e.g. MPC and MPJ subopt are mutually exclusive) and some additional ones required by our implementation (e.g. almost all ADD_ADDR variant are mutually exclusive with everything else). We can leverage the above to optimize the out option generation: we check DSS/MPC/MPJ presence in a mutually exclusive way, avoiding many unneeded conditionals in the common cases. Additionally extend the existing constraints on ADD_ADDR opt on all subvariants, so that it becomes fully mutually exclusive with the above and we can skip another conditional statement for the common case. This change is also needed by the next patch. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdpSong Yoong Siang
Ensure a valid XSK buffer before proceed to free the xdp buffer. The following kernel panic is observed without this patch: RIP: 0010:xp_free+0x5/0x40 Call Trace: stmmac_napi_poll_rxtx+0x332/0xb30 [stmmac] ? stmmac_tx_timer+0x3c/0xb0 [stmmac] net_rx_action+0x13d/0x3d0 __do_softirq+0xfc/0x2fb ? smpboot_register_percpu_thread+0xe0/0xe0 run_ksoftirqd+0x32/0x70 smpboot_thread_fn+0x1d8/0x2c0 kthread+0x169/0x1a0 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 ---[ end trace 0000000000000002 ]--- Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Cc: <stable@vger.kernel.org> # 5.13.x Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: fix kernel panic due to NULL pointer dereference of xsk_poolSong Yoong Siang
After free xsk_pool, there is possibility that napi polling is still running in the middle, thus causes a kernel crash due to kernel NULL pointer dereference of rx_q->xsk_pool and tx_q->xsk_pool. Fix this by changing the XDP pool setup sequence to: 1. disable napi before free xsk_pool 2. enable napi after init xsk_pool The following kernel panic is observed without this patch: RIP: 0010:xsk_uses_need_wakeup+0x5/0x10 Call Trace: stmmac_napi_poll_rxtx+0x3a9/0xae0 [stmmac] __napi_poll+0x27/0x130 net_rx_action+0x233/0x280 __do_softirq+0xe2/0x2b6 run_ksoftirqd+0x1a/0x20 smpboot_thread_fn+0xac/0x140 ? sort_range+0x20/0x20 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 ---[ end trace a77c8956b79ac107 ]--- Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Cc: <stable@vger.kernel.org> # 5.13.x Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25x86/build: Move the install rule to arch/x86/MakefileMasahiro Yamada
Currently, the install target in arch/x86/Makefile descends into arch/x86/boot/Makefile to invoke the shell script, but there is no good reason to do so. arch/x86/Makefile can run the shell script directly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210729140023.442101-2-masahiroy@kernel.org
2021-08-25Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2021-08-24 Vinicius Costa Gomes says: This adds support for PCIe PTM (Precision Time Measurement) to the igc driver. PCIe PTM allows the NIC and Host clocks to be compared more precisely, improving the clock synchronization accuracy. Patch 1/4 reverts a commit that made pci_enable_ptm() private to the PCI subsystem, reverting makes it possible for it to be called from the drivers. Patch 2/4 adds the pcie_ptm_enabled() helper. Patch 3/4 calls pci_enable_ptm() from the igc driver. Patch 4/4 implements the PCIe PTM support. Exposing it via the .getcrosststamp() API implies that the time measurements are made synchronously with the ioctl(). The hardware was implemented so the most convenient way to retrieve that information would be asynchronously. So, to follow the expectations of the ioctl() we have to use less convenient ways, triggering an PCIe PTM dialog every time a ioctl() is received. Some questions are raised (also pointed out in the commit message): 1. Using convert_art_ns_to_tsc() is too x86 specific, there should be a common way to create a 'system_counterval_t' from a timestamp. 2. convert_art_ns_to_tsc() says that it should only be used when X86_FEATURE_TSC_KNOWN_FREQ is true, but during tests it works even when it returns false. Should that check be done? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge branch 'lan7800-improvements'David S. Miller
John Efstathiades says: ==================== LAN7800 driver improvements This patch set introduces a number of improvements and fixes for problems found during testing of a modification to add a NAPI-style approach to packet handling to improve performance. NOTE: the NAPI changes are not part of this patch set and the issues fixed by this patch set are not coupled to the NAPI changes. Patch 1 fixes white space and style issues Patch 2 removes an unused timer Patch 3 introduces macros to set the internal packet FIFO flow control levels, which makes it easier to update the levels in future. Patch 4 removes an unused queue Patch 5 (updated for v2) introduces function return value checks and error propagation to various parts of the driver where a return code was captured but then ignored. This patch is completely different to patch 5 in version 1 of this patch set. The changes in the v1 patch 5 are being set aside for the time being. Patch 6 updates the LAN7800 MAC reset code to ensure there is no PHY register access in progress when the MAC is reset. This change prevents a kernel exception that can otherwise occur. Patch 7 fixes problems with system suspend and resume handling while the device is transmitting and receiving data. Patch 8 fixes problems with auto-suspend and resume handling and depends on changes introduced by patch 7. Patch 9 fixes problems with device disconnect handling that can result in kernel exceptions and/or hang. Patch 10 limits the rate at which driver warning messages are emitted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Limit number of driver warning messagesJohn Efstathiades
Device removal can result in a large burst of driver warning messages (20 - 30) sent to the kernel log. Most of these are register read/write failures. This change limits the rate at which these messages are emitted. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix race condition in disconnect handlingJohn Efstathiades
If there is a device disconnect at roughly the same time as a deferred PHY link reset there is a race condition that can result in a kernel lock up due to a null pointer dereference in the driver's deferred work handling routine lan78xx_delayedwork(). The following changes fix this problem. Add new status flag EVENT_DEV_DISCONNECT to indicate when the device has been removed and use it to prevent operations, such as register access, that will fail once the device is removed. Stop processing of deferred work items when the driver's USB disconnect handler is invoked. Disconnect the PHY only after the network device has been unregistered and all delayed work has been cancelled. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix race conditions in suspend/resume handlingJohn Efstathiades
If the interface is given an IP address while the device is suspended (as a result of an auto-suspend event) there is a race between lan78xx_resume() and lan78xx_open() that can result in an exception or failure to handle incoming packets. The following changes fix this problem. Introduce a mutex to serialise operations in the network interface open and stop entry points with respect to the USB driver suspend and resume entry points. Move Tx and Rx data path start/stop to lan78xx_start() and lan78xx_stop() respectively and flush the packet FIFOs before starting the Tx and Rx data paths. This prevents the MAC and FIFOs getting out of step and delivery of malformed packets to the network stack. Stop processing of received packets before disconnecting the PHY from the MAC to prevent a kernel exception caused by handling packets after the PHY device has been removed. Refactor device auto-suspend code to make it consistent with the the system suspend code and make the suspend handler easier to read. Add new code to stop wake-on-lan packets or PHY events resuming the host or device from suspend if the device has not been opened (typically after an IP address is assigned). This patch is dependent on changes to lan78xx_suspend() and lan78xx_resume() introduced in the previous patch of this patch set. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix partial packet errors on suspend/resumeJohn Efstathiades
The MAC can get out of step with the internal packet FIFOs if the system goes to sleep when the link is active, especially at high data rates. This can result in partial frames in the packet FIFOs that in result in malformed frames being delivered to the host. This occurs because the driver does not enable/disable the internal packet FIFOs in step with the corresponding MAC data path. The following changes fix this problem. Update code that enables/disables the MAC receiver and transmitter to the more general Rx and Tx data path, where the data path in each direction consists of both the MAC function (Tx or Rx) and the corresponding packet FIFO. In the receive path the packet FIFO must be enabled before the MAC receiver but disabled after the MAC receiver. In the transmit path the opposite is true: the packet FIFO must be enabled after the MAC transmitter but disabled before the MAC transmitter. The packet FIFOs can be flushed safely once the corresponding data path is stopped. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix exception on link speed changeJohn Efstathiades
An exception is sometimes seen when the link speed is changed from auto-negotiation to a fixed speed, or vice versa. The exception occurs when the MAC is reset (due to the link speed change) at the same time as the PHY state machine is accessing a PHY register. The following changes fix this problem. Rework the MAC reset to ensure there is no outstanding MDIO register transaction before the reset and then wait until the reset is complete before allowing any further MAC register access. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Add missing return code checksJohn Efstathiades
There are many places in the driver where the return code from a function call is captured but without a subsequent test of the return code and appropriate action taken. This patch adds the missing return code tests and action. In most cases the action is an early exit from the calling function. The function lan78xx_set_suspend() was also updated to make it consistent with lan78xx_suspend(). Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Remove unused pause frame queueJohn Efstathiades
Remove the pause frame queue from the driver. It is initialised but not actually used. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Set flow control threshold to prevent packet lossJohn Efstathiades
Set threshold at which flow control is triggered to 3/4 full of the internal Rx packet FIFO to prevent packet drops at high data rates. The new setting reduces the number of dropped UDP frames and TCP retransmit requests especially on less capable CPUs. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Remove unused timerJohn Efstathiades
Remove kernel timer that is not used by the driver. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix white space and style issuesJohn Efstathiades
Fix white space and code style issues identified by checkpatch. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge series "ASoC: mediatek: Add support for MT8195 SoC" from Trevor Wu ↵Mark Brown
<trevor.wu@mediatek.com>: This series of patches adds support for Mediatek AFE of MT8195 SoC. Patches are based on broonie tree "for-next" branch. Changes since v4: - removed sof related code Changes since v3: - fixed warnings found by kernel test robot - removed unused critical section - corrected the lock protected sections on etdm driver - added DPTX and HDMITX audio support Changes since v2: - added audio clock gate control - added 'mediatek' prefix to private dts properties - added consumed clocks to dt-bindins and adopted suggestions from Rob - refined clock usage and remove unused clock and control code - fixed typos Changes since v1: - fixed some problems related to dt-bindings - added some missing properties to dt-bindings - added depency declaration on dt-bindings - fixed some warnings found by kernel test robot Trevor Wu (11): ASoC: mediatek: mt8195: update mediatek common driver ASoC: mediatek: mt8195: support audsys clock control ASoC: mediatek: mt8195: support etdm in platform driver ASoC: mediatek: mt8195: support adda in platform driver ASoC: mediatek: mt8195: support pcm in platform driver ASoC: mediatek: mt8195: add platform driver dt-bindings: mediatek: mt8195: add audio afe document ASoC: mediatek: mt8195: add machine driver with mt6359, rt1019 and rt5682 ASoC: mediatek: mt8195: add DPTX audio support ASoC: mediatek: mt8195: add HDMITX audio support dt-bindings: mediatek: mt8195: add mt8195-mt6359-rt1019-rt5682 document .../bindings/sound/mt8195-afe-pcm.yaml | 184 + .../sound/mt8195-mt6359-rt1019-rt5682.yaml | 47 + sound/soc/mediatek/Kconfig | 24 + sound/soc/mediatek/Makefile | 1 + sound/soc/mediatek/common/mtk-afe-fe-dai.c | 22 +- sound/soc/mediatek/common/mtk-base-afe.h | 10 +- sound/soc/mediatek/mt8195/Makefile | 15 + sound/soc/mediatek/mt8195/mt8195-afe-clk.c | 441 +++ sound/soc/mediatek/mt8195/mt8195-afe-clk.h | 109 + sound/soc/mediatek/mt8195/mt8195-afe-common.h | 158 + sound/soc/mediatek/mt8195/mt8195-afe-pcm.c | 3281 +++++++++++++++++ sound/soc/mediatek/mt8195/mt8195-audsys-clk.c | 214 ++ sound/soc/mediatek/mt8195/mt8195-audsys-clk.h | 15 + .../soc/mediatek/mt8195/mt8195-audsys-clkid.h | 93 + sound/soc/mediatek/mt8195/mt8195-dai-adda.c | 830 +++++ sound/soc/mediatek/mt8195/mt8195-dai-etdm.c | 2639 +++++++++++++ sound/soc/mediatek/mt8195/mt8195-dai-pcm.c | 389 ++ .../mt8195/mt8195-mt6359-rt1019-rt5682.c | 1087 ++++++ sound/soc/mediatek/mt8195/mt8195-reg.h | 2796 ++++++++++++++ 19 files changed, 12350 insertions(+), 5 deletions(-) create mode 100644 Documentation/devicetree/bindings/sound/mt8195-afe-pcm.yaml create mode 100644 Documentation/devicetree/bindings/sound/mt8195-mt6359-rt1019-rt5682.yaml create mode 100644 sound/soc/mediatek/mt8195/Makefile create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-clk.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-clk.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-common.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-pcm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clk.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clk.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clkid.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-adda.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-etdm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-pcm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-reg.h -- 2.18.0
2021-08-25x86/build: Remove the left-over bzlilo targetMasahiro Yamada
Commit f279b49f13bd ("x86/boot: Modernize genimage script; hdimage+EFI support") removed the bzlilo target from arch/x86/boot/Makefile. Remove the left-over from arch/x86/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210729140023.442101-1-masahiroy@kernel.org
2021-08-25Merge branch 'xen-harden-netfront'David S. Miller
Juergen Gross says: ==================== xen: harden netfront against malicious backends Xen backends of para-virtualized devices can live in dom0 kernel, dom0 user land, or in a driver domain. This means that a backend might reside in a less trusted environment than the Xen core components, so a backend should not be able to do harm to a Xen guest (it can still mess up I/O data, but it shouldn't be able to e.g. crash a guest by other means or cause a privilege escalation in the guest). Unfortunately netfront in the Linux kernel is fully trusting its backend. This series is fixing netfront in this regard. It was discussed to handle this as a security problem, but the topic was discussed in public before, so it isn't a real secret. It should be mentioned that a similar series has been posted some years ago by Marek Marczykowski-Górecki, but this series has not been applied due to a Xen header not having been available in the Xen git repo at that time. Additionally my series is fixing some more DoS cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: don't trust the backend response data blindlyJuergen Gross
Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page. Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: disentangle tx_skb_freelistJuergen Gross
The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request. Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer. Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list. When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it. Remove skb_entry_set_link() and open code it instead as it is really trivial now. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: don't read data from request on the ring pageJuergen Gross
In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: read response from backend only onceJuergen Gross
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: macb: Add a NULL check on desc_ptpHarini Katakam
macb_ptp_desc will not return NULL under most circumstances with correct Kconfig and IP design config register. But for the sake of the extreme corner case, check for NULL when using the helper. In case of rx_tstamp, no action is necessary except to return (similar to timestamp disabled) and warn. In case of TX, return -EINVAL to let the skb be free. Perform this check before marking skb in progress. Fixes coverity warning: (4) Event dereference: Dereferencing a null pointer "desc_ptp" Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25qed: Enable automatic recovery on error condition.Alok Prasad
This patch enables automatic recovery by default in case of various error condition like fw assert , hardware error etc. This also ensure driver can handle multiple iteration of assertion conditions. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warningsMichael Riesch
This reverts commit 2c896fb02e7f65299646f295a007bda043e0f382 "net: stmmac: dwmac-rk: add pd_gmac support for rk3399" and fixes unbalanced pm_runtime_enable warnings. In the commit to be reverted, support for power management was introduced to the Rockchip glue code. Later, power management support was introduced to the stmmac core code, resulting in multiple invocations of pm_runtime_{enable,disable,get_sync,put_sync}. The multiple invocations happen in rk_gmac_powerup and stmmac_{dvr_probe, resume} as well as in rk_gmac_powerdown and stmmac_{dvr_remove, suspend}, respectively, which are always called in conjunction. Fixes: 5ec55823438e850c91c6b92aec93fb04ebde29e2 ("net: stmmac: add clocks management for gmac driver") Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net-next: When a bond have a massive amount of VLANs with IPv6 addresses, ↵Gilad Naaman
performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically. The source of most of the slow down is the `dev_addr_lists.c` module, which mainatins a linked list of HW addresses. When using IPv6, this list grows for each IPv6 address added on a VLAN, since each IPv6 address has a multicast HW address associated with it. When performing any modification to the involved links, this list is traversed many times, often for nothing, all while holding the RTNL lock. Instead, this patch adds an auxilliary rbtree which cuts down traversal time significantly. Performance can be seen with the following script: #!/bin/bash ip netns del test || true 2>/dev/null ip netns add test echo 1 | ip netns exec test tee /proc/sys/net/ipv6/conf/all/keep_addr_on_down > /dev/null set -e ip -n test link add foo type veth peer name bar ip -n test link add b1 type bond ip -n test link add florp type vrf table 10 ip -n test link set bar master b1 ip -n test link set foo up ip -n test link set bar up ip -n test link set b1 up ip -n test link set florp up VLAN_COUNT=1500 BASE_DEV=b1 echo Creating vlans ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link add link $BASE_DEV name foo.\$i type vlan id \$i; done" echo Bringing them up ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i up; done" echo Assiging IPv6 Addresses ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test address add dev foo.\$i 2000::\$i/64; done" echo Attaching to VRF ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i master florp; done" On an Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz machine, the performance before the patch is (truncated): Creating vlans real 108.35 Bringing them up real 4.96 Assiging IPv6 Addresses real 19.22 Attaching to VRF real 458.84 After the patch: Creating vlans real 5.59 Bringing them up real 5.07 Assiging IPv6 Addresses real 5.64 Attaching to VRF real 25.37 Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Lu Wei <luwei32@huawei.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mmc: queue: Remove unused parameters(request_queue)ChanWoo Lee
In function mmc_exit_request, the request_queue structure(*q) is not used. I remove the unnecessary code related to the request_queue structure. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20210825074601.8881-1-cw9316.lee@samsung.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25mmc: pwrseq: sd8787: fix compilation warningClaudiu Beznea
Fixed compilation warning "cast from pointer to integer of different size [-Wpointer-to-int-cast]" Fixes: b2832b96fcf5 ("mmc: pwrseq: sd8787: add support for wilc1000") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20210825081931.598934-1-claudiu.beznea@microchip.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>