summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-25mptcp: MP_FAIL suboption sendingGeliang Tang
This patch added the MP_FAIL suboption sending support. Add a new flag named send_mp_fail in struct mptcp_subflow_context. If this flag is set, send out MP_FAIL suboption. Add a new member fail_seq in struct mptcp_out_options to save the data sequence number to put into the MP_FAIL suboption. An MP_FAIL option could be included in a RST or on the subflow-level ACK. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: shrink mptcp_out_options structPaolo Abeni
After the previous patch we can alias with a union several fields in mptcp_out_options. Such struct is stack allocated and memset() for each plain TCP out packet. Every saved byted counts. Before: pahole -EC mptcp_out_options # ... /* size: 136, cachelines: 3, members: 17 */ After: pahole -EC mptcp_out_options # ... /* size: 56, cachelines: 1, members: 9 */ Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: optimize out option generationPaolo Abeni
Currently we have several protocol constraints on MPTCP options generation (e.g. MPC and MPJ subopt are mutually exclusive) and some additional ones required by our implementation (e.g. almost all ADD_ADDR variant are mutually exclusive with everything else). We can leverage the above to optimize the out option generation: we check DSS/MPC/MPJ presence in a mutually exclusive way, avoiding many unneeded conditionals in the common cases. Additionally extend the existing constraints on ADD_ADDR opt on all subvariants, so that it becomes fully mutually exclusive with the above and we can skip another conditional statement for the common case. This change is also needed by the next patch. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdpSong Yoong Siang
Ensure a valid XSK buffer before proceed to free the xdp buffer. The following kernel panic is observed without this patch: RIP: 0010:xp_free+0x5/0x40 Call Trace: stmmac_napi_poll_rxtx+0x332/0xb30 [stmmac] ? stmmac_tx_timer+0x3c/0xb0 [stmmac] net_rx_action+0x13d/0x3d0 __do_softirq+0xfc/0x2fb ? smpboot_register_percpu_thread+0xe0/0xe0 run_ksoftirqd+0x32/0x70 smpboot_thread_fn+0x1d8/0x2c0 kthread+0x169/0x1a0 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 ---[ end trace 0000000000000002 ]--- Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Cc: <stable@vger.kernel.org> # 5.13.x Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: fix kernel panic due to NULL pointer dereference of xsk_poolSong Yoong Siang
After free xsk_pool, there is possibility that napi polling is still running in the middle, thus causes a kernel crash due to kernel NULL pointer dereference of rx_q->xsk_pool and tx_q->xsk_pool. Fix this by changing the XDP pool setup sequence to: 1. disable napi before free xsk_pool 2. enable napi after init xsk_pool The following kernel panic is observed without this patch: RIP: 0010:xsk_uses_need_wakeup+0x5/0x10 Call Trace: stmmac_napi_poll_rxtx+0x3a9/0xae0 [stmmac] __napi_poll+0x27/0x130 net_rx_action+0x233/0x280 __do_softirq+0xe2/0x2b6 run_ksoftirqd+0x1a/0x20 smpboot_thread_fn+0xac/0x140 ? sort_range+0x20/0x20 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 ---[ end trace a77c8956b79ac107 ]--- Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Cc: <stable@vger.kernel.org> # 5.13.x Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25x86/build: Move the install rule to arch/x86/MakefileMasahiro Yamada
Currently, the install target in arch/x86/Makefile descends into arch/x86/boot/Makefile to invoke the shell script, but there is no good reason to do so. arch/x86/Makefile can run the shell script directly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210729140023.442101-2-masahiroy@kernel.org
2021-08-25Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2021-08-24 Vinicius Costa Gomes says: This adds support for PCIe PTM (Precision Time Measurement) to the igc driver. PCIe PTM allows the NIC and Host clocks to be compared more precisely, improving the clock synchronization accuracy. Patch 1/4 reverts a commit that made pci_enable_ptm() private to the PCI subsystem, reverting makes it possible for it to be called from the drivers. Patch 2/4 adds the pcie_ptm_enabled() helper. Patch 3/4 calls pci_enable_ptm() from the igc driver. Patch 4/4 implements the PCIe PTM support. Exposing it via the .getcrosststamp() API implies that the time measurements are made synchronously with the ioctl(). The hardware was implemented so the most convenient way to retrieve that information would be asynchronously. So, to follow the expectations of the ioctl() we have to use less convenient ways, triggering an PCIe PTM dialog every time a ioctl() is received. Some questions are raised (also pointed out in the commit message): 1. Using convert_art_ns_to_tsc() is too x86 specific, there should be a common way to create a 'system_counterval_t' from a timestamp. 2. convert_art_ns_to_tsc() says that it should only be used when X86_FEATURE_TSC_KNOWN_FREQ is true, but during tests it works even when it returns false. Should that check be done? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge branch 'lan7800-improvements'David S. Miller
John Efstathiades says: ==================== LAN7800 driver improvements This patch set introduces a number of improvements and fixes for problems found during testing of a modification to add a NAPI-style approach to packet handling to improve performance. NOTE: the NAPI changes are not part of this patch set and the issues fixed by this patch set are not coupled to the NAPI changes. Patch 1 fixes white space and style issues Patch 2 removes an unused timer Patch 3 introduces macros to set the internal packet FIFO flow control levels, which makes it easier to update the levels in future. Patch 4 removes an unused queue Patch 5 (updated for v2) introduces function return value checks and error propagation to various parts of the driver where a return code was captured but then ignored. This patch is completely different to patch 5 in version 1 of this patch set. The changes in the v1 patch 5 are being set aside for the time being. Patch 6 updates the LAN7800 MAC reset code to ensure there is no PHY register access in progress when the MAC is reset. This change prevents a kernel exception that can otherwise occur. Patch 7 fixes problems with system suspend and resume handling while the device is transmitting and receiving data. Patch 8 fixes problems with auto-suspend and resume handling and depends on changes introduced by patch 7. Patch 9 fixes problems with device disconnect handling that can result in kernel exceptions and/or hang. Patch 10 limits the rate at which driver warning messages are emitted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Limit number of driver warning messagesJohn Efstathiades
Device removal can result in a large burst of driver warning messages (20 - 30) sent to the kernel log. Most of these are register read/write failures. This change limits the rate at which these messages are emitted. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix race condition in disconnect handlingJohn Efstathiades
If there is a device disconnect at roughly the same time as a deferred PHY link reset there is a race condition that can result in a kernel lock up due to a null pointer dereference in the driver's deferred work handling routine lan78xx_delayedwork(). The following changes fix this problem. Add new status flag EVENT_DEV_DISCONNECT to indicate when the device has been removed and use it to prevent operations, such as register access, that will fail once the device is removed. Stop processing of deferred work items when the driver's USB disconnect handler is invoked. Disconnect the PHY only after the network device has been unregistered and all delayed work has been cancelled. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix race conditions in suspend/resume handlingJohn Efstathiades
If the interface is given an IP address while the device is suspended (as a result of an auto-suspend event) there is a race between lan78xx_resume() and lan78xx_open() that can result in an exception or failure to handle incoming packets. The following changes fix this problem. Introduce a mutex to serialise operations in the network interface open and stop entry points with respect to the USB driver suspend and resume entry points. Move Tx and Rx data path start/stop to lan78xx_start() and lan78xx_stop() respectively and flush the packet FIFOs before starting the Tx and Rx data paths. This prevents the MAC and FIFOs getting out of step and delivery of malformed packets to the network stack. Stop processing of received packets before disconnecting the PHY from the MAC to prevent a kernel exception caused by handling packets after the PHY device has been removed. Refactor device auto-suspend code to make it consistent with the the system suspend code and make the suspend handler easier to read. Add new code to stop wake-on-lan packets or PHY events resuming the host or device from suspend if the device has not been opened (typically after an IP address is assigned). This patch is dependent on changes to lan78xx_suspend() and lan78xx_resume() introduced in the previous patch of this patch set. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix partial packet errors on suspend/resumeJohn Efstathiades
The MAC can get out of step with the internal packet FIFOs if the system goes to sleep when the link is active, especially at high data rates. This can result in partial frames in the packet FIFOs that in result in malformed frames being delivered to the host. This occurs because the driver does not enable/disable the internal packet FIFOs in step with the corresponding MAC data path. The following changes fix this problem. Update code that enables/disables the MAC receiver and transmitter to the more general Rx and Tx data path, where the data path in each direction consists of both the MAC function (Tx or Rx) and the corresponding packet FIFO. In the receive path the packet FIFO must be enabled before the MAC receiver but disabled after the MAC receiver. In the transmit path the opposite is true: the packet FIFO must be enabled after the MAC transmitter but disabled before the MAC transmitter. The packet FIFOs can be flushed safely once the corresponding data path is stopped. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix exception on link speed changeJohn Efstathiades
An exception is sometimes seen when the link speed is changed from auto-negotiation to a fixed speed, or vice versa. The exception occurs when the MAC is reset (due to the link speed change) at the same time as the PHY state machine is accessing a PHY register. The following changes fix this problem. Rework the MAC reset to ensure there is no outstanding MDIO register transaction before the reset and then wait until the reset is complete before allowing any further MAC register access. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Add missing return code checksJohn Efstathiades
There are many places in the driver where the return code from a function call is captured but without a subsequent test of the return code and appropriate action taken. This patch adds the missing return code tests and action. In most cases the action is an early exit from the calling function. The function lan78xx_set_suspend() was also updated to make it consistent with lan78xx_suspend(). Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Remove unused pause frame queueJohn Efstathiades
Remove the pause frame queue from the driver. It is initialised but not actually used. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Set flow control threshold to prevent packet lossJohn Efstathiades
Set threshold at which flow control is triggered to 3/4 full of the internal Rx packet FIFO to prevent packet drops at high data rates. The new setting reduces the number of dropped UDP frames and TCP retransmit requests especially on less capable CPUs. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Remove unused timerJohn Efstathiades
Remove kernel timer that is not used by the driver. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25lan78xx: Fix white space and style issuesJohn Efstathiades
Fix white space and code style issues identified by checkpatch. Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25Merge series "ASoC: mediatek: Add support for MT8195 SoC" from Trevor Wu ↵Mark Brown
<trevor.wu@mediatek.com>: This series of patches adds support for Mediatek AFE of MT8195 SoC. Patches are based on broonie tree "for-next" branch. Changes since v4: - removed sof related code Changes since v3: - fixed warnings found by kernel test robot - removed unused critical section - corrected the lock protected sections on etdm driver - added DPTX and HDMITX audio support Changes since v2: - added audio clock gate control - added 'mediatek' prefix to private dts properties - added consumed clocks to dt-bindins and adopted suggestions from Rob - refined clock usage and remove unused clock and control code - fixed typos Changes since v1: - fixed some problems related to dt-bindings - added some missing properties to dt-bindings - added depency declaration on dt-bindings - fixed some warnings found by kernel test robot Trevor Wu (11): ASoC: mediatek: mt8195: update mediatek common driver ASoC: mediatek: mt8195: support audsys clock control ASoC: mediatek: mt8195: support etdm in platform driver ASoC: mediatek: mt8195: support adda in platform driver ASoC: mediatek: mt8195: support pcm in platform driver ASoC: mediatek: mt8195: add platform driver dt-bindings: mediatek: mt8195: add audio afe document ASoC: mediatek: mt8195: add machine driver with mt6359, rt1019 and rt5682 ASoC: mediatek: mt8195: add DPTX audio support ASoC: mediatek: mt8195: add HDMITX audio support dt-bindings: mediatek: mt8195: add mt8195-mt6359-rt1019-rt5682 document .../bindings/sound/mt8195-afe-pcm.yaml | 184 + .../sound/mt8195-mt6359-rt1019-rt5682.yaml | 47 + sound/soc/mediatek/Kconfig | 24 + sound/soc/mediatek/Makefile | 1 + sound/soc/mediatek/common/mtk-afe-fe-dai.c | 22 +- sound/soc/mediatek/common/mtk-base-afe.h | 10 +- sound/soc/mediatek/mt8195/Makefile | 15 + sound/soc/mediatek/mt8195/mt8195-afe-clk.c | 441 +++ sound/soc/mediatek/mt8195/mt8195-afe-clk.h | 109 + sound/soc/mediatek/mt8195/mt8195-afe-common.h | 158 + sound/soc/mediatek/mt8195/mt8195-afe-pcm.c | 3281 +++++++++++++++++ sound/soc/mediatek/mt8195/mt8195-audsys-clk.c | 214 ++ sound/soc/mediatek/mt8195/mt8195-audsys-clk.h | 15 + .../soc/mediatek/mt8195/mt8195-audsys-clkid.h | 93 + sound/soc/mediatek/mt8195/mt8195-dai-adda.c | 830 +++++ sound/soc/mediatek/mt8195/mt8195-dai-etdm.c | 2639 +++++++++++++ sound/soc/mediatek/mt8195/mt8195-dai-pcm.c | 389 ++ .../mt8195/mt8195-mt6359-rt1019-rt5682.c | 1087 ++++++ sound/soc/mediatek/mt8195/mt8195-reg.h | 2796 ++++++++++++++ 19 files changed, 12350 insertions(+), 5 deletions(-) create mode 100644 Documentation/devicetree/bindings/sound/mt8195-afe-pcm.yaml create mode 100644 Documentation/devicetree/bindings/sound/mt8195-mt6359-rt1019-rt5682.yaml create mode 100644 sound/soc/mediatek/mt8195/Makefile create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-clk.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-clk.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-common.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-afe-pcm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clk.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clk.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-audsys-clkid.h create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-adda.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-etdm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-dai-pcm.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c create mode 100644 sound/soc/mediatek/mt8195/mt8195-reg.h -- 2.18.0
2021-08-25x86/build: Remove the left-over bzlilo targetMasahiro Yamada
Commit f279b49f13bd ("x86/boot: Modernize genimage script; hdimage+EFI support") removed the bzlilo target from arch/x86/boot/Makefile. Remove the left-over from arch/x86/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210729140023.442101-1-masahiroy@kernel.org
2021-08-25Merge branch 'xen-harden-netfront'David S. Miller
Juergen Gross says: ==================== xen: harden netfront against malicious backends Xen backends of para-virtualized devices can live in dom0 kernel, dom0 user land, or in a driver domain. This means that a backend might reside in a less trusted environment than the Xen core components, so a backend should not be able to do harm to a Xen guest (it can still mess up I/O data, but it shouldn't be able to e.g. crash a guest by other means or cause a privilege escalation in the guest). Unfortunately netfront in the Linux kernel is fully trusting its backend. This series is fixing netfront in this regard. It was discussed to handle this as a security problem, but the topic was discussed in public before, so it isn't a real secret. It should be mentioned that a similar series has been posted some years ago by Marek Marczykowski-Górecki, but this series has not been applied due to a Xen header not having been available in the Xen git repo at that time. Additionally my series is fixing some more DoS cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: don't trust the backend response data blindlyJuergen Gross
Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page. Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: disentangle tx_skb_freelistJuergen Gross
The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request. Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer. Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list. When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it. Remove skb_entry_set_link() and open code it instead as it is really trivial now. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: don't read data from request on the ring pageJuergen Gross
In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: read response from backend only onceJuergen Gross
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: macb: Add a NULL check on desc_ptpHarini Katakam
macb_ptp_desc will not return NULL under most circumstances with correct Kconfig and IP design config register. But for the sake of the extreme corner case, check for NULL when using the helper. In case of rx_tstamp, no action is necessary except to return (similar to timestamp disabled) and warn. In case of TX, return -EINVAL to let the skb be free. Perform this check before marking skb in progress. Fixes coverity warning: (4) Event dereference: Dereferencing a null pointer "desc_ptp" Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25qed: Enable automatic recovery on error condition.Alok Prasad
This patch enables automatic recovery by default in case of various error condition like fw assert , hardware error etc. This also ensure driver can handle multiple iteration of assertion conditions. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warningsMichael Riesch
This reverts commit 2c896fb02e7f65299646f295a007bda043e0f382 "net: stmmac: dwmac-rk: add pd_gmac support for rk3399" and fixes unbalanced pm_runtime_enable warnings. In the commit to be reverted, support for power management was introduced to the Rockchip glue code. Later, power management support was introduced to the stmmac core code, resulting in multiple invocations of pm_runtime_{enable,disable,get_sync,put_sync}. The multiple invocations happen in rk_gmac_powerup and stmmac_{dvr_probe, resume} as well as in rk_gmac_powerdown and stmmac_{dvr_remove, suspend}, respectively, which are always called in conjunction. Fixes: 5ec55823438e850c91c6b92aec93fb04ebde29e2 ("net: stmmac: add clocks management for gmac driver") Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net-next: When a bond have a massive amount of VLANs with IPv6 addresses, ↵Gilad Naaman
performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically. The source of most of the slow down is the `dev_addr_lists.c` module, which mainatins a linked list of HW addresses. When using IPv6, this list grows for each IPv6 address added on a VLAN, since each IPv6 address has a multicast HW address associated with it. When performing any modification to the involved links, this list is traversed many times, often for nothing, all while holding the RTNL lock. Instead, this patch adds an auxilliary rbtree which cuts down traversal time significantly. Performance can be seen with the following script: #!/bin/bash ip netns del test || true 2>/dev/null ip netns add test echo 1 | ip netns exec test tee /proc/sys/net/ipv6/conf/all/keep_addr_on_down > /dev/null set -e ip -n test link add foo type veth peer name bar ip -n test link add b1 type bond ip -n test link add florp type vrf table 10 ip -n test link set bar master b1 ip -n test link set foo up ip -n test link set bar up ip -n test link set b1 up ip -n test link set florp up VLAN_COUNT=1500 BASE_DEV=b1 echo Creating vlans ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link add link $BASE_DEV name foo.\$i type vlan id \$i; done" echo Bringing them up ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i up; done" echo Assiging IPv6 Addresses ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test address add dev foo.\$i 2000::\$i/64; done" echo Attaching to VRF ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i master florp; done" On an Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz machine, the performance before the patch is (truncated): Creating vlans real 108.35 Bringing them up real 4.96 Assiging IPv6 Addresses real 19.22 Attaching to VRF real 458.84 After the patch: Creating vlans real 5.59 Bringing them up real 5.07 Assiging IPv6 Addresses real 5.64 Attaching to VRF real 25.37 Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Lu Wei <luwei32@huawei.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mmc: queue: Remove unused parameters(request_queue)ChanWoo Lee
In function mmc_exit_request, the request_queue structure(*q) is not used. I remove the unnecessary code related to the request_queue structure. Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20210825074601.8881-1-cw9316.lee@samsung.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25mmc: pwrseq: sd8787: fix compilation warningClaudiu Beznea
Fixed compilation warning "cast from pointer to integer of different size [-Wpointer-to-int-cast]" Fixes: b2832b96fcf5 ("mmc: pwrseq: sd8787: add support for wilc1000") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20210825081931.598934-1-claudiu.beznea@microchip.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25mmc: core: Return correct emmc response in case of ioctl errorNishad Kamdar
When a read/write command is sent via ioctl to the kernel, and the command fails, the actual error response of the emmc is not sent to the user. IOCTL read/write tests are carried out using commands 17 (Single BLock Read), 24 (Single Block Write), 18 (Multi Block Read), 25 (Multi Block Write) The tests are carried out on a 64Gb emmc device. All of these tests try to access an "out of range" sector address (0x09B2FFFF). It is seen that without the patch the response received by the user is not OUT_OF_RANGE error (R1 response 31st bit is not set) as per JEDEC specification. After applying the patch proper response is seen. This is because the function returns without copying the response to the user in case of failure. This patch fixes the issue. Hence, this memcpy is required whether we get an error response or not. Therefor it is moved up from the current position up to immediately after we have called mmc_wait_for_req(). The test code and the output of only the CMD17 is included in the commit to limit the message length. CMD17 (Test Code Snippet): ========================== printf("Forming CMD%d\n", opt_idx); /* single block read */ cmd.blksz = 512; cmd.blocks = 1; cmd.write_flag = 0; cmd.opcode = 17; //cmd.arg = atoi(argv[3]); cmd.arg = 0x09B2FFFF; /* Expecting response R1B */ cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC; memset(data, 0, sizeof(__u8) * 512); mmc_ioc_cmd_set_data(cmd, data); printf("Sending CMD%d: ARG[0x%08x]\n", opt_idx, cmd.arg); if(ioctl(fd, MMC_IOC_CMD, &cmd)) perror("Error"); printf("\nResponse: %08x\n", cmd.response[0]); CMD17 (Output without patch): ============================= test@test-LIVA-Z:~$ sudo ./mmc cmd_test /dev/mmcblk0 17 Entering the do_mmc_commands:Device: /dev/mmcblk0 nargs:4 Entering the do_mmc_commands:Device: /dev/mmcblk0 options[17, 0x09B2FFF] Forming CMD17 Sending CMD17: ARG[0x09b2ffff] Error: Connection timed out Response: 00000000 (Incorrect response) CMD17 (Output with patch): ========================== test@test-LIVA-Z:~$ sudo ./mmc cmd_test /dev/mmcblk0 17 [sudo] password for test: Entering the do_mmc_commands:Device: /dev/mmcblk0 nargs:4 Entering the do_mmc_commands:Device: /dev/mmcblk0 options[17, 09B2FFFF] Forming CMD17 Sending CMD17: ARG[0x09b2ffff] Error: Connection timed out Response: 80000900 (Correct OUT_OF_ERROR response as per JEDEC specification) Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20210824191726.8296-1-nishadkamdar@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25mmc: sdhci-esdhc-imx: Select the correct mode for auto tuningHaibo Chen
USDHC hardware auto tuning circuit support check 1/4/8 data lines and cmd line. Out of reset uSDHC, it default select check 4 data lines and do not check cmd line. This is incorrect if we use 8 data lines. So need to config the auto tuning mode according to current bus width. Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Link: https://lore.kernel.org/r/1629285415-7495-2-git-send-email-haibo.chen@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25mmc: sdhci-esdhc-imx: Remove redundant code for manual tuningHaibo Chen
For manual tuning method, already call esdhc_prepare_tuning() config the necessary registers, so remove the redundant code in esdhc_writew_le() for SDHCI_HOST_CONTROL2. Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Link: https://lore.kernel.org/r/1629285415-7495-1-git-send-email-haibo.chen@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-25KVM: s390: generate kvm hypercall functionsHeiko Carstens
Generate kvm hypercall functions with a macro instead of duplicating the more or less identical code seven times. This also reduces number of lines of code. However the main purpose is to get rid of as many as possible open coded error prone register asm constructs in s390 architecture code. For the only user of kvm_hypercall identical code is created before/after this patch (drivers/s390/virtio/virtio_ccw.c). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210713145713.2815167-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/sclp: add tracing of SCLP interactionsPeter Oberparleiter
Add tracing of interactions between the SCLP base driver, firmware and other drivers to support problem determination in case of SCLP-related issues. For that purpose this patch introduces two new s390dbf debug areas: - sclp: An abbreviated log of all common interactions - sclp_err: A full log of failed or abnormal interactions Tracing of full SCCB contents can be enabled for the sclp area by setting its debug level to maximum (6). Overview of added trace events: * Firmware interaction: - SRV1: Service call about to be issued - SRV2: Service call was issued - INT: Interrupt received * Driver interaction: - RQAD: Request was added - RQOK: Request success - RQAB: Request aborted - RQTM: Request timed out - REG: Event listener registered - UREG: Event listener unregistered - EVNT: Event callback - STCG: State-change callback * Abnormal events: - TMO: A timeout occurred - UNEX: Unexpected SCCB completion * Other (not traced at default level): - SYN1: Synchronous wait start - SYN2: Synchronous wait end Since the SCLP interface is used by console drivers this patch also moves s390dbf printks outside the critical section protected by debug area locks to prevent a potential deadlock that would otherwise be introduced between console_owner --> sclp_lock --> sclp_debug.lock. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: add early tracing supportPeter Oberparleiter
Debug areas can currently only be used after s390dbf initialization which occurs as a postcore_initcall. This is too late for tracing earlier code such as that related to console_init(). This patch introduces a macro for defining a statically initialized debug area that can be used to trace very early code. The macro is made available for built-in code only because modules are never running during early boot. Example usage: 1. Define static debug area: DEFINE_STATIC_DEBUG_INFO(my_debug, "my_debug", 4, 1, 16, &debug_hex_ascii_view); 2. Add trace entry: debug_event(&my_debug, 0, "DATA", 4); Note: The debug area is automatically registered in debugfs during boot. A driver must not call any of the debug_register()/_unregister() functions on a static debug_info_t! Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: fix debug area life cyclePeter Oberparleiter
Currently allocation and registration of s390dbf debug areas are tied together. As a result, a debug area cannot be unregistered and re-registered while any process has an associated debugfs file open. Fix this by splitting alloc/release from register/unregister. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/debug: keep debug data on resizePeter Oberparleiter
Any previously recorded s390dbf debug data is reset when a debug area is resized using the 'pages' sysfs attribute. This can make live-debugging unnecessarily complex. Fix this by copying existing debug data to the newly allocated debug area when resizing. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/diag: make restart_part2 a local labelHeiko Carstens
Avoid that the "restart_part2" label, which is in the middle of a function, appears in /proc/kallsyms. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/mm,pageattr: fix walk_pte_level() early exitHeiko Carstens
In case of splitting to 4k mapping the early exit in walk_pte_level() must only be taken iff flags is equal to SET_MEMORY_4K. Currently the early exit is taken if the flag is set, and also others might be set. This may lead to the situation that a mapping is split but other changes are not done, like e.g. setting pages to R/W. There is currently no such caller, but there might be in the future. Fixes: b3e1a00c8fa4 ("s390/mm: implement set_memory_4k()") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390: fix typo in linker scriptHeiko Carstens
Rename amod31 to amode31 like it was supposed to be. Fixes: c78d0c7484f0 ("s390: rename dma section to amode31") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390: remove do_signal() prototype and do_notify_resume() functionSven Schnelle
Both are no longer used since the conversion to generic entry, therefore remove them. Fixes: 56e62a737028 ("s390: convert to generic entry") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.cRandy Dunlap
The 0day bot reported some kernel-doc warnings in this file so clean up all of the kernel-doc and use proper kernel-doc formatting. There are no more kernel-doc errors or warnings reported in this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Jason Herne <jjherne@linux.ibm.com> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: linux-s390@vger.kernel.org Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/r/20210806050149.9614-1-rdunlap@infradead.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: improve DMA translation init and exitNiklas Schnelle
Currently zpci_dma_init_device()/zpci_dma_exit_device() is called as part of zpci_enable_device()/zpci_disable_device() and errors for zpci_dma_exit_device() are always ignored even if we could abort. Improve upon this by moving zpci_dma_exit_device() out of zpci_disable_device() and check for errors whenever we have a way to abort the current operation. Note that for example in zpci_event_hard_deconfigured() the device is expected to be gone so we really can't abort and proceed even in case of error. Similarly move the cc == 3 special case out of zpci_unregister_ioat() and into the callers allowing to abort when finding an already disabled devices precludes proceeding with the operation. While we are at it log IOAT register/unregister errors in the s390 debugfs log, Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: simplify CLP List PCI handlingNiklas Schnelle
Currently clp_get_state() and clp_refresh_fh() awkwardly use the clp_list_pci() callback mechanism to find the entry for a specific FID and update its zdev, respectively return its state. This is both needlessly complex and means we are always going through the entire PCI function list even if the FID has already been found. Instead lets introduce a clp_find_pci() function to find a specific entry and share the CLP List PCI request handling code with clp_list_pci(). With that in place we can also easily make the function handle a simple out parameter instead of directly altering the zdev allowing easier access to the updated function handle by the caller. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: handle FH state mismatch only on disableNiklas Schnelle
Instead of always treating CLP_RC_SETPCIFN_ALRDY as success and blindly updating the function handle restrict this special handling to the disable case by moving it into zpci_disable_device() and still treating it as an error while also updating the function handle such that a subsequent zpci_disable_device() succeeds or the caller can ignore the error when aborting is not an option such as for zPCI event 0x304. Also print this occurrence to the log such that an admin can tell why a disable operation returned an error. A mismatch between the state of the underlying device and our view of it can naturally happen when the device suddenly enters the error state but we haven't gotten the error notification yet, it must not happen on enable though. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/pci: fix misleading rc in clp_set_pci_fn()Niklas Schnelle
Currently clp_set_pci_fn() always returns 0 as long as the CLP request itself succeeds even if the operation itself returns a response code other than CLP_RC_OK or CLP_RC_SETPCIFN_ALRDY. This is highly misleading because calling code assumes that a zero rc means that the operation was successful. Fix this by returning the response code or cc on failure with the exception of the special handling for CLP_RC_SETPCIFN_ALRDY. Also let's not assume that the returned function handle for CLP_RC_SETPCIFN_ALRDY is 0, we don't need it anyway. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/boot: factor out offset_vmlinux_info() functionAlexander Gordeev
Move offsetting all of vmlinux_info fields to a separate function for better readability. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25s390/kasan: fix large PMD pages address alignment checkAlexander Gordeev
It is currently possible to initialize a large PMD page when the address is not aligned on page boundary. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>