summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-25netfilter: ipset: fix unaligned atomic accessRussell King
When using ip_set with counters and comment, traffic causes the kernel to panic on 32-bit ARM: Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>] Unhandled fault: alignment exception (0x221) at 0xea08133c PC is at ip_set_match_extensions+0xe0/0x224 [ip_set] The problem occurs when we try to update the 64-bit counters - the faulting address above is not 64-bit aligned. The problem occurs due to the way elements are allocated, for example: set->dsize = ip_set_elem_len(set, tb, 0, 0); map = ip_set_alloc(sizeof(*map) + elements * set->dsize); If the element has a requirement for a member to be 64-bit aligned, and set->dsize is not a multiple of 8, but is a multiple of four, then every odd numbered elements will be misaligned - and hitting an atomic64_add() on that element will cause the kernel to panic. ip_set_elem_len() must return a size that is rounded to the maximum alignment of any extension field stored in the element. This change ensures that is the case. Fixes: 95ad1f4a9358 ("netfilter: ipset: Fix extension alignment") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-24qed: add missing error test for DBG_STATUS_NO_MATCHING_FRAMING_MODEColin Ian King
The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum enum dbg_status however there is a missing corresponding entry for this in the array s_status_str. This causes an out-of-bounds read when indexing into the last entry of s_status_str. Fix this by adding in the missing entry. Addresses-Coverity: ("Out-of-bounds read"). Fixes: 2d22bc8354b1 ("qed: FW 8.42.2.0 debug features") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24Merge branch 'net-phy-call-phy_disable_interrupts-in-phy_init_hw'David S. Miller
Jisheng Zhang says: ==================== net: phy: call phy_disable_interrupts() in phy_init_hw() We face an issue with rtl8211f, a pin is shared between INTB and PMEB, and the PHY Register Accessible Interrupt is enabled by default, so the INTB/PMEB pin is always active in polling mode case. As Heiner pointed out "I was thinking about calling phy_disable_interrupts() in phy_init_hw(), to have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." patch1 makes phy_disable_interrupts() non-static so that it could be used in phy_init_hw() to have a defined init state. patch2 calls phy_disable_interrupts() in phy_init_hw() to have a defined init state. Since v3: - call phy_disable_interrupts() have interrupts disabled first then config_init, thank Florian Since v2: - Don't export phy_disable_interrupts() but just make it non-static Since v1: - EXPORT the correct symbol ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: call phy_disable_interrupts() in phy_init_hw()Jisheng Zhang
Call phy_disable_interrupts() in phy_init_hw() to "have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." as pointed out by Heiner. Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: make phy_disable_interrupts() non-staticJisheng Zhang
We face an issue with rtl8211f, a pin is shared between INTB and PMEB, and the PHY Register Accessible Interrupt is enabled by default, so the INTB/PMEB pin is always active in polling mode case. As Heiner pointed out "I was thinking about calling phy_disable_interrupts() in phy_init_hw(), to have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." Make phy_disable_interrupts() non-static so that it could be used in phy_init_hw() to have a defined init state. Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: ethernet: mvneta: Add back interface mode validationSascha Hauer
When writing the serdes configuration register was moved to mvneta_config_interface() the whole code block was removed from mvneta_port_power_up() in the assumption that its only purpose was to write the serdes configuration register. As mentioned by Russell King its purpose was also to check for valid interface modes early so that later in the driver we do not have to care for unexpected interface modes. Add back the test to let the driver bail out early on unhandled interface modes. Fixes: b4748553f53f ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: ethernet: mvneta: Do not error out in non serdes modesSascha Hauer
In mvneta_config_interface() the RGMII modes are catched by the default case which is an error return. The RGMII modes are valid modes for the driver, so instead of returning an error add a break statement to return successfully. This avoids this warning for non comphy SoCs which use RGMII, like SolidRun Clearfog: WARNING: CPU: 0 PID: 268 at drivers/net/ethernet/marvell/mvneta.c:3512 mvneta_start_dev+0x220/0x23c Fixes: b4748553f53f ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24lan743x: Remove duplicated include from lan743x_main.cYueHaibing
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24dsa: Allow forwarding of redirected IGMP trafficDaniel Mack
The driver for Marvell switches puts all ports in IGMP snooping mode which results in all IGMP/MLD frames that ingress on the ports to be forwarded to the CPU only. The bridge code in the kernel can then interpret these frames and act upon them, for instance by updating the mdb in the switch to reflect multicast memberships of stations connected to the ports. However, the IGMP/MLD frames must then also be forwarded to other ports of the bridge so external IGMP queriers can track membership reports, and external multicast clients can receive query reports from foreign IGMP queriers. Currently, this is impossible as the EDSA tagger sets offload_fwd_mark on the skb when it unwraps the tagged frames, and that will make the switchdev layer prevent the skb from egressing on any other port of the same switch. To fix that, look at the To_CPU code in the DSA header and make forwarding of the frame possible for trapped IGMP packets. Introduce some #defines for the frame types to make the code a bit more comprehensive. This was tested on a Marvell 88E6352 variant. Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24Merge branch 'net-bridge-fdb-activity-tracking'David S. Miller
Nikolay Aleksandrov says: ==================== net: bridge: fdb activity tracking This set adds extensions needed for EVPN multi-homing proper and efficient mac sync. User-space (e.g. FRR) needs to be able to track non-dynamic entry activity on per-fdb basis depending if a tracked fdb is currently peer active or locally active and needs to be able to add new peer active fdb (static + track + inactive) without refreshing it to get real activity tracking. Patch 02 adds a new NDA attribute - NDA_FDB_EXT_ATTRS to avoid future pollution of NDA attributes by bridge or vxlan. New bridge/vxlan specific fdb attributes are embedded in NDA_FDB_EXT_ATTRS, which is used in patch 03 to pass the new NFEA_ACTIVITY_NOTIFY attribute which controls if an fdb should be tracked and also reflects its current state when dumping. It is treated as a bitfield, current valid bits are: 1 - mark an entry for activity tracking 2 - mark an entry as inactive to avoid multiple notifications and reflect state properly Patch 04 adds the ability to avoid refreshing an entry when changing it via the NFEA_DONT_REFRESH flag. That allows user-space to mark a static entry for tracking and keep its real activity unchanged. The set has been extensively tested with FRR and those changes will be upstreamed if/after it gets accepted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: add a flag to avoid refreshing fdb when changing/addingNikolay Aleksandrov
When we modify or create a new fdb entry sometimes we want to avoid refreshing its activity in order to track it properly. One example is when a mac is received from EVPN multi-homing peer by FRR, which doesn't want to change local activity accounting. It makes it static and sets a flag to track its activity. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: add option to allow activity notifications for any fdb entriesNikolay Aleksandrov
This patch adds the ability to notify about activity of any entries (static, permanent or ext_learn). EVPN multihoming peers need it to properly and efficiently handle mac sync (peer active/locally active). We add a new NFEA_ACTIVITY_NOTIFY attribute which is used to dump the current activity state and to control if static entries should be monitored at all. We use 2 bits - one to activate fdb entry tracking (disabled by default) and the second to denote that an entry is inactive. We need the second bit in order to avoid multiple notifications of inactivity. Obviously this makes no difference for dynamic entries since at the time of inactivity they get deleted, while the tracked non-dynamic entries get the inactive bit set and get a notification. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: neighbor: add fdb extended attributeNikolay Aleksandrov
Add an attribute to NDA which will contain all future fdb-specific attributes in order to avoid polluting the NDA namespace with e.g. bridge or vxlan specific attributes. The attribute is called NDA_FDB_EXT_ATTRS and the structure would look like: [NDA_FDB_EXT_ATTRS] = { [NFEA_xxx] } Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: fdb_add_entry takes ndm as argumentNikolay Aleksandrov
We can just pass ndm as an argument instead of its fields separately. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24openvswitch: take into account de-fragmentation/gso_size in ↵Lorenzo Bianconi
execute_check_pkt_len ovs connection tracking module performs de-fragmentation on incoming fragmented traffic. Take info account if traffic has been de-fragmented in execute_check_pkt_len action otherwise we will perform the wrong nested action considering the original packet size. This issue typically occurs if ovs-vswitchd adds a rule in the pipeline that requires connection tracking (e.g. OVN stateful ACLs) before execute_check_pkt_len action. Moreover take into account GSO fragment size for GSO packet in execute_check_pkt_len routine Fixes: 4d5ec89fc8d14 ("net: openvswitch: Add a new action check_pkt_len") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24Merge branch 'net-phy-mscc-PHC-and-timestamping-support'David S. Miller
Antoine Tenart says: ==================== net: phy: mscc: PHC and timestamping support This series aims at adding support for PHC and timestamping operations in the MSCC PHY driver, for the VSC858x and VSC8575. Those PHYs are capable of timestamping in 1-step and 2-step for both L2 and L4 traffic. As of this series, only IPv4 support was implemented when using L4 mode. This is because of an hardware limitation which prevents us for supporting both IPv4 and IPv6 at the same time. Implementing support for IPv6 should be quite easy (I do have the modifications needed for the hardware configuration) but I did not see a way to retrieve this information in hwtstamp(). What would you suggest? Those PHYs are distributed in hardware packages containing multiple times the PHY. The VSC8584 for example is composed of 4 PHYs. With hardware packages, parts of the logic is usually common and one of the PHY has to be used for some parts of the initialization. Following this logic, the 1588 blocks of those PHYs are shared between two PHYs and accessing the registers has to be done using the "base" PHY of the group. This is handled thanks to helpers in the PTP code (and locks). We also need the MDIO bus lock while performing a single read or write to the 1588 registers as the read/write are composed of multiple MDIO transactions (and we don't want other threads updating the page). To get and set the PHC time, a GPIO has to be used and changes are only retrieved or committed when on a rising edge. The same GPIO is shared by all PHYs, so the granularity of the lock protecting it has to be different from the ones protecting the 1588 registers (the VSC8584 PHY has 2 1588 blocks, and a single load/save pin). Patch 1 extends the recently added helpers to share information between PHYs of the same hardware package; to allow having part of the probe to be shared (in addition to the already supported init part). This will be used when adding support for PHC/TS to initialize locks. Patches 2 and 3 are mostly cosmetic. Patch 4 takes into account the 1588 block in the MACsec initialization, to allow having both the MACsec and 1588 blocks initialized on a running system. Patches 5 and 6 add support for PHC and timestamping operations in the MSCC driver. An initialization of the 1588 block (plus all the registers definition; and helpers) is added first; and then comes a patch to implement the PHC and timestamping API. Patches 7 and 8 add the required hardware description for device trees, to be able to use the load/save GPIO pin on the PCB120 board. To use this on a PCB120 board, two other series are needed and have already been sent upstream (one is merged). There are no dependency between all those series. Since v3: - Fixed a SKB leak. - Removed ts_lock from the init, as TS and PHC operations aren't registered at this time. - Refectored the ts_base_addr/phy intialization. - Cleaned up the ingr/egr latencies definitons. - Fixed a comment about locking and the shared GPIO. - A few cosmetic fixes. Since v2: - Removed explicit inlines from .c files. - Fixed three warnings. Since v1: - Removed checks in rxtstamp/txtstamp as skb cannot be NULL here. - Reworked get_ptp_header_rx/get_ptp_header. - Reworked the locking logic between the PHC and timestamping operations. - Fixed a compilation issue on x86 reported by Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24MIPS: dts: ocelot: describe the load/save GPIOQuentin Schulz
This patch adds a description of the load/save GPIN pin, used in the VSC8584 PHY for timestamping operations. The related pinctrl description is also added. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24dt-bindings: net: phy: vsc8531: document the load/save GPIOAntoine Tenart
A new optional property can be used to reference the load/save GPIO, used for PTP hardware clock (PHC) operations. This patch documents it in the binding documentation. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: mscc: timestamping and PHC supportAntoine Tenart
This patch adds support for PHC and timestamping operations for the MSCC PHY. PTP 1-step and 2-step modes are supported, over Ethernet and UDP. To get and set the PHC time, a GPIO has to be used and changes are only retrieved or committed when on a rising edge. The same GPIO is shared by all PHYs, so the granularity of the lock protecting it has to be different from the ones protecting the 1588 registers (the VSC8584 PHY has 2 1588 blocks, and a single load/save pin). Co-developed-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: mscc: 1588 block initializationQuentin Schulz
This patch adds the first parts of the 1588 support in the MSCC PHY, with registers definition and the 1588 block initialization. Those PHYs are distributed in hardware packages containing multiple times the PHY. The VSC8584 for example is composed of 4 PHYs. With hardware packages, parts of the logic is usually common and one of the PHY has to be used for some parts of the initialization. Following this logic, the 1588 blocks of those PHYs are shared between two PHYs and accessing the registers has to be done using the "base" PHY of the group. This is handled thanks to helpers in the PTP code (and locks). We also need the MDIO bus lock while performing a single read or write to the 1588 registers as the read/write are composed of multiple MDIO transactions (and we don't want other threads updating the page). Co-developed-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: mscc: take into account the 1588 block in MACsec initAntoine Tenart
This patch takes in account the use of the 1588 block in the MACsec initialization, as a conditional configuration has to be done (when the 1588 block is used). Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: mscc: remove the TR CLK disable magic valueQuentin Schulz
This patch adds a define for the 0x8000 magic value used to perform enable/disable actions on the "token ring clock". The patch is only cosmetic. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: mscc: fix copyright and author information in MACsecAntoine Tenart
All headers in the MSCC PHY driver have been copied and pasted from the original mscc.c file. However the information is not necessarily correct, as in the MACsec support. Fix this. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: phy: add support for a common probe between shared PHYsAntoine Tenart
Shared PHYs (PHYs in the same hardware package) may have shared registers and their drivers would usually need to share information. There is currently a way to have a shared (part of the) init, by using phy_package_init_once(). This patch extends the logic to share parts of the probe to allow sharing the initialization of locks or resources retrieval. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Fixes all over the place. This includes a couple of tests that I would normally defer, but since they have already been helpful in catching some bugs, don't build for any users at all, and having them upstream makes life easier for everyone, I think it's ok even at this late stage" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: tools/virtio: Use tools/include/list.h instead of stubs tools/virtio: Reset index in virtio_test --reset. tools/virtio: Extract virtqueue initialization in vq_reset tools/virtio: Use __vring_new_virtqueue in virtio_test.c tools/virtio: Add --reset tools/virtio: Add --batch=random option tools/virtio: Add --batch option virtio-mem: add memory via add_memory_driver_managed() virtio-mem: silence a static checker warning vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap() vdpa: fix typos in the comments for __vdpa_alloc_device()
2020-06-24Merge tag 'for-linus-2020-06-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread fix from Christian Brauner: "This fixes a regression introduced with 303cc571d107 ("nsproxy: attach to namespaces via pidfds"). The LTP testsuite reported a regression where users would now see EBADF returned instead of EINVAL when an fd was passed that referred to an open file but the file was not a namespace file. Fix this by continuing to report EINVAL and add a regression test" * tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: test for setns() EINVAL regression nsproxy: restore EINVAL for non-namespace file descriptor
2020-06-24IB/hfi1: Add atomic triggered sleep/wakeupMike Marciniszyn
When running iperf in a two host configuration the following trace can occur: [ 319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out The issue happens because the current implementation relies on the netif txq being stopped to control the flushing of the tx list. There are two resources that the transmit logic can wait on and stop the txq: - SDMA descriptors - Ring space to hold completions The ring space is tested on the sending side and relieved when the ring is consumed in the napi tx reaping. Unfortunately, that reaping can run conncurrently with the workqueue flushing of the txlist. If the txq is started just before the workitem executes, the txlist will never be flushed, leading to the txq being stuck. Fix by: - Adding sleep/wakeup wrappers * Use an atomic to control the call to the netif routines inside the wrappers - Use another atomic to record ring space exhaustion * Only wakeup when the a ring space exhaustion has happened and it relieved Add additional wrappers to clarify the ring space resource handling. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24IB/hfi1: Correct -EBUSY handling in tx codeMike Marciniszyn
The current code mishandles -EBUSY in two ways: - The flow change doesn't test the return from the flush and runs on to process the current packet racing with the wakeup processing - The -EBUSY handling for a single packet inserts the tx into the txlist after the submit call, racing with the same wakeup processing Fix the first by dropping the skb and returning NETDEV_TX_OK. Fix the second by insuring the the list entry within the txreq is inited when allocated. This enables the sleep routine to detect that the txreq has used the non-list api and queue the packet to the txlist. Both flaws can lead to having the flushing thread executing in causing two threads to manipulate the txlist. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24IB/hfi1: Fix module use count flaw due to leftover module put callsDennis Dalessandro
When the try_module_get calls were removed from opening and closing of the i2c debugfs file, the corresponding module_put calls were missed. This results in an inaccurate module use count that requires a power cycle to fix. Fixes: 09fbca8e6240 ("IB/hfi1: No need to use try_module_get for debugfs") Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24IB/hfi1: Restore kfree in dummy_netdev cleanupDennis Dalessandro
We need to do some rework on the dummy netdev. Calling the free_netdev() would normally make sense, and that will be addressed in an upcoming patch. For now just revert the behavior to what it was before keeping the unused variable removal part of the patch. The dd->dumm_netdev is mainly used for packet receiving through alloc_netdev_mqs() for typical net devices. A a result, it should be freed with kfree instead of free_netdev() that leads to a crash when unloading the hfi1 module: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 8000000855b54067 P4D 8000000855b54067 PUD 84a4f5067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 73 PID: 10299 Comm: modprobe Not tainted 5.6.0-rc5+ #1 Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 RIP: 0010:__hw_addr_flush+0x12/0x80 Code: 40 00 48 83 c4 08 4c 89 e7 5b 5d 41 5c e9 76 77 18 00 66 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 1f 48 39 df <48> 8b 2b 75 08 eb 4a 48 89 eb 48 89 c5 48 89 df e8 99 bf d0 ff 84 RSP: 0018:ffffb40e08783db8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 RDX: ffffb40e00000000 RSI: 0000000000000246 RDI: ffff88ab13662298 RBP: ffff88ab13662000 R08: 0000000000001549 R09: 0000000000001549 R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88ab13662298 R13: ffff88ab1b259e20 R14: ffff88ab1b259e42 R15: 0000000000000000 FS: 00007fb39b534740(0000) GS:ffff88b31f940000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000084d3ea004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: dev_addr_flush+0x15/0x30 free_netdev+0x7e/0x130 hfi1_netdev_free+0x59/0x70 [hfi1] remove_one+0x65/0x110 [hfi1] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0xec/0x1b0 driver_detach+0x46/0x90 bus_remove_driver+0x58/0xd0 pci_unregister_driver+0x26/0xa0 hfi1_mod_cleanup+0xc/0xd54 [hfi1] __x64_sys_delete_module+0x16c/0x260 ? exit_to_usermode_loop+0xa4/0xc0 do_syscall_64+0x5b/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 193ba03141bb ("IB/hfi1: Use free_netdev() in hfi1_netdev_free()") Link: https://lore.kernel.org/r/20200623203224.106975.16926.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24bpf: Add SO_KEEPALIVE and related options to bpf_setsockoptDmitry Yakunin
This patch adds support of SO_KEEPALIVE flag and TCP related options to bpf_setsockopt() routine. This is helpful if we want to enable or tune TCP keepalive for applications which don't do it in the userspace code. v3: - update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>) v4: - update kernel-doc in tools too (Alexei Starovoitov) - add test to selftests (Alexei Starovoitov) Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
2020-06-24tcp: Expose tcp_sock_set_keepidle_lockedDmitry Yakunin
This is preparation for usage in bpf_setsockopt. v2: - remove redundant EXPORT_SYMBOL (Alexei Starovoitov) Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200620153052.9439-2-zeil@yandex-team.ru
2020-06-24sock: Move sock_valbool_flag to headerDmitry Yakunin
This is preparation for usage in bpf_setsockopt. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200620153052.9439-1-zeil@yandex-team.ru
2020-06-24selftests/bpf: Workaround for get_stack_rawtp test.Alexei Starovoitov
./test_progs-no_alu32 -t get_stack_raw_tp fails due to: 52: (85) call bpf_get_stack#67 53: (bf) r8 = r0 54: (bf) r1 = r8 55: (67) r1 <<= 32 56: (c7) r1 s>>= 32 ; if (usize < 0) 57: (c5) if r1 s< 0x0 goto pc+26 R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); 58: (1f) r9 -= r8 ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); 59: (bf) r2 = r7 60: (0f) r2 += r1 regs=1 stack=0 before 52: (85) call bpf_get_stack#67 ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); 61: (bf) r1 = r6 62: (bf) r3 = r9 63: (b7) r4 = 0 64: (85) call bpf_get_stack#67 R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R3 unbounded memory access, use 'var &= const' or 'if (var < const)' In the C code: usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); if (usize < 0) return 0; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); if (ksize < 0) return 0; We used to have problem with pointer arith in R2. Now it's a problem with two integers in R3. 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800]. Both registers represent the same 'usize' variable. Then R9 -= R8 is doing 800 - [-inf, 800] so the result of "max_len - usize" looks unbounded to the verifier while it's obvious in C code that "max_len - usize" should be [0, 800]. To workaround the problem convert ksize and usize variables from int to long. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-24libbpf: Prevent loading vmlinux BTF twiceAndrii Nakryiko
Prevent loading/parsing vmlinux BTF twice in some cases: for CO-RE relocations and for BTF-aware hooks (tp_btf, fentry/fexit, etc). Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200624043805.1794620-1-andriin@fb.com
2020-06-24libbpf: Fix spelling mistake "kallasyms" -> "kallsyms"Colin Ian King
There is a spelling mistake in a pr_warn message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200623084207.149253-1-colin.king@canonical.com
2020-06-24tools, bpftool: Fix variable shadowing in emit_obj_refs_json()Quentin Monnet
Building bpftool yields the following complaint: pids.c: In function 'emit_obj_refs_json': pids.c:175:80: warning: declaration of 'json_wtr' shadows a global declaration [-Wshadow] 175 | void emit_obj_refs_json(struct obj_refs_table *table, __u32 id, json_writer_t *json_wtr) | ~~~~~~~~~~~~~~~^~~~~~~~ In file included from pids.c:11: main.h:141:23: note: shadowed declaration is here 141 | extern json_writer_t *json_wtr; | ^~~~~~~~ Let's rename the variable. v2: - Rename the variable instead of calling the global json_wtr directly. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200623213600.16643-1-quentin@isovalent.com
2020-06-24ALSA: usb-audio: Fix OOB access of mixer element listTakashi Iwai
The USB-audio mixer code holds a linked list of usb_mixer_elem_list, and several operations are performed for each mixer element. A few of them (snd_usb_mixer_notify_id() and snd_usb_mixer_interrupt_v2()) assume each mixer element being a usb_mixer_elem_info object that is a subclass of usb_mixer_elem_list, cast via container_of() and access it members. This may result in an out-of-bound access when a non-standard list element has been added, as spotted by syzkaller recently. This patch adds a new field, is_std_info, in usb_mixer_elem_list to indicate that the element is the usb_mixer_elem_info type or not, and skip the access to such an element if needed. Reported-by: syzbot+fb14314433463ad51625@syzkaller.appspotmail.com Reported-by: syzbot+2405ca3401e943c538b5@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200624122340.9615-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-06-23Merge branch 'cxgb4-fix-more-warnings-reported-by-sparse'David S. Miller
Rahul Lakkireddy says: ==================== cxgb4: fix more warnings reported by sparse Patch 1 ensures all callers take on-chip memory lock when flashing PHY firmware to fix lock context imbalance warnings. Patch 2 moves all static arrays in header file to respective C file in device dump collection path. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23cxgb4: move device dump arrays in header to C fileRahul Lakkireddy
Move all arrays related to device dump in header file to C file. Also, move the function that shares the arrays to the same C file. Fixes following warnings reported by make W=1 in several places: cudbg_entity.h:513:18: warning: 't6_hma_ireg_array' defined but not used [-Wunused-const-variable=] 513 | static const u32 t6_hma_ireg_array[][IREG_NUM_ELEM] = { Fixes: a7975a2f9a79 ("cxgb4: collect register dump") Fixes: 17b332f48074 ("cxgb4: add support to read serial flash") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23cxgb4: always sync access when flashing PHY firmwareRahul Lakkireddy
Access to on-chip memory for flashing PHY firmware must always be synchronized. So, ensure the callers take on-chip memory lock. Also fixes following sparse warning: sge.c:1641:26: warning: context imbalance in 't4_load_phy_fw' - different lock contexts for basic block Fixes: 01b6961410b7 ("cxgb4: Add PHY firmware support for T420-BT cards") Fixes: 4ee339e1e92a ("cxgb4: add support to flash PHY image") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23Merge branch 'Two-phylink-pause-fixes'David S. Miller
Russell King says: ==================== Two phylink pause fixes While testing, I discovered two issues with ethtool -A with phylink. First, if there is a PHY bound to the network device, we hit a deadlock when phylib tries to notify us of the link changing as a result of triggering a renegotiation. Second, when we are manually forcing the pause settings, and there is no renegotiation triggered, we do not update the MAC via the new mac_link_up approach. These two patches solve both problems, and will need to be backported to v5.7; they do not apply cleanly there due to the introduction of PCS in the v5.8 merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: phylink: ensure manual pause mode configuration takes effectRussell King
We have been relying on link events and mac_config() when the manual pause modes are changed. With recent developments, such as moving the programming of link state to mac_link_up(), this no longer works. To ensure that we update the MAC, we must generate a link-down followed by a link-up event; we can do that by setting mac_link_dropped and triggering a resolve. Fixes: 91a208f2185a ("net: phylink: propagate resolved link config via mac_link_up()") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: phylink: fix ethtool -A with attached PHYsRussell King
Fix a phylink's ethtool set_pauseparam support deadlock caused by phylib interacting with phylink: we must not hold the state lock while calling phylib functions that may call into phylink_phy_change(). Fixes: f904f15ea9b5 ("net: phylink: allow ethtool -A to change flow control advertisement") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: Do not clear the sock TX queue in sk_set_socket()Tariq Toukan
Clearing the sock TX queue in sk_set_socket() might cause unexpected out-of-order transmit when called from sock_orphan(), as outstanding packets can pick a different TX queue and bypass the ones already queued. This is undesired in general. More specifically, it breaks the in-order scheduling property guarantee for device-offloaded TLS sockets. Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it explicitly only where needed. Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23selftests/net: plug rxtimestamp test into kselftest frameworktannerlove
Run rxtimestamp as part of TEST_PROGS. Analogous to other tests, add new rxtimestamp.sh wrapper script, so that the test runs isolated from background traffic in a private network namespace. Also ignore failures of test case #6 by default. This case verifies that a receive timestamp is not reported if timestamp reporting is enabled for a socket, but generation is disabled. Receive timestamp generation has to be enabled globally, as no associated socket is known yet. A background process that enables rx timestamp generation therefore causes a false positive. Ntpd is one example that does. Add a "--strict" option to cause failure in the event that any test case fails, including test #6. This is useful for environments that are known to not have such background processes. Tested: make -C tools/testing/selftests TARGETS="net" run_tests Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: thunderbolt: Add comment clarifying prtcstns flagsMika Westerberg
ThunderboltIP protocol currently has two flags from which we only support and set match frags ID. The first flag is reserved for full E2E flow control. Add a comment that clarifies them. Suggested-by: Yehezkel Bernat <yehezkelshb@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Yehezkel Bernat <YehezkelShB@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23Merge branch 'ACPI-support-for-xgmac_mdio-drivers'David S. Miller
Calvin Johnson says: ==================== ACPI support for xgmac_mdio drivers. This patch series provides ACPI support for xgmac_mdio driver. Changes in v3: - handle case MDIOBUS_NO_CAP Changes in v2: - Reserve "0" to mean that no mdiobus capabilities have been declared. - bus->id: change to appropriate printk format specifier - clean up xgmac_acpi_match - clariy platform_get_resource() usage with comments ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net/fsl: enable extended scanning in xgmac_mdioJeremy Linton
Since we know the xgmac hardware always has a c45 compliant bus, let's try scanning for c22 capable PHYs first. If we fail to find any, then it will fall back to c45 automatically. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net/fsl: acpize xgmac_mdioCalvin Johnson
Add ACPI support for xgmac MDIO bus registration while maintaining the existing DT support. The function mdiobus_register() inside of_mdiobus_register(), brings up all the PHYs on the mdio bus and attach them to the bus. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>