summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-02-28net: phy: provide phy driver start/stop hooksRussell King
Provide phy driver start/stop hooks so that the PHY driver knows when the network driver is starting or stopping. This will be used for the Marvell 10G driver so that we can sanely refuse to start if the PHYs firmware is not present, and also so that we can sanely support SFPs behind the PHY. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: phy: add resolved pause supportRussell King
Allow phylib drivers to pass the hardware-resolved pause state to MAC drivers, rather than using the software-based pause resolution code. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: phylink: pcs: add 802.3 clause 45 helpersRussell King
Implement helpers for PCS accessed via the MII bus using 802.3 clause 45 cycles for 10GBASE-R. Only link up/down is supported, 10G full duplex is assumed. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: phylink: pcs: add 802.3 clause 22 helpersRussell King
Implement helpers for PCS accessed via the MII bus using 802.3 clause 22 cycles, conforming to 802.3 clause 37 and Cisco SGMII specifications for the advertisement word. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: phylink: add separate pcs operations structureRussell King
Add a separate set of PCS operations, which MAC drivers can use to couple phylink with their associated MAC PCS layer. The PCS operations include: - pcs_get_state() - reads the link up/down, resolved speed, duplex and pause from the PCS. - pcs_config() - configures the PCS for the specified mode, PHY interface type, and setting the advertisement. - pcs_an_restart() - restarts 802.3 in-band negotiation with the link partner - pcs_link_up() - informs the PCS that link has come up, and the parameters of the link. Link parameters are used to program the PCS for fixed speed and non-inband modes. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: mdiobus: add APIs for modifying a MDIO device registerRussell King
Add APIs for modifying a MDIO device register, similar to the existing phy_modify() group of functions, but at mdiobus level instead. Adapt __phy_modify_changed() to use the new mdiobus level helper. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: mii: add linkmode_adv_to_mii_adv_x()Russell King
Add a helper to convert a linkmode advertisement to a clause 37 advertisement value for 1000base-x and 2500base-x. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: mii: convert mii_lpa_to_ethtool_lpa_x() to linkmode variantRussell King
Add a LPA to linkmode decoder for 1000BASE-X protocols; this decoder only provides the modify semantics similar to other such decoders. This replaces the unused mii_lpa_to_ethtool_lpa_x() helper. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: dsa: propagate resolved link config via mac_link_up()Russell King
Propagate the resolved link configuration down via DSA's phylink_mac_link_up() operation to allow split PCS/MAC to work. Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-28net: phylink: propagate resolved link config via mac_link_up()Russell King
Propagate the resolved link parameters via the mac_link_up() call for MACs that do not automatically track their PCS state. We propagate the link parameters via function arguments so that inappropriate members of struct phylink_link_state can't be accessed, and creating a new structure just for this adds needless complexity to the API. Tested-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-17net: phylink: clarify flow control settings in documentationRussell King
Clarify the expected flow control settings operation in the phylink documentation for each negotiation mode. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-17net: phylink: resolve fixed link flow controlRussell King
Resolve the fixed link flow control using the recently introduced linkmode_resolve_pause() helper, which we use in phylink_get_fixed_state() only when operating in full duplex mode. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-17net: add linkmode helper for setting flow control advertisementRussell King
Add a linkmode helper to set the flow control advertisement in an ethtool linkmode mask according to the tx/rx capabilities. This implementation is moved from phylib, and documented with an analysis of its shortcomings. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-17net: add helpers to resolve negotiated flow controlRussell King
Add a couple of helpers to resolve negotiated flow control. Two helpers are provided: - linkmode_resolve_pause() which takes the link partner and local advertisements, and decodes whether we should enable TX or RX pause at the MAC. This is useful outside of phylib, e.g. in phylink. - phy_get_pause(), which returns the TX/RX enablement status for the current negotiation results of the PHY. This allows us to centralise the flow control resolution, rather than spreading it around. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-02-17net: linkmode: make linkmode_test_bit() take const pointerRussell King
linkmode_test_bit() does not modify the address; test_bit() is also declared const volatile for the same reason. There's no need for linkmode_test_bit() to be any different, and allows implementation of helpers that take a const linkmode pointer. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-29net: phy: Added IRQ print to phylink_bringup_phy()Florian Fainelli
The information about the PHY attached to the PHYLINK instance is useful but is missing the IRQ prints that phy_attached_info() adds. phy_attached_info() is a bit long and it would not be possible to use phylink_info() anyway. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-29net: phylink: add support for polling MAC PCSVladimir Oltean
Some MAC PCS blocks are unable to provide interrupts when their status changes. As we already have support in phylink for polling status, use this to provide a hook for MACs to enable polling mode. The patch idea was picked up from Russell King's suggestion on the macb phylink patch thread here [0] but the implementation was changed. Instead of introducing a new phylink_start_poll() function, which would make the implementation cumbersome for common PHYLINK implementations for multiple types of devices, like DSA, just add a boolean property to the phylink_config structure, which is just as backwards-compatible. https://lkml.org/lkml/2019/12/16/603 Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27net: phy: add PHY_INTERFACE_MODE_10GBASERRussell King
Recent discussion has revealed that the use of PHY_INTERFACE_MODE_10GKR is incorrect. Add a 10GBASE-R definition, document both the -R and -KR versions, and the fact that 10GKR was used incorrectly. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: phy: provide and use genphy_read_status_fixed()Russell King
There are two drivers and generic code which contain exactly the same code to read the status of a PHY operating without autonegotiation enabled. Rather than duplicate this code, provide a helper to read this information. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: phy: add genphy_check_and_restart_aneg()Russell King
Add a helper for restarting autonegotiation(), similar to the clause 45 variant. Use it in __genphy_config_aneg() Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: phylink: delay MAC configuration for copper SFP modulesRussell King
Knowing whether we need to delay the MAC configuration because a module may have a PHY is useful to phylink to allow NBASE-T modules to work on systems supporting no more than 2.5G speeds. This commit allows us to delay such configuration until after the PHY has been probed by recording the parsed capabilities, and if the module may have a PHY, doing no more until the module_start() notification is called. At that point, we either have a PHY, or we don't. We move the PHY-based setup a little later, and use the PHYs support capabilities rather than the EEPROM parsed capabilities to determine whether we can support the PHY. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: sfp: add module start/stop upstream notificationsRussell King
When dealing with some copper modules, we can't positively know the module capabilities are until we have probed the PHY. Without the full capabilities, we may end up failing a module that we could otherwise drive with a restricted set of capabilities. An example of this would be a module with a NBASE-T PHY plugged into a host that supports phy interface modes 2500BASE-X and SGMII. The PHY supports 10GBASE-R, 5000BASE-X, 2500BASE-X, SGMII interface modes, which means a subset of the capabilities are compatible with the host. However, reading the module EEPROM leads us to believe that the module only supports ethtool link mode 10GBASE-T, which is incompatible with the host - and thus results in the module being rejected. This patch adds an extra notification which are triggered after the SFP module's PHY probe, and a corresponding notification just before the PHY is removed. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: sfp: add more extended compliance codesRussell King
SFF-8024 is used to define various constants re-used in several SFF SFP-related specifications. Split these constants from the enum, and rename them to indicate that they're defined by SFF-8024. Add and use updated SFF-8024 extended compliance code definitions for 10GBASE-T, 5GBASE-T and 2.5GBASE-T modules. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-27net: sfp: derive interface mode from ethtool link modesRussell King
We don't need the EEPROM ID to derive the phy interface mode as we can derive it merely from the ethtool link modes. Remove the EEPROM ID argument to sfp_select_interface(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-26Merge tag 'block-5.5-2020-01-26' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fix from Jens Axboe: "Unfortunately this weekend we had a few last minute reports, one was for block. The partition disable for zoned devices was overly restrictive, it can work (and be supported) just fine for host-aware variants. Here's a fix ensuring that's the case so we don't break existing users of that" * tag 'block-5.5-2020-01-26' of git://git.kernel.dk/linux-block: block: allow partitions on host aware zone devices
2020-01-26block: allow partitions on host aware zone devicesChristoph Hellwig
Host-aware SMR drives can be used with the commands to explicitly manage zone state, but they can also be used as normal disks. In the former case it makes perfect sense to allow partitions on them, in the latter it does not, just like for host managed devices. Add a check to add_partition to allow partitions on host aware devices, but give up any zone management capabilities in that case, which also catches the previously missed case of adding a partition vs just scanning it. Because sd can rescan the attribute at runtime it needs to check if a disk has partitions, for which a new helper is added to genhd.h. Fixes: 5eac3eb30c9a ("block: Remove partition support for zoned block devices") Reported-by: Borislav Petkov <bp@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Off by one in mt76 airtime calculation, from Dan Carpenter. 2) Fix TLV fragment allocation loop condition in iwlwifi, from Luca Coelho. 3) Don't confirm neigh entries when doing ipsec pmtu updates, from Xu Wang. 4) More checks to make sure we only send TSO packets to lan78xx chips that they can actually handle. From James Hughes. 5) Fix ip_tunnel namespace move, from William Dauchy. 6) Fix unintended packet reordering due to cooperation between listification done by GRO and non-GRO paths. From Maxim Mikityanskiy. 7) Add Jakub Kicincki formally as networking co-maintainer. 8) Info leak in airo ioctls, from Michael Ellerman. 9) IFLA_MTU attribute needs validation during rtnl_create_link(), from Eric Dumazet. 10) Use after free during reload in mlxsw, from Ido Schimmel. 11) Dangling pointers are possible in tp->highest_sack, fix from Eric Dumazet. 12) Missing *pos++ in various networking seq_next handlers, from Vasily Averin. 13) CHELSIO_GET_MEM operation neds CAP_NET_ADMIN check, from Michael Ellerman. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (109 commits) firestream: fix memory leaks net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM net: bcmgenet: Use netif_tx_napi_add() for TX NAPI tipc: change maintainer email address net: stmmac: platform: fix probe for ACPI devices net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path net/mlx5e: kTLS, Remove redundant posts in TX resync flow net/mlx5e: kTLS, Fix corner-case checks in TX resync flow net/mlx5e: Clear VF config when switching modes net/mlx5: DR, use non preemptible call to get the current cpu number net/mlx5: E-Switch, Prevent ingress rate configuration of uplink rep net/mlx5: DR, Enable counter on non-fwd-dest objects net/mlx5: Update the list of the PCI supported devices net/mlx5: Fix lowest FDB pool size net: Fix skb->csum update in inet_proto_csum_replace16(). netfilter: nf_tables: autoload modules from the abort path netfilter: nf_tables: add __nft_chain_type_get() netfilter: nf_tables_offload: fix check the chain offload flag netfilter: conntrack: sctp: use distinct states for new SCTP connections ipv6_route_seq_next should increase position index ...
2020-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Missing netlink attribute sanity check for NFTA_OSF_DREG, from Florian Westphal. 2) Use bitmap infrastructure in ipset to fix KASAN slab-out-of-bounds reads, from Jozsef Kadlecsik. 3) Missing initial CLOSED state in new sctp connection through ctnetlink events, from Jiri Wiesner. 4) Missing check for NFT_CHAIN_HW_OFFLOAD in nf_tables offload indirect block infrastructure, from wenxu. 5) Add __nft_chain_type_get() to sanity check family and chain type. 6) Autoload modules from the nf_tables abort path to fix races reported by syzbot. 7) Remove unnecessary skb->csum update on inet_proto_csum_replace16(), from Praveen Chaudhary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24netfilter: nf_tables: autoload modules from the abort pathPablo Neira Ayuso
This patch introduces a list of pending module requests. This new module list is composed of nft_module_request objects that contain the module name and one status field that tells if the module has been already loaded (the 'done' field). In the first pass, from the preparation phase, the netlink command finds that a module is missing on this list. Then, a module request is allocated and added to this list and nft_request_module() returns -EAGAIN. This triggers the abort path with the autoload parameter set on from nfnetlink, request_module() is called and the module request enters the 'done' state. Since the mutex is released when loading modules from the abort phase, the module list is zapped so this is iteration occurs over a local list. Therefore, the request_module() calls happen when object lists are in consistent state (after fulling aborting the transaction) and the commit list is empty. On the second pass, the netlink command will find that it already tried to load the module, so it does not request it again and nft_request_module() returns 0. Then, there is a look up to find the object that the command was missing. If the module was successfully loaded, the command proceeds normally since it finds the missing object in place, otherwise -ENOENT is reported to userspace. This patch also updates nfnetlink to include the reason to enter the abort phase, which is required for this new autoload module rationale. Fixes: ec7470b834fe ("netfilter: nf_tables: store transaction list locally while requesting module") Reported-by: syzbot+29125d208b3dae9a7019@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-23Merge tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray fixes from Matthew Wilcox: "Primarily bugfixes, mostly around handling index wrap-around correctly. A couple of doc fixes and adding missing APIs. I had an oops live on stage at linux.conf.au this year, and it turned out to be a bug in xas_find() which I can't prove isn't triggerable in the current codebase. Then in looking for the bug, I spotted two more bugs. The bots have had a few days to chew on this with no problems reported, and it passes the test-suite (which now has more tests to make sure these problems don't come back)" * tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-dax: XArray: Add xa_for_each_range XArray: Fix xas_find returning too many entries XArray: Fix xa_find_after with multi-index entries XArray: Fix infinite loop with entry at ULONG_MAX XArray: Add wrappers for nested spinlocks XArray: Improve documentation of search marks XArray: Fix xas_pause at ULONG_MAX
2020-01-23Merge tag 'trace-v5.5-rc6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various tracing fixes: - Fix a function comparison warning for a xen trace event macro - Fix a double perf_event linking to a trace_uprobe_filter for multiple events - Fix suspicious RCU warnings in trace event code for using list_for_each_entry_rcu() when the "_rcu" portion wasn't needed. - Fix a bug in the histogram code when using the same variable - Fix a NULL pointer dereference when tracefs lockdown enabled and calling trace_set_default_clock() - A fix to a bug found with the double perf_event linking patch" * tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobe: Fix to make trace_uprobe_filter alignment safe tracing: Do not set trace clock if tracefs lockdown is in effect tracing: Fix histogram code when expression has same var as value tracing: trigger: Replace unneeded RCU-list traversals tracing/uprobe: Fix double perf_event linking on multiprobe uprobe tracing: xen: Ordered comparison of function pointers
2020-01-23net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link()Eric Dumazet
rtnl_create_link() needs to apply dev->min_mtu and dev->max_mtu checks that we apply in do_setlink() Otherwise malicious users can crash the kernel, for example after an integer overflow : BUG: KASAN: use-after-free in memset include/linux/string.h:365 [inline] BUG: KASAN: use-after-free in __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238 Write of size 32 at addr ffff88819f20b9c0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192 memset+0x24/0x40 mm/kasan/common.c:108 memset include/linux/string.h:365 [inline] __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238 alloc_skb include/linux/skbuff.h:1049 [inline] alloc_skb_with_frags+0x93/0x590 net/core/skbuff.c:5664 sock_alloc_send_pskb+0x7ad/0x920 net/core/sock.c:2242 sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2259 mld_newpack+0x1d7/0x7f0 net/ipv6/mcast.c:1609 add_grhead.isra.0+0x299/0x370 net/ipv6/mcast.c:1713 add_grec+0x7db/0x10b0 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:1970 [inline] mld_ifc_timer_expire+0x3d3/0x950 net/ipv6/mcast.c:2477 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61 Code: 98 6b ea f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 44 1c 60 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 34 1c 60 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 4e 5d 9a f9 e8 79 RSP: 0018:ffffffff89807ce8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffffffff13266ae RBX: ffffffff8987a1c0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffffffff8987aa54 RBP: ffffffff89807d18 R08: ffffffff8987a1c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000 R13: ffffffff8a799980 R14: 0000000000000000 R15: 0000000000000000 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:690 default_idle_call+0x84/0xb0 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361 rest_init+0x23b/0x371 init/main.c:451 arch_call_rest_init+0xe/0x1b start_kernel+0x904/0x943 init/main.c:784 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x77/0x7b arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242 The buggy address belongs to the page: page:ffffea00067c82c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 raw: 057ffe0000000000 ffffea00067c82c8 ffffea00067c82c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88819f20b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88819f20b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff88819f20b980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88819f20ba00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88819f20ba80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-22Merge tag 'io_uring-5.5-2020-01-22' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fix from Jens Axboe: "This was supposed to have gone in last week, but due to a brain fart on my part, I forgot that we made this struct addition in the 5.5 cycle. So here it is for 5.5, to prevent having a 32 vs 64-bit compatability issue with the files_update command" * tag 'io_uring-5.5-2020-01-22' of git://git.kernel.dk/linux-block: io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
2020-01-20io_uring: fix compat for IORING_REGISTER_FILES_UPDATEEugene Syromiatnikov
fds field of struct io_uring_files_update is problematic with regards to compat user space, as pointer size is different in 32-bit, 32-on-64-bit, and 64-bit user space. In order to avoid custom handling of compat in the syscall implementation, make fds __u64 and use u64_to_user_ptr in order to retrieve it. Also, align the field naturally and check that no garbage is passed there. Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20netfilter: ipset: use bitmap infrastructure completelyKadlecsik József
The bitmap allocation did not use full unsigned long sizes when calculating the required size and that was triggered by KASAN as slab-out-of-bounds read in several places. The patch fixes all of them. Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix non-blocking connect() in x25, from Martin Schiller. 2) Fix spurious decryption errors in kTLS, from Jakub Kicinski. 3) Netfilter use-after-free in mtype_destroy(), from Cong Wang. 4) Limit size of TSO packets properly in lan78xx driver, from Eric Dumazet. 5) r8152 probe needs an endpoint sanity check, from Johan Hovold. 6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from John Fastabend. 7) hns3 needs short frames padded on transmit, from Yunsheng Lin. 8) Fix netfilter ICMP header corruption, from Eyal Birger. 9) Fix soft lockup when low on memory in hns3, from Yonglong Liu. 10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan. 11) Fix memory leak in act_ctinfo, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: reject overlapped queues in TC-MQPRIO offload cxgb4: fix Tx multi channel port rate limit net: sched: act_ctinfo: fix memory leak bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal. bnxt_en: Fix ipv6 RFS filter matching logic. bnxt_en: Fix NTUPLE firmware command failures. net: systemport: Fixed queue mapping in internal ring map net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec net: dsa: sja1105: Don't error out on disabled ports with no phy-mode net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset net: hns: fix soft lockup when there is not enough memory net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key() net/sched: act_ife: initalize ife->metalist earlier netfilter: nat: fix ICMP header corruption on ICMP errors net: wan: lapbether.c: Use built-in RCU list checking netfilter: nf_tables: fix flowtable list del corruption netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks() netfilter: nf_tables: remove WARN and add NLA_STRING upper limits netfilter: nft_tunnel: ERSPAN_VERSION must not be null netfilter: nft_tunnel: fix null-attribute check ...
2020-01-18Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Back from LCA2020, fixes wasn't too busy last week, seems to have quieten down appropriately, some amdgpu, i915, then a core mst fix and one fix for virtio-gpu and one for rockchip: core mst: - serialize down messages and clear timeslots are on unplug amdgpu: - Update golden settings for renoir - eDP fix i915: - uAPI fix: Remove dash and colon from PMU names to comply with tools/perf - Fix for include file that was indirectly included - Two fixes to make sure VMA are marked active for error capture virtio: - maintain obj reservation lock when submitting cmds rockchip: - increase link rate var size to accommodate rates" * tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Reorder detect_edp_sink_caps before link settings read. drm/amdgpu: update goldensetting for renoir drm/dp_mst: Have DP_Tx send one msg at a time drm/dp_mst: clear time slots for ports invalid drm/i915/pmu: Do not use colons or dashes in PMU names drm/rockchip: fix integer type used for storing dp data rate drm/i915/gt: Mark ring->vma as active while pinned drm/i915/gt: Mark context->state vma as active while pinned drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings drm/i915: Add missing include file <linux/math64.h> drm/virtio: add missing virtio_gpu_array_lock_resv call
2020-01-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull rseq fixes from Ingo Molnar: "Two rseq bugfixes: - CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end up corrupting the TLS of the parent. Technically a change in the ABI but the previous behavior couldn't resonably have been relied on by applications so this looks like a valid exception to the ABI rule. - Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the handling of other flags. This is not thought to impact any applications either" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Unregister rseq for clone CLONE_VM rseq: Reject unknown flags on rseq unregister
2020-01-17XArray: Add xa_for_each_rangeMatthew Wilcox (Oracle)
This function supports iterating over a range of an array. Also add documentation links for xa_for_each_start(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-01-17XArray: Add wrappers for nested spinlocksMatthew Wilcox (Oracle)
Some users need to take an xarray lock while holding another xarray lock. Reported-by: Doug Gilbert <dgilbert@interlog.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-01-18Merge tag 'drm-misc-fixes-2020-01-16' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes virtio: maintain obj reservation lock when submitting cmds (Gerd) rockchip: increase link rate var size to accommodate rates (Tobias) mst: serialize down messages and clear timeslots are on unplug (Wayne) Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Tobias Schramm <t.schramm@manjaro.org> Cc: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200116162856.GA11524@art_vandelay
2020-01-17Merge tag 'block-5.5-2020-01-16' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Three fixes that should go into this release: - The 32-bit segment size fix that I mentioned last week (Ming) - Use uint for the block size (Mikulas) - A null_blk zone write handling fix (Damien)" * tag 'block-5.5-2020-01-16' of git://git.kernel.dk/linux-block: block: fix an integer overflow in logical block size null_blk: Fix zone write handling block: fix get_max_segment_size() overflow on 32bit arch
2020-01-16Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "I've been sitting on these longer than I meant, so the patch count is a bit higher than ideal for this part of the release. There's also some reverts of double-applied patches that brings the diffstat up a bit. With that said, the biggest changes are: - Revert of duplicate i2c device addition on two Aspeed (BMC) Devicetrees. - Move of two device nodes that got applied to the wrong part of the tree on ASpeed G6. - Regulator fix for Beaglebone X15 (adding 12/5V supplies) - Use interrupts for keys on Amlogic SM1 to avoid missed polls In addition to that, there is a collection of smaller DT fixes: - Power supply assignment fixes for i.MX6 - Fix of interrupt line for magnetometer on i.MX8 Librem5 devkit - Build fixlets (selects) for davinci/omap2+ - More interrupt number fixes for Stratix10, Amlogic SM1, etc. - ... and more similar fixes across different platforms And some non-DT stuff: - optee fix to register multiple shared pages properly - Clock calculation fixes for MMP3 - Clock fixes for OMAP as well" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (42 commits) MAINTAINERS: Add myself as the co-maintainer for Actions Semi platforms ARM: dts: imx7: Fix Toradex Colibri iMX7S 256MB NAND flash support ARM: dts: imx6sll-evk: Remove incorrect power supply assignment ARM: dts: imx6sl-evk: Remove incorrect power supply assignment ARM: dts: imx6sx-sdb: Remove incorrect power supply assignment ARM: dts: imx6qdl-sabresd: Remove incorrect power supply assignment ARM: dts: imx6q-icore-mipi: Use 1.5 version of i.Core MX6DL ARM: omap2plus: select RESET_CONTROLLER ARM: davinci: select CONFIG_RESET_CONTROLLER ARM: dts: aspeed: rainier: Fix fan fault and presence ARM: dts: aspeed: rainier: Remove duplicate i2c busses ARM: dts: aspeed: tacoma: Remove duplicate flash nodes ARM: dts: aspeed: tacoma: Remove duplicate i2c busses ARM: dts: aspeed: tacoma: Fix fsi master node ARM: dts: aspeed-g6: Fix FSI master location ARM: dts: mmp3: Fix the TWSI ranges clk: mmp2: Fix the order of timer mux parents ARM: mmp: do not divide the clock rate arm64: dts: rockchip: Fix IR on Beelink A1 optee: Fix multi page dynamic shm pool alloc ...
2020-01-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2020-01-15 The following pull-request contains BPF updates for your *net* tree. We've added 12 non-merge commits during the last 9 day(s) which contain a total of 13 files changed, 95 insertions(+), 43 deletions(-). The main changes are: 1) Fix refcount leak for TCP time wait and request sockets for socket lookup related BPF helpers, from Lorenz Bauer. 2) Fix wrong verification of ARSH instruction under ALU32, from Daniel Borkmann. 3) Batch of several sockmap and related TLS fixes found while operating more complex BPF programs with Cilium and OpenSSL, from John Fastabend. 4) Fix sockmap to read psock's ingress_msg queue before regular sk_receive_queue() to avoid purging data upon teardown, from Lingpeng Chen. 5) Fix printing incorrect pointer in bpftool's btf_dump_ptr() in order to properly dump a BPF map's value with BTF, from Martin KaFai Lau. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15block: fix an integer overflow in logical block sizeMikulas Patocka
Logical block size has type unsigned short. That means that it can be at most 32768. However, there are architectures that can run with 64k pages (for example arm64) and on these architectures, it may be possible to create block devices with 64k block size. For exmaple (run this on an architecture with 64k pages): Mount will fail with this error because it tries to read the superblock using 2-sector access: device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536 EXT4-fs (dm-0): unable to read superblock This patch changes the logical block size from unsigned short to unsigned int to avoid the overflow. Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15bpf: Sockmap/tls, push write_space updates through ulp updatesJohn Fastabend
When sockmap sock with TLS enabled is removed we cleanup bpf/psock state and call tcp_update_ulp() to push updates to TLS ULP on top. However, we don't push the write_space callback up and instead simply overwrite the op with the psock stored previous op. This may or may not be correct so to ensure we don't overwrite the TLS write space hook pass this field to the ULP and have it fixup the ctx. This completes a previous fix that pushed the ops through to the ULP but at the time missed doing this for write_space, presumably because write_space TLS hook was added around the same time. Fixes: 95fa145479fbc ("bpf: sockmap/tls, close can race with map free") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com
2020-01-15bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loopJohn Fastabend
When a sockmap is free'd and a socket in the map is enabled with tls we tear down the bpf context on the socket, the psock struct and state, and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform the tls stack it needs to update its saved sock ops so that when the tls socket is later destroyed it doesn't try to call the now destroyed psock hooks. This is about keeping stacked ULPs in good shape so they always have the right set of stacked ops. However, recently unhash() hook was removed from TLS side. But, the sockmap/bpf side is not doing any extra work to update the unhash op when is torn down instead expecting TLS side to manage it. So both TLS and sockmap believe the other side is managing the op and instead no one updates the hook so it continues to point at tcp_bpf_unhash(). When unhash hook is called we call tcp_bpf_unhash() which detects the psock has already been destroyed and calls sk->sk_prot_unhash() which calls tcp_bpf_unhash() yet again and so on looping and hanging the core. To fix have sockmap tear down logic fixup the stale pointer. Fixes: 5d92e631b8be ("net/tls: partially revert fix transition through disconnect with close") Reported-by: syzbot+83979935eb6304f8cd46@syzkaller.appspotmail.com Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-2-john.fastabend@gmail.com
2020-01-15drm/dp_mst: Have DP_Tx send one msg at a timeWayne Lin
[Why] Noticed this while testing MST with the 4 ports MST hub from StarTech.com. Sometimes can't light up monitors normally and get the error message as 'sideband msg build failed'. Look into aux transactions, found out that source sometimes will send out another down request before receiving the down reply of the previous down request. On the other hand, in drm_dp_get_one_sb_msg(), current code doesn't handle the interleaved replies case. Hence, source can't build up message completely and can't light up monitors. [How] For good compatibility, enforce source to send out one down request at a time. Add a flag, is_waiting_for_dwn_reply, to determine if the source can send out a down request immediately or not. - Check the flag before calling process_single_down_tx_qlock to send out a msg - Set the flag when successfully send out a down request - Clear the flag when successfully build up a down reply - Clear the flag when find erros during sending out a down request - Clear the flag when find errors during building up a down reply - Clear the flag when timeout occurs during waiting for a down reply - Use drm_dp_mst_kick_tx() to try to send another down request in queue at the end of drm_dp_mst_wait_tx_reply() (attempt to send out messages in queue when errors occur) Cc: Lyude Paul <lyude@redhat.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200113093649.11755-1-Wayne.Lin@amd.com
2020-01-15bpf: Fix incorrect verifier simulation of ARSH under ALU32Daniel Borkmann
Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (85) call bpf_get_socket_cookie#46 1: R0_w=invP(id=0) R10=fp0 1: (57) r0 &= 808464432 2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 2: (14) w0 -= 810299440 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 3: (c4) w0 s>>= 1 4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 4: (76) if w0 s>= 0x30303030 goto pc+216 221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 221: (95) exit processed 6 insns (limit 1000000) [...] Taking a closer look, the program was xlated as follows: # ./bpftool p d x i 12 0: (85) call bpf_get_socket_cookie#7800896 1: (bf) r6 = r0 2: (57) r6 &= 808464432 3: (14) w6 -= 810299440 4: (c4) w6 s>>= 1 5: (76) if w6 s>= 0x30303030 goto pc+216 6: (05) goto pc-1 7: (05) goto pc-1 8: (05) goto pc-1 [...] 220: (05) goto pc-1 221: (05) goto pc-1 222: (95) exit Meaning, the visible effect is very similar to f54c7898ed1c ("bpf: Fix precision tracking for unbounded scalars"), that is, the fall-through branch in the instruction 5 is considered to be never taken given the conclusion from the min/max bounds tracking in w6, and therefore the dead-code sanitation rewrites it as goto pc-1. However, real-life input disagrees with verification analysis since a soft-lockup was observed. The bug sits in the analysis of the ARSH. The definition is that we shift the target register value right by K bits through shifting in copies of its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the register into 32 bit mode, same happens after simulating the operation. However, for the case of simulating the actual ARSH, we don't take the mode into account and act as if it's always 64 bit, but location of sign bit is different: dst_reg->smin_value >>= umin_val; dst_reg->smax_value >>= umin_val; dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val); Consider an unknown R0 where bpf_get_socket_cookie() (or others) would for example return 0xffff. With the above ARSH simulation, we'd see the following results: [...] 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (57) r0 &= 808464432 -> R0_runtime = 0x3030 3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 3: (14) w0 -= 810299440 -> R0_runtime = 0xcfb40000 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 (0xffffffff) 4: (c4) w0 s>>= 1 -> R0_runtime = 0xe7da0000 5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0x7ffbfff8) [...] In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011 0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into 0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000' and means here that the simulation didn't retain the sign bit. With above logic, the updates happen on the 64 bit min/max bounds and given we coerced the register, the sign bits of the bounds are cleared as well, meaning, we need to force the simulation into s32 space for 32 bit alu mode. Verification after the fix below. We're first analyzing the fall-through branch on 32 bit signed >= test eventually leading to rejection of the program in this specific case: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r2 = 808464432 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (bf) r6 = r0 3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0 3: (57) r6 &= 808464432 4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 4: (14) w6 -= 810299440 5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 5: (c4) w6 s>>= 1 6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0xfffbfff8) 6: (76) if w6 s>= 0x30303030 goto pc+216 7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 7: (30) r0 = *(u8 *)skb[808464432] BPF_LD_[ABS|IND] uses reserved fields processed 8 insns (limit 1000000) [...] Fixes: 9cbe1f5a32dc ("bpf/verifier: improve register value range tracking with ARSH") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net
2020-01-15Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fixes from Al Viro: "Fixes for mountpoint_last() bugs (by converting to use of lookup_last()) and an autofs regression fix from this cycle (caused by follow_managed() breakage introduced in barrier fixes series)" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix autofs regression caused by follow_managed() changes reimplement path_mountpoint() with less magic