linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-07-23	net/mlx5: FW tracer, Enable tracing	Feras Daoud
	Add the tracer file to the makefile and add the init function to the load one flow. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	net/mlx5: FW tracer, parse traces and kernel tracing support	Feras Daoud
	For each message the driver should do the following: 1- Find the message string in the strings database 2- Count the param number of each message 3- Wait for the param events and accumulate them 4- Calculate the event timestamp using the local event timestamp and the first timestamp event following it. 5- Print message to trace log Enable the tracing by: echo 1 > /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable Read traces by: cat /sys/kernel/debug/tracing/trace Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	net/mlx5: FW tracer, events handling	Feras Daoud
	The tracer has one event, event 0x26, with two subtypes: - Subtype 0: Ownership change - Subtype 1: Traces available An ownership change occurs in the following cases: 1- Owner releases his ownership, in this case, an event will be sent to inform others to reattempt acquire ownership. 2- Ownership was taken by a higher priority tool, in this case the owner should understand that it lost ownership, and go through tear down flow. The second subtype indicates that there are traces in the trace buffer, in this case, the driver polls the tracer buffer for new traces, parse them and prepares the messages for printing. The HW starts tracing from the first address in the tracer buffer. Driver receives an event notifying that new trace block exists. HW posts a timestamp event at the last 8B of every 256B block. Comparing the timestamp to the last handled timestamp would indicate that this is a new trace block. Once the new timestamp is detected, the entire block is considered valid. Block validation and parsing, should be done after copying the current block to a different location, in order to avoid block overwritten during processing. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	net/mlx5: FW tracer, register log buffer memory key	Saeed Mahameed
	Create a memory key and protection domain for the tracer log buffer. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	net/mlx5: FW tracer, create trace buffer and copy strings database	Feras Daoud
	For each PF do the following: 1- Allocate memory for the tracer strings database and read the strings from the FW to the SW. These strings will be used later for parsing traces. 2- Allocate and dma map tracer buffers. Traces that will be written into the buffer will be parsed as a group of one or more traces, referred to as trace message. The trace message represents a C-like printf string. First trace of a message holds the pointer to the correct string in strings database. The following traces holds the variables of the message. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	net/mlx5: FW tracer, implement tracer logic	Feras Daoud
	Implement FW tracer logic and registers access, initialization and cleanup flows. Initializing the tracer will be part of load one flow, as multiple PFs will try to acquire ownership but only one will succeed and will be the tracer owner. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	Merge branch 'mlx5-next' of ↵	Saeed Mahameed
	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 core infrastructure updates and fixes. From Eran: - Add MPEGC (Management PCIe General Configuration) registers and btis - Fix tristate and description for MLX5 module rom Feras: - Add hardware structures for the firmware tracer From Jainbo: - Core support for double vlan push/pop steering action From Max: - Add XRQ commands definitions From Noa: - Add missing SET_DRIVER_VERSION command translation From Roi: - Use ERR_CAST() instead of coding it From Tariq: - Better return types for CQE API Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23	gpio: uniphier: set legitimate irq trigger type in .to_irq hook	Masahiro Yamada
	If a GPIO chip is a part of a hierarchy IRQ domain, there is no way to specify the trigger type when gpio(d)_to_irq() allocates an interrupt on-the-fly. Currently, uniphier_gpio_to_irq() sets IRQ_TYPE_NONE, but it causes an error in the .alloc() hook of the parent domain. (drivers/irq/irq-uniphier-aidet.c) Even if we change irq-uniphier-aidet.c to accept the NONE type, GIC complains about it since commit 83a86fbb5b56 ("irqchip/gic: Loudly complain about the use of IRQ_TYPE_NONE"). Instead, use IRQ_TYPE_LEVEL_HIGH as a temporary value when an irq is allocated. irq_set_irq_type() will override it when the irq is really requested. Fixes: dbe776c2ca54 ("gpio: uniphier: add UniPhier GPIO controller driver") Reported-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-23	gpio: of: Handle fixed regulator flags properly	Linus Walleij
	This fixes up the handling of fixed regulator polarity inversion flags: while I remembered to fix it for the undocumented "reg-fixed-voltage" I forgot about the official "regulator-fixed" binding, there are two ways to do a fixed regulator. The error was noticed and fixed. Fixes: a603a2b8d86e ("gpio: of: Add special quirk to parse regulator flags") Cc: Mark Brown <broonie@kernel.org> Cc: Thierry Reding <thierry.reding@gmail.com> Reported-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-23	Merge branch 'lan743x-Add-features-to-lan743x-driver'	David S. Miller
	Bryan Whitehead says: ==================== lan743x: Add features to lan743x driver This patch series adds extra features to the lan743x driver. Updates for v4: Patch 6/8 - Modified get/set_wol to use super set of MAC and PHY driver support. Patch 7/9 - In set_eee, return the return value from phy_ethtool_set_eee. Updates for v3: Removed patch 9 from this series, regarding PTP support Patch 6/8 - Add call to phy_ethtool_get_wol to lan743x_ethtool_get_wol Patch 7/8 - Add call to phy_ethtool_set_eee on (!eee->eee_enabled) Updates for v2: Patch 3/9 - Used ARRAY_SIZE macro in lan743x_ethtool_get_ethtool_stats. Patch 5/9 - Used MAX_EEPROM_SIZE in lan743x_ethtool_set_eeprom. Patch 6/9 - Removed unnecessary read of PMT_CTL. Used CRC algorithm from lib. Removed PHY interrupt settings from lan743x_pm_suspend Change "#if CONFIG_PM" to "#ifdef CONFIG_PM" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add RSS support	Bryan Whitehead
	Implement RSS support Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add EEE support	Bryan Whitehead
	Implement EEE support Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add power management support	Bryan Whitehead
	Implement power management Supports suspend, resume, and Wake on LAN Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add support for ethtool eeprom access	Bryan Whitehead
	Implement ethtool eeprom access Also provides access to OTP (One Time Programming) Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add support for ethtool message level	Bryan Whitehead
	Implement ethtool message level Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add support for ethtool statistics	Bryan Whitehead
	Implement ethtool statistics Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add support for ethtool link settings	Bryan Whitehead
	Use default link setting functions Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	lan743x: Add support for ethtool get_drvinfo	Bryan Whitehead
	Implement ethtool get_drvinfo Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	Merge branch 'sh_eth-clean-up-the-TSU-register-accessors'	David S. Miller
	Sergei Shtylyov says: ==================== sh_eth: clean up the TSU register accessors Here's a set of 5 patches against DaveM's 'net-next.git' repo. They do a final clean up of the TSU register accessors... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	sh_eth: make sh_eth_tsu_{read\|write}_entry() prototypes symmetric	Sergei Shtylyov
	sh_eth_tsu_read_entry() is still asymmetric with sh_eth_tsu_write_entry() WRT their prototypes -- make them symmetric by passing to the former a TSU register offset instead of its address and also adding the (now necessary) 'ndev' parameter... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	sh_eth: make sh_eth_tsu_write_entry() take 'offset' parameter	Sergei Shtylyov
	We can add the TSU register base address to a TSU register offset right in sh_eth_tsu_write_entry(), no need to do it in its callers... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	sh_eth: call sh_eth_tsu_get_offset() from TSU register accessors	Sergei Shtylyov
	With sh_eth_tsu_get_offset() now actually returning TSU register's offset, we can at last use it in sh_eth_tsu_{read\|write}(). Somehow this saves 248 bytes of object code with AArch64 gcc 4.8.5... :-) Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	sh_eth: make sh_eth_tsu_get_offset() match its name	Sergei Shtylyov
	sh_eth_tsu_get_offset(), despite its name, returns a TSU register's address, not its offset. Make this function match its name and return a register's offset from the TSU registers base address instead. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	sh_eth: uninline sh_eth_tsu_get_offset()	Sergei Shtylyov
	sh_eth_tsu_get_offset() is called several times by the driver, remove inline and move that function from the header to the driver itself to let gcc decide whether to expand it inline or not... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	Merge branch 'tcp-robust-ooo'	David S. Miller
	Eric Dumazet says: ==================== Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. With tcp_rmem[2] default of 6MB, the ooo queue could contain ~7000 nodes. This patch series makes sure we cut cpu cycles enough to render the attack not critical. We might in the future go further, like disconnecting or black-holing proven malicious flows. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	tcp: add tcp_ooo_try_coalesce() helper	Eric Dumazet
	In case skb in out_or_order_queue is the result of multiple skbs coalescing, we would like to get a proper gso_segs counter tracking, so that future tcp_drop() can report an accurate number. I chose to not implement this tracking for skbs in receive queue, since they are not dropped, unless socket is disconnected. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	tcp: call tcp_drop() from tcp_data_queue_ofo()	Eric Dumazet
	In order to be able to give better diagnostics and detect malicious traffic, we need to have better sk->sk_drops tracking. Fixes: 9f5afeae5152 ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	tcp: detect malicious patterns in tcp_collapse_ofo_queue()	Eric Dumazet
	In case an attacker feeds tiny packets completely out of order, tcp_collapse_ofo_queue() might scan the whole rb-tree, performing expensive copies, but not changing socket memory usage at all. 1) Do not attempt to collapse tiny skbs. 2) Add logic to exit early when too many tiny skbs are detected. We prefer not doing aggressive collapsing (which copies packets) for pathological flows, and revert to tcp_prune_ofo_queue() which will be less expensive. In the future, we might add the possibility of terminating flows that are proven to be malicious. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	tcp: avoid collapses in tcp_prune_queue() if possible	Eric Dumazet
	Right after a TCP flow is created, receiving tiny out of order packets allways hit the condition : if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk); tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc (guarded by tcp_rmem[2]) Calling tcp_collapse_ofo_queue() in this case is not useful, and offers a O(N^2) surface attack to malicious peers. Better not attempt anything before full queue capacity is reached, forcing attacker to spend lots of resource and allow us to more easily detect the abuse. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	tcp: free batches of packets in tcp_prune_ofo_queue()	Eric Dumazet
	Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	ip: hash fragments consistently	Paolo Abeni
	The skb hash for locally generated ip[v6] fragments belonging to the same datagram can vary in several circumstances: * for connected UDP[v6] sockets, the first fragment get its hash via set_owner_w()/skb_set_hash_from_sk() * for unconnected IPv6 UDPv6 sockets, the first fragment can get its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if auto_flowlabel is enabled For the following frags the hash is usually computed via skb_get_hash(). The above can cause OoO for unconnected IPv6 UDPv6 socket: in that scenario the egress tx queue can be selected on a per packet basis via the skb hash. It may also fool flow-oriented schedulers to place fragments belonging to the same datagram in different flows. Fix the issue by copying the skb hash from the head frag into the others at fragmentation time. Before this commit: perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8" netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n & perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1 perf script probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0 After this commit: probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 Fixes: b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf on xmit") Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	ipv6: use fib6_info_hold_safe() when necessary	Wei Wang
	In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	i2c: imx: Fix reinit_completion() use	Esben Haabendal
	Make sure to call reinit_completion() before dma is started to avoid race condition where reinit_completion() is called after complete() and before wait_for_completion_timeout(). Signed-off-by: Esben Haabendal <eha@deif.com> Fixes: ce1a78840ff7 ("i2c: imx: add DMA support for freescale i2c driver") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-07-23	wan/fsl_ucc_hdlc: use IS_ERR_VALUE() to check return value of qe_muram_alloc	YueHaibing
	qe_muram_alloc return a unsigned long integer,which should not compared with zero. check it using IS_ERR_VALUE() to fix this. Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	i2c: davinci: Avoid zero value of CLKH	Alexander Sverdlin
	If CLKH is set to 0 I2C clock is not generated at all, so avoid this value and stretch the clock in this case. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-07-23	Merge tag 'linux-can-fixes-for-4.18-20180723' of ↵	David S. Miller
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-07-23 this is a pull request of 12 patches for net/master. The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem with older firmware. The next patch is by Roman Fietze and fixes the setup of the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas fix the runtime resume and clean up the probe function in the m_can driver. The last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	Merge branch 'smc-next'	David S. Miller
	Ursula Braun says: ==================== net/smc: patches 2018-07-23 here are some small patches for SMC: Just the first patch contains a functional change. It allows to differ between the modes SMCR and SMCD on s390 when monitoring SMC sockets. The remaining patches are cleanups without functional changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net/smc: remove local variable page in smc_rx_splice()	Ursula Braun
	The page map address is already stored in the RMB descriptor. There is no need to derive it from the cpu_addr value. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net/smc: use DECLARE_BITMAP for rtokens_used_mask	Ursula Braun
	Link group field tokens_used_mask is a bitmap. Use macro DECLARE_BITMAP for its definition. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net/smc: add function to get link group from link	Stefan Raspl
	Replace a frequently used construct with a more readable variant, reducing the code. Also might come handy when we start to support more than a single per link group. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net/smc: eliminate cursor read and write calls	Stefan Raspl
	The functions to read and write cursors are exclusively used to copy cursors. Therefore switch to a respective function instead. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net/smc: provide smc mode in smc_diag.c	Karsten Graul
	Rename field diag_fallback into diag_mode and set the smc mode of a connection explicitly. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	selftests: forwarding: gre_multipath: Drop IPv6 tests	Petr Machata
	Support for device-only IPv6 multipath next hops was dropped in commit 33bd5ac54dc4 ("net/ipv6: Revert attempt to simplify route replace and append") and as of commit b5d2d75e079a ("net/ipv6: Do not allow device only routes via the multipath API"), attempts to add a next hop like that yield an explicit diagnostic. Correspondingly, drop the IPv6 parts of GRE multipath test that are supposed to test that code. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	ipv6: sr: Use kmemdup instead of duplicating it in parse_nla_srh	YueHaibing
	Replace calls to kmalloc followed by a memcpy with a direct call to kmemdup. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	Merge branch 'net-bridge-add-support-for-backup-port'	David S. Miller
	Nikolay Aleksandrov says: ==================== net: bridge: add support for backup port This set introduces a new bridge port option that allows any port to have any other port (in the same bridge of course) as its backup and traffic will be forwarded to the backup port when the primary goes down. This is mainly used in MLAG and EVPN setups where we have peerlink path which is a backup of many (or even all) ports and is a participating bridge port itself. There's more detailed information in patch 02. Patch 01 just prepares the port sysfs code for options that take raw value. The main issues that this set solves are scalability and fallback latency. We have used similar code for over 6 months now to bring the fallback latency of the backup peerlink down and avoid fdb notification storms. Also due to the nature of master devices such setup is currently not possible, and last but not least having tens of thousands of fdbs require thousands of calls to switch. I've also CCed our MLAG experts that have been using similar option. Roopa also adds: "Two switches acting in a MLAG pair are connected by the peerlink interface which is a bridge port. the config on one of the switches looks like the below. The other switch also has a similar config. eth0 is connected to one port on the server. And the server is connected to both switches. br0 -- team0---eth0 \| -- switch-peerlink switch-peerlink becomes the failover/backport port when say team0 to the server goes down. Today, when team0 goes down, control plane has to withdraw all the fdb entries pointing to team0 and re-install the fdb entries pointing to switch-peerlink...and restore the fdb entries when team0 comes back up again. and this is the problem we are trying to solve. This also becomes necessary when multihoming is implemented by a standard like E-VPN https://tools.ietf.org/html/rfc8365#section-8 where the 'switch-peerlink' is an overlay vxlan port (like nikolay mentions in his patch commit). In these implementations, the fdb scale can be much larger. On why bond failover cannot be used here ?: the point that nikolay was alluding to is, switch-peerlink in the above example is a bridge port and is a failover/backport port for more than one or all ports in the bridge br0. And you cannot enslave switch-peerlink into a second level team with other bridge ports. Hence a multi layered team device is not an option (FWIW, switch-peerlink is also a teamed interface to the peer switch)." v3: Added Roopa's explanation and diagram v2: In patch 01 use kstrdup/kfree to avoid casting the const buf. In order to avoid using GFP_ATOMIC or always allocating I kept the spinlock inside each branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net: bridge: add support for backup port	Nikolay Aleksandrov
	This patch adds a new port attribute - IFLA_BRPORT_BACKUP_PORT, which allows to set a backup port to be used for known unicast traffic if the port has gone carrier down. The backup pointer is rcu protected and set only under RTNL, a counter is maintained so when deleting a port we know how many other ports reference it as a backup and we remove it from all. Also the pointer is in the first cache line which is hot at the time of the check and thus in the common case we only add one more test. The backup port will be used only for the non-flooding case since it's a part of the bridge and the flooded packets will be forwarded to it anyway. To remove the forwarding just send a 0/non-existing backup port. This is used to avoid numerous scalability problems when using MLAG most notably if we have thousands of fdbs one would need to change all of them on port carrier going down which takes too long and causes a storm of fdb notifications (and again when the port comes back up). In a Multi-chassis Link Aggregation setup usually hosts are connected to two different switches which act as a single logical switch. Those switches usually have a control and backup link between them called peerlink which might be used for communication in case a host loses connectivity to one of them. We need a fast way to failover in case a host port goes down and currently none of the solutions (like bond) cannot fulfill the requirements because the participating ports are actually the "master" devices and must have the same peerlink as their backup interface and at the same time all of them must participate in the bridge device. As Roopa noted it's normal practice in routing called fast re-route where a precalculated backup path is used when the main one is down. Another use case of this is with EVPN, having a single vxlan device which is backup of every port. Due to the nature of master devices it's not currently possible to use one device as a backup for many and still have all of them participate in the bridge (which is master itself). More detailed information about MLAG is available at the link below. https://docs.cumulusnetworks.com/display/DOCS/Multi-Chassis+Link+Aggregation+-+MLAG Further explanation and a diagram by Roopa: Two switches acting in a MLAG pair are connected by the peerlink interface which is a bridge port. the config on one of the switches looks like the below. The other switch also has a similar config. eth0 is connected to one port on the server. And the server is connected to both switches. br0 -- team0---eth0 \| -- switch-peerlink Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	net: bridge: add support for raw sysfs port options	Nikolay Aleksandrov
	This patch adds a new alternative store callback for port sysfs options which takes a raw value (buf) and can use it directly. It is needed for the backup port sysfs support since we have to pass the device by its name. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23	bluetooth: hci_qca: Replace GFP_ATOMIC with GFP_KERNEL	Jia-Ju Bai
	qca_open() and qca_set_baudrate() are never called in atomic context. They call kzalloc() and bt_skb_alloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. I also manually check the kernel code before reporting it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-23	bluetooth: hci_intel: Replace GFP_ATOMIC with GFP_KERNEL in ↵	Jia-Ju Bai
	inject_cmd_complete() inject_cmd_complete() is only called by intel_dequeue(), which is never called in atomic context. inject_cmd_complete() calls bt_skb_alloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. I also manually check the kernel code before reporting it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-23	bluetooth: btusb: Replace GFP_ATOMIC with GFP_KERNEL in inject_cmd_complete()	Jia-Ju Bai
	inject_cmd_complete() is only called by btusb_send_frame_intel(), which is set to hdev->send, and hdev->send() is never called in atomic context. inject_cmd_complete() calls bt_skb_alloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. I also manually check the kernel code before reporting it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>