linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-10-10	fore200e: store a struct device in struct fore200e	Christoph Hellwig
	This can be used much better than the untyped void pointer containing either a PCI or platform device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	fore200e: simplify fore200e_bus usage	Christoph Hellwig
	There is no need to have a global array of the ops, instead PCI and sbus can have their own instances assigned in *_probe. Also switch to C99 initializers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	net: tun: remove useless codes of tun_automq_select_queue	Wang Li
	Because the function __skb_get_hash_symmetric always returns non-zero. Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Wang Li <wangli39@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	virtio_net: ethtool tx napi configuration	Jason Wang
	Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers. Interrupt moderation is currently not supported, so these accept and display the default settings of 0 usec and 1 frame. Toggle tx napi through setting tx-frames. So as to not interfere with possible future interrupt moderation, value 1 means tx napi while value 0 means not. Only allow the switching when device is down for simplicity. Link: https://patchwork.ozlabs.org/patch/948149/ Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	Merge branch 'nfp-flower-speed-up-stats-update-loop'	David S. Miller
	Jakub Kicinski says: ==================== nfp: flower: speed up stats update loop This set from Pieter improves performance of processing FW stats update notifications. The FW seems to send those at relatively high rate (roughly ten per second per flow), therefore if we want to approach the million flows mark we have to be very careful about our data structures. We tried rhashtable for stat updates, but according to our experiments rhashtable lookup on a u32 takes roughly 60ns on an Xeon E5-2670 v3. Which translate to a hard limit of 16M lookups per second on this CPU, and, according to perf record jhash and memcmp account for 60% of CPU usage on the core handling the updates. Given that our statistic IDs are already array indices, and considering each statistic is only 24B in size, we decided to forego the use of hashtables and use a directly indexed array. The CPU savings are considerable. With the recent improvements in TC core and with our own bottlenecks out of the way Pieter removes the artificial limit of 128 flows, and allows the driver to install as many flows as FW supports. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	nfp: flower: use host context count provided by firmware	Pieter Jansen van Vuuren
	Read the host context count symbols provided by firmware and use it to determine the number of allocated stats ids. Previously it won't be possible to offload more than 2^17 filter even if FW was able to do so. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	nfp: flower: use stats array instead of storing stats per flow	Pieter Jansen van Vuuren
	Make use of an array stats instead of storing stats per flow which would require a hash lookup at critical times. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	nfp: flower: use rhashtable for flow caching	Pieter Jansen van Vuuren
	Make use of relativistic hash tables for tracking flows instead of fixed sized hash tables. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	isdn/hisax: amd7930_fn: Remove unnecessary parentheses	Nathan Chancellor
	Clang warns when multiple sets of parentheses are used for a single conditional statement. drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison with extraneous parentheses [-Wparentheses-equality] if ((cs->dc.amd7930.ph_state == 8)) { ~~~~~~~~~~~~~~~~~~~~~~~~^~~~ drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous parentheses around the comparison to silence this warning if ((cs->dc.amd7930.ph_state == 8)) { ~ ^ ~ drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this equality comparison into an assignment if ((cs->dc.amd7930.ph_state == 8)) { ^~ = 1 warning generated. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	Merge tag 'rxrpc-fixes-20181008' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fix packet reception code Here are a set of patches that prepares for and fix problems in rxrpc's package reception code. There serious problems are: (A) There's a window between binding the socket and setting the data_ready hook in which packets can find their way into the UDP socket's receive queues. (B) The skb_recv_udp() will return an error (and clear the error state) if there was an error on the Tx side. rxrpc doesn't handle this. (C) The rxrpc data_ready handler doesn't fully drain the UDP receive queue. (D) The rxrpc data_ready handler assumes it is called in a non-reentrant state. The second patch fixes (A) - (C); the third patch renders (B) and (C) non-issues by using the recap_rcv hook instead of data_ready - and the final patch fixes (D). That last is the most complex. The preparatory patches are: (1) Fix some places that are doing things in the wrong net namespace. (2) Stop taking the rcu read lock as it's held by the IP input routine in the call chain. (3) Only end the Tx phase if we rotated the final packet out of the Tx buffer. (4) Don't assume that the call state won't change after dropping the call_state lock. (5) Only take receive window and MTU suze parameters from an ACK packet if it's the latest ACK packet. (6) Record connection-level abort information correctly. (7) Fix a trace line. And then there are three main patches - note that these are mixed in with the preparatory patches somewhat: (1) Fix the setup window (A), skb_recv_udp() error check (B) and packet drainage (C). (2) Switch to using the encap_rcv instead of data_ready to cut out the effects of the UDP read queues and get the packets delivered directly. (3) Add more locking into the various packet input paths to defend against re-entrance (D). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	tcp: refactor DCTCP ECN ACK handling	Yuchung Cheng
	DCTCP has two parts - a new ECN signalling mechanism and the response function to it. The first part can be used by other congestion control for DCTCP-ECN deployed networks. This patch moves that part into a separate tcp_dctcp.h to be used by other congestion control module (like how Yeah uses Vegas algorithmas). For example, BBR is experimenting such ECN signal currently https://tinyurl.com/ietf-102-iccrg-bbr2 Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	net/ipv6: Make ipv6_route_table_template static	David Ahern
	ipv6_route_table_template is exported but there are no users outside of route.c. Make it static. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	rtnetlink: Update comment in rtnl_stats_dump regarding strict data checking	David Ahern
	The NLM_F_DUMP_PROPER_HDR netlink flag was replaced by a setsockopt. Update the comment in rtnl_stats_dump. Fixes: 841891ec0c65 ("rtnetlink: Update rtnl_stats_dump for strict data checking") Reported-by: Christian Brauner <christian@brauner.io> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	rtnetlink: Move ifm in valid_fdb_dump_legacy to closer to use	David Ahern
	Move setting of local variable ifm to after the message parsing in valid_fdb_dump_legacy. Avoid potential future use of unchecked variable. Fixes: 8dfbda19a21b ("rtnetlink: Move input checking for rtnl_fdb_dump to helper") Reported-by: Christian Brauner <christian@brauner.io> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	Merge branch 'mlxsw-selftests-Few-small-updates'	David S. Miller
	Ido Schimmel says: ==================== mlxsw: selftests: Few small updates First patch fixes a typo in mlxsw. Second patch fixes a race in a recent test. Third patch makes a recent test executable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	selftests: mlxsw: qos_mc_aware: Make executable	Petr Machata
	This is a self-standing test and as such should be itself executable. Fixes: b5638d46c90a ("selftests: mlxsw: Add a test for UC behavior under MC flood") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	selftests: forwarding: Have lldpad_app_wait_set() wait for unknown, too	Petr Machata
	Immediately after mlxsw module is probed and lldpad started, added APP entries are briefly in "unknown" state before becoming "pending". That's the state that lldpad_app_wait_set() typically sees, and since there are no pending entries at that time, it bails out. However the entries have not been pushed to the kernel yet at that point, and thus the test case fails. Fix by waiting for both unknown and pending entries to disappear before proceeding. Fixes: d159261f3662 ("selftests: mlxsw: Add test for trust-DSCP") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	mlxsw: pci: Fix a typo	Nir Dotan
	Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	rds: RDS (tcp) hangs on sendto() to unresponding address	Ka-Cheong Poon
	In rds_send_mprds_hash(), if the calculated hash value is non-zero and the MPRDS connections are not yet up, it will wait. But it should not wait if the send is non-blocking. In this case, it should just use the base c_path for sending the message. Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11	Merge tag 'for-4.19/dm-fixes-4' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Mike writes: "device mapper fix for 4.19 final - Fix for earlier 4.19 final DM linear change that incorrectly checked for CONFIG_DM_ZONED rather than CONFIG_BLK_DEV_ZONED." * tag 'for-4.19/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm linear: fix linear_end_io conditional definition
2018-10-10	net: aquantia: remove some redundant variable initializations	Colin Ian King
	There are several variables being initialized that are being set later and hence the initialization is redundant and can be removed. Remove then. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11	Merge tag 'xfs-fixes-for-4.19-rc7' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Dave writes: "xfs: fixes for 4.19-rc7 Update for 4.19-rc7 to fix numerous file clone and deduplication issues." * tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix data corruption w/ unaligned reflink ranges xfs: fix data corruption w/ unaligned dedupe ranges xfs: update ctime and remove suid before cloning files xfs: zero posteof blocks when cloning above eof xfs: refactor clonerange preparation into a separate helper
2018-10-10	dm linear: fix linear_end_io conditional definition	Damien Le Moal
	The dm-linear target is independent of the dm-zoned target. For code requiring support for zoned block devices, use CONFIG_BLK_DEV_ZONED instead of CONFIG_DM_ZONED. While at it, similarly to dm linear, also enable the DM_TARGET_ZONED_HM feature in dm-flakey only if CONFIG_BLK_DEV_ZONED is defined. Fixes: beb9caac211c1 ("dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled") Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-10	net/mlx5: WQ, fixes for fragmented WQ buffers API	Tariq Toukan
	mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API. Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue. This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix() The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers. New calculation is also more efficient, and improves performance as follows: Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size. Before: 16,477,619 pps After: 17,085,793 pps 3.7% improvement Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type	Huy Nguyen
	The HW spec defines only bits 24-26 of pftype_wq as the page fault type, use the required mask to ensure that. Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5: Fix memory leak when setting fpga ipsec caps	Talat Batheesh
	Allocated memory for context should be freed once finished working with it. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5e: Do not ignore netdevice TX/RX queues number	Feras Daoud
	The current design of mlx5e driver ignores the netdevice TX/RX queues number for netdevices that RDMA IPoIB ULP creates. Instead, the queue number is initialized to the maximum number that mlx5 thinks best for performance. As a result, ULP drivers that choose to create a netdevice with queue number that is less than the maximum channels mlx5 creates, will get a memory corruption. This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX queue number and use it when creating resources instead of the maximum channel number. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5e: Use non-delayed work for update stats	Saeed Mahameed
	Convert mlx5e update stats work to a normal work structure, since it is never used delayed. Add a helper function to queue update stats work on demand which checks for some conditions and reduce code duplication to have a better abstraction. Fixes: ed56c5193ad8 ("net/mlx5e: Update NIC HW stats on demand only") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5e: Initialize all netdev common structures in one place	Saeed Mahameed
	Move all mlx5e generic structures initializations to mlx5e_netdev_init. The common structure new initializer function will be used to initialize mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated child netdevs. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com>
2018-10-10	net/mlx5e: Always initialize update stats delayed work	Feras Daoud
	mlx5e_detach_netdev cancels update_stats work which was not initialized in ipoib netdevice profile, as a result, the following assert occurs: ODEBUG: assert_init not available (active state 0) object type: timer_list hint:(null) This change moves the update stats work to be initialized for all mlx5e netdevices. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	net/mlx5e: Gather common netdev init/cleanup functionality in one place	Feras Daoud
	Introduce a helper init/cleanup function that initializes mlx5e generic netdev private structure, and use them from all profiles init/cleanup callbacks. This patch will also be helpful to initialize/cleanup netdevs that are not created by mlx5 driver, e.g: accelerated ipoib child netdevs. Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	RDMA/netdev: Fix netlink support in IPoIB	Denis Drozdov
	IPoIB netlink support was broken by the below commit since integrating the rdma_netdev support relies on an allocation flow for netdevs that was controlled by the ipoib driver while netdev's rtnl_newlink implementation assumes that the netdev will be allocated by netlink. Such situation leads to crash in __ipoib_device_add, once trying to reuse netlink device. This patch fixes the kernel oops for both mlx4 and mlx5 devices triggered by the following command: Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	RDMA/netdev: Hoist alloc_netdev_mqs out of the driver	Denis Drozdov
	netdev has several interfaces that expect to call alloc_netdev_mqs from the core code, with the driver only providing the arguments. This is incompatible with the rdma_netdev interface that returns the netdev directly. Thus re-organize the API used by ipoib so that the verbs core code calls alloc_netdev_mqs for the driver. This is done by allowing the drivers to provide the allocation parameters via a 'get_params' callback and then initializing an allocated netdev as a second step. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10	Merge tag 'for-4.19/dm-fixes-3' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Mike writes: "device mapper fixes for 4.19 final - Fix a DM cache module init error path bug that doesn't properly cleanup a KMEM_CACHE if target registration fails. - Two stable@ fixes for DM zoned target; 4.20 will have changes that eliminate this code entirely but <= 4.19 needs these changes." * tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled dm: fix report zone remapping to account for partition offset dm cache: destroy migration_cache if cache target registration failed
2018-10-10	Merge tag 'trace-v4.19-rc5' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Steven writes: "vsprint fix: It was reported that trace_printk() was not reporting properly values that came after a dereference pointer. trace_printk() utilizes vbin_printf() and bstr_printf() to keep the overhead of tracing down. vbin_printf() does not do any conversions and just stors the string format and the raw arguments into the buffer. bstr_printf() is used to read the buffer and does the conversions to complete the printf() output. This can be troublesome with dereferenced pointers because the reference may be different from the time vbin_printf() is called to the time bstr_printf() is called. To fix this, a prior commit changed vbin_printf() to convert dereferenced pointers into strings and load the converted string into the buffer. But the change to bstr_printf() had an off-by-one error and didn't account for the nul character at the end of the string and this corrupted the rest of the values in the format that came after a dereferenced pointer." * tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
2018-10-10	Merge tag 'devicetree-fixes-for-4.19-3' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Rob writes: "Devicetree fixes for 4.19, part 3: - Fix DT unittest on Oldworld MAC systems" * tag 'devicetree-fixes-for-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: unittest: Disable interrupt node tests for old world MAC systems
2018-10-10	IB/mlx5: Unmap DMA addr from HCA before IOMMU	Valentine Fatiev
	The function that puts back the MR in cache also removes the DMA address from the HCA. Therefore we need to call this function before we remove the DMA mapping from MMU. Otherwise the HCA may access a memory that is no longer DMA mapped. Call trace: NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0. CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012 RIP: 0010:intel_idle+0x73/0x120 Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02 RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046 RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000 RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9 R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000 R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d FS: 0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0 Call Trace: cpuidle_enter_state+0x7e/0x2e0 do_idle+0x1ed/0x290 cpu_startup_entry+0x6f/0x80 start_kernel+0x524/0x544 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa4/0xb0 DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set Fixes: f3f134f5260a ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory") Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-10	net: make skb_partial_csum_set() more robust against overflows	Eric Dumazet
	syzbot managed to crash in skb_checksum_help() [1] : BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb)); Root cause is the following check in skb_partial_csum_set() if (unlikely(start > skb_headlen(skb)) \|\| unlikely((int)start + off > skb_headlen(skb) - 2)) return false; If skb_headlen(skb) is 1, then (skb_headlen(skb) - 2) becomes 0xffffffff and the check fails to detect that ((int)start + off) is off the limit, since the compare is unsigned. When we fix that, then the first condition (start > skb_headlen(skb)) becomes obsolete. Then we should also check that (skb_headroom(skb) + start) wont overflow 16bit field. [1] kernel BUG at net/core/dev.c:2880! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7330 Comm: syz-executor4 Not tainted 4.19.0-rc6+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_checksum_help+0x9e3/0xbb0 net/core/dev.c:2880 Code: 85 00 ff ff ff 48 c1 e8 03 42 80 3c 28 00 0f 84 09 fb ff ff 48 8b bd 00 ff ff ff e8 97 a8 b9 fb e9 f8 fa ff ff e8 2d 09 76 fb <0f> 0b 48 8b bd 28 ff ff ff e8 1f a8 b9 fb e9 b1 f6 ff ff 48 89 cf RSP: 0018:ffff8801d83a6f60 EFLAGS: 00010293 RAX: ffff8801b9834380 RBX: ffff8801b9f8d8c0 RCX: ffffffff8608c6d7 RDX: 0000000000000000 RSI: ffffffff8608cc63 RDI: 0000000000000006 RBP: ffff8801d83a7068 R08: ffff8801b9834380 R09: 0000000000000000 R10: ffff8801d83a76d8 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000010001 R14: 000000000000ffff R15: 00000000000000a8 FS: 00007f1a66db5700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7d77f091b0 CR3: 00000001ba252000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_csum_hwoffload_help+0x8f/0xe0 net/core/dev.c:3269 validate_xmit_skb+0xa2a/0xf30 net/core/dev.c:3312 __dev_queue_xmit+0xc2f/0x3950 net/core/dev.c:3797 dev_queue_xmit+0x17/0x20 net/core/dev.c:3838 packet_snd net/packet/af_packet.c:2928 [inline] packet_sendmsg+0x422d/0x64c0 net/packet/af_packet.c:2953 Fixes: 5ff8dda3035d ("net: Ensure partial checksum offset is inside the skb head") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	Merge branch 'devlink-param-type-string-fixes'	David S. Miller
	Moshe Shemesh says: ==================== devlink param type string fixes This patchset fixes devlink param infrastructure for string param type. The devlink param infrastructure doesn't handle copying the string data correctly. The first two patches fix it and the third patch adds helper function to safely copy string value without exceeding DEVLINK_PARAM_MAX_STRING_VALUE. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	devlink: Add helper function for safely copy string param	Moshe Shemesh
	Devlink string param buffer is allocated at the size of DEVLINK_PARAM_MAX_STRING_VALUE. Add helper function which makes sure this size is not exceeded. Renamed DEVLINK_PARAM_MAX_STRING_VALUE to __DEVLINK_PARAM_MAX_STRING_VALUE to emphasize that it should be used by devlink only. The driver should use the helper function instead to verify it doesn't exceed the allowed length. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	devlink: Fix param cmode driverinit for string type	Moshe Shemesh
	Driverinit configuration mode value is held by devlink to enable the driver fetch the value after reload command. In case the param type is string devlink should copy the value from driver string buffer to devlink string buffer on devlink_param_driverinit_value_set() and vice-versa on devlink_param_driverinit_value_get(). Fixes: ec01aeb1803e ("devlink: Add support for get/set driverinit value") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	devlink: Fix param set handling for string type	Moshe Shemesh
	In case devlink param type is string, it needs to copy the string value it got from the input to devlink_param_value. Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11	samples: disable CONFIG_SAMPLES for UML	Masahiro Yamada
	Some samples require headers installation, so commit 3fca1700c4c3 ("kbuild: make samples really depend on headers_install") added such dependency in the top Makefile. However, UML fails to build with CONFIG_SAMPLES=y because UML does not support headers_install. Fixes: 3fca1700c4c3 ("kbuild: make samples really depend on headers_install") Reported-by: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-10-10	Merge branch 'octeontx2-af-Add-RVU-Admin-Function-driver'	David S. Miller
	Sunil Goutham says: ==================== octeontx2-af: Add RVU Admin Function driver Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW resources from the network, crypto and other functional blocks into PCI-compatible physical and virtual functions. Each functional block again has multiple local functions (LFs) for provisioning to PCI devices. RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual functions (VFs). PF0 is called the administrative / admin function (AF) and has privileges to provision RVU functional block's LFs to each of the PF/VF. RVU managed networking functional blocks - Network pool allocator (NPA) - Network interface controller (NIX) - Network parser CAM (NPC) - Schedule/Synchronize/Order unit (SSO) RVU managed non-networking functional blocks - Crypto accelerator (CPT) - Scheduled timers unit (TIM) - Schedule/Synchronize/Order unit (SSO) Used for both networking and non networking usecases - Compression (upcoming in future variants of the silicons) Resource provisioning examples - A PF/VF with NIX-LF & NPA-LF resources works as a pure network device - A PF/VF with CPT-LF resource works as a pure cyrpto offload device. This admin function driver neither receives any data nor processes it i.e no I/O, a configuration only driver. PF/VFs communicates with AF via a shared memory region (mailbox). Upon receiving requests from PF/VF, AF does resource provisioning and other HW configuration. AF is always attached to host, but PF/VFs may be used by host kernel itself, or attached to VMs or to userspace applications like DPDK etc. So AF has to handle provisioning/configuration requests sent by any device from any domain. This patch series adds logic for the following - RVU AF driver with functional blocks provisioning support. - Mailbox infrastructure for communication between AF and PFs. - CGX (MAC controller) driver which communicates with firmware for managing physical ethernet interfaces. AF collects info from this driver and forwards the same to the PF/VFs uaing these interfaces. This is the first set of patches out of 80+ patches. Changes from v8: 1 Removed unnecessary typecasts in entire series - Suggested by David Miller 2 Added COMPILE_TEST to AF driver - Suggested by Arnd Bergmann 3 Changed udelay() to usleep_range() in rvu_poll_reg - Suggested by Arnd Bergmann 4 MSIX vector base IOMMU mapping is done using dma_map_resource() API instead of dma_map_single() as it accepts physical address. - Issue pointed by Arnd Bergmann Changes from v7: 1 Removed unnecessary typecasts in mbox infra code. - Suggested by David Miller 2 Fixed MAINTAINERS patch - Suggested by Joe Perches Changes from v6: Fixed ordering of local variables from longest to shortest line. - Suggested by David Miller Changes from v5: Modified bitfield based command structures to bitmasks for communication with firmware, to address endianness issues. - Suggested by Arnd Bergmann Changes from v4: 1 Removed module author/version/description from CGX driver as it's now merged with AF driver module. - Suggested by Arnd Bergmann 2 Added big-endian bitfields for CGX's kernel <=> firmware communication command structures. - Suggested by Arnd Bergmann Changes from v3: Moved driver from drivers/soc to drivers/net/ethernet - Suggested by Arnd Bergmann https://patchwork.kernel.org/cover/10587635/ Changes from v2: No changes, submitted again with netdev mailing list in loop. - Suggested by Arnd Bergmann and Andrew Lunn Changes from v1: 1 Merged RVU admin function and CGX drivers into a single module - Suggested by Arnd Bergmann 2 Pulled mbox communication APIs into a separate module to remove admin function driver dependency in a VM where AF is not attached. - Suggested by Arnd Bergmann ==================== Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	MAINTAINERS: Add entry for Marvell OcteonTX2 Admin Function driver	Sunil Goutham
	Added maintainers entry for Marvell OcteonTX2 SOC's RVU admin function driver. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	octeontx2-af: Register for CGX lmac events	Linu Cherian
	Added support in RVU AF driver to register for CGX LMAC link status change events from firmware and managing them. Processing part will be added in followup patches. - Introduced eventqueue for posting events from cgx lmac. Queueing mechanism will ensure that events can be posted and firmware can be acked immediately and hence event reception and processing are decoupled. - Events gets added to the queue by notification callback. Notification callback is expected to be atomic, since it is called from interrupt context. - Events are dequeued and processed in a worker thread. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	octeontx2-af: Add support for CGX link management	Linu Cherian
	CGX LMAC initialization, link status polling etc is done by low level secure firmware. For link management this patch adds a interface or communication mechanism between firmware and this kernel CGX driver. - Firmware interface specification is defined in cgx_fw_if.h. - Support to send/receive commands/events to/form firmware. - events/commands implemented * link up * link down * reading firmware version Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Nithya Mani <nmani@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	octeontx2-af: Set RVU PFs to CGX LMACs mapping	Linu Cherian
	Each of the enabled CGX LMAC is considered a physical interface and RVU PFs are mapped to these. VFs of these SRIOV PFs will be virtual interfaces and share CGX LMAC along with PF. This mapping info will be used later on for Rx/Tx pkt steering. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	octeontx2-af: Add Marvell OcteonTX2 CGX driver	Sunil Goutham
	This patch adds basic template for Marvell OcteonTX2's CGX ethernet interface driver. Just the probe. RVU AF driver will use APIs exported by this driver for various things like PF to physical interface mapping, loopback mode, interface stats etc. Hence marged both drivers into a single module. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10	octeontx2-af: Reconfig MSIX base with IOVA	Geetha sowjanya
	HW interprets RVU_AF_MSIXTR_BASE address as an IOVA, hence create a IOMMU mapping for the physcial address configured by firmware and reconfig RVU_AF_MSIXTR_BASE with IOVA. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>