summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-27net/mlx5e: Remove rq_headroom field from paramsTariq Toukan
It can be derived from other params, calculate it via the dedicated function when needed. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Remove RQ MPWQE fields from paramsTariq Toukan
Introduce functions to calculate them when needed. They can be derived from other params. This will simplify transition between RQ configurations. In general, any parameter that is not explicitly set or controlled, but derived from other parameters, should not have a control-path field itself, but a getter function. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Use no-offset function in skb header copyTariq Toukan
In copying skb header to skb->data, replace the call to skb_copy_to_linear_data_offset() with a zero offset with the call to the no-offset function skb_copy_to_linear_data(). Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Separate dma base address and offset in dma_sync callTariq Toukan
Pass the base dma address and offset to dma_sync_single_range_for_cpu(), instead of doing the pre-calculation. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Remove unused define MLX5_MPWRQ_STRIDES_PER_PAGETariq Toukan
Clean it up as it's not in use. Fixes: d9d9f156f380 ("net/mlx5e: Expand WQE stride when CQE compression is enabled") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Disable Striding RQ when PCI is slower than linkTariq Toukan
We turn the feature off for servers with PCI BW bounded by a threshold (16G) and lower than MAX LINK BW. This improves the effectiveness of CQE compression feature, that is defaulted to ON for the same case. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27net/mlx5e: Unify slow PCI heuristicTariq Toukan
Get the link/pci speed query and logic into a single function. Unify the heuristics and use a single PCI threshold (16G) for all. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-27Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that give the wrong return to Read Capacity and cause a huge log spew. The remaining five patches all try to fix commit 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs") which broke the non-mq I/O path" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled scsi: sd: Remember that READ CAPACITY(16) succeeded scsi: ibmvfc: Avoid unnecessary port relogin scsi: virtio_scsi: unify scsi_host_template scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity scsi: core: introduce force_blk_mq scsi: megaraid_sas: fix selection of reply queue scsi: hpsa: fix selection of reply queue
2018-03-27rxrpc: Trace call completionDavid Howells
Add a tracepoint to track rxrpc calls moving into the completed state and to log the completion type and the recorded error value and abort code. Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-27rxrpc, afs: Use debug_ids rather than pointers in tracesDavid Howells
In rxrpc and afs, use the debug_ids that are monotonically allocated to various objects as they're allocated rather than pointers as kernel pointers are now hashed making them less useful. Further, the debug ids aren't reused anywhere nearly as quickly. In addition, allow kernel services that use rxrpc, such as afs, to take numbers from the rxrpc counter, assign them to their own call struct and pass them in to rxrpc for both client and service calls so that the trace lines for each will have the same ID tag. Signed-off-by: David Howells <dhowells@redhat.com>
2018-03-27rxrpc: Trace resendDavid Howells
Add a tracepoint to trace packet resend events and to dump the Tx annotation buffer for added illumination. Signed-off-by: David Howells <dhowells@rdhat.com>
2018-03-27RDMA/hns: ensure for-loop actually iterates and free's buffersColin Ian King
The current for-loop zeros variable i and only loops once, hence not all the buffers are free'd. Fix this by setting i correctly. Detected by CoverityScan, CID#1463415 ("Operands don't affect result") Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27bpf: follow idr code conventionShaohua Li
Generally we do a preload before doing idr allocation. This also help improve the allocation success rate in memory pressure. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-27RDMA/ucma: Check that device exists prior to accessing itLeon Romanovsky
Ensure that device exists prior to accessing its properties. Reported-by: <syzbot+71655d44855ac3e76366@syzkaller.appspotmail.com> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Check that device is connected prior to access itLeon Romanovsky
Add missing check that device is connected prior to access it. [ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0 [ 55.359389] Read of size 8 at addr 00000000000000b0 by task qp/618 [ 55.360255] [ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b8b88 #91 [ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 55.363264] Call Trace: [ 55.363833] dump_stack+0x5c/0x77 [ 55.364215] kasan_report+0x163/0x380 [ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0 [ 55.365238] rdma_init_qp_attr+0x4a/0x2c0 [ 55.366410] ucma_init_qp_attr+0x111/0x200 [ 55.366846] ? ucma_notify+0xf0/0xf0 [ 55.367405] ? _get_random_bytes+0xea/0x1b0 [ 55.367846] ? urandom_read+0x2f0/0x2f0 [ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0 [ 55.369104] ? refcount_inc_not_zero+0x9/0x60 [ 55.369583] ? refcount_inc+0x5/0x30 [ 55.370155] ? rdma_create_id+0x215/0x240 [ 55.370937] ? _copy_to_user+0x4f/0x60 [ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290 [ 55.372127] ? _copy_from_user+0x5e/0x90 [ 55.372720] ucma_write+0x174/0x1f0 [ 55.373090] ? ucma_close_id+0x40/0x40 [ 55.373805] ? __lru_cache_add+0xa8/0xd0 [ 55.374403] __vfs_write+0xc4/0x350 [ 55.374774] ? kernel_read+0xa0/0xa0 [ 55.375173] ? fsnotify+0x899/0x8f0 [ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170 [ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 55.377522] ? handle_mm_fault+0x174/0x320 [ 55.378169] vfs_write+0xf7/0x280 [ 55.378864] SyS_write+0xa1/0x120 [ 55.379270] ? SyS_read+0x120/0x120 [ 55.379643] ? mm_fault_error+0x180/0x180 [ 55.380071] ? task_work_run+0x7d/0xd0 [ 55.380910] ? __task_pid_nr_ns+0x120/0x140 [ 55.381366] ? SyS_read+0x120/0x120 [ 55.381739] do_syscall_64+0xeb/0x250 [ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 55.382841] RIP: 0033:0x7fc2ef803e99 [ 55.383227] RSP: 002b:00007fffcc5f3be8 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 [ 55.384173] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc2ef803e99 [ 55.386145] RDX: 0000000000000057 RSI: 0000000020000080 RDI: 0000000000000003 [ 55.388418] RBP: 00007fffcc5f3c00 R08: 0000000000000000 R09: 0000000000000000 [ 55.390542] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400480 [ 55.392916] R13: 00007fffcc5f3cf0 R14: 0000000000000000 R15: 0000000000000000 [ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49 8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08 48 89 04 24 e8 3a 4f 1e ff 48 [ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP: ffff8801e2c2f9d8 [ 55.532648] CR2: 00000000000000b0 [ 55.534396] ---[ end trace 70cee64090251c0b ]--- Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Fixes: d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user") Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/rdma_cm: Fix use after free race with process_one_reqJason Gunthorpe
process_one_req() can race with rdma_addr_cancel(): CPU0 CPU1 ==== ==== process_one_work() debug_work_deactivate(work); process_one_req() rdma_addr_cancel() mutex_lock(&lock); set_timeout(&req->work,..); __queue_work() debug_work_activate(work); mutex_unlock(&lock); mutex_lock(&lock); [..] list_del(&req->list); mutex_unlock(&lock); [..] // ODEBUG explodes since the work is still queued. kfree(req); Causing ODEBUG to detect the use after free: ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165 WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288 kvm: emulating exchange as write Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ib_addr process_one_req Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x1f4/0x2b0 lib/bug.c:186 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288 RSP: 0000:ffff8801d966f210 EFLAGS: 00010086 RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815acd6e RDX: 0000000000000000 RSI: 1ffff1003b2cddf2 RDI: 0000000000000000 RBP: ffff8801d966f250 R08: 0000000000000000 R09: 1ffff1003b2cddc8 R10: ffffed003b2cde71 R11: ffffffff86f39a98 R12: 0000000000000001 R13: ffffffff86f15540 R14: ffffffff86408700 R15: ffffffff8147c0a0 __debug_check_no_obj_freed lib/debugobjects.c:745 [inline] debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774 kfree+0xc7/0x260 mm/slab.c:3799 process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592 process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC") Reported-by: <syzbot+3b4acab09b6463472d0a@syzkaller.appspotmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27Merge branch 'sfc-filter-locking'David S. Miller
Edward Cree says: ==================== sfc: rework locking around filter management The use of a spinlock to protect filter state combined with the need for a sleeping operation (MCDI) to apply that state to the NIC (on EF10) led to unfixable race conditions, around the handling of filter restoration after an MC reboot. So, this patch series removes the requirement to be able to modify the SW filter table from atomic context, by using a workqueue to request asynchronous filter operations (which are needed for ARFS). Then, the filter table locks are changed to mutexes, replacing the dance of spinlocks and 'busy' flags. Also, a mutex is added to protect the RSS context state, since otherwise a similar race is possible around restoring that after an MC reboot. While we're at it, fix a couple of other related bugs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: fix flow type handling for RSS filtersEdward Cree
The FLOW_RSS flag was causing us to insert UDP filters when TCP was wanted. Fixes: 42356d9a137b ("sfc: support RSS spreading of ethtool ntuple filters") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: protect list of RSS contexts under a mutexEdward Cree
Otherwise races are possible between ethtool ops and efx_ef10_rx_restore_rss_contexts(). Also, don't try to perform the restore on every reset, only after an MC reboot, otherwise we'll leak RSS contexts on the NIC. Fixes: 42356d9a137b ("sfc: support RSS spreading of ethtool ntuple filters") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: return a better error if filter insertion collides with MC rebootEdward Cree
If some other operation gets the MCDI lock ahead of us and performs an MC reboot, then our attempt to insert the filter will fail with EINVAL, because the destination VI (spec->dmaq_id, MC_CMD_FILTER_OP_IN_RX_QUEUE) does not exist. But the caller's request (which might e.g. be an ethtool ntuple request from userland) isn't invalid, it just got unlucky; so return EAGAIN. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: use a semaphore to lock farch filters tooEdward Cree
With this change, the spinlock efx->filter_lock is no longer used and is thus removed. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: give ef10 its own rwsem in the filter table instead of filter_lockEdward Cree
efx->filter_lock remains in place for use on farch, but EF10 now ignores it. EFX_EF10_FILTER_FLAG_BUSY is no longer needed, hence it is removed. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27sfc: replace asynchronous filter operationsEdward Cree
Instead of having an efx->type->filter_rfs_insert() method, just use workitems with a worker function that calls efx->type->filter_insert(). The only user of this is efx_filter_rfs(), which now queues a call to efx_filter_rfs_work(). Similarly, efx_filter_rfs_expire() is now a worker function called on a new channel->filter_work work_struct, so the method efx->type->filter_rfs_expire_one() is no longer called in atomic context. We also add a new mutex efx->rps_mutex to protect the RPS state (efx-> rps_expire_channel, efx->rps_expire_index, and channel->rps_flow_id) so that the taking of efx->filter_lock can be moved to efx->type->filter_rfs_expire_one(). Thus, all filter table functions are now called in a sleepable context, allowing them to use sleeping locks in a future patch. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27Merge branch 'pernet-all-async'David S. Miller
Kirill Tkhai says: ==================== Make pernet_operations always read locked All the pernet_operations are converted, and the last one is in this patchset (nfsd_net_ops acked by J. Bruce Fields). So, it's the time to kill pernet_operations::async field, and make setup_net() and cleanup_net() always require the rwsem only read locked. All further pernet_operations have to be developed to fit this rule. Some of previous patches added a comment to struct pernet_operations about that. Also, this patchset renames net_sem to pernet_ops_rwsem to make the target area of the rwsem is more clear visible, and adds more comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Add more commentsKirill Tkhai
This adds comments to different places to improve readability. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Rename net_sem to pernet_ops_rwsemKirill Tkhai
net_sem is some undefined area name, so it will be better to make the area more defined. Rename it to pernet_ops_rwsem for better readability and better intelligibility. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Drop pernet_operations::asyncKirill Tkhai
Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Reflect all pernet_operations are convertedKirill Tkhai
All pernet_operations are reviewed and converted, hooray! Reflect this in core code: setup_net() and cleanup_net() will take down_read() always. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Convert nfsd_net_opsKirill Tkhai
These pernet_operations look similar to rpcsec_gss_net_ops, they just create and destroy another caches. So, they also can be async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: mvpp2: Use relaxed I/O in data pathYan Markman
Use relaxed I/O on the hot path. This achieves significant performance improvements. On a 10G link, this makes a basic iperf TCP test go from an average of 4.5 Gbits/sec to about 9.40 Gbits/sec. Signed-off-by: Yan Markman <ymarkman@marvell.com> [Maxime: Commit message, cosmetic changes] Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27qede: Fix barrier usage after tx doorbell write.Manish Chopra
Since commit c5ad119fb6c09b0297446be05bd66602fa564758 ("net: sched: pfifo_fast use skb_array") driver is exposed to an issue where it is hitting NULL skbs while handling TX completions. Driver uses mmiowb() to flush the writes to the doorbell bar which is a write-combined bar, however on x86 mmiowb() does not flush the write combined buffer. This patch fixes this problem by replacing mmiowb() with wmb() after the write combined doorbell write so that writes are flushed and synchronized from more than one processor. V1->V2: ------- This patch was marked as "superseded" in patchwork. (Not really sure for what reason).Resending it as v2. Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27vhost: correctly remove wait queue during poll failureJason Wang
We tried to remove vq poll from wait queue, but do not check whether or not it was in a list before. This will lead double free. Fixing this by switching to use vhost_poll_stop() which zeros poll->wqh after removing poll from waitqueue to make sure it won't be freed twice. Cc: Darren Kenny <darren.kenny@oracle.com> Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com Fixes: 2b8b328b61c79 ("vhost_net: handle polling errors when setting backend") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-28kbuild: rpm-pkg: Support GNU tar >= 1.29Jason Gunthorpe
There is a change in how command line parsing is done in this version. Excludes and includes are now ordered with the file list. Since the spec file puts the file list before the exclude list it means newer tar ignores the excludes and packs all the build output into the kernel-devel RPM resulting in a huge package. Simple argument re-ordering fixes the problem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-28builddeb: Fix header package regarding dtc source linksJan Kiszka
Since d5d332d3f7e8, a couple of links in scripts/dtc/include-prefixes are additionally required in order to build device trees with the header package. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-27Merge branch 'mlx4-misc-fixes-for-4.16'David S. Miller
Tariq Toukan says: ==================== mlx4 misc fixes for 4.16 This patchset contains misc bug fixes from the team to the mlx4 Core and Eth drivers. Patch 1 by Eran fixes a control mix of PFC and Global pauses, please queue it to -stable for >= v4.8. Patch 2 by Moshe fixes a resource leak in slave's delete flow, please queue it to -stable for >= v4.5. Series generated against net commit: 3c82b372a9f4 net: dsa: mt7530: fix module autoloading for OF platform drivers ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/mlx4_core: Fix memory leak while delete slave's resourcesMoshe Shemesh
mlx4_delete_all_resources_for_slave in resource tracker should free all memory allocated for a slave. While releasing memory of fs_rule, it misses releasing memory of fs_rule->mirr_mbox. Fixes: 78efed275117 ('net/mlx4_core: Support mirroring VF DMFS rules on both ports') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/mlx4_en: Fix mixed PFC and Global pause user control requestsEran Ben Elisha
Global pause and PFC configuration should be mutually exclusive (i.e. only one of them at most can be set). However, once PFC was turned off, driver automatically turned Global pause on. This is a bug. Fix the driver behaviour to turn off PFC/Global once the user turned the other on. This also fixed a weird behaviour that at a current time, the profile had both PFC and global pause configuration turned on, which is Hardware-wise impossible and caused returning false positive indication to query tools. In addition, fix error code when setting global pause or PFC to change metadata only upon successful change. Also, removed useless debug print. Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink commands") Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/smc: use announced length in sock_recvmsg()Ursula Braun
Not every CLC proposal message needs the maximum buffer length. Due to the MSG_WAITALL flag, it is important to use the peeked real length when receiving the message. Fixes: d63d271ce2b5ce ("smc: switch to sock_recvmsg()") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27llc: properly handle dev_queue_xmit() return valueCong Wang
llc_conn_send_pdu() pushes the skb into write queue and calls llc_conn_send_pdus() to flush them out. However, the status of dev_queue_xmit() is not returned to caller, in this case, llc_conn_state_process(). llc_conn_state_process() needs hold the skb no matter success or failure, because it still uses it after that, therefore we should hold skb before dev_queue_xmit() when that skb is the one being processed by llc_conn_state_process(). For other callers, they can just pass NULL and ignore the return value as they are. Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27Merge tag 'mlx5-fixes-2018-03-23' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-03-23 The following series includes fixes for mlx5 netdev and eswitch. v1->v2: - Fixed commit message quotation marks in patch #7 For -stable v4.12 ('net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path') ('net/mlx5e: Fix traffic being dropped on VF representor') For -stable v4.13 ('net/mlx5e: Fix memory usage issues in offloading TC flows') ('net/mlx5e: Verify coalescing parameters in range') For -stable v4.14 ('net/mlx5e: Don't override vport admin link state in switchdev mode') For -stable v4.15 ('108b2b6d5c02 net/mlx5e: Sync netdev vxlan ports at open') Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27Merge tag 'mlx5-updates-2018-03-22' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-03-22 (Misc updates) This series includes misc updates for mlx5 core and netdev dirver, Highlights: From Inbar, three patches to add support for PFC stall prevention statistics and enable/disable through new ethtool tunable, as requested from previous submission. From Moshe, four patches, added more drop counters: - drop counter for netdev steering miss - drop counter for when VF logical link is down - drop counter for when netdev logical link is down. From Or, three patches to support vlan push/pop offload via tc HW action, for newer HW (Connectx-5 and onward) via HW steering flow actions rather than the emulated path for the older HW brands. And five more misc small trivial patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27liquidio: Removed duplicate Tx queue status checkIntiyaz Basha
Napi is checking Tx queue status and waking the Tx queue if required. Same operation is being done while freeing every Tx buffer. So removed the duplicate operation of checking Tx queue status from the Tx buffer free functions. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27strparser: Fix sign of err codesDave Watson
strp_parser_err is called with a negative code everywhere, which then calls abort_parser with a negative code. strp_msg_timeout calls abort_parser directly with a positive code. Negate ETIMEDOUT to match signed-ness of other calls. The default abort_parser callback, strp_abort_strp, sets sk->sk_err to err. Also negate the error here so sk_err always holds a positive value, as the rest of the net code expects. Currently a negative sk_err can result in endless loops, or user code that thinks it actually sent/received err bytes. Found while testing net/tls_sw recv path. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net sched actions: fix dumping which requires several messages to user spaceCraig Dillabaugh
Fixes a bug in the tcf_dump_walker function that can cause some actions to not be reported when dumping a large number of actions. This issue became more aggrevated when cookies feature was added. In particular this issue is manifest when large cookie values are assigned to the actions and when enough actions are created that the resulting table must be dumped in multiple batches. The number of actions returned in each batch is limited by the total number of actions and the memory buffer size. With small cookies the numeric limit is reached before the buffer size limit, which avoids the code path triggering this bug. When large cookies are used buffer fills before the numeric limit, and the erroneous code path is hit. For example after creating 32 csum actions with the cookie aaaabbbbccccdddd $ tc actions ls action csum total acts 26 action order 0: csum (tcp) action continue index 1 ref 1 bind 0 cookie aaaabbbbccccdddd ..... action order 25: csum (tcp) action continue index 26 ref 1 bind 0 cookie aaaabbbbccccdddd total acts 6 action order 0: csum (tcp) action continue index 28 ref 1 bind 0 cookie aaaabbbbccccdddd ...... action order 5: csum (tcp) action continue index 32 ref 1 bind 0 cookie aaaabbbbccccdddd Note that the action with index 27 is omitted from the report. Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")" Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27r8169: fix setting driver_data after register_netdevHeiner Kallweit
pci_set_drvdata() is called only after registering the net_device, therefore we could run into a NPE if one of the functions using driver_data is called before it's set. Fix this by calling pci_set_drvdata() before registering the net_device. This fix is a candidate for stable. As far as I can see the bug has been there in kernel version 3.2 already, therefore I can't provide a reference which commit is fixed by it. The fix may need small adjustments per kernel version because due to other changes the label which is jumped to if register_netdev() fails has changed over time. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27ipv6: addrconf: Use normal debugging styleJoe Perches
Remove local ADBG macro and use netdev_dbg/pr_debug Miscellanea: o Remove unnecessary debug message after allocation failure as there already is a dump_stack() on the failure paths o Leave the allocation failure message on snmp6_alloc_dev as there is one code path that does not do a dump_stack() Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: fix possible out-of-bound read in skb_network_protocol()Eric Dumazet
skb mac header is not necessarily set at the time skb_network_protocol() is called. Use skb->data instead. BUG: KASAN: slab-out-of-bounds in skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 Read of size 2 at addr ffff8801b3097a0b by task syz-executor5/14242 CPU: 1 PID: 14242 Comm: syz-executor5 Not tainted 4.16.0-rc6+ #280 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23c/0x360 mm/kasan/report.c:412 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:443 skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 harmonize_features net/core/dev.c:2924 [inline] netif_skb_features+0x509/0x9b0 net/core/dev.c:3011 validate_xmit_skb+0x81/0xb00 net/core/dev.c:3084 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3142 packet_direct_xmit+0x117/0x790 net/packet/af_packet.c:256 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xca/0x110 net/socket.c:639 ___sys_sendmsg+0x767/0x8b0 net/socket.c:2047 __sys_sendmsg+0xe5/0x210 net/socket.c:2081 Fixes: 19acc327258a ("gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@ovn.org> Reported-by: Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27tc-testing: Correct compound statements for namespace executionLucas Bates
If tdc is executing test cases inside a namespace, only the first command in a compound statement will be executed inside the namespace by tdc. As a result, the subsequent commands are not executed inside the namespace and the test will fail. Example: for i in {x..y}; do args="foo"; done && tc actions add $args The namespace execution feature will prepend 'ip netns exec' to the command: ip netns exec tcut for i in {x..y}; do args="foo"; done && \ tc actions add $args So the actual tc command is not parsed by the shell as being part of the namespace execution. Enclosing these compound statements inside a bash invocation with proper escape characters resolves the problem by creating a subshell inside the namespace. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net-usb: add qmi_wwan if on lte modem wistron neweb d18q1Giuseppe Lippolis
This modem is embedded on dlink dwr-921 router. The oem configuration states: T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1435 ProdID=0918 Rev= 2.32 S: Manufacturer=Android S: Product=Android S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none) E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=125us Tested on openwrt distribution Signed-off-by: Giuseppe Lippolis <giu.lippolis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27tipc: tipc_node_create() can be staticWei Yongjun
Fixes the following sparse warning: net/tipc/node.c:336:18: warning: symbol 'tipc_node_create' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>