summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-28mm/memcontrol.c: fix parameter description mismatchHonglei Wang
There are a couple of places where parameter description and function name do not match the actual code. Fix it. Link: http://lkml.kernel.org/r/1520843448-17347-1-git-send-email-honglei.wang@oracle.com Signed-off-by: Honglei Wang <honglei.wang@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-28mm/vmstat.c: fix vmstat_update() preemption BUGSteven J. Hill
Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can cause vmstat_update() to report a BUG due to preemption not being disabled around smp_processor_id(). Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor. BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:1/269 caller is vmstat_update+0x50/0xa0 CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted 4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1 Workqueue: mm_percpu_wq vmstat_update Call Trace: show_stack+0x94/0x128 dump_stack+0xa4/0xe0 check_preemption_disabled+0x118/0x120 vmstat_update+0x50/0xa0 process_one_work+0x144/0x348 worker_thread+0x150/0x4b8 kthread+0x110/0x140 ret_from_kernel_thread+0x14/0x1c Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com Signed-off-by: Steven J. Hill <steven.hill@cavium.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-28mm/page_owner: fix recursion bug after changing skip entriesManinder Singh
This patch fixes commit 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries"). Because if we skip first two entries then logic of checking count value as 2 for recursion is broken and code will go in one depth recursion. so we need to check only one call of _RET_IP(__set_page_owner) while checking for recursion. Current Backtrace while checking for recursion:- (save_stack) from (__set_page_owner) // (But recursion returns true here) (__set_page_owner) from (get_page_from_freelist) (get_page_from_freelist) from (__alloc_pages_nodemask) (__alloc_pages_nodemask) from (depot_save_stack) (depot_save_stack) from (save_stack) // recursion should return true here (save_stack) from (__set_page_owner) (__set_page_owner) from (get_page_from_freelist) (get_page_from_freelist) from (__alloc_pages_nodemask+) (__alloc_pages_nodemask) from (depot_save_stack) (depot_save_stack) from (save_stack) (save_stack) from (__set_page_owner) (__set_page_owner) from (get_page_from_freelist) Correct Backtrace with fix: (save_stack) from (__set_page_owner) // recursion returned true here (__set_page_owner) from (get_page_from_freelist) (get_page_from_freelist) from (__alloc_pages_nodemask+) (__alloc_pages_nodemask) from (depot_save_stack) (depot_save_stack) from (save_stack) (save_stack) from (__set_page_owner) (__set_page_owner) from (get_page_from_freelist) Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries") Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Signed-off-by: Vaneet Narang <v.narang@samsung.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@techadventures.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ayush Mittal <ayush.m@samsung.com> Cc: Prakash Gupta <guptap@codeaurora.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: Vasyl Gomonovych <gomonovych@gmail.com> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Cc: <pankaj.m@samsung.com> Cc: Vaneet Narang <v.narang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-28ipc/shm.c: add split function to shm_vm_opsMike Kravetz
If System V shmget/shmat operations are used to create a hugetlbfs backed mapping, it is possible to munmap part of the mapping and split the underlying vma such that it is not huge page aligned. This will untimately result in the following BUG: kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E 4.15.0-10-generic #11-Ubuntu NIP: c00000000036e764 LR: c00000000036ee48 CTR: 0000000000000009 REGS: c000003fbcdcf810 TRAP: 0700 Tainted: G C E (4.15.0-10-generic) MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002222 XER: 20040000 CFAR: c00000000036ee44 SOFTE: 1 NIP __unmap_hugepage_range+0xa4/0x760 LR __unmap_hugepage_range_final+0x28/0x50 Call Trace: 0x7115e4e00000 (unreliable) __unmap_hugepage_range_final+0x28/0x50 unmap_single_vma+0x11c/0x190 unmap_vmas+0x94/0x140 exit_mmap+0x9c/0x1d0 mmput+0xa8/0x1d0 do_exit+0x360/0xc80 do_group_exit+0x60/0x100 SyS_exit_group+0x24/0x30 system_call+0x58/0x6c ---[ end trace ee88f958a1c62605 ]--- This bug was introduced by commit 31383c6865a5 ("mm, hugetlbfs: introduce ->split() to vm_operations_struct"). A split function was added to vm_operations_struct to determine if a mapping can be split. This was mostly for device-dax and hugetlbfs mappings which have specific alignment constraints. Mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops. shm_vm_ops functions will call back to the original vm_ops if needed. Add such a split function to shm_vm_ops. Link: http://lkml.kernel.org/r/20180321161314.7711-1-mike.kravetz@oracle.com Fixes: 31383c6865a5 ("mm, hugetlbfs: introduce ->split() to vm_operations_struct") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-28mm, slab: memcg_link the SLAB's kmem_cacheShakeel Butt
All the root caches are linked into slab_root_caches which was introduced by the commit 510ded33e075 ("slab: implement slab_root_caches list") but it missed to add the SLAB's kmem_cache. While experimenting with opt-in/opt-out kmem accounting, I noticed system crashes due to NULL dereference inside cache_from_memcg_idx() while deferencing kmem_cache.memcg_params.memcg_caches. The upstream clean kernel will not see these crashes but SLAB should be consistent with SLUB which does linked its boot caches (kmem_cache_node and kmem_cache) into slab_root_caches. Link: http://lkml.kernel.org/r/20180319210020.60289-1-shakeelb@google.com Fixes: 510ded33e075c ("slab: implement slab_root_caches list") Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-28RDMA/ucma: Introduce safer rdma_addr_size() variantsRoland Dreier
There are several places in the ucma ABI where userspace can pass in a sockaddr but set the address family to AF_IB. When that happens, rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6, and the ucma kernel code might end up copying past the end of a buffer not sized for a struct sockaddr_ib. Fix this by introducing new variants int rdma_addr_size_in6(struct sockaddr_in6 *addr); int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr); that are type-safe for the types used in the ucma ABI and return 0 if the size computed is bigger than the size of the type passed in. We can use these new variants to check what size userspace has passed in before copying any addresses. Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-28x86/platform/UV: Fix critical UV MMR address errormike.travis@hpe.com
A critical error was found testing the fixed UV4 HUB in that an MMR address was found to be incorrect. This causes the virtual address space for accessing the MMIOH1 region to be allocated with the incorrect size. Fixes: 673aa20c55a1 ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes") Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com
2018-03-28perf/hwbp: Simplify the perf-hwbp code, fix documentationLinus Torvalds
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the modification of a breakpoint - simplify it and remove the pointless local variables. Also update the stale Docbook while at it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-28drm/tegra: dc: Using NULL instead of plain integerWei Yongjun
Fixes the following sparse warnings: drivers/gpu/drm/tegra/dc.c:2181:69: warning: Using plain integer as NULL pointer Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-28KVM: x86: Fix pv tlb flush dependenciesWanpeng Li
PV TLB FLUSH can only be turned on when steal time is enabled. The condition got reversed during conflict resolution. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Fixes: 4f2f61fc5071 ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled") [Rebased on top of kvm/master and reworded the commit message. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28x86/platform/uv/BAU: Add APIC idt entryAndrew Banman
BAU uses the old alloc_initr_gate90 method to setup its interrupt. This fails silently as the BAU vector is in the range of APIC vectors that are registered to the spurious interrupt handler. As a consequence BAU broadcasts are not handled, and the broadcast source CPU hangs. Update BAU to use new idt structure. Fixes: dc20b2d52653 ("x86/idt: Move interrupt gate initialization to IDT code") Signed-off-by: Andrew Banman <abanman@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: Dimitri Sivanich <sivanich@hpe.com> Cc: Russ Anderson <rja@hpe.com> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com
2018-03-27Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "A small number of small fixes for ARM, mostly for some build issues. One fix for a regression caused by the cpu hotplug conversion from a few kernel versions ago" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8750/1: deflate_xip_data.sh: minor fixes ARM: 8748/1: mm: Define vdso_start, vdso_end as array ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
2018-03-27Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that give the wrong return to Read Capacity and cause a huge log spew. The remaining five patches all try to fix commit 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs") which broke the non-mq I/O path" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled scsi: sd: Remember that READ CAPACITY(16) succeeded scsi: ibmvfc: Avoid unnecessary port relogin scsi: virtio_scsi: unify scsi_host_template scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity scsi: core: introduce force_blk_mq scsi: megaraid_sas: fix selection of reply queue scsi: hpsa: fix selection of reply queue
2018-03-27RDMA/hns: ensure for-loop actually iterates and free's buffersColin Ian King
The current for-loop zeros variable i and only loops once, hence not all the buffers are free'd. Fix this by setting i correctly. Detected by CoverityScan, CID#1463415 ("Operands don't affect result") Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Check that device exists prior to accessing itLeon Romanovsky
Ensure that device exists prior to accessing its properties. Reported-by: <syzbot+71655d44855ac3e76366@syzkaller.appspotmail.com> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/ucma: Check that device is connected prior to access itLeon Romanovsky
Add missing check that device is connected prior to access it. [ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0 [ 55.359389] Read of size 8 at addr 00000000000000b0 by task qp/618 [ 55.360255] [ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b8b88 #91 [ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 55.363264] Call Trace: [ 55.363833] dump_stack+0x5c/0x77 [ 55.364215] kasan_report+0x163/0x380 [ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0 [ 55.365238] rdma_init_qp_attr+0x4a/0x2c0 [ 55.366410] ucma_init_qp_attr+0x111/0x200 [ 55.366846] ? ucma_notify+0xf0/0xf0 [ 55.367405] ? _get_random_bytes+0xea/0x1b0 [ 55.367846] ? urandom_read+0x2f0/0x2f0 [ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0 [ 55.369104] ? refcount_inc_not_zero+0x9/0x60 [ 55.369583] ? refcount_inc+0x5/0x30 [ 55.370155] ? rdma_create_id+0x215/0x240 [ 55.370937] ? _copy_to_user+0x4f/0x60 [ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290 [ 55.372127] ? _copy_from_user+0x5e/0x90 [ 55.372720] ucma_write+0x174/0x1f0 [ 55.373090] ? ucma_close_id+0x40/0x40 [ 55.373805] ? __lru_cache_add+0xa8/0xd0 [ 55.374403] __vfs_write+0xc4/0x350 [ 55.374774] ? kernel_read+0xa0/0xa0 [ 55.375173] ? fsnotify+0x899/0x8f0 [ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170 [ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30 [ 55.377522] ? handle_mm_fault+0x174/0x320 [ 55.378169] vfs_write+0xf7/0x280 [ 55.378864] SyS_write+0xa1/0x120 [ 55.379270] ? SyS_read+0x120/0x120 [ 55.379643] ? mm_fault_error+0x180/0x180 [ 55.380071] ? task_work_run+0x7d/0xd0 [ 55.380910] ? __task_pid_nr_ns+0x120/0x140 [ 55.381366] ? SyS_read+0x120/0x120 [ 55.381739] do_syscall_64+0xeb/0x250 [ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 55.382841] RIP: 0033:0x7fc2ef803e99 [ 55.383227] RSP: 002b:00007fffcc5f3be8 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 [ 55.384173] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc2ef803e99 [ 55.386145] RDX: 0000000000000057 RSI: 0000000020000080 RDI: 0000000000000003 [ 55.388418] RBP: 00007fffcc5f3c00 R08: 0000000000000000 R09: 0000000000000000 [ 55.390542] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400480 [ 55.392916] R13: 00007fffcc5f3cf0 R14: 0000000000000000 R15: 0000000000000000 [ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49 8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08 48 89 04 24 e8 3a 4f 1e ff 48 [ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP: ffff8801e2c2f9d8 [ 55.532648] CR2: 00000000000000b0 [ 55.534396] ---[ end trace 70cee64090251c0b ]--- Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Fixes: d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user") Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27RDMA/rdma_cm: Fix use after free race with process_one_reqJason Gunthorpe
process_one_req() can race with rdma_addr_cancel(): CPU0 CPU1 ==== ==== process_one_work() debug_work_deactivate(work); process_one_req() rdma_addr_cancel() mutex_lock(&lock); set_timeout(&req->work,..); __queue_work() debug_work_activate(work); mutex_unlock(&lock); mutex_lock(&lock); [..] list_del(&req->list); mutex_unlock(&lock); [..] // ODEBUG explodes since the work is still queued. kfree(req); Causing ODEBUG to detect the use after free: ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165 WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288 kvm: emulating exchange as write Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ib_addr process_one_req Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x1f4/0x2b0 lib/bug.c:186 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288 RSP: 0000:ffff8801d966f210 EFLAGS: 00010086 RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815acd6e RDX: 0000000000000000 RSI: 1ffff1003b2cddf2 RDI: 0000000000000000 RBP: ffff8801d966f250 R08: 0000000000000000 R09: 1ffff1003b2cddc8 R10: ffffed003b2cde71 R11: ffffffff86f39a98 R12: 0000000000000001 R13: ffffffff86f15540 R14: ffffffff86408700 R15: ffffffff8147c0a0 __debug_check_no_obj_freed lib/debugobjects.c:745 [inline] debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774 kfree+0xc7/0x260 mm/slab.c:3799 process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592 process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC") Reported-by: <syzbot+3b4acab09b6463472d0a@syzkaller.appspotmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27qede: Fix barrier usage after tx doorbell write.Manish Chopra
Since commit c5ad119fb6c09b0297446be05bd66602fa564758 ("net: sched: pfifo_fast use skb_array") driver is exposed to an issue where it is hitting NULL skbs while handling TX completions. Driver uses mmiowb() to flush the writes to the doorbell bar which is a write-combined bar, however on x86 mmiowb() does not flush the write combined buffer. This patch fixes this problem by replacing mmiowb() with wmb() after the write combined doorbell write so that writes are flushed and synchronized from more than one processor. V1->V2: ------- This patch was marked as "superseded" in patchwork. (Not really sure for what reason).Resending it as v2. Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27vhost: correctly remove wait queue during poll failureJason Wang
We tried to remove vq poll from wait queue, but do not check whether or not it was in a list before. This will lead double free. Fixing this by switching to use vhost_poll_stop() which zeros poll->wqh after removing poll from waitqueue to make sure it won't be freed twice. Cc: Darren Kenny <darren.kenny@oracle.com> Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com Fixes: 2b8b328b61c79 ("vhost_net: handle polling errors when setting backend") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-28kbuild: rpm-pkg: Support GNU tar >= 1.29Jason Gunthorpe
There is a change in how command line parsing is done in this version. Excludes and includes are now ordered with the file list. Since the spec file puts the file list before the exclude list it means newer tar ignores the excludes and packs all the build output into the kernel-devel RPM resulting in a huge package. Simple argument re-ordering fixes the problem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-28builddeb: Fix header package regarding dtc source linksJan Kiszka
Since d5d332d3f7e8, a couple of links in scripts/dtc/include-prefixes are additionally required in order to build device trees with the header package. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-27Merge branch 'mlx4-misc-fixes-for-4.16'David S. Miller
Tariq Toukan says: ==================== mlx4 misc fixes for 4.16 This patchset contains misc bug fixes from the team to the mlx4 Core and Eth drivers. Patch 1 by Eran fixes a control mix of PFC and Global pauses, please queue it to -stable for >= v4.8. Patch 2 by Moshe fixes a resource leak in slave's delete flow, please queue it to -stable for >= v4.5. Series generated against net commit: 3c82b372a9f4 net: dsa: mt7530: fix module autoloading for OF platform drivers ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/mlx4_core: Fix memory leak while delete slave's resourcesMoshe Shemesh
mlx4_delete_all_resources_for_slave in resource tracker should free all memory allocated for a slave. While releasing memory of fs_rule, it misses releasing memory of fs_rule->mirr_mbox. Fixes: 78efed275117 ('net/mlx4_core: Support mirroring VF DMFS rules on both ports') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/mlx4_en: Fix mixed PFC and Global pause user control requestsEran Ben Elisha
Global pause and PFC configuration should be mutually exclusive (i.e. only one of them at most can be set). However, once PFC was turned off, driver automatically turned Global pause on. This is a bug. Fix the driver behaviour to turn off PFC/Global once the user turned the other on. This also fixed a weird behaviour that at a current time, the profile had both PFC and global pause configuration turned on, which is Hardware-wise impossible and caused returning false positive indication to query tools. In addition, fix error code when setting global pause or PFC to change metadata only upon successful change. Also, removed useless debug print. Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink commands") Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net/smc: use announced length in sock_recvmsg()Ursula Braun
Not every CLC proposal message needs the maximum buffer length. Due to the MSG_WAITALL flag, it is important to use the peeked real length when receiving the message. Fixes: d63d271ce2b5ce ("smc: switch to sock_recvmsg()") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27llc: properly handle dev_queue_xmit() return valueCong Wang
llc_conn_send_pdu() pushes the skb into write queue and calls llc_conn_send_pdus() to flush them out. However, the status of dev_queue_xmit() is not returned to caller, in this case, llc_conn_state_process(). llc_conn_state_process() needs hold the skb no matter success or failure, because it still uses it after that, therefore we should hold skb before dev_queue_xmit() when that skb is the one being processed by llc_conn_state_process(). For other callers, they can just pass NULL and ignore the return value as they are. Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27Merge tag 'mlx5-fixes-2018-03-23' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-03-23 The following series includes fixes for mlx5 netdev and eswitch. v1->v2: - Fixed commit message quotation marks in patch #7 For -stable v4.12 ('net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path') ('net/mlx5e: Fix traffic being dropped on VF representor') For -stable v4.13 ('net/mlx5e: Fix memory usage issues in offloading TC flows') ('net/mlx5e: Verify coalescing parameters in range') For -stable v4.14 ('net/mlx5e: Don't override vport admin link state in switchdev mode') For -stable v4.15 ('108b2b6d5c02 net/mlx5e: Sync netdev vxlan ports at open') Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27strparser: Fix sign of err codesDave Watson
strp_parser_err is called with a negative code everywhere, which then calls abort_parser with a negative code. strp_msg_timeout calls abort_parser directly with a positive code. Negate ETIMEDOUT to match signed-ness of other calls. The default abort_parser callback, strp_abort_strp, sets sk->sk_err to err. Also negate the error here so sk_err always holds a positive value, as the rest of the net code expects. Currently a negative sk_err can result in endless loops, or user code that thinks it actually sent/received err bytes. Found while testing net/tls_sw recv path. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net sched actions: fix dumping which requires several messages to user spaceCraig Dillabaugh
Fixes a bug in the tcf_dump_walker function that can cause some actions to not be reported when dumping a large number of actions. This issue became more aggrevated when cookies feature was added. In particular this issue is manifest when large cookie values are assigned to the actions and when enough actions are created that the resulting table must be dumped in multiple batches. The number of actions returned in each batch is limited by the total number of actions and the memory buffer size. With small cookies the numeric limit is reached before the buffer size limit, which avoids the code path triggering this bug. When large cookies are used buffer fills before the numeric limit, and the erroneous code path is hit. For example after creating 32 csum actions with the cookie aaaabbbbccccdddd $ tc actions ls action csum total acts 26 action order 0: csum (tcp) action continue index 1 ref 1 bind 0 cookie aaaabbbbccccdddd ..... action order 25: csum (tcp) action continue index 26 ref 1 bind 0 cookie aaaabbbbccccdddd total acts 6 action order 0: csum (tcp) action continue index 28 ref 1 bind 0 cookie aaaabbbbccccdddd ...... action order 5: csum (tcp) action continue index 32 ref 1 bind 0 cookie aaaabbbbccccdddd Note that the action with index 27 is omitted from the report. Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")" Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27r8169: fix setting driver_data after register_netdevHeiner Kallweit
pci_set_drvdata() is called only after registering the net_device, therefore we could run into a NPE if one of the functions using driver_data is called before it's set. Fix this by calling pci_set_drvdata() before registering the net_device. This fix is a candidate for stable. As far as I can see the bug has been there in kernel version 3.2 already, therefore I can't provide a reference which commit is fixed by it. The fix may need small adjustments per kernel version because due to other changes the label which is jumped to if register_netdev() fails has changed over time. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: fix possible out-of-bound read in skb_network_protocol()Eric Dumazet
skb mac header is not necessarily set at the time skb_network_protocol() is called. Use skb->data instead. BUG: KASAN: slab-out-of-bounds in skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 Read of size 2 at addr ffff8801b3097a0b by task syz-executor5/14242 CPU: 1 PID: 14242 Comm: syz-executor5 Not tainted 4.16.0-rc6+ #280 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23c/0x360 mm/kasan/report.c:412 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:443 skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 harmonize_features net/core/dev.c:2924 [inline] netif_skb_features+0x509/0x9b0 net/core/dev.c:3011 validate_xmit_skb+0x81/0xb00 net/core/dev.c:3084 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3142 packet_direct_xmit+0x117/0x790 net/packet/af_packet.c:256 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xca/0x110 net/socket.c:639 ___sys_sendmsg+0x767/0x8b0 net/socket.c:2047 __sys_sendmsg+0xe5/0x210 net/socket.c:2081 Fixes: 19acc327258a ("gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@ovn.org> Reported-by: Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net-usb: add qmi_wwan if on lte modem wistron neweb d18q1Giuseppe Lippolis
This modem is embedded on dlink dwr-921 router. The oem configuration states: T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1435 ProdID=0918 Rev= 2.32 S: Manufacturer=Android S: Product=Android S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none) E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=125us Tested on openwrt distribution Signed-off-by: Giuseppe Lippolis <giu.lippolis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27Merge tag 'batadv-net-for-davem-20180326' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - fix multicast-via-unicast transmissions for AP isolation and gateway extension, by Linus Luessing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27ALSA: pcm: potential uninitialized return valuesDan Carpenter
Smatch complains that "tmp" can be uninitialized if we do a zero size write. Fixes: 02a5d6925cd3 ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-27Merge tag 'sunxi-fixes-for-4.16' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Pull "Allwinner Fixes for 4.16" from Maxime Ripard: The first and second patches fix the regulator support for the Bananapi M2 board. The last one updates my email address in MAINTAINERS. * tag 'sunxi-fixes-for-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: MAINTAINERS: update email address for Maxime Ripard ARM: dts: sun6i: a31s: bpi-m2: add missing regulators ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
2018-03-27Merge tag 'omap-for-v4.16/sram-fix-signed' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Pull "Two fixes for omap variants for v4.16-rc cycle" from Tony Lindgren: Fix insecure W+X mapping warning for SRAM for omaps that don't yet use drivers/misc/*sram*.c code. An earlier attempt at fixing this turned out to cause problems with PM on omap3, this version works with PM on omap3. Also fix dmtimer probe for omap16xx devices that was noticed with the pending dmtimer move to drivers. It seems this has been broken for a while and is a non-critical for booting. It is needed for PM on omap16xx though. * tag 'omap-for-v4.16/sram-fix-signed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP: Fix SRAM W+X mapping ARM: OMAP: Fix dmtimer init for omap1
2018-03-27Merge tag 'tegra-for-4.17-misc' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tegra/linux into fixes Pull "ARM: tegra: Miscellaneous changes for v4.17-rc1" from Thierry Reding: This contains a single patch to update the MAINTAINERS entry for the Tegra SMMU driver. * tag 'tegra-for-4.17-misc' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tegra/linux: MAINTAINERS: Update Tegra IOMMU maintainer
2018-03-27x86/alternatives: Fixup alternative_call_2Alexey Dobriyan
The following pattern fails to compile while the same pattern with alternative_call() does: if (...) alternative_call_2(...); else alternative_call_2(...); as it expands into if (...) { }; <=== else { }; Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2
2018-03-27Merge tag 'drm-amdkfd-fixes-2018-03-25' of ↵Dave Airlie
git://people.freedesktop.org/~gabbayo/linux into drm-fixes - Programming VMID correctly for scratch memory with HWS - deallocating SDMA queues correctly in various situations * tag 'drm-amdkfd-fixes-2018-03-25' of git://people.freedesktop.org/~gabbayo/linux: drm/amdkfd: Deallocate SDMA queues correctly drm/amdkfd: Fix scratch memory with HWS enabled
2018-03-27perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUsStephane Eranian
this patch fix a bug in how the pebs->real_ip is handled in the PEBS handler. real_ip only exists in Haswell and later processor. It is actually the eventing IP, i.e., where the event occurred. As opposed to the pebs->ip which is the PEBS interrupt IP which is always off by one. The problem is that the real_ip just like the IP needs to be fixed up because PEBS does not record all the machine state registers, and in particular the code segement (cs). This is why we have the set_linear_ip() function. The problem was that set_linear_ip() was only used on the pebs->ip and not the pebs->real_ip. We have profiles which ran into invalid callstacks because of this. Here is an example: ..... 0: ffffffffffffff80 recent entry, marker kernel v ..... 1: 000000000040044d <= user address in kernel space! ..... 2: fffffffffffffe00 marker enter user v ..... 3: 000000000040044d ..... 4: 00000000004004b6 oldest entry Debugging output in get_perf_callchain(): [ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0 The problem is that the kernel entry in 1: points to a user level address. How can that be? The reason is that with PEBS sampling the instruction that caused the event to occur and the instruction where the CPU was when the interrupt was posted may be far apart. And sometime during that time window, the privilege level may change. This happens, for instance, when the PEBS sample is taken close to a kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level instruction. But by the time the PMU interrupt fired, the processor had already entered kernel space. This is why the debug output shows a user address with user_mode() false. The problem comes from PEBS not recording the code segment (cs) register. The register is used in x86_64 to determine if executing in kernel vs user space. This is okay because the kernel has a software workaround called set_linear_ip(). But the issue in setup_pebs_sample_data() is that set_linear_ip() is never called on the real_ip value when it is available (Haswell and later) and precise_ip > 1. This patch fixes this problem and eliminates the callchain discrepancy. The patch restructures the code around set_linear_ip() to minimize the number of times the IP has to be set. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-27objtool: Add Clang supportJosh Poimboeuf
Since the ORC unwinder was made the default on x86_64, Clang-built defconfig kernels have triggered some new objtool warnings: drivers/gpu/drm/i915/i915_gpu_error.o: warning: objtool: i915_error_printf()+0x6c: return with modified stack frame drivers/gpu/drm/i915/intel_display.o: warning: objtool: pipe_config_err()+0xa6: return with modified stack frame The problem is that objtool has never seen clang-built binaries before. Shockingly enough, objtool is apparently able to follow the code flow mostly fine, except for one instruction sequence. Instead of a LEAVE instruction, clang restores RSP and RBP the long way: 67c: 48 89 ec mov %rbp,%rsp 67f: 5d pop %rbp Teach objtool about this new code sequence. Reported-and-test-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/fce88ce81c356eedcae7f00ed349cfaddb3363cc.1521741586.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-26net/mlx5e: Sync netdev vxlan ports at openShahar Klein
When mlx5_core is loaded it is expected to sync ports with all vxlan devices so it can support vxlan encap/decap. This is done via udp_tunnel_get_rx_info(). Currently this call is set in mlx5e_nic_enable() and if the netdev is not in NETREG_REGISTERED state it will not be called. Normally on load the netdev state is not NETREG_REGISTERED so udp_tunnel_get_rx_info() will not be called. Moving udp_tunnel_get_rx_info() to mlx5e_open() so it will be called on netdev UP event and allow encap/decap. Fixes: 610e89e05c3f ("net/mlx5e: Don't sync netdev state when not registered") Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update pathOr Gerlitz
Currently we use the global ipv6_stub var to access the ipv6 global nd table. This practice gets us to troubles when the stub is only partially set e.g when ipv6 is loaded under the disabled policy. In this case, as of commit 343d60aada5a ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument") the stub is not null, but stub->nd_tbl is and we crash. As we can access the ipv6 nd_tbl directly, the fix is just to avoid the reference through the stub. There is one place in the code where we issue ipv6 route lookup and keep doing it through the stub, but that mentioned commit makes sure we get -EAFNOSUPPORT from the stack. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Fix memory usage issues in offloading TC flowsJianbo Liu
For NIC flows, the parsed attributes are not freed when we exit successfully from mlx5e_configure_flower(). There is possible double free for eswitch flows. If error is returned from rhashtable_insert_fast(), the parse attrs will be freed in mlx5e_tc_del_flow(), but they will be freed again before exiting mlx5e_configure_flower(). To fix both issues we do the following: (1) change the condition that determines if to issue the free call to check if this flow is NIC flow, or it does not have encap action. (2) reorder the code such that that the check and free calls are done before we attempt to add into the hash table. Fixes: 232c001398ae ('net/mlx5e: Add support to neighbour update flow') Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Fix traffic being dropped on VF representorRoi Dayan
Increase representor netdev RQ size to avoid dropped packets. The current size (two) is just too small to keep up with conventional slow path traffic patterns. Also match the SQ size to the RQ size. Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Verify coalescing parameters in rangeMoshe Shemesh
Add check of coalescing parameters received through ethtool are within range of values supported by the HW. Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the users through ethtool. The ethtool support up to 32 bit value for each. However, mlx5 modify cq limits the coalescing time parameter to 12 bit and coalescing frames parameters to 16 bits. Return out of range error if user tries to set these parameters to higher values. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5: Make eswitch support to depend on switchdevOr Gerlitz
Add dependancy for switchdev to be congfigured as any user-space control plane SW is expected to use the HW switchdev ID to locate the representors related to VFs of a certain PF and apply SW/offloaded switching on them. Fixes: e80541ecabd5 ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Use 32 bits to store VF representor SQ numberOr Gerlitz
SQs are 32 and not 16 bits, hence it's wrong to use only 16 bits to store the sq number for which are going to set steering rule, fix that. Fixes: cb67b832921c ('net/mlx5e: Introduce SRIOV VF representors') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5e: Don't override vport admin link state in switchdev modeJianbo Liu
The vport admin original link state will be re-applied after returning back to legacy mode, it is not right to change the admin link state value when in switchdev mode. Use direct vport commands to alter logical vport state in netdev representor open/close flows rather than the administrative eswitch API. Fixes: 20a1ea674783 ('net/mlx5e: Support VF vport link state control for SRIOV switchdev mode') Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net: dsa: mt7530: fix module autoloading for OF platform driversSean Wang
It's required to create a modules.alias via MODULE_DEVICE_TABLE helper for the OF platform driver. Otherwise, module autoloading cannot work. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>