summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-09-01bridge: switchdev: Clear forward mark when transmitting packetIdo Schimmel
Commit 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices") added the 'offload_fwd_mark' bit to the skb in order to allow drivers to indicate to the bridge driver that they already forwarded the packet in L2. In case the bit is set, before transmitting the packet from each port, the port's mark is compared with the mark stored in the skb's control block. If both marks are equal, we know the packet arrived from a switch device that already forwarded the packet and it's not re-transmitted. However, if the packet is transmitted from the bridge device itself (e.g., br0), we should clear the 'offload_fwd_mark' bit as the mark stored in the skb's control block isn't valid. This scenario can happen in rare cases where a packet was trapped during L3 forwarding and forwarded by the kernel to a bridge device. Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Yotam Gigi <yotamg@mellanox.com> Tested-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01mlxsw: spectrum: Forbid linking to devices that have uppersIdo Schimmel
The mlxsw driver relies on NETDEV_CHANGEUPPER events to configure the device in case a port is enslaved to a master netdev such as bridge or bond. Since the driver ignores events unrelated to its ports and their uppers, it's possible to engineer situations in which the device's data path differs from the kernel's. One example to such a situation is when a port is enslaved to a bond that is already enslaved to a bridge. When the bond was enslaved the driver ignored the event - as the bond wasn't one of its uppers - and therefore a bridge port instance isn't created in the device. Until such configurations are supported forbid them by checking that the upper device doesn't have uppers of its own. Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Nogah Frankel <nogahf@mellanox.com> Tested-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01Fix warning messages when mounting to older serversSteve French
When mounting to older servers, such as Windows XP (or even Windows 7), the limited error messages that can be passed back to user space can get confusing since the default dialect has changed from SMB1 (CIFS) to more secure SMB3 dialect. Log additional information when the user chooses to use the default dialects and when the server does not support the dialect requested. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-08-31Merge tag 'cifs-fixes-for-4.13-rc7-and-stable' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Two cifs bug fixes for stable" * tag 'cifs-fixes-for-4.13-rc7-and-stable' of git://git.samba.org/sfrench/cifs-2.6: CIFS: remove endian related sparse warning CIFS: Fix maximum SMB2 header size
2017-08-31Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Unfortunately a few issues that warrant sending another pull request, even if I had hoped to avoid it. This contains: - A fix for multiqueue xen-blkback, on tear down / disconnect. - A few fixups for NVMe, including a wrong bit definition, fix for host memory buffers, and an nvme rdma page size fix" * 'for-linus' of git://git.kernel.dk/linux-block: nvme: fix the definition of the doorbell buffer config support bit nvme-pci: use dma memory for the host memory buffer descriptors nvme-rdma: default MR page size to 4k xen-blkback: stop blkback thread of every queue in xen_blkif_disconnect
2017-08-31Merge tag 'for-4.13/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - A couple fixes for bugs introduced as part of the blk_status_t block layer changes during the 4.13 merge window - A printk throttling fix to use discrete rate limiting state for each DM log level - A stable@ fix for DM multipath that delays request requeueing to avoid CPU lockup if/when the request queue is "dying" * tag 'for-4.13/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm mpath: do not lock up a CPU with requeuing activity dm: fix printk() rate limiting code dm mpath: retry BLK_STS_RESOURCE errors dm: fix the second dec_pending() argument in __split_and_process_bio()
2017-08-31Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more fixes from Andrew Morton: "6 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: scripts/dtc: fix '%zx' warning include/linux/compiler.h: don't perform compiletime_assert with -O0 mm, madvise: ensure poisoned pages are removed from per-cpu lists mm, uprobes: fix multiple free of ->uprobes_state.xol_area kernel/kthread.c: kthread_worker: don't hog the cpu mm,page_alloc: don't call __node_reclaim() with oom_lock held.
2017-08-31Merge branch 'mmu_notifier_fixes'Linus Torvalds
Merge mmu_notifier fixes from Jérôme Glisse: "The invalidate_page callback suffered from 2 pitfalls. First it used to happen after page table lock was release and thus a new page might have been setup for the virtual address before the call to invalidate_page(). This is in a weird way fixed by commit c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()") which moved the callback under the page table lock. Which also broke several existing user of the mmu_notifier API that assumed they could sleep inside this callback. The second pitfall was invalidate_page being the only callback not taking a range of address in respect to invalidation but was giving an address and a page. Lot of the callback implementer assumed this could never be THP and thus failed to invalidate the appropriate range for THP pages. By killing this callback we unify the mmu_notifier callback API to always take a virtual address range as input. There is now two clear API (I am not mentioning the youngess API which is seldomly used): - invalidate_range_start()/end() callback (which allow you to sleep) - invalidate_range() where you can not sleep but happen right after page table update under page table lock Note that a lot of existing user feels broken in respect to range_start/ range_end. Many user only have range_start() callback but there is nothing preventing them to undo what was invalidated in their range_start() callback after it returns but before any CPU page table update take place. The code pattern use in kvm or umem odp is an example on how to properly avoid such race. In a nutshell use some kind of sequence number and active range invalidation counter to block anything that might undo what the range_start() callback did. If you do not care about keeping fully in sync with CPU page table (ie you can live with CPU page table pointing to new different page for a given virtual address) then you can take a reference on the pages inside the range_start callback and drop it in range_end or when your driver is done with those pages. Last alternative is to use invalidate_range() if you can do invalidation without sleeping as invalidate_range() callback happens under the CPU page table spinlock right after the page table is updated. The first two patches convert existing mmu_notifier_invalidate_page() calls to mmu_notifier_invalidate_range() and bracket those call with call to mmu_notifier_invalidate_range_start()/end(). The next ten patches remove existing invalidate_page() callback as it can no longer happen. Finally the last page remove the invalidate_page() callback completely so it can RIP. Changes since v1: - remove more dead code in kvm (no testing impact) - more accurate end address computation (patch 2) in page_mkclean_one and try_to_unmap_one - added tested-by/reviewed-by gotten so far" * emailed patches from Jérôme Glisse <jglisse@redhat.com>: mm/mmu_notifier: kill invalidate_page KVM: update to new mmu_notifier semantic v2 xen/gntdev: update to new mmu_notifier semantic sgi-gru: update to new mmu_notifier semantic misc/mic/scif: update to new mmu_notifier semantic iommu/intel: update to new mmu_notifier semantic iommu/amd: update to new mmu_notifier semantic IB/hfi1: update to new mmu_notifier semantic IB/umem: update to new mmu_notifier semantic drm/amdgpu: update to new mmu_notifier semantic powerpc/powernv: update to new mmu_notifier semantic mm/rmap: update to new mmu_notifier semantic v2 dax: update to new mmu_notifier semantic
2017-08-31jfs should use MAX_LFS_FILESIZE when calculating s_maxbytesDave Kleikamp
jfs had previously avoided the use of MAX_LFS_FILESIZE because it hadn't accounted for the whole 32-bit index range on 32-bit systems. That has been fixed by commit 0cc3b0ec23ce ("Clarify (and fix) MAX_LFS_FILESIZE macros"), so we can simplify the code now. Suggested by Andreas Dilger. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: jfs-discussion@lists.sourceforge.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31scripts/dtc: fix '%zx' warningRussell King
dtc uses an incorrect format specifier for printing a uint64_t value. uint64_t may be either 'unsigned long' or 'unsigned long long' depending on the host architecture. Fix this by using %llx and casting to unsigned long long, which ensures that we always have a wide enough variable to print 64 bits of hex. HOSTCC scripts/dtc/checks.o scripts/dtc/checks.c: In function 'check_simple_bus_reg': scripts/dtc/checks.c:876:2: warning: format '%zx' expects argument of type 'size_t', but argument 4 has type 'uint64_t' [-Wformat=] snprintf(unit_addr, sizeof(unit_addr), "%zx", reg); ^ scripts/dtc/checks.c:876:2: warning: format '%zx' expects argument of type 'size_t', but argument 4 has type 'uint64_t' [-Wformat=] Link: http://lkml.kernel.org/r/20170829222034.GJ20805@n2100.armlinux.org.uk Fixes: 828d4cdd012c ("dtc: check.c fix compile error") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31include/linux/compiler.h: don't perform compiletime_assert with -O0Joe Stringer
Commit c7acec713d14 ("kernel.h: handle pointers to arrays better in container_of()") made use of __compiletime_assert() from container_of() thus increasing the usage of this macro, allowing developers to notice type conflicts in usage of container_of() at compile time. However, the implementation of __compiletime_assert relies on compiler optimizations to report an error. This means that if a developer uses "-O0" with any code that performs container_of(), the compiler will always report an error regardless of whether there is an actual problem in the code. This patch disables compile_time_assert when optimizations are disabled to allow such code to compile with CFLAGS="-O0". Example compilation failure: ./include/linux/compiler.h:547:38: error: call to `__compiletime_assert_94' declared with attribute error: pointer type mismatch in container_of() _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ ./include/linux/compiler.h:530:4: note: in definition of macro `__compiletime_assert' prefix ## suffix(); \ ^~~~~~ ./include/linux/compiler.h:547:2: note: in expansion of macro `_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^~~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:46:37: note: in expansion of macro `compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ ./include/linux/kernel.h:860:2: note: in expansion of macro `BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \ ^~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: use do{}while(0), per Michal] Link: http://lkml.kernel.org/r/20170829230114.11662-1-joe@ovn.org Fixes: c7acec713d14c6c ("kernel.h: handle pointers to arrays better in container_of()") Signed-off-by: Joe Stringer <joe@ovn.org> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31mm, madvise: ensure poisoned pages are removed from per-cpu listsMel Gorman
Wendy Wang reported off-list that a RAS HWPOISON-SOFT test case failed and bisected it to the commit 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP"). The problem is that a page that was poisoned with madvise() is reused. The commit removed a check that would trigger if DEBUG_VM was enabled but re-enabling the check only fixes the problem as a side-effect by printing a bad_page warning and recovering. The root of the problem is that an madvise() can leave a poisoned page on the per-cpu list. This patch drains all per-cpu lists after pages are poisoned so that they will not be reused. Wendy reports that the test case in question passes with this patch applied. While this could be done in a targeted fashion, it is over-complicated for such a rare operation. Link: http://lkml.kernel.org/r/20170828133414.7qro57jbepdcyz5x@techsingularity.net Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Wang, Wendy <wendy.wang@intel.com> Tested-by: Wang, Wendy <wendy.wang@intel.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Hansen, Dave" <dave.hansen@intel.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31mm, uprobes: fix multiple free of ->uprobes_state.xol_areaEric Biggers
Commit 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write killable") made it possible to kill a forking task while it is waiting to acquire its ->mmap_sem for write, in dup_mmap(). However, it was overlooked that this introduced an new error path before the new mm_struct's ->uprobes_state.xol_area has been set to NULL after being copied from the old mm_struct by the memcpy in dup_mm(). For a task that has previously hit a uprobe tracepoint, this resulted in the 'struct xol_area' being freed multiple times if the task was killed at just the right time while forking. Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather than in uprobe_dup_mmap(). With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C program given by commit 2b7e8665b4ff ("fork: fix incorrect fput of ->exe_file causing use-after-free"), provided that a uprobe tracepoint has been set on the fork_thread() function. For example: $ gcc reproducer.c -o reproducer -lpthread $ nm reproducer | grep fork_thread 0000000000400719 t fork_thread $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable $ ./reproducer Here is the use-after-free reported by KASAN: BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200 Read of size 8 at addr ffff8800320a8b88 by task reproducer/198 CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 #255 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 Call Trace: dump_stack+0xdb/0x185 print_address_description+0x7e/0x290 kasan_report+0x23b/0x350 __asan_report_load8_noabort+0x19/0x20 uprobe_clear_state+0x1c4/0x200 mmput+0xd6/0x360 do_exit+0x740/0x1670 do_group_exit+0x13f/0x380 get_signal+0x597/0x17d0 do_signal+0x99/0x1df0 exit_to_usermode_loop+0x166/0x1e0 syscall_return_slowpath+0x258/0x2c0 entry_SYSCALL_64_fastpath+0xbc/0xbe ... Allocated by task 199: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xfc/0x180 kmem_cache_alloc_trace+0xf3/0x330 __create_xol_area+0x10f/0x780 uprobe_notify_resume+0x1674/0x2210 exit_to_usermode_loop+0x150/0x1e0 prepare_exit_to_usermode+0x14b/0x180 retint_user+0x8/0x20 Freed by task 199: save_stack_trace+0x1b/0x20 kasan_slab_free+0xa8/0x1a0 kfree+0xba/0x210 uprobe_clear_state+0x151/0x200 mmput+0xd6/0x360 copy_process.part.8+0x605f/0x65d0 _do_fork+0x1a5/0xbd0 SyS_clone+0x19/0x20 do_syscall_64+0x22f/0x660 return_from_SYSCALL_64+0x0/0x7a Note: without KASAN, you may instead see a "Bad page state" message, or simply a general protection fault. Link: http://lkml.kernel.org/r/20170830033303.17927-1-ebiggers3@gmail.com Fixes: 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write killable") Signed-off-by: Eric Biggers <ebiggers@google.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.7+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31kernel/kthread.c: kthread_worker: don't hog the cpuShaohua Li
If the worker thread continues getting work, it will hog the cpu and rcu stall complains. Make it a good citizen. This is triggered in a loop block device test. Link: http://lkml.kernel.org/r/5de0a179b3184e1a2183fc503448b0269f24d75b.1503697127.git.shli@fb.com Signed-off-by: Shaohua Li <shli@fb.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31mm,page_alloc: don't call __node_reclaim() with oom_lock held.Tetsuo Handa
We are doing a last second memory allocation attempt before calling out_of_memory(). But since slab shrinker functions might indirectly wait for other thread's __GFP_DIRECT_RECLAIM && !__GFP_NORETRY memory allocations via sleeping locks, calling slab shrinker functions from node_reclaim() from get_page_from_freelist() with oom_lock held has possibility of deadlock. Therefore, make sure that last second memory allocation attempt does not call slab shrinker functions. Link: http://lkml.kernel.org/r/1503577106-9196-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31mm/mmu_notifier: kill invalidate_pageJérôme Glisse
The invalidate_page callback suffered from two pitfalls. First it used to happen after the page table lock was release and thus a new page might have setup before the call to invalidate_page() happened. This is in a weird way fixed by commit c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()") that moved the callback under the page table lock but this also broke several existing users of the mmu_notifier API that assumed they could sleep inside this callback. The second pitfall was invalidate_page() being the only callback not taking a range of address in respect to invalidation but was giving an address and a page. Lots of the callback implementers assumed this could never be THP and thus failed to invalidate the appropriate range for THP. By killing this callback we unify the mmu_notifier callback API to always take a virtual address range as input. Finally this also simplifies the end user life as there is now two clear choices: - invalidate_range_start()/end() callback (which allow you to sleep) - invalidate_range() where you can not sleep but happen right after page table update under page table lock Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Bernhard Held <berny156@gmx.de> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: axie <axie@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31KVM: update to new mmu_notifier semantic v2Jérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Changed since v1 (Linus Torvalds) - remove now useless kvm_arch_mmu_notifier_invalidate_page() Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Mike Galbraith <efault@gmx.de> Tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31xen/gntdev: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: xen-devel@lists.xenproject.org (moderated for non-subscribers) Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31sgi-gru: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31misc/mic/scif: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31iommu/intel: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31iommu/amd: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31IB/hfi1: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: linux-rdma@vger.kernel.org Cc: Dean Luick <dean.luick@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31IB/umem: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Leon Romanovsky <leonro@mellanox.com> Cc: linux-rdma@vger.kernel.org Cc: Artemy Kovalyov <artemyko@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31drm/amdgpu: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31powerpc/powernv: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and now are bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Alistair Popple <alistair@popple.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31mm/rmap: update to new mmu_notifier semantic v2Jérôme Glisse
Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range() and make sure it is bracketed by calls to *_invalidate_range_start()/end(). Note that because we can not presume the pmd value or pte value we have to assume the worst and unconditionaly report an invalidation as happening. Changed since v2: - try_to_unmap_one() only one call to mmu_notifier_invalidate_range() - compute end with PAGE_SIZE << compound_order(page) - fix PageHuge() case in try_to_unmap_one() Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Bernhard Held <berny156@gmx.de> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: axie <axie@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31dax: update to new mmu_notifier semanticJérôme Glisse
Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range() and make sure it is bracketed by calls to *_invalidate_range_start()/end(). Note that because we can not presume the pmd value or pte value we have to assume the worst and unconditionaly report an invalidation as happening. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Bernhard Held <berny156@gmx.de> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: axie <axie@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-01ceph: fix readpage from fscacheYan, Zheng
ceph_readpage() unlocks page prematurely prematurely in the case that page is reading from fscache. Caller of readpage expects that page is uptodate when it get unlocked. So page shoule get locked by completion callback of fscache_read_or_alloc_pages() Cc: stable@vger.kernel.org # 4.1+, needs backporting for < 4.7 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-08-31wl1251: add a missing spin_lock_init()Cong Wang
wl1251: add a missing spin_lock_init() This fixes the following kernel warning: [ 5668.771453] BUG: spinlock bad magic on CPU#0, kworker/u2:3/9745 [ 5668.771850] lock: 0xce63ef20, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 [ 5668.772277] CPU: 0 PID: 9745 Comm: kworker/u2:3 Tainted: G W 4.12.0-03002-gec979a4-dirty #40 [ 5668.772796] Hardware name: Nokia RX-51 board [ 5668.773071] Workqueue: phy1 wl1251_irq_work [ 5668.773345] [<c010c9e4>] (unwind_backtrace) from [<c010a274>] (show_stack+0x10/0x14) [ 5668.773803] [<c010a274>] (show_stack) from [<c01545a4>] (do_raw_spin_lock+0x6c/0xa0) [ 5668.774230] [<c01545a4>] (do_raw_spin_lock) from [<c06ca578>] (_raw_spin_lock_irqsave+0x10/0x18) [ 5668.774658] [<c06ca578>] (_raw_spin_lock_irqsave) from [<c048c010>] (wl1251_op_tx+0x38/0x5c) [ 5668.775115] [<c048c010>] (wl1251_op_tx) from [<c06a12e8>] (ieee80211_tx_frags+0x188/0x1c0) [ 5668.775543] [<c06a12e8>] (ieee80211_tx_frags) from [<c06a138c>] (__ieee80211_tx+0x6c/0x130) [ 5668.775970] [<c06a138c>] (__ieee80211_tx) from [<c06a3dbc>] (ieee80211_tx+0xdc/0x104) [ 5668.776367] [<c06a3dbc>] (ieee80211_tx) from [<c06a4af0>] (__ieee80211_subif_start_xmit+0x454/0x8c8) [ 5668.776824] [<c06a4af0>] (__ieee80211_subif_start_xmit) from [<c06a4f94>] (ieee80211_subif_start_xmit+0x30/0x2fc) [ 5668.777343] [<c06a4f94>] (ieee80211_subif_start_xmit) from [<c0578848>] (dev_hard_start_xmit+0x80/0x118) ... by adding the missing spin_lock_init(). Reported-by: Pavel Machek <pavel@ucw.cz> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Pavel Machek <pavel@ucw.cz> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-31Input: xpad - fix PowerA init quirk for some gamepad modelsCameron Gutman
The PowerA gamepad initialization quirk worked with the PowerA wired gamepad I had around (0x24c6:0x543a), but a user reported [0] that it didn't work for him, even though our gamepads shared the same vendor and product IDs. When I initially implemented the PowerA quirk, I wanted to avoid actually triggering the rumble action during init. My tests showed that my gamepad would work correctly even if it received a rumble of 0 intensity, so that's what I went with. Unfortunately, this apparently isn't true for all models (perhaps a firmware difference?). This non-working gamepad seems to require the real magic rumble packet that the Microsoft driver sends, which actually vibrates the gamepad. To counteract this effect, I still send the old zero-rumble PowerA quirk packet which cancels the rumble effect before the motors can spin up enough to vibrate. [0]: https://github.com/paroj/xpad/issues/48#issuecomment-313904867 Reported-by: Kyle Beauchamp <kyleabeauchamp@gmail.com> Tested-by: Kyle Beauchamp <kyleabeauchamp@gmail.com> Fixes: 81093c9848a7 ("Input: xpad - support some quirky Xbox One pads") Cc: stable@vger.kernel.org # v4.12 Signed-off-by: Cameron Gutman <aicommander@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-31i2c: designware: Round down ACPI provided clk to nearest supported clkHans de Goede
The Lenovo Miix2 8 DSDT contains an i2c clk / bus speed of 1700000 Hz for one if its devices, which is not supported. This is the second DSDT to show up with an unsupported clk in a short time, remove the hardcoded fix for DSDTs with a 1 MiHz clock and simply always round down the clk to the nearest supported value. Reported-by: russianneuromancer@ya.ru Fixes: 682c6c2188 ("i2c: designware: Some broken DSTDs use 1MiHz ...") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-31Merge tag 'asoc-fix-v4.13-rc7' of ↵Takashi Iwai
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v4.13 A couple of fixes, one for a regression in simple-card introduced during the merge window that was only reported this week and another for a regression in registration of ACPI GPIOs.
2017-08-31Merge tag 'vfio-ccw-20170724' of ↵Martin Schwidefsky
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into fixes Pull vfio-ccw fix from Cornelia Huck: "A bugfix in the ccw translation code."
2017-08-31s390/mm: fix BUG_ON in crst_table_upgradeMartin Schwidefsky
A 31-bit compat process can force a BUG_ON in crst_table_upgrade with specific, invalid mmap calls, e.g. mmap((void*) 0x7fff8000, 0x10000, 3, 32, -1, 0) The arch_get_unmapped_area[_topdown] functions miss an if condition in the decision to do a page table upgrade. Fixes: 9b11c7912d00 ("s390/mm: simplify arch_get_unmapped_area[_topdown]") Cc: <stable@vger.kernel.org> # v4.12+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-31s390/mm: fork vs. 5 level page tabelMartin Schwidefsky
The mm->context.asce field of a new process is not set up correctly in case of a fork with a 5 level page table. Add the missing case to init_new_context(). Fixes: 1aea9b3f9210 ("s390/mm: implement 5 level pages tables") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-31Merge remote-tracking branch 'asoc/fix/rt5670' into asoc-fixesMark Brown
2017-08-30Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"Florian Fainelli
This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()") because it is creating the possibility for a NULL pointer dereference. David Daney provide the following call trace and diagram of events: When ndo_stop() is called we call: phy_disconnect() +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL; +---> phy_stop_machine() | +---> phy_state_machine() | +----> queue_delayed_work(): Work queued. +--->phy_detach() implies: phydev->attached_dev = NULL; Now at a later time the queued work does: phy_state_machine() +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL: CPU 12 Unable to handle kernel paging request at virtual address 0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c Oops[#1]: CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1 Workqueue: events_power_efficient phy_state_machine task: 80000004021ed100 task.stack: 8000000409d70000 $ 0 : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004 $ 4 : 0000000000000000 0000000000000001 0000000000000004 0000000000000000 $ 8 : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000 $12 : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b $16 : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800 $20 : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008 $24 : 0000000000000061 ffffffff808637b0 $28 : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c Hi : 000000000000002a Lo : 000000000000003f epc : ffffffff80de37ec netif_carrier_off+0xc/0x58 ra : ffffffff80c7804c phy_state_machine+0x48c/0x4f8 Status: 14009ce3 KX SX UX KERNEL EXL IE Cause : 00800008 (ExcCode 02) BadVA : 0000000000000048 PrId : 000d9501 (Cavium Octeon III) Modules linked in: Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000, task=80000004021ed100, tls=0000000000000000) Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00 0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300 80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001 ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000 8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000 ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240 0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00 8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8 8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08 8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080 ... Call Trace: [<ffffffff80de37ec>] netif_carrier_off+0xc/0x58 [<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8 [<ffffffff808a1708>] process_one_work+0x158/0x368 [<ffffffff808a1ac0>] worker_thread+0x150/0x4c0 [<ffffffff808a8598>] kthread+0xc8/0xe0 [<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c The original motivation for this change originated from Marc Gonzales indicating that his network driver did not have its adjust_link callback executing with phydev->link = 0 while he was expecting it. PHYLIB has never made any such guarantees ever because phy_stop() merely just tells the workqueue to move into PHY_HALTED state which will happen asynchronously. Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Reported-by: David Daney <ddaney.cavm@gmail.com> Fixes: 7ad813f20853 ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30Merge tag 'mlx5-fixes-2017-08-30' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-08-30 This series contains some misc fixes to the mlx5 driver. Please pull and let me know if there's any problem. For -stable: Kernels >= 4.12 net/mlx5e: Fix CQ moderation mode not set properly net/mlx5e: Don't override user RSS upon set channels Kernels >= 4.11 net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address Kernels >= 4.10 net/mlx5e: Fix DCB_CAP_ATTR_DCBX capability for DCBNL getcap net/mlx5e: Check for qos capability in dcbnl_initialize Kernels >= 4.9 net/mlx5e: Fix dangling page pointer on DMA mapping error Kernels >= 4.8 net/mlx5e: Fix inline header size for small packets net/mlx5: E-Switch, Unload the representors in the correct order net/mlx5: Fix arm SRQ command for ISSI version 0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30net: dsa: bcm_sf2: Fix number of CFP entries for BCM7278Florian Fainelli
BCM7278 has only 128 entries while BCM7445 has the full 256 entries set, fix that. Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30kcm: do not attach PF_KCM sockets to avoid deadlockEric Dumazet
syzkaller had no problem to trigger a deadlock, attaching a KCM socket to another one (or itself). (original syzkaller report was a very confusing lockdep splat during a sendmsg()) It seems KCM claims to only support TCP, but no enforcement is done, so we might need to add additional checks. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Dan Williams: "A single patch removing some structure definitions from a uapi header file. These payloads are never processed directly by the kernel they are simply passed through an ioctl as opaque blobs to the ACPI _DSM (Device Specific Method) interface. Userspace should not be depending on the kernel to define these payloads. We will instead provide these definitions via the existing libndctl (https://github.com/pmem/ndctl) project that has NVDIMM command helpers and other definitions" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm: clean up command definitions
2017-08-30Merge branch 'net-sched-init-failure-fixes'David S. Miller
Nikolay Aleksandrov says: ==================== net/sched: init failure fixes I went over all qdiscs' init, destroy and reset callbacks and found the issues fixed in each patch. Mostly they are null pointer dereferences due to uninitialized timer (qdisc watchdog) or double frees due to ->destroy cleaning up a second time. There's more information in each patch. I've tested these by either sending wrong attributes from user-spaces, no attributes or by simulating memory alloc failure where applicable. Also tried all of the qdiscs as a default qdisc. Most of these bugs were present before commit 87b60cfacf9f, I've tried to include proper fixes tags in each patch. I haven't included individual patch acks in the set, I'd appreciate it if you take another look and resend them. ==================== Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_tbf: fix two null pointer dereferences on init failureNikolay Aleksandrov
sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy callbacks but it may fail before the timer is initialized due to missing options (either not supplied by user-space or set as a default qdisc), also q->qdisc is used by ->reset and ->destroy so we need it initialized. Reproduce: $ sysctl net.core.default_qdisc=tbf $ ip l set ethX up Crash log: [ 959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 959.160323] IP: qdisc_reset+0xa/0x5c [ 959.160400] PGD 59cdb067 [ 959.160401] P4D 59cdb067 [ 959.160466] PUD 59ccb067 [ 959.160532] PMD 0 [ 959.160597] [ 959.160706] Oops: 0000 [#1] SMP [ 959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem [ 959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62 [ 959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000 [ 959.161263] RIP: 0010:qdisc_reset+0xa/0x5c [ 959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286 [ 959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000 [ 959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000 [ 959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff [ 959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0 [ 959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001 [ 959.162546] FS: 00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000 [ 959.162844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0 [ 959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 959.163638] Call Trace: [ 959.163788] tbf_reset+0x19/0x64 [sch_tbf] [ 959.163957] qdisc_destroy+0x8b/0xe5 [ 959.164119] qdisc_create_dflt+0x86/0x94 [ 959.164284] ? dev_activate+0x129/0x129 [ 959.164449] attach_one_default_qdisc+0x36/0x63 [ 959.164623] netdev_for_each_tx_queue+0x3d/0x48 [ 959.164795] dev_activate+0x4b/0x129 [ 959.164957] __dev_open+0xe7/0x104 [ 959.165118] __dev_change_flags+0xc6/0x15c [ 959.165287] dev_change_flags+0x25/0x59 [ 959.165451] do_setlink+0x30c/0xb3f [ 959.165613] ? check_chain_key+0xb0/0xfd [ 959.165782] rtnl_newlink+0x3a4/0x729 [ 959.165947] ? rtnl_newlink+0x117/0x729 [ 959.166121] ? ns_capable_common+0xd/0xb1 [ 959.166288] ? ns_capable+0x13/0x15 [ 959.166450] rtnetlink_rcv_msg+0x188/0x197 [ 959.166617] ? rcu_read_unlock+0x3e/0x5f [ 959.166783] ? rtnl_newlink+0x729/0x729 [ 959.166948] netlink_rcv_skb+0x6c/0xce [ 959.167113] rtnetlink_rcv+0x23/0x2a [ 959.167273] netlink_unicast+0x103/0x181 [ 959.167439] netlink_sendmsg+0x326/0x337 [ 959.167607] sock_sendmsg_nosec+0x14/0x3f [ 959.167772] sock_sendmsg+0x29/0x2e [ 959.167932] ___sys_sendmsg+0x209/0x28b [ 959.168098] ? do_raw_spin_unlock+0xcd/0xf8 [ 959.168267] ? _raw_spin_unlock+0x27/0x31 [ 959.168432] ? __handle_mm_fault+0x651/0xdb1 [ 959.168602] ? check_chain_key+0xb0/0xfd [ 959.168773] __sys_sendmsg+0x45/0x63 [ 959.168934] ? __sys_sendmsg+0x45/0x63 [ 959.169100] SyS_sendmsg+0x19/0x1b [ 959.169260] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 959.169432] RIP: 0033:0x7fcc5097e690 [ 959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690 [ 959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003 [ 959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003 [ 959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006 [ 959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000 [ 959.170900] ? trace_hardirqs_off_caller+0xa7/0xcf [ 959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24 98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89 e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb [ 959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610 [ 959.171821] CR2: 0000000000000018 Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_sfq: fix null pointer dereference on init failureNikolay Aleksandrov
Currently only a memory allocation failure can lead to this, so let's initialize the timer first. Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_netem: avoid null pointer deref on init failureNikolay Aleksandrov
netem can fail in ->init due to missing options (either not supplied by user-space or used as a default qdisc) causing a timer->base null pointer deref in its ->destroy() and ->reset() callbacks. Reproduce: $ sysctl net.core.default_qdisc=netem $ ip l set ethX up Crash log: [ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1814.847181] IP: hrtimer_active+0x17/0x8a [ 1814.847270] PGD 59c34067 [ 1814.847271] P4D 59c34067 [ 1814.847337] PUD 37374067 [ 1814.847403] PMD 0 [ 1814.847468] [ 1814.847582] Oops: 0000 [#1] SMP [ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O) [ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G O 4.13.0-rc6+ #62 [ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000 [ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a [ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246 [ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000 [ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8 [ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff [ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000 [ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001 [ 1814.849616] FS: 00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 1814.849919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0 [ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1814.850723] Call Trace: [ 1814.850875] hrtimer_try_to_cancel+0x1a/0x93 [ 1814.851047] hrtimer_cancel+0x15/0x20 [ 1814.851211] qdisc_watchdog_cancel+0x12/0x14 [ 1814.851383] netem_reset+0xe6/0xed [sch_netem] [ 1814.851561] qdisc_destroy+0x8b/0xe5 [ 1814.851723] qdisc_create_dflt+0x86/0x94 [ 1814.851890] ? dev_activate+0x129/0x129 [ 1814.852057] attach_one_default_qdisc+0x36/0x63 [ 1814.852232] netdev_for_each_tx_queue+0x3d/0x48 [ 1814.852406] dev_activate+0x4b/0x129 [ 1814.852569] __dev_open+0xe7/0x104 [ 1814.852730] __dev_change_flags+0xc6/0x15c [ 1814.852899] dev_change_flags+0x25/0x59 [ 1814.853064] do_setlink+0x30c/0xb3f [ 1814.853228] ? check_chain_key+0xb0/0xfd [ 1814.853396] ? check_chain_key+0xb0/0xfd [ 1814.853565] rtnl_newlink+0x3a4/0x729 [ 1814.853728] ? rtnl_newlink+0x117/0x729 [ 1814.853905] ? ns_capable_common+0xd/0xb1 [ 1814.854072] ? ns_capable+0x13/0x15 [ 1814.854234] rtnetlink_rcv_msg+0x188/0x197 [ 1814.854404] ? rcu_read_unlock+0x3e/0x5f [ 1814.854572] ? rtnl_newlink+0x729/0x729 [ 1814.854737] netlink_rcv_skb+0x6c/0xce [ 1814.854902] rtnetlink_rcv+0x23/0x2a [ 1814.855064] netlink_unicast+0x103/0x181 [ 1814.855230] netlink_sendmsg+0x326/0x337 [ 1814.855398] sock_sendmsg_nosec+0x14/0x3f [ 1814.855584] sock_sendmsg+0x29/0x2e [ 1814.855747] ___sys_sendmsg+0x209/0x28b [ 1814.855912] ? do_raw_spin_unlock+0xcd/0xf8 [ 1814.856082] ? _raw_spin_unlock+0x27/0x31 [ 1814.856251] ? __handle_mm_fault+0x651/0xdb1 [ 1814.856421] ? check_chain_key+0xb0/0xfd [ 1814.856592] __sys_sendmsg+0x45/0x63 [ 1814.856755] ? __sys_sendmsg+0x45/0x63 [ 1814.856923] SyS_sendmsg+0x19/0x1b [ 1814.857083] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 1814.857256] RIP: 0033:0x7f733b2dd690 [ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690 [ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003 [ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003 [ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002 [ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000 [ 1814.859267] ? trace_hardirqs_off_caller+0xa7/0xcf [ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3 31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b 45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89 [ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590 [ 1814.860214] CR2: 0000000000000000 Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_fq_codel: avoid double free on init failureNikolay Aleksandrov
It is very unlikely to happen but the backlogs memory allocation could fail and will free q->flows, but then ->destroy() will free q->flows too. For correctness remove the first free and let ->destroy clean up. Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_cbq: fix null pointer dereferences on init failureNikolay Aleksandrov
CBQ can fail on ->init by wrong nl attributes or simply for missing any, f.e. if it's set as a default qdisc then TCA_OPTIONS (opt) will be NULL when it is activated. The first thing init does is parse opt but it will dereference a null pointer if used as a default qdisc, also since init failure at default qdisc invokes ->reset() which cancels all timers then we'll also dereference two more null pointers (timer->base) as they were never initialized. To reproduce: $ sysctl net.core.default_qdisc=cbq $ ip l set ethX up Crash log of the first null ptr deref: [44727.907454] BUG: unable to handle kernel NULL pointer dereference at (null) [44727.907600] IP: cbq_init+0x27/0x205 [44727.907676] PGD 59ff4067 [44727.907677] P4D 59ff4067 [44727.907742] PUD 59c70067 [44727.907807] PMD 0 [44727.907873] [44727.907982] Oops: 0000 [#1] SMP [44727.908054] Modules linked in: [44727.908126] CPU: 1 PID: 21312 Comm: ip Not tainted 4.13.0-rc6+ #60 [44727.908235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [44727.908477] task: ffff88005ad42700 task.stack: ffff880037214000 [44727.908672] RIP: 0010:cbq_init+0x27/0x205 [44727.908838] RSP: 0018:ffff8800372175f0 EFLAGS: 00010286 [44727.909018] RAX: ffffffff816c3852 RBX: ffff880058c53800 RCX: 0000000000000000 [44727.909222] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8800372175f8 [44727.909427] RBP: ffff880037217650 R08: ffffffff81b0f380 R09: 0000000000000000 [44727.909631] R10: ffff880037217660 R11: 0000000000000020 R12: ffffffff822a44c0 [44727.909835] R13: ffff880058b92000 R14: 00000000ffffffff R15: 0000000000000001 [44727.910040] FS: 00007ff8bc583740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000 [44727.910339] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [44727.910525] CR2: 0000000000000000 CR3: 00000000371e5000 CR4: 00000000000406e0 [44727.910731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [44727.910936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [44727.911141] Call Trace: [44727.911291] ? lockdep_init_map+0xb6/0x1ba [44727.911461] ? qdisc_alloc+0x14e/0x187 [44727.911626] qdisc_create_dflt+0x7a/0x94 [44727.911794] ? dev_activate+0x129/0x129 [44727.911959] attach_one_default_qdisc+0x36/0x63 [44727.912132] netdev_for_each_tx_queue+0x3d/0x48 [44727.912305] dev_activate+0x4b/0x129 [44727.912468] __dev_open+0xe7/0x104 [44727.912631] __dev_change_flags+0xc6/0x15c [44727.912799] dev_change_flags+0x25/0x59 [44727.912966] do_setlink+0x30c/0xb3f [44727.913129] ? check_chain_key+0xb0/0xfd [44727.913294] ? check_chain_key+0xb0/0xfd [44727.913463] rtnl_newlink+0x3a4/0x729 [44727.913626] ? rtnl_newlink+0x117/0x729 [44727.913801] ? ns_capable_common+0xd/0xb1 [44727.913968] ? ns_capable+0x13/0x15 [44727.914131] rtnetlink_rcv_msg+0x188/0x197 [44727.914300] ? rcu_read_unlock+0x3e/0x5f [44727.914465] ? rtnl_newlink+0x729/0x729 [44727.914630] netlink_rcv_skb+0x6c/0xce [44727.914796] rtnetlink_rcv+0x23/0x2a [44727.914956] netlink_unicast+0x103/0x181 [44727.915122] netlink_sendmsg+0x326/0x337 [44727.915291] sock_sendmsg_nosec+0x14/0x3f [44727.915459] sock_sendmsg+0x29/0x2e [44727.915619] ___sys_sendmsg+0x209/0x28b [44727.915784] ? do_raw_spin_unlock+0xcd/0xf8 [44727.915954] ? _raw_spin_unlock+0x27/0x31 [44727.916121] ? __handle_mm_fault+0x651/0xdb1 [44727.916290] ? check_chain_key+0xb0/0xfd [44727.916461] __sys_sendmsg+0x45/0x63 [44727.916626] ? __sys_sendmsg+0x45/0x63 [44727.916792] SyS_sendmsg+0x19/0x1b [44727.916950] entry_SYSCALL_64_fastpath+0x23/0xc2 [44727.917125] RIP: 0033:0x7ff8bbc96690 [44727.917286] RSP: 002b:00007ffc360991e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [44727.917579] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007ff8bbc96690 [44727.917783] RDX: 0000000000000000 RSI: 00007ffc36099230 RDI: 0000000000000003 [44727.917987] RBP: ffff880037217f98 R08: 0000000000000001 R09: 0000000000000003 [44727.918190] R10: 00007ffc36098fb0 R11: 0000000000000246 R12: 0000000000000006 [44727.918393] R13: 000000000066f1a0 R14: 00007ffc360a12e0 R15: 0000000000000000 [44727.918597] ? trace_hardirqs_off_caller+0xa7/0xcf [44727.918774] Code: 41 5f 5d c3 66 66 66 66 90 55 48 8d 56 04 45 31 c9 49 c7 c0 80 f3 b0 81 48 89 e5 41 55 41 54 53 48 89 fb 48 8d 7d a8 48 83 ec 48 <0f> b7 0e be 07 00 00 00 83 e9 04 e8 e6 f7 d8 ff 85 c0 0f 88 bb [44727.919332] RIP: cbq_init+0x27/0x205 RSP: ffff8800372175f0 [44727.919516] CR2: 0000000000000000 Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_hfsc: fix null pointer deref and double free on init failureNikolay Aleksandrov
Depending on where ->init fails we can get a null pointer deref due to uninitialized hires timer (watchdog) or a double free of the qdisc hash because it is already freed by ->destroy(). Fixes: 8d5537387505 ("net/sched/hfsc: allocate tcf block for hfsc root class") Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30sch_hhf: fix null pointer dereference on init failureNikolay Aleksandrov
If sch_hhf fails in its ->init() function (either due to wrong user-space arguments as below or memory alloc failure of hh_flows) it will do a null pointer deref of q->hh_flows in its ->destroy() function. To reproduce the crash: $ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000 Crash log: [ 690.654882] BUG: unable to handle kernel NULL pointer dereference at (null) [ 690.655565] IP: hhf_destroy+0x48/0xbc [ 690.655944] PGD 37345067 [ 690.655948] P4D 37345067 [ 690.656252] PUD 58402067 [ 690.656554] PMD 0 [ 690.656857] [ 690.657362] Oops: 0000 [#1] SMP [ 690.657696] Modules linked in: [ 690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57 [ 690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 690.659255] task: ffff880058578000 task.stack: ffff88005acbc000 [ 690.659747] RIP: 0010:hhf_destroy+0x48/0xbc [ 690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246 [ 690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000 [ 690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0 [ 690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000 [ 690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea [ 690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000 [ 690.663769] FS: 00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 690.667069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0 [ 690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 690.671003] Call Trace: [ 690.671743] qdisc_create+0x377/0x3fd [ 690.672534] tc_modify_qdisc+0x4d2/0x4fd [ 690.673324] rtnetlink_rcv_msg+0x188/0x197 [ 690.674204] ? rcu_read_unlock+0x3e/0x5f [ 690.675091] ? rtnl_newlink+0x729/0x729 [ 690.675877] netlink_rcv_skb+0x6c/0xce [ 690.676648] rtnetlink_rcv+0x23/0x2a [ 690.677405] netlink_unicast+0x103/0x181 [ 690.678179] netlink_sendmsg+0x326/0x337 [ 690.678958] sock_sendmsg_nosec+0x14/0x3f [ 690.679743] sock_sendmsg+0x29/0x2e [ 690.680506] ___sys_sendmsg+0x209/0x28b [ 690.681283] ? __handle_mm_fault+0xc7d/0xdb1 [ 690.681915] ? check_chain_key+0xb0/0xfd [ 690.682449] __sys_sendmsg+0x45/0x63 [ 690.682954] ? __sys_sendmsg+0x45/0x63 [ 690.683471] SyS_sendmsg+0x19/0x1b [ 690.683974] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 690.684516] RIP: 0033:0x7f8ae529d690 [ 690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690 [ 690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003 [ 690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000 [ 690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002 [ 690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000 [ 690.688475] ? trace_hardirqs_off_caller+0xa7/0xcf [ 690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83 c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02 00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1 [ 690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0 [ 690.690636] CR2: 0000000000000000 Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>