linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-04-19	mm: Fix MREMAP_DONTUNMAP accounting on VMA merge	Brian Geffon
	When remapping a mapping where a portion of a VMA is remapped into another portion of the VMA it can cause the VMA to become split. During the copy_vma operation the VMA can actually be remerged if it's an anonymous VMA whose pages have not yet been faulted. This isn't normally a problem because at the end of the remap the original portion is unmapped causing it to become split again. However, MREMAP_DONTUNMAP leaves that original portion in place which means that the VMA which was split and then remerged is not actually split at the end of the mremap. This patch fixes a bug where we don't detect that the VMAs got remerged and we end up putting back VM_ACCOUNT on the next mapping which is completely unreleated. When that next mapping is unmapped it results in incorrectly unaccounting for the memory which was never accounted, and eventually we will underflow on the memory comittment. There is also another issue which is similar, we're currently accouting for the number of pages in the new_vma but that's wrong. We need to account for the length of the remap operation as that's all that is being added. If there was a mapping already at that location its comittment would have been adjusted as part of the munmap at the start of the mremap. A really simple repro can be seen in: https://gist.github.com/bgaff/e101ce99da7d9a8c60acc641d07f312c Fixes: e346b3813067 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Brian Geffon <bgeffon@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-19	Merge tag 'clk-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two build fixes for a couple clk drivers and a fix for the Unisoc serial clk where we want to keep it on for earlycon" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sprd: don't gate uart console clock clk: mmp2: fix link error without mmp2 clk: asm9260: fix __clk_hw_register_fixed_rate_with_accuracy typo
2020-04-19	io_uring: only restore req->work for req that needs do completion	Xiaoguang Wang
	When testing io_uring IORING_FEAT_FAST_POLL feature, I got below panic: BUG: kernel NULL pointer dereference, address: 0000000000000030 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 5 PID: 2154 Comm: io_uring_echo_s Not tainted 5.6.0+ #359 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:io_wq_submit_work+0xf/0xa0 Code: ff ff ff be 02 00 00 00 e8 ae c9 19 00 e9 58 ff ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 2f <8b> 45 30 48 8d 9d 48 ff ff ff 25 01 01 00 00 83 f8 01 75 07 eb 2a RSP: 0018:ffffbef543e93d58 EFLAGS: 00010286 RAX: ffffffff84364f50 RBX: ffffa3eb50f046b8 RCX: 0000000000000000 RDX: ffffa3eb0efc1840 RSI: 0000000000000006 RDI: ffffa3eb50f046b8 RBP: 0000000000000000 R08: 00000000fffd070d R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffa3eb50f046b8 R13: ffffa3eb0efc2088 R14: ffffffff85b69be0 R15: ffffa3eb0effa4b8 FS: 00007fe9f69cc4c0(0000) GS:ffffa3eb5ef40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000020410000 CR4: 00000000000006e0 Call Trace: task_work_run+0x6d/0xa0 do_exit+0x39a/0xb80 ? get_signal+0xfe/0xbc0 do_group_exit+0x47/0xb0 get_signal+0x14b/0xbc0 ? __x64_sys_io_uring_enter+0x1b7/0x450 do_signal+0x2c/0x260 ? __x64_sys_io_uring_enter+0x228/0x450 exit_to_usermode_loop+0x87/0xf0 do_syscall_64+0x209/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x7fe9f64f8df9 Code: Bad RIP value. task_work_run calls io_wq_submit_work unexpectedly, it's obvious that struct callback_head's func member has been changed. After looking into codes, I found this issue is still due to the union definition: union { /* * Only commands that never go async can use the below fields, * obviously. Right now only IORING_OP_POLL_ADD uses them, and * async armed poll handlers for regular commands. The latter * restore the work, if needed. / struct { struct callback_head task_work; struct hlist_node hash_node; struct async_poll apoll; }; struct io_wq_work work; }; When task_work_run has multiple work to execute, the work that calls io_poll_remove_all() will do req->work restore for non-poll request always, but indeed if a non-poll request has been added to a new callback_head, subsequent callback will call io_async_task_func() to handle this request, that means we should not do the restore work for such non-poll request. Meanwhile in io_async_task_func(), we should drop submit ref when req has been canceled. Fix both issues. Fixes: b1f573bd15fd ("io_uring: restore req->work when canceling poll request") Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Use io_double_put_req() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-19	igc: Add debug messages to MAC filter code	Andre Guedes
	This patch adds log messages to functions related to the MAC address filtering code to ease debugging. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Refactor igc_del_mac_filter()	Andre Guedes
	This patch does a code refactoring in igc_del_mac_filter() so it uses the new helper igc_find_mac_filter() and improves the comment about the special handling when deleting the default filter. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Refactor igc_mac_entry_can_be_used()	Andre Guedes
	The helper igc_mac_entry_can_be_used() implementation is a bit convoluted since it does two different things: find a not-in-use slot in mac_table or find an in-use slot where the address and address type match. This patch does a code refactoring and break it up into two helper functions. With this patch we might traverse mac_table twice in some situations, but this is not harmful performance-wise (mac_table has only 16 entries and adding mac filters is not hot-path), and it improves igc_add_mac_ filter() readability considerably. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Remove igc_*_mac_steering_filter() wrappers	Andre Guedes
	With the previous two patches, igc_add_mac_steering_filter() and igc_del_mac_steering_filter() became a pointless wrapper of igc_add_mac_filter() and igc_del_mac_filter(). This patch removes these wrappers and update callers to call igc_add_mac_filter() and igc_del_mac_filter() directly. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Remove IGC_MAC_STATE_QUEUE_STEERING	Andre Guedes
	The IGC_MAC_STATE_QUEUE_STEERING bit in mac_table[i].state is utilized to indicate that frames matching the filter are assigned to mac_table[i].queue. This bit is not strictly necessary since we can convey the same information as follows: queue == -1 means queue assignment is disabled, otherwise it is enabled. In addition to make the code simpler, this change fixes some awkward situations where we pass a complete misleading 'queue' value such as in igc_uc_sync(). So this patch removes IGC_MAC_STATE_QUEUE_STEERING and also takes the opportunity to improve the igc_add_mac_filter documentation. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Remove 'queue' check in igc_del_mac_filter()	Andre Guedes
	igc_add_mac_filter() doesn't allow us to have more than one entry with the same address and address type in adapter->mac_table so checking if 'queue' matches in igc_del_mac_filter() isn't necessary. This patch removes that check. This patch also takes the opportunity to improve the igc_del_mac_filter documentation and remove comment which is not applicable to this I225 controller. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Improve address check in igc_del_mac_filter()	Andre Guedes
	igc_add_mac_filter() doesn't allow filters with invalid MAC address to be added to adapter->mac_table so, in igc_del_mac_filter(), we can early return if MAC address is invalid. No need to traverse the table. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	Merge tag 'x86-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 and objtool fixes from Thomas Gleixner: "A set of fixes for x86 and objtool: objtool: - Ignore the double UD2 which is emitted in BUG() when CONFIG_UBSAN_TRAP is enabled. - Support clang non-section symbols in objtool ORC dump - Fix switch table detection in .text.unlikely - Make the BP scratch register warning more robust. x86: - Increase microcode maximum patch size for AMD to cope with new CPUs which have a larger patch size. - Fix a crash in the resource control filesystem when the removal of the default resource group is attempted. - Preserve Code and Data Prioritization enabled state accross CPU hotplug. - Update split lock cpu matching to use the new X86_MATCH macros. - Change the split lock enumeration as Intel finaly decided that the IA32_CORE_CAPABILITIES bits are not architectural contrary to what the SDM claims. !@#%$^! - Add Tremont CPU models to the split lock detection cpu match. - Add a missing static attribute to make sparse happy" * tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/split_lock: Add Tremont family CPU models x86/split_lock: Bits in IA32_CORE_CAPABILITIES are not architectural x86/resctrl: Preserve CDP enable over CPU hotplug x86/resctrl: Fix invalid attempt at removing the default resource group x86/split_lock: Update to use X86_MATCH_INTEL_FAM6_MODEL() x86/umip: Make umip_insns static x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE objtool: Make BP scratch register warning more robust objtool: Fix switch table detection in .text.unlikely objtool: Support Clang non-section symbols in ORC generation objtool: Support Clang non-section symbols in ORC dump objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings
2020-04-19	igc: Refactor igc_rar_set_index()	Andre Guedes
	Current igc_rar_set_index() implementation is a bit convoluted so this patch does some code refactoring to improve it. The helper igc_rar_set_index() is about writing MAC filter settings into hardware registers. Logic such as address validation belongs to functions upper in the call chain such as igc_set_mac() and igc_add_mac_filter(). So this patch moves the is_valid_ether_addr() call to igc_add_mac_filter(). No need to touch igc_set_mac() since it already checks it. The variables 'rar_low' and 'rar_high' represent the value in registers RAL and RAH so we rename them to 'ral' and 'rah', respectively, to match the registers names. To make it explicit, filter settings are passed as arguments to the function instead of reading them from adapter->mac_table "under the hood". Also, the function was renamed to igc_set_mac_filter_hw to make it more clear what it does. Finally, the patch removes some wrfl() calls and comments not needed. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Fix igc_uc_unsync()	Andre Guedes
	In case igc_del_mac_filter() returns error, that error is masked since the functions always return 0 (success). This patch fixes igc_uc_unsync() so it returns whatever value igc_del_mac_filter() returns (0 on success, negative number on error). Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Change igc_add_mac_filter() returning value	Andre Guedes
	In case of success, igc_add_mac_filter() returns the index in adapter->mac_table where the requested filter was added. This information, however, is not used by any caller of that function. In fact, callers have extra code just to handle this returning index as 0 (success). So this patch changes the function to return 0 on success instead, and cleans up the extra code. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	Merge tag 'timers-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull time namespace fix from Thomas Gleixner: "An update for the proc interface of time namespaces: Use symbolic names instead of clockid numbers. The usability nuisance of numbers was noticed by Michael when polishing the man page" * tag 'timers-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: proc, time/namespace: Show clock symbolic names in /proc/pid/timens_offsets
2020-04-19	igc: Check unsupported flag in igc_add_mac_filter()	Andre Guedes
	The IGC_MAC_STATE_SRC_ADDR flags is not supported by igc_add_mac_ filter() so this patch adds a check for it and returns -ENOTSUPP in case it is set. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Remove duplicate code in MAC filtering logic	Andre Guedes
	This patch does a code refactoring in the MAC address filtering logic to get rid of some duplicate code. IGC driver has two functions to add MAC address filters that are pretty much the same: igc_add_mac_filter() and igc_add_mac_filter_flags(). The only difference is that the latter allows the callee to specify the 'flags' parameter while the former has it hard coded as zero. The same rationale applies to filter deletion counterparts. So this patch refactors igc_add_mac_filter() and igc_del_mac_filter() so they handle the 'flags' parameters, removes the _flags() functions, and fixes callees accordingly. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	e1000e: fix S0ix flows for cable connected case	Vitaly Lifshits
	Added a fix to S0ix entry and exit flows for TGP and above MAC types, to the case when the Ethernet cable is connected and the link is up. With that the system is able to reach SLP_S0 when going to freeze power state. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	igc: Add new device IDs for i225 part	Sasha Neftin
	Add new device IDs for the next step of i225 Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-04-19	Merge tag 'perf-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes and updates from Thomas Gleixner: - Fix the header line of perf stat output for '--metric-only --per-socket' - Fix the python build with clang - The usual tools UAPI header synchronization * tag 'perf-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools headers: Synchronize linux/bits.h with the kernel sources tools headers: Adopt verbatim copy of compiletime_assert() from kernel sources tools headers: Update x86's syscall_64.tbl with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers UAPI: Update tools's copy of drm.h headers tools headers kvm: Sync linux/kvm.h with the kernel sources tools headers UAPI: Sync linux/fscrypt.h with the kernel sources tools include UAPI: Sync linux/vhost.h with the kernel sources tools arch x86: Sync asm/cpufeatures.h with the kernel sources tools headers UAPI: Sync linux/mman.h with the kernel tools headers UAPI: Sync sched.h with the kernel tools headers: Update linux/vdso.h and grab a copy of vdso/const.h perf stat: Fix no metric header if --per-socket and --metric-only set perf python: Check if clang supports -fno-semantic-interposition tools arch x86: Sync the msr-index.h copy with the kernel sources
2020-04-19	Merge tag 'irq-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes/updates for the interrupt subsystem: - Remove setup_irq() and remove_irq(). All users have been converted so remove them before new users surface. - A set of bugfixes for various interrupt chip drivers - Add a few missing static attributes to address sparse warnings" * tag 'irq-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-bcm7038-l1: Make bcm7038_l1_of_init() static irqchip/irq-mvebu-icu: Make legacy_bindings static irqchip/meson-gpio: Fix HARDIRQ-safe -> HARDIRQ-unsafe lock order irqchip/sifive-plic: Fix maximum priority threshold value irqchip/ti-sci-inta: Fix processing of masked irqs irqchip/mbigen: Free msi_desc on device teardown irqchip/gic-v4.1: Update effective affinity of virtual SGIs irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling genirq: Remove setup_irq() and remove_irq()
2020-04-19	Merge tag 'sched-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Two fixes for the scheduler: - Work around an uninitialized variable warning where GCC can't figure it out. - Allow 'isolcpus=' to skip unknown subparameters so that older kernels work with the commandline of a newer kernel. Improve the error output while at it" * tag 'sched-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/vtime: Work around an unitialized variable warning sched/isolation: Allow "isolcpus=" to skip unknown sub-parameters
2020-04-19	Merge tag 'core-urgent-2020-04-19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Thomas Gleixner: "A single bugfix for RCU to prevent taking a lock in NMI context" * tag 'core-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Don't acquire lock in NMI handler in rcu_nmi_enter_common()
2020-04-19	Merge tag 'ext4_for_linus_stable' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Miscellaneous bug fixes and cleanups for ext4, including a fix for generic/388 in data=journal mode, removing some BUG_ON's, and cleaning up some compiler warnings" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: convert BUG_ON's to WARN_ON's in mballoc.c ext4: increase wait time needed before reuse of deleted inode numbers ext4: remove set but not used variable 'es' in ext4_jbd2.c ext4: remove set but not used variable 'es' ext4: do not zeroout extents beyond i_disksize ext4: fix return-value types in several function comments ext4: use non-movable memory for superblock readahead ext4: use matching invalidatepage in ext4_writepage
2020-04-19	Merge tag '5.7-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds
	Pull cifs fixes from Steve French: "Three small smb3 fixes: two debug related (helping network tracing for SMB2 mounts, and the other removing an unintended debug line on signing failures), and one fixing a performance problem with 64K pages" * tag '5.7-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: remove overly noisy debug line in signing errors cifs: improve read performance for page size 64KB & cache=strict & vers=2.1+ cifs: dump the session id and keys also for SMB2 sessions
2020-04-19	Merge tag 'flexible-array-member-5.7-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flexible-array member conversion from Gustavo Silva: "The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member convertions) will also help to get completely rid of those sorts of issues. Notice that all of these patches have been baking in linux-next for quite a while now and, 238 more of these patches have already been merged into 5.7-rc1. There are a couple hundred more of these issues waiting to be addressed in the whole codebase" [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") * tag 'flexible-array-member-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (28 commits) xattr.h: Replace zero-length array with flexible-array member uapi: linux: fiemap.h: Replace zero-length array with flexible-array member uapi: linux: dlm_device.h: Replace zero-length array with flexible-array member tpm_eventlog.h: Replace zero-length array with flexible-array member ti_wilink_st.h: Replace zero-length array with flexible-array member swap.h: Replace zero-length array with flexible-array member skbuff.h: Replace zero-length array with flexible-array member sched: topology.h: Replace zero-length array with flexible-array member rslib.h: Replace zero-length array with flexible-array member rio.h: Replace zero-length array with flexible-array member posix_acl.h: Replace zero-length array with flexible-array member platform_data: wilco-ec.h: Replace zero-length array with flexible-array member memcontrol.h: Replace zero-length array with flexible-array member list_lru.h: Replace zero-length array with flexible-array member lib: cpu_rmap: Replace zero-length array with flexible-array member irq.h: Replace zero-length array with flexible-array member ihex.h: Replace zero-length array with flexible-array member igmp.h: Replace zero-length array with flexible-array member genalloc.h: Replace zero-length array with flexible-array member ethtool.h: Replace zero-length array with flexible-array member ...
2020-04-19	netfilter: nat: fix error handling upon registering inet hook	Hillf Danton
	A case of warning was reported by syzbot. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 19934 at net/netfilter/nf_nat_core.c:1106 nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 19934 Comm: syz-executor.5 Not tainted 5.6.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x35 kernel/panic.c:582 report_bug+0x27b/0x2f0 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:175 [inline] fixup_bug arch/x86/kernel/traps.c:170 [inline] do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267 do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106 Code: ff df 48 c1 ea 03 80 3c 02 00 75 75 48 8b 44 24 10 4c 89 ef 48 c7 00 00 00 00 00 e8 e8 f8 53 fb e9 4d fe ff ff e8 ee 9c 16 fb <0f> 0b e9 41 fe ff ff e8 e2 45 54 fb e9 b5 fd ff ff 48 8b 7c 24 20 RSP: 0018:ffffc90005487208 EFLAGS: 00010246 RAX: 0000000000040000 RBX: 0000000000000004 RCX: ffffc9001444a000 RDX: 0000000000040000 RSI: ffffffff865c94a2 RDI: 0000000000000005 RBP: ffff88808b5cf000 R08: ffff8880a2620140 R09: fffffbfff14bcd79 R10: ffffc90005487208 R11: fffffbfff14bcd78 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000 nf_nat_ipv6_unregister_fn net/netfilter/nf_nat_proto.c:1017 [inline] nf_nat_inet_register_fn net/netfilter/nf_nat_proto.c:1038 [inline] nf_nat_inet_register_fn+0xfc/0x140 net/netfilter/nf_nat_proto.c:1023 nf_tables_register_hook net/netfilter/nf_tables_api.c:224 [inline] nf_tables_addchain.constprop.0+0x82e/0x13c0 net/netfilter/nf_tables_api.c:1981 nf_tables_newchain+0xf68/0x16a0 net/netfilter/nf_tables_api.c:2235 nfnetlink_rcv_batch+0x83a/0x1610 net/netfilter/nfnetlink.c:433 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:543 [inline] nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:561 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 and to quiesce it, unregister NFPROTO_IPV6 hook instead of NFPROTO_INET in case of failing to register NFPROTO_IPV4 hook. Reported-by: syzbot <syzbot+33e06702fd6cffc24c40@syzkaller.appspotmail.com> Fixes: d164385ec572 ("netfilter: nat: add inet family nat support") Cc: Florian Westphal <fw@strlen.de> Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-18	Merge branch 'r8169-series-with-improvements'	David S. Miller
	Heiner Kallweit says: ==================== r8169: series with improvements Again a series with few improvements. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: add workaround for RTL8168evl TSO hw issues	Heiner Kallweit
	Add workaround for hw issues with TSO on RTL8168evl. This workaround is based on information I got from Realtek, and should allow to safely enable TSO on this chip version. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: improve rtl8169_tso_csum_v2	Heiner Kallweit
	Simplify the code and avoid the overhead of calling vlan_get_protocol(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: use rtl8169_set_features in rtl8169_init_one	Heiner Kallweit
	At that place in rtl_init_one() we can safely use rtl8169_set_features() to configure the chip according to the default features. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: preserve VLAN setting on RTL8125 in rtl_init_rxcfg	Heiner Kallweit
	So far we set RX_VLAN_8125 unconditionally, even if NETIF_F_HW_VLAN_CTAG_RX may not be set. Don't touch these bits, and let only rtl8169_set_features() control them. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: remove NETIF_F_HIGHDMA from vlan_features	Heiner Kallweit
	NETIF_F_HIGHDMA is added to vlan_features by register_netdev(), therefore we can omit this here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	r8169: move setting OCP base to generic init code	Heiner Kallweit
	Move setting the ocp_base to rtl_init_one(). Where supported the value is always the same, and if not supported it doesn't hurt. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	enetc: permit configuration of rx-vlan-filter with ethtool	Vladimir Oltean
	Each ENETC station interface (SI) has a VLAN filter list and a port flag (PSIPVMR) by which it can be put in "VLAN promiscuous" mode, which enables the reception of VLAN-tagged traffic even if it is not in the VLAN filtering list. Currently the handling of this setting works like this: the port starts off as VLAN promiscuous, then it switches to enabling VLAN filtering as soon as the first VLAN is installed in its filter via .ndo_vlan_rx_add_vid. In practice that does not work out very well, because more often than not, the first VLAN to be installed is out of the control of the user: the 8021q module, if loaded, adds its rule for 802.1p (VID 0) traffic upon bringing the interface up. What the user is currently seeing in ethtool is this: ethtool -k eno2 rx-vlan-filter: on [fixed] which doesn't match the intention of the code, but the practical reality of having the 8021q module install its VID which has the side-effect of turning on VLAN filtering in this driver. All in all, a slightly confusing experience. So instead of letting this driver switch the VLAN filtering state by itself, just wire it up with the rx-vlan-filter feature from ethtool, and let it be user-configurable just through that knob, except for one case, see below. In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: mscc: ocelot: deal with problematic MAC_ETYPE VCAP IS2 rules	Vladimir Oltean
	By default, the VCAP IS2 will produce a single match for each frame, on the most specific classification. Example: a ping packet (ICMP over IPv4 over Ethernet) sent from an IP address of 10.0.0.1 and a MAC address of 96:18:82:00:04:01 will match this rule: tc filter add dev swp0 ingress protocol ipv4 \ flower skip_sw src_ip 10.0.0.1 action drop but not this one: tc filter add dev swp0 ingress \ flower skip_sw src_mac 96:18:82:00:04:01 action drop Currently the driver does not really warn the user in any way about this, and the behavior is rather strange anyway. The current patch is a workaround to force matches on MAC_ETYPE keys (DMAC and SMAC) for all packets irrespective of higher layer protocol. The setting is made at the port level. Of course this breaks all other non-src_mac and non-dst_mac matches, so rule exclusivity checks have been added to the driver, in order to never have rules of both types on any ingress port. The bits that discard higher-level protocol information are set only once a MAC_ETYPE rule is added to a filter block, and only for the ports that are bound to that filter block. Then all further non-MAC_ETYPE rules added to that filter block should be denied by the ports bound to it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	Merge branch '1GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2020-04-17 This series contains updates to e1000e and igc only. Sasha adds partial generic segmentation offload (GSO partial) support to the igc driver. Also added support for translating taprio schedules into i225 cycles in igc. Did clean up of dead code or unused defines in the igc driver. Refactored the code to avoid forward declarations where possible. Enables the NETIF_F_HW_TC flag for igc by default. Vinicius adds support for ETF offloading using the similar approach that taprio offload used. Kees Cook fixes a clang warning in the e1000e driver by moving the declared variable either into the switch case that uses the variable or lift them up into the main function body, to help the compiler. Andre fixed some register overwriting when dumping registers via ethtool for igc driver. Also fixed support for ethtool Network Flow Classification (NFC) queue redirection by adding the missing code needed to enable the queue selection feature from Receive Address High (RAH) register. Cleans up code to remove the code bits designed to support tc-flower filters, since this client part does not support it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: phy: broadcom: Add support for BCM53125 internal PHYs	Florian Fainelli
	BCM53125 has internal Gigabit PHYs which support interrupts as well as statistics, make it possible to configure both of those features with a PHY driver entry. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: phy: mdio-bcm-iproc: Do not show kernel pointer	Florian Fainelli
	Displaying the virtual address at which the MDIO base register address has been mapped is not useful and is not visible with pointer hashing in place, replace the message with something indicating successful registration instead. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: dsa: b53: per-port interrupts are optional	Florian Fainelli
	Make use of platform_get_irq_byname_optional() to avoid printing messages on the kernel console that interrupts cannot be found. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	tcp: cache line align MAX_TCP_HEADER	Eric Dumazet
	TCP stack is dumb in how it cooks its output packets. Depending on MAX_HEADER value, we might chose a bad ending point for the headers. If we align the end of TCP headers to cache line boundary, we make sure to always use the smallest number of cache lines, which always help. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: phy: at803x: add support for AR8032 PHY	David Bauer
	This adds support for the Qualcomm Atheros AR8032 Fast Ethernet PHY. It shares many similarities with the already supported AR8030 PHY but additionally supports MII connection to the MAC. Signed-off-by: David Bauer <mail@david-bauer.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	Merge branch 'mptcp-fixes'	David S. Miller
	Florian Westphal says: ==================== mptcp: fix 'attempt to release socket in state...' splats These two patches fix error handling corner-cases where inet_sock_destruct gets called for a mptcp_sk that is not in TCP_CLOSE state. This results in unwanted error printks from the network stack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	mptcp: fix 'Attempt to release TCP socket in state' warnings	Florian Westphal
	We need to set sk_state to CLOSED, else we will get following: IPv4: Attempt to release TCP socket in state 3 00000000b95f109e IPv4: Attempt to release TCP socket in state 10 00000000b95f109e First one is from inet_sock_destruct(), second one from mptcp_sk_clone failure handling. Setting sk_state to CLOSED isn't enough, we also need to orphan sk so it has DEAD flag set. Otherwise, a very similar warning is printed from inet_sock_destruct(). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	mptcp: fix splat when incoming connection is never accepted before exit/close	Florian Westphal
	Following snippet (replicated from syzkaller reproducer) generates warning: "IPv4: Attempt to release TCP socket in state 1". int main(void) { struct sockaddr_in sin1 = { .sin_family = 2, .sin_port = 0x4e20, .sin_addr.s_addr = 0x010000e0, }; struct sockaddr_in sin2 = { .sin_family = 2, .sin_addr.s_addr = 0x0100007f, }; struct sockaddr_in sin3 = { .sin_family = 2, .sin_port = 0x4e20, .sin_addr.s_addr = 0x0100007f, }; int r0 = socket(0x2, 0x1, 0x106); int r1 = socket(0x2, 0x1, 0x106); bind(r1, (void )&sin1, sizeof(sin1)); connect(r1, (void )&sin2, sizeof(sin2)); listen(r1, 3); return connect(r0, (void *)&sin3, 0x4d); } Reason is that the newly generated mptcp socket is closed via the ulp release of the tcp listener socket when its accept backlog gets purged. To fix this, delay setting the ESTABLISHED state until after userspace calls accept and via mptcp specific destructor. Fixes: 58b09919626bf ("mptcp: create msk early") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/9 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net/mlx4_en: avoid indirect call in TX completion	Eric Dumazet
	Commit 9ecc2d86171a ("net/mlx4_en: add xdp forwarding and data write support") brought another indirect call in fast path. Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call when/if CONFIG_RETPOLINE=y Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	ipv6: rpl: fix full address compression	Alexander Aring
	This patch makes it impossible that cmpri or cmpre values are set to the value 16 which is not possible, because these are 4 bit values. We currently run in an overflow when assigning the value 16 to it. According to the standard a value of 16 can be interpreted as a full elided address which isn't possible to set as compression value. A reason why this cannot be set is that the current ipv6 header destination address should never show up inside the segments of the rpl header. In this case we run in a overflow and the address will have no compression at all. Means cmpri or compre is set to 0. As we handle cmpri and cmpre sometimes as unsigned char or 4 bit value inside the rpl header the current behaviour ends in an invalid header format. This patch simple use the best compression method if we ever run into the case that the destination address is showed up inside the rpl segments. We avoid the overflow handling and the rpl header is still valid, even when we have the destination address inside the rpl segments. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: stmmac: Fix sub-second increment	Julien Beraud
	In fine adjustement mode, which is the current default, the sub-second increment register is the number of nanoseconds that will be added to the clock when the accumulator overflows. At each clock cycle, the value of the addend register is added to the accumulator. Currently, we use 20ns = 1e09ns / 50MHz as this value whatever the frequency of the ptp clock actually is. The adjustment is then done on the addend register, only incrementing every X clock cycles X being the ratio between 50MHz and ptp_clock_rate (addend = 2^32 * 50MHz/ptp_clock_rate). This causes the following issues : - In case the frequency of the ptp clock is inferior or equal to 50MHz, the addend value calculation will overflow and the default addend value will be set to 0, causing the clock to not work at all. (For instance, for ptp_clock_rate = 50MHz, addend = 2^32). - The resolution of the timestamping clock is limited to 20ns while it is not needed, thus limiting the accuracy of the timestamping to 20ns. Fix this by setting sub-second increment to 2e09ns / ptp_clock_rate. It will allow to reach the minimum possible frequency for ptp_clk_ref, which is 5MHz for GMII 1000Mps Full-Duplex by setting the sub-second-increment to a higher value. For instance, for 25MHz, it gives ssinc = 80ns and default_addend = 2^31. It will also allow to use a lower value for sub-second-increment, thus improving the timestamping accuracy with frequencies higher than 100MHz, for instance, for 200MHz, ssinc = 10ns and default_addend = 2^31. v1->v2: - Remove modifications to the calculation of default addend, which broke compatibility with clock frequencies for which 2000000000 / ptp_clk_freq is not an integer. - Modify description according to discussions. Signed-off-by: Julien Beraud <julien.beraud@orolia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	net: stmmac: fix enabling socfpga's ptp_ref_clock	Julien Beraud
	There are 2 registers to write to enable a ptp ref clock coming from the fpga. One that enables the usage of the clock from the fpga for emac0 and emac1 as a ptp ref clock, and the other to allow signals from the fpga to reach emac0 and emac1. Currently, if the dwmac-socfpga has phymode set to PHY_INTERFACE_MODE_MII, PHY_INTERFACE_MODE_GMII, or PHY_INTERFACE_MODE_SGMII, both registers will be written and the ptp ref clock will be set as coming from the fpga. Separate the 2 register writes to only enable signals from the fpga to reach emac0 or emac1 when ptp ref clock is not coming from the fpga. Signed-off-by: Julien Beraud <julien.beraud@orolia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18	wimax/i2400m: Fix potential urb refcnt leak	Xiyu Yang
	i2400mu_bus_bm_wait_for_ack() invokes usb_get_urb(), which increases the refcount of the "notif_urb". When i2400mu_bus_bm_wait_for_ack() returns, local variable "notif_urb" becomes invalid, so the refcount should be decreased to keep refcount balanced. The issue happens in all paths of i2400mu_bus_bm_wait_for_ack(), which forget to decrease the refcnt increased by usb_get_urb(), causing a refcnt leak. Fix this issue by calling usb_put_urb() before the i2400mu_bus_bm_wait_for_ack() returns. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>