linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-12-02	parisc: Enable -ffunction-sections for modules on 32-bit kernel	Helge Deller
	Frank Schreiner reported, that since kernel 4.18 he faces sysfs-warnings when loading modules on a 32-bit kernel. Here is one such example: sysfs: cannot create duplicate filename '/module/nfs/sections/.text' CPU: 0 PID: 98 Comm: modprobe Not tainted 4.18.0-2-parisc #1 Debian 4.18.10-2 Backtrace: [<1017ce2c>] show_stack+0x3c/0x50 [<107a7210>] dump_stack+0x28/0x38 [<103f900c>] sysfs_warn_dup+0x88/0xac [<103f8b1c>] sysfs_add_file_mode_ns+0x164/0x1d0 [<103f9e70>] internal_create_group+0x11c/0x304 [<103fa0a0>] sysfs_create_group+0x48/0x60 [<1022abe8>] load_module.constprop.35+0x1f9c/0x23b8 [<1022b278>] sys_finit_module+0xd0/0x11c [<101831dc>] syscall_exit+0x0/0x14 This warning gets triggered by the fact, that due to commit 24b6c22504a2 ("parisc: Build kernel without -ffunction-sections") we now get multiple .text sections in the kernel modules for which sysfs_create_group() can't create multiple virtual files. This patch works around the problem by re-enabling the -ffunction-sections compiler option for modules, while keeping it disabled for the non-module kernel code. Reported-by: Frank Scheiner <frank.scheiner@web.de> Fixes: 24b6c22504a2 ("parisc: Build kernel without -ffunction-sections") Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-12-02	bpf: Fix memleak in aux->func_info and aux->btf	Martin KaFai Lau
	The aux->func_info and aux->btf are leaked in the error out cases during bpf_prog_load(). This patch fixes it. Fixes: ba64e7d85252 ("bpf: btf: support proper non-jit func info") Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-02	SUNRPC: Fix a potential race in xprt_connect()	Trond Myklebust
	If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	SUNRPC: Fix a memory leak in call_encode()	Trond Myklebust
	If we retransmit an RPC request, we currently end up clobbering the value of req->rq_rcv_buf.bvec that was allocated by the initial call to xprt_request_prepare(req). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	SUNRPC: Fix leak of krb5p encode pages	Chuck Lever
	call_encode can be invoked more than once per RPC call. Ensure that each call to gss_wrap_req_priv does not overwrite pointers to previously allocated memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	SUNRPC: call_connect_status() must handle tasks that got transmitted	Trond Myklebust
	If a task failed to get the write lock in the call to xprt_connect(), then it will be queued on xprt->sending. In that case, it is possible for it to get transmitted before the call to call_connect_status(), in which case it needs to be handled by call_transmit_status() instead. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	nfs: don't dirty kernel pages read by direct-io	Dave Kleikamp
	When we use direct_IO with an NFS backing store, we can trigger a WARNING in __set_page_dirty(), as below, since we're dirtying the page unnecessarily in nfs_direct_read_completion(). To fix, replicate the logic in commit 53cbf3b157a0 ("fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read"). Other filesystems that implement direct_IO handle this; most use blockdev_direct_IO(). ceph and cifs have similar logic. mount 127.0.0.1:/export /nfs dd if=/dev/zero of=/nfs/image bs=1M count=200 losetup --direct-io=on -f /nfs/image mkfs.btrfs /dev/loop0 mount -t btrfs /dev/loop0 /mnt/ kernel: WARNING: CPU: 0 PID: 8067 at fs/buffer.c:580 __set_page_dirty+0xaf/0xd0 kernel: Modules linked in: loop(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) tun(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_ kernel: snd_seq(E) snd_seq_device(E) snd_pcm(E) video(E) snd_timer(E) snd(E) soundcore(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) pata_acpi(E) crc32c_intel(E) ahci(E) li kernel: CPU: 0 PID: 8067 Comm: kworker/0:2 Tainted: G E 4.20.0-rc1.master.20181111.ol7.x86_64 #1 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 kernel: Workqueue: nfsiod rpc_async_release [sunrpc] kernel: RIP: 0010:__set_page_dirty+0xaf/0xd0 kernel: Code: c3 48 8b 02 f6 c4 04 74 d4 48 89 df e8 ba 05 f7 ff 48 89 c6 eb cb 48 8b 43 08 a8 01 75 1f 48 89 d8 48 8b 00 a8 04 74 02 eb 87 <0f> 0b eb 83 48 83 e8 01 eb 9f 48 83 ea 01 0f 1f 00 eb 8b 48 83 e8 kernel: RSP: 0000:ffffc1c8825b7d78 EFLAGS: 00013046 kernel: RAX: 000fffffc0020089 RBX: fffff2b603308b80 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: ffff9d11478115c8 RDI: ffff9d11478115d0 kernel: RBP: ffffc1c8825b7da0 R08: 0000646f6973666e R09: 8080808080808080 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d11478115d0 kernel: R13: ffff9d11478115c8 R14: 0000000000003246 R15: 0000000000000001 kernel: FS: 0000000000000000(0000) GS:ffff9d115ba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f408686f640 CR3: 0000000104d8e004 CR4: 00000000000606f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: __set_page_dirty_buffers+0xb6/0x110 kernel: set_page_dirty+0x52/0xb0 kernel: nfs_direct_read_completion+0xc4/0x120 [nfs] kernel: nfs_pgio_release+0x10/0x20 [nfs] kernel: rpc_free_task+0x30/0x70 [sunrpc] kernel: rpc_async_release+0x12/0x20 [sunrpc] kernel: process_one_work+0x174/0x390 kernel: worker_thread+0x4f/0x3e0 kernel: kthread+0x102/0x140 kernel: ? drain_workqueue+0x130/0x130 kernel: ? kthread_stop+0x110/0x110 kernel: ret_from_fork+0x35/0x40 kernel: ---[ end trace 01341980905412c9 ]--- Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [forward-ported to v4.20] Signed-off-by: Calum Mackay <calum.mackay@oracle.com> Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	flexfiles: enforce per-mirror stateid only for v4 DSes	Tigran Mkrtchyan
	Since commit bb21ce0ad227 we always enforce per-mirror stateid. However, this makes sense only for v4+ servers. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-02	ieee802154: hwsim: fix off-by-one in parse nested	Alexander Aring
	This patch fixes a off-by-one mistake in nla_parse_nested() functions of mac802154_hwsim driver. I had to enabled stack protector so I was able to reproduce it. Reference: https://github.com/linux-wpan/wpan-tools/issues/17 Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-12-01	Merge branch 'x86-pti-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull STIBP fallout fixes from Thomas Gleixner: "The performance destruction department finally got it's act together and came up with a cure for the STIPB regression: - Provide a command line option to control the spectre v2 user space mitigations. Default is either seccomp or prctl (if seccomp is disabled in Kconfig). prctl allows mitigation opt-in, seccomp enables the migitation for sandboxed processes. - Rework the code to handle the conditional STIBP/IBPB control and remove the now unused ptrace_may_access_sched() optimization attempt - Disable STIBP automatically when SMT is disabled - Optimize the switch_to() logic to avoid MSR writes and invocations of __switch_to_xtra(). - Make the asynchronous speculation TIF updates synchronous to prevent stale mitigation state. As a general cleanup this also makes retpoline directly depend on compiler support and removes the 'minimal retpoline' option which just pretended to provide some form of security while providing none" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/speculation: Provide IBPB always command line options x86/speculation: Add seccomp Spectre v2 user space protection mode x86/speculation: Enable prctl mode for spectre_v2_user x86/speculation: Add prctl() control for indirect branch speculation x86/speculation: Prepare arch_smt_update() for PRCTL mode x86/speculation: Prevent stale SPEC_CTRL msr content x86/speculation: Split out TIF update ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS x86/speculation: Prepare for conditional IBPB in switch_mm() x86/speculation: Avoid __switch_to_xtra() calls x86/process: Consolidate and simplify switch_to_xtra() code x86/speculation: Prepare for per task indirect branch speculation control x86/speculation: Add command line control for indirect branch speculation x86/speculation: Unify conditional spectre v2 print functions x86/speculataion: Mark command line parser data __initdata x86/speculation: Mark string arrays const correctly x86/speculation: Reorder the spec_v2 code x86/l1tf: Show actual SMT state x86/speculation: Rework SMT state change sched/smt: Expose sched_smt_present static key ...
2018-12-01	bpf: refactor bpf_test_run() to separate own failures and test program result	Roman Gushchin
	After commit f42ee093be29 ("bpf/test_run: support cgroup local storage") the bpf_test_run() function may fail with -ENOMEM, if it's not possible to allocate memory for a cgroup local storage. This error shouldn't be mixed with the return value of the testing program. Let's add an additional argument with a pointer where to store the testing program's result; and make bpf_test_run() return either 0 or -ENOMEM. Fixes: f42ee093be29 ("bpf/test_run: support cgroup local storage") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-01	Merge tag 'for-linus-20181201' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block layer fixes from Jens Axboe: - Single range elevator discard merge fix, that caused crashes (Ming) - Fix for a regression in O_DIRECT, where we could potentially lose the error value (Maximilian Heyne) - NVMe pull request from Christoph, with little fixes all over the map for NVMe. * tag 'for-linus-20181201' of git://git.kernel.dk/linux-block: block: fix single range discard merge nvme-rdma: fix double freeing of async event data nvme: flush namespace scanning work just before removing namespaces nvme: warn when finding multi-port subsystems without multipathing enabled fs: fix lost error code in dio_complete nvme-pci: fix surprise removal nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() nvme: Free ctrl device name on init failure
2018-12-01	Merge tag 'pci-v4.20-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix a link speed checking interface that broke PCIe gen3 cards in gen1 slots (Mikulas Patocka) - Fix an imx6 link training error (Trent Piepho) - Fix a layerscape outbound window accessor calling error (Hou Zhiqiang) - Fix a DesignWare endpoint MSI-X address calculation error (Gustavo Pimentel) * tag 'pci-v4.20-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix incorrect value returned from pcie_get_speed_cap() PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check
2018-12-01	netfilter: nat: remove l4 protocol port rovers	Florian Westphal
	This is a leftover from days where single-cpu systems were common: Store last port used to resolve a clash to use it as a starting point when the next conflict needs to be resolved. When we have parallel attempt to connect to same address:port pair, its likely that both cores end up computing the same "available" port, as both use same starting port, and newly used ports won't become visible to other cores until the conntrack gets confirmed later. One of the cores then has to drop the packet at insertion time because the chosen new tuple turns out to be in use after all. Lets simplify this: remove port rover and use a pseudo-random starting point. Note that this doesn't make netfilter default to 'fully random' mode; the 'rover' was only used if NAT could not reuse source port as-is. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-01	netfilter: remove NFC_* cache bits	Pablo Neira Ayuso
	These are very very (for long time unused) caching infrastructure definition, remove then. They have nothing to do with the NFC subsystem. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-01	netfilter: Replace call_rcu_bh(), rcu_barrier_bh(), and synchronize_rcu_bh()	Paul E. McKenney
	Now that call_rcu()'s callback is not invoked until after bh-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_bh(). Similarly, rcu_barrier() can be used in place of rcu_barrier_bh() and synchronize_rcu() in place of synchronize_rcu_bh(). This commit therefore makes these changes. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-30	Merge branch 'xdp1-improvements'	Alexei Starovoitov
	Matteo Croce says: ==================== Small improvements to improve the readability and easiness to use of the xdp1 sample. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	samples: bpf: get ifindex from ifname	Matteo Croce
	Find the ifindex with if_nametoindex() instead of requiring the numeric ifindex. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	samples: bpf: improve xdp1 example	Matteo Croce
	Store only the total packet count for every protocol, instead of the whole per-cpu array. Use bpf_map_get_next_key() to iterate the map, instead of looking up all the protocols. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	Merge remote-tracking branch 'lorenzo/pci/controller-fixes' into for-linus	Bjorn Helgaas
	- Fix DesignWare endpoint MSI-X address calculation bug (Gustavo Pimentel) - Fix Layerscape outbound window disable usage (Hou Zhiqiang) - Fix imx6 link up detection (Trent Piepho) * lorenzo/pci/controller-fixes: PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check
2018-11-30	PCI: Fix incorrect value returned from pcie_get_speed_cap()	Mikulas Patocka
	The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks. We must mask the register and compare it against them. This fixes errors like this: amdgpu: [powerplay] failed to send message 261 ret is 0 when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is being incorrectly reported as PCIe-v3 capable. 6cf57be0f78e, which appeared in v4.17, added pcie_get_speed_cap() with the incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask. 5d9a63304032, which appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the amdgpu bug reports below are regressions in v4.19. Fixes: 6cf57be0f78e ("PCI: Add pcie_get_speed_cap() to find max supported link speed") Fixes: 5d9a63304032 ("drm/amdgpu: use pcie functions for link width and speed") Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704 Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2, remove test of PCI_EXP_LNKCAP for zero, since that register is required] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # v4.17+
2018-11-30	Merge branch 'improve-test-coverage-sparc'	Alexei Starovoitov
	David Miller says: ==================== On sparc64 a ton of test cases in test_verifier.c fail because the memory accesses in the test case are unaligned (or cannot be proven to be aligned by the verifier). Perhaps we can eventually try to (carefully) modify each test case which has this problem to not use unaligned accesses but: 1) That is delicate work. 2) The changes might not fully respect the original intention of the testcase. 3) In some cases, such a transformation might not even be feasible at all. So add an "any alignment" flag to tell the verifier to forcefully disable it's alignment checks completely. test_verifier.c is then annotated to use this flag when necessary. The presence of the flag in each test case is good documentation to anyone who wants to actually tackle the job of eliminating the unaligned memory accesses in the test cases. I've also seen several weird things in test cases, like trying to access __skb->mark in a packet buffer. This gets rid of 104 test_verifier.c failures on sparc64. Changes since v1: 1) Explain the new BPF_PROG_LOAD flag in easier to understand terms. Suggested by Alexei. 2) Make bpf_verify_program() just take a __u32 prog_flags instead of just accumulating boolean arguments over and over. Also suggested by Alexei. Changes since RFC: 1) Only the admin can allow the relaxation of alignment restrictions on inefficient unaligned access architectures. 2) Use F_NEEDS_EFFICIENT_UNALIGNED_ACCESS instead of making a new flag. 3) Annotate in the output, when we have a test case that the verifier accepted but we did not try to execute because we are on an inefficient unaligned access platform. Maybe with some arch machinery we can avoid this in the future. Signed-off-by: David S. Miller <davem@davemloft.net> ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Apply F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to more ACCEPT test cases.	David Miller
	If a testcase has alignment problems but is expected to be ACCEPT, verify it using F_NEEDS_EFFICIENT_UNALIGNED_ACCESS too. Maybe in the future if we add some architecture specific code to elide the unaligned memory access warnings during the test, we can execute these as well. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Make more use of 'any' alignment in test_verifier.c	David Miller
	Use F_NEEDS_EFFICIENT_UNALIGNED_ACCESS in more tests where the expected result is REJECT. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Adjust F_NEEDS_EFFICIENT_UNALIGNED_ACCESS handling in test_verifier.c	David Miller
	Make it set the flag argument to bpf_verify_program() which will relax the alignment restrictions. Now all such test cases will go properly through the verifier even on inefficient unaligned access architectures. On inefficient unaligned access architectures do not try to run such programs, instead mark the test case as passing but annotate the result similarly to how it is done now in the presence of this flag. So, we get complete full coverage for all REJECT test cases, and at least verifier level coverage for ACCEPT test cases. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Add BPF_F_ANY_ALIGNMENT.	David Miller
	Often we want to write tests cases that check things like bad context offset accesses. And one way to do this is to use an odd offset on, for example, a 32-bit load. This unfortunately triggers the alignment checks first on platforms that do not set CONFIG_EFFICIENT_UNALIGNED_ACCESS. So the test case see the alignment failure rather than what it was testing for. It is often not completely possible to respect the original intention of the test, or even test the same exact thing, while solving the alignment issue. Another option could have been to check the alignment after the context and other validations are performed by the verifier, but that is a non-trivial change to the verifier. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Fix verifier log string check for bad alignment.	David Miller
	The message got changed a lot time ago. This was responsible for 36 test case failures on sparc64. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-01	powerpc: Look for "stdout-path" when setting up legacy consoles	Benjamin Herrenschmidt
	Commit 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") broke the default console on a number of embedded PowerPC systems, because it failed to also update the code in arch/powerpc/kernel/legacy_serial.c to look for that property in addition to the old one. This fixes it. Fixes: 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-30	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "31 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (31 commits) ocfs2: fix potential use after free mm/khugepaged: fix the xas_create_range() error path mm/khugepaged: collapse_shmem() do not crash on Compound mm/khugepaged: collapse_shmem() without freezing new_page mm/khugepaged: minor reorderings in collapse_shmem() mm/khugepaged: collapse_shmem() remember to clear holes mm/khugepaged: fix crashes due to misaccounted holes mm/khugepaged: collapse_shmem() stop if punched or truncated mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() mm/huge_memory: splitting set mapping+index before unfreeze mm/huge_memory: rename freeze_page() to unmap_page() initramfs: clean old path before creating a hardlink kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace psi: make disabling/enabling easier for vendor kernels proc: fixup map_files test on arm debugobjects: avoid recursive calls with kmemleak userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set userfaultfd: shmem: add i_size checks userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem ...
2018-11-30	Merge tag 'mips_fixes_4.20_4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull few more MIPS fixes from Paul Burton: - Fix mips_get_syscall_arg() to operate on the task specified when detecting o32 tasks running on MIPS64 kernels. - Fix some incorrect GPIO pin muxing for the MT7620 SoC. - Update the linux-mips mailing list address. * tag 'mips_fixes_4.20_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MAINTAINERS: Update linux-mips mailing list address MIPS: ralink: Fix mt7620 nd_sd pinmux mips: fix mips_get_syscall_arg o32 check
2018-11-30	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Cortex-A76 erratum workaround - ftrace fix to enable syscall events on arm64 - Fix uninitialised pointer in iort_get_platform_device_domain() * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value arm64: ftrace: Fix to enable syscall events on arm64 arm64: Add workaround for Cortex-A76 erratum 1286807
2018-11-30	Merge tag 'gcc-plugins-v4.20-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull stackleak plugin fix from Kees Cook: "Fix crash by not allowing kprobing of stackleak_erase() (Alexander Popov)" * tag 'gcc-plugins-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stackleak: Disable function tracing and kprobes for stackleak_erase()
2018-11-30	Merge tag 'fscache-fixes-20181130' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache and cachefiles fixes from David Howells: "Misc fixes: - Fix an assertion failure at fs/cachefiles/xattr.c:138 caused by a race between a cache object lookup failing and someone attempting to reenable that object, thereby triggering an update of the object's attributes. - Fix an assertion failure at fs/fscache/operation.c:449 caused by a split atomic subtract and atomic read that allows a race to happen. - Fix a leak of backing pages when simultaneously reading the same page from the same object from two or more threads. - Fix a hang due to a race between a cache object being discarded and the corresponding cookie being reenabled. There are also some minor cleanups: - Cast an enum value to a different enum type to prevent clang from generating a warning. This shouldn't cause any sort of change in the emitted code. - Use ktime_get_real_seconds() instead of get_seconds(). This is just used to uniquify a filename for an object to be placed in the graveyard. Objects placed there are deleted by cachfilesd in userspace immediately thereafter. - Remove an initialised, but otherwise unused variable. This should have been entirely optimised away anyway" * tag 'fscache-fixes-20181130' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fscache, cachefiles: remove redundant variable 'cache' cachefiles: avoid deprecated get_seconds() cachefiles: Explicitly cast enumerated type in put_object fscache: fix race between enablement and dropping of object cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active fscache: Fix race in fscache_op_complete() due to split atomic_sub & read cachefiles: Fix an assertion failure when trying to update a failed object
2018-11-30	tun: forbid iface creation with rtnl ops	Nicolas Dichtel
	It's not supported right now (the goal of the initial patch was to support 'ip link del' only). Before the patch: $ ip link add foo type tun [ 239.632660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [snip] [ 239.636410] RIP: 0010:register_netdevice+0x8e/0x3a0 This panic occurs because dev->netdev_ops is not set by tun_setup(). But to have something usable, it will require more than just setting netdev_ops. Fixes: f019a7a594d9 ("tun: Implement ip link del tunXXX") CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	net: usb: aqc111: Initialize wol_cfg with memset in aqc111_suspend	Nathan Chancellor
	Clang warns: drivers/net/usb/aqc111.c:1326:37: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct aqc111_wol_cfg wol_cfg = { 0 }; ^ {} 1 warning generated. Use memset to initialize the object to take compiler instrumentation out of the equation. Fixes: e58ba4544c77 ("net: usb: aqc111: Add support for wake on LAN by MAGIC packet") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	virtio-net: keep vnet header zeroed after processing XDP	Jason Wang
	We copy vnet header unconditionally in page_to_skb() this is wrong since XDP may modify the packet data. So let's keep a zeroed vnet header for not confusing the conversion between vnet header and skb metadata. In the future, we should able to detect whether or not the packet was modified and keep using the vnet header when packet was not touched. Fixes: f600b6905015 ("virtio_net: Add XDP support") Reported-by: Pavel Popa <pashinho1990@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	net: qualcomm: rmnet: Remove set but not used variable 'cmd'	YueHaibing
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c: In function 'rmnet_map_do_flow_control': drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:23:36: warning: variable 'cmd' set but not used [-Wunused-but-set-variable] struct rmnet_map_control_command *cmd; 'cmd' not used anymore now, should also be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	Merge branch 'tcp-fixes-in-timeout-and-retransmission-accounting'	David S. Miller
	Yuchung Cheng says: ==================== tcp: fixes in timeout and retransmission accounting This patch set has assorted fixes of minor accounting issues in timeout, window probe, and retransmission stats. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	tcp: fix SNMP TCP timeout under-estimation	Yuchung Cheng
	Previously the SNMP TCPTIMEOUTS counter has inconsistent accounting: 1. It counts all SYN and SYN-ACK timeouts 2. It counts timeouts in other states except recurring timeouts and timeouts after fast recovery or disorder state. Such selective accounting makes analysis difficult and complicated. For example the monitoring system needs to collect many other SNMP counters to infer the total amount of timeout events. This patch makes TCPTIMEOUTS counter simply counts all the retransmit timeout (SYN or data or FIN). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	tcp: fix SNMP under-estimation on failed retransmission	Yuchung Cheng
	Previously the SNMP counter LINUX_MIB_TCPRETRANSFAIL is not counting the TSO/GSO properly on failed retransmission. This patch fixes that. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	tcp: fix off-by-one bug on aborting window-probing socket	Yuchung Cheng
	Previously there is an off-by-one bug on determining when to abort a stalled window-probing socket. This patch fixes that so it is consistent with tcp_write_timeout(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	liquidio: read sc->iq_no before release sc	Pan Bian
	The function lio_vf_rep_packet_sent_callback releases the occupation of sc via octeon_free_soft_command. sc should not be used after that. Unfortunately, sc->iq_no is read. To fix this, the patch stores sc->iq_no into a local variable before releasing sc and then uses the local variable instead of sc->iq_no. Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	mlx5: fix get_ip_proto()	Cong Wang
	IP header is not necessarily located right after struct ethhdr, there could be multiple 802.1Q headers in between, this is why we call __vlan_get_protocol(). Fixes: fe1dc069990c ("net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets") Cc: Alaa Hleihel <alaa@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	net: dsa: Fix tagging attribute location	Florian Fainelli
	While introducing the DSA tagging protocol attribute, it was added to the DSA slave network devices, but those actually see untagged traffic (that is their whole purpose). Correct this mistake by putting the tagging sysfs attribute under the DSA master network device where this is the information that we need. While at it, also correct the sysfs documentation mistake that missed the "dsa/" directory component of the attribute. Fixes: 98cdb4807123 ("net: dsa: Expose tagging protocol to user-space") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	bpf: Improve socket lookup reuseport documentation	Joe Stringer
	Improve the wording around socket lookup for reuseport sockets, and ensure that both bpf.h headers are in sync. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	bpf: Support sk lookup in netns with id 0	Joe Stringer
	David Ahern and Nicolas Dichtel report that the handling of the netns id 0 is incorrect for the BPF socket lookup helpers: rather than finding the netns with id 0, it is resolving to the current netns. This renders the netns_id 0 inaccessible. To fix this, adjust the API for the netns to treat all negative s32 values as a lookup in the current netns (including u64 values which when truncated to s32 become negative), while any values with a positive value in the signed 32-bit integer space would result in a lookup for a socket in the netns corresponding to that id. As before, if the netns with that ID does not exist, no socket will be found. Any netns outside of these ranges will fail to find a corresponding socket, as those values are reserved for future usage. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Joey Pabalinas <joeypabalinas@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30	tun: implement carrier change	Nicolas Dichtel
	The userspace may need to control the carrier state. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	net/sched: act_police: fix memory leak in case of invalid control action	Davide Caratti
	when users set an invalid control action, kmemleak complains as follows: # echo clear >/sys/kernel/debug/kmemleak # ./tdc.py -e b48b Test b48b: Add police action with exceed goto chain control action All test results: 1..1 ok 1 - b48b # Add police action with exceed goto chain control action about to flush the tap output if tests need to be skipped done flushing skipped test tap output # echo scan >/sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffffa0fafbc3dde0 (size 96): comm "tc", pid 2358, jiffies 4294922738 (age 17.022s) hex dump (first 32 bytes): 2a 00 00 20 00 00 00 00 00 00 7d 00 00 00 00 00 *.. ......}..... f8 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000648803d2>] tcf_action_init_1+0x384/0x4c0 [<00000000cb69382e>] tcf_action_init+0x12b/0x1a0 [<00000000847ef0d4>] tcf_action_add+0x73/0x170 [<0000000093656e14>] tc_ctl_action+0x122/0x160 [<0000000023c98e32>] rtnetlink_rcv_msg+0x263/0x2d0 [<000000003493ae9c>] netlink_rcv_skb+0x4d/0x130 [<00000000de63f8ba>] netlink_unicast+0x209/0x2d0 [<00000000c3da0ebe>] netlink_sendmsg+0x2c1/0x3c0 [<000000007a9e0753>] sock_sendmsg+0x33/0x40 [<00000000457c6d2e>] ___sys_sendmsg+0x2a0/0x2f0 [<00000000c5c6a086>] __sys_sendmsg+0x5e/0xa0 [<00000000446eafce>] do_syscall_64+0x5b/0x180 [<000000004aa871f2>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<00000000450c38ef>] 0xffffffffffffffff change tcf_police_init() to avoid leaking 'new' in case TCA_POLICE_RESULT contains TC_ACT_GOTO_CHAIN extended action. Fixes: c08f5ed5d625 ("net/sched: act_police: disallow 'goto chain' on fallback control action") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	net: reorder flowi_common fields to avoid holes	Paolo Abeni
	the flowi* structures are used and memsetted by server functions in critical path. Currently flowi_common has a couple of holes that we can eliminate reordering the struct fields. As a side effect, both flowi4 and flowi6 shrink by 8 bytes. Before: pahole -EC flowi_common struct flowi_common { // ... /* size: 40, cachelines: 1, members: 10 / / sum members: 32, holes: 1, sum holes: 4 / / padding: 4 / / last cacheline: 40 bytes / }; pahole -EC flowi6 struct flowi6 { // ... / size: 88, cachelines: 2, members: 6 / / padding: 4 / / last cacheline: 24 bytes / }; pahole -EC flowi4 struct flowi4 { // ... / size: 56, cachelines: 1, members: 4 / / padding: 4 / / last cacheline: 56 bytes / }; After: struct flowi_common { // ... / size: 32, cachelines: 1, members: 10 / / last cacheline: 32 bytes / }; struct flowi6 { // ... / size: 80, cachelines: 2, members: 6 / / padding: 4 / / last cacheline: 16 bytes / }; struct flowi4 { // ... / size: 48, cachelines: 1, members: 4 / / padding: 4 / / last cacheline: 48 bytes */ }; Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30	Merge branch 'mlxsw-Add-VxLAN-support-with-VLAN-aware-bridges'	David S. Miller
	Ido Schimmel says: ==================== mlxsw: Add VxLAN support with VLAN-aware bridges Commit 53e50a6ec24d ("Merge branch 'mlxsw-Add-VxLAN-support'") added mlxsw support for VxLAN when the VxLAN device was enslaved to VLAN-unaware bridges. This patchset extends mlxsw to also support VxLAN with VLAN-aware bridges. With VLAN-aware bridges, the VxLAN device's VNI is mapped to the VLAN that is configured as 'pvid untagged' on the corresponding bridge port. To prevent ambiguity, mlxsw forbids configurations in which the same VLAN is configured as 'pvid untagged' on multiple VxLAN devices. Patches #1-#2 add the necessary APIs in mlxsw and the bridge driver. Patches #3-#4 perform small refactoring in order to prepare mlxsw for VLAN-aware support. Patch #5 finally enables the enslavement of VxLAN devices to a VLAN-aware bridge. Among other things, it extends mlxsw to handle switchdev notifications about VLAN add / delete on a VxLAN device enslaved to an offloaded VLAN-aware bridge. Patches #6-#8 add selftests to test the new functionality. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>