summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-12-18KVM: X86: Fix NULL deref in vcpu_scan_ioapicWanpeng Li
Reported by syzkaller: CPU: 1 PID: 5962 Comm: syz-executor118 Not tainted 4.20.0-rc6+ #374 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline] RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline] RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline] RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074 Call Trace: kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT14 msr and triggers scan ioapic logic to load synic vectors into EOI exit bitmap. However, irqchip is not initialized by this simple testcase, ioapic/apic objects should not be accessed. This patch fixes it by also considering whether or not apic is present. Reported-by: syzbot+39810e6c400efadfef71@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-18KVM: Fix UAF in nested posted interrupt processingCfir Cohen
nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It caches the kmap()ed page object and pointer, however, it doesn't handle errors correctly: it's possible to cache a valid pointer, then release the page and later dereference the dangling pointer. I was able to reproduce with the following steps: 1. Call vmlaunch with valid posted_intr_desc_addr but an invalid MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed pi_desc_page and pi_desc. Later the invalid EFER value fails check_vmentry_postreqs() which fails the first vmlaunch. 2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr (I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages pi_desc_page is unmapped and released and pi_desc_page is set to NULL (the "shouldn't happen" clause). Due to the invalid posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and nested_get_vmcs12_pages() returns. It doesn't return an error value so vmlaunch proceeds. Note that at this time we have a dangling pointer in vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs. 3. Issue an IPI in L2 guest code. This triggers a call to vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which dereferences the dangling pointer. Vulnerable code requires nested and enable_apicv variables to be set to true. The host CPU must also support posted interrupts. Fixes: 5e2f30b756a37 "KVM: nVMX: get rid of nested_get_page()" Cc: stable@vger.kernel.org Reviewed-by: Andy Honig <ahonig@google.com> Signed-off-by: Cfir Cohen <cfir@google.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-18KVM: fix unregistering coalesced mmio zone from wrong busEric Biggers
If you register a kvm_coalesced_mmio_zone with '.pio = 0' but then unregister it with '.pio = 1', KVM_UNREGISTER_COALESCED_MMIO will try to unregister it from KVM_PIO_BUS rather than KVM_MMIO_BUS, which is a no-op. But it frees the kvm_coalesced_mmio_dev anyway, causing a use-after-free. Fix it by only unregistering and freeing the zone if the correct value of 'pio' is provided. Reported-by: syzbot+f87f60bb6f13f39b54e3@syzkaller.appspotmail.com Fixes: 0804c849f1df ("kvm/x86 : add coalesced pio support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-18Merge branch 'hns3-next'David S. Miller
Peng Li says: ==================== net: hns3: code optimizations & bugfixes for HNS3 driver This patchset includes bugfixes and code optimizations for the HNS3 ethernet controller driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: fix a SSU buffer checking bugYunsheng Lin
When caculating the SSU buffer, it first allocate tx and rx private buffer, then the remaining buffer is for rx shared buffer. The remaining buffer size should be at least bigger than or equal to the shared_std, which is the minimum shared buffer size required by the driver, but currently if the remaining buffer size is equal to the shared_std, it returns failure, which causes SSU buffer allocation failure problem. This patch fixes this problem by rounding up shared_std before checking the the remaining buffer size bigger than or equal to the shared_std. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: aligning buffer size in SSU to 256 bytesYunsheng Lin
The hardware expects the buffer size set to SSU is aligned to 256 bytes, this patch aligns the buffer size to 256 byte using roundup or rounddown function. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: getting tx and dv buffer size through firmwareYunsheng Lin
This patch adds support of getting tx and dv buffer size through firmware, because different version of hardware requires different size of tx and dv buffer. This patch also add dv_buf_size to tc' private buffer size even if pfc is not enable for the tc. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: synchronize speed and duplex from phy when phy link upPeng Li
Driver calls phy_connect_direct and registers hclge_mac_adjust_link to synchronize mac speed and duplex from phy. It is better to synchronize mac speed and duplex from phy when phy link up. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: remove 1000M/half support of phyFuyun Liang
Our phy does not support 1000M/half, this patch removes 1000M/half from PHY_SUPPORTED_FEATURES. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: update coalesce param per secondPeng Li
coalesce param updates every 100 napi times, it may update a little late if ping test after a high rate flow, may over napi poll is called 100 times as ping test sends packets every second. This patch updates coalesce param every second, instead with every 100 napi times. It can not update the param 100% in time, but the lag time is very short. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: fix incomplete uninitialization of IRQ in the ↵Huazhong Tan
hns3_nic_uninit_vector_data() In the hns3_nic_uninit_vector_data(), the procedure of uninitializing the tqp_vector's IRQ has not set affinity_notify to NULL and changes its init flag. This patch fixes it. And for simplificaton, local variable tqp_vector is used instead of priv->tqp_vector[i]. Fixes: 424eb834a9be ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: remove unnecessary configuration recapture while resettingHuazhong Tan
When doing reset, it is unnecessary to get the hardware's default configuration again, otherwise, the user's configuration will be overwritten. Fixes: 4ed340ab8f49 ("net: hns3: Add reset process in hclge_main") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: update some variables while hclge_reset()/hclgevf_reset() doneHuazhong Tan
When hclge_reset() completes successfully, it should update the last_reset_time, set reset_fail_cnt to 0, and set reset_type of hnae3_ae_dev to HNAE3_NONE_RESET. Also when hclgevf_reset() completes successfully, it should update the last_reset_time, and set reset_type of hnae3_ae_dev to HNAE3_NONE_RESET. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: fix napi_disable not return problemHuazhong Tan
While doing DOWN, the calling of napi_disable() may not return, since the napi_complete() in the hns3_nic_common_poll() will never be called when HNS3_NIC_STATE_DOWN is set. So we need to call napi_complete() before checking HNS3_NIC_STETE_DOWN. Fixes: ff0699e04b97 ("net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is set") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: uninitialize pci in the hclgevf_uninitHuazhong Tan
In the hclgevf_pci_reset(), it only uninitialize and initialize the msi, so if the initialization fails, hclgevf_uninit_hdev() does not need to uninitialize the msi, but needs to uninitialize the pci, otherwise it will cause pci resource not free. Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18net: hns3: fix error handling int the hns3_get_vector_ring_chainHuazhong Tan
When hns3_get_vector_ring_chain() failed in the hns3_nic_init_vector_data(), it should do the error handling instead of return directly. Also, cur_chain should be freed instead of chain and head->next should be set to NULL in error handling of hns3_get_vector_ring_chain. This patch fixes them. Fixes: 73b907a083b8 ("net: hns3: bugfix for buffer not free problem during resetting") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18VSOCK: Send reset control packet when socket is partially boundJorgen Hansen
If a server side socket is bound to an address, but not in the listening state yet, incoming connection requests should receive a reset control packet in response. However, the function used to send the reset silently drops the reset packet if the sending socket isn't bound to a remote address (as is the case for a bound socket not yet in the listening state). This change fixes this by using the src of the incoming packet as destination for the reset packet in this case. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-12-18 1) Fix error return code in xfrm_output_one() when no dst_entry is attached to the skb. From Wei Yongjun. 2) The xfrm state hash bucket count reported to userspace is off by one. Fix from Benjamin Poirier. 3) Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry. 4) Fix freeing of xfrm states on acquire. We use a dedicated slab cache for the xfrm states now, so free it properly with kmem_cache_free. From Mathias Krause. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18tools uapi asm: Update asm-generic/unistd.h copyArnaldo Carvalho de Melo
To get the change in: b7d624ab4312 ("asm-generic: unistd.h: fixup broken macro include.") That doesn't imply in any changes in the tools. This silences the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Guo Ren <ren_guo@c-sky.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-2e7xwm5i2qcc88jp2lyawdyd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf symbols: Relax checks on perf-PID.map ownershipArnaldo Carvalho de Melo
Those are simple enough, and usually not produced by root, instead by whatever user is running java, rust, Node.js JIT code that end up generating those /tmp/perf-PID.map for resolution of symbols in the anonymous executable maps. Having to use --force to resolve symbols in 'perf top' is a distraction, as recently I experienced when node.js symbols were not being resolved by 'perf top'. Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hítalo Silva <hitalos@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-tk2jgo2v4v2yjuj28axbpppo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Wire up the fadvise 'advice' table generatorArnaldo Carvalho de Melo
That ends up generating this: $ cat /tmp/build/perf/trace/beauty/generated/fadvise_advice_array.c static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zwbslubagram8a8zdc003u8h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf beauty: Add generator for fadvise64's 'advice' arg constantsArnaldo Carvalho de Melo
$ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ This has a hack wrt the s390 difference. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-tb7jguv01u8p570piq13eioh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18tools headers uapi: Grab a copy of fadvise.hArnaldo Carvalho de Melo
Will be used to generate the string table for fadvise64's 'advice' argument. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-muswpnft8q9krktv052yrgsc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf beauty mmap: Print mmap's 'offset' arg in hexadecimalArnaldo Carvalho de Melo
Also to make it match 'strace' output, for regression testing. Both now produce this option, when 'perf trace' uses a .perfconfig asking for the strace like output: mmap(0x7faf66e6a000, 1363968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7faf66e6a000 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-27qhouo1kaac2iyl85nfnsf5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf beauty mmap: Print PROT_READ before PROT_EXEC to match strace outputArnaldo Carvalho de Melo
Helps with comparing 'strace' and 'perf trace' output, for mutual regression testing. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-va0qe95xbhep5hy52aq5qe0v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace beauty: Beautify arch_prctl()'s argumentsArnaldo Carvalho de Melo
This actually so far, AFAIK is available only in x86, so the code was put in place with x86 prefixes, in arches where it is not available it will just not be called, so no further mechanisms are needed at this time. Later, when other arches wire this up, we'll just look at the uname (live sessions) or perf_env data in the perf.data header to auto-wire the right beautifier. With this the output is the same as produced by 'strace' when used with the following ~/.perfconfig: # cat ~/.perfconfig [llvm] dump-obj = true [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o show_zeros = yes show_duration = no no_inherit = yes show_timestamp = no show_arg_names = no args_alignment = -40 show_prefix = yes # And, on fedora 29, since the string tables are generated from the kernel sources, we don't know about 0x3001, just like strace: --- /tmp/strace 2018-12-17 11:22:08.707586721 -0300 +++ /tmp/trace 2018-12-18 11:11:32.037512729 -0300 @@ -1,49 +1,49 @@ -arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument) +arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe4eb93ae0) = -1 EINVAL (Invalid argument) -arch_prctl(ARCH_SET_FS, 0x7faf6700f540) = 0 +arch_prctl(ARCH_SET_FS, 0x7fb507364540) = 0 And that seems to be related to the CET/Shadow Stack feature, that userland in Fedora 29 (glibc 2.28) are querying the kernel about, that 0x3001 seems to be ARCH_CET_STATUS, I'll check the situation and test with a fedora 29 kernel to see if the other codes are used. A diff that ignores the different pointers for different runs needs to be put in place in the upcoming regression tests comparing 'perf trace's output to strace's. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-73a9prs8ktkrt97trtdmdjs8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: When showing string prefixes show prefix + ??? for unknown entriesArnaldo Carvalho de Melo
To match 'strace' output, like in: arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-kx59j2dk5l1x04ou57mt99ck@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Move strarrays to beauty.h for further reuseArnaldo Carvalho de Melo
We'll use it in the upcoming arch_prctl() 'code' arg beautifier. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6e4tj2fjen8qa73gy4u49vav@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf beauty: Wire up the x86_arch prctl code table generatorArnaldo Carvalho de Melo
$ cat /tmp/build/perf/trace/beauty/generated/x86_arch_prctl_code_array.c #define x86_arch_prctl_codes_1_offset 0x1001 static const char *x86_arch_prctl_codes_1[] = { [0x1001 - 0x1001]= "SET_GS", [0x1002 - 0x1001]= "SET_FS", [0x1003 - 0x1001]= "GET_FS", [0x1004 - 0x1001]= "GET_GS", [0x1011 - 0x1001]= "GET_CPUID", [0x1012 - 0x1001]= "SET_CPUID", }; #define x86_arch_prctl_codes_2_offset 0x2001 static const char *x86_arch_prctl_codes_2[] = { [0x2001 - 0x2001]= "MAP_VDSO_X32", [0x2002 - 0x2001]= "MAP_VDSO_32", [0x2003 - 0x2001]= "MAP_VDSO_64", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-3r9blij6n8wdlsyd5dujx86r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf beauty: Add a string table generator for x86's 'arch_prctl' codesArnaldo Carvalho de Melo
$ tools/perf/trace/beauty/x86_arch_prctl.sh #define x86_arch_prctl_codes_1_offset 0x1001 static const char *x86_arch_prctl_codes_1[] = { [0x1001 - 0x1001]= "SET_GS", [0x1002 - 0x1001]= "SET_FS", [0x1003 - 0x1001]= "GET_FS", [0x1004 - 0x1001]= "GET_GS", [0x1011 - 0x1001]= "GET_CPUID", [0x1012 - 0x1001]= "SET_CPUID", }; #define x86_arch_prctl_codes_2_offset 0x2001 static const char *x86_arch_prctl_codes_2[] = { [0x2001 - 0x2001]= "MAP_VDSO_X32", [0x2002 - 0x2001]= "MAP_VDSO_32", [0x2003 - 0x2001]= "MAP_VDSO_64", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-w0fux1psivphhx6rve8kn3vq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18tools include arch: Grab a copy of x86's prctl.hArnaldo Carvalho de Melo
We need it to generate the tables for the 'code' arch_prctl's syscall argument. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-vu890pi18fpd4eyz61cazckj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Show NULL when syscall pointer args are 0Arnaldo Carvalho de Melo
Matching strace's output format. The 'format' file for the syscall tracepoints have an indication if the arg is a pointer, with some exceptions like 'mmap' that has its first arg as an 'unsigned long', so use a heuristic using the argument name, i.e. if it contains the 'addr' substring, format it with the pointer formatter. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ddghemr8qrm6i0sb8awznbze@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Enclose the errno strings with ()Arnaldo Carvalho de Melo
To match strace, now both emit the same line for calls like: access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-krxl6klsqc9qyktoaxyih942@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf augmented_raw_syscalls: Copy 'access' arg as wellArnaldo Carvalho de Melo
This will all come from userspace, but to test the changes to make 'perf trace' output similar to strace's, do this one more now manually. To update the precompiled augmented_raw_syscalls.o binary I just run: # perf record -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB perf.data ] # Because to have augmented_raw_syscalls to be always used and a fast startup and remove the need to have the llvm toolchain installed, I'm using: # perf config | grep add_events trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o # So when doing changes to augmented_raw_syscals.c one needs to rebuild the .o file. This will be done automagically later, i.e. have a 'make' behaviour of recompiling when the .c gets changed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-lw3i2atyq8549fpqwmszn3qp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Add alignment spaces after the closing parensArnaldo Carvalho de Melo
To use strace's style, helping in comparing the output of 'perf trace' with the one from 'strace', to help in upcoming regression tests. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mw6peotz4n84rga0fk78buff@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace beauty: Print O_RDONLY when (flags & O_ACCMODE) == 0Arnaldo Carvalho de Melo
And there are more flags, to match strace's output. openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 Also to help with regression tests. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ofovpmvdli3bwch30936xn7t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Allow asking for not suppressing common string prefixesArnaldo Carvalho de Melo
So far we've been suppressing common stuff such as "MAP_" in the mmap flags, showing "SHARED" instead of "MAP_SHARED", allow for those prefixes (and a few suffixes) to be shown: # trace -e *map,open*,*seek sleep 1 openat("/etc/ld.so.cache", CLOEXEC) = 3 mmap(0, 109093, READ, PRIVATE, 3, 0) = 0x7ff61c695000 openat("/lib64/libc.so.6", CLOEXEC) = 3 lseek(3, 792, SET) = 792 mmap(0, 8192, READ|WRITE, PRIVATE|ANONYMOUS) = 0x7ff61c693000 lseek(3, 792, SET) = 792 lseek(3, 864, SET) = 864 mmap(0, 1857568, READ, PRIVATE|DENYWRITE, 3, 0) = 0x7ff61c4cd000 mmap(0x7ff61c4ef000, 1363968, EXEC|READ, PRIVATE|FIXED|DENYWRITE, 3, 139264) = 0x7ff61c4ef000 mmap(0x7ff61c63c000, 311296, READ, PRIVATE|FIXED|DENYWRITE, 3, 1503232) = 0x7ff61c63c000 mmap(0x7ff61c689000, 24576, READ|WRITE, PRIVATE|FIXED|DENYWRITE, 3, 1814528) = 0x7ff61c689000 mmap(0x7ff61c68f000, 14368, READ|WRITE, PRIVATE|FIXED|ANONYMOUS) = 0x7ff61c68f000 munmap(0x7ff61c695000, 109093) = 0 openat("/usr/lib/locale/locale-archive", CLOEXEC) = 3 mmap(0, 217749968, READ, PRIVATE, 3, 0) = 0x7ff60f523000 # # vim ~/.perfconfig # # perf config llvm.dump-obj=true trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o trace.show_zeros=yes trace.show_duration=no trace.no_inherit=yes trace.show_timestamp=no trace.show_arg_names=no trace.args_alignment=0 trace.string_quote=" trace.show_prefix=yes # # # trace -e *map,open*,*seek sleep 1 openat(AT_FDCWD, "/etc/ld.so.cache", O_CLOEXEC) = 3 mmap(0, 109093, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7ebbe59000 openat(AT_FDCWD, "/lib64/libc.so.6", O_CLOEXEC) = 3 lseek(3, 792, SEEK_SET) = 792 mmap(0, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7ebbe57000 lseek(3, 792, SEEK_SET) = 792 lseek(3, 864, SEEK_SET) = 864 mmap(0, 1857568, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7ebbc91000 mmap(0x7f7ebbcb3000, 1363968, PROT_EXEC|PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 139264) = 0x7f7ebbcb3000 mmap(0x7f7ebbe00000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1503232) = 0x7f7ebbe00000 mmap(0x7f7ebbe4d000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1814528) = 0x7f7ebbe4d000 mmap(0x7f7ebbe53000, 14368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f7ebbe53000 munmap(0x7f7ebbe59000, 109093) = 0 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_CLOEXEC) = 3 mmap(0, 217749968, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7eaece7000 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mtn1i4rjowjl72trtnbmvjd4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Add a prefix member to the strarray classArnaldo Carvalho de Melo
So that the user, in an upcoming patch, can select printing it to get the full string as used in the source code, not one with a common prefix chopped off so as to make the output more compact. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zypczc88gzbmeqx7b372s138@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Enclose strings with double quotesArnaldo Carvalho de Melo
To match 'strace' output, helping with upcoming regression tests comparing both outputs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-jab52t1dcuh6vlztqle9g7u9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18perf trace: Make the alignment of the syscall args be configurableArnaldo Carvalho de Melo
Since the start 'perf trace' aligns the parens enclosing the list of syscall args to align the syscall results, allow this to be configurable, keeping the default of 70. Using: # perf config llvm.dump-obj=true trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o trace.show_zeros=yes trace.show_duration=no trace.no_inherit=yes trace.show_timestamp=no trace.show_arg_names=no trace.args_alignment=0 # trace -e open*,close,*sleep sleep 1 openat(CWD, /etc/ld.so.cache, CLOEXEC) = 3 close(3) = 0 openat(CWD, /lib64/libc.so.6, CLOEXEC) = 3 close(3) = 0 openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC) = 3 close(3) = 0 nanosleep(0x7ffc00de66f0, 0) = 0 close(1) = 0 close(2) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r8cbhoz1lr5npq9tutpvoigr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18Merge tag 'spi-nor/for-4.21' of git://git.infradead.org/linux-mtd into mtd/nextBoris Brezillon
Core changes: - Parse the 4BAIT SFDP section - Add a bunch of SPI NOR entries to the flash_info table - Add the concept of SFDP fixups and use it to fix a bug on MX25L25635F - A bunch of minor cleanups/comestic changes
2018-12-18Merge tag 'nand/for-4.21' of git://git.infradead.org/linux-mtd into mtd/nextBoris Brezillon
NAND core changes: - kernel-doc miscellaneous fixes. - Third batch of fixes/cleanup to the raw NAND core impacting various controller drivers (ams-delta, marvell, fsmc, denali, tegra, vf610): * Stopping to pass mtd_info objects to internal functions * Reorganizing code to avoid forward declarations * Dropping useless test in nand_legacy_set_defaults() * Moving nand_exec_op() to internal.h * Adding nand_[de]select_target() helpers * Passing the CS line to be selected in struct nand_operation * Making ->select_chip() optional when ->exec_op() is implemented * Deprecating the ->select_chip() hook * Moving the ->exec_op() method to nand_controller_ops * Moving ->setup_data_interface() to nand_controller_ops * Deprecating the dummy_controller field * Fixing JEDEC detection * Providing a helper for polling GPIO R/B pin Raw NAND chip drivers changes: - Macronix: * Flagging 1.8V AC chips with a broken GET_FEATURES(TIMINGS) Raw NAND controllers drivers changes: - Ams-delta: * Fixing the error path * SPDX tag added * May be compiled with COMPILE_TEST=y * Conversion to ->exec_op() interface * Dropping .IOADDR_R/W use * Use GPIO API for data I/O - Denali: * Removing denali_reset_banks() * Removing ->dev_ready() hook * Including <linux/bits.h> instead of <linux/bitops.h> * Changes to comply with the above fixes/cleanup done in the core. - FSMC: * Adding an SPDX tag to replace the license text * Making conversion from chip to fsmc consistent * Fixing unchecked return value in fsmc_read_page_hwecc * Changes to comply with the above fixes/cleanup done in the core. - Marvell: * Preventing timeouts on a loaded machine (fix) * Changes to comply with the above fixes/cleanup done in the core. - OMAP2: * Pass the parent of pdev to dma_request_chan() (fix) - R852: * Use generic DMA API - sh_flctl: * Converting to SPDX identifiers - Sunxi: * Write pageprog related opcodes to the right register: WCMD_SET (fix) - Tegra: * Stop implementing ->select_chip() - VF610: * Adding an SPDX tag to replace the license text * Changes to comply with the above fixes/cleanup done in the core. - Various trivial/spelling/coding style fixes. SPI-NAND drivers changes: - Removing the depreacated mt29f_spinand driver from staging. - Adding support for: * Toshiba TC58CVG2S0H * GigaDevice GD5FxGQ4xA * Winbond W25N01GV
2018-12-18Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: The t10-pi one is a regression from the 4.19 release, the qla2xxx one is a 4.20 merge window regression and the bnx2fc is a very old bug" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: t10-pi: Return correct ref tag when queue has no integrity profile scsi: bnx2fc: Fix NULL dereference in error handling Revert "scsi: qla2xxx: Fix NVMe Target discovery"
2018-12-18Merge tag 'irqchip-4.21' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer) - Updates for new (and old) platforms (i.MX8MQ, F1C100s) - A number of SPDX cleanups - A workaround for a very broken GICv3 implementation - A platform-msi fix - Various cleanups
2018-12-18ALSA: xen-front: Use Xen common shared buffer implementationOleksandr Andrushchenko
Use page directory based shared buffer implementation now available as common code for Xen frontend drivers. Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-12-18Merge branch 'mlxsw-VXLAN-and-firmware-flashing-fixes'David S. Miller
Ido Schimmel says: ==================== mlxsw: VXLAN and firmware flashing fixes Patch #1 fixes firmware flashing failures by increasing the time period after which the driver fails the transaction with the firmware. The problem is explained in detail in the commit message. Patch #2 adds a missing trap for decapsulated ARP packets. It is necessary for VXLAN routing to work. Patch #3 fixes a memory leak during driver reload caused by NULLing a pointer before kfree(). Please consider patch #1 for 4.19.y ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18mlxsw: spectrum_nve: Fix memory leak upon driver reloadIdo Schimmel
The pointer was NULLed before freeing the memory, resulting in a memory leak. Trace from kmemleak: unreferenced object 0xffff88820ae36528 (size 512): comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330 [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0 [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690 [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260 [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360 [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220 [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0 [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150 [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180 [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0 [<00000000efc4eae8>] genl_rcv+0x2d/0x40 [<00000000ea645603>] netlink_unicast+0x52f/0x740 [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50 [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120 [<00000000d85795a9>] __sys_sendto+0x397/0x620 [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0 Fixes: 6e6030bd5412 ("mlxsw: spectrum_nve: Implement common NVE core") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18mlxsw: spectrum: Add trap for decapsulated ARP packetsIdo Schimmel
After a packet was decapsulated it is classified to the relevant FID based on its VNI and undergoes L2 forwarding. Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap decapsulated ARP packets during L2 forwarding and instead can only trap such packets in the underlay router during decapsulation. Add this missing packet trap, which is required for VXLAN routing when the MAC of the target host is not known. Fixes: b02597d513a9 ("mlxsw: spectrum: Add NVE packet traps") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18mlxsw: core: Increase timeout during firmware flash processShalom Toledo
During the firmware flash process, some of the EMADs get timed out, which causes the driver to send them again with a limit of 5 retries. There are some situations in which 5 retries is not enough and the EMAD access fails. If the failed EMAD was related to the flashing process, the driver fails the flashing. The reason for these timeouts during firmware flashing is cache misses in the CPU running the firmware. In case the CPU needs to fetch instructions from the flash when a firmware is flashed, it needs to wait for the flashing to complete. Since flashing takes time, it is possible for pending EMADs to timeout. Fix by increasing EMADs' timeout while flashing firmware. Fixes: ce6ef68f433f ("mlxsw: spectrum: Implement the ethtool flash_device callback") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18drm/xen-front: Use Xen common shared buffer implementationOleksandr Andrushchenko
Use page directory based shared buffer implementation now available as common code for Xen frontend drivers. Remove flushing of shared buffer on page flip as this workaround needs a proper fix. Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>