summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-07-31Merge tag 'wireless-drivers-for-davem-2018-07-31' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.18 Last set of fixes before 4.18 is released iwlwifi * add new IDs for cards already available on the market brcmfmac * fix a regression introduced in v4.17 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31ASoC: wcd9335: add CLASS-H Controller supportSrinivas Kandagatla
CLASS-H controller/Amplifier is common accorss Qualcomm WCD codec series. This patchset adds basic CLASS-H controller apis for WCD codecs after wcd9335 to use. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-31ASoC: wcd9335: add support to wcd9335 codecSrinivas Kandagatla
Qualcomm WCD9335 Codec is a standalone Hi-Fi audio codec IC, It supports both I2S/I2C and SLIMbus audio interfaces. On slimbus interface it supports two data lanes; 16 Tx ports and 8 Rx ports. It has Seven DACs and nine dedicated interpolators, Seven (six audio ADCs, and one VBAT ADC), Multibutton headset control (MBHC), Active noise cancellation and Sidetone paths and processing. This patchset adds very basic support for playback and capture via the 9 interpolators and ADC respectively. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-31ASoC: dt-bindings: add dt bindings for wcd9335 audio codecSrinivas Kandagatla
This patch adds bindings for wcd9335 audio codec which can support both SLIMbus and I2S/I2C interface. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-31xen/gntdev: don't dereference a null gntdev_dmabuf on allocation failureColin Ian King
Currently when the allocation of gntdev_dmabuf fails, the error exit path will call dmabuf_imp_free_storage and causes a null pointer dereference on gntdev_dmabuf. Fix this by adding an error exit path that won't free gntdev_dmabuf. Detected by CoverityScan, CID#1472124 ("Dereference after null check") Fixes: bf8dc55b1358 ("xen/gntdev: Implement dma-buf import functionality") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-07-31Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine fixes, five in the qla2xxx driver, the most serious of which is the uninitialized list head crash which can be observed in most systems under a sufficiently loaded low memory environment. The two sg fixes are minor but obvious and two target ones which seem reasonable but not high impact" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Return error when TMF returns scsi: qla2xxx: Fix ISP recovery on unload scsi: qla2xxx: Fix driver unload by shutting down chip scsi: qla2xxx: Fix NPIV deletion by calling wait_for_sess_deletion scsi: qla2xxx: Fix unintialized List head crash scsi: sg: update comment for blk_get_request() scsi: sg: fix minor memory leak in error path scsi: libiscsi: fix possible NULL pointer dereference in case of TMF scsi: target: iscsi: cxgbit: fix max iso npdu calculation
2018-07-31Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Some bugfixes that seem important and safe enough to merge at the last minute" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: fix another race between migration and ballooning tools/virtio: add kmalloc_array stub tools/virtio: add dma barrier stubs
2018-07-31Merge tag 'acpi-urgent-4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a recent ACPICA regression affecting control method execution at the table level and an earlier hibernation regression in the ACPI driver for Intel SoCs (LPSS) that was missed by a previous fix in this cycle. Specifics: - Fix a recent ACPICA regression introduced by a previous fix that caused control method execution at the table level to be mishandled by mistake (Erik Schmauss). - Fix a hibernation regression from the 4.15 cycle in the ACPI driver for Intel SoCs (LPSS) that caused the platform firmware to be confused during resume from hibernation by the driver's PM quirks which was fixed for system-wide suspend/resume (ACPI S3) earlier in this cycle, but that previous fix missed the hibernation (ACPI S4) case (Rafael Wysocki)" * tag 'acpi-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: AML Parser: ignore control method status in module-level code ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
2018-07-31PCI: Fix is_added/is_busmaster race conditionHari Vyas
When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283 Signed-off-by: Hari Vyas <hari.vyas@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-31s390/kdump: Fix elfcorehdr size calculationPhilipp Rudo
Before the memory for the elfcorehdr is allocated the required size is estimated with alloc_size = 0x1000 + get_cpu_cnt() * 0x4a0 + mem_chunk_cnt * sizeof(Elf64_Phdr); Where 0x4a0 is used as size for the ELF notes to store the register contend. This size is 8 bytes too small. Usually this does not immediately cause a problem because the page reserved for overhead (Elf_Ehdr, vmcoreinfo, etc.) is pretty generous. So usually there is enough spare memory to counter the mis-calculated per cpu size. However, with growing overhead and/or a huge cpu count the allocated size gets too small for the elfcorehdr. Ultimately a BUG_ON is triggered causing the crash kernel to panic. Fix this by properly calculating the required size instead of relying on magic numbers. Fixes: a62bc07392539 ("s390/kdump: add support for vector extension") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-31t10-pi: provide empty t10_pi_complete() for !CONFIG_BLK_DEV_INTEGRITYJens Axboe
Fixes a link failure whtn BLK_DEV_INTEGRITY isn't defined. Fixes: 10c41ddd6132 ("block: move dif_prepare/dif_complete functions to block layer") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-31perf bpf: Show better message when failing to load an objectArnaldo Carvalho de Melo
Before: libbpf: license of tools/perf/examples/bpf/etcsnoop.c is GPL libbpf: section(6) version, size 4, link 0, flags 3, type=1 libbpf: kernel version of tools/perf/examples/bpf/etcsnoop.c is 41200 libbpf: section(7) .symtab, size 120, link 1, flags 0, type=2 bpf: config program 'syscalls:sys_enter_openat' libbpf: load bpf program failed: Operation not permitted libbpf: failed to load program 'syscalls:sys_enter_openat' libbpf: failed to load object 'tools/perf/examples/bpf/etcsnoop.c' bpf: load objects failed After: (just the last line changes) bpf: load objects failed: err=-4009: (Incorrect kernel version) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-wi44iid0yjfht3lcvplc75fm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31spi: dw: document Microsemi integrationAlexandre Belloni
The integration of the Designware SPI controller on Microsemi SoCs requires an extra register set to be able to give the IP control of the SPI interface. Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-31perf list: Unify metric group description format with PMU event descriptionMichael Petlan
PMU event descriptions use 7 spaces + '[' or 8 spaces as indentation. Metric groups used a tab + '['. This patch unifies it to the way PMU event descriptions are indented. BEFORE: $ perf list [...] Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] [...] AFTER: $ perf list [...] Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] [...] Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> LPU-Reference: 771439042.22924766.1532986504631.JavaMail.zimbra@redhat.com Link: https://lkml.kernel.org/n/tip-mlo850517m6u1rbjndvd1bwr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf vendor events arm64: Update ThunderX2 implementation defined pmu core ↵Ganapatrao Kulkarni
events Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ganapatrao Kulkarni <gklkml16@gmail.com> Cc: Jan Glauber <jan.glauber@cavium.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Vadim Lomovtsev <vadim.lomovtsev@cavium.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180731100251.23575-1-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packetLeo Yan
CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, this patch is to add branch sample for CS_ETM_TRACE_ON packet if it is inserted in the middle of CS_ETM_RANGE packets; as result we can have hint for the trace discontinuity. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-7-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packetLeo Yan
If one CS_ETM_TRACE_ON packet is inserted, we miss to generate branch sample for the previous CS_ETM_RANGE packet. This patch is to generate branch sample when receiving a CS_ETM_TRACE_ON packet, so this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-6-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packetLeo Yan
For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value, but the dummy value is pointless for branch sample when we use 'perf script' command to check program flow. This patch is a preparation to support CS_ETM_TRACE_ON packet for branch sample, it converts the dummy address value to zero for more readable; this is accomplished by cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-5-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf cs-etm: Fix start tracing packet handlingLeo Yan
Usually the start tracing packet is a CS_ETM_TRACE_ON packet, this packet is passed to cs_etm__flush(); cs_etm__flush() will check the condition 'prev_packet->sample_type == CS_ETM_RANGE' but 'prev_packet' is allocated by zalloc() so 'prev_packet->sample_type' is zero in initialization and this condition is false. So cs_etm__flush() will directly bail out without handling the start tracing packet. This patch is to introduce a new sample type CS_ETM_EMPTY, which is used to indicate the packet is an empty packet. cs_etm__flush() will swap packets when it finds the previous packet is empty, so this can record the start tracing packet into 'etmq->prev_packet'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-4-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf build: Fix installation directory for eBPFThomas Richter
The perf tool build and install is controlled via a Makefile. The 'install' rule creates directories and copies files. Among them are header files installed in /usr/lib/include/perf/bpf/. However all listed examples are installing its header files in /usr/lib/<tool-name>/...[/include]/header.h and not in /usr/lib/include/<tool-name>/.../header.h. Background information: Building the Fedora 28 glibc RPM on s390x and s390 fails on s390 (gcc -m31) as gcc is not able to find header-files like stdbool.h. In the glibc.spec file, you can see that glibc is configured with "--with-headers". In this case, first -nostdinc is added to the CFLAGS and then further include paths are added via -isystem. One of those paths should contain header files like stdbool.h. In order to get this path, gcc is invoked with: - on Fedora 28 (with 4.18 kernel): $ gcc -print-file-name=include /usr/lib/gcc/s390x-redhat-linux/8/include $ gcc -m31 -print-file-name=include /usr/lib/gcc/s390x-redhat-linux/8/../../../../lib/include => If perf is installed, this is: /usr/lib/include On my machine this directory is only containing the directory "perf". If perf is not installed gcc returns: /usr/lib/gcc/s390x-redhat-linux/8/include - on Ubuntu 18.04 (with 4.15 kernel): $ gcc -print-file-name=include /usr/lib/gcc/s390x-linux-gnu/7/include $ gcc -m31 -print-file-name=include /usr/lib/gcc/s390x-linux-gnu/7/include => gcc returns the correct path even if perf is installed. In each case, the introduction of the subdirectory /usr/lib/include leads to the regression that one can not build the glibc RPM for s390 anymore as gcc can not find headers like stdbool.h. To remedy this install bpf.h to /usr/lib/perf/include/bpf/bpf.h Output before using the command 'perf test -Fv 40': echo '...[bpf-program-source]...' | /usr/bin/clang ... \ -I/root/lib/include/perf/bpf ... ^^^^^^^^^^^^ ... [root@p23lp27 perf]# perf test -F 40 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok [root@p23lp27 perf]# Output after using command 'perf test -Fv 40': echo '...[bpf-program-source]...' | /usr/bin/clang ... \ -I/root/lib/perf/include/bpf ... ^^^^^^^^^^^^ ... [root@p23lp27 perf]# perf test -F 40 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok [root@p23lp27 perf]# Committer testing: While the above 'perf test -F 40' (or 'perf test bpf') will allow us to see that the correct path is now added via -I, to actually test this we better try to use a bpf script that includes files in the changed directory. We have the files that now reside in /root/lib/perf/examples/bpf/ to do just that: # tail -8 /root/lib/perf/examples/bpf/5sec.c #include <bpf.h> int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec) { return sec == 5; } license(GPL); # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 4 0.333 (4000.086 ms): sleep/9248 nanosleep(rqtp: 0x7ffc155f3300) = 0 # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5 0.287 ( ): sleep/9659 nanosleep(rqtp: 0x7ffeafe38200) ... 0.290 ( ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5 0.287 (5000.059 ms): sleep/9659 ... [continued]: nanosleep()) = 0 # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 6 0.247 (5999.951 ms): sleep/10068 nanosleep(rqtp: 0x7fff2086d900) = 0 # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5.987 0.293 ( ): sleep/10489 nanosleep(rqtp: 0x7ffdd4fc10e0) ... 0.296 ( ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5 0.293 (5986.912 ms): sleep/10489 ... [continued]: nanosleep()) = 0 # Suggested-by: Stefan Liebler <stli@linux.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: 1b16fffa389d ("perf llvm-utils: Add bpf include path to clang command line") Link: http://lkml.kernel.org/r/20180731073254.91090-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf c2c report: Fix crash for empty browserJiri Olsa
'perf c2c' scans read/write accesses and tries to find false sharing cases, so when the events it wants were not asked for or ended up not taking place, we get no histograms. So do not try to display entry details if there's not any. Currently this ends up in crash: $ perf c2c report # then press 'd' perf: Segmentation fault $ Committer testing: Before: Record a perf.data file without events of interest to 'perf c2c report', then call it and press 'd': # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (6 samples) ] # perf c2c report perf: Segmentation fault -------- backtrace -------- perf[0x5b1d2a] /lib64/libc.so.6(+0x346df)[0x7fcb566e36df] perf[0x46fcae] perf[0x4a9f1e] perf[0x4aa220] perf(main+0x301)[0x42c561] /lib64/libc.so.6(__libc_start_main+0xe9)[0x7fcb566cff29] perf(_start+0x29)[0x42c999] # After the patch the segfault doesn't take place, a follow up patch to tell the user why nothing changes when 'd' is pressed would be good. Reported-by: rodia@autistici.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: f1c5fd4d0bb9 ("perf c2c report: Add TUI cacheline browser") Link: http://lkml.kernel.org/r/20180724062008.26126-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf tests: Fix indexing when invoking subtestsSandipan Das
Recently, the subtest numbering was changed to start from 1. While it is fine for displaying results, this should not be the case when the subtests are actually invoked. Typically, the subtests are stored in zero-indexed arrays and invoked based on the index passed to the main test function. Since the index now starts from 1, the second subtest in the array (index 1) gets invoked instead of the first (index 0). This applies to all of the following subtests but for the last one, the subtest always fails because it does not meet the boundary condition of the subtest index being lesser than the number of subtests. This can be observed on powerpc64 and x86_64 systems running Fedora 28 as shown below. Before: # perf test "builtin clang support" 55: builtin clang support : 55.1: builtin clang compile C source to IR : Ok 55.2: builtin clang compile C source to ELF object : FAILED! # perf test "LLVM search and compile" 38: LLVM search and compile : 38.1: Basic BPF llvm compile : Ok 38.2: kbuild searching : Ok 38.3: Compile source for BPF prologue generation : Ok 38.4: Compile source for BPF relocation : FAILED! # perf test "BPF filter" 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : FAILED! After: # perf test "builtin clang support" 55: builtin clang support : 55.1: builtin clang compile C source to IR : Ok 55.2: builtin clang compile C source to ELF object : Ok # perf test "LLVM search and compile" 38: LLVM search and compile : 38.1: Basic BPF llvm compile : Ok 38.2: kbuild searching : Ok 38.3: Compile source for BPF prologue generation : Ok 38.4: Compile source for BPF relocation : Ok # perf test "BPF filter" 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Fixes: 9ef0112442bd ("perf test: Fix subtest number when showing results") Link: http://lkml.kernel.org/r/20180726171733.33208-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' argsArnaldo Carvalho de Melo
For instance: $ trace -e socket* ssh sandy 0.000 ( 0.031 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 0.052 ( 0.015 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 1.568 ( 0.020 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 1.603 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 1.699 ( 0.014 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 1.724 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK ) = 3 1.804 ( 0.020 ms): ssh/19919 socket(family: INET, type: STREAM, protocol: TCP ) = 3 17.549 ( 0.098 ms): ssh/19919 socket(family: LOCAL, type: STREAM ) = 4 acme@sandy's password: Just like with other syscall args, the common bits are supressed so that the output is more compact, i.e. we use "TCP" instead of "IPPROTO_TCP", but we can make this show the original constant names if we like it by using some command line knob or ~/.perfconfig "[trace]" section variable. Also needed is to make perf's event parser accept things like: $ perf trace -e socket*/protocol=TCP/ By using both the tracefs event 'format' files and these tables built from the kernel sources. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-l39jz1vnyda0b6jsufuc8bz7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf trace beauty: Add beautifiers for 'socket''s 'protocol' argArnaldo Carvalho de Melo
It'll be wired to 'perf trace' in the next cset. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2i9vkvm1ik8yu4hgjmxhsyjv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf trace beauty: Do not print NULL strarray entriesArnaldo Carvalho de Melo
We may have string tables where not all slots have values, in those cases its better to print the numeric value, for instance: In the table below we would show "protocol: (null)" for socket_ipproto[3] Where it would be better to show "protocol: 3". $ tools/perf/trace/beauty/socket_ipproto.sh static const char *socket_ipproto[] = { [0] = "IP", [103] = "PIM", [108] = "COMP", [12] = "PUP", [132] = "SCTP", [136] = "UDPLITE", [137] = "MPLS", [17] = "UDP", [1] = "ICMP", [22] = "IDP", [255] = "RAW", [29] = "TP", [2] = "IGMP", [33] = "DCCP", [41] = "IPV6", [46] = "RSVP", [47] = "GRE", [4] = "IPIP", [50] = "ESP", [51] = "AH", [6] = "TCP", [8] = "EGP", [92] = "MTP", [94] = "BEETPH", [98] = "ENCAP", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-7djfak94eb3b9ltr79cpn3ti@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf beauty: Add a generator for IPPROTO_ socket's protocol constantsArnaldo Carvalho de Melo
It'll use tools/include copy of linux/in.h to generate a table to be used by tools, initially by the 'socket' and 'socketpair' beautifiers in 'perf trace', but that could also be used to translate from a string constant to the integer value to be used in a eBPF or tracefs tracepoint filter. When used without any args it produces: $ tools/perf/trace/beauty/socket_ipproto.sh static const char *socket_ipproto[] = { [0] = "IP", [103] = "PIM", [108] = "COMP", [12] = "PUP", [132] = "SCTP", [136] = "UDPLITE", [137] = "MPLS", [17] = "UDP", [1] = "ICMP", [22] = "IDP", [255] = "RAW", [29] = "TP", [2] = "IGMP", [33] = "DCCP", [41] = "IPV6", [46] = "RSVP", [47] = "GRE", [4] = "IPIP", [50] = "ESP", [51] = "AH", [6] = "TCP", [8] = "EGP", [92] = "MTP", [94] = "BEETPH", [98] = "ENCAP", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v9rafqh3qn6b9kp9vfvj9f8s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31tools include uapi: Grab a copy of linux/in.hArnaldo Carvalho de Melo
We'll use it to create tables for the 'protocol' argument to the socket syscall when the 'family' arg is one of AF_INET or AF_INET6. Add it to check_headers.sh so that when a new protocol gets added we get a notification during the build process. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2amnveu1ns4emjn70xuavpje@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf tests: Fix complex event name parsingSandipan Das
The 'umask' event parameter is unsupported on some architectures like powerpc64. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf test "Parse event definition strings" -v 6: Parse event definition strings : --- start --- test child forked, pid 45915 ... running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp'Invalid event/parameter 'umask' Invalid event/parameter 'umask' failed to parse event 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp', err 1, str 'unknown term' event syntax error: '..,event=0x2,umask=0x3/ukp' \___ unknown term valid terms: event,mark,pmc,cache_sel,pmcxsel,unit,thresh_stop,thresh_start,combine,thresh_sel,thresh_cmp,sample_mode,config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,no-overwrite,overwrite,driver-config mem_access -> cpu/event=0x10401e0/ running test 0 'config=10,config1,config2=3,umask=1' test child finished with 1 ---- end ---- Parse event definition strings: FAILED! Committer testing: After applying the patch these test passes and in verbose mode we get: # perf test -v "event definition" 6: Parse event definition strings: --- start --- test child forked, pid 11061 running test 0 'syscalls:sys_enter_openat'Using CPUID GenuineIntel-6-9E <SNIP> running test 53 'cycles/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'/Duk' running test 0 'cpu/config=10,config1,config2=3,period=1000/u' running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u' running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/' running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp' <SNIP> test child finished with 0 ---- end ---- Parse event definition strings: Ok # Suggested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: 06dc5bf21f3f ("perf tests: Check that complex event name is parsed correctly") Link: http://lkml.kernel.org/r/20180726105502.31670-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31ALSA: usb-audio: Operate UAC3 Power Domains in PCM callbacksJorge Sanjuan
Make use of UAC3 Power Domains associated to an Audio Streaming path within the PCM's logic. This means, when there is no audio being transferred (pcm is closed), the host will set the Power Domain associated to that substream to state D1. When audio is being transferred (from hw_params onwards), the Power Domain will be set to D0 state. This is the way the host lets the device know which Terminal is going to be actively used and it is for the device to manage its own internal resources on that UAC3 Power Domain. Note the resume method now sets the Power Domain to D1 state as resuming the device doesn't mean audio streaming will occur. Signed-off-by: Jorge Sanjuan <jorge.sanjuan@codethink.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-31ALSA: usb-audio: Add UAC3 Power Domains to suspend/resumeJorge Sanjuan
Set the UAC3 Power Domain state for an Audio Streaming interface to D2 state before suspending the device (usb_driver callback). This lets the device know there is no intention to use any of the Units in the Audio Function and that the host is not going to even listen for wake-up events (interrupts) on the units. When the usb_driver gets resumed, the state D0 (fully powered) will be set. This ties up the UAC3 Power Domains to the runtime PM. Signed-off-by: Jorge Sanjuan <jorge.sanjuan@codethink.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-31ALSA: usb-audio: AudioStreaming Power Domain parsingJorge Sanjuan
Power Domains in the UAC3 spec are mainly intended to be associated to an Input or Output Terminal so the host changes the power state of the entire capture or playback path within the topology. This patch adds support for finding Power Domains associated to an Audio Streaming Interface (bTerminalLink) and adds a reference to them in the usb audio substreams (snd_usb_substream). Signed-off-by: Jorge Sanjuan <jorge.sanjuan@codethink.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-31ALSA: usb-audio: Initial Power Domain supportJorge Sanjuan
Thee USB Audio Class 3 (UAC3) introduces Power Domains as a new feature to let a host turn individual parts of an audio function to different power states via USB requests. This lets the device get to know a bit amore about what the host is up to in order to optimize power consumption efficiently. The Power Domains are optional for UAC3 configuration but all UAC3 devices shall include at least one BADD configuration where the support for Power Domains is compulsory. This patch adds a set of features/helpers to parse these power domains and change their status. Signed-off-by: Jorge Sanjuan <jorge.sanjuan@codethink.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-31perf evlist: Fix error out while applying initial delay and LBRKan Liang
'perf record' will error out if both --delay and LBR are applied. For example: # perf record -D 1000 -a -e cycles -j any -- sleep 2 Error: dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' # A dummy event is added implicitly for initial delay, which has the same configurations as real sampling events. The dummy event is a software event. If LBR is configured, perf must error out. The dummy event will only be used to track PERF_RECORD_MMAP while perf waits for the initial delay to enable the real events. The BRANCH_STACK bit can be safely cleared for the dummy event. After applying the patch: # perf record -D 1000 -a -e cycles -j any -- sleep 2 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.054 MB perf.data (828 samples) ] # Reported-by: Sunil K Pandey <sunil.k.pandey@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1531145722-16404-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31perf trace beauty: Default header_dir to cwd to work without parmsArnaldo Carvalho de Melo
Useful when checking the effects of header synchs for the files it uses as a input to generate string tables, in retrospect this is how it should've been done from day 1, not requiring the header_dir to be set on the Makefile, will change everything later, so that the only parm, common to all generators will be $(srctree) and $(beauty_outdir). So, to see what it generates, just call it without any parameters: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh static const char *vhost_virtio_ioctl_cmds[] = { [0x00] = "SET_FEATURES", [0x01] = "SET_OWNER", [0x02] = "RESET_OWNER", [0x03] = "SET_MEM_TABLE", [0x04] = "SET_LOG_BASE", [0x07] = "SET_LOG_FD", [0x10] = "SET_VRING_NUM", [0x11] = "SET_VRING_ADDR", [0x12] = "SET_VRING_BASE", [0x13] = "SET_VRING_ENDIAN", [0x14] = "GET_VRING_ENDIAN", [0x20] = "SET_VRING_KICK", [0x21] = "SET_VRING_CALL", [0x22] = "SET_VRING_ERR", [0x23] = "SET_VRING_BUSYLOOP_TIMEOUT", [0x24] = "GET_VRING_BUSYLOOP_TIMEOUT", [0x30] = "NET_SET_BACKEND", [0x40] = "SCSI_SET_ENDPOINT", [0x41] = "SCSI_CLEAR_ENDPOINT", [0x42] = "SCSI_GET_ABI_VERSION", [0x43] = "SCSI_SET_EVENTS_MISSED", [0x44] = "SCSI_GET_EVENTS_MISSED", [0x60] = "VSOCK_SET_GUEST_CID", [0x61] = "VSOCK_SET_RUNNING", }; static const char *vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", [0x12] = "GET_VRING_BASE", }; $ Or: $ tools/perf/trace/beauty/sndrv_pcm_ioctl.sh static const char *sndrv_pcm_ioctl_cmds[] = { [0x00] = "PVERSION", [0x01] = "INFO", [0x02] = "TSTAMP", [0x03] = "TTSTAMP", [0x04] = "USER_PVERSION", [0x10] = "HW_REFINE", [0x11] = "HW_PARAMS", [0x12] = "HW_FREE", [0x13] = "SW_PARAMS", [0x20] = "STATUS", [0x21] = "DELAY", [0x22] = "HWSYNC", [0x23] = "SYNC_PTR", [0x24] = "STATUS_EXT", [0x32] = "CHANNEL_INFO", [0x40] = "PREPARE", [0x41] = "RESET", [0x42] = "START", [0x43] = "DROP", [0x44] = "DRAIN", [0x45] = "PAUSE", [0x46] = "REWIND", [0x47] = "RESUME", [0x48] = "XRUN", [0x49] = "FORWARD", [0x50] = "WRITEI_FRAMES", [0x51] = "READI_FRAMES", [0x52] = "WRITEN_FRAMES", [0x53] = "READN_FRAMES", [0x60] = "LINK", [0x61] = "UNLINK", }; $ Etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-90am4vm8hh1osms894dp2otr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pairArd Biesheuvel
Calling pmull_gcm_encrypt_block() requires kernel_neon_begin() and kernel_neon_end() to be used since the routine touches the NEON register file. Add the missing calls. Also, since NEON register contents are not preserved outside of a kernel mode NEON region, pass the key schedule array again. Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: kexec: Add comment to explain use of __flush_icache_range()Will Deacon
Now that we understand the deadlock arising from flush_icache_range() on the kexec crash kernel path, add a comment to justify the use of __flush_icache_range() here. Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: sdei: Mark sdei stack helper functions as staticWill Deacon
The SDEI stack helper functions are only used by _on_sdei_stack() and refer to symbols (e.g. sdei_stack_normal_ptr) that are only defined if CONFIG_VMAP_STACK=y. Mark these functions as static, so we don't run into errors at link-time due to references to undefined symbols. Stick all the parameters onto the same line whilst we're passing through. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64, kaslr: export offset in VMCOREINFO ELF notesBhupesh Sharma
Include KASLR offset in arm64 VMCOREINFO ELF notes to assist in debugging. vmcore parsing in user-space already expects this value in the notes and we are providing it for portability of those existing tools with x86. Ideally we would like core code to do this (so that way this information won't be missed when an architecture adds KASLR support), but mips has CONFIG_RANDOMIZE_BASE, and doesn't provide kaslr_offset(), so I am not sure if this is needed for mips (and other such similar arch cases in future). So, lets keep this architecture specific for now. As an example of a user-space use-case, consider the makedumpfile user-space utility which will need fixup to use this KASLR offset to work with cases where we need to find a way to translate symbol address from vmlinux to kernel run time address in case of KASLR boot on arm64. I have already submitted the makedumpfile user-space patch upstream and the maintainer has suggested to wait for the kernel changes to be included (see [0]). I tested this on my qualcomm amberwing board both for KASLR and non-KASLR boot cases: Without this patch: # cat > scrub.conf << EOF [vmlinux] erase jiffies erase init_task.utime for tsk in init_task.tasks.next within task_struct:tasks erase tsk.utime endfor EOF # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3} readpage_elf: Attempt to read non-existent page at 0xffffa8a5bf180000. readmem: type_addr: 1, addr:ffffa8a5bf180000, size:8 vaddr_to_paddr_arm64: Can't read pgd readmem: Can't convert a virtual address(ffff0000092a542c) to physical address. readmem: type_addr: 0, addr:ffff0000092a542c, size:390 check_release: Can't get the address of system_utsname After this patch check_release() is ok, and also we are able to erase symbol from vmcore (I checked this with kernel 4.18.0-rc4+): # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3} The kernel version is not supported. The makedumpfile operation may be incomplete. Checking for memory holes : [100.0 %] \ Checking for memory holes : [100.0 %] | Checking foExcluding unnecessary pages : [100.0 %] \ Excluding unnecessary pages : [100.0 %] \ The dumpfiles are saved to dumpfile_1, dumpfile_2, and dumpfile_3. makedumpfile Completed. [0] https://www.spinics.net/lists/kexec/msg21195.html Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: perf: Add cap_user_time aarch64Michael O'Farrell
It is useful to get the running time of a thread. Doing so in an efficient manner can be important for performance of user applications. Avoiding system calls in `clock_gettime` when handling CLOCK_THREAD_CPUTIME_ID is important. Other clocks are handled in the VDSO, but CLOCK_THREAD_CPUTIME_ID falls back on the system call. CLOCK_THREAD_CPUTIME_ID is not handled in the VDSO since it would have costs associated with maintaining updated user space accessible time offsets. These offsets have to be updated everytime the a thread is scheduled/descheduled. However, for programs regularly checking the running time of a thread, this is a performance improvement. This patch takes a middle ground, and adds support for cap_user_time an optional feature of the perf_event API. This way costs are only incurred when the perf_event api is enabled. This is done the same way as it is in x86. Ultimately this allows calculating the thread running time in userspace on aarch64 as follows (adapted from perf_event_open manpage): u32 seq, time_mult, time_shift; u64 running, count, time_offset, quot, rem, delta; struct perf_event_mmap_page *pc; pc = buf; // buf is the perf event mmaped page as documented in the API. if (pc->cap_usr_time) { do { seq = pc->lock; barrier(); running = pc->time_running; count = readCNTVCT_EL0(); // Read ARM hardware clock. time_offset = pc->time_offset; time_mult = pc->time_mult; time_shift = pc->time_shift; barrier(); } while (pc->lock != seq); quot = (count >> time_shift); rem = count & (((u64)1 << time_shift) - 1); delta = time_offset + quot * time_mult + ((rem * time_mult) >> time_shift); running += delta; // running now has the current nanosecond level thread time. } Summary of changes in the patch: For aarch64 systems, make arch_perf_update_userpage update the timing information stored in the perf_event page. Requiring the following calculations: - Calculate the appropriate time_mult, and time_shift factors to convert ticks to nano seconds for the current clock frequency. - Adjust the mult and shift factors to avoid shift factors of 32 bits. (possibly unnecessary) - The time_offset userspace should apply when doing calculations: negative the current sched time (now), because time_running and time_enabled fields of the perf_event page have just been updated. Toggle bits to appropriate values: - Enable cap_user_time Signed-off-by: Michael O'Farrell <micpof@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31efi/libstub: Only disable stackleak plugin for arm64Laura Abbott
arm64 uses the full KBUILD_CFLAGS for building libstub as opposed to x86 which doesn't. This means that x86 doesn't pick up the gcc-plugins. We need to disable the stackleak plugin but doing this unconditionally breaks x86 build since it doesn't have any plugins. Switch to disabling the stackleak plugin for arm64 only. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: drop unused kernel_neon_begin_partial() macroArd Biesheuvel
When kernel mode NEON was first introduced to the arm64 kernel, every call to kernel_neon_begin()/_end() stacked resp. unstacked the entire NEON register file, making it worthwile to reduce the number of used NEON registers to a bare minimum, and only stack those. kernel_neon_begin_partial() was introduced for this purpose, but after the refactoring for SVE and other changes, it no longer exists and was simply #define'd to kernel_neon_begin() directly. In the mean time, all users have been updated, so let's remove the fallback macro. Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31s390/cpum_sf: save TOD clock base in SDBs for time conversionHendrik Brueckner
Processing the samples in the AUX-area by perf requires the computation of respective time stamps. The time stamps used by perf are based on the monotonic clock. To convert the TOD clock value contained in an SDB to a monotonic clock value, the TOD clock base is required. Hence, also save the TOD clock base in the SDB. Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-31cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platformsSrinivas Pandruvada
Dynamic boosting of HWP performance on IO wake showed significant improvement to IO workloads. This series was intended for Skylake Xeon platforms only and feature was enabled by default based on CPU model number. But some Xeon platforms reused the Skylake desktop CPU model number. This caused some undesirable side effects to some graphics workloads. Since they are heavily IO bound, the increase in CPU performance decreased the power available for GPU to do its computing and hence decrease in graphics benchmark performance. For example on a Skylake desktop, GpuTest benchmark showed average FPS reduction from 529 to 506. This change makes sure that HWP boost feature is only enabled for Skylake server platforms by using ACPI FADT preferred PM Profile. If some desktop users wants to get benefit of boost, they can still enable boost from intel_pstate sysfs attribute "hwp_dynamic_boost". Fixes: 41ab43c9c89e (cpufreq: intel_pstate: enable boost for Skylake Xeon) Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410 Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-31Merge branch 'acpi-soc'Rafael J. Wysocki
Merge a fix for hibernation regression in the ACPI driver for Intel SoCs (LPSS). * acpi-soc: ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
2018-07-31mtd: rawnand: allocate model parameter dynamicallyMiquel Raynal
Thanks to the migration of all drivers to use nand_scan() and the related nand_controller_ops, we can now allocate data during the detection phase. Let's do it first for the NAND model parameter which is allocated in nand_detect(). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-31mtd: rawnand: do not export nand_scan_[ident|tail]() anymoreMiquel Raynal
Both nand_scan_ident() and nand_scan_tail() helpers used to be called directly from controller drivers that needed to tweak some ECC-related parameters before nand_scan_tail(). This separation prevented dynamic allocations during the phase of NAND identification, which was inconvenient. All controller drivers have been moved to use nand_scan(), in conjunction with the chip->ecc.[attach|detach]_chip() hooks that actually do the required tweaking sequence between both ident/tail calls, allowing programmers to use dynamic allocation as they need all across the scanning sequence. Declare nand_scan_[ident|tail]() statically now. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-31mtd: rawnand: txx9ndfmc: convert driver to nand_scan()Miquel Raynal
Two helpers have been added to the core to do all kind of controller side configuration/initialization between the detection phase and the final NAND scan. Implement these hooks so that we can convert the driver to just use nand_scan() instead of the nand_scan_ident() + nand_scan_tail() pair. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-31mtd: rawnand: txx9ndfmc: clarify ECC parameters assignationMiquel Raynal
A comment in the probe declares that values are assigned to ecc.size and ecc.bytes, but these values will be overwritten. This is not entirely right as they are overwritten only if mtd->writesize >= 512. Let's clarify this by moving these assignations to txx9ndfmc_nand_scan(). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-31mtd: rawnand: tegra: convert driver to nand_scan()Miquel Raynal
Two helpers have been added to the core to do all kind of controller side configuration/initialization between the detection phase and the final NAND scan. Implement these hooks so that we can convert the driver to just use nand_scan() instead of the nand_scan_ident() + nand_scan_tail() pair. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>