summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-30arc: remove redundant GCC version checksMasahiro Yamada
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. With GCC >= 4.6 assumed, 'upto_gcc44' is empty, 'atleast_gcc44' is y. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-08-31Merge branch 'pm-core'Rafael J. Wysocki
Merge a generic clock management fix for 4.19-rc2. * pm-core: PM / clk: signedness bug in of_pm_clk_add_clks()
2018-08-30clk: x86: Set default parent to 48MhzAkshu Agrawal
System clk provided in ST soc can be set to: 48Mhz, non-spread 25Mhz, spread To get accurate rate, we need it to set it at non-spread option which is 48Mhz. Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com> Reviewed-by: Daniel Kurtz <djkurtz@chromium.org> Fixes: 421bf6a1f061 ("clk: x86: Add ST oscout platform clock") Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-08-30i2c: sh_mobile: fix leak when using DMA bounce bufferWolfram Sang
We only freed the bounce buffer after successful DMA, missing the cases where DMA setup may have gone wrong. Use a better location which always gets called after each message and use 'stop_after_dma' as a flag for a successful transfer. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30i2c: sh_mobile: define start_ch() void as it only returns 0 anyhowWolfram Sang
After various refactoring over the years, start_ch() doesn't return errno anymore, so make the function return void. This saves the error handling when calling it which in turn eases cleanup of resources of a future patch. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30i2c: refactor function to release a DMA safe bufferWolfram Sang
a) rename to 'put' instead of 'release' to match 'get' when obtaining the buffer b) change the argument order to have the buffer as first argument c) add a new argument telling the function if the message was transferred. This allows the function to be used also in cases where setting up DMA failed, so the buffer needs to be freed without syncing to the message buffer. Also convert the only user. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30i2c: algos: bit: make the error messages grepableJan Kundrát
Yep, I went looking for one of these, and I wasn't able to find it easily. That's worse than a line which is 82-chars long, IMHO. Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30i2c: designware: Re-init controllers with pm_disabled set on resumeHans de Goede
On Bay Trail and Cherry Trail devices we set the pm_disabled flag for I2C busses which the OS shares with the PUNIT as these need special handling. Until now we called dev_pm_syscore_device(dev, true) for I2C controllers with this flag set to keep these I2C controllers always on. After commit 12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation"), this no longer works. This commit modifies lpss_iosf_exit_d3_state() to only run if lpss_iosf_enter_d3_state() has ran before it, so that it does not run on a resume from hibernate (or from S3). On these systems the conditions for lpss_iosf_enter_d3_state() to run never become true, so lpss_iosf_exit_d3_state() never gets called and the 2 LPSS DMA controllers never get forced into D0 mode, instead they are left in their default automatic power-on when needed mode. The not forcing of D0 mode for the DMA controllers enables these systems to properly enter S0ix modes, which is a good thing. But after entering S0ix modes the I2C controller connected to the PMIC no longer works, leading to e.g. broken battery monitoring. The _PS3 method for this I2C controller looks like this: Method (_PS3, 0, NotSerialized) // _PS3: Power State 3 { If ((((PMID == 0x04) || (PMID == 0x05)) || (PMID == 0x06))) { Return (Zero) } PSAT |= 0x03 Local0 = PSAT /* \_SB_.I2C5.PSAT */ } Where PMID = 0x05, so we enter the Return (Zero) path on these systems. So even if we were to not call dev_pm_syscore_device(dev, true) the I2C controller will be left in D0 rather then be switched to D3. Yet on other Bay and Cherry Trail devices S0ix is not entered unless *all* I2C controllers are in D3 mode. This combined with the I2C controller no longer working now that we reach S0ix states on these systems leads to me believing that the PUNIT itself puts the I2C controller in D3 when all other conditions for entering S0ix states are true. Since now the I2C controller is put in D3 over a suspend/resume we must re-initialize it afterwards and that does indeed fix it no longer working. This commit implements this fix by: 1) Making the suspend_late callback a no-op if pm_disabled is set and making the resume_early callback skip the clock re-enable (since it now was not disabled) while still doing the necessary I2C controller re-init. 2) Removing the dev_pm_syscore_device(dev, true) call, so that the suspend and resume callbacks are actually called. Normally this would cause the ACPI pm code to call _PS3 putting the I2C controller in D3, wreaking havoc since it is shared with the PUNIT, but in this special case the _PS3 method is a no-op so we can safely allow a "fake" suspend / resume. Fixes: 12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and resume ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=200861 Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBusMika Westerberg
Commit 7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR") made it possible for AML code to access SMBus I/O ports by installing custom SystemIO OpRegion handler and blocking i80i driver access upon first AML read/write to this OpRegion. However, while ThinkPad T560 does have SystemIO OpRegion declared under the SMBus device, it does not access any of the SMBus registers: Device (SMBU) { ... OperationRegion (SMBP, PCI_Config, 0x50, 0x04) Field (SMBP, DWordAcc, NoLock, Preserve) { , 5, TCOB, 11, Offset (0x04) } Name (TCBV, 0x00) Method (TCBS, 0, NotSerialized) { If ((TCBV == 0x00)) { TCBV = (\_SB.PCI0.SMBU.TCOB << 0x05) } Return (TCBV) /* \_SB_.PCI0.SMBU.TCBV */ } OperationRegion (TCBA, SystemIO, TCBS (), 0x10) Field (TCBA, ByteAcc, NoLock, Preserve) { Offset (0x04), , 9, CPSC, 1 } } Problem with the current approach is that it blocks all I/O port access and because this system has touchpad connected to the SMBus controller after first AML access (happens during suspend/resume cycle) the touchpad fails to work anymore. Fix this so that we allow ACPI AML I/O port access if it does not touch the region reserved for the SMBus. Fixes: 7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR") Link: https://bugzilla.kernel.org/show_bug.cgi?id=200737 Reported-by: Yussuf Khalil <dev@pp3345.net> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-08-30Merge tag 'for-linus-20180830' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Small collection of fixes that should go into this series. This pull contains: - NVMe pull request with three small fixes (via Christoph) - Kill useless NULL check before kmem_cache_destroy (Chengguang Xu) - Xen block driver pull request with persistent grant flushing fixes (Juergen Gross) - Final wbt fixes, wrapping up the changes for this series. These have been heavily tested (me) - cdrom info leak fix (Scott Bauer) - ATA dma quirk for SQ201 (Linus Walleij) - Straight forward bsg refcount_t conversion (John Pittman)" * tag 'for-linus-20180830' of git://git.kernel.dk/linux-block: cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status nvmet: free workqueue object if module init fails nvme-fcloop: Fix dropped LS's to removed target port nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event block: bsg: move atomic_t ref_count variable to refcount API block: remove unnecessary condition check ata: ftide010: Add a quirk for SQ201 blk-wbt: remove dead code blk-wbt: improve waking of tasks blk-wbt: abstract out end IO completion handler xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring xen/blkback: move persistent grants flags to bool xen/blkfront: reorder tests in xlblk_init() xen/blkfront: cleanup stale persistent grants xen/blkback: don't keep persistent grants too long
2018-08-30ipmi: kcs_bmc: don't change device nameBenjamin Fair
kcs_bmc_alloc(...) calls dev_set_name(...) which is incorrect as most bus driver frameworks, platform_driver in particular, assume that they are able to set the device name themselves. Signed-off-by: Benjamin Fair <benjaminfair@google.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2018-08-30of: add node name compare helper functionsRob Herring
In preparation to remove device_node.name pointer, add helper functions for node name comparisons which are a common pattern throughout the kernel. Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
2018-08-30perf annotate: Fix parsing aarch64 branch instructions after objdump updateKim Phillips
Starting with binutils 2.28, aarch64 objdump adds comments to the disassembly output to show the alternative names of a condition code [1]. It is assumed that commas in objdump comments could occur in other arches now or in the future, so this fix is arch-independent. The fix could have been done with arm64 specific jump__parse and jump__scnprintf functions, but the jump__scnprintf instruction would have to have its comment character be a literal, since the scnprintf functions cannot receive a struct arch easily. This inconvenience also applies to the generic jump__scnprintf, which is why we add a raw_comment pointer to struct ins_operands, so the __parse function assigns it to be re-used by its corresponding __scnprintf function. Example differences in 'perf annotate --stdio2' output on an aarch64 perf.data file: BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b AFTER : ↓ b.cs 18c BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b AFTER : ↓ b.cc 31c The branch target labels 18c and 31c also now appear in the output: BEFORE: add x26, x29, #0x80 AFTER : 18c: add x26, x29, #0x80 BEFORE: add x21, x21, #0x8 AFTER : 31c: add x21, x21, #0x8 The Fixes: tag below is added so stable branches will get the update; it doesn't necessarily mean that commit was broken at the time, rather it didn't withstand the aarch64 objdump update. Tested no difference in output for sample x86_64, power arch perf.data files. [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730 Signed-off-by: Kim Phillips <kim.phillips@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anton Blanchard <anton@samba.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: b13bbeee5ee6 ("perf annotate: Fix branch instruction with multiple operands") Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf probe powerpc: Ignore SyS symbols irrespective of endiannessSandipan Das
This makes sure that the SyS symbols are ignored for any powerpc system, not just the big endian ones. Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: fb6d59423115 ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc") Link: http://lkml.kernel.org/r/20180828090848.1914-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30vfs: implement readahead(2) using POSIX_FADV_WILLNEEDAmir Goldstein
The implementation of readahead(2) syscall is identical to that of fadvise64(POSIX_FADV_WILLNEED) with a few exceptions: 1. readahead(2) returns -EINVAL for !mapping->a_ops and fadvise64() ignores the request and returns 0. 2. fadvise64() checks for integer overflow corner case 3. fadvise64() calls the optional filesystem fadvise() file operation Unite the two implementations by calling vfs_fadvise() from readahead(2) syscall. Check the !mapping->a_ops in readahead(2) syscall to preserve documented syscall ABI behaviour. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: d1d04ef8572b ("ovl: stack file ops") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-08-30perf event-parse: Use fixed size string for commsChris Phlipot
Some implementations of libc do not support the 'm' width modifier as part of the scanf string format specifier. This can cause the parsing to fail. Since the parser never checks if the scanf parsing was successesful, this can result in a crash. Change the comm string to be allocated as a fixed size instead of dynamically using 'm' scanf width modifier. This can be safely done since comm size is limited to 16 bytes by TASK_COMM_LEN within the kernel. This change prevents perf from crashing when linked against bionic as well as reduces the total number of heap allocations and frees invoked while accomplishing the same task. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830021950.15563-1-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf util: Fix bad memory access in trace info.Chris Phlipot
In the write to the output_fd in the error condition of record_saved_cmdline(), we are writing 8 bytes from a memory location on the stack that contains a primitive that is only 4 bytes in size. Change the primitive to 8 bytes in size to match the size of the write in order to avoid reading unknown memory from the stack. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180829061954.18871-1-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf tools: Streamline bpf examples and headers installationArnaldo Carvalho de Melo
We were emitting 4 lines, two of them misleading: make: Entering directory '/home/acme/git/perf/tools/perf' <SNIP> INSTALL lib INSTALL include/bpf INSTALL lib INSTALL examples/bpf <SNIP> make: Leaving directory '/home/acme/git/perf/tools/perf' Make it more compact by showing just two lines: make: Entering directory '/home/acme/git/perf/tools/perf' INSTALL bpf-headers INSTALL bpf-examples make: Leaving directory '/home/acme/git/perf/tools/perf' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0nvkyciqdkrgy829lony5925@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx()Hisao Tanabe
If evsel is NULL, we should return NULL to avoid a NULL pointer dereference a bit later in the code. Signed-off-by: Hisao Tanabe <xtanabe@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 03e0a7df3efd ("perf tools: Introduce bpf-output event") LPU-Reference: 20180824154556.23428-1-xtanabe@gmail.com Link: https://lkml.kernel.org/n/tip-e5plzjhx6595a5yjaf22jss3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf arm64: Fix include path for asm-generic/unistd.hKim Phillips
The new syscall table support for arm64 mistakenly used the system's asm-generic/unistd.h file when processing the tools/arch/arm64/include/uapi/asm/unistd.h file's include directive: #include <asm-generic/unistd.h> See "Committer notes" section of commit 2b5882435606 "perf arm64: Generate system call table from asm/unistd.h" for more details. This patch removes the committer's temporary workaround, and instructs the host compiler to search the build tree's include path for the right copy of the unistd.h file, instead of the one on the system's /usr/include path. It thus fixes the committer's test that cross-builds an arm64 perf on an x86 platform running Ubuntu 14.04.5 LTS with an old toolchain: $ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc `pwd`/tools tools/arch/arm64/include/uapi/asm/unistd.h | grep bpf [280] = "bpf", Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Fixes: 2b5882435606 ("perf arm64: Generate system call table from asm/unistd.h") Link: http://lkml.kernel.org/r/20180806172800.bbcec3cfcc51e2facc978bf2@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf/hw_breakpoint: Simplify breakpoint enable in perf_event_modify_breakpointJiri Olsa
We can safely enable the breakpoint back for both the fail and success paths by checking only the bp->attr.disabled, which either holds the new 'requested' disabled state or the original breakpoint state. Committer testing: At the end of the series, the 'perf test' entry introduced as the first patch now runs to completion without finding the fixed issues: # perf test "bp modify" 62: x86 bp modify : Ok # In verbose mode: # perf test -v "bp modify" 62: x86 bp modify : --- start --- test child forked, pid 5161 rip 5950a0, bp_1 0x5950a0 in bp_1 rip 5950a0, bp_1 0x5950a0 in bp_1 test child finished with 0 ---- end ---- x86 bp modify: Ok Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf/hw_breakpoint: Enable breakpoint in modify_user_hw_breakpointJiri Olsa
Currently we enable the breakpoint back only if the breakpoint modification was successful. If it fails we can leave the breakpoint in disabled state with attr->disabled == 0. We can safely enable the breakpoint back for both the fail and success paths by checking the bp->attr.disabled, which either holds the new 'requested' disabled state or the original breakpoint state. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf/hw_breakpoint: Remove superfluous bp->attr.disabled = 0Jiri Olsa
Once the breakpoint was succesfully modified, the attr->disabled value is in bp->attr.disabled. So there's no reason to set it again, removing that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf/hw_breakpoint: Modify breakpoint even if the new attr has disabled setJiri Olsa
We need to change the breakpoint even if the attr with new fields has disabled set to true. Current code prevents following user code to change the breakpoint address: ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[0]), addr_1) ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[0]), addr_2) ptrace(PTRACE_POKEUSER, child, offsetof(struct user, u_debugreg[7]), dr7) The first PTRACE_POKEUSER creates the breakpoint with attr.disabled set to true: ptrace_set_breakpoint_addr(nr = 0) struct perf_event *bp = t->ptrace_bps[nr]; ptrace_register_breakpoint(..., disabled = true) ptrace_fill_bp_fields(..., disabled) register_user_hw_breakpoint So the second PTRACE_POKEUSER will be omitted: ptrace_set_breakpoint_addr(nr = 0) struct perf_event *bp = t->ptrace_bps[nr]; struct perf_event_attr attr = bp->attr; modify_user_hw_breakpoint(bp, &attr) if (!attr->disabled) modify_user_hw_breakpoint_check Reported-by: Milind Chabbi <chabbi.milind@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf tests: Add breakpoint modify testsJiri Olsa
Adding to tests that aims on kernel breakpoint modification bugs. First test creates HW breakpoint, tries to change it and checks it was properly changed. It aims on kernel issue that prevents HW breakpoint to be changed via ptrace interface. The first test forks, the child sets itself as ptrace tracee and waits in signal for parent to trace it, then it calls bp_1 and quits. The parent does following steps: - creates a new breakpoint (id 0) for bp_2 function - changes that breakpoint to bp_1 function - waits for the breakpoint to hit and checks it has proper rip of bp_1 function This test aims on an issue in kernel preventing to change disabled breakpoints Second test mimics the first one except for few steps in the parent: - creates a new breakpoint (id 0) for bp_1 function - changes that breakpoint to bogus (-1) address - waits for the breakpoint to hit and checks it has proper rip of bp_1 function This test aims on an issue in kernel disabling enabled breakpoint after unsuccesful change. Committer testing: # uname -a Linux jouet 4.18.0-rc8-00002-g1236568ee3cb #12 SMP Tue Aug 7 14:08:26 -03 2018 x86_64 x86_64 x86_64 GNU/Linux # perf test -v "bp modify" 62: x86 bp modify : --- start --- test child forked, pid 25671 in bp_1 tracee exited prematurely 2 FAILED arch/x86/tests/bp-modify.c:209 modify test 1 failed test child finished with -1 ---- end ---- x86 bp modify: FAILED! # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf annotate: Properly interpret indirect callMartin Liška
The patch changes the parsing of: callq *0x8(%rbx) from: 0.26 │ → callq *8 to: 0.26 │ → callq *0x8(%rbx) in this case an address is followed by a register, thus one can't parse only the address. Committer testing: 1) run 'perf record sleep 10' 2) before applying the patch, run: perf annotate --stdio2 > /tmp/before 3) after applying the patch, run: perf annotate --stdio2 > /tmp/after 4) diff /tmp/before /tmp/after: --- /tmp/before 2018-08-28 11:16:03.238384143 -0300 +++ /tmp/after 2018-08-28 11:15:39.335341042 -0300 @@ -13274,7 +13274,7 @@ ↓ jle 128 hash_value = hash_table->hash_func (key); mov 0x8(%rsp),%rdi - 0.91 → callq *30 + 0.91 → callq *0x30(%r12) mov $0x2,%r8d cmp $0x2,%eax node_hash = hash_table->hashes[node_index]; @@ -13848,7 +13848,7 @@ mov %r14,%rdi sub %rbx,%r13 mov %r13,%rdx - → callq *38 + → callq *0x38(%r15) cmp %rax,%r13 1.91 ↓ je 240 1b4: mov $0xffffffff,%r13d @@ -14026,7 +14026,7 @@ mov %rcx,-0x500(%rbp) mov %r15,%rsi mov %r14,%rdi - → callq *38 + → callq *0x38(%rax) mov -0x500(%rbp),%rcx cmp %rax,%rcx ↓ jne 9b0 <SNIP tons of other such cases> Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/bd1f3932-be2b-85f9-7582-111ee0a43b07@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30Merge tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull mtd fixes from Boris Brezillon: "Raw NAND fixes: - denali: Fix a regression caused by the nand_scan() rework - docg4: Fix a build error when gcc decides to not iniline some functions (can be reproduced with gcc 4.1.2): * tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd: mtd: rawnand: denali: do not pass zero maxchips to nand_scan() mtd: rawnand: docg4: Remove wrong __init annotations
2018-08-30Merge tag 'mmc-v4.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix unsupported parallel dispatch of requests MMC host: - atmel-mci/android-goldfish: Fixup logic of sg_copy_{from,to}_buffer - renesas_sdhi_internal_dmac: Prevent IRQ-storm due of DMAC IRQs - renesas_sdhi_internal_dmac: Fixup bad register offset" * tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: renesas_sdhi_internal_dmac: mask DMAC interrupts mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS mmc: block: Fix unsupported parallel dispatch of requests mmc: android-goldfish: fix bad logic of sg_copy_{from,to}_buffer conversion mmc: atmel-mci: fix bad logic of sg_copy_{from,to}_buffer conversion
2018-08-30tools/kvm_stat: re-animate display of dead guestsStefan Raspl
When filtering by guest (interactive commands 'p'/'g'), and the respective guest was destroyed, detect when the guest is up again through the guest name if possible. I.e. when displaying events for a specific guest, it is not necessary anymore to restart kvm_stat in case the guest is restarted. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: indicate dead guests as suchStefan Raspl
For destroyed guests, kvm_stat essentially freezes with the last data displayed. This is acceptable for users, in case they want to inspect the final data. But it looks a bit irritating. Therefore, detect this situation and display a respective indicator in the header. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: handle guest removals more gracefullyStefan Raspl
When running with the DebugFS provider, removal of a guest can result in a negative CurAvg/s, which looks rather confusing. If so, suppress the body refresh and print a message instead. To reproduce, have at least one guest A completely booted. Then start another guest B (which generates a huge amount of events), then destroy B. On the next refresh, kvm_stat should display a whole lot of negative values in the CurAvg/s column. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: don't reset stats when setting PID filter for debugfsStefan Raspl
When setting a PID filter in debugfs, we unnecessarily reset the statistics, although there is no reason to do so. This behavior was merely introduced with commit 9f114a03c6854f "tools/kvm_stat: add interactive command 'r'", most likely to mimic the behavior of the tracepoints provider in this respect. However, there are plenty of differences between the two providers, so there is no reason not to take advantage of the possibility to filter by PID without resetting the statistics. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: fix updates for dead guestsStefan Raspl
With pid filtering active, when a guest is removed e.g. via virsh shutdown, successive updates produce garbage. Therefore, we add code to detect this case and prevent further body updates. Note that when displaying the help dialog via 'h' in this case, once we exit we're stuck with the 'Collecting data...' message till we remove the filter. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: fix handling of invalid paths in debugfs providerStefan Raspl
When filtering by guest, kvm_stat displays garbage when the guest is destroyed - see sample output below. We add code to remove the invalid paths from the providers, so at least no more garbage is displayed. Here's a sample output to illustrate: kvm statistics - pid 13986 (foo) Event Total %Total CurAvg/s diagnose_258 -2 0.0 0 deliver_program_interruption -3 0.0 0 diagnose_308 -4 0.0 0 halt_poll_invalid -91 0.0 -6 deliver_service_signal -244 0.0 -16 halt_successful_poll -250 0.1 -17 exit_pei -285 0.1 -19 exit_external_request -312 0.1 -21 diagnose_9c -328 0.1 -22 userspace_handled -713 0.1 -47 halt_attempted_poll -939 0.2 -62 deliver_emergency_signal -3126 0.6 -208 halt_wakeup -7199 1.5 -481 exit_wait_state -7379 1.5 -493 diagnose_500 -56499 11.5 -3757 exit_null -85491 17.4 -5685 diagnose_44 -133300 27.1 -8874 exit_instruction -195898 39.8 -13037 Total -492063 Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30tools/kvm_stat: fix python3 issuesStefan Raspl
Python3 returns a float for a regular division - switch to a division operator that returns an integer. Furthermore, filters return a generator object instead of the actual list - wrap result in yet another list, which makes it still work in both, Python2 and 3. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30vfs: add the fadvise() file operationAmir Goldstein
This is going to be used by overlayfs and possibly useful for other filesystems. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-08-30Documentation/filesystems: update documentation of file_operationsAmir Goldstein
...to kernel 4.18. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-08-30ovl: fix GPF in swapfile_activate of file from overlayfs over xfsAmir Goldstein
Since overlayfs implements stacked file operations, the underlying filesystems are not supposed to be exposed to the overlayfs file, whose f_inode is an overlayfs inode. Assigning an overlayfs file to swap_file results in an attempt of xfs code to dereference an xfs_inode struct from an ovl_inode pointer: CPU: 0 PID: 2462 Comm: swapon Not tainted 4.18.0-xfstests-12721-g33e17876ea4e #3402 RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f Call Trace: xfs_iomap_swapfile_activate+0x1f/0x43 __se_sys_swapon+0xb1a/0xee9 Fix this by not assigning the real inode mapping to f_mapping, which will cause swapon() to return an error (-EINVAL). Although it makes sense not to allow setting swpafile on an overlayfs file, some users may depend on it, so we may need to fix this up in the future. Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT open to fail. Fix this by installing ovl_aops with noop_direct_IO in overlay inode mapping. Keeping f_mapping pointing to overlay inode mapping will cause other a_ops related operations to fail (e.g. readahead()). Those will be fixed by follow up patches. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: f7c72396d0de ("ovl: add O_DIRECT support") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-08-30ovl: respect FIEMAP_FLAG_SYNC flagAmir Goldstein
Stacked overlayfs fiemap operation broke xfstests that test delayed allocation (with "_test_generic_punch -d"), because ovl_fiemap() failed to write dirty pages when requested. Fixes: 9e142c4102db ("ovl: add ovl_fiemap()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-08-30KVM: x86: Unexport x86_emulate_instruction()Sean Christopherson
Allowing x86_emulate_instruction() to be called directly has led to subtle bugs being introduced, e.g. not setting EMULTYPE_NO_REEXECUTE in the emulation type. While most of the blame lies on re-execute being opt-out, exporting x86_emulate_instruction() also exposes its cr2 parameter, which may have contributed to commit d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") using x86_emulate_instruction() instead of emulate_instruction() because "hey, I have a cr2!", which in turn introduced its EMULTYPE_NO_REEXECUTE bug. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()Sean Christopherson
Lack of the kvm_ prefix gives the impression that it's a VMX or SVM specific function, and there's no conflict that prevents adding the kvm_ prefix. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: Do not re-{try,execute} after failed emulation in L2Sean Christopherson
Commit a6f177efaa58 ("KVM: Reenter guest after emulation failure if due to access to non-mmio address") added reexecute_instruction() to handle the scenario where two (or more) vCPUS race to write a shadowed page, i.e. reexecute_instruction() is intended to return true if and only if the instruction being emulated was accessing a shadowed page. As L0 is only explicitly shadowing L1 tables, an emulation failure of a nested VM instruction cannot be due to a race to write a shadowed page and so should never be re-executed. This fixes an issue where an "MMIO" emulation failure[1] in L2 is all but guaranteed to result in an infinite loop when TDP is enabled. Because "cr2" is actually an L2 GPA when TDP is enabled, calling kvm_mmu_gva_to_gpa_write() to translate cr2 in the non-direct mapped case (L2 is never direct mapped) will almost always yield UNMAPPED_GVA and cause reexecute_instruction() to immediately return true. The !mmio_info_in_cache() check in kvm_mmu_page_fault() doesn't catch this case because mmio_info_in_cache() returns false for a nested MMU (the MMIO caching currently handles L1 only, e.g. to cache nested guests' GPAs we'd have to manually flush the cache when switching between VMs and when L1 updated its page tables controlling the nested guest). Way back when, commit 68be0803456b ("KVM: x86: never re-execute instruction with enabled tdp") changed reexecute_instruction() to always return false when using TDP under the assumption that KVM would only get into the emulator for MMIO. Commit 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") effectively reverted that behavior in order to handle the scenario where emulation failed due to an access from L1 to the shadow page tables for L2, but it didn't account for the case where emulation failed in L2 with TDP enabled. All of the above logic also applies to retry_instruction(), added by commit 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions"). An indefinite loop in retry_instruction() should be impossible as it protects against retrying the same instruction over and over, but it's still correct to not retry an L2 instruction in the first place. Fix the immediate issue by adding a check for a nested guest when determining whether or not to allow retry in kvm_mmu_page_fault(). In addition to fixing the immediate bug, add WARN_ON_ONCE in the retry functions since they are not designed to handle nested cases, i.e. they need to be modified even if there is some scenario in the future where we want to allow retrying a nested guest. [1] This issue was encountered after commit 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") changed the page fault path to return KVM_PFN_NOSLOT when translating an L2 access to a prive memslot. Returning KVM_PFN_NOSLOT is semantically correct when we want to hide a memslot from L2, i.e. there effectively is no defined memory region for L2, but it has the unfortunate side effect of making KVM think the GFN is a MMIO page, thus triggering emulation. The failure occurred with in-development code that deliberately exposed a private memslot to L2, which L2 accessed with an instruction that is not emulated by KVM. Fixes: 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") Fixes: 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Jim Mattson <jmattson@google.com> Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com> Cc: Xiao Guangrong <xiaoguangrong@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_faultSean Christopherson
Effectively force kvm_mmu_page_fault() to opt-in to allowing retry to make it more obvious when and why it allows emulation to be retried. Previously this approach was less convenient due to retry and re-execute behavior being controlled by separate flags that were also inverted in their implementations (opt-in versus opt-out). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: Merge EMULTYPE_RETRY and EMULTYPE_ALLOW_REEXECUTESean Christopherson
retry_instruction() and reexecute_instruction() are a package deal, i.e. there is no scenario where one is allowed and the other is not. Merge their controlling emulation type flags to enforce this in code. Name the combined flag EMULTYPE_ALLOW_RETRY to make it abundantly clear that we are allowing re{try,execute} to occur, as opposed to explicitly requesting retry of a previously failed instruction. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: Invert emulation re-execute behavior to make it opt-inSean Christopherson
Re-execution of an instruction after emulation decode failure is intended to be used only when emulating shadow page accesses. Invert the flag to make allowing re-execution opt-in since that behavior is by far in the minority. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: x86: SVM: Set EMULTYPE_NO_REEXECUTE for RSM emulationSean Christopherson
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of RSM emulation. Add a new helper, kvm_emulate_instruction_from_buffer(), to support emulating from a pre-defined buffer. This eliminates the last direct call to x86_emulate_instruction() outside of kvm_mmu_page_fault(), which means x86_emulate_instruction() can be unexported in a future patch. Fixes: 7607b7174405 ("KVM: SVM: install RSM intercept") Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instrSean Christopherson
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of MMIO emulation. As handle_ept_misconfig() is only used for MMIO emulation, it should pass EMULTYPE_NO_REEXECUTE when using the emulator to skip an instr in the fast-MMIO case where VM_EXIT_INSTRUCTION_LEN is invalid. And because the cr2 value passed to x86_emulate_instruction() is only destined for use when retrying or reexecuting, we can simply call emulate_instruction(). Fixes: d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: SVM: remove unused variable dst_vaddr_endColin Ian King
Variable dst_vaddr_end is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: variable 'dst_vaddr_end' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30KVM: nVMX: avoid redundant double assignment of nested_run_pendingVitaly Kuznetsov
nested_run_pending is set 20 lines above and check_vmentry_prereqs()/ check_vmentry_postreqs() don't seem to be resetting it (the later, however, checks it). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Eduardo Valentin <eduval@amazon.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-08-30ALSA: hda - Fix cancel_work_sync() stall from jackpoll workTakashi Iwai
On AMD/ATI controllers, the HD-audio controller driver allows a bus reset upon the error recovery, and its procedure includes the cancellation of pending jack polling work as found in snd_hda_bus_codec_reset(). This works usually fine, but it becomes a problem when the reset happens from the jack poll work itself; then calling cancel_work_sync() from the work being processed tries to wait the finish endlessly. As a workaround, this patch adds the check of current_work() and applies the cancel_work_sync() only when it's not from the jackpoll_work. This doesn't fix the root cause of the reported error below, but at least, it eases the unexpected stall of the whole system. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200937 Cc: <stable@vger.kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>