linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-04-16	bpf: Move sanitize_val_alu out of op switch	Daniel Borkmann
	Add a small sanitize_needed() helper function and move sanitize_val_alu() out of the main opcode switch. In upcoming work, we'll move sanitize_ptr_alu() as well out of its opcode switch so this helps to streamline both. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Refactor and streamline bounds check into helper	Daniel Borkmann
	Move the bounds check in adjust_ptr_min_max_vals() into a small helper named sanitize_check_bounds() in order to simplify the former a bit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Improve verifier error messages for users	Daniel Borkmann
	Consolidate all error handling and provide more user-friendly error messages from sanitize_ptr_alu() and sanitize_val_alu(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Rework ptr_limit into alu_limit and add common error path	Daniel Borkmann
	Small refactor with no semantic changes in order to consolidate the max ptr_limit boundary check. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Ensure off_reg has no mixed signed bounds for all types	Daniel Borkmann
	The mixed signed bounds check really belongs into retrieve_ptr_limit() instead of outside of it in adjust_ptr_min_max_vals(). The reason is that this check is not tied to PTR_TO_MAP_VALUE only, but to all pointer types that we handle in retrieve_ptr_limit() and given errors from the latter propagate back to adjust_ptr_min_max_vals() and lead to rejection of the program, it's a better place to reside to avoid anything slipping through for future types. The reason why we must reject such off_reg is that we otherwise would not be able to derive a mask, see details in 9d7eceede769 ("bpf: restrict unknown scalars of mixed signed bounds for unprivileged"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Move off_reg into sanitize_ptr_alu	Daniel Borkmann
	Small refactor to drag off_reg into sanitize_ptr_alu(), so we later on can use off_reg for generalizing some of the checks for all pointer types. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	bpf: Use correct permission flag for mixed signed bounds arithmetic	Daniel Borkmann
	We forbid adding unknown scalars with mixed signed bounds due to the spectre v1 masking mitigation. Hence this also needs bypass_spec_v1 flag instead of allow_ptr_leaks. Fixes: 2c78ee898d8f ("bpf: Implement CAP_BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-04-16	blk-mq: Fix spurious debugfs directory creation during initialization	Saravanan D
	blk_mq_debugfs_register_sched_hctx() called from device_add_disk()->elevator_init_mq()->blk_mq_init_sched() initialization sequence does not have relevant parent directory setup and thus spuriously attempts "sched" directory creation from root mount of debugfs for every hw queue detected on the block device dmesg ... debugfs: Directory 'sched' with parent '/' already present! debugfs: Directory 'sched' with parent '/' already present! . . debugfs: Directory 'sched' with parent '/' already present! ... The parent debugfs directory for hw queues get properly setup device_add_disk()->blk_register_queue()->blk_mq_debugfs_register() ->blk_mq_debugfs_register_hctx() later in the block device initialization sequence. A simple check for debugfs_dir has been added to thwart premature debugfs directory/file creation attempts. Signed-off-by: Saravanan D <saravanand@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-16	cgroup: use tsk->in_iowait instead of delayacct_is_task_waiting_on_io()	Chunguang Xu
	If delayacct is disabled, then delayacct_is_task_waiting_on_io() always returns false, which causes the statistical value to be wrong. Perhaps tsk->in_iowait is better. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2021-04-16	tick/broadcast: Allow late registered device to enter oneshot mode	Jindong Yue
	The broadcast device is switched to oneshot mode when the system switches to oneshot mode. If a broadcast clock event device is registered after the system switched to oneshot mode, it will stay in periodic mode forever. Ensure that a late registered device which is selected as broadcast device is initialized in oneshot mode when the system already uses oneshot mode. [ tglx: Massage changelog ] Signed-off-by: Jindong Yue <jindong.yue@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210331083318.21794-1-jindong.yue@nxp.com
2021-04-16	tick: Use tick_check_replacement() instead of open coding it	Wang Wensheng
	The function tick_check_replacement() is the combination of tick_check_percpu() and tick_check_preferred(), but tick_check_new_device() has the same logic open coded. Use the helper to simplify the code. [ tglx: Massage changelog ] Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210326022328.3266-1-wangwensheng4@huawei.com
2021-04-16	time/timecounter: Mark 1st argument of timecounter_cyc2time() as const	Marc Kleine-Budde
	The timecounter is not modified in this function. Mark it as const. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210303103544.994855-1-mkl@pengutronix.de
2021-04-16	mtd: core: Constify buf in mtd_write_user_prot_reg()	Tudor Ambarus
	The write buffer comes from user and should be const. Constify write buffer in mtd core and across all _write_user_prot_reg() users. cfi_cmdset_{0001, 0002} and onenand_base will pay the cost of an explicit cast to discard the const qualifier since the beginning, since they are using an otp_op_t function prototype that is used for both reads and writes. mtd_dataflash and SPI NOR will benefit of the const buffer because they are using different paths for writes and reads. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210403060931.7119-1-tudor.ambarus@microchip.com
2021-04-16	Merge tag 'riscv-for-linus-5.12-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "A handful of fixes: - a fix to properly select SPARSEMEM_STATIC on rv32 - a few fixes to kprobes I don't generally like sending stuff this late, but these all seem pretty safe" * tag 'riscv-for-linus-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: keep interrupts disabled for BREAKPOINT exception riscv: kprobes/ftrace: Add recursion protection to the ftrace callback riscv: add do_page_fault and do_trap_break into the kprobes blacklist riscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"
2021-04-16	perf/amd/uncore: Fix sysfs type mismatch	Nathan Chancellor
	dev_attr_show() calls the __uncore_*_show() functions via an indirect call but their type does not currently match the type of the show() member in 'struct device_attribute', resulting in a Control Flow Integrity violation. $ cat /sys/devices/amd_l3/format/umask config:8-15 $ dmesg \| grep "CFI failure" [ 1258.174653] CFI failure (target: __uncore_umask_show...): Update the type in the DEFINE_UNCORE_FORMAT_ATTR macro to match 'struct device_attribute' so that there is no more CFI violation. Fixes: 06f2c24584f3 ("perf/amd/uncore: Prepare to scale for more attributes that vary per family") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210415001112.3024673-2-nathan@kernel.org
2021-04-16	x86/events/amd/iommu: Fix sysfs type mismatch	Nathan Chancellor
	dev_attr_show() calls _iommu_event_show() via an indirect call but _iommu_event_show()'s type does not currently match the type of the show() member in 'struct device_attribute', resulting in a Control Flow Integrity violation. $ cat /sys/devices/amd_iommu_1/events/mem_dte_hit csource=0x0a $ dmesg \| grep "CFI failure" [ 3526.735140] CFI failure (target: _iommu_event_show...): Change _iommu_event_show() and 'struct amd_iommu_event_desc' to 'struct device_attribute' so that there is no more CFI violation. Fixes: 7be6296fdd75 ("perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210415001112.3024673-1-nathan@kernel.org
2021-04-16	perf core: Add PERF_COUNT_SW_CGROUP_SWITCHES event	Namhyung Kim
	This patch adds a new software event to count context switches involving cgroup switches. So it's counted only if cgroups of previous and next tasks are different. Note that it only checks the cgroups in the perf_event subsystem. For cgroup v2, it shouldn't matter anyway. One can argue that we can do this by using existing sched_switch event with eBPF. But some systems might not have eBPF for some reason so I'd like to add this as a simple way. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210210083327.22726-2-namhyung@kernel.org
2021-04-16	perf core: Factor out __perf_sw_event_sched	Namhyung Kim
	In some cases, we need to check more than whether the software event is enabled. So split the condition check and the actual event handling. This is a preparation for the next change. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210210083327.22726-1-namhyung@kernel.org
2021-04-16	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Fix kernel compilation when using the LLVM integrated assembly. A recent commit (2decad92f473, "arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically") broke the kernel build when using the LLVM integrated assembly (only noticeable with clang-12 as MTE is not supported by earlier versions and the code in question not compiled). The Fixes: tag in the commit refers to the original patch introducing subsections for the alternative code sequences" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: alternatives: Move length validation in alternative_{insn, endif}
2021-04-16	io_uring: fix merge error for async resubmit	Jens Axboe
	A hand-edit while applying this patch on top of a new base resulted in a reverted check for re-issue, resulting in spurious -EAGAIN errors. Fixes: 8c130827f417 ("io_uring: don't alter iopoll reissue fail ret code") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-16	io_uring: tie req->apoll to request lifetime	Jens Axboe
	We manage these separately right now, just tie it to the request lifetime and make it be part of the usual REQ_F_NEED_CLEANUP logic. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-16	io_uring: put flag checking for needing req cleanup in one spot	Jens Axboe
	We have this in two spots right now, which is a bit fragile. In preparation for moving REQ_F_POLLED cleanup into the same spot, move the check into a separate helper so we only have it once. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-16	drm/bridge: lt8912b: fix incorrect handling of of_* return values	Adrien Grassein
	A static analysis shows several issues in the driver code at probing time. DT parsing errors were bad handled and could lead to bugs: - Bad error detection; - Bad release of resources Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Adrien Grassein <adrien.grassein@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210415183639.1487-1-rdunlap@infradead.org Signed-off-by: Robert Foss <robert.foss@linaro.org>
2021-04-16	sched,fair: Alternative sched_slice()	Peter Zijlstra
	The current sched_slice() seems to have issues; there's two possible things that could be improved: - the 'nr_running' used for __sched_period() is daft when cgroups are considered. Using the RQ wide h_nr_running seems like a much more consistent number. - (esp) cgroups can slice it real fine, which makes for easy over-scheduling, ensure min_gran is what the name says. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.611897312@infradead.org
2021-04-16	sched: Move /proc/sched_debug to debugfs	Peter Zijlstra
	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.548833671@infradead.org
2021-04-16	sched,debug: Convert sysctl sched_domains to debugfs	Peter Zijlstra
	Stop polluting sysctl, move to debugfs for SCHED_DEBUG stuff. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/YHgB/s4KCBQ1ifdm@hirez.programming.kicks-ass.net
2021-04-16	debugfs: Implement debugfs_create_str()	Peter Zijlstra
	Implement debugfs_create_str() to easily display names and such in debugfs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.415407080@infradead.org
2021-04-16	sched,preempt: Move preempt_dynamic to debug.c	Peter Zijlstra
	Move the #ifdef SCHED_DEBUG bits to kernel/sched/debug.c in order to collect all the debugfs bits. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.353833279@infradead.org
2021-04-16	sched: Move SCHED_DEBUG sysctl to debugfs	Peter Zijlstra
	Stop polluting sysctl with undocumented knobs that really are debug only, move them all to /debug/sched/ along with the existing /debug/sched_* files that already exist. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.287610138@infradead.org
2021-04-16	sched: Don't make LATENCYTOP select SCHED_DEBUG	Peter Zijlstra
	SCHED_DEBUG is not in fact required for LATENCYTOP, don't select it. Suggested-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.224578981@infradead.org
2021-04-16	sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG	Peter Zijlstra
	CONFIG_SCHEDSTATS does not depend on SCHED_DEBUG, it is inconsistent to have the sysctl depend on it. Suggested-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.161151631@infradead.org
2021-04-16	sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG	Mel Gorman
	The ability to enable/disable NUMA balancing is not a debugging feature and should not depend on CONFIG_SCHED_DEBUG. For example, machines within a HPC cluster may disable NUMA balancing temporarily for some jobs and re-enable it for other jobs without needing to reboot. This patch removes the dependency on CONFIG_SCHED_DEBUG for kernel.numa_balancing sysctl. The other numa balancing related sysctls are left as-is because if they need to be tuned then it is more likely that NUMA balancing needs to be fixed instead. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210324133916.GQ15768@suse.de
2021-04-16	sched: Use cpu_dying() to fix balance_push vs hotplug-rollback	Peter Zijlstra
	Use the new cpu_dying() state to simplify and fix the balance_push() vs CPU hotplug rollback state. Specifically, we currently rely on notifiers sched_cpu_dying() / sched_cpu_activate() to terminate balance_push, however if the cpu_down() fails when we're past sched_cpu_deactivate(), it should terminate balance_push at that point and not wait until we hit sched_cpu_activate(). Similarly, when cpu_up() fails and we're going back down, balance_push should be active, where it currently is not. So instead, make sure balance_push is enabled below SCHED_AP_ACTIVE (when !cpu_active()), and gate it's utility with cpu_dying(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/YHgAYef83VQhKdC2@hirez.programming.kicks-ass.net
2021-04-16	cpumask: Introduce DYING mask	Peter Zijlstra
	Introduce a cpumask that indicates (for each CPU) what direction the CPU hotplug is currently going. Notably, it tracks rollbacks. Eg. when an up fails and we do a roll-back down, it will accurately reflect the direction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210310150109.151441252@infradead.org
2021-04-16	cpumask: Make cpu_{online,possible,present,active}() inline	Peter Zijlstra
	Prepare for addition of another mask. Primarily a code movement to avoid having to create more #ifdef, but while there, convert everything with an argument to an inline function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210310150109.045447765@infradead.org
2021-04-16	Merge tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Daniel Vetter: "I pinged the usual suspects, only intel fixes pending" * tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm: drm/i915/display/vlv_dsi: Do not skip panel_pwr_cycle_delay when disabling the panel drm/i915: Don't zero out the Y plane's watermarks drm/i915/dpcd_bl: Don't try vesa interface unless specified by VBT
2021-04-16	dt-bindings: bcm4329-fmac: add optional brcm,ccode-map	Shawn Guo
	Add optional brcm,ccode-map property to support translation from ISO3166 country code to brcmfmac firmware country code and revision. The country revision is needed because the RF parameters that provide regulatory compliance are tweaked per platform/customer. So depending on the RF path tight to the chip, certain country revision needs to be specified. As such they could be seen as device specific calibration data which is a good fit into device tree. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/20210415104728.8471-2-shawn.guo@linaro.org Signed-off-by: Rob Herring <robh@kernel.org>
2021-04-16	perf/x86: Move cpuc->running into P4 specific code	Kan Liang
	The 'running' variable is only used in the P4 PMU. Current perf sets the variable in the critical function x86_pmu_start(), which wastes cycles for everybody not running on P4. Move cpuc->running into the P4 specific p4_pmu_enable_event(). Add a static per-CPU 'p4_running' variable to replace the 'running' variable in the struct cpu_hw_events. Saves space for the generic structure. The p4_pmu_enable_all() also invokes the p4_pmu_enable_event(), but it should not set cpuc->running. Factor out __p4_pmu_enable_event() for p4_pmu_enable_all(). Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618410990-21383-1-git-send-email-kan.liang@linux.intel.com
2021-04-16	selftests/perf_events: Add kselftest for remove_on_exec	Marco Elver
	Add kselftest to test that remove_on_exec removes inherited events from child tasks. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-9-elver@google.com
2021-04-16	selftests/perf_events: Add kselftest for process-wide sigtrap handling	Marco Elver
	Add a kselftest for testing process-wide perf events with synchronous SIGTRAP on events (using breakpoints). In particular, we want to test that changes to the event propagate to all children, and the SIGTRAPs are in fact synchronously sent to the thread where the event occurred. Note: The "signal_stress" test case is also added later in the series to perf tool's built-in tests. The test here is more elaborate in that respect, which on one hand avoids bloating the perf tool unnecessarily, but we also benefit from structured tests with TAP-compliant output that the kselftest framework provides. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-8-elver@google.com
2021-04-16	perf: Add support for SIGTRAP on perf events	Marco Elver
	Adds bit perf_event_attr::sigtrap, which can be set to cause events to send SIGTRAP (with si_code TRAP_PERF) to the task where the event occurred. The primary motivation is to support synchronous signals on perf events in the task where an event (such as breakpoints) triggered. To distinguish perf events based on the event type, the type is set in si_errno. For events that are associated with an address, si_addr is copied from perf_sample_data. The new field perf_event_attr::sig_data is copied to si_perf, which allows user space to disambiguate which event (of the same type) triggered the signal. For example, user space could encode the relevant information it cares about in sig_data. We note that the choice of an opaque u64 provides the simplest and most flexible option. Alternatives where a reference to some user space data is passed back suffer from the problem that modification of referenced data (be it the event fd, or the perf_event_attr) can race with the signal being delivered (of course, the same caveat applies if user space decides to store a pointer in sig_data, but the ABI explicitly avoids prescribing such a design). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
2021-04-16	signal: Introduce TRAP_PERF si_code and si_perf to siginfo	Marco Elver
	Introduces the TRAP_PERF si_code, and associated siginfo_t field si_perf. These will be used by the perf event subsystem to send signals (if requested) to the task where an event occurred. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> # asm-generic Link: https://lkml.kernel.org/r/20210408103605.1676875-6-elver@google.com
2021-04-16	perf: Add support for event removal on exec	Marco Elver
	Adds bit perf_event_attr::remove_on_exec, to support removing an event from a task on exec. This option supports the case where an event is supposed to be process-wide only, and should not propagate beyond exec, to limit monitoring to the original process image only. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-5-elver@google.com
2021-04-16	perf: Support only inheriting events if cloned with CLONE_THREAD	Marco Elver
	Adds bit perf_event_attr::inherit_thread, to restricting inheriting events only if the child was cloned with CLONE_THREAD. This option supports the case where an event is supposed to be process-wide only (including subthreads), but should not propagate beyond the current process's shared environment. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/lkml/YBvj6eJR%2FDY2TsEB@hirez.programming.kicks-ass.net/
2021-04-16	perf: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children	Marco Elver
	As with other ioctls (such as PERF_EVENT_IOC_{ENABLE,DISABLE}), fix up handling of PERF_EVENT_IOC_MODIFY_ATTRIBUTES to also apply to children. Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lkml.kernel.org/r/20210408103605.1676875-3-elver@google.com
2021-04-16	perf: Rework perf_event_exit_event()	Peter Zijlstra
	Make perf_event_exit_event() more robust, such that we can use it from other contexts. Specifically the up and coming remove_on_exec. For this to work we need to address a few issues. Remove_on_exec will not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to disable event_function_call() and we thus have to use perf_remove_from_context(). When using perf_remove_from_context(), there's two races to consider. The first is against close(), where we can have concurrent tear-down of the event. The second is against child_list iteration, which should not find a half baked event. To address this, teach perf_remove_from_context() to special case !ctx->is_active and about DETACH_CHILD. [ elver@google.com: fix racing parent/child exit in sync_child_event(). ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com
2021-04-16	perf intel-pt: Use aux_watermark	Alexander Shishkin
	Turns out, the default setting of attr.aux_watermark to half of the total buffer size is not very useful, especially with smaller buffers. The problem is that, after half of the buffer is filled up, the kernel updates ->aux_head and sets up the next "transaction", while observing that ->aux_tail is still zero (as userspace haven't had the chance to update it), meaning that the trace will have to stop at the end of this second "transaction". This means, for example, that the second PERF_RECORD_AUX in every trace comes with TRUNCATED flag set. Setting attr.aux_watermark to quarter of the buffer gives enough space for the ->aux_tail update to be observed and prevents the data loss. The obligatory before/after showcase: > # perf_before record -e intel_pt//u -m,8 uname > Linux > [ perf record: Woken up 6 times to write data ] > Warning: > AUX data lost 4 times out of 10! > > [ perf record: Captured and wrote 0.099 MB perf.data ] > # perf record -e intel_pt//u -m,8 uname > Linux > [ perf record: Woken up 4 times to write data ] > [ perf record: Captured and wrote 0.039 MB perf.data ] The effect is still visible with large workloads and large buffers, although less pronounced. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210414154955.49603-3-alexander.shishkin@linux.intel.com
2021-04-16	perf: Cap allocation order at aux_watermark	Alexander Shishkin
	Currently, we start allocating AUX pages half the size of the total requested AUX buffer size, ignoring the attr.aux_watermark setting. This, in turn, makes intel_pt driver disregard the watermark also, as it uses page order for its SG (ToPA) configuration. Now, this can be fixed in the intel_pt PMU driver, but seeing as it's the only one currently making use of high order allocations, there is no reason not to fix the allocator instead. This way, any other driver wishing to add this support would not have to worry about this. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210414154955.49603-2-alexander.shishkin@linux.intel.com
2021-04-16	mmc: dw_mmc-rockchip: Just set default sample value for legacy mode	Shawn Lin
	.set_ios() is called from .resume() as well. For SDIO device which sets keep-power-in-suspend, nothing should be changed after resuming, as well as sample tuning value, since this value is tuned already. So we should not overwrite it with the default value. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://lore.kernel.org/r/1618539454-182170-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-04-16	spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails	Quanyang Wang
	The spi controller supports 44-bit address space on AXI in DMA mode, so set dma_addr_t width to 44-bit to avoid using a swiotlb mapping. In addition, if dma_map_single fails, it should return immediately instead of continuing doing the DMA operation which bases on invalid address. This fixes the following crash which occurs in reading a big block from flash: [ 123.633577] zynqmp-qspi ff0f0000.spi: swiotlb buffer is full (sz: 4194304 bytes), total 32768 (slots), used 0 (slots) [ 123.644230] zynqmp-qspi ff0f0000.spi: ERR:rxdma:memory not mapped [ 123.784625] Unable to handle kernel paging request at virtual address 00000000003fffc0 [ 123.792536] Mem abort info: [ 123.795313] ESR = 0x96000145 [ 123.798351] EC = 0x25: DABT (current EL), IL = 32 bits [ 123.803655] SET = 0, FnV = 0 [ 123.806693] EA = 0, S1PTW = 0 [ 123.809818] Data abort info: [ 123.812683] ISV = 0, ISS = 0x00000145 [ 123.816503] CM = 1, WnR = 1 [ 123.819455] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000805047000 [ 123.825887] [00000000003fffc0] pgd=0000000803b45003, p4d=0000000803b45003, pud=0000000000000000 [ 123.834586] Internal error: Oops: 96000145 [#1] PREEMPT SMP Fixes: 1c26372e5aa9 ("spi: spi-zynqmp-gqspi: Update driver to use spi-mem framework") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Link: https://lore.kernel.org/r/20210416004652.2975446-6-quanyang.wang@windriver.com Signed-off-by: Mark Brown <broonie@kernel.org>