summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-06-28Merge tag 'timers-nohz-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers/nohz updates from Ingo Molnar: - Micro-optimize tick_nohz_full_cpu() - Optimize idle exit tick restarts to be less eager - Optimize tick_nohz_dep_set_task() to only wake up a single CPU. This reduces IPIs and interruptions on nohz_full CPUs. - Optimize tick_nohz_dep_set_signal() in a similar fashion. - Skip IPIs in tick_nohz_kick_task() when trying to kick a non-running task. - Micro-optimize tick_nohz_task_switch() IRQ flags handling to reduce context switching costs. - Misc cleanups and fixes * tag 'timers-nohz-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add myself as context tracking maintainer tick/nohz: Call tick_nohz_task_switch() with interrupts disabled tick/nohz: Kick only _queued_ task whose tick dependency is updated tick/nohz: Change signal tick dependency to wake up CPUs of member tasks tick/nohz: Only wake up a single target cpu when kicking a task tick/nohz: Update nohz_full Kconfig help tick/nohz: Update idle_exittime on actual idle exit tick/nohz: Remove superflous check for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE tick/nohz: Conditionally restart tick on idle exit tick/nohz: Evaluate the CPU expression after the static key
2021-06-28Merge tag 'sched-core-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler udpates from Ingo Molnar: - Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There are new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. * tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/doc: Update the CPU capacity asymmetry bits sched/topology: Rework CPU capacity asymmetry detection sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag psi: Fix race between psi_trigger_create/destroy sched/fair: Introduce the burstable CFS controller sched/uclamp: Fix uclamp_tg_restrict() sched/rt: Fix Deadline utilization tracking during policy change sched/rt: Fix RT utilization tracking during policy change sched: Change task_struct::state sched,arch: Remove unused TASK_STATE offsets sched,timer: Use __set_current_state() sched: Add get_current_state() sched,perf,kvm: Fix preemption condition sched: Introduce task_is_running() sched: Unbreak wakeups sched/fair: Age the average idle time sched/cpufreq: Consider reduced CPU capacity in energy calculation sched/fair: Take thermal pressure into account while estimating energy thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure sched/fair: Return early from update_tg_cfs_load() if delta == 0 ...
2021-06-28Merge tag 'perf-core-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Platform PMU driver updates: - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers - Fix RDPMC support - Fix [extended-]PEBS-via-PT support - Fix Sapphire Rapids event constraints - Fix :ppp support on Sapphire Rapids - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU - Other heterogenous-PMU fixes - Kprobes: - Remove the unused and misguided kprobe::fault_handler callbacks. - Warn about kprobes taking a page fault. - Fix the 'nmissed' stat counter. - Misc cleanups and fixes. * tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix task context PMU for Hetero perf/x86/intel: Fix instructions:ppp support in Sapphire Rapids perf/x86/intel: Add more events requires FRONTEND MSR on Sapphire Rapids perf/x86/intel: Fix fixed counter check warning for some Alder Lake perf/x86/intel: Fix PEBS-via-PT reload base value for Extended PEBS perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task kprobes: Do not increment probe miss count in the fault handler x86,kprobes: WARN if kprobes tries to handle a fault kprobes: Remove kprobe::fault_handler uprobes: Update uprobe_write_opcode() kernel-doc comment perf/hw_breakpoint: Fix DocBook warnings in perf hw_breakpoint perf/core: Fix DocBook warnings perf/core: Make local function perf_pmu_snapshot_aux() static perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR perf/x86/intel/uncore: Generalize I/O stacks to PMON mapping procedure perf/x86/intel/uncore: Drop unnecessary NULL checks after container_of()
2021-06-28Merge tag 'locking-core-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. * tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) locking/lockdep: Correct the description error for check_redundant() futex: Provide FUTEX_LOCK_PI2 to support clock selection futex: Prepare futex_lock_pi() for runtime clock selection lockdep/selftest: Remove wait-type RCU_CALLBACK tests lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING lockdep: Fix wait-type for empty stack locking/selftests: Add a selftest for check_irq_usage() lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage() locking/lockdep: Remove the unnecessary trace saving locking/lockdep: Fix the dep path printing for backwards BFS selftests: futex: Add futex compare requeue test selftests: futex: Add futex wait test seqlock: Remove trailing semicolon in macros locking/lockdep: Reduce LOCKDEP dependency list locking/lockdep,doc: Improve readability of the block matrix locking/atomics: atomic-instrumented: simplify ifdeffery locking/atomic: delete !ARCH_ATOMIC remnants locking/atomic: xtensa: move to ARCH_ATOMIC locking/atomic: sparc: move to ARCH_ATOMIC locking/atomic: sh: move to ARCH_ATOMIC ...
2021-06-28Merge tags 'objtool-urgent-2021-06-28' and 'objtool-core-2021-06-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix and updates from Ingo Molnar: "An ELF format fix for a section flags mismatch bug that breaks kernel tooling such as kpatch-build. The biggest change in this cycle is the new code to handle and rewrite variable sized jump labels - which results in slightly tighter code generation in hot paths, through the use of short(er) NOPs. Also a number of cleanups and fixes, and a change to the generic include/linux/compiler.h to handle a s390 GCC quirk" * tag 'objtool-urgent-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Don't make .altinstructions writable * tag 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Improve reloc hash size guestimate instrumentation.h: Avoid using inline asm operand modifiers compiler.h: Avoid using inline asm operand modifiers kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n' objtool: Reflow handle_jump_alt() jump_label/x86: Remove unused JUMP_LABEL_NOP_SIZE jump_label, x86: Allow short NOPs objtool: Provide stats for jump_labels objtool: Rewrite jump_label instructions objtool: Decode jump_entry::key addend jump_label, x86: Emit short JMP jump_label: Free jump_entry::key bit1 for build use jump_label, x86: Add variable length patching support jump_label, x86: Introduce jump_entry_size() jump_label, x86: Improve error when we fail expected text jump_label, x86: Factor out the __jump_table generation jump_label, x86: Strip ASM jump_label support x86, objtool: Dont exclude arch/x86/realmode/ objtool: Rewrite hashtable sizing
2021-06-28Merge tag 'x86_cpu_for_v5.14_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Borislav Petkov: - New AMD models support - Allow MONITOR/MWAIT to be used for C1 state entry on Hygon too - Use the special RAPL CPUID bit to detect the functionality on AMD and Hygon instead of doing family matching. - Add support for new Intel microcode deprecating TSX on some models and do not enable kernel workarounds for those CPUs when TSX transactions always abort, as a result of that microcode update. * tag 'x86_cpu_for_v5.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsx: Clear CPUID bits when TSX always force aborts x86/events/intel: Do not deploy TSX force abort workaround when TSX is deprecated x86/msr: Define new bits in TSX_FORCE_ABORT MSR perf/x86/rapl: Use CPUID bit on AMD and Hygon parts x86/cstate: Allow ACPI C1 FFH MWAIT use on Hygon systems x86/amd_nb: Add AMD family 19h model 50h PCI ids x86/cpu: Fix core name for Sapphire Rapids
2021-06-28Merge tag 'hwmon-for-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon updates from Guenter Roeck: "New drivers: - Delta DPS920AB - Flex PIM4006, PIM4328 and PIM4820 - MPS MP2888 - Sensirion SHT4X Added chip support to existing drivers: - Flex BMR310, BMR456, BMR457, BMR458, BMR480, BMR490, BMR491, and BMR492 - TI TMP1075 - Renesas ZLS1003, ZLS4009 and ZL8802 Other: - Dropped explicit ACPI support for MAX31722 and LM70; the APIC IDs in those drivers do not exist. - Support set_trips() callback into thermal subsystem - Minor fixes and improvements in various drivers" * tag 'hwmon-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (49 commits) hwmon: Support set_trips() of thermal device ops hwmon: (lm90) Prevent integer underflows of temperature calculations hwmon: (lm90) Disable interrupt on suspend hwmon: (lm90) Unmask hardware interrupt hwmon: (lm90) Use hwmon_notify_event() hwmon: (lm90) Don't override interrupt trigger type hwmon: (pmbus/dps920ab) Delete some dead code hwmon: (ntc_thermistor) Drop unused headers. MAINTAINERS: Add Delta DPS920AB PSU driver dt-bindings: trivial-devices: Add Delta DPS920AB hwmon: (pmbus) Add driver for Delta DPS-920AB PSU hwmon: (pmbus/pim4328) Add documentation for the pim4328 PMBus driver hwmon: (pmbus/pim4328) Add PMBus driver for PIM4006, PIM4328 and PIM4820 hwmon: (pmbus) Allow phase function even if it's not on page hwmon: (pmbus) Add support for reading direct mode coefficients hwmon: (pmbus) Add new pmbus flag NO_WRITE_PROTECT docs: hwmon: adm1177.rst: avoid using ReSt :doc:`foo` markup hwmon: (pmbus_core) Check adapter PEC support hwmon: (ina3221) use CVRF only for single-shot conversion hwmon: (max31790) Detect and report zero fan speed ...
2021-06-28Merge tag 'spi-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "The biggest single thing in the diffstat here is a massive overhaul of the PXA2xx driver from Andy Shevchenko (the IP is still in use on modern Intel systems), though we also have quite a lot of core work as well: - Better support for mixing native and GPIO chip selects also from Andy. - Support for devices with multiple chip selects from Sebastian Reichel. - Helper for polling status registers in spi-mem from Patrice Chotard. - Support for Renesas RZ/N1 and Rockchip RV1126" * tag 'spi-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (86 commits) spi: core: add dma_map_dev for dma device spi: convert Xilinx Zynq UltraScale+ MPSoC GQSPI bindings to YAML spi: Fix self assignment issue with ancillary->mode spi: spi-sh-msiof: : use proper DMAENGINE API for termination spi: spi-rspi: : use proper DMAENGINE API for termination spi: spi-rockchip: add description for rv1126 spi: rockchip: Support SPI_CS_HIGH spi: rockchip: Support cs-gpio spi: rockchip: Wait for STB status in slave mode tx_xfer spi: rockchip: Set rx_fifo interrupt waterline base on transfer item spi: rockchip: add compatible string for rv1126 spi: spi-sun6i: Fix chipselect/clock bug spi: dt-bindings: support devices with multiple chipselects spi: add ancillary device support spi: xilinx: convert to yaml spi: convert Cadence SPI bindings to YAML spi: stm32-qspi: Remove unused qspi field of struct stm32_qspi_flash spi: add of_device_uevent_modalias support spi: meson-spicc: fix memory leak in meson_spicc_probe spi: meson-spicc: fix a wrong goto jump for avoiding memory leak. ...
2021-06-28Merge tag 'regulator-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "The main core change this release is generic support for handling of hardware errors from Matti Vaittinen, including some small updates to the reboot and thermal code so we can share support for powering off the system if things are going wrong enough. Otherwise this release we've mainly seen the addition of new drivers, including MT6359 which has pulled in some small changes from the MFD tree for build dependencies. - Support for controlling the trigger points for hardware error detection, and shared handlers for this. - Support for Maxim MAX8993, Mediatek MT6359 and MT6359P, Qualcomm PM8226 and SA8115P-ADP, and Sylergy TCS4526" * tag 'regulator-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (91 commits) regulator: bd9576: Fix uninitializes variable may_have_irqs regulator: max8893: Select REGMAP_I2C to fix build error regulator: da9052: Ensure enough delay time for .set_voltage_time_sel regulator: mt6358: Fix vdram2 .vsel_mask regulator: hi6421v600: Fix setting wrong driver_data MAINTAINERS: Add reviewer for regulator irq_helpers regulator: bd9576: Fix the driver name in id table regulator: bd9576: Support error reporting regulator: bd9576 add FET ON-resistance for OCW regulator: add property parsing and callbacks to set protection limits regulator: IRQ based event/error notification helpers regulator: move rdev_print helpers to internal.h regulator: add warning flags thermal: Use generic HW-protection shutdown API reboot: Add hardware protection power-off regulator: Add protection limit properties regulator: hi6421v600: Fix setting idle mode regulator: Add MAX8893 bindings regulator: max8893: add regulator driver regulator: hi6421: Use correct variable type for regmap api val argument ...
2021-06-28Merge tag 'regmap-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "The big thing this release is support for accessing the register maps of MDIO devices via the framework. We've also added support for 7/17 register formats on bytestream transports and inverted status registers in regmap-irq" * tag 'regmap-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: mdio: Reject invalid addresses regmap: mdio: Fix regmap_bus pointer constness regmap: mdio: Add clause-45 support regmap: mdio: Clean up invalid clause-22 addresses regmap-irq: Introduce inverted status registers support regmap: add support for 7/17 register formating regmap: mdio: Don't modify output if error happened regmap: Add MDIO bus support regmap-i2c: Set regmap max raw r/w from quirks
2021-06-28Merge tag 'mmc-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds
Pull MMC and MEMSTICK updates from Ulf Hansson: "MMC core: - Add support for Cache Ctrl for SD cards - Add support for Power Off Notification for SD cards - Add support for read/write of the SD function extension registers - Allow broken eMMC HS400 mode to be disabled via DT - Allow UHS-I voltage switch for SDSC cards if supported - Disable command queueing in the ioctl path - Enable eMMC sleep commands to use HW busy polling to minimize delay - Extend re-use of the common polling loop to standardize behaviour - Take into account MMC_CAP_NEED_RSP_BUSY for eMMC HPI commands MMC host: - jz4740: Add support for the JZ4775 variant - sdhci-acpi: Disable write protect detection on Toshiba Encore 2 WT8-B - sdhci-esdhc-imx: Advertise HS400 support through MMC caps - sdhci-esdhc-imx: Enable support for system wakeup for SDIO - sdhci-iproc: Add support for the legacy sdhci controller on the BCM7211 - vub3000: Fix control-request direction MEMSTICK: - A couple of fixes/cleanups" * tag 'mmc-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (54 commits) mmc: sdhci-iproc: Add support for the legacy sdhci controller on the BCM7211 dt-bindings: mmc: sdhci-iproc: Add brcm,bcm7211a0-sdhci mmc: JZ4740: Add support for JZ4775 dt-bindings: mmc: JZ4740: Add bindings for JZ4775 mmc: sdhci-esdhc-imx: Enable support for system wakeup for SDIO mmc: Improve function name when aborting a tuning cmd mmc: sdhci-of-aspeed: Turn down a phase correction warning mmc: debugfs: add description for module parameter mmc: via-sdmmc: add a check against NULL pointer dereference mmc: sdhci-sprd: use sdhci_sprd_writew mmc: sdhci-esdhc-imx: remove unused is_imx6q_usdhc mmc: core: Allow UHS-I voltage switch for SDSC cards if supported mmc: mmc_spi: Imply container_of() to be no-op mmc: mmc_spi: Drop duplicate 'mmc_spi' in the debug messages mmc: dw_mmc-pltfm: Remove unused <linux/clk.h> mmc: sdhci-of-aspeed: Configure the SDHCIs as specified by the devicetree. mmc: core: Add a missing SPDX license header mmc: vub3000: fix control-request direction mmc: sdhci-omap: Use pm_runtime_resume_and_get() to replace open coding mmc: sdhci_am654: Use pm_runtime_resume_and_get() to replace open coding ...
2021-06-28Merge tag 'for-5.14/libata-2021-06-27' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull libata updates from Jens Axboe: "The big change in this round is that we're finally in a position where we can sanely remove the old drivers/ide/ code, as libata covers everything we need by now. This is exciting for two reasons: 1) we delete a lot of legacy code that doesn't really meet the standards we have today, and 2) it enables us to clean up various bits in the block layer that exist only because of the old IDE code. Outside of that, just a few minor fixes here, fixups for warnings, etc" * tag 'for-5.14/libata-2021-06-27' of git://git.kernel.dk/linux-block: (29 commits) ata: rb532_cf: remove redundant codes ide: remove the legacy ide driver m68k: use libata instead of the legacy ide driver ARM: disable CONFIG_IDE in pxa_defconfig ARM: disable CONFIG_IDE in footbridge_defconfig alpha: use libata instead of the legacy ide driver pata_cypress: add a module option to disable BM-DMA ata: pata_macio: Avoid overwriting initialised field in 'pata_macio_sht' ata: pata_serverworks: Avoid overwriting initialised field in 'serverworks_osb4_sht ata: pata_sc1200: sc1200_sht'Avoid overwriting initialised field in ' ata: pata_cs5530: Avoid overwriting initialised field in 'cs5530_sht' ata: pata_cs5520: Avoid overwriting initialised field in 'cs5520_sht' ata: pata_atiixp: Avoid overwriting initialised field in 'atiixp_sht' ata: sata_nv: Do not over-write initialise fields in 'nv_adma_sht' and 'nv_swncq_sht' ata: sata_mv: Do not over-write initialise fields in 'mv6_sht' ata: sata_sil24: Do not over-write initialise fields in 'sil24_sht' ata: ahci: Ensure initialised fields are not overwritten in AHCI_SHT() ata: include: libata: Move fields commonly over-written to separate MACRO ahci: Add support for Dell S140 and later controllers ata: ahci_sunxi: Disable DIPM ...
2021-06-28Merge tag 'irqchip-5.14' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Revamped the irqdomain internals to consistently cache irqdata - Expose a new API to simplify IRQ handling involving an irqdomain by not using the IRQ number - Convert all the irqchip drivers to this new API - Allow the Qualcomm PDC driver to be compiled as a module - Fix HiSi MBIGEN compile warning when CONFIG_ACPI isn't selected - Remove a bunch of spurious printks on error paths - The obligatory couple of DT updates
2021-06-27Revert "signal: Allow tasks to cache one sigqueue struct"Linus Torvalds
This reverts commits 4bad58ebc8bc4f20d89cff95417c9b4674769709 (and 399f8dd9a866e107639eabd3c1979cd526ca3a98, which tried to fix it). I do not believe these are correct, and I'm about to release 5.13, so am reverting them out of an abundance of caution. The locking is odd, and appears broken. On the allocation side (in __sigqueue_alloc()), the locking is somewhat straightforward: it depends on sighand->siglock. Since one caller doesn't hold that lock, it further then tests 'sigqueue_flags' to avoid the case with no locks held. On the freeing side (in sigqueue_cache_or_free()), there is no locking at all, and the logic instead depends on 'current' being a single thread, and not able to race with itself. To make things more exciting, there's also the data race between freeing a signal and allocating one, which is handled by using WRITE_ONCE() and READ_ONCE(), and being mutually exclusive wrt the initial state (ie freeing will only free if the old state was NULL, while allocating will obviously only use the value if it was non-NULL, so only one or the other will actually act on the value). However, while the free->alloc paths do seem mutually exclusive thanks to just the data value dependency, it's not clear what the memory ordering constraints are on it. Could writes from the previous allocation possibly be delayed and seen by the new allocation later, causing logical inconsistencies? So it's all very exciting and unusual. And in particular, it seems that the freeing side is incorrect in depending on "current" being single-threaded. Yes, 'current' is a single thread, but in the presense of asynchronous events even a single thread can have data races. And such asynchronous events can and do happen, with interrupts causing signals to be flushed and thus free'd (for example - sending a SIGCONT/SIGSTOP can happen from interrupt context, and can flush previously queued process control signals). So regardless of all the other questions about the memory ordering and locking for this new cached allocation, the sigqueue_cache_or_free() assumptions seem to be fundamentally incorrect. It may be that people will show me the errors of my ways, and tell me why this is all safe after all. We can reinstate it if so. But my current belief is that the WRITE_ONCE() that sets the cached entry needs to be a smp_store_release(), and the READ_ONCE() that finds a cached entry needs to be a smp_load_acquire() to handle memory ordering correctly. And the sequence in sigqueue_cache_or_free() would need to either use a lock or at least be interrupt-safe some way (perhaps by using something like the percpu 'cmpxchg': it doesn't need to be SMP-safe, but like the percpu operations it needs to be interrupt-safe). Fixes: 399f8dd9a866 ("signal: Prevent sigqueue caching after task got released") Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct") Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-26mailbox: mtk-cmdq: Add struct cmdq_pkt in struct cmdq_cb_dataChun-Kuang Hu
Current client use 'struct cmdq_pkt' as callback data, so change 'void *data' to 'struct cmdq_pkt *pkt'. Keep data until client use pkt instead of data. Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Reviewed-by: Yongqiang Niu <yongqiang.niu@mediatek.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2021-06-26mailbox: mtk-cmdq: Remove cmdq_cb_statusChun-Kuang Hu
cmdq_cb_status is an error status. Use the standard error number instead of cmdq_cb_status to prevent status duplication. Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Reviewed-by: Yongqiang Niu <yongqiang.niu@mediatek.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2021-06-26net/mlx5: DR, Add support for flow sampler offloadYevgeny Kliteynik
Add SW steering support for sFlow / flow sampler action. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-25net: mdiobus: withdraw fwnode_mdbiobus_registerMarcin Wojtas
The newly implemented fwnode_mdbiobus_register turned out to be problematic - in case the fwnode_/of_/acpi_mdio are built as modules, a dependency cycle can be observed during the depmod phase of modules_install, eg.: depmod: ERROR: Cycle detected: fwnode_mdio -> of_mdio -> fwnode_mdio depmod: ERROR: Found 2 modules in dependency cycles! OR: depmod: ERROR: Cycle detected: acpi_mdio -> fwnode_mdio -> acpi_mdio depmod: ERROR: Found 2 modules in dependency cycles! A possible solution could be to rework fwnode_mdiobus_register, so that to merge the contents of acpi_mdiobus_register and of_mdiobus_register. However feasible, such change would be very intrusive and affect huge amount of the of_mdiobus_register users. Since there are currently 2 users of ACPI and MDIO (xgmac_mdio and mvmdio), withdraw the fwnode_mdbiobus_register and roll back to a simple 'if' condition in affected drivers. Fixes: 62a6ef6a996f ("net: mdiobus: Introduce fwnode_mdbiobus_register()") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "24 patches, based on 4a09d388f2ab382f217a764e6a152b3f614246f6. Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb, memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and mailmap" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits) mailmap: add Marek's other e-mail address and identity without diacritics MAINTAINERS: fix Marek's identity again mm/page_alloc: do bulk array bounds check after checking populated elements mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array mm/hwpoison: do not lock page again when me_huge_page() successfully recovers mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned mm/memory-failure: use a mutex to avoid memory_failure() races mm, futex: fix shared futex pgoff on shmem huge page kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() kthread_worker: split code for canceling the delayed work timer mm/vmalloc: unbreak kasan vmalloc support KVM: s390: prepare for hugepage vmalloc mm/vmalloc: add vmalloc_no_huge nilfs2: fix memory leak in nilfs_sysfs_delete_device_group mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes mm: page_vma_mapped_walk(): get vma_address_end() earlier mm: page_vma_mapped_walk(): use goto instead of while (1) mm: page_vma_mapped_walk(): add a level of indentation mm: page_vma_mapped_walk(): crossing page table boundary ...
2021-06-25dev_forward_skb: do not scrub skb mark within the same name spaceNicolas Dichtel
The goal is to keep the mark during a bpf_redirect(), like it is done for legacy encapsulation / decapsulation, when there is no x-netns. This was initially done in commit 213dd74aee76 ("skbuff: Do not scrub skb mark within the same name space"). When the call to skb_scrub_packet() was added in dev_forward_skb() (commit 8b27f27797ca ("skb: allow skb_scrub_packet() to be used by tunnels")), the second argument (xnet) was set to true to force a call to skb_orphan(). At this time, the mark was always cleanned up by skb_scrub_packet(), whatever xnet value was. This call to skb_orphan() was removed later in commit 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). But this 'true' stayed here without any real reason. Let's correctly set xnet in ____dev_forward_skb(), this function has access to the previous interface and to the new interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25Merge tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Two regression fixes from the merge window: one in the auth code affecting old clusters and one in the filesystem for proper propagation of MDS request errors. Also included a locking fix for async creates, marked for stable" * tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client: libceph: set global_id as soon as we get an auth ticket libceph: don't pass result into ac->ops->handle_reply() ceph: fix error handling in ceph_atomic_open and ceph_lookup ceph: must hold snap_rwsem when filling inode for async create
2021-06-25Merge tag 'kvmarm-5.14' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes
2021-06-25Merge remote-tracking branch 'spi/for-5.14' into spi-nextMark Brown
2021-06-25HID: core: Add hid_hw_may_wakeup() functionHans de Goede
Add a hid_hw_may_wakeup() function, which is the equivalent of device_may_wakeup() for hid devices. In most cases this just returns device_may_wakeup(hdev->dev.parent), but for some ll-drivers this is not correct. E.g. usb_hid_driver instantiated hid devices have their parent set to the usb-interface to which the usb_hid_driver is bound, but the power/wakeup* sysfs attributes are part of the usb-device, which is the usb-interface's parent. For these special cases a new may_wakeup callback is added to hid_ll_driver, so that ll-drivers can override the default behavior. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-06-25spi: core: add dma_map_dev for dma deviceVinod Koul
Some controllers like qcom geni need the parent device to be used for dma mapping, so add a dma_map_dev field and let drivers fill this to be used as mapping device Signed-off-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20210625052213.32260-4-vkoul@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-24mm, futex: fix shared futex pgoff on shmem huge pageHugh Dickins
If more than one futex is placed on a shmem huge page, it can happen that waking the second wakes the first instead, and leaves the second waiting: the key's shared.pgoff is wrong. When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when generating futex_key"), the only shared huge pages came from hugetlbfs, and the code added to deal with its exceptional page->index was put into hugetlb source. Then that was missed when 4.8 added shmem huge pages. page_to_pgoff() is what others use for this nowadays: except that, as currently written, it gives the right answer on hugetlbfs head, but nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific hugetlb_basepage_index() on PageHuge tails as well as on head. Yes, it's unconventional to declare hugetlb_basepage_index() there in pagemap.h, rather than in hugetlb.h; but I do not expect anything but page_to_pgoff() ever to need it. [akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope] Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Reported-by: Neel Natu <neelnatu@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Zhang Yi <wetpzy@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24mm/vmalloc: add vmalloc_no_hugeClaudio Imbrenda
Patch series "mm: add vmalloc_no_huge and use it", v4. Add vmalloc_no_huge() and export it, so modules can allocate memory with small pages. Use the newly added vmalloc_no_huge() in KVM on s390 to get around a hardware limitation. This patch (of 2): Commit 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") added support for hugepage vmalloc mappings, it also added the flag VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be performed with 0-order non-huge pages. This flag is not accessible when calling vmalloc, the only option is to call directly __vmalloc_node_range, which is not exported. This means that a module can't vmalloc memory with small pages. Case in point: KVM on s390x needs to vmalloc a large area, and it needs to be mapped with non-huge pages, because of a hardware limitation. This patch adds the function vmalloc_no_huge, which works like vmalloc, but it is guaranteed to always back the mapping using small pages. This new function is exported, therefore it is usable by modules. [akpm@linux-foundation.org: whitespace fixes, per Christoph] Link: https://lkml.kernel.org/r/20210614132357.10202-1-imbrenda@linux.ibm.com Link: https://lkml.kernel.org/r/20210614132357.10202-2-imbrenda@linux.ibm.com Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24blk: Fix lock inversion between ioc lock and bfqd lockJan Kara
Lockdep complains about lock inversion between ioc->lock and bfqd->lock: bfqd -> ioc: put_io_context+0x33/0x90 -> ioc->lock grabbed blk_mq_free_request+0x51/0x140 blk_put_request+0xe/0x10 blk_attempt_req_merge+0x1d/0x30 elv_attempt_insert_merge+0x56/0xa0 blk_mq_sched_try_insert_merge+0x4b/0x60 bfq_insert_requests+0x9e/0x18c0 -> bfqd->lock grabbed blk_mq_sched_insert_requests+0xd6/0x2b0 blk_mq_flush_plug_list+0x154/0x280 blk_finish_plug+0x40/0x60 ext4_writepages+0x696/0x1320 do_writepages+0x1c/0x80 __filemap_fdatawrite_range+0xd7/0x120 sync_file_range+0xac/0xf0 ioc->bfqd: bfq_exit_icq+0xa3/0xe0 -> bfqd->lock grabbed put_io_context_active+0x78/0xb0 -> ioc->lock grabbed exit_io_context+0x48/0x50 do_exit+0x7e9/0xdd0 do_group_exit+0x54/0xc0 To avoid this inversion we change blk_mq_sched_try_insert_merge() to not free the merged request but rather leave that upto the caller similarly to blk_mq_sched_try_merge(). And in bfq_insert_requests() we make sure to free all the merged requests after dropping bfqd->lock. Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler") Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210623093634.27879-3-jack@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24KVM: debugfs: Reuse binary stats descriptorsJing Zhang
To remove code duplication, use the binary stats descriptors in the implementation of the debugfs interface for statistics. This unifies the definition of statistics for the binary and debugfs interfaces. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-8-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: stats: Support binary stats retrieval for a VCPUJing Zhang
Add a VCPU ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VCPU stats header, descriptors and data. Define VCPU statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: stats: Support binary stats retrieval for a VMJing Zhang
Add a VM ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VM stats header, descriptors and data. Define VM statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-4-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24libceph: set global_id as soon as we get an auth ticketIlya Dryomov
Commit 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") delayed the setting of global_id too much. It is set only after all tickets are received, but in pre-nautilus clusters an auth ticket and the service tickets are obtained in separate steps (for a total of three MAuth replies). When the service tickets are requested, global_id is used to build an authorizer; if global_id is still 0 we never get them and fail to establish the session. Moving the setting of global_id into protocol implementations. This way global_id can be set exactly when an auth ticket is received, not sooner nor later. Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24libceph: don't pass result into ac->ops->handle_reply()Ilya Dryomov
There is no result to pass in msgr2 case because authentication failures are reported through auth_bad_method frame and in MAuth case an error is returned immediately. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24net: mdiobus: fix fwnode_mdbiobus_register() fallback caseMarcin Wojtas
The fallback case of fwnode_mdbiobus_register() (relevant for !CONFIG_FWNODE_MDIO) was defined with wrong argument name, causing a compilation error. Fix that. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24block: pass a gendisk to bdev_disk_changedChristoph Hellwig
bdev_disk_changed can only operate on whole devices. Make that clear by passing a gendisk instead of the struct block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24block: move bdev_disk_changedChristoph Hellwig
Move bdev_disk_changed to block/partitions/core.c, together with the rest of the partition scanning code. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24xdp: Add proper __rcu annotations to redirect map entriesToke Høiland-Jørgensen
XDP_REDIRECT works by a three-step process: the bpf_redirect() and bpf_redirect_map() helpers will lookup the target of the redirect and store it (along with some other metadata) in a per-CPU struct bpf_redirect_info. Next, when the program returns the XDP_REDIRECT return code, the driver will call xdp_do_redirect() which will use the information thus stored to actually enqueue the frame into a bulk queue structure (that differs slightly by map type, but shares the same principle). Finally, before exiting its NAPI poll loop, the driver will call xdp_do_flush(), which will flush all the different bulk queues, thus completing the redirect. Pointers to the map entries will be kept around for this whole sequence of steps, protected by RCU. However, there is no top-level rcu_read_lock() in the core code; instead drivers add their own rcu_read_lock() around the XDP portions of the code, but somewhat inconsistently as Martin discovered[0]. However, things still work because everything happens inside a single NAPI poll sequence, which means it's between a pair of calls to local_bh_disable()/local_bh_enable(). So Paul suggested[1] that we could document this intention by using rcu_dereference_check() with rcu_read_lock_bh_held() as a second parameter, thus allowing sparse and lockdep to verify that everything is done correctly. This patch does just that: we add an __rcu annotation to the map entry pointers and remove the various comments explaining the NAPI poll assurance strewn through devmap.c in favour of a longer explanation in filter.c. The goal is to have one coherent documentation of the entire flow, and rely on the RCU annotations as a "standard" way of communicating the flow in the map code (which can additionally be understood by sparse and lockdep). The RCU annotation replacements result in a fairly straight-forward replacement where READ_ONCE() becomes rcu_dereference_check(), WRITE_ONCE() becomes rcu_assign_pointer() and xchg() and cmpxchg() gets wrapped in the proper constructs to cast the pointer back and forth between __rcu and __kernel address space (for the benefit of sparse). The one complication is that xskmap has a few constructions where double-pointers are passed back and forth; these simply all gain __rcu annotations, and only the final reference/dereference to the inner-most pointer gets changed. With this, everything can be run through sparse without eliciting complaints, and lockdep can verify correctness even without the use of rcu_read_lock() in the drivers. Subsequent patches will clean these up from the drivers. [0] https://lore.kernel.org/bpf/20210415173551.7ma4slcbqeyiba2r@kafai-mbp.dhcp.thefacebook.com/ [1] https://lore.kernel.org/bpf/20210419165837.GA975577@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210624160609.292325-6-toke@redhat.com
2021-06-24rcu: Create an unrcu_pointer() to remove __rcu from a pointerPaul E. McKenney
The xchg() and cmpxchg() functions are sometimes used to carry out RCU updates. Unfortunately, this can result in sparse warnings for both the old-value and new-value arguments, as well as for the return value. The arguments can be dealt with using RCU_INITIALIZER(): old_p = xchg(&p, RCU_INITIALIZER(new_p)); But a sparse warning still remains due to assigning the __rcu pointer returned from xchg to the (most likely) non-__rcu pointer old_p. This commit therefore provides an unrcu_pointer() macro that strips the __rcu. This macro can be used as follows: old_p = unrcu_pointer(xchg(&p, RCU_INITIALIZER(new_p))); Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210624160609.292325-2-toke@redhat.com
2021-06-24KVM: stats: Add fd-based API to read binary stats dataJing Zhang
This commit defines the API for userspace and prepare the common functionalities to support per VM/VCPU binary stats data readings. The KVM stats now is only accessible by debugfs, which has some shortcomings this change series are supposed to fix: 1. The current debugfs stats solution in KVM could be disabled when kernel Lockdown mode is enabled, which is a potential rick for production. 2. The current debugfs stats solution in KVM is organized as "one stats per file", it is good for debugging, but not efficient for production. 3. The stats read/clear in current debugfs solution in KVM are protected by the global kvm_lock. Besides that, there are some other benefits with this change: 1. All KVM VM/VCPU stats can be read out in a bulk by one copy to userspace. 2. A schema is used to describe KVM statistics. From userspace's perspective, the KVM statistics are self-describing. 3. With the fd-based solution, a separate telemetry would be able to read KVM stats in a less privileged environment. 4. After the initial setup by reading in stats descriptors, a telemetry only needs to read the stats data itself, no more parsing or setup is needed. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-3-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: stats: Separate generic stats from architecture specific onesJing Zhang
Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24HID: input: Add support for Programmable ButtonsThomas Weißschuh
Map them to KEY_MACRO# event codes. These buttons are defined by HID as follows: "The user defines the function of these buttons to control software applications or GUI objects." This matches the semantics of the KEY_MACRO# input event codes that Linux supports. Also add support for HID "Named Array" collections. Also add hid-debug support for KEY_MACRO#. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-06-24Merge branch 'for-next/smccc' into for-next/coreWill Deacon
Add support for versions v1.2 and 1.3 of the SMC calling convention. * for-next/smccc: arm64: smccc: Support SMCCC v1.3 SVE register saving hint arm64: smccc: Add support for SMCCCv1.2 extended input/output registers
2021-06-24Merge branch 'for-next/perf' into for-next/coreWill Deacon
PMU driver cleanups for managing IRQ affinity and exposing event attributes via sysfs. * for-next/perf: (36 commits) drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe() perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number arm64: perf: Simplify EVENT ATTR macro in perf_event.c drivers/perf: Simplify EVENT ATTR macro in fsl_imx8_ddr_perf.c drivers/perf: Simplify EVENT ATTR macro in xgene_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l3_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l2_pmu.c drivers/perf: Simplify EVENT ATTR macro in SMMU PMU driver perf: Add EVENT_ATTR_ID to simplify event attributes perf/smmuv3: Don't trample existing events with global filter perf/hisi: Constify static attribute_group structs perf: qcom: Remove redundant dev_err call in qcom_l3_cache_pmu_probe() drivers/perf: hisi: Fix data source control arm64: perf: Add more support on caps under sysfs perf: qcom_l2_pmu: move to use request_irq by IRQF_NO_AUTOEN flag arm_pmu: move to use request_irq by IRQF_NO_AUTOEN flag perf: arm_spe: use DEVICE_ATTR_RO macro perf: xgene_pmu: use DEVICE_ATTR_RO macro perf: qcom: use DEVICE_ATTR_RO macro perf: arm_pmu: use DEVICE_ATTR_RO macro ...
2021-06-24sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flagBeata Michalska
Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain topology flag, to distinguish between shed_domains where any CPU capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where a full set of CPU capacities is visible to all domain members (SD_ASYM_CPUCAPACITY_FULL). With the distinction between full and partial CPU capacity asymmetry, brought in by the newly introduced flag, the scope of the original SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing behaviour when one is detected on a given sched domain, allowing misfit migrations within sched domains that do not observe full range of CPU capacities but still do have members with different capacity values. It loses though it's meaning when it comes to the lowest CPU asymmetry sched_domain level per-cpu pointer, which is to be now denoted by SD_ASYM_CPUCAPACITY_FULL flag. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210603140627.8409-2-beata.michalska@arm.com
2021-06-24crypto: api - Move crypto attr definitions out of crypto.hHerbert Xu
The definitions for crypto_attr-related types and enums are not needed by most Crypto API users. This patch moves them out of crypto.h and into algapi.h/internal.h depending on the extent of their use. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2021-06-23 The following pull-request contains BPF updates for your *net* tree. We've added 14 non-merge commits during the last 6 day(s) which contain a total of 13 files changed, 137 insertions(+), 64 deletions(-). Note that when you merge net into net-next, there is a small merge conflict between 9f2470fbc4cb ("skmsg: Improve udp_bpf_recvmsg() accuracy") from bpf with c49661aa6f70 ("skmsg: Remove unused parameters of sk_msg_wait_data()") from net-next. Resolution is to: i) net/ipv4/udp_bpf.c: take udp_msg_wait_data() and remove err parameter from the function, ii) net/ipv4/tcp_bpf.c: take tcp_msg_wait_data() and remove err parameter from the function, iii) for net/core/skmsg.c and include/linux/skmsg.h: remove the sk_msg_wait_data() implementation and its prototype in header. The main changes are: 1) Fix BPF poke descriptor adjustments after insn rewrite, from John Fastabend. 2) Fix regression when using BPF_OBJ_GET with non-O_RDWR flags, from Maciej Żenczykowski. 3) Various bug and error handling fixes for UDP-related sock_map, from Cong Wang. 4) Fix patching of vmlinux BTF IDs with correct endianness, from Tony Ambardar. 5) Two fixes for TX descriptor validation in AF_XDP, from Magnus Karlsson. 6) Fix overflow in size calculation for bpf_map_area_alloc(), from Bui Quang Minh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23Merge remote-tracking branch 'regulator/for-5.14' into regulator-nextMark Brown
2021-06-23Merge branch 'topic/ppc-kvm' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes
2021-06-23ieee80211: add defines for HE PHY cap byte 10Johannes Berg
One bit out of the previously completely reserved byte 10 in the PHY capabilities is used since 802.11ax D7.0, add a new define for it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.c026feb3873d.I380f52a05ddb4153bc77ff7f276a3484819f69b2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-23ieee80211: define timing measurement in extended capabilities IEKrishnanand Prabhu
Define the bit used for timing measurement support in extended capabilities IE, used for time synchronization. Signed-off-by: Krishnanand Prabhu <krishnanand.prabhu@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.b75f40765538.I92b50e43e29272c97d17ed5f37f216f4caf0f205@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>