summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2019-11-26Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest changes in this cycle were: - Make kcpustat vtime aware (Frederic Weisbecker) - Rework the CFS load_balance() logic (Vincent Guittot) - Misc cleanups, smaller enhancements, fixes. The load-balancing rework is the most intrusive change: it replaces the old heuristics that have become less meaningful after the introduction of the PELT metrics, with a grounds-up load-balancing algorithm. As such it's not really an iterative series, but replaces the old load-balancing logic with the new one. We hope there are no performance regressions left - but statistically it's highly probable that there *is* going to be some workload that is hurting from these chnages. If so then we'd prefer to have a look at that workload and fix its scheduling, instead of reverting the changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) rackmeter: Use vtime aware kcpustat accessor leds: Use all-in-one vtime aware kcpustat accessor cpufreq: Use vtime aware kcpustat accessors for user time procfs: Use all-in-one vtime aware kcpustat accessor sched/vtime: Bring up complete kcpustat accessor sched/cputime: Support other fields on kcpustat_field() sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util() sched/fair: Add comments for group_type and balancing at SD_NUMA level sched/fair: Fix rework of find_idlest_group() sched/uclamp: Fix overzealous type replacement sched/Kconfig: Fix spelling mistake in user-visible help text sched/core: Further clarify sched_class::set_next_task() sched/fair: Use mul_u32_u32() sched/core: Simplify sched_class::pick_next_task() sched/core: Optimize pick_next_task() sched/core: Make pick_next_task_idle() more consistent sched/fair: Better document newidle_balance() leds: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM cpufreq: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM procfs: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM ...
2019-11-26Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel side changes in this cycle were: - Various Intel-PT updates and optimizations (Alexander Shishkin) - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu) - Add support for LSM and SELinux checks to control access to the perf syscall (Joel Fernandes) - Misc other changes, optimizations, fixes and cleanups - see the shortlog for details. There were numerous tooling changes as well - 254 non-merge commits. Here are the main changes - too many to list in detail: - Enhancements to core tooling infrastructure, perf.data, libperf, libtraceevent, event parsing, vendor events, Intel PT, callchains, BPF support and instruction decoding. - There were updates to the following tools: perf annotate perf diff perf inject perf kvm perf list perf maps perf parse perf probe perf record perf report perf script perf stat perf test perf trace - And a lot of other changes: please see the shortlog and Git log for more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits) perf parse: Fix potential memory leak when handling tracepoint errors perf probe: Fix spelling mistake "addrees" -> "address" libtraceevent: Fix memory leakage in copy_filter_type libtraceevent: Fix header installation perf intel-bts: Does not support AUX area sampling perf intel-pt: Add support for decoding AUX area samples perf intel-pt: Add support for recording AUX area samples perf pmu: When using default config, record which bits of config were changed by the user perf auxtrace: Add support for queuing AUX area samples perf session: Add facility to peek at all events perf auxtrace: Add support for dumping AUX area samples perf inject: Cut AUX area samples perf record: Add aux-sample-size config term perf record: Add support for AUX area sampling perf auxtrace: Add support for AUX area sample recording perf auxtrace: Move perf_evsel__find_pmu() perf record: Add a function to test for kernel support for AUX area sampling perf tools: Add kernel AUX area sampling definitions perf/core: Make the mlock accounting simple again perf report: Jump to symbol source view from total cycles view ...
2019-11-26Merge branch 'core-stacktrace-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull stacktrace cleanup from Ingo Molnar: "A minor cleanup" * 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stacktrace: Get rid of unneeded '!!' pattern
2019-11-26sysctl: Remove the sysctl system callEric W. Biederman
This system call has been deprecated almost since it was introduced, and in a survey of the linux distributions I can no longer find any of them that enable CONFIG_SYSCTL_SYSCALL. The only indication that I can find that anyone might care is that a few of the defconfigs in the kernel enable CONFIG_SYSCTL_SYSCALL. However this appears in only 31 of 414 defconfigs in the kernel, so I suspect this symbols presence is simply because it is harmless to include rather than because it is necessary. As there appear to be no users of the sysctl system call, remove the code. As this removes one of the few uses of the internal kernel mount of proc I hope this allows for even more simplifications of the proc filesystem. Cc: Alex Smith <alex.smith@imgtec.com> Cc: Anders Berg <anders.berg@lsi.com> Cc: Apelete Seketeli <apelete@seketeli.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chee Nouk Phoon <cnphoon@altera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Harvey Hunt <harvey.hunt@imgtec.com> Cc: Helge Deller <deller@gmx.de> Cc: Hongliang Tao <taohl@lemote.com> Cc: Hua Yan <yanh@lemote.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <blogic@openwrt.org> Cc: Jonas Jensen <jonas.jensen@gmail.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Jun Nie <jun.nie@linaro.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Kevin Wells <kevin.wells@nxp.com> Cc: Kumar Gala <galak@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Ley Foon Tan <lftan@altera.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Olof Johansson <olof@lixom.net> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Roland Stigge <stigge@antcom.de> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Scott Telford <stelford@cadence.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: Tanmay Inamdar <tinamdar@apm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Wolfram Sang <w.sang@pengutronix.de> Acked-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-11-26Merge branches 'pm-sleep', 'pm-domains', 'pm-opp' and 'powercap'Rafael J. Wysocki
* pm-sleep: PM / wakeirq: remove unnecessary parentheses PM / core: Clean up some function headers in power.h PM / hibernate: memory_bm_find_bit(): Tighten node optimisation * pm-domains: PM / Domains: Convert to dev_to_genpd_safe() in genpd_syscore_switch() mmc: tmio: Avoid boilerplate code in ->runtime_suspend() PM / Domains: Implement the ->start() callback for genpd PM / Domains: Introduce dev_pm_domain_start() * pm-opp: PM / OPP: Support adjusting OPP voltages at runtime * powercap: powercap/intel_rapl: add support for Cometlake desktop powercap/intel_rapl: add support for CometLake Mobile
2019-11-26Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: cpuidle: Pass exit latency limit to cpuidle_use_deepest_state() cpuidle: Allow idle injection to apply exit latency limit cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks cpuidle: teo: Avoid code duplication in conditionals cpuidle: teo: Avoid using "early hits" incorrectly cpuidle: teo: Exclude cpuidle overhead from computations cpuidle: Use nanoseconds as the unit of time cpuidle: Consolidate disabled state checks ACPI: processor_idle: Skip dummy wait if kernel is in guest cpuidle: Do not unset the driver if it is there already cpuidle: teo: Fix "early hits" handling for disabled idle states cpuidle: teo: Consider hits and misses metrics of disabled states cpuidle: teo: Rename local variable in teo_select() cpuidle: teo: Ignore disabled idle states that are too deep
2019-11-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: "Another merge window, another pull full of stuff: 1) Support alternative names for network devices, from Jiri Pirko. 2) Introduce per-netns netdev notifiers, also from Jiri Pirko. 3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara Larsen. 4) Allow compiling out the TLS TOE code, from Jakub Kicinski. 5) Add several new tracepoints to the kTLS code, also from Jakub. 6) Support set channels ethtool callback in ena driver, from Sameeh Jubran. 7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED, SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long. 8) Add XDP support to mvneta driver, from Lorenzo Bianconi. 9) Lots of netfilter hw offload fixes, cleanups and enhancements, from Pablo Neira Ayuso. 10) PTP support for aquantia chips, from Egor Pomozov. 11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From Josh Hunt. 12) Add smart nagle to tipc, from Jon Maloy. 13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat Duvvuru. 14) Add a flow mask cache to OVS, from Tonghao Zhang. 15) Add XDP support to ice driver, from Maciej Fijalkowski. 16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak. 17) Support UDP GSO offload in atlantic driver, from Igor Russkikh. 18) Support it in stmmac driver too, from Jose Abreu. 19) Support TIPC encryption and auth, from Tuong Lien. 20) Introduce BPF trampolines, from Alexei Starovoitov. 21) Make page_pool API more numa friendly, from Saeed Mahameed. 22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni. 23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits) libbpf: Fix usage of u32 in userspace code mm: Implement no-MMU variant of vmalloc_user_node_flags slip: Fix use-after-free Read in slip_open net: dsa: sja1105: fix sja1105_parse_rgmii_delays() macvlan: schedule bc_work even if error enetc: add support Credit Based Shaper(CBS) for hardware offload net: phy: add helpers phy_(un)lock_mdio_bus mdio_bus: don't use managed reset-controller ax88179_178a: add ethtool_op_get_ts_info() mlxsw: spectrum_router: Fix use of uninitialized adjacency index mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels bpf: Simplify __bpf_arch_text_poke poke type handling bpf: Introduce BPF_TRACE_x helper for the tracing tests bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT bpf, testing: Add various tail call test cases bpf, x86: Emit patchable direct jump as tail call bpf: Constant map key tracking for prog array pokes bpf: Add poke dependency tracking for prog array maps bpf: Add initial poke descriptor table for jit images bpf: Move owner type, jited info into array auxiliary data ...
2019-11-25Merge tag 'livepatching-for-5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - New API to track system state changes done be livepatch callbacks. It helps to maintain compatibility between livepatches. - Update Kconfig help text. ORC is another reliable unwinder. - Disable generic selftest timeout. Livepatch selftests have their own per-operation fine-grained timeouts. * tag 'livepatching-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: x86/stacktrace: update kconfig help text for reliable unwinders livepatch: Selftests of the API for tracking system state changes livepatch: Documentation of the new API for tracking system state changes livepatch: Allow to distinguish different version of system state changes livepatch: Basic API to track system state changes livepatch: Keep replaced patches until post_patch callback is called selftests/livepatch: Disable the timeout
2019-11-25Merge tag 'printk-for-5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Allow to print symbolic error names via new %pe modifier. - Use pr_warn() instead of the remaining pr_warning() calls. Fix formatting of the related lines. - Add VSPRINTF entry to MAINTAINERS. * tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (32 commits) checkpatch: don't warn about new vsprintf pointer extension '%pe' MAINTAINERS: Add VSPRINTF tools lib api: Renaming pr_warning to pr_warn ASoC: samsung: Use pr_warn instead of pr_warning lib: cpu_rmap: Use pr_warn instead of pr_warning trace: Use pr_warn instead of pr_warning dma-debug: Use pr_warn instead of pr_warning vgacon: Use pr_warn instead of pr_warning fs: afs: Use pr_warn instead of pr_warning sh/intc: Use pr_warn instead of pr_warning scsi: Use pr_warn instead of pr_warning platform/x86: intel_oaktrail: Use pr_warn instead of pr_warning platform/x86: asus-laptop: Use pr_warn instead of pr_warning platform/x86: eeepc-laptop: Use pr_warn instead of pr_warning oprofile: Use pr_warn instead of pr_warning of: Use pr_warn instead of pr_warning macintosh: Use pr_warn instead of pr_warning idsn: Use pr_warn instead of pr_warning ide: Use pr_warn instead of pr_warning crypto: n2: Use pr_warn instead of pr_warning ...
2019-11-25Merge branch 'for-5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "There are several notable changes here: - Single thread migrating itself has been optimized so that it doesn't need threadgroup rwsem anymore. - Freezer optimization to avoid unnecessary frozen state changes. - cgroup ID unification so that cgroup fs ino is the only unique ID used for the cgroup and can be used to directly look up live cgroups through filehandle interface on 64bit ino archs. On 32bit archs, cgroup fs ino is still the only ID in use but it is only unique when combined with gen. - selftest and other changes" * 'for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (24 commits) writeback: fix -Wformat compilation warnings docs: cgroup: mm: Fix spelling of "list" cgroup: fix incorrect WARN_ON_ONCE() in cgroup_setup_root() cgroup: use cgrp->kn->id as the cgroup ID kernfs: use 64bit inos if ino_t is 64bit kernfs: implement custom exportfs ops and fid type kernfs: combine ino/id lookup functions into kernfs_find_and_get_node_by_id() kernfs: convert kernfs_node->id from union kernfs_node_id to u64 kernfs: kernfs_find_and_get_node_by_ino() should only look up activated nodes kernfs: use dumber locking for kernfs_find_and_get_node_by_ino() netprio: use css ID instead of cgroup ID writeback: use ino_t for inodes in tracepoints kernfs: fix ino wrap-around detection kselftests: cgroup: Avoid the reuse of fd after it is deallocated cgroup: freezer: don't change task and cgroups status unnecessarily cgroup: use cgroup->last_bstat instead of cgroup->bstat_pending for consistency cgroup: remove cgroup_enable_task_cg_lists() optimization cgroup: pids: use atomic64_t for pids->limit selftests: cgroup: Run test_core under interfering stress selftests: cgroup: Add task migration tests ...
2019-11-25Merge branch 'for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: "There have been sporadic reports of sanity checks in destroy_workqueue() failing spuriously over the years. This contains the fix and its follow-up changes / fixes. There's also a RCU annotation improvement" * 'for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Add RCU annotation for pwq list walk workqueue: Fix pwq ref leak in rescuer_thread() workqueue: more destroy_workqueue() fixes workqueue: Minor follow-ups to the rescuer destruction change workqueue: Fix missing kfree(rescuer) in destroy_workqueue() workqueue: Fix spurious sanity check failures in destroy_workqueue()
2019-11-25Merge tag 'threads-v5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread management updates from Christian Brauner: - A pidfd's fdinfo file currently contains the field "Pid:\t<pid>" where <pid> is the pid of the process in the pid namespace of the procfs instance the fdinfo file for the pidfd was opened in. The fdinfo file has now gained a new "NSpid:\t<ns-pid1>[\t<ns-pid2>[...]]" field which lists the pids of the process in all child pid namespaces provided the pid namespace of the procfs instance it is looked up under has an ancestoral relationship with the pid namespace of the process. If it does not 0 will be shown and no further pid namespaces will be listed. Tests included. (Christian Kellner) - If the process the pidfd references has already exited, print -1 for the Pid and NSpid fields in the pidfd's fdinfo file. Tests included. (me) - Add CLONE_CLEAR_SIGHAND. This lets callers clear all signal handler that are not SIG_DFL or SIG_IGN at process creation time. This originated as a feature request from glibc to improve performance and elimate races in their posix_spawn() implementation. Tests included. (me) - Add support for choosing a specific pid for a process with clone3(). This is the feature which was part of the thread update for v5.4 but after a discussion at LPC in Lisbon we decided to delay it for one more cycle in order to make the interface more generic. This has now done. It is now possible to choose a specific pid in a whole pid namespaces (sub)hierarchy instead of just one pid namespace. In order to choose a specific pid the caller must have CAP_SYS_ADMIN in all owning user namespaces of the target pid namespaces. Tests included. (Adrian Reber) - Test improvements and extensions. (Andrei Vagin, me) * tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests/clone3: skip if clone3() is ENOSYS selftests/clone3: check that all pids are released on error paths selftests/clone3: report a correct number of fails selftests/clone3: flush stdout and stderr before clone3() and _exit() selftests: add tests for clone3() with *set_tid fork: extend clone3() to support setting a PID selftests: add tests for clone3() tests: test CLONE_CLEAR_SIGHAND clone3: add CLONE_CLEAR_SIGHAND pid: use pid_has_task() in pidfd_open() exit: use pid_has_task() in do_wait() pid: use pid_has_task() in __change_pid() test: verify fdinfo for pidfd of reaped process pidfd: check pid has attached task in fdinfo pidfd: add tests for NSpid info in fdinfo pidfd: add NSpid entries to fdinfo
2019-11-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - data abort report and injection - steal time support - GICv4 performance improvements - vgic ITS emulation fixes - simplify FWB handling - enable halt polling counters - make the emulated timer PREEMPT_RT compliant s390: - small fixes and cleanups - selftest improvements - yield improvements PPC: - add capability to tell userspace whether we can single-step the guest - improve the allocation of XIVE virtual processor IDs - rewrite interrupt synthesis code to deliver interrupts in virtual mode when appropriate. - minor cleanups and improvements. x86: - XSAVES support for AMD - more accurate report of nested guest TSC to the nested hypervisor - retpoline optimizations - support for nested 5-level page tables - PMU virtualization optimizations, and improved support for nested PMU virtualization - correct latching of INITs for nested virtualization - IOAPIC optimization - TSX_CTRL virtualization for more TAA happiness - improved allocation and flushing of SEV ASIDs - many bugfixes and cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints KVM: x86: Grab KVM's srcu lock when setting nested state KVM: x86: Open code shared_msr_update() in its only caller KVM: Fix jump label out_free_* in kvm_init() KVM: x86: Remove a spurious export of a static function KVM: x86: create mmu/ subdirectory KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page KVM: x86: remove set but not used variable 'called' KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID KVM: x86: do not modify masked bits of shared MSRs KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT KVM: x86: Unexport kvm_vcpu_reload_apic_access_page() KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1 KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization ...
2019-11-25Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Apart from the arm64-specific bits (core arch and perf, new arm64 selftests), it touches the generic cow_user_page() (reviewed by Kirill) together with a macro for x86 to preserve the existing behaviour on this architecture. Summary: - On ARMv8 CPUs without hardware updates of the access flag, avoid failing cow_user_page() on PFN mappings if the pte is old. The patches introduce an arch_faults_on_old_pte() macro, defined as false on x86. When true, cow_user_page() makes the pte young before attempting __copy_from_user_inatomic(). - Covert the synchronous exception handling paths in arch/arm64/kernel/entry.S to C. - FTRACE_WITH_REGS support for arm64. - ZONE_DMA re-introduced on arm64 to support Raspberry Pi 4 - Several kselftest cases specific to arm64, together with a MAINTAINERS update for these files (moved to the ARM64 PORT entry). - Workaround for a Neoverse-N1 erratum where the CPU may fetch stale instructions under certain conditions. - Workaround for Cortex-A57 and A72 errata where the CPU may speculatively execute an AT instruction and associate a VMID with the wrong guest page tables (corrupting the TLB). - Perf updates for arm64: additional PMU topologies on HiSilicon platforms, support for CCN-512 interconnect, AXI ID filtering in the IMX8 DDR PMU, support for the CCPI2 uncore PMU in ThunderX2. - GICv3 optimisation to avoid a heavy barrier when accessing the ICC_PMR_EL1 register. - ELF HWCAP documentation updates and clean-up. - SMC calling convention conduit code clean-up. - KASLR diagnostics printed during boot - NVIDIA Carmel CPU added to the KPTI whitelist - Some arm64 mm clean-ups: use generic free_initrd_mem(), remove stale macro, simplify calculation in __create_pgd_mapping(), typos. - Kconfig clean-ups: CMDLINE_FORCE to depend on CMDLINE, choice for endinanness to help with allmodconfig" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: Kconfig: add a choice for endianness kselftest: arm64: fix spelling mistake "contiguos" -> "contiguous" arm64: Kconfig: make CMDLINE_FORCE depend on CMDLINE MAINTAINERS: Add arm64 selftests to the ARM64 PORT entry arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 ...
2019-11-25Merge tag 'linux-kselftest-5.5-rc1-kunit' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest KUnit support gtom Shuah Khan: "This adds KUnit, a lightweight unit testing and mocking framework for the Linux kernel from Brendan Higgins. KUnit is not an end-to-end testing framework. It is currently supported on UML and sub-systems can write unit tests and run them in UML env. KUnit documentation is included in this update. In addition, this Kunit update adds 3 new kunit tests: - proc sysctl test from Iurii Zaikin - the 'list' doubly linked list test from David Gow - ext4 tests for decoding extended timestamps from Iurii Zaikin In the future KUnit will be linked to Kselftest framework to provide a way to trigger KUnit tests from user-space" * tag 'linux-kselftest-5.5-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits) lib/list-test: add a test for the 'list' doubly linked list ext4: add kunit test for decoding extended timestamps Documentation: kunit: Fix verification command kunit: Fix '--build_dir' option kunit: fix failure to build without printk MAINTAINERS: add proc sysctl KUnit test to PROC SYSCTL section kernel/sysctl-test: Add null pointer test for sysctl.c:proc_dointvec() MAINTAINERS: add entry for KUnit the unit testing framework Documentation: kunit: add documentation for KUnit kunit: defconfig: add defconfigs for building KUnit tests kunit: tool: add Python wrappers for running KUnit tests kunit: test: add tests for KUnit managed resources kunit: test: add the concept of assertions kunit: test: add tests for kunit test abort kunit: test: add support for test abort objtool: add kunit_try_catch_throw to the noreturn list kunit: test: add initial tests lib: enable building KUnit in lib/ kunit: test: add the concept of expectations kunit: test: add assertion printing library ...
2019-11-25y2038: alarm: fix half-second cut-offArnd Bergmann
Changing alarm_itimer accidentally broke the logic for arithmetic rounding of half seconds in the return code. Change it to a constant based on NSEC_PER_SEC, as suggested by Ben Hutchings. Fixes: bd40a175769d ("y2038: itimer: change implementation to timespec64") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-25Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: "A lot of stuff has been going on this cycle, with improving the support for networked IO (and hence unbounded request completion times) being one of the major themes. There's been a set of fixes done this week, I'll send those out as well once we're certain we're fully happy with them. This contains: - Unification of the "normal" submit path and the SQPOLL path (Pavel) - Support for sparse (and bigger) file sets, and updating of those file sets without needing to unregister/register again. - Independently sized CQ ring, instead of just making it always 2x the SQ ring size. This makes it more flexible for networked applications. - Support for overflowed CQ ring, never dropping events but providing backpressure on submits. - Add support for absolute timeouts, not just relative ones. - Support for generic cancellations. This divorces io_uring from workqueues as well, which additionally gets us one step closer to generic async system call support. - With cancellations, we can support grabbing the process file table as well, just like we do mm context. This allows support for system calls that create file descriptors, like accept4() support that's built on top of that. - Support for io_uring tracing (Dmitrii) - Support for linked timeouts. These abort an operation if it isn't completed by the time noted in the linke timeout. - Speedup tracking of poll requests - Various cleanups making the coder easier to follow (Jackie, Pavel, Bob, YueHaibing, me) - Update MAINTAINERS with new io_uring list" * tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits) io_uring: make POLL_ADD/POLL_REMOVE scale better io-wq: remove now redundant struct io_wq_nulls_list io_uring: Fix getting file for non-fd opcodes io_uring: introduce req_need_defer() io_uring: clean up io_uring_cancel_files() io-wq: ensure free/busy list browsing see all items io-wq: ensure we have a stable view of ->cur_work for cancellations io_wq: add get/put_work handlers to io_wq_create() io_uring: check for validity of ->rings in teardown io_uring: fix potential deadlock in io_poll_wake() io_uring: use correct "is IO worker" helper io_uring: fix -ENOENT issue with linked timer with short timeout io_uring: don't do flush cancel under inflight_lock io_uring: flag SQPOLL busy condition to userspace io_uring: make ASYNC_CANCEL work with poll and timeout io_uring: provide fallback request for OOM situations io_uring: convert accept4() -ERESTARTSYS into -EINTR io_uring: fix error clear of ->file_table in io_sqe_files_register() io_uring: separate the io_free_req and io_free_req_find_next interface io_uring: keep io_put_req only responsible for release and put req ...
2019-11-25Merge branch 'timers/urgent' into timers/core, to pick up fixIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge branch 'sched/rt' into sched/core, to pick up commitIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Remove unused 'refcount_error_report()' functionWill Deacon
'refcount_error_report()' has no callers. Remove it. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-10-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-24bpf: Simplify __bpf_arch_text_poke poke type handlingDaniel Borkmann
Given that we have BPF_MOD_NOP_TO_{CALL,JUMP}, BPF_MOD_{CALL,JUMP}_TO_NOP and BPF_MOD_{CALL,JUMP}_TO_{CALL,JUMP} poke types and that we also pass in old_addr as well as new_addr, it's a bit redundant and unnecessarily complicates __bpf_arch_text_poke() itself since we can derive the same from the *_addr that were passed in. Hence simplify and use BPF_MOD_{CALL,JUMP} as types which also allows to clean up call-sites. In addition to that, __bpf_arch_text_poke() currently verifies that text matches expected old_insn before we invoke text_poke_bp(). Also add a check on new_insn and skip rewrite if it already matches. Reason why this is rather useful is that it avoids making any special casing in prog_array_map_poke_run() when old and new prog were NULL and has the benefit that also for this case we perform a check on text whether it really matches our expectations. Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/fcb00a2b0b288d6c73de4ef58116a821c8fe8f2f.1574555798.git.daniel@iogearbox.net
2019-11-24bpf: Constant map key tracking for prog array pokesDaniel Borkmann
Add tracking of constant keys into tail call maps. The signature of bpf_tail_call_proto is that arg1 is ctx, arg2 map pointer and arg3 is a index key. The direct call approach for tail calls can be enabled if the verifier asserted that for all branches leading to the tail call helper invocation, the map pointer and index key were both constant and the same. Tracking of map pointers we already do from prior work via c93552c443eb ("bpf: properly enforce index mask to prevent out-of-bounds speculation") and 09772d92cd5a ("bpf: avoid retpoline for lookup/update/ delete calls on maps"). Given the tail call map index key is not on stack but directly in the register, we can add similar tracking approach and later in fixup_bpf_calls() add a poke descriptor to the progs poke_tab with the relevant information for the JITing phase. We internally reuse insn->imm for the rewritten BPF_JMP | BPF_TAIL_CALL instruction in order to point into the prog's poke_tab, and keep insn->imm as 0 as indicator that current indirect tail call emission must be used. Note that publishing to the tracker must happen at the end of fixup_bpf_calls() since adding elements to the poke_tab reallocates its memory, so we need to wait until its in final state. Future work can generalize and add similar approach to optimize plain array map lookups. Difference there is that we need to look into the key value that sits on stack. For clarity in bpf_insn_aux_data, map_state has been renamed into map_ptr_state, so we get map_{ptr,key}_state as trackers. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/e8db37f6b2ae60402fa40216c96738ee9b316c32.1574452833.git.daniel@iogearbox.net
2019-11-24bpf: Add poke dependency tracking for prog array mapsDaniel Borkmann
This work adds program tracking to prog array maps. This is needed such that upon prog array updates/deletions we can fix up all programs which make use of this tail call map. We add ops->map_poke_{un,}track() helpers to maps to maintain the list of programs and ops->map_poke_run() for triggering the actual update. bpf_array_aux is extended to contain the list head and poke_mutex in order to serialize program patching during updates/deletions. bpf_free_used_maps() will untrack the program shortly before dropping the reference to the map. For clearing out the prog array once all urefs are dropped we need to use schedule_work() to have a sleepable context. The prog_array_map_poke_run() is triggered during updates/deletions and walks the maintained prog list. It checks in their poke_tabs whether the map and key is matching and runs the actual bpf_arch_text_poke() for patching in the nop or new jmp location. Depending on the type of update, we use one of BPF_MOD_{NOP_TO_JUMP,JUMP_TO_NOP,JUMP_TO_JUMP}. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1fb364bb3c565b3e415d5ea348f036ff379e779d.1574452833.git.daniel@iogearbox.net
2019-11-24bpf: Add initial poke descriptor table for jit imagesDaniel Borkmann
Add initial poke table data structures and management to the BPF prog that can later be used by JITs. Also add an instance of poke specific data for tail call maps; plan for later work is to extend this also for BPF static keys. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1db285ec2ea4207ee0455b3f8e191a4fc58b9ade.1574452833.git.daniel@iogearbox.net
2019-11-24bpf: Move owner type, jited info into array auxiliary dataDaniel Borkmann
We're going to extend this with further information which is only relevant for prog array at this point. Given this info is not used in critical path, move it into its own structure such that the main array map structure can be kept on diet. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/b9ddccdb0f6f7026489ee955f16c96381e1e7238.1574452833.git.daniel@iogearbox.net
2019-11-24bpf: Move bpf_free_used_maps into sleepable sectionDaniel Borkmann
We later on are going to need a sleepable context as opposed to plain RCU callback in order to untrack programs we need to poke at runtime and tracking as well as image update is performed under mutex. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/09823b1d5262876e9b83a8e75df04cf0467357a4.1574452833.git.daniel@iogearbox.net
2019-11-24bpf: Provide better register bounds after jmp32 instructionsYonghong Song
With latest llvm (trunk https://github.com/llvm/llvm-project), test_progs, which has +alu32 enabled, failed for strobemeta.o. The verifier output looks like below with edit to replace large decimal numbers with hex ones. 193: (85) call bpf_probe_read_user_str#114 R0=inv(id=0) 194: (26) if w0 > 0x1 goto pc+4 R0_w=inv(id=0,umax_value=0xffffffff00000001) 195: (6b) *(u16 *)(r7 +80) = r0 196: (bc) w6 = w0 R6_w=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) 197: (67) r6 <<= 32 R6_w=inv(id=0,smax_value=0x7fffffff00000000,umax_value=0xffffffff00000000, var_off=(0x0; 0xffffffff00000000)) 198: (77) r6 >>= 32 R6=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) ... 201: (79) r8 = *(u64 *)(r10 -416) R8_w=map_value(id=0,off=40,ks=4,vs=13872,imm=0) 202: (0f) r8 += r6 R8_w=map_value(id=0,off=40,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) 203: (07) r8 += 9696 R8_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) ... 255: (bf) r1 = r8 R1_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) ... 257: (85) call bpf_probe_read_user_str#114 R1 unbounded memory access, make sure to bounds check any array access into a map The value range for register r6 at insn 198 should be really just 0/1. The umax_value=0xffffffff caused later verification failure. After jmp instructions, the current verifier already tried to use just obtained information to get better register range. The current mechanism is for 64bit register only. This patch implemented to tighten the range for 32bit sub-registers after jmp32 instructions. With the patch, we have the below range ranges for the above code sequence: 193: (85) call bpf_probe_read_user_str#114 R0=inv(id=0) 194: (26) if w0 > 0x1 goto pc+4 R0_w=inv(id=0,smax_value=0x7fffffff00000001,umax_value=0xffffffff00000001, var_off=(0x0; 0xffffffff00000001)) 195: (6b) *(u16 *)(r7 +80) = r0 196: (bc) w6 = w0 R6_w=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0x1)) 197: (67) r6 <<= 32 R6_w=inv(id=0,umax_value=0x100000000,var_off=(0x0; 0x100000000)) 198: (77) r6 >>= 32 R6=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) ... 201: (79) r8 = *(u64 *)(r10 -416) R8_w=map_value(id=0,off=40,ks=4,vs=13872,imm=0) 202: (0f) r8 += r6 R8_w=map_value(id=0,off=40,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) 203: (07) r8 += 9696 R8_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) ... 255: (bf) r1 = r8 R1_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) ... 257: (85) call bpf_probe_read_user_str#114 ... At insn 194, the register R0 has better var_off.mask and smax_value. Especially, the var_off.mask ensures later lshift and rshift maintains proper value range. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191121170650.449030-1-yhs@fb.com
2019-11-24xdp: Fix cleanup on map free for devmap_hash map typeToke Høiland-Jørgensen
Tetsuo pointed out that it was not only the device unregister hook that was broken for devmap_hash types, it was also cleanup on map free. So better fix this as well. While we're at it, there's no reason to allocate the netdev_map array for DEVMAP_HASH, so skip that and adjust the cost accordingly. Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20191121133612.430414-1-toke@redhat.com
2019-11-23mm/hmm: define the pre-processor related parts of hmm.h even if disabledJason Gunthorpe
Only the function calls are stubbed out with static inlines that always fail. This is the standard way to write a header for an optional component and makes it easier for drivers that only optionally need HMM_MIRROR. Link: https://lore.kernel.org/r/20191112202231.3856-5-jgg@ziepe.ca Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23Revert "bpf: Emit audit messages upon successful prog load and unload"Jakub Kicinski
This commit reverts commit 91e6015b082b ("bpf: Emit audit messages upon successful prog load and unload") and its follow up commit 7599a896f2e4 ("audit: Move audit_log_task declaration under CONFIG_AUDITSYSCALL") as requested by Paul Moore. The change needs close review on linux-audit, tests etc. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22tracing: Use xarray for syscall trace eventsHassan Naveed
Currently, a lot of memory is wasted for architectures like MIPS when init_ftrace_syscalls() allocates the array for syscalls using kcalloc. This is because syscalls numbers start from 4000, 5000 or 6000 and array elements up to that point are unused. Fix this by using a data structure more suited to storing sparsely populated arrays. The XARRAY data structure, implemented using radix trees, is much more memory efficient for storing the syscalls in question. Link: http://lkml.kernel.org/r/20191115234314.21599-1-hnaveed@wavecomp.com Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com> Reviewed-by: Paul Burton <paulburton@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22tracing: Adding new functions for kernel access to Ftrace instancesDivya Indi
Adding 2 new functions - 1) struct trace_array *trace_array_get_by_name(const char *name); Return pointer to a trace array with given name. If it does not exist, create and return pointer to the new trace array. 2) int trace_array_set_clr_event(struct trace_array *tr, const char *system ,const char *event, bool enable); Enable/Disable events to this trace array. Additionally, - To handle reference counters, export trace_array_put() - Due to introduction of the above 2 new functions, we no longer need to export - ftrace_set_clr_event & trace_array_create APIs. Link: http://lkml.kernel.org/r/1574276919-11119-2-git-send-email-divya.indi@oracle.com Signed-off-by: Divya Indi <divya.indi@oracle.com> Reviewed-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22tracing: Fix Kconfig indentationKrzysztof Kozlowski
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Link: http://lkml.kernel.org/r/20191120133807.12741-1-krzk@kernel.org Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22ring-buffer: Fix typos in function ring_buffer_producerXianting Tian
Fix spelling and other typos Link: http://lkml.kernel.org/r/1573916755-32478-1-git-send-email-xianting_tian@126.com Signed-off-by: Xianting Tian <xianting_tian@126.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock from commit c8183f548902 ("s390/qeth: fix potential deadlock on workqueue flush"), removed the code which was removed by commit 9897d583b015 ("s390/qeth: consolidate some duplicated HW cmd code"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Validate tunnel options length in act_tunnel_key, from Xin Long. 2) Fix DMA sync bug in gve driver, from Adi Suresh. 3) TSO kills performance on some r8169 chips due to HW issues, disable by default in that case, from Corinna Vinschen. 4) Fix clock disable mismatch in fec driver, from Chubong Yuan. 5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan. 6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann. 7) Don't napi_disable() twice in r8152 driver, from Hayes Wang. 8) Fix SKB extension memory leak, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) r8152: avoid to call napi_disable twice MAINTAINERS: Add myself as maintainer of virtio-vsock udp: drop skb extensions before marking skb stateless net: rtnetlink: prevent underflows in do_setvfinfo() can: m_can_platform: remove unnecessary m_can_class_resume() call can: m_can_platform: set net_device structure as driver data hv_netvsc: Fix send_table offset in case of a host bug hv_netvsc: Fix offset usage in netvsc_send_table() net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN sfc: Only cancel the PPS workqueue if it exists nfc: port100: handle command failure cleanly net-sysfs: fix netdev_queue_add_kobject() breakage r8152: Re-order napi_disable in rtl8152_close net: qca_spi: Move reset_count to struct qcaspi net: qca_spi: fix receive buffer size check net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode Revert "net/ibmvnic: Fix EOI when running in XIVE mode" net/mlxfw: Verify FSM error code translation doesn't exceed array size net/mlx5: Update the list of the PCI supported devices net/mlx5: Fix auto group size calculation ...
2019-11-22Merge tag 'pm-5.4-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management regression fix from Rafael Wysocki: "Fix problems with switching cpufreq drivers on some x86 systems with ACPI (and with changing the operation modes of the intel_pstate driver on those systems) introduced by recent changes related to the management of frequency limits in cpufreq" * tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: QoS: Invalidate frequency QoS requests after removal
2019-11-22tracing: Remove unnecessary DEBUG_FS dependencyKusanagi Kouichi
Tracing replaced debugfs with tracefs. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20191120104350753.EWCT.12796.ppp.dion.ne.jp@dmta0009.auone-net.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-21dma-mapping: treat dev->bus_dma_mask as a DMA limitNicolas Saenz Julienne
Using a mask to represent bus DMA constraints has a set of limitations. The biggest one being it can only hold a power of two (minus one). The DMA mapping code is already aware of this and treats dev->bus_dma_mask as a limit. This quirk is already used by some architectures although still rare. With the introduction of the Raspberry Pi 4 we've found a new contender for the use of bus DMA limits, as its PCIe bus can only address the lower 3GB of memory (of a total of 4GB). This is impossible to represent with a mask. To make things worse the device-tree code rounds non power of two bus DMA limits to the next power of two, which is unacceptable in this case. In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all over the tree and treat it as such. Note that dev->bus_dma_limit should contain the higher accessible DMA address. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-11-21Merge branch 'for-next/zone-dma' of ↵Christoph Hellwig
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into dma-mapping-for-next Pull in a stable branch from the arm64 tree that adds the zone_dma_bits variable to avoid creating hard to resolve conflicts with that addition.
2019-11-21Merge branch 'kvm-tsx-ctrl' into HEADPaolo Bonzini
Conflicts: arch/x86/kvm/vmx/vmx.c
2019-11-21perf/core: Make the mlock accounting simple againAlexander Shishkin
Commit: d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()") does a lot of things to the mlock accounting arithmetics, while the only thing that actually needed to happen is subtracting the part that is charged to the mm from the part that is charged to the user, so that the former isn't charged twice. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Cc: songliubraving@fb.com Link: https://lkml.kernel.org/r/20191120170640.54123-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21sched/vtime: Bring up complete kcpustat accessorFrederic Weisbecker
Many callsites want to fetch the values of system, user, user_nice, guest or guest_nice kcpustat fields altogether or at least a pair of these. In that case calling kcpustat_field() for each requested field brings unecessary overhead when we could fetch all of them in a row. So provide kcpustat_cpu_fetch() that fetches the whole kcpustat array in a vtime safe way under the same RCU and seqcount block. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191121024430.19938-3-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21sched/cputime: Support other fields on kcpustat_field()Frederic Weisbecker
Provide support for user, nice, guest and guest_nice fields through kcpustat_field(). Whether we account the delta to a nice or not nice field is decided on top of the nice value snapshot taken at the time we call kcpustat_field(). If the nice value of the task has been changed since the last vtime update, we may have inacurrate distribution of the nice VS unnice cputime. However this is considered as a minor issue compared to the proper fix that would involve interrupting the target on nice updates, which is undesired on nohz_full CPUs. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191121024430.19938-2-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-11-20 The following pull-request contains BPF updates for your *net-next* tree. We've added 81 non-merge commits during the last 17 day(s) which contain a total of 120 files changed, 4958 insertions(+), 1081 deletions(-). There are 3 trivial conflicts, resolve it by always taking the chunk from 196e8ca74886c433: <<<<<<< HEAD ======= void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 <<<<<<< HEAD void *bpf_map_area_alloc(u64 size, int numa_node) ======= static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 <<<<<<< HEAD if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { ======= /* kmalloc()'ed memory can't be mmap()'ed */ if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 The main changes are: 1) Addition of BPF trampoline which works as a bridge between kernel functions, BPF programs and other BPF programs along with two new use cases: i) fentry/fexit BPF programs for tracing with practically zero overhead to call into BPF (as opposed to k[ret]probes) and ii) attachment of the former to networking related programs to see input/output of networking programs (covering xdpdump use case), from Alexei Starovoitov. 2) BPF array map mmap support and use in libbpf for global data maps; also a big batch of libbpf improvements, among others, support for reading bitfields in a relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko. 3) Extend s390x JIT with usage of relative long jumps and loads in order to lift the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich. 4) Add BPF audit support and emit messages upon successful prog load and unload in order to have a timeline of events, from Daniel Borkmann and Jiri Olsa. 5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode (XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson. 6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API call named bpf_get_link_xdp_info() for retrieving the full set of prog IDs attached to XDP, from Toke Høiland-Jørgensen. 7) Add BTF support for array of int, array of struct and multidimensional arrays and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau. 8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo. 9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid xdping to be run as standalone, from Jiri Benc. 10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song. 11) Fix a memory leak in BPF fentry test run data, from Colin Ian King. 12) Various smaller misc cleanups and improvements mostly all over BPF selftests and samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21time: Zero the upper 32-bits in __kernel_timespec on 32-bitDmitry Safonov
On compat interfaces, the high order bits of nanoseconds should be zeroed out. This is because the application code or the libc do not guarantee zeroing of these. If used without zeroing, kernel might be at risk of using timespec values incorrectly. Originally it was handled correctly, but lost during is_compat_syscall() cleanup. Revert the condition back to check CONFIG_64BIT. Fixes: 98f76206b335 ("compat: Cleanup in_compat_syscall() callers") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191121000303.126523-1-dima@arista.com
2019-11-20bpf: Switch bpf_map_{area_alloc,area_mmapable_alloc}() to u64 sizeDaniel Borkmann
Given we recently extended the original bpf_map_area_alloc() helper in commit fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY"), we need to apply the same logic as in ff1c08e1f74b ("bpf: Change size to u64 for bpf_map_{area_alloc, charge_init}()"). To avoid conflicts, extend it for bpf-next. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-11-20bpf: Emit audit messages upon successful prog load and unloadDaniel Borkmann
Allow for audit messages to be emitted upon BPF program load and unload for having a timeline of events. The load itself is in syscall context, so additional info about the process initiating the BPF prog creation can be logged and later directly correlated to the unload event. The only info really needed from BPF side is the globally unique prog ID where then audit user space tooling can query / dump all info needed about the specific BPF program right upon load event and enrich the record, thus these changes needed here can be kept small and non-intrusive to the core. Raw example output: # auditctl -D # auditctl -a always,exit -F arch=x86_64 -S bpf # ausearch --start recent -m 1334 [...] ---- time->Wed Nov 20 12:45:51 2019 type=PROCTITLE msg=audit(1574271951.590:8974): proctitle="./test_verifier" type=SYSCALL msg=audit(1574271951.590:8974): arch=c000003e syscall=321 success=yes exit=14 a0=5 a1=7ffe2d923e80 a2=78 a3=0 items=0 ppid=742 pid=949 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="test_verifier" exe="/root/bpf-next/tools/testing/selftests/bpf/test_verifier" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1334] msg=audit(1574271951.590:8974): auid=0 uid=0 gid=0 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=949 comm="test_verifier" exe="/root/bpf-next/tools/testing/selftests/bpf/test_verifier" prog-id=3260 event=LOAD ---- time->Wed Nov 20 12:45:51 2019 type=UNKNOWN[1334] msg=audit(1574271951.590:8975): prog-id=3260 event=UNLOAD ---- [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191120213816.8186-1-jolsa@kernel.org
2019-11-20dma-direct: exclude dma_direct_map_resource from the min_low_pfn checkChristoph Hellwig
The valid memory address check in dma_capable only makes sense when mapping normal memory, not when using dma_map_resource to map a device resource. Add a new boolean argument to dma_capable to exclude that check for the dma_map_resource case. Fixes: b12d66278dd6 ("dma-direct: check for overflows on 32 bit DMA addresses") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>