summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-11-13ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regsSteven Rostedt (VMware)
In preparation to have arguments of a function passed to callbacks attached to functions as default, change the default callback prototype to receive a struct ftrace_regs as the forth parameter instead of a pt_regs. For callbacks that set the FL_SAVE_REGS flag in their ftrace_ops flags, they will now need to get the pt_regs via a ftrace_get_regs() helper call. If this is called by a callback that their ftrace_ops did not have a FL_SAVE_REGS flag set, it that helper function will return NULL. This will allow the ftrace_regs to hold enough just to get the parameters and stack pointer, but without the worry that callbacks may have a pt_regs that is not completely filled. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-13bpf: Augment the set of sleepable LSM hooksKP Singh
Update the set of sleepable hooks with the ones that do not trigger a warning with might_fault() when exercised with the correct kernel config options enabled, i.e. DEBUG_ATOMIC_SLEEP=y LOCKDEP=y PROVE_LOCKING=y This means that a sleepable LSM eBPF program can be attached to these LSM hooks. A new helper method bpf_lsm_is_sleepable_hook is added and the set is maintained locally in bpf_lsm.c Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201113005930.541956-2-kpsingh@chromium.org
2020-11-13tty: serial: 8250: 8250_port: Move prototypes to shared locationLee Jones
Fixes the following W=1 kernel build warning(s): drivers/tty/serial/8250/8250_port.c:349:14: warning: no previous prototype for ‘au_serial_in’ [-Wmissing-prototypes] drivers/tty/serial/8250/8250_port.c:359:6: warning: no previous prototype for ‘au_serial_out’ [-Wmissing-prototypes] Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Mike Hudson <Exoray@isys.ca> Cc: linux-serial@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20201112105857.2078977-3-lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-13syscalls: Fix file comments for syscalls implemented in kernel/sys.cTal Zussman
The relevant syscalls were previously moved from kernel/timer.c to kernel/sys.c, but the comments weren't updated to reflect this change. Fixing these comments messes up the alphabetical ordering of syscalls by filename. This could be fixed by merging the two groups of kernel/sys.c syscalls, but that would require reordering the syscalls and renumbering them to maintain the numerical order in unistd.h. Signed-off-by: Tal Zussman <tz2294@columbia.edu> Link: https://lore.kernel.org/r/20201112215657.GA4539@charmander' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-11-12Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13bpf: Support for pointers beyond pkt_end.Alexei Starovoitov
This patch adds the verifier support to recognize inlined branch conditions. The LLVM knows that the branch evaluates to the same value, but the verifier couldn't track it. Hence causing valid programs to be rejected. The potential LLVM workaround: https://reviews.llvm.org/D87428 can have undesired side effects, since LLVM doesn't know that skb->data/data_end are being compared. LLVM has to introduce extra boolean variable and use inline_asm trick to force easier for the verifier assembly. Instead teach the verifier to recognize that r1 = skb->data; r1 += 10; r2 = skb->data_end; if (r1 > r2) { here r1 points beyond packet_end and subsequent if (r1 > r2) // always evaluates to "true". } Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20201111031213.25109-2-alexei.starovoitov@gmail.com
2020-11-12net: usb: switch to dev_get_tstats64 and remove usbnet_get_stats64 aliasHeiner Kallweit
Replace usbnet_get_stats64() with new identical core function dev_get_tstats64() in all users and remove usbnet_get_stats64(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12usbnet: switch to core handling of rx/tx byte/packet countersHeiner Kallweit
Use netdev->tstats instead of a member of usbnet for storing a pointer to the per-cpu counters. This allows us to use core functionality for statistics handling. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12Merge tag 'net-5.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Current release - regressions: - arm64: dts: fsl-ls1028a-kontron-sl28: specify in-band mode for ENETC Current release - bugs in new features: - mptcp: provide rmem[0] limit offset to fix oops Previous release - regressions: - IPv6: Set SIT tunnel hard_header_len to zero to fix path MTU calculations - lan743x: correctly handle chips with internal PHY - bpf: Don't rely on GCC __attribute__((optimize)) to disable GCSE - mlx5e: Fix VXLAN port table synchronization after function reload Previous release - always broken: - bpf: Zero-fill re-used per-cpu map element - fix out-of-order UDP packets when forwarding with UDP GSO fraglists turned on: - fix UDP header access on Fast/frag0 UDP GRO - fix IP header access and skb lookup on Fast/frag0 UDP GRO - ethtool: netlink: add missing netdev_features_change() call - net: Update window_clamp if SOCK_RCVBUF is set - igc: Fix returning wrong statistics - ch_ktls: fix multiple leaks and corner cases in Chelsio TLS offload - tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies - r8169: disable hw csum for short packets on all chip versions - vrf: Fix fast path output packet handling with async Netfilter rules" * tag 'net-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) lan743x: fix use of uninitialized variable net: udp: fix IP header access and skb lookup on Fast/frag0 UDP GRO net: udp: fix UDP header access on Fast/frag0 UDP GRO devlink: Avoid overwriting port attributes of registered port vrf: Fix fast path output packet handling with async Netfilter rules cosa: Add missing kfree in error path of cosa_write net: switch to the kernel.org patchwork instance ch_ktls: stop the txq if reaches threshold ch_ktls: tcb update fails sometimes ch_ktls/cxgb4: handle partial tag alone SKBs ch_ktls: don't free skb before sending FIN ch_ktls: packet handling prior to start marker ch_ktls: Correction in middle record handling ch_ktls: missing handling of header alone ch_ktls: Correction in trimmed_len calculation cxgb4/ch_ktls: creating skbs causes panic ch_ktls: Update cheksum information ch_ktls: Correction in finding correct length cxgb4/ch_ktls: decrypted bit is not enough net/x25: Fix null-ptr-deref in x25_connect ...
2020-11-12block: add a return value to set_capacity_revalidate_and_notifyChristoph Hellwig
Return if the function ended up sending an uevent or not. Cc: stable@vger.kernel.org # v5.9 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-12platform/chrome: cros_ec: Import Type C host commandsPrashant Malani
Import the EC_CMD_TYPEC_STATUS and EC_CMD_TYPEC_DISCOVERY Chrome OS EC host commands from the EC code base [1]. These commands can be used by the application processor to query Power Delivery (PD) discovery information concerning connected Type C peripherals. Also add the EC_FEATURE_TYPEC_CMD feature flag, which is used to determine whether these commands are supported by the EC. [1]: https://chromium.googlesource.com/chromiumos/platform/ec/+/refs/heads/master/include/ec_commands.h Signed-off-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20201029222738.482366-5-pmalani@chromium.org
2020-11-12Merge tag 'pm-5.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Make the intel_pstate driver behave as expected when it operates in the passive mode with HWP enabled and the 'powersave' governor on top of it" * tag 'pm-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Take CPUFREQ_GOV_STRICT_TARGET into account cpufreq: Add strict_target to struct cpufreq_policy cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET cpufreq: Introduce governor flags
2020-11-12spi: Introduce device-managed SPI controller allocationLukas Wunner
SPI driver probing currently comprises two steps, whereas removal comprises only one step: spi_alloc_master() spi_register_controller() spi_unregister_controller() That's because spi_unregister_controller() calls device_unregister() instead of device_del(), thereby releasing the reference on the spi_controller which was obtained by spi_alloc_master(). An SPI driver's private data is contained in the same memory allocation as the spi_controller struct. Thus, once spi_unregister_controller() has been called, the private data is inaccessible. But some drivers need to access it after spi_unregister_controller() to perform further teardown steps. Introduce devm_spi_alloc_master() and devm_spi_alloc_slave(), which release a reference on the spi_controller struct only after the driver has unbound, thereby keeping the memory allocation accessible. Change spi_unregister_controller() to not release a reference if the spi_controller was allocated by one of these new devm functions. The present commit is small enough to be backportable to stable. It allows fixing drivers which use the private data in their ->remove() hook after it's been freed. It also allows fixing drivers which neglect to release a reference on the spi_controller in the probe error path. Long-term, most SPI drivers shall be moved over to the devm functions introduced herein. The few that can't shall be changed in a treewide commit to explicitly release the last reference on the controller. That commit shall amend spi_unregister_controller() to no longer release a reference, thereby completing the migration. As a result, the behaviour will be less surprising and more consistent with subsystems such as IIO, which also includes the private data in the allocation of the generic iio_dev struct, but calls device_del() in iio_device_unregister(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/272bae2ef08abd21388c98e23729886663d19192.1605121038.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org>
2020-11-12serial: imx: Remove unused platform data supportFabio Estevam
Since 5.10-rc1 i.MX is a devicetree-only platform and the existing platform data support in this driver was only useful for old non-devicetree platforms. Get rid of the platform data support since it is no longer used. Reviewed-by: Fugang Duan <fugang.duan@nxp.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20201110214840.16768-1-festevam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-11net: evaluate net.ipv4.conf.all.proxy_arp_pvlanVincent Bernat
Introduced in 65324144b50b, the "proxy_arp_vlan" sysctl is a per-interface sysctl to tune proxy ARP support for private VLANs. While the "all" variant is exposed, it was a noop and never evaluated. We use the usual "or" logic for this kind of sysctls. Fixes: 65324144b50b ("net: RFC3069, private VLAN proxy arp support") Signed-off-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11net: evaluate net.ipvX.conf.all.ignore_routes_with_linkdownVincent Bernat
Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl ignores a route whose interface is down. It is provided as a per-interface sysctl. However, while a "all" variant is exposed, it was a noop since it was never evaluated. We use the usual "or" logic for this kind of sysctls. Tested with: ip link add type veth # veth0 + veth1 ip link add type veth # veth1 + veth2 ip link set up dev veth0 ip link set up dev veth1 # link-status paired with veth0 ip link set up dev veth2 ip link set up dev veth3 # link-status paired with veth2 # First available path ip -4 addr add 203.0.113.${uts#H}/24 dev veth0 ip -6 addr add 2001:db8:1::${uts#H}/64 dev veth0 # Second available path ip -4 addr add 192.0.2.${uts#H}/24 dev veth2 ip -6 addr add 2001:db8:2::${uts#H}/64 dev veth2 # More specific route through first path ip -4 route add 198.51.100.0/25 via 203.0.113.254 # via veth0 ip -6 route add 2001:db8:3::/56 via 2001:db8:1::ff # via veth0 # Less specific route through second path ip -4 route add 198.51.100.0/24 via 192.0.2.254 # via veth2 ip -6 route add 2001:db8:3::/48 via 2001:db8:2::ff # via veth2 # H1: enable on "all" # H2: enable on "veth0" for v in ipv4 ipv6; do case $uts in H1) sysctl -qw net.${v}.conf.all.ignore_routes_with_linkdown=1 ;; H2) sysctl -qw net.${v}.conf.veth0.ignore_routes_with_linkdown=1 ;; esac done set -xe # When veth0 is up, best route is through veth0 ip -o route get 198.51.100.1 | grep -Fw veth0 ip -o route get 2001:db8:3::1 | grep -Fw veth0 # When veth0 is down, best route should be through veth2 on H1/H2, # but on veth0 on H2 ip link set down dev veth1 # down veth0 ip route show [ $uts != H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth0 [ $uts != H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth0 [ $uts = H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth2 [ $uts = H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth2 Without this patch, the two last lines would fail on H1 (the one using the "all" sysctl). With the patch, everything succeeds as expected. Also document the sysctl in `ip-sysctl.rst`. Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") Signed-off-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11Merge branch 'stable/for-linus-5.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "Two tiny fixes for issues that make drivers under Xen unhappy under certain conditions" * 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
2020-11-11dma-buf: Document that dma-buf size is fixedJianxin Xiong
The fact that the size of dma-buf is invariant over the lifetime of the buffer is mentioned in the comment of 'dma_buf_ops.mmap', but is not documented at where the info is defined. Add the missing documentation. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1605044477-51833-7-git-send-email-jianxin.xiong@intel.com
2020-11-11spi: introduce SPI_MODE_X_MASK macroOleksij Rempel
Provide a macro to filter all SPI_MODE_0,1,2,3 mode in one run. The latest SPI framework will parse the devicetree in following call sequence: of_register_spi_device() -> of_spi_parse_dt() So, driver do not need to pars the devicetree and will get prepared flags in the probe. On one hand it is good far most drivers. On other hand some drivers need to filter flags provide by SPI framework and apply know to work flags. This drivers may use SPI_MODE_X_MASK to filter MODE flags and set own, known flags: spi->flags &= ~SPI_MODE_X_MASK; spi->flags |= SPI_MODE_0; Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20201027095724.18654-2-o.rempel@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2020-11-11thunderbolt: Add support for end-to-end flow controlMika Westerberg
USB4 spec defines end-to-end (E2E) flow control that can be used between hosts to prevent overflow of a RX ring. We previously had this partially implemented but that code was removed with commit 53f13319d131 ("thunderbolt: Get rid of E2E workaround") with the idea that we add it back properly if there ever is need. Now that we are going to add DMA traffic test driver (in subsequent patches) this can be useful. For this reason we modify tb_ring_alloc_rx/tx() so that they accept RING_FLAG_E2E and configure the hardware ring accordingly. The RX side also requires passing TX HopID (e2e_tx_hop) used in the credit grant packets. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-11thunderbolt: Create debugfs directory automatically for servicesMika Westerberg
This allows service drivers to use it as parent directory if they need to add their own debugfs entries. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-11thunderbolt: Add functions for enabling and disabling lane bonding on XDomainIsaac Hazan
These can be used by service drivers to enable and disable lane bonding as needed. Signed-off-by: Isaac Hazan <isaac.hazan@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-11thunderbolt: Add link_speed and link_width to XDomainIsaac Hazan
Link speed and link width are needed for checking expected values in case of using a loopback service. Signed-off-by: Isaac Hazan <isaac.hazan@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-10ftrace: Clean up the recursion code a bitSteven Rostedt (VMware)
In trace_test_and_set_recursion(), current->trace_recursion is placed into a variable, and that variable should be used for the processing, as there's no reason to dereference current multiple times. On trace_clear_recursion(), current->trace_recursion is modified and there's no reason to copy it over to a variable. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10fgraph: Make overruns 4 bytes in graph stack structureSteven Rostedt (VMware)
Inspecting the data structures of the function graph tracer, I found that the overrun value is unsigned long, which is 8 bytes on a 64 bit machine, and not only that, the depth is an int (4 bytes). The overrun can be simply an unsigned int (4 bytes) and pack the ftrace_graph_ret structure better. The depth is moved up next to the func, as it is used more often with func, and improves cache locality. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-10vfs: move __sb_{start,end}_write* to fs.hDarrick J. Wong
Now that we've straightened out the callers, move these three functions to fs.h since they're fairly trivial. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz>
2020-11-10vfs: separate __sb_start_write into blocking and non-blocking helpersDarrick J. Wong
Break this function into two helpers so that it's obvious that the trylock versions return a value that must be checked, and the blocking versions don't require that. While we're at it, clean up the return type mismatch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-11-10bpf: Load and verify kernel module BTFsAndrii Nakryiko
Add kernel module listener that will load/validate and unload module BTF. Module BTFs gets ID generated for them, which makes it possible to iterate them with existing BTF iteration API. They are given their respective module's names, which will get reported through GET_OBJ_INFO API. They are also marked as in-kernel BTFs for tooling to distinguish them from user-provided BTFs. Also, similarly to vmlinux BTF, kernel module BTFs are exposed through sysfs as /sys/kernel/btf/<module-name>. This is convenient for user-space tools to inspect module BTF contents and dump their types with existing tools: [vmuser@archvm bpf]$ ls -la /sys/kernel/btf total 0 drwxr-xr-x 2 root root 0 Nov 4 19:46 . drwxr-xr-x 13 root root 0 Nov 4 19:46 .. ... -r--r--r-- 1 root root 888 Nov 4 19:46 irqbypass -r--r--r-- 1 root root 100225 Nov 4 19:46 kvm -r--r--r-- 1 root root 35401 Nov 4 19:46 kvm_intel -r--r--r-- 1 root root 120 Nov 4 19:46 pcspkr -r--r--r-- 1 root root 399 Nov 4 19:46 serio_raw -r--r--r-- 1 root root 4094095 Nov 4 19:46 vmlinux Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org
2020-11-10PM: domains: Rename pm_genpd_syscore_poweroff|poweron()Ulf Hansson
To better describe what the pm_genpd_syscore_poweroff|poweron() functions actually do, let's rename them to dev_pm_genpd_suspend|resume() and update the rather few callers of them accordingly (a couple of clocksource drivers). Moreover, let's take the opportunity to add some documentation of these exported functions, as that is currently missing. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-10PM: EM: update the comments related to power scaleLukasz Luba
The Energy Model supports power values expressed in milli-Watts or in an 'abstract scale'. Update the related comments is the code to reflect that state. Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-10PM: EM: Add a flag indicating units of power values in Energy ModelLukasz Luba
There are different platforms and devices which might use different scale for the power values. Kernel sub-systems might need to check if all Energy Model (EM) devices are using the same scale. Address that issue and store the information inside EM for each device. Thanks to that they can be easily compared and proper action triggered. Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-10Merge branch 'sched/migrate-disable'Peter Zijlstra
2020-11-10sched: Fix migrate_disable() vs rt/dl balancingPeter Zijlstra
In order to minimize the interference of migrate_disable() on lower priority tasks, which can be deprived of runtime due to being stuck below a higher priority task. Teach the RT/DL balancers to push away these higher priority tasks when a lower priority task gets selected to run on a freshly demoted CPU (pull). This adds migration interference to the higher priority task, but restores bandwidth to system that would otherwise be irrevocably lost. Without this it would be possible to have all tasks on the system stuck on a single CPU, each task preempted in a migrate_disable() section with a single high priority task running. This way we can still approximate running the M highest priority tasks on the system. Migrating the top task away is (ofcourse) still subject to migrate_disable() too, which means the lower task is subject to an interference equivalent to the worst case migrate_disable() section. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102347.499155098@infradead.org
2020-11-10sched,rt: Use cpumask_any*_distribute()Peter Zijlstra
Replace a bunch of cpumask_any*() instances with cpumask_any*_distribute(), by injecting this little bit of random in cpu selection, we reduce the chance two competing balance operations working off the same lowest_mask pick the same CPU. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102347.190759694@infradead.org
2020-11-10sched: Fix migrate_disable() vs set_cpus_allowed_ptr()Peter Zijlstra
Concurrent migrate_disable() and set_cpus_allowed_ptr() has interesting features. We rely on set_cpus_allowed_ptr() to not return until the task runs inside the provided mask. This expectation is exported to userspace. This means that any set_cpus_allowed_ptr() caller must wait until migrate_enable() allows migrations. At the same time, we don't want migrate_enable() to schedule, due to patterns like: preempt_disable(); migrate_disable(); ... migrate_enable(); preempt_enable(); And: raw_spin_lock(&B); spin_unlock(&A); this means that when migrate_enable() must restore the affinity mask, it cannot wait for completion thereof. Luck will have it that that is exactly the case where there is a pending set_cpus_allowed_ptr(), so let that provide storage for the async stop machine. Much thanks to Valentin who used TLA+ most effective and found lots of 'interesting' cases. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102346.921768277@infradead.org
2020-11-10sched: Add migrate_disable()Peter Zijlstra
Add the base migrate_disable() support (under protest). While migrate_disable() is (currently) required for PREEMPT_RT, it is also one of the biggest flaws in the system. Notably this is just the base implementation, it is broken vs sched_setaffinity() and hotplug, both solved in additional patches for ease of review. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102346.818170844@infradead.org
2020-11-10sched/hotplug: Consolidate task migration on CPU unplugThomas Gleixner
With the new mechanism which kicks tasks off the outgoing CPU at the end of schedule() the situation on an outgoing CPU right before the stopper thread brings it down completely is: - All user tasks and all unbound kernel threads have either been migrated away or are not running and the next wakeup will move them to a online CPU. - All per CPU kernel threads, except cpu hotplug thread and the stopper thread have either been unbound or parked by the responsible CPU hotplug callback. That means that at the last step before the stopper thread is invoked the cpu hotplug thread is the last legitimate running task on the outgoing CPU. Add a final wait step right before the stopper thread is kicked which ensures that any still running tasks on the way to park or on the way to kick themself of the CPU are either sleeping or gone. This allows to remove the migrate_tasks() crutch in sched_cpu_dying(). If sched_cpu_dying() detects that there is still another running task aside of the stopper thread then it will explode with the appropriate fireworks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102346.547163969@infradead.org
2020-11-10stop_machine: Add function and caller debug infoPeter Zijlstra
Crashes in stop-machine are hard to connect to the calling code, add a little something to help with that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102346.116513635@infradead.org
2020-11-10cpufreq: Add strict_target to struct cpufreq_policyRafael J. Wysocki
Add a new field to be set when the CPUFREQ_GOV_STRICT_TARGET flag is set for the current governor to struct cpufreq_policy, so that the drivers needing to check CPUFREQ_GOV_STRICT_TARGET do not have to access the governor object during every frequency transition. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-11-10cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGETRafael J. Wysocki
Introduce a new governor flag, CPUFREQ_GOV_STRICT_TARGET, for the governors that want the target frequency to be set exactly to the given value without leaving any room for adjustments on the hardware side and set this flag for the powersave and performance governors. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-11-10cpufreq: Introduce governor flagsRafael J. Wysocki
A new cpufreq governor flag will be added subsequently, so replace the bool dynamic_switching fleid in struct cpufreq_governor with a flags field and introduce CPUFREQ_GOV_DYNAMIC_SWITCHING to set for the "dynamic switching" governors instead of it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-11-10Merge drm/drm-next into drm-misc-nextThomas Zimmermann
We need commit f8f6ae5d077a ("mm: always have io_remap_pfn_range() set pgprot_decrypted()") to be able to merge Jason's cleanup patch. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2020-11-10Merge v5.10-rc3 into drm-nextDaniel Vetter
We need commit f8f6ae5d077a ("mm: always have io_remap_pfn_range() set pgprot_decrypted()") to be able to merge Jason's cleanup patch. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2020-11-09net: core: add dev_get_tstats64 as a ndo_get_stats64 implementationHeiner Kallweit
It's a frequent pattern to use netdev->stats for the less frequently accessed counters and per-cpu counters for the frequently accessed counters (rx/tx bytes/packets). Add a default ndo_get_stats64() implementation for this use case. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-09Merge tag 'ext4_for_linus_cleanups' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes and cleanups from Ted Ts'o: "More fixes and cleanups for the new fast_commit features, but also a few other miscellaneous bug fixes and a cleanup for the MAINTAINERS file" * tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits) jbd2: fix up sparse warnings in checkpoint code ext4: fix sparse warnings in fast_commit code ext4: cleanup fast commit mount options jbd2: don't start fast commit on aborted journal ext4: make s_mount_flags modifications atomic ext4: issue fsdev cache flush before starting fast commit ext4: disable fast commit with data journalling ext4: fix inode dirty check in case of fast commits ext4: remove unnecessary fast commit calls from ext4_file_mmap ext4: mark buf dirty before submitting fast commit buffer ext4: fix code documentatioon ext4: dedpulicate the code to wait on inode that's being committed jbd2: don't read journal->j_commit_sequence without taking a lock jbd2: don't touch buffer state until it is filled jbd2: add todo for a fast commit performance optimization jbd2: don't pass tid to jbd2_fc_end_commit_fallback() jbd2: don't use state lock during commit path jbd2: drop jbd2_fc_init documentation ext4: clean up the JBD2 API that initializes fast commits jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs ...
2020-11-09drivers: base: fix some kernel-doc markupsMauro Carvalho Chehab
class_create is actually defined at the header. Fix the markup there and add a new one at the right place. While here, also fix some markups for functions that have different names between their prototypes and kernel-doc comments. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/2fb6efd6a1f90d69ff73bf579566079cbb051e15.1603469755.git.mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-09uio: fix some kernel-doc markupsMauro Carvalho Chehab
The definitions for (devm_)uio_register_device should be at the header file, as the macros are there. The ones inside uio.c refer, instead, to __(devm_)uio_register_device. Update them and add new kernel-doc markups for the macros. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/82ab7b68d271aeda7396e369ff8a629491b9d628.1603469755.git.mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-09kernfs: bring names in comments in line with codeWillem de Bruijn
Fix two stragglers in the comments of the below rename operation. Fixes: adc5e8b58f48 ("kernfs: drop s_ prefix from kernfs_node members") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20201015185726.1386868-1-willemdebruijn.kernel@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-09perf/arch: Remove perf_sample_data::regs_user_copyPeter Zijlstra
struct perf_sample_data lives on-stack, we should be careful about it's size. Furthermore, the pt_regs copy in there is only because x86_64 is a trainwreck, solve it differently. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201030151955.258178461@infradead.org
2020-11-09perf: Reduce stack usage of perf_output_begin()Peter Zijlstra
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_output_begin() must already have a perf_sample_data on-stack. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org