summaryrefslogtreecommitdiff
path: root/init/Kconfig
AgeCommit message (Collapse)Author
2023-08-02sched/psi: Select KERNFS as neededRandy Dunlap
Users of KERNFS should select it to enforce its being built, so do this to prevent a build error. In file included from ../kernel/sched/build_utility.c:97: ../kernel/sched/psi.c: In function 'psi_trigger_poll': ../kernel/sched/psi.c:1479:17: error: implicit declaration of function 'kernfs_generic_poll' [-Werror=implicit-function-declaration] 1479 | kernfs_generic_poll(t->of, wait); Fixes: aff037078eca ("sched/psi: use kernfs polling functions for PSI trigger polling") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Suren Baghdasaryan <surenb@google.com> Link: lore.kernel.org/r/202307310732.r65EQFY0-lkp@intel.com
2023-06-09cachestat: implement cachestat syscallNhat Pham
There is currently no good way to query the page cache state of large file sets and directory trees. There is mincore(), but it scales poorly: the kernel writes out a lot of bitmap data that userspace has to aggregate, when the user really doesn not care about per-page information in that case. The user also needs to mmap and unmap each file as it goes along, which can be quite slow as well. Some use cases where this information could come in handy: * Allowing database to decide whether to perform an index scan or direct table queries based on the in-memory cache state of the index. * Visibility into the writeback algorithm, for performance issues diagnostic. * Workload-aware writeback pacing: estimating IO fulfilled by page cache (and IO to be done) within a range of a file, allowing for more frequent syncing when and where there is IO capacity, and batching when there is not. * Computing memory usage of large files/directory trees, analogous to the du tool for disk usage. More information about these use cases could be found in the following thread: https://lore.kernel.org/lkml/20230315170934.GA97793@cmpxchg.org/ This patch implements a new syscall that queries cache state of a file and summarizes the number of cached pages, number of dirty pages, number of pages marked for writeback, number of (recently) evicted pages, etc. in a given range. Currently, the syscall is only wired in for x86 architecture. NAME cachestat - query the page cache statistics of a file. SYNOPSIS #include <sys/mman.h> struct cachestat_range { __u64 off; __u64 len; }; struct cachestat { __u64 nr_cache; __u64 nr_dirty; __u64 nr_writeback; __u64 nr_evicted; __u64 nr_recently_evicted; }; int cachestat(unsigned int fd, struct cachestat_range *cstat_range, struct cachestat *cstat, unsigned int flags); DESCRIPTION cachestat() queries the number of cached pages, number of dirty pages, number of pages marked for writeback, number of evicted pages, number of recently evicted pages, in the bytes range given by `off` and `len`. An evicted page is a page that is previously in the page cache but has been evicted since. A page is recently evicted if its last eviction was recent enough that its reentry to the cache would indicate that it is actively being used by the system, and that there is memory pressure on the system. These values are returned in a cachestat struct, whose address is given by the `cstat` argument. The `off` and `len` arguments must be non-negative integers. If `len` > 0, the queried range is [`off`, `off` + `len`]. If `len` == 0, we will query in the range from `off` to the end of the file. The `flags` argument is unused for now, but is included for future extensibility. User should pass 0 (i.e no flag specified). Currently, hugetlbfs is not supported. Because the status of a page can change after cachestat() checks it but before it returns to the application, the returned values may contain stale information. RETURN VALUE On success, cachestat returns 0. On error, -1 is returned, and errno is set to indicate the error. ERRORS EFAULT cstat or cstat_args points to an invalid address. EINVAL invalid flags. EBADF invalid file descriptor. EOPNOTSUPP file descriptor is of a hugetlbfs file [nphamcs@gmail.com: replace rounddown logic with the existing helper] Link: https://lkml.kernel.org/r/20230504022044.3675469-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230503013608.2431726-3-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Brian Foster <bfoster@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-27Merge tag 'driver-core-6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems" * tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits) device property: make device_property functions take const device * driver core: update comments in device_rename() driver core: Don't require dynamic_debug for initcall_debug probe timing firmware_loader: rework crypto dependencies firmware_loader: Strip off \n from customized path zram: fix up permission for the hot_add sysfs file cacheinfo: Add use_arch[|_cache]_info field/function arch_topology: Remove early cacheinfo error message if -ENOENT cacheinfo: Check cache properties are present in DT cacheinfo: Check sib_leaf in cache_leaves_are_shared() cacheinfo: Allow early level detection when DT/ACPI info is missing/broken cacheinfo: Add arm64 early level initializer implementation cacheinfo: Add arch specific early level initializer tty: make tty_class a static const structure driver core: class: remove struct class_interface * from callbacks driver core: class: mark the struct class in struct class_interface constant driver core: class: make class_register() take a const * driver core: class: mark class_release() as taking a const * driver core: remove incorrect comment for device_create* MIPS: vpe-cmp: remove module owner pointer from struct class usage. ...
2023-04-25Merge tag 'slab-for-6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: "The main change is naturally the SLOB removal. Since its deprecation in 6.2 I've seen no complaints so hopefully SLUB_(TINY) works well for everyone and we can proceed. Besides the code cleanup, the main immediate benefit will be allowing kfree() family of function to work on kmem_cache_alloc() objects, which was incompatible with SLOB. This includes kfree_rcu() which had no kmem_cache_free_rcu() counterpart yet and now it shouldn't be necessary anymore. Besides that, there are several small code and comment improvements from Thomas, Thorsten and Vernon" * tag 'slab-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab: document kfree() as allowed for kmem_cache_alloc() objects mm/slob: remove slob.c mm/slab: remove CONFIG_SLOB code from slab common code mm, pagemap: remove SLOB and SLQB from comments and documentation mm, page_flags: remove PG_slob_free mm/slob: remove CONFIG_SLOB mm/slub: fix help comment of SLUB_DEBUG mm: slub: make kobj_type structure constant slab: Adjust comment after refactoring of gfp.h
2023-04-25Merge tag 'printk-for-6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Code cleanup and dead code removal * tag 'printk-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: Remove obsoleted check for non-existent "user" object lib/vsprintf: Use isodigit() for the octal number check Remove orphaned CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT
2023-04-23gcc: disable '-Warray-bounds' for gcc-13 tooLinus Torvalds
We started disabling '-Warray-bounds' for gcc-12 originally on s390, because it resulted in some warnings that weren't realistically fixable (commit 8b202ee21839: "s390: disable -Warray-bounds"). That s390-specific issue was then found to be less common elsewhere, but generic (see f0be87c42cbd: "gcc-12: disable '-Warray-bounds' universally for now"), and then later expanded the version check was expanded to gcc-11 (5a41237ad1d4: "gcc: disable -Warray-bounds for gcc-11 too"). And it turns out that I was much too optimistic in thinking that it's all going to go away, and here we are with gcc-13 showing all the same issues. So instead of expanding this one version at a time, let's just disable it for gcc-11+, and put an end limit to it only when we actually find a solution. Yes, I'm sure some of this is because the kernel just does odd things (like our "container_of()" use, but also knowingly playing games with things like linker tables and array layouts). And yes, some of the warnings are likely signs of real bugs, but when there are hundreds of false positives, that doesn't really help. Oh well. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-29mm/slob: remove CONFIG_SLOBVlastimil Babka
Remove SLOB from Kconfig and Makefile. Everything under #ifdef CONFIG_SLOB, and mm/slob.c is now dead code. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
2023-03-27Remove orphaned CONFIG_PRINTK_SAFE_LOG_BUF_SHIFTMarc Aurèle La France
After the commit 93d102f094be9beab2 ("printk: remove safe buffers"), CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT is no longer useful. Remove it. Signed-off-by: Marc Aurèle La France <tsi@tuyoix.net> Reviewed-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> [pmladek@suse.cz: Cleaned up the commit message.] Signed-off-by: Petr Mladek <pmladek@suse.com> Fixes: 93d102f094be9beab ("printk: remove safe buffers") Link: https://lore.kernel.org/r/5c19e248-1b6b-330c-7c4c-a824688daefe@tuyoix.net
2023-03-06driver core: remove CONFIG_SYSFS_DEPRECATED and CONFIG_SYSFS_DEPRECATED_V2Greg Kroah-Hartman
CONFIG_SYSFS_DEPRECATED was added in commit 88a22c985e35 ("CONFIG_SYSFS_DEPRECATED") in 2006 to allow systems with older versions of some tools (i.e. Fedora 3's version of udev) to boot properly. Four years later, in 2010, the option was attempted to be removed as most of userspace should have been fixed up properly by then, but some kernel developers clung to those old systems and refused to update, so we added CONFIG_SYSFS_DEPRECATED_V2 in commit e52eec13cd6b ("SYSFS: Allow boot time switching between deprecated and modern sysfs layout") to allow them to continue to boot properly, and we allowed a boot time parameter to be used to switch back to the old format if needed. Over time, the logic that was covered under these config options was slowly removed from individual driver subsystems successfully, removed, and the only thing that is now left in the kernel are some changes in the block layer's representation in sysfs where real directories are used instead of symlinks like normal. Because the original changes were done to userspace tools in 2006, and all distros that use those tools are long end-of-life, and older non-udev-based systems do not care about the block layer's sysfs representation, it is time to finally remove this old logic and the config entries from the kernel. Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: linux-block@vger.kernel.org Cc: linux-doc@vger.kernel.org Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20230223073326.2073220-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-26Merge tag 'kbuild-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Change V=1 option to print both short log and full command log - Allow V=1 and V=2 to be combined as V=12 - Make W=1 detect wrong .gitignore files - Tree-wide cleanups for unused command line arguments passed to Clang - Stop using -Qunused-arguments with Clang - Make scripts/setlocalversion handle only correct release tags instead of any arbitrary annotated tag - Create Debian and RPM source packages without cleaning the source tree - Various cleanups for packaging * tag 'kbuild-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (74 commits) kbuild: rpm-pkg: remove unneeded KERNELRELEASE from modules/headers_install docs: kbuild: remove description of KBUILD_LDS_MODULE .gitattributes: use 'dts' diff driver for *.dtso files kbuild: deb-pkg: improve the usability of source package kbuild: deb-pkg: fix binary-arch and clean in debian/rules kbuild: tar-pkg: use tar rules in scripts/Makefile.package kbuild: make perf-tar*-src-pkg work without relying on git kbuild: deb-pkg: switch over to source format 3.0 (quilt) kbuild: deb-pkg: make .orig tarball a hard link if possible kbuild: deb-pkg: hide KDEB_SOURCENAME from Makefile kbuild: srcrpm-pkg: create source package without cleaning kbuild: rpm-pkg: build binary packages from source rpm kbuild: deb-pkg: create source package without cleaning kbuild: add a tool to list files ignored by git Documentation/llvm: add Chimera Linux, Google and Meta datacenters setlocalversion: use only the correct release tag for git-describe setlocalversion: clean up the construction of version output .gitignore: ignore *.cover and *.mbx kbuild: remove --include-dir MAKEFLAG from top Makefile kbuild: fix trivial typo in comment ...
2023-02-23Merge tag 'bootconfig-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull bootconfig updates from Masami Hiramatsu: - Fix ftrace2bconf.sh tool for checking event enable status correctly - Add CONFIG_BOOT_CONFIG_FORCE to apply bootconfig without 'bootconfig' boot parameter - Enable CONFIG_BOOT_CONFIG_FORCE by default if a bootconfig is embedded in the kernel - Increase max number of nodes of bootconfig to 8192 * tag 'bootconfig-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support bootconfig: Default BOOT_CONFIG_FORCE to y if BOOT_CONFIG_EMBED Allow forcing unconditional bootconfig processing tools/bootconfig: fix single & used for logical condition
2023-02-21Merge tag 'net-next-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ...
2023-02-22bootconfig: Default BOOT_CONFIG_FORCE to y if BOOT_CONFIG_EMBEDPaul E. McKenney
When a kernel is built with CONFIG_BOOT_CONFIG_EMBED=y, the intention will normally be to unconditionally provide the specified kernel-boot arguments to the kernel, as opposed to requiring a separately provided bootconfig parameter. Therefore, make the BOOT_CONFIG_FORCE Kconfig option default to y in kernels built with CONFIG_BOOT_CONFIG_EMBED=y. The old semantics may be obtained by manually overriding this default. Link: https://lore.kernel.org/all/20230107162202.GA4028633@paulmck-ThinkPad-P17-Gen-1/ Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-02-22Allow forcing unconditional bootconfig processingPaul E. McKenney
The BOOT_CONFIG family of Kconfig options allows a bootconfig file containing kernel boot parameters to be embedded into an initrd or into the kernel itself. This can be extremely useful when deploying kernels in cases where some of the boot parameters depend on the kernel version rather than on the server hardware, firmware, or workload. Unfortunately, the "bootconfig" kernel parameter must be specified in order to cause the kernel to look for the embedded bootconfig file, and it clearly does not help to embed this "bootconfig" kernel parameter into that file. Therefore, provide a new BOOT_CONFIG_FORCE Kconfig option that causes the kernel to act as if the "bootconfig" kernel parameter had been specified. In other words, kernels built with CONFIG_BOOT_CONFIG_FORCE=y will look for the embedded bootconfig file even when the "bootconfig" kernel parameter is omitted. This permits kernel-version-dependent kernel boot parameters to be embedded into the kernel image without the need to (for example) update large numbers of boot loaders. Link: https://lore.kernel.org/all/20230105005838.GA1772817@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <linux-doc@vger.kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-02-21Merge tag 'rcu.2023.02.10a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably: - Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks - Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized - Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have done this for many years) - Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases) - Make kfree_rcu() and friends take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark This also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh) - SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture This also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another - Clean up the now-useless SRCU Kconfig option There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window - RCU-tasks updates, perhaps most notably these fixes: - A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang - A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period - A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU - Torture-test updates and fixes - Torture-test scripting updates and fixes - Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() kernel/notifier: Remove CONFIG_SRCU init: Remove "select SRCU" fs/quota: Remove "select SRCU" fs/notify: Remove "select SRCU" fs/btrfs: Remove "select SRCU" fs: Remove CONFIG_SRCU drivers/pci/controller: Remove "select SRCU" drivers/net: Remove "select SRCU" drivers/md: Remove "select SRCU" drivers/hwtracing/stm: Remove "select SRCU" drivers/dax: Remove "select SRCU" drivers/base: Remove CONFIG_SRCU rcu: Disable laziness if lazy-tracking says so rcu: Track laziness during boot and suspend rcu: Remove redundant call to rcu_boost_kthread_setaffinity() rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts rcu: Align the output of RCU CPU stall warning messages rcu: Add RCU stall diagnosis information sched: Add helper nr_context_switches_cpu() ...
2023-02-02init: Remove "select SRCU"Paul E. McKenney
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: John Ogness <john.ogness@linutronix.de>
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-28Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2023-01-28 We've added 124 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 6386 insertions(+), 1827 deletions(-). The main changes are: 1) Implement XDP hints via kfuncs with initial support for RX hash and timestamp metadata kfuncs, from Stanislav Fomichev and Toke Høiland-Jørgensen. Measurements on overhead: https://lore.kernel.org/bpf/875yellcx6.fsf@toke.dk 2) Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case, from Andrii Nakryiko. 3) Significantly reduce the search time for module symbols by livepatch and BPF, from Jiri Olsa and Zhen Lei. 4) Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals, from David Vernet. 5) Fix several issues in the dynptr processing such as stack slot liveness propagation, missing checks for PTR_TO_STACK variable offset, etc, from Kumar Kartikeya Dwivedi. 6) Various performance improvements, fixes, and introduction of more than just one XDP program to XSK selftests, from Magnus Karlsson. 7) Big batch to BPF samples to reduce deprecated functionality, from Daniel T. Lee. 8) Enable struct_ops programs to be sleepable in verifier, from David Vernet. 9) Reduce pr_warn() noise on BTF mismatches when they are expected under the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien. 10) Describe modulo and division by zero behavior of the BPF runtime in BPF's instruction specification document, from Dave Thaler. 11) Several improvements to libbpf API documentation in libbpf.h, from Grant Seltzer. 12) Improve resolve_btfids header dependencies related to subcmd and add proper support for HOSTCC, from Ian Rogers. 13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room() helper along with BPF selftests, from Ziyang Xuan. 14) Simplify the parsing logic of structure parameters for BPF trampoline in the x86-64 JIT compiler, from Pu Lehui. 15) Get BTF working for kernels with CONFIG_RUST enabled by excluding Rust compilation units with pahole, from Martin Rodriguez Reboredo. 16) Get bpf_setsockopt() working for kTLS on top of TCP sockets, from Kui-Feng Lee. 17) Disable stack protection for BPF objects in bpftool given BPF backends don't support it, from Holger Hoffstätte. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits) selftest/bpf: Make crashes more debuggable in test_progs libbpf: Add documentation to map pinning API functions libbpf: Fix malformed documentation formatting selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket. bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt(). bpf/selftests: Verify struct_ops prog sleepable behavior bpf: Pass const struct bpf_prog * to .check_member libbpf: Support sleepable struct_ops.s section bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable selftests/bpf: Fix vmtest static compilation error tools/resolve_btfids: Alter how HOSTCC is forced tools/resolve_btfids: Install subcmd headers bpf/docs: Document the nocast aliasing behavior of ___init bpf/docs: Document how nested trusted fields may be defined bpf/docs: Document cpumask kfuncs in a new file selftests/bpf: Add selftest suite for cpumask kfuncs selftests/bpf: Add nested trust selftests suite bpf: Enable cpumasks to be queried and used as kptrs bpf: Disallow NULLable pointers for trusted kfuncs ... ==================== Link: https://lore.kernel.org/r/20230128004827.21371-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26scripts: remove bin2cMasahiro Yamada
Commit 80f8be7af03f ("tomoyo: Omit use of bin2c") removed the last use of bin2c. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com>
2023-01-21Merge tag 'kbuild-fixes-v6.2-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Hide LDFLAGS_vmlinux from decompressor Makefiles to fix error messages when GNU Make 4.4 is used. - Fix 'make modules' build error when CONFIG_DEBUG_INFO_BTF_MODULES=y. - Fix warnings emitted by GNU Make 4.4 in scripts/kconfig/Makefile. - Support GNU Make 4.4 for scripts/jobserver-exec. - Show clearer error message when kernel/gen_kheaders.sh fails due to missing cpio. * tag 'kbuild-fixes-v6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kheaders: explicitly validate existence of cpio command scripts: support GNU make 4.4 in jobserver-exec kconfig: Update all declared targets scripts: rpm: make clear that mkspec script contains 4.13 feature init/Kconfig: fix LOCALVERSION_AUTO help text kbuild: fix 'make modules' error when CONFIG_DEBUG_INFO_BTF_MODULES=y kbuild: export top-level LDFLAGS_vmlinux only to scripts/Makefile.vmlinux init/version-timestamp.c: remove unneeded #include <linux/version.h> docs: kbuild: remove mention to dropped $(objtree) feature
2023-01-17btf, scripts: Exclude Rust CUs with paholeMartin Rodriguez Reboredo
Version 1.24 of pahole has the capability to exclude compilation units (CUs) of specific languages [1] [2]. Rust, as of writing, is not currently supported by pahole and if it's used with a build that has BTF debugging enabled it results in malformed kernel and module binaries [3]. So it's better for pahole to exclude Rust CUs until support for it arrives. Co-developed-by: Eric Curtin <ecurtin@redhat.com> Signed-off-by: Eric Curtin <ecurtin@redhat.com> Signed-off-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Eric Curtin <ecurtin@redhat.com> Reviewed-by: Neal Gompa <neal@gompa.dev> Acked-by: Miguel Ojeda <ojeda@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=49358dfe2aaae4e90b072332c3e324019826783f [1] Link: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=8ee363790b7437283c53090a85a9fec2f0b0fbc4 [2] Link: https://github.com/Rust-for-Linux/linux/issues/735 [3] Link: https://lore.kernel.org/bpf/20230111152050.559334-1-yakoyoku@gmail.com
2023-01-16Merge tag 'mm-hotfixes-stable-2023-01-16-15-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "21 hotfixes. Thirteen of these address pre-6.1 issues and hence have the cc:stable tag" * tag 'mm-hotfixes-stable-2023-01-16-15-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) init/Kconfig: fix typo (usafe -> unsafe) nommu: fix split_vma() map_count error nommu: fix do_munmap() error path nommu: fix memory leak in do_mmap() error path MAINTAINERS: update Robert Foss' email address proc: fix PIE proc-empty-vm, proc-pid-vm tests mm: update mmap_sem comments to refer to mmap_lock include/linux/mm: fix release_pages_arg kernel doc comment lib/win_minmax: use /* notation for regular comments kasan: mark kasan_kunit_executing as static nilfs2: fix general protection fault in nilfs_btree_insert() Docs/admin-guide/mm/zswap: remove zsmalloc's lack of writeback warning mm/hugetlb: pre-allocate pgtable pages for uffd wr-protects hugetlb: unshare some PMDs when splitting VMAs mm: fix vma->anon_name memory leak for anonymous shmem VMAs mm/shmem: restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection() ...
2023-01-11init/Kconfig: fix typo (usafe -> unsafe)Lizzy Fleckenstein
Fix the help text for the PRINTK_SAFE_LOG_BUF_SHIFT setting. Link: https://lkml.kernel.org/r/20230109201837.23873-1-eliasfleckenstein@web.de Signed-off-by: Lizzy Fleckenstein <eliasfleckenstein@web.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-11init/Kconfig: fix LOCALVERSION_AUTO help textRasmus Villemoes
It was never guaranteed to be exactly eight, but since commit 548b8b5168c9 ("scripts/setlocalversion: make git describe output more reliable"), it has been exactly 12. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-01-09gcc: disable -Warray-bounds for gcc-11 tooLinus Torvalds
We had already disabled this warning for gcc-12 due to bugs in the value range analysis, but it turns out we end up having some similar problems with gcc-11.3 too, so let's disable it there too. Older gcc versions end up being increasingly less relevant, and hopefully clang and newer version of gcc (ie gcc-13) end up working reliably enough that we still get the build coverage even when we disable this for some versions. Link: https://lore.kernel.org/all/20221227002941.GA2691687@roeck-us.net/ Link: https://lore.kernel.org/all/D8BDBF66-E44C-45D4-9758-BAAA4F0C1998@kernel.org/ Cc: Kees Cook <kees@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-27sched: Introduce per-memory-map concurrency IDMathieu Desnoyers
This feature allows the scheduler to expose a per-memory map concurrency ID to user-space. This concurrency ID is within the possible cpus range, and is temporarily (and uniquely) assigned while threads are actively running within a memory map. If a memory map has fewer threads than cores, or is limited to run on few cores concurrently through sched affinity or cgroup cpusets, the concurrency IDs will be values close to 0, thus allowing efficient use of user-space memory for per-cpu data structures. This feature is meant to be exposed by a new rseq thread area field. The primary purpose of this feature is to do the heavy-lifting needed by memory allocators to allow them to use per-cpu data structures efficiently in the following situations: - Single-threaded applications, - Multi-threaded applications on large systems (many cores) with limited cpu affinity mask, - Multi-threaded applications on large systems (many cores) with restricted cgroup cpuset per container. One of the key concern from scheduler maintainers is the overhead associated with additional spin locks or atomic operations in the scheduler fast-path. This is why the following optimization is implemented. On context switch between threads belonging to the same memory map, transfer the mm_cid from prev to next without any atomic ops. This takes care of use-cases involving frequent context switch between threads belonging to the same memory map. Additional optimizations can be done if the spin locks added when context switching between threads belonging to different memory maps end up being a performance bottleneck. Those are left out of this patch though. A performance impact would have to be clearly demonstrated to justify the added complexity. The credit goes to Paul Turner (Google) for the original virtual cpu id idea. This feature is implemented based on the discussions with Paul Turner and Peter Oskolkov (Google), but I took the liberty to implement scheduler fast-path optimizations and my own NUMA-awareness scheme. The rumor has it that Google have been running a rseq vcpu_id extension internally in production for a year. The tcmalloc source code indeed has comments hinting at a vcpu_id prototype extension to the rseq system call [1]. The following benchmarks do not show any significant overhead added to the scheduler context switch by this feature: * perf bench sched messaging (process) Baseline: 86.5±0.3 ms With mm_cid: 86.7±2.6 ms * perf bench sched messaging (threaded) Baseline: 84.3±3.0 ms With mm_cid: 84.7±2.6 ms * hackbench (process) Baseline: 82.9±2.7 ms With mm_cid: 82.9±2.9 ms * hackbench (threaded) Baseline: 85.2±2.6 ms With mm_cid: 84.4±2.9 ms [1] https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linux_syscall_support.h#L26 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221122203932.231377-8-mathieu.desnoyers@efficios.com
2022-12-14Merge tag 'hardening-v6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening updates from Kees Cook: - Convert flexible array members, fix -Wstringop-overflow warnings, and fix KCFI function type mismatches that went ignored by maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook) - Remove the remaining side-effect users of ksize() by converting dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add more __alloc_size attributes, and introduce full testing of all allocator functions. Finally remove the ksize() side-effect so that each allocation-aware checker can finally behave without exceptions - Introduce oops_limit (default 10,000) and warn_limit (default off) to provide greater granularity of control for panic_on_oops and panic_on_warn (Jann Horn, Kees Cook) - Introduce overflows_type() and castable_to_type() helpers for cleaner overflow checking - Improve code generation for strscpy() and update str*() kern-doc - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests - Always use a non-NULL argument for prepare_kernel_cred() - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell) - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin Li) - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu) - Fix um vs FORTIFY warnings for always-NULL arguments * tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits) ksmbd: replace one-element arrays with flexible-array members hpet: Replace one-element array with flexible-array member um: virt-pci: Avoid GCC non-NULL warning signal: Initialize the info in ksignal lib: fortify_kunit: build without structleak plugin panic: Expose "warn_count" to sysfs panic: Introduce warn_limit panic: Consolidate open-coded panic_on_warn checks exit: Allow oops_limit to be disabled exit: Expose "oops_count" to sysfs exit: Put an upper limit on how often we can oops panic: Separate sysctl logic from CONFIG_SMP mm/pgtable: Fix multiple -Wstringop-overflow warnings mm: Make ksize() a reporting-only function kunit/fortify: Validate __alloc_size attribute results drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid() drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid() driver core: Add __alloc_size hint to devm allocators overflow: Introduce overflows_type() and castable_to_type() coredump: Proactively round up to kmalloc bucket size ...
2022-12-13Merge tag 'modules-6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull modules updates from Luis Chamberlain: "Tux gets for xmas an improvement to the average lookup performance of kallsyms_lookup_name() by 715x thanks to the work by Zhen Lei, which upgraded our old implementation from being O(n) to O(log(n)), while also retaining the old implementation support on /proc/kallsyms. The only penalty was increasing the memory footprint by 3 * kallsyms_num_syms. Folks who want to improve this further now also have a dedicated selftest facility through KALLSYMS_SELFTEST. Stephen Boyd added zstd in-kernel decompression support, but the only users of this would be folks using the load-pin LSM because otherwise we do module decompression in userspace. The only other thing with mentioning is a minor boot time optimization by Rasmus Villemoes which deferes param_sysfs_init() to late init. The rest is cleanups and minor fixes" * tag 'modules-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: livepatch: Call klp_match_callback() in klp_find_callback() to avoid code duplication module/decompress: Support zstd in-kernel decompression kallsyms: Remove unneeded semicolon kallsyms: Add self-test facility livepatch: Use kallsyms_on_each_match_symbol() to improve performance kallsyms: Add helper kallsyms_on_each_match_symbol() kallsyms: Reduce the memory occupied by kallsyms_seqs_of_names[] kallsyms: Correctly sequence symbols when CONFIG_LTO_CLANG=y kallsyms: Improve the performance of kallsyms_lookup_name() scripts/kallsyms: rename build_initial_tok_table() module: Fix NULL vs IS_ERR checking for module_get_next_page kernel/params.c: defer most of param_sysfs_init() to late_initcall time module: Remove unused macros module_addr_min/max module: remove redundant module_sysfs_initialized variable
2022-11-22init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dashAlexandre Belloni
When using dash as /bin/sh, the CC_HAS_ASM_GOTO_TIED_OUTPUT test fails with a syntax error which is not the one we are looking for: <stdin>: In function ‘foo’: <stdin>:1:29: warning: missing terminating " character <stdin>:1:29: error: missing terminating " character <stdin>:2:5: error: expected ‘:’ before ‘+’ token <stdin>:2:7: warning: missing terminating " character <stdin>:2:7: error: missing terminating " character <stdin>:2:5: error: expected declaration or statement at end of input Removing '\n' solves this. Fixes: 1aa0e8b144b6 ("Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug") Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-11-15kallsyms: Add self-test facilityZhen Lei
Added test cases for basic functions and performance of functions kallsyms_lookup_name(), kallsyms_on_each_symbol() and kallsyms_on_each_match_symbol(). It also calculates the compression rate of the kallsyms compression algorithm for the current symbol set. The basic functions test begins by testing a set of symbols whose address values are known. Then, traverse all symbol addresses and find the corresponding symbol name based on the address. It's impossible to determine whether these addresses are correct, but we can use the above three functions along with the addresses to test each other. Due to the traversal operation of kallsyms_on_each_symbol() is too slow, only 60 symbols can be tested in one second, so let it test on average once every 128 symbols. The other two functions validate all symbols. If the basic functions test is passed, print only performance test results. If the test fails, print error information, but do not perform subsequent performance tests. Start self-test automatically after system startup if CONFIG_KALLSYMS_SELFTEST=y. Example of output content: (prefix 'kallsyms_selftest:' is omitted start --------------------------------------------------------- | nr_symbols | compressed size | original size | ratio(%) | |---------------------------------------------------------| | 107543 | 1357912 | 2407433 | 56.40 | --------------------------------------------------------- kallsyms_lookup_name() looked up 107543 symbols The time spent on each symbol is (ns): min=630, max=35295, avg=7353 kallsyms_on_each_symbol() traverse all: 11782628 ns kallsyms_on_each_match_symbol() traverse all: 9261 ns finish Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-11-01kbuild: upgrade the orphan section warning to an error if CONFIG_WERROR is setXin Li
Andrew Cooper suggested upgrading the orphan section warning to a hard link error. However Nathan Chancellor said outright turning the warning into an error with no escape hatch might be too aggressive, as we have had these warnings triggered by new compiler generated sections, and suggested turning orphan sections into an error only if CONFIG_WERROR is set. Kees Cook echoed and emphasized that the mandate from Linus is that we should avoid breaking builds. It wrecks bisection, it causes problems across compiler versions, etc. Thus upgrade the orphan section warning to a hard link error only if CONFIG_WERROR is set. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Xin Li <xin3.li@intel.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221025073023.16137-2-xin3.li@intel.com
2022-10-20init: Kconfig: fix spelling mistake "satify" -> "satisfy"Colin Ian King
There is a spelling mistake in a Kconfig description. Fix it. Link: https://lkml.kernel.org/r/20221007204339.2757753-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12Merge tag 'mm-nonmm-stable-2022-10-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - hfs and hfsplus kmap API modernization (Fabio Francesco) - make crash-kexec work properly when invoked from an NMI-time panic (Valentin Schneider) - ntfs bugfixes (Hawkins Jiawei) - improve IPC msg scalability by replacing atomic_t's with percpu counters (Jiebin Sun) - nilfs2 cleanups (Minghao Chi) - lots of other single patches all over the tree! * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype proc: test how it holds up with mapping'less process mailmap: update Frank Rowand email address ia64: mca: use strscpy() is more robust and safer init/Kconfig: fix unmet direct dependencies ia64: update config files nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure fork: remove duplicate included header files init/main.c: remove unnecessary (void*) conversions proc: mark more files as permanent nilfs2: remove the unneeded result variable nilfs2: delete unnecessary checks before brelse() checkpatch: warn for non-standard fixes tag style usr/gen_init_cpio.c: remove unnecessary -1 values from int file ipc/msg: mitigate the lock contention with percpu counter percpu: add percpu_counter_add_local and percpu_counter_sub_local fs/ocfs2: fix repeated words in comments relay: use kvcalloc to alloc page array in relay_alloc_page_array proc: make config PROC_CHILDREN depend on PROC_FS fs: uninline inode_maybe_inc_iversion() ...
2022-10-11init/Kconfig: fix unmet direct dependenciesRen Zhijie
Commit 3c07bfce92a5 ("proc: make config PROC_CHILDREN depend on PROC_FS") make config PROC_CHILDREN depend on PROC_FS. When CONFIG_PROC_FS is not set and CONFIG_CHECKPOINT_RESTORE=y, make menuconfig screams like this: WARNING: unmet direct dependencies detected for PROC_CHILDREN Depends on [n]: PROC_FS [=n] Selected by [y]: - CHECKPOINT_RESTORE [=y] CHECKPOINT_RESTORE would select PROC_CHILDREN which depends on PROC_FS, so add depends on PROC_FS to CHECKPOINT_RESTORE to fix this. Link: https://lkml.kernel.org/r/20220929070057.59044-1-renzhijie2@huawei.com Fixes: 3c07bfce92a5 ("proc: make config PROC_CHILDREN depend on PROC_FS") Signed-off-by: Ren Zhijie <renzhijie2@huawei.com> Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-09Merge tag 'powerpc-6.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Remove our now never-true definitions for pgd_huge() and p4d_leaf(). - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit. - Add support for syscall wrappers. - Add support for KFENCE on 64-bit. - Update 64-bit HV KVM to use the new guest state entry/exit accounting API. - Support execute-only memory when using the Radix MMU (P9 or later). - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests. - Updates to our linker script to move more data into read-only sections. - Allow the VDSO to be randomised on 32-bit. - Many other small features and fixes. Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas, Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng Yongjun. * tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) KVM: PPC: Book3S HV: Fix stack frame regs marker powerpc: Don't add __powerpc_ prefix to syscall entry points powerpc/64s/interrupt: Fix stack frame regs marker powerpc/64: Fix msr_check_and_set/clear MSR[EE] race powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN powerpc/pseries: Add firmware details to the hardware description powerpc/powernv: Add opal details to the hardware description powerpc: Add device-tree model to the hardware description powerpc/64: Add logical PVR to the hardware description powerpc: Add PVR & CPU name to hardware description powerpc: Add hardware description string powerpc/configs: Enable PPC_UV in powernv_defconfig powerpc/configs: Update config files for removed/renamed symbols powerpc/mm: Fix UBSAN warning reported on hugetlb powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range powerpc: Drops STABS_DEBUG from linker scripts powerpc/64s: Remove lost/old comment powerpc/64s: Remove old STAB comment powerpc: remove orphan systbl_chk.sh ...
2022-10-03mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbolJohannes Weiner
Since 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP. Update the sites accordingly and drop the symbol. [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM, which hasn't been a user-visible symbol for over half a decade. ] Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-28Kbuild: add Rust supportMiguel Ojeda
Having most of the new files in place, we now enable Rust support in the build system, including `Kconfig` entries related to Rust, the Rust configuration printer and a few other bits. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com> Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com> Co-developed-by: Finn Behrens <me@kloenk.de> Signed-off-by: Finn Behrens <me@kloenk.de> Co-developed-by: Adam Bratschi-Kaye <ark.email@gmail.com> Signed-off-by: Adam Bratschi-Kaye <ark.email@gmail.com> Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com> Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com> Co-developed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com> Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Co-developed-by: Boris-Chengbiao Zhou <bobo1239@web.de> Signed-off-by: Boris-Chengbiao Zhou <bobo1239@web.de> Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Co-developed-by: Douglas Su <d0u9.su@outlook.com> Signed-off-by: Douglas Su <d0u9.su@outlook.com> Co-developed-by: Dariusz Sosnowski <dsosnowski@dsosnowski.pl> Signed-off-by: Dariusz Sosnowski <dsosnowski@dsosnowski.pl> Co-developed-by: Antonio Terceiro <antonio.terceiro@linaro.org> Signed-off-by: Antonio Terceiro <antonio.terceiro@linaro.org> Co-developed-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Co-developed-by: Björn Roy Baron <bjorn3_gh@protonmail.com> Signed-off-by: Björn Roy Baron <bjorn3_gh@protonmail.com> Co-developed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-09-05powerpc/64: Remove PPC64 special case for cputime accounting defaultNicholas Piggin
Distro kernels tend to be moving to VIRT_CPU_ACCOUNTING_GEN, and there is not much reason why PPC64 should be special here. Remove the special case and make the ppc64 and pseries defconfigs use GEN accounting (others will use TICK, as-per Kconfig defaults). VIRT_CPU_ACCOUNTING_NATIVE does provide scaled vtime and stolen time apportioned between system and user time, and vtime accounting is not unconditionally enabled, and possibly other things. But it would be better at this point to extend GEN to cover important missing features rather than directing users back to a less used option. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220902085316.2071519-4-npiggin@gmail.com
2022-08-21asm goto: eradicate CC_HAS_ASM_GOTONick Desaulniers
GCC has supported asm goto since 4.5, and Clang has since version 9.0.0. The minimum supported versions of these tools for the build according to Documentation/process/changes.rst are 5.1 and 11.0.0 respectively. Remove the feature detection script, Kconfig option, and clean up some fallback code that is no longer supported. The removed script was also testing for a GCC specific bug that was fixed in the 4.7 release. Also remove workarounds for bpftrace using clang older than 9.0.0, since other BPF backend fixes are required at this point. Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/ Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637 Acked-by: Borislav Petkov <bp@suse.de> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-10Merge tag 'kbuild-v5.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the support for -O3 (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3) - Fix error of rpm-pkg cross-builds - Support riscv for checkstack tool - Re-enable -Wformwat warnings for Clang - Clean up modpost, Makefiles, and misc scripts * tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits) modpost: remove .symbol_white_list field entirely modpost: remove unneeded .symbol_white_list initializers modpost: add PATTERNS() helper macro modpost: shorten warning messages in report_sec_mismatch() Revert "Kbuild, lto, workaround: Don't warn for initcall_reference in modpost" modpost: use more reliable way to get fromsec in section_rel(a)() modpost: add array range check to sec_name() modpost: refactor get_secindex() kbuild: set EXIT trap before creating temporary directory modpost: remove unused Elf_Sword macro Makefile.extrawarn: re-enable -Wformat for clang kbuild: add dtbs_prepare target kconfig: Qt5: tell the user which packages are required modpost: use sym_get_data() to get module device_table data modpost: drop executable ELF support checkstack: add riscv support for scripts/checkstack.pl kconfig: shorten the temporary directory name for cc-option scripts: headers_install.sh: Update config leak ignore entries kbuild: error out if $(INSTALL_MOD_PATH) contains % or : kbuild: error out if $(KBUILD_EXTMOD) contains % or : ...
2022-08-08Merge tag 'modules-6.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "For the 6.0 merge window the modules code shifts to cleanup and minor fixes effort. This becomes much easier to do and review now due to the code split to its own directory from effort on the last kernel release. I expect to see more of this with time and as we expand on test coverage in the future. The cleanups and fixes come from usual suspects such as Christophe Leroy and Aaron Tomlin but there are also some other contributors. One particular minor fix worth mentioning is from Helge Deller, where he spotted a *forever* incorrect natural alignment on both ELF section header tables: * .altinstructions * __bug_table sections A lot of back and forth went on in trying to determine the ill effects of this misalignment being present for years and it has been determined there should be no real ill effects unless you have a buggy exception handler. Helge actually hit one of these buggy exception handlers on parisc which is how he ended up spotting this issue. When implemented correctly these paths with incorrect misalignment would just mean a performance penalty, but given that we are dealing with alternatives on modules and with the __bug_table (where info regardign BUG()/WARN() file/line information associated with it is stored) this really shouldn't be a big deal. The only other change with mentioning is the kmap() with kmap_local_page() and my only concern with that was on what is done after preemption, but the virtual addresses are restored after preemption. This is only used on module decompression. This all has sit on linux-next for a while except the kmap stuff which has been there for 3 weeks" * tag 'modules-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: module: Replace kmap() with kmap_local_page() module: Show the last unloaded module's taint flag(s) module: Use strscpy() for last_unloaded_module module: Modify module_flags() to accept show_state argument module: Move module's Kconfig items in kernel/module/ MAINTAINERS: Update file list for module maintainers module: Use vzalloc() instead of vmalloc()/memset(0) modules: Ensure natural alignment for .altinstructions and __bug_table sections module: Increase readability of module_kallsyms_lookup_name() module: Fix ERRORs reported by checkpatch.pl module: Add support for default value for module async_probe
2022-08-03Merge tag 'cgroup-for-5.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Several core optimizations: - threadgroup_rwsem write locking is skipped when configuring controllers in empty subtrees. Combined with CLONE_INTO_CGROUP, this allows the common static usage pattern to not grab threadgroup_rwsem at all (glibc still doesn't seem ready for CLONE_INTO_CGROUP unfortunately). - threadgroup_rwsem used to be put into non-percpu mode by default due to latency concerns in specific use cases. There's no reason for everyone else to pay for it. Make the behavior optional. - psi no longer allocates memory when disabled. ... along with some code cleanups" * tag 'cgroup-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Skip subtree root in cgroup_update_dfl_csses() cgroup: remove "no" prefixed mount options cgroup: Make !percpu threadgroup_rwsem operations optional cgroup: Add "no" prefixed mount options cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree cgroup.c: remove redundant check for mixable cgroup in cgroup_migrate_vet_dst cgroup.c: add helper __cset_cgroup_from_root to cleanup duplicated codes psi: dont alloc memory for psi by default
2022-08-02Merge tag 'docs-6.0' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "This was a moderately busy cycle for documentation, but nothing all that earth-shaking: - More Chinese translations, and an update to the Italian translations. The Japanese, Korean, and traditional Chinese translations are more-or-less unmaintained at this point, instead. - Some build-system performance improvements. - The removal of the archaic submitting-drivers.rst document, with the movement of what useful material that remained into other docs. - Improvements to sphinx-pre-install to, hopefully, give more useful suggestions. - A number of build-warning fixes Plus the usual collection of typo fixes, updates, and more" * tag 'docs-6.0' of git://git.lwn.net/linux: (92 commits) docs: efi-stub: Fix paths for x86 / arm stubs Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8 Docs/zh_CN: Update the translation of pci to 5.19-rc8 Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8 Docs/zh_CN: Update the translation of usage to 5.19-rc8 Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8 Docs/zh_CN: Update the translation of sparse to 5.19-rc8 Docs/zh_CN: Update the translation of kasan to 5.19-rc8 Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8 doc:it_IT: align Italian documentation docs: Remove spurious tag from admin-guide/mm/overcommit-accounting.rst Documentation: process: Update email client instructions for Thunderbird docs: ABI: correct QEMU fw_cfg spec path doc/zh_CN: remove submitting-driver reference from docs docs: zh_TW: align to submitting-drivers removal docs: zh_CN: align to submitting-drivers removal docs: ko_KR: howto: remove reference to removed submitting-drivers docs: ja_JP: howto: remove reference to removed submitting-drivers docs: it_IT: align to submitting-drivers removal docs: process: remove outdated submitting-drivers.rst ...
2022-08-02Merge tag 'rcu.2022.07.26a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes - Callback-offload updates, perhaps most notably a new RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be offloaded at boot time, regardless of kernel boot parameters. This is useful to battery-powered systems such as ChromeOS and Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot parameter prevents offloaded callbacks from interfering with real-time workloads and with energy-efficiency mechanisms - Polled grace-period updates, perhaps most notably making these APIs account for both normal and expedited grace periods - Tasks RCU updates, perhaps most notably reducing the CPU overhead of RCU tasks trace grace periods by more than a factor of two on a system with 15,000 tasks. The reduction is expected to increase with the number of tasks, so it seems reasonable to hypothesize that a system with 150,000 tasks might see a 20-fold reduction in CPU overhead - Torture-test updates - Updates that merge RCU's dyntick-idle tracking into context tracking, thus reducing the overhead of transitioning to kernel mode from either idle or nohz_full userspace execution for kernels that track context independently of RCU. This is expected to be helpful primarily for kernels built with CONFIG_NO_HZ_FULL=y * tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits) rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings rcu: Diagnose extended sync_rcu_do_polled_gp() loops rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings rcutorture: Test polled expedited grace-period primitives rcu: Add polled expedited grace-period primitives rcutorture: Verify that polled GP API sees synchronous grace periods rcu: Make Tiny RCU grace periods visible to polled APIs rcu: Make polled grace-period API account for expedited grace periods rcu: Switch polled grace-period APIs to ->gp_seq_polled rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty rcu/nocb: Add option to opt rcuo kthreads out of RT priority rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread() rcu/nocb: Add an option to offload all CPUs on boot rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order rcu/nocb: Add/del rdp to iterate from rcuog itself rcu/tree: Add comment to describe GP-done condition in fqs loop rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs() rcu/kvfree: Remove useless monitor_todo flag rcu: Cleanup RCU urgency state for offline CPU ...
2022-08-02Merge tag 'v5.20-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Make proc files report fips module name and version Algorithms: - Move generic SHA1 code into lib/crypto - Implement Chinese Remainder Theorem for RSA - Remove blake2s - Add XCTR with x86/arm64 acceleration - Add POLYVAL with x86/arm64 acceleration - Add HCTR2 - Add ARIA Drivers: - Add support for new CCP/PSP device ID in ccp" * tag 'v5.20-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (89 commits) crypto: tcrypt - Remove the static variable initialisations to NULL crypto: arm64/poly1305 - fix a read out-of-bound crypto: hisilicon/zip - Use the bitmap API to allocate bitmaps crypto: hisilicon/sec - fix auth key size error crypto: ccree - Remove a useless dma_supported() call crypto: ccp - Add support for new CCP/PSP device ID crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq crypto: testmgr - some more fixes to RSA test vectors cyrpto: powerpc/aes - delete the rebundant word "block" in comments hwrng: via - Fix comment typo crypto: twofish - Fix comment typo crypto: rmd160 - fix Kconfig "its" grammar crypto: keembay-ocs-ecc - Drop if with an always false condition Documentation: qat: rewrite description Documentation: qat: Use code block for qat sysfs example crypto: lib - add module license to libsha1 crypto: lib - make the sha1 library optional crypto: lib - move lib/sha1.c into lib/crypto/ crypto: fips - make proc files report fips module name and version ...
2022-07-27init/Kconfig: update KALLSYMS_ALL help textBaruch Siach
CONFIG_KALLSYMS_ALL is required for kernel live patching which is a common use case that is enabled in some major distros. Update the Kconfig help text to reflect that. While at it, s/e.g./i.e./ to match the text intention. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-07-27kbuild: drop support for CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3Nick Desaulniers
The difference in most compilers between `-O3` and `-O2` is mostly down to whether loops with statically determinable trip counts are fully unrolled vs unrolled to a multiple of SIMD width. This patch is effectively a revert of commit 15f5db60a137 ("kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding ARCH_CFLAGS Ever since commit cfdbc2e16e65 ("ARC: Build system: Makefiles, Kconfig, Linker script") ARC has been built with -O3, though the reason for doing so was not specified in inline comments or the commit message. This commit does not re-add -O3 to arch/arc/Makefile. Folks looking to experiment with `-O3` (or any compiler flag for that matter) may pass them along to the command line invocation of make: $ make KCFLAGS=-O3 Code that looks to re-add an explicit Kconfig option for `-O3` should provide: 1. A rigorous and reproducible performance profile of a reasonable userspace workload that demonstrates a hot loop in the kernel that would benefit from `-O3` over `-O2`. 2. Disassembly of said loop body before and after. 3. Provides stats on terms of increase in file size. Link: https://lore.kernel.org/linux-kbuild/CA+55aFz2sNBbZyg-_i8_Ldr2e8o9dfvdSfHHuRzVtP2VMAUWPg@mail.gmail.com/ Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-07-23cgroup: Make !percpu threadgroup_rwsem operations optionalTejun Heo
3942a9bd7b58 ("locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write()") disabled percpu operations on threadgroup_rwsem because the impiled synchronize_rcu() on write locking was pushing up the latencies too much for android which constantly moves processes between cgroups. This makes the hotter paths - fork and exit - slower as they're always forced into the slow path. There is no reason to force this on everyone especially given that more common static usage pattern can now completely avoid write-locking the rwsem. Write-locking is elided when turning on and off controllers on empty sub-trees and CLONE_INTO_CGROUP enables seeding a cgroup without grabbing the rwsem. Restore the default percpu operations and introduce the mount option "favordynmods" and config option CGROUP_FAVOR_DYNMODS for users who need lower latencies for the dynamic operations. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Michal Koutn� <mkoutny@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Oleg Nesterov <oleg@redhat.com>
2022-07-15crypto: lib - make the sha1 library optionalEric Biggers
Since the Linux RNG no longer uses sha1_transform(), the SHA-1 library is no longer needed unconditionally. Make it possible to build the Linux kernel without the SHA-1 library by putting it behind a kconfig option, and selecting this new option from the kconfig options that gate the remaining users: CRYPTO_SHA1 for crypto/sha1_generic.c, BPF for kernel/bpf/core.c, and IPV6 for net/ipv6/addrconf.c. Unfortunately, since BPF is selected by NET, for now this can only make a difference for kernels built without networking support. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>