summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)Author
2024-01-09Merge tag 'pm-6.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add support for new processors (Sierra Forest, Grand Ridge and Meteor Lake) to the intel_idle driver, make intel_pstate run on Emerald Rapids without HWP support and adjust it to utilize EPP values supplied by the platform firmware, fix issues, clean up code and improve documentation. The most significant fix addresses deadlocks in the core system-wide resume code that occur if async_schedule_dev() attempts to run its argument function synchronously (for example, due to a memory allocation failure). It rearranges the code in question which may increase the system resume time in some cases, but this basically is a removal of a premature optimization. That optimization will be added back later, but properly this time. Specifics: - Add support for the Sierra Forest, Grand Ridge and Meteorlake SoCs to the intel_idle cpuidle driver (Artem Bityutskiy, Zhang Rui) - Do not enable interrupts when entering idle in the haltpoll cpuidle driver (Borislav Petkov) - Add Emerald Rapids support in no-HWP mode to the intel_pstate cpufreq driver (Zhenguo Yao) - Use EPP values programmed by the platform firmware as balanced performance ones by default in intel_pstate (Srinivas Pandruvada) - Add a missing function return value check to the SCMI cpufreq driver to avoid unexpected behavior (Alexandra Diupina) - Fix parameter type warning in the armada-8k cpufreq driver (Gregory CLEMENT) - Rework trans_stat_show() in the devfreq core code to avoid buffer overflows (Christian Marangi) - Synchronize devfreq_monitor_[start/stop] so as to prevent a timer list corruption from occurring when devfreq governors are switched frequently (Mukesh Ojha) - Fix possible deadlocks in the core system-wide PM code that occur if device-handling functions cannot be executed asynchronously during resume from system-wide suspend (Rafael J. Wysocki) - Clean up unnecessary local variable initializations in multiple places in the hibernation code (Wang chaodong, Li zeming) - Adjust core hibernation code to avoid missing wakeup events that occur after saving an image to persistent storage (Chris Feng) - Update hibernation code to enforce correct ordering during image compression and decompression (Hongchen Zhang) - Use kmap_local_page() instead of kmap_atomic() in copy_data_page() during hibernation and restore (Chen Haonan) - Adjust documentation and code comments to reflect recent tasks freezer changes (Kevin Hao) - Repair excess function parameter description warning in the hibernation image-saving code (Randy Dunlap) - Fix _set_required_opps when opp is NULL (Bryan O'Donoghue) - Use device_get_match_data() in the OPP code for TI (Rob Herring) - Clean up OPP level and other parts and call dev_pm_opp_set_opp() recursively for required OPPs (Viresh Kumar)" * tag 'pm-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (35 commits) OPP: Rename 'rate_clk_single' OPP: Pass rounded rate to _set_opp() OPP: Relocate dev_pm_opp_sync_regulators() PM: sleep: Fix possible deadlocks in core system-wide PM code OPP: Move dev_pm_opp_icc_bw to internal opp.h async: Introduce async_schedule_dev_nocall() async: Split async_schedule_node_domain() cpuidle: haltpoll: Do not enable interrupts when entering idle OPP: Fix _set_required_opps when opp is NULL OPP: The level field is always of unsigned int type PM: hibernate: Repair excess function parameter description warning PM: sleep: Remove obsolete comment from unlock_system_sleep() cpufreq: intel_pstate: Add Emerald Rapids support in no-HWP mode Documentation: PM: Adjust freezing-of-tasks.rst to the freezer changes PM: hibernate: Use kmap_local_page() in copy_data_page() intel_idle: add Sierra Forest SoC support intel_idle: add Grand Ridge SoC support PM / devfreq: Synchronize devfreq_monitor_[start/stop] cpufreq: armada-8k: Fix parameter type warning PM: hibernate: Enforce ordering during image compression/decompression ...
2024-01-09Merge tag 'thermal-6.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for the D1/T113s THS controller to the sun8i driver and a DT-based mechanism for platforms to indicate a preference to reboot (instead of shutting down) on crossing a critical trip point, fix issues, make other improvements (in the IPA governor, the Intel HFI driver, the exynos driver and the thermal netlink interface among other places) and clean up code. One long-standing issue addressed here is that trip point crossing notifications sent to user space might be unreliable due to the incorrect handling of trip point hysteresis in the thermal core: multiple notifications might be sent for the same event or there might be events without any notification at all. Specifics: - Add dynamic thresholds for trip point crossing detection to prevent trip point crossing notifications from being sent at incorrect times or not at all in some cases (Rafael J. Wysocki) - Fix synchronization issues related to the resume of thermal zones during a system-wide resume and allow thermal zones to be resumed concurrently (Rafael J. Wysocki) - Modify the thermal zone unregistration to wait for the given zone to go away completely before returning to the caller and rework the sysfs interface for trip points on top of that (Rafael J. Wysocki) - Fix a possible NULL pointer dereference in thermal zone registration error path (Rafael J. Wysocki) - Clean up the IPA thermal governor and modify it (with the help of a new governor callback) to avoid allocating and freeing memory every time its throttling callback is invoked (Lukasz Luba) - Make the IPA thermal governor handle thermal instance weight changes via sysfs correctly (Lukasz Luba) - Update the thermal netlink code to avoid sending messages if there are no recipients (Stanislaw Gruszka) - Convert Mediatek Thermal to the json-schema (Rafał Miłecki) - Fix thermal DT bindings issue on Loongson (Binbin Zhou) - Fix returning NULL instead of -ENODEV during thermal probe on Loogsoon (Binbin Zhou) - Add thermal DT binding for tsens on the SM8650 platform (Neil Armstrong) - Add reboot on the critical trip point crossing option feature (Fabio Estevam) - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal suspend/resume on AmLogic (Uwe Kleine-König) - Add D1/T113s THS controller support to the Sun8i thermal control driver (Maxim Kiselev) - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold) - Fix compilation warning in the tmon utility (Florian Eckert) - Add support for interrupt-based thermal configuration on Exynos along with a set of related cleanups (Mateusz Majewski) - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri) - Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal driver (Randy Dunlap) - Move the .get_temp() thermal zone callback presence check to the thermal zone registration code (Daniel Lezcano) - Use the for_each_trip() macro for trip points table walks in a few places in the thermal core (Rafael J. Wysocki) - Make all trip point updates (via sysfs as well as from the platform firmware) trigger trip change notifications (Rafael J. Wysocki) - Drop redundant code from the thermal core and make one function in it take a const pointer argument (Rafael J. Wysocki)" * tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) thermal: trip: Constify thermal zone argument of thermal_zone_trip_id() thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void ...
2024-01-09Merge tag 'mtd/for-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd updates from Miquel Raynal: "MTD: - Apart from preventing the mtdblk to run on top of ftl or ubiblk (which may cause security issues and has no meaning anyway), there are a few misc fixes. Raw NAND: - Two meaningful changes this time. The conversion of the brcmnand driver to the ->exec_op() API, this series brought additional changes to the core in order to help controller drivers to handle themselves the WP pin during destructive operations when relevant. - There is also a series bringing important fixes to the sequential read feature. - As always, there is as well a whole bunch of miscellaneous W=1 fixes, together with a few runtime fixes (double free, timeout value, OOB layout, missing register initialization) and the usual load of remove callbacks turned into void (which led to switch the txx9ndfmc driver to use module_platform_driver()). SPI NOR: - SPI NOR comes with die erase support for multi die flashes, with new octal protocols (1-1-8 and 1-8-8) parsed from SFDP and with an updated documentation about what the contributors shall consider when proposing flash additions or updates. - Michael Walle stepped out from the reviewer role to maintainer" * tag 'mtd/for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (39 commits) mtd: rawnand: Clarify conditions to enable continuous reads mtd: rawnand: Prevent sequential reads with on-die ECC engines mtd: rawnand: Fix core interference with sequential reads mtd: rawnand: Prevent crossing LUN boundaries during sequential reads mtd: Fix gluebi NULL pointer dereference caused by ftl notifier dt-bindings: mtd: partitions: u-boot: Fix typo mtd: rawnand: s3c2410: fix Excess struct member description kernel-doc warnings MAINTAINERS: change my mail to the kernel.org one mtd: spi-nor: sfdp: get the 1-1-8 and 1-8-8 protocol from SFDP mtd: spi-nor: drop superfluous debug prints mtd: spi-nor: sysfs: hide the flash name if not set mtd: spi-nor: mark the flash name as obsolete mtd: spi-nor: print flash ID instead of name mtd: maps: vmu-flash: Fix the (mtd core) switch to ref counters mtd: ssfdc: Remove an unused variable mtd: rawnand: diskonchip: fix a potential double free in doc_probe mtd: rawnand: rockchip: Add missing title to a kernel doc comment mtd: rawnand: rockchip: Rename a structure mtd: rawnand: pl353: Fix kernel doc mtd: spi-nor: micron-st: Add support for mt25qu01g ...
2024-01-09Merge tag 'spi-v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "A moderately busy release for SPI, the main core update was the merging of support for multiple chip selects, used in some flash configurations. There were also big overhauls for the AXI SPI Engine and PL022 drivers, plus some new device support for ST. There's a few patches for other trees, API updates to allow the multiple chip select support and one of the naming modernisations touched a controller embedded in the USB code. - Support for multiple chip selects. - A big overhaul for the AXI SPI engine driver, modernising it and adding a bunch of new features. - Modernisation of the PL022 driver, fixing some issues with submitting messages while in atomic context in the process. - Many drivers were converted to use new APIs which avoid outdated terminology for devices and controllers. - Support for ST Microelectronics STM32F7 and STM32MP25, and Renesas RZ/Five" * tag 'spi-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (83 commits) spi: stm32: add st,stm32mp25-spi compatible supporting STM32MP25 soc dt-bindings: spi: stm32: add st,stm32mp25-spi compatible spi: stm32: use dma_get_slave_caps prior to configuring dma channel spi: axi-spi-engine: fix struct member doc warnings spi: pl022: update description of internal_cs_control() spi: pl022: delete description of cur_msg spi: dw: Remove Intel Thunder Bay SOC support spi: dw: Remove Intel Thunder Bay SOC support spi: sh-msiof: Enforce fixed DTDL for R-Car H3 spi: ljca: switch to use devm_spi_alloc_host() spi: cs42l43: switch to use devm_spi_alloc_host() spi: zynqmp-gqspi: switch to use modern name spi: zynq-qspi: switch to use modern name spi: xtensa-xtfpga: switch to use modern name spi: xlp: switch to use modern name spi: xilinx: switch to use modern name spi: xcomm: switch to use modern name spi: uniphier: switch to use modern name spi: topcliff-pch: switch to use modern name spi: wpcm-fiu: switch to use devm_spi_alloc_host() ...
2024-01-09Merge tag 'regulator-v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "The main updates for this release are around monitoring of regulators, largely for error handling purposes. We allow the stream of regulator events to be seen by userspace as netlink events and allow system integrators to describe individual regulators as system critical with information on how long the system is expected to last on error. The system level error handling is very much about best effort problem mitigation rather than providing something fully robust, the initial drive was to provide a mechanism for trying to avoid initiating any new writes to flash once we notice the power going out. Otherwise it's very quiet, mainly several new Qualcomm devices. - Support for marking regulators as system critical and providing information on how long the system might last with those regulators in a failure state, hooked into the existing critical shutdown error handling. - Optional support for generating netlink events for events, there are use cases for system monitoring UIs and error handling. - A command line option to leave unused controllable regulators enabled, useful for debugging. We already only disable regulators we were explicitly given permission to control. - Support for Quacomm MP5496, PM8010 and PM8937" * tag 'regulator-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (31 commits) regulator: event: Ensure atomicity for sequence number uapi: regulator: Fix typo regulator: Reuse LINEAR_RANGE() in REGULATOR_LINEAR_RANGE() dt-bindings: regulator: qcom,usb-vbus-regulator: clean up example regulator: qcom_smd: Add LDO5 MP5496 regulator regulator: qcom-rpmh: add support for pm8010 regulators regulator: dt-bindings: qcom,rpmh: add compatible for pm8010 regulator: qcom-rpmh: extend to support multiple linear voltage ranges regulator: wm8350: Convert to platform remove callback returning void regulator: virtual: Convert to platform remove callback returning void regulator: userspace-consumer: Convert to platform remove callback returning void regulator: uniphier: Convert to platform remove callback returning void regulator: stm32-vrefbuf: Convert to platform remove callback returning void regulator: db8500-prcmu: Convert to platform remove callback returning void regulator: bd9571mwv: Convert to platform remove callback returning void regulator: arizona-ldo1: Convert to platform remove callback returning void regulator: event: Add regulator netlink event support regulator: event: Add regulator netlink event support regulator: stpmic1: Fix kernel-doc notation warnings regulator: palmas: remove redundant initialization of pointer pdata ...
2024-01-09Merge tag 'lsm-pr-20240105' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull security module updates from Paul Moore: - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr(). The first syscall simply lists the LSMs enabled, while the second and third get and set the current process' LSM attributes. Yes, these syscalls may provide similar functionality to what can be found under /proc or /sys, but they were designed to support multiple, simultaneaous (stacked) LSMs from the start as opposed to the current /proc based solutions which were created at a time when only one LSM was allowed to be active at a given time. We have spent considerable time discussing ways to extend the existing /proc interfaces to support multiple, simultaneaous LSMs and even our best ideas have been far too ugly to support as a kernel API; after +20 years in the kernel, I felt the LSM layer had established itself enough to justify a handful of syscalls. Support amongst the individual LSM developers has been nearly unanimous, with a single objection coming from Tetsuo (TOMOYO) as he is worried that the LSM_ID_XXX token concept will make it more difficult for out-of-tree LSMs to survive. Several members of the LSM community have demonstrated the ability for out-of-tree LSMs to continue to exist by picking high/unused LSM_ID values as well as pointing out that many kernel APIs rely on integer identifiers, e.g. syscalls (!), but unfortunately Tetsuo's objections remain. My personal opinion is that while I have no interest in penalizing out-of-tree LSMs, I'm not going to penalize in-tree development to support out-of-tree development, and I view this as a necessary step forward to support the push for expanded LSM stacking and reduce our reliance on /proc and /sys which has occassionally been problematic for some container users. Finally, we have included the linux-api folks on (all?) recent revisions of the patchset and addressed all of their concerns. - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit ioctls on 64-bit systems problem. This patch includes support for all of the existing LSMs which provide ioctl hooks, although it turns out only SELinux actually cares about the individual ioctls. It is worth noting that while Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this patch, they did both indicate they are okay with the changes. - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled at boot. While it's good that we are fixing this, I doubt this is something users are seeing in the wild as you need to both disable IPv6 and then attempt to configure IPv6 labeled networking via NetLabel/CALIPSO; that just doesn't make much sense. Normally this would go through netdev, but Jakub asked me to take this patch and of all the trees I maintain, the LSM tree seemed like the best fit. - Update the LSM MAINTAINERS entry with additional information about our process docs, patchwork, bug reporting, etc. I also noticed that the Lockdown LSM is missing a dedicated MAINTAINERS entry so I've added that to the pull request. I've been working with one of the major Lockdown authors/contributors to see if they are willing to step up and assume a Lockdown maintainer role; hopefully that will happen soon, but in the meantime I'll continue to look after it. - Add a handful of mailmap entries for Serge Hallyn and myself. * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits) lsm: new security_file_ioctl_compat() hook lsm: Add a __counted_by() annotation to lsm_ctx.ctx calipso: fix memory leak in netlbl_calipso_add_pass() selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test MAINTAINERS: add an entry for the lockdown LSM MAINTAINERS: update the LSM entry mailmap: add entries for Serge Hallyn's dead accounts mailmap: update/replace my old email addresses lsm: mark the lsm_id variables are marked as static lsm: convert security_setselfattr() to use memdup_user() lsm: align based on pointer length in lsm_fill_user_ctx() lsm: consolidate buffer size handling into lsm_fill_user_ctx() lsm: correct error codes in security_getselfattr() lsm: cleanup the size counters in security_getselfattr() lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation lsm: drop LSM_ID_IMA LSM: selftests for Linux Security Module syscalls SELinux: Add selfattr hooks AppArmor: Add selfattr hooks Smack: implement setselfattr and getselfattr hooks ...
2024-01-09Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio conversions for file paths'. - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2: Folio conversions for directory paths'. - IA64 remnant removal in Heiko Carstens's 'Remove unused code after IA-64 removal'. - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in 'Treewide: enable -Wmissing-prototypes'. This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series 'hexagon: Fix up instances of -Wmissing-prototypes'. - Nathan also addressed some s390 warnings in 's390: A couple of fixes for -Wmissing-prototypes'. - Arnd Bergmann addresses the same warnings for MIPS in his series 'mips: address -Wmissing-prototypes warnings'. - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series 'kexec_file: Load kernel at top of system RAM if required' - Baoquan He has also added the self-explanatory 'kexec_file: print out debugging message if required'. - Some checkstack maintenance work from Tiezhu Yang in the series 'Modify some code about checkstack'. - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is 'watchdog: Better handling of concurrent lockups'. - Yuntao Wang has contributed some maintenance work on the crash code in 'crash: Some cleanups and fixes'" * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits) crash_core: fix and simplify the logic of crash_exclude_mem_range() x86/crash: use SZ_1M macro instead of hardcoded value x86/crash: remove the unused image parameter from prepare_elf_headers() kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE scripts/decode_stacktrace.sh: strip unexpected CR from lines watchdog: if panicking and we dumped everything, don't re-enable dumping watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/hardlockup: adopt softlockup logic avoiding double-dumps kexec_core: fix the assignment to kimage->control_page x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init() lib/trace_readwrite.c:: replace asm-generic/io with linux/io nilfs2: cpfile: fix some kernel-doc warnings stacktrace: fix kernel-doc typo scripts/checkstack.pl: fix no space expression between sp and offset x86/kexec: fix incorrect argument passed to kexec_dprintk() x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs nilfs2: add missing set_freezable() for freezable kthread kernel: relay: remove relay_file_splice_read dead code, doesn't work docs: submit-checklist: remove all of "make namespacecheck" ...
2024-01-09Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
2024-01-09Merge tag 'slab-for-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - SLUB: delayed freezing of CPU partial slabs (Chengming Zhou) Freezing is an operation involving double_cmpxchg() that makes a slab exclusive for a particular CPU. Chengming noticed that we use it also in situations where we are not yet installing the slab as the CPU slab, because freezing also indicates that the slab is not on the shared list. This results in redundant freeze/unfreeze operation and can be avoided by marking separately the shared list presence by reusing the PG_workingset flag. This approach neatly avoids the issues described in 9b1ea29bc0d7 ("Revert "mm, slub: consider rest of partial list if acquire_slab() fails"") as we can now grab a slab from the shared list in a quick and guaranteed way without the cmpxchg_double() operation that amplifies the lock contention and can fail. As a result, lkp has reported 34.2% improvement of stress-ng.rawudp.ops_per_sec - SLAB removal and SLUB cleanups (Vlastimil Babka) The SLAB allocator has been deprecated since 6.5 and nobody has objected so far. We agreed at LSF/MM to wait until the next LTS, which is 6.6, so we should be good to go now. This doesn't yet erase all traces of SLAB outside of mm/ so some dead code, comments or documentation remain, and will be cleaned up gradually (some series are already in the works). Removing the choice of allocators has already allowed to simplify and optimize the code wiring up the kmalloc APIs to the SLUB implementation. * tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits) mm/slub: free KFENCE objects in slab_free_hook() mm/slub: handle bulk and single object freeing separately mm/slub: introduce __kmem_cache_free_bulk() without free hooks mm/slub: fix bulk alloc and free stats mm/slub: optimize free fast path code layout mm/slub: optimize alloc fastpath code layout mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers mm/slab: move kmalloc() functions from slab_common.c to slub.c mm/slab: move kmalloc_slab() to mm/slab.h mm/slab: move kfree() from slab_common.c to slub.c mm/slab: move struct kmem_cache_node from slab.h to slub.c mm/slab: move memcg related functions from slab.h to slub.c mm/slab: move pre/post-alloc hooks from slab.h to slub.c mm/slab: consolidate includes in the internal mm/slab.h mm/slab: move the rest of slub_def.h to mm/slab.h mm/slab: move struct kmem_cache_cpu declaration to slub.c mm/slab: remove mm/slab.c and slab_def.h mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs mm/slab: remove CONFIG_SLAB code from slab common code cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks ...
2024-01-08Merge tag 'cgroup-for-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Yafang Shao added task_get_cgroup1() helper to enable a similar BPF helper so that BPF progs can be more useful on cgroup1 hierarchies. While cgroup1 is mostly in maintenance mode, this addition is very small while having an outsized usefulness for users who are still on cgroup1. Yafang also optimized root cgroup list access by making it RCU protected in the process. - Waiman Long optimized rstat operation leading to substantially lower and more consistent lock hold time while flushing the hierarchical statistics. As the lock can be acquired briefly in various hot paths, this reduction has cascading benefits. - Waiman also improved the quality of isolation for cpuset's isolated partitions. CPUs which are allocated to isolated partitions are now excluded from running unbound work items and cpu_is_isolated() test which is used by vmstat and memcg to reduce interference now includes cpuset isolated CPUs. While it isn't there yet, the hope is eventually reaching parity with the isolation level provided by the `isolcpus` boot param but in a dynamic manner. * tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Move rcu_head up near the top of cgroup_root cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check cgroup: Avoid false cacheline sharing of read mostly rstat_cpu cgroup/rstat: Optimize cgroup_rstat_updated_list() cgroup: Fix documentation for cpu.idle cgroup/cpuset: Expose cpuset.cpus.isolated workqueue: Move workqueue_set_unbound_cpumask() and its helpers inside CONFIG_SYSFS cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked() cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask cgroup/cpuset: Keep track of CPUs in isolated partitions selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask selftests: cgroup: Fixes a typo in a comment cgroup: Add a new helper for cgroup1 hierarchy cgroup: Add annotation for holding namespace_sem in current_cgns_cgroup_from_root() cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show() cgroup: Make operations on the cgroup root_list RCU safe cgroup: Remove unnecessary list_empty()
2024-01-08Merge tag 'sched-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Energy scheduling: - Consolidate how the max compute capacity is used in the scheduler and how we calculate the frequency for a level of utilization. - Rework interface between the scheduler and the schedutil governor - Simplify the util_est logic Deadline scheduler: - Work more towards reducing SCHED_DEADLINE starvation of low priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks monopolize CPU cycles, via the introduction of 'deadline servers' (nested/2-level scheduling). "Fair servers" to make use of this facility are not introduced yet. EEVDF: - Introduce O(1) fastpath for EEVDF task selection NUMA balancing: - Tune the NUMA-balancing vma scanning logic some more, to better distribute the probability of a particular vma getting scanned. Plus misc fixes, cleanups and updates" * tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) sched/fair: Fix tg->load when offlining a CPU sched/fair: Remove unused 'next_buddy_marked' local variable in check_preempt_wakeup_fair() sched/fair: Use all little CPUs for CPU-bound workloads sched/fair: Simplify util_est sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) arm64/amu: Use capacity_ref_freq() to set AMU ratio cpufreq/cppc: Set the frequency used for computing the capacity cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}() energy_model: Use a fixed reference frequency cpufreq/schedutil: Use a fixed reference frequency cpufreq: Use the fixed and coherent frequency for scaling capacity sched/topology: Add a new arch_scale_freq_ref() method freezer,sched: Clean saved_state when restoring it during thaw sched/fair: Update min_vruntime for reweight_entity() correctly sched/doc: Update documentation after renames and synchronize Chinese version sched/cpufreq: Rework iowait boost sched/cpufreq: Rework schedutil governor performance estimation sched/pelt: Avoid underestimation of task utilization sched/timers: Explain why idle task schedules out on remote timer enqueue sched/cpuidle: Comment about timers requirements VS idle handler ...
2024-01-08Merge tag 'perf-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: - Add branch stack counters ABI extension to better capture the growing amount of information the PMU exposes via branch stack sampling. There's matching tooling support. - Fix race when creating the nr_addr_filters sysfs file - Add Intel Sierra Forest and Grand Ridge intel/cstate PMU support - Add Intel Granite Rapids, Sierra Forest and Grand Ridge uncore PMU support - Misc cleanups & fixes * tag 'perf-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Factor out topology_gidnid_map() perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology() perf/x86/amd: Reject branch stack for IBS events perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge perf/x86/intel/uncore: Support IIO free-running counters on GNR perf/x86/intel/uncore: Support Granite Rapids perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR perf: Fix the nr_addr_filters fix perf/x86/intel/cstate: Add Grand Ridge support perf/x86/intel/cstate: Add Sierra Forest support x86/smp: Export symbol cpu_clustergroup_mask() perf/x86/intel/cstate: Cleanup duplicate attr_groups perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file perf/x86/intel: Support branch counters logging perf/x86/intel: Reorganize attrs and is_visible perf: Add branch_sample_call_stack perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag perf: Add branch stack counters
2024-01-08Merge tag 'irq-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq subsystem updates from Ingo Molnar: - Add support for the IA55 interrupt controller on RZ/G3S SoC's - Update/fix the Qualcom MPM Interrupt Controller driver's register enumeration within the somewhat exotic "RPM Message RAM" MMIO-mapped shared memory region that is used for other purposes as well - Clean up the Xtensa built-in Programmable Interrupt Controller driver (xtensa-pic) a bit * tag 'irq-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-xtensa-pic: Clean up irqchip/qcom-mpm: Support passing a slice of SRAM as reg space dt-bindings: interrupt-controller: mpm: Pass MSG RAM slice through phandle dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/G3S irqchip/renesas-rzg2l: Add support for suspend to RAM irqchip/renesas-rzg2l: Add macro to retrieve TITSR register offset based on register's index irqchip/renesas-rzg2l: Implement restriction when writing ISCR register irqchip/renesas-rzg2l: Document structure members irqchip/renesas-rzg2l: Align struct member names to tabs irqchip/renesas-rzg2l: Use tabs instead of spaces
2024-01-08Merge tag 'locking-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molar: "Lock guards: - Use lock guards in the ptrace code - Introduce conditional guards to extend to conditional lock primitives like mutex_trylock()/mutex_lock_interruptible()/etc. lockdep: - Optimize 'struct lock_class' to be smaller - Update file patterns in MAINTAINERS mutexes: - Document mutex lifetime rules a bit more" * tag 'locking-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked locking/mutex: Document that mutex_unlock() is non-atomic ptrace: Convert ptrace_attach() to use lock guards locking/lockdep: Slightly reorder 'struct lock_class' to save some memory MAINTAINERS: Add include/linux/lockdep*.h cleanup: Add conditional guard support
2024-01-08Merge tag 'x86-cleanups-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - Change global variables to local - Add missing kernel-doc function parameter descriptions - Remove unused parameter from a macro - Remove obsolete Kconfig entry - Fix comments - Fix typos, mostly scripted, manually reviewed and a micro-optimization got misplaced as a cleanup: - Micro-optimize the asm code in secondary_startup_64_no_verify() * tag 'x86-cleanups-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arch/x86: Fix typos x86/head_64: Use TESTB instead of TESTL in secondary_startup_64_no_verify() x86/docs: Remove reference to syscall trampoline in PTI x86/Kconfig: Remove obsolete config X86_32_SMP x86/io: Remove the unused 'bw' parameter from the BUILDIO() macro x86/mtrr: Document missing function parameters in kernel-doc x86/setup: Make relocated_ramdisk a local variable of relocate_initrd()
2024-01-08Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "CPU features: - Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye olde Thunder-X machines - Avoid mapping KPTI trampoline when it is not required - Make CPU capability API more robust during early initialisation Early idreg overrides: - Remove dependencies on core kernel helpers from the early command-line parsing logic in preparation for moving this code before the kernel is mapped FPsimd: - Restore kernel-mode fpsimd context lazily, allowing us to run fpsimd code sequences in the kernel with pre-emption enabled KBuild: - Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y - Makefile cleanups LPA2 prep: - Preparatory work for enabling the 'LPA2' extension, which will introduce 52-bit virtual and physical addressing even with 4KiB pages (including for KVM guests). Misc: - Remove dead code and fix a typo MM: - Pass NUMA node information for IRQ stack allocations Perf: - Add perf support for the Synopsys DesignWare PCIe PMU - Add support for event counting thresholds (FEAT_PMUv3_TH) introduced in Armv8.8 - Add support for i.MX8DXL SoCs to the IMX DDR PMU driver. - Minor PMU driver fixes and optimisations RIP VPIPT: - Remove what support we had for the obsolete VPIPT I-cache policy Selftests: - Improvements to the SVE and SME selftests Stacktrace: - Refactor kernel unwind logic so that it can used by BPF unwinding and, eventually, reliable backtracing Sysregs: - Update a bunch of register definitions based on the latest XML drop from Arm" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits) kselftest/arm64: Don't probe the current VL for unsupported vector types efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad arm64: properly install vmlinuz.efi arm64/sysreg: Add missing system instruction definitions for FGT arm64/sysreg: Add missing system register definitions for FGT arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1 arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1 arm64: memory: remove duplicated include arm: perf: Fix ARCH=arm build with GCC arm64: Align boot cpucap handling with system cpucap handling arm64: Cleanup system cpucap handling MAINTAINERS: add maintainers for DesignWare PCIe PMU driver drivers/perf: add DesignWare PCIe PMU driver PCI: Move pci_clear_and_set_dword() helper to PCI header PCI: Add Alibaba Vendor ID to linux/pci_ids.h docs: perf: Add description for Synopsys DesignWare PCIe PMU driver arm64: irq: set the correct node for shadow call stack Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch ...
2024-01-08Merge tag 'powerpc-6.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add initial support to recognise the HeXin C2000 processor. - Add papr-vpd and papr-sysparm character device drivers for VPD & sysparm retrieval, so userspace tools can be adapted to avoid doing raw firmware calls from userspace. - Sched domains optimisations for shared processor partitions on P9/P10. - A series of optimisations for KVM running as a nested HV under PowerVM. - Other small features and fixes. Thanks to Aditya Gupta, Aneesh Kumar K.V, Arnd Bergmann, Christophe Leroy, Colin Ian King, Dario Binacchi, David Heidelberg, Geoff Levand, Gustavo A. R. Silva, Haoran Liu, Jordan Niethe, Kajol Jain, Kevin Hao, Kunwu Chan, Li kunyu, Li zeming, Masahiro Yamada, Michal Suchánek, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Randy Dunlap, Sathvika Vasireddy, Srikar Dronamraju, Stephen Rothwell, Vaibhav Jain, and Zhao Ke. * tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (96 commits) powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2 powerpc/86xx: Drop unused CONFIG_MPC8610 powerpc/powernv: Add error handling to opal_prd_range_is_valid selftests/powerpc: Fix spelling mistake "EACCESS" -> "EACCES" powerpc/hvcall: Reorder Nestedv2 hcall opcodes powerpc/ps3: Add missing set_freezable() for ps3_probe_thread() powerpc/mpc83xx: Use wait_event_freezable() for freezable kthread powerpc/mpc83xx: Add the missing set_freezable() for agent_thread_fn() powerpc/fsl: Fix fsl,tmu-calibration to match the schema powerpc/smp: Dynamically build Powerpc topology powerpc/smp: Avoid asym packing within thread_group of a core powerpc/smp: Add __ro_after_init attribute powerpc/smp: Disable MC domain for shared processor powerpc/smp: Enable Asym packing for cores on shared processor powerpc/sched: Cleanup vcpu_is_preempted() powerpc: add cpu_spec.cpu_features to vmcoreinfo powerpc/imc-pmu: Add a null pointer check in update_events_in_group() powerpc/powernv: Add a null pointer check in opal_powercap_init() powerpc/powernv: Add a null pointer check in opal_event_init() powerpc/powernv: Add a null pointer check to scom_debug_init_one() ...
2024-01-08Merge tag 'ras_core_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Borislav Petkov: - Convert the hw error storm handling into a finer-grained, per-bank solution which allows for more timely detection and reporting of errors - Start a documentation section which will hold down relevant RAS features description and how they should be used - Add new AMD error bank types - Slim down and remove error type descriptions from the kernel side of error decoding to rasdaemon which can be used from now on to decode hw errors on AMD - Mark pages containing uncorrectable errors as poison so that kdump can avoid them and thus not cause another panic - The usual cleanups and fixlets * tag 'ras_core_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Handle Intel threshold interrupt storms x86/mce: Add per-bank CMCI storm mitigation x86/mce: Remove old CMCI storm mitigation code Documentation: Begin a RAS section x86/MCE/AMD: Add new MA_LLC, USR_DP, and USR_CP bank types EDAC/mce_amd: Remove SMCA Extended Error code descriptions x86/mce/amd, EDAC/mce_amd: Move long names to decoder module x86/mce/inject: Clear test status value x86/mce: Remove redundant check from mce_device_create() x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08mm, treewide: introduce NR_PAGE_ORDERSKirill A. Shutemov
NR_PAGE_ORDERS defines the number of page orders supported by the page allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total. NR_PAGE_ORDERS assists in defining arrays of page orders and allows for more natural iteration over them. [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning] Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08Merge tag 'x86_misc_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Add an informational message which gets issued when IA32 emulation has been disabled on the cmdline - Clarify in detail how /proc/cpuinfo is used on x86 - Fix a theoretical overflow in num_digits() * tag 'x86_misc_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ia32: State that IA32 emulation is disabled Documentation/x86: Document what /proc/cpuinfo is for x86/lib: Fix overflow when counting digits
2024-01-08Merge tag 'vfs-6.8.super' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs super updates from Christian Brauner: "This contains the super work for this cycle including the long-awaited series by Jan to make it possible to prevent writing to mounted block devices: - Writing to mounted devices is dangerous and can lead to filesystem corruption as well as crashes. Furthermore syzbot comes with more and more involved examples how to corrupt block device under a mounted filesystem leading to kernel crashes and reports we can do nothing about. Add tracking of writers to each block device and a kernel cmdline argument which controls whether other writeable opens to block devices open with BLK_OPEN_RESTRICT_WRITES flag are allowed. Note that this effectively only prevents modification of the particular block device's page cache by other writers. The actual device content can still be modified by other means - e.g. by issuing direct scsi commands, by doing writes through devices lower in the storage stack (e.g. in case loop devices, DM, or MD are involved) etc. But blocking direct modifications of the block device page cache is enough to give filesystems a chance to perform data validation when loading data from the underlying storage and thus prevent kernel crashes. Syzbot can use this cmdline argument option to avoid uninteresting crashes. Also users whose userspace setup does not need writing to mounted block devices can set this option for hardening. We expect that this will be interesting to quite a few workloads. Btrfs is currently opted out of this because they still haven't merged patches we require for this to work from three kernel releases ago. - Reimplement block device freezing and thawing as holder operations on the block device. This allows us to extend block device freezing to all devices associated with a superblock and not just the main device. It also allows us to remove get_active_super() and thus another function that scans the global list of superblocks. Freezing via additional block devices only works if the filesystem chooses to use @fs_holder_ops for these additional devices as well. That currently only includes ext4 and xfs. Earlier releases switched get_tree_bdev() and mount_bdev() to use @fs_holder_ops. The remaining nilfs2 open-coded version of mount_bdev() has been converted to rely on @fs_holder_ops as well. So block device freezing for the main block device will continue to work as before. There should be no regressions in functionality. The only special case is btrfs where block device freezing for the main block device never worked because sb->s_bdev isn't set. Block device freezing for btrfs can be fixed once they can switch to @fs_holder_ops but that can happen whenever they're ready" * tag 'vfs-6.8.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (27 commits) block: Fix a memory leak in bdev_open_by_dev() super: don't bother with WARN_ON_ONCE() super: massage wait event mechanism ext4: Block writes to journal device xfs: Block writes to log device fs: Block writes to mounted block devices btrfs: Do not restrict writes to btrfs devices block: Add config option to not allow writing to mounted devices block: Remove blkdev_get_by_*() functions bcachefs: Convert to bdev_open_by_path() fs: handle freezing from multiple devices fs: remove dead check nilfs2: simplify device handling fs: streamline thaw_super_locked ext4: simplify device handling xfs: simplify device handling fs: simplify setup_bdev_super() calls blkdev: comment fs_holder_ops porting: document block device freeze and thaw changes fs: remove unused helper ...
2024-01-08Merge branch 'pm-sleep'Rafael J. Wysocki
Merge system-wide power management updates for 6.8-rc1: - Fix possible deadlocks in the core system-wide PM code that occur if device-handling functions cannot be executed asynchronously during resune from system-wide suspend (Rafael J. Wysocki). - Clean up unnecessary local variable initializations in multiple places in the hibernation code (Wang chaodong, Li zeming). - Adjust core hibernation code to avoid missing wakeup events that occur after saving an image to persistent storage (Chris Feng). - Update hibernation code to enforce correct ordering during image compression and decompression (Hongchen Zhang). - Use kmap_local_page() instead of kmap_atomic() in copy_data_page() during hibernation and restore (Chen Haonan). - Adjust documentation and code comments to reflect recent task freezer changes (Kevin Hao). - Repair excess function parameter description warning in the hibernation image-saving code (Randy Dunlap). * pm-sleep: PM: sleep: Fix possible deadlocks in core system-wide PM code async: Introduce async_schedule_dev_nocall() async: Split async_schedule_node_domain() PM: hibernate: Repair excess function parameter description warning PM: sleep: Remove obsolete comment from unlock_system_sleep() Documentation: PM: Adjust freezing-of-tasks.rst to the freezer changes PM: hibernate: Use kmap_local_page() in copy_data_page() PM: hibernate: Enforce ordering during image compression/decompression PM: hibernate: Avoid missing wakeup events during hibernation PM: hibernate: Do not initialize error in snapshot_write_next() PM: hibernate: Do not initialize error in swap_write_page() PM: hibernate: Drop unnecessary local variable initialization
2024-01-08Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-devfreq'Rafael J. Wysocki
Merge cpuidle, cpufreq and devfreq updates for 6.8-rc1: - Add support for the Sierra Forest, Grand Ridge and Meteorlake SoCs to the intel_idle cpuidle driver (Artem Bityutskiy, Zhang Rui). - Do not enable interrupts when entering idle in the haltpoll cpuidle driver (Borislav Petkov). - Add Emerald Rapids support in no-HWP mode to the intel_pstate cpufreq driver (Zhenguo Yao). - Use EPP values programmed by the platform firmware as balance performance ones by default in intel_pstate (Srinivas Pandruvada). - Add a missing function return value check to the SCMI cpufreq driver to avoid unexpected behavior (Alexandra Diupina). - Fix parameter type warning in the armada-8k cpufreq driver (Gregory CLEMENT). - Rework trans_stat_show() in the devfreq core code to avoid buffer overflows (Christian Marangi). - Synchronize devfreq_monitor_[start/stop] so as to prevent a timer list corruption from occurring when devfreq governors are switched frequently (Mukesh Ojha). * pm-cpuidle: cpuidle: haltpoll: Do not enable interrupts when entering idle intel_idle: add Sierra Forest SoC support intel_idle: add Grand Ridge SoC support intel_idle: Add Meteorlake support * pm-cpufreq: cpufreq: intel_pstate: Add Emerald Rapids support in no-HWP mode cpufreq: armada-8k: Fix parameter type warning cpufreq: scmi: process the result of devm_of_clk_add_hw_provider() cpufreq: intel_pstate: Prioritize firmware-provided balance performance EPP * pm-devfreq: PM / devfreq: Synchronize devfreq_monitor_[start/stop] PM / devfreq: Convert to use sysfs_emit_at() API PM / devfreq: Fix buffer overflow in trans_stat_show
2024-01-08locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, ↵Ingo Molnar
can still use the lock object after it's unlocked Clarify the mutex lock lifetime rules a bit more. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
2024-01-06Merge tag 'i2c-for-6.7-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Improve the detection when to run atomic transfer handlers for kernels with preemption disabled. This removes some false positive splats a number of users were seeing if their driver didn't have support for atomic transfers. Also, fix a typo in the docs while we are here" * tag 'i2c-for-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: core: Fix atomic xfer check for non-preempt config Documentation/i2c: fix spelling error in i2c-address-translators
2024-01-04Merge tag 'net-6.7-rc9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. We haven't accumulated much over the break. If it wasn't for the uninterrupted stream of fixes for Intel drivers this PR would be very slim. There was a handful of user reports, however, either they stood out because of the lower traffic or users have had more time to test over the break. The ones which are v6.7-relevant should be wrapped up. Current release - regressions: - Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required", it caused issues on networks where routers send prefixes with preferred_lft=0 - wifi: - iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock - mac80211: fix re-adding debugfs entries during reconfiguration Current release - new code bugs: - tcp: print AO/MD5 messages only if there are any keys Previous releases - regressions: - virtio_net: fix missing dma unmap for resize, prevent OOM Previous releases - always broken: - mptcp: prevent tcp diag from closing listener subflows - nf_tables: - set transport header offset for egress hook, fix IPv4 mangling - skip set commit for deleted/destroyed sets, avoid double deactivation - nat: make sure action is set for all ct states, fix openvswitch matching on ICMP packets in related state - eth: mlxbf_gige: fix receive hang under heavy traffic - eth: r8169: fix PCI error on system resume for RTL8168FP - net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling" * tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) net/tcp: Only produce AO/MD5 logs if there are any keys net: Implement missing SO_TIMESTAMPING_NEW cmsg support bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters() net: ravb: Wait for operating mode to be applied asix: Add check for usbnet_get_endpoints octeontx2-af: Re-enable MAC TX in otx2_stop processing octeontx2-af: Always configure NIX TX link credits based on max frame size net/smc: fix invalid link access in dumping SMC-R connections net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues virtio_net: fix missing dma unmap for resize igc: Fix hicredit calculation ice: fix Get link status data length i40e: Restore VF MSI-X state during PCI reset i40e: fix use-after-free in i40e_aqc_add_filters() net: Save and restore msg_namelen in sock_sendmsg netfilter: nft_immediate: drop chain reference counter on error netfilter: nf_nat: fix action not being set for all ct states net: bcmgenet: Fix FCS generation for fragmented skbuffs mptcp: prevent tcp diag from closing listener subflows MAINTAINERS: add Geliang as reviewer for MPTCP ...
2024-01-04Merge branch 'for-next/perf' into for-next/coreWill Deacon
* for-next/perf: (30 commits) arm: perf: Fix ARCH=arm build with GCC MAINTAINERS: add maintainers for DesignWare PCIe PMU driver drivers/perf: add DesignWare PCIe PMU driver PCI: Move pci_clear_and_set_dword() helper to PCI header PCI: Add Alibaba Vendor ID to linux/pci_ids.h docs: perf: Add description for Synopsys DesignWare PCIe PMU driver Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" Documentation: arm64: Document the PMU event counting threshold feature arm64: perf: Add support for event counting threshold arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h perf/arm_dmc620: Remove duplicate format attribute #defines arm: pmu: Share user ABI format mechanism with SPE arm64: perf: Include threshold control fields in PMEVTYPER mask arm: perf: Convert remaining fields to use GENMASK arm: perf: Use GENMASK for PMMIR fields arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N arm: perf: Remove inlines from arm_pmuv3.c drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax drivers/perf: Remove usage of the deprecated ida_simple_xx() API ...
2024-01-02Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required"Alex Henrie
The commit had a bug and might not have been the right approach anyway. Fixes: 629df6701c8a ("net: ipv6/addrconf: clamp preferred_lft to the minimum required") Fixes: ec575f885e3e ("Documentation: networking: explain what happens if temp_prefered_lft is too small or too large") Reported-by: Dan Moulding <dan@danm.net> Closes: https://lore.kernel.org/netdev/20231221231115.12402-1-dan@danm.net/ Link: https://lore.kernel.org/netdev/CAMMLpeTdYhd=7hhPi2Y7pwdPCgnnW5JYh-bu3hSc7im39uxnEA@mail.gmail.com/ Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231230043252.10530-1-alexhenrie24@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-02Merge tag 'v6.7-rc8' into locking/core, to pick up dependent changesIngo Molnar
Pick up these commits from Linus's tree: b106bcf0f99a ("locking/osq_lock: Clarify osq_wait_next()") 563adbfc351b ("locking/osq_lock: Clarify osq_wait_next() calling convention") 7c2230982129 ("locking/osq_lock: Move the definition of optimistic_spin_node into osq_lock.c") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-01-02dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examplesJohan Hovold
Clean up the examples by adding newline separators, moving 'reg' properties after 'compatible' and dropping unused labels. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231130174114.13122-3-johan+linaro@kernel.org
2024-01-02dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node namesJohan Hovold
The ADC Thermal Monitor is part of an SPMI PMIC, which in turn sits on an SPMI bus. Fixes: db03874b8543 ("dt-bindings: thermal: qcom: add HC variant of adc-thermal monitor bindings") Fixes: e8ffd6c0756b ("dt-bindings: thermal: qcom: add adc-thermal monitor bindings") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231130174114.13122-2-johan+linaro@kernel.org
2024-01-02dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controllerMaxim Kiselev
Add a binding for D1/T113s thermal sensor controller. Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231217210629.131486-2-bigunclemax@gmail.com
2024-01-02dt-bindings: thermal-zones: Document critical-actionFabio Estevam
Document the critical-action property to describe the thermal action the OS should perform after the critical temperature is reached. The possible values are "shutdown" and "reboot". The motivation for introducing the critical-action property is that different systems may need different thermal actions when the critical temperature is reached. For example, in a desktop PC, it is desired that a shutdown happens after the critical temperature is reached. However, in some embedded cases, such behavior does not suit well, as the board may be unattended in the field and rebooting may be a better approach. The bootloader may also benefit from this new property as it can check the SoC temperature and in case the temperature is above the critical point, it can trigger a shutdown or reboot accordingly. Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-1-festevam@gmail.com
2024-01-02dt-bindings: thermal: qcom-tsens: document the SM8650 Temperature SensorNeil Armstrong
Document the Temperature Sensor (TSENS) on the SM8650 Platform. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231128-topic-sm8650-upstream-bindings-tsens-v3-1-54179e6646d3@linaro.org
2024-01-02dt-bindings: thermal: loongson,ls2k-thermal: Fix binding check issuesBinbin Zhou
Add the missing 'thermal-sensor-cells' property which is required for every thermal sensor as it's used when using phandles. And add the thermal-sensor.yaml reference. In fact, it was a careless mistake when submitting the driver that caused it to not work properly. So the fix is necessary, although it will result in the ABI break. Fixes: 72684d99a854 ("thermal: dt-bindings: add loongson-2 thermal") Cc: Yinbo Zhu <zhuyinbo@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/6d69362632271ab0af9a5fbfa3bc46a0894f1d54.1700817227.git.zhoubinbin@loongson.cn
2024-01-02dt-bindings: thermal: convert Mediatek Thermal to the json-schemaRafał Miłecki
This helps validating DTS files. Introduced changes: 1. Improved title 2. Simplified description (dropped "This describes the device tree...") 3. Dropped undocumented "reset-names" from example Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231117052214.24554-1-zajec5@gmail.com
2023-12-29zswap: memcontrol: implement zswap writeback disablingNhat Pham
During our experiment with zswap, we sometimes observe swap IOs due to occasional zswap store failures and writebacks-to-swap. These swapping IOs prevent many users who cannot tolerate swapping from adopting zswap to save memory and improve performance where possible. This patch adds the option to disable this behavior entirely: do not writeback to backing swapping device when a zswap store attempt fail, and do not write pages in the zswap pool back to the backing swap device (both when the pool is full, and when the new zswap shrinker is called). This new behavior can be opted-in/out on a per-cgroup basis via a new cgroup file. By default, writebacks to swap device is enabled, which is the previous behavior. Initially, writeback is enabled for the root cgroup, and a newly created cgroup will inherit the current setting of its parent. Note that this is subtly different from setting memory.swap.max to 0, as it still allows for pages to be stored in the zswap pool (which itself consumes swap space in its current form). This patch should be applied on top of the zswap shrinker series: https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@gmail.com/ as it also disables the zswap shrinker, a major source of zswap writebacks. For the most part, this feature is motivated by internal parties who have already established their opinions regarding swapping - the workloads that are highly sensitive to IO, and especially those who are using servers with really slow disk performance (for instance, massive but slow HDDs). For these folks, it's impossible to convince them to even entertain zswap if swapping also comes as a packaged deal. Writeback disabling is quite a useful feature in these situations - on a mixed workloads deployment, they can disable writeback for the more IO-sensitive workloads, and enable writeback for other background workloads. For instance, on a server with HDD, I allocate memories and populate them with random values (so that zswap store will always fail), and specify memory.high low enough to trigger reclaim. The time it takes to allocate the memories and just read through it a couple of times (doing silly things like computing the values' average etc.): zswap.writeback disabled: real 0m30.537s user 0m23.687s sys 0m6.637s 0 pages swapped in 0 pages swapped out zswap.writeback enabled: real 0m45.061s user 0m24.310s sys 0m8.892s 712686 pages swapped in 461093 pages swapped out (the last two lines are from vmstat -s). [nphamcs@gmail.com: add a comment about recurring zswap store failures leading to reclaim inefficiency] Link: https://lkml.kernel.org/r/20231221005725.3446672-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20231207192406.3809579-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: David Heidelberg <david@ixit.cz> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Zefan Li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29docs: submit-checklist: remove all of "make namespacecheck"Tiezhu Yang
After commit 7dfbea4c468c ("scripts: remove namespace.pl"), scripts/namespace.pl has been removed from the kernel, and "make namespacecheck" has been removed from the English version of submit-checklist.rst, so also remove it in the related translations. Link: https://lkml.kernel.org/r/20231219125008.23007-6-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29lib: crc_ccitt_false() is identical to crc_itu_t()Mathis Marion
crc_ccitt_false() was introduced in commit 0d85adb5fbd33 ("lib/crc-ccitt: Add CCITT-FALSE CRC16 variant"), but it is redundant with crc_itu_t(). Since the latter is more used, it is the one being kept. Link: https://lkml.kernel.org/r/20231219131154.748577-1-Mathis.Marion@silabs.com Signed-off-by: Mathis Marion <mathis.marion@silabs.com> Cc: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com> Cc: Jérôme Pouiller <jerome.pouiller@silabs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29mm/rmap: rename COMPOUND_MAPPED to ENTIRELY_MAPPEDDavid Hildenbrand
We removed all "bool compound" and RMAP_COMPOUND parameters. Let's remove the remaining "compound" terminology by making COMPOUND_MAPPED match the "folio->_entire_mapcount" terminology, renaming it to ENTIRELY_MAPPED. ENTIRELY_MAPPED is only used when the whole folio is mapped using a single page table entry (e.g., a single PMD mapping a PMD-sized THP). For now, we don't support mapping any THP bigger than that, so ENTIRELY_MAPPED only applies to PMD-mapped PMD-sized THP only. Link: https://lkml.kernel.org/r/20231220224504.646757-40-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29Documentation: stop referring to page_remove_rmap()David Hildenbrand
Refer to folio_remove_rmap_*() instaed. Link: https://lkml.kernel.org/r/20231220224504.646757-32-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29mm/ksm: document ksm advisor and its sysfs knobsStefan Roesch
This documents the KSM advisor and its new knobs in /sys/fs/kernel/mm. Link: https://lkml.kernel.org/r/20231218231054.1625219-5-shr@devkernel.io Signed-off-by: Stefan Roesch <shr@devkernel.io> Acked-by: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29userfaultfd: UFFDIO_MOVE uABIAndrea Arcangeli
Implement the uABI of UFFDIO_MOVE ioctl. UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application needs pages to be allocated [1]. However, with UFFDIO_MOVE, if pages are available (in userspace) for recycling, as is usually the case in heap compaction algorithms, then we can avoid the page allocation and memcpy (done by UFFDIO_COPY). Also, since the pages are recycled in the userspace, we avoid the need to release (via madvise) the pages back to the kernel [2]. We see over 40% reduction (on a Google pixel 6 device) in the compacting thread's completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was measured using a benchmark that emulates a heap compaction implementation using userfaultfd (to allow concurrent accesses by application threads). More details of the usecase are explained in [2]. Furthermore, UFFDIO_MOVE enables moving swapped-out pages without touching them within the same vma. Today, it can only be done by mremap, however it forces splitting the vma. [1] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redhat.com/ [2] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyjniNVjp0Aw@mail.gmail.com/ Update for the ioctl_userfaultfd(2) manpage: UFFDIO_MOVE (Since Linux xxx) Move a continuous memory chunk into the userfault registered range and optionally wake up the blocked thread. The source and destination addresses and the number of bytes to move are specified by the src, dst, and len fields of the uffdio_move structure pointed to by argp: struct uffdio_move { __u64 dst; /* Destination of move */ __u64 src; /* Source of move */ __u64 len; /* Number of bytes to move */ __u64 mode; /* Flags controlling behavior of move */ __s64 move; /* Number of bytes moved, or negated error */ }; The following value may be bitwise ORed in mode to change the behavior of the UFFDIO_MOVE operation: UFFDIO_MOVE_MODE_DONTWAKE Do not wake up the thread that waits for page-fault resolution UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES Allow holes in the source virtual range that is being moved. When not specified, the holes will result in ENOENT error. When specified, the holes will be accounted as successfully moved memory. This is mostly useful to move hugepage aligned virtual regions without knowing if there are transparent hugepages in the regions or not, but preventing the risk of having to split the hugepage during the operation. The move field is used by the kernel to return the number of bytes that was actually moved, or an error (a negated errno- style value). If the value returned in move doesn't match the value that was specified in len, the operation fails with the error EAGAIN. The move field is output-only; it is not read by the UFFDIO_MOVE operation. The operation may fail for various reasons. Usually, remapping of pages that are not exclusive to the given process fail; once KSM might deduplicate pages or fork() COW-shares pages during fork() with child processes, they are no longer exclusive. Further, the kernel might only perform lightweight checks for detecting whether the pages are exclusive, and return -EBUSY in case that check fails. To make the operation more likely to succeed, KSM should be disabled, fork() should be avoided or MADV_DONTFORK should be configured for the source VMA before fork(). This ioctl(2) operation returns 0 on success. In this case, the entire area was moved. On error, -1 is returned and errno is set to indicate the error. Possible errors include: EAGAIN The number of bytes moved (i.e., the value returned in the move field) does not equal the value that was specified in the len field. EINVAL Either dst or len was not a multiple of the system page size, or the range specified by src and len or dst and len was invalid. EINVAL An invalid bit was specified in the mode field. ENOENT The source virtual memory range has unmapped holes and UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES is not set. EEXIST The destination virtual memory range is fully or partially mapped. EBUSY The pages in the source virtual memory range are either pinned or not exclusive to the process. The kernel might only perform lightweight checks for detecting whether the pages are exclusive. To make the operation more likely to succeed, KSM should be disabled, fork() should be avoided or MADV_DONTFORK should be configured for the source virtual memory area before fork(). ENOMEM Allocating memory needed for the operation failed. ESRCH The target process has exited at the time of a UFFDIO_MOVE operation. Link: https://lkml.kernel.org/r/20231206103702.3873743-3-surenb@google.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-27Documentation/i2c: fix spelling error in i2c-address-translatorsAttreyee Mukherjee
Correct to "stretched" from "streched" in "keeps clock streched on bus A waiting for reply". Signed-off-by: Attreyee Mukherjee <tintinm2017@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-12-23Merge tag 'char-misc-6.7-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver fixes from Greg KH: "Here are a small number of various driver fixes for 6.7-rc7 that normally come through the char-misc tree, and one debugfs fix as well. Included in here are: - iio and hid sensor driver fixes for a number of small things - interconnect driver fixes - brcm_nvmem driver fixes - debugfs fix for previous fix - guard() definition in device.h so that many subsystems can start using it for 6.8-rc1 (requested by Dan Williams to make future merges easier) All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) debugfs: initialize cancellations earlier Revert "iio: hid-sensor-als: Add light color temperature support" Revert "iio: hid-sensor-als: Add light chromaticity support" nvmem: brcm_nvram: store a copy of NVRAM content dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp driver core: Add a guard() definition for the device_lock() interconnect: qcom: icc-rpm: Fix peak rate calculation iio: adc: MCP3564: fix hardware identification logic iio: adc: MCP3564: fix calib_bias and calib_scale range checks iio: adc: meson: add separate config for axg SoC family iio: adc: imx93: add four channels for imx93 adc iio: adc: ti_am335x_adc: Fix return value check of tiadc_request_dma() interconnect: qcom: sm8250: Enable sync_state iio: triggered-buffer: prevent possible freeing of wrong buffer iio: imu: inv_mpu6050: fix an error code problem in inv_mpu6050_read_raw iio: imu: adis16475: use bit numbers in assign_bit() iio: imu: adis16475: add spi_device_id table iio: tmag5273: fix temperature offset interconnect: Treat xlate() returning NULL node as an error iio: common: ms_sensors: ms_sensors_i2c: fix humidity conversion time table ...
2023-12-23sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true)Vincent Guittot
sched_feat(UTIL_EST_FASTUP) has been added to easily disable the feature in order to check for possibly related regressions. After 3 years, it has never been used and no regression has been reported. Let's remove it and make fast increase a permanent behavior. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Hongyan Xia <hongyan.xia2@arm.com> Reviewed-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Yanteng Si <siyanteng@loongson.cn> [for the Chinese translation] Reviewed-by: Alex Shi <alexs@kernel.org> Link: https://lore.kernel.org/r/20231201161652.1241695-2-vincent.guittot@linaro.org
2023-12-23Merge tag 'v6.7-rc6' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-12-22Merge tag 'spi-nor/for-6.8' into mtd/nextMiquel Raynal
SPI NOR comes with die erase support for multi die flashes, with new octal protocols (1-1-8 and 1-8-8) parsed from SFDP and with an updated documentation about what the contributors shall consider when proposing flash additions or updates. Michael Walle stepped out from the reviewer role to maintainer.
2023-12-22dt-bindings: mtd: partitions: u-boot: Fix typoStefan Wahren
The initial description contained a typo. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20231218130656.9020-1-wahrenst@gmx.net