summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-03net/mlx5: Set default grace period based on function typeMaher Sanalla
Currently, driver sets the same grace period for fw fatal health reporter to any type of function. Since the lower level functions are more vulnerable to fw fatal errors as a result of parent function closure/reload, set a smaller grace period for the lower level functions, as follows: 1. For ECPF: 180 seconds. 2. For PF: 60 seconds. 3. For VF/SF: 30 seconds. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5: Start health poll at earlier stage of driver loadMoshe Shemesh
Start health poll at earlier stage, so if fw fatal issue occurred before or during initialization commands such as init_hca or set_hca_cap the poll health can detect and indicate that the driver is already in error state. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: Expose rx_oversize_pkts_buffer counterGal Pressman
Add the rx_oversize_pkts_buffer counter to ethtool statistics. This counter exposes the number of dropped received packets due to length which arrived to RQ and exceed software buffer size allocated by the device for incoming traffic. It might imply that the device MTU is larger than the software buffers size. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Optimize for unaligned mode with 3072-byte framesMaxim Mikityanskiy
When XSK frame size is 3072 (or another power of two multiplied by 3), KLM mechanism for NIC virtual memory page mapping can be optimized by replacing it with KSM. Before this change, two KLM entries were needed to map an XSK frame that is not a power of two: one entry maps the UMEM memory up to the frame length, the other maps the rest of the stride to the garbage page. When the frame length divided by 3 is a power of two, it can be mapped using 3 KSM entries, and the fourth will map the rest of the stride to the garbage page. All 4 KSM entries are of the same size, which allows for a much faster lookup. Frame size 3072 is useful in certain use cases, because it allows packing 4 frames into 3 pages. Generally speaking, other frame sizes equal to PAGE_SIZE minus a power of two can be optimized in a similar way, but it will require many more KSMs per frame, which slows down UMRs a little bit, but more importantly may hit the limit for the maximum number of KSM entries. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Print a warning in slow configurationsMaxim Mikityanskiy
On striding RQ, when the XSK frame size doesn't match the MKey page size, KLM is used for memory mappings, which is a slower mechanism than MTT or KSM. It may happen in two cases: 1. Frame size is not a power of two (only possible in the unaligned mode of XSK). 2. Frame size is 2048 bytes, and the firmware doesn't support MKey pages smaller than 4096 bytes. Depending on the case, print a warning and recommend to disable striding RQ or upgrade the firmware. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Use KLM to protect frame overrun in unaligned modeMaxim Mikityanskiy
XSK RQs support striding RQ linear mode, but the stride size may be bigger than the XSK frame size, because: 1. The stride size must be a power of two. 2. The stride size must be equal to the UMR page size. Each XSK frame is treated as a separate page, because they aren't necessarily adjacent in physical memory, so the driver can't put more than one stride per page. 3. The minimal MTT page size is 4096 on older firmware. That means that if XSK frame size is 2048 or not a power of two, the strides may be bigger than XSK frames. Normally, it's not a problem if the hardware enforces the MTU. However, traffic between vports skips the hardware MTU check, and oversized packets may be received. If an oversized packet is bigger than the XSK frame but not bigger than the stride, it will cause overwriting of the adjacent UMEM region. If the packet takes more than one stride, they can be recycled for reuse, so it's not a problem when the XSK frame size matches the stride size. Work around the above issue by leveraging KLM to make a more fine-grained mapping. The beginning of each stride is mapped to the frame memory, and the padding up to the closest power of two is mapped to the overflow page that doesn't belong to UMEM. This way, application data corruption won't happen upon receiving packets bigger than MTU. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: Improve MTT/KSM alignmentMaxim Mikityanskiy
Make mlx5e_mpwrq_mtts_per_wqe take into account that KSM requires smaller alignment than MTT. Ensure that there is always an even amount of MTTs in a UMR WQE, so that complete octwords are formed, and no garbage is mapped. Drop extra alignment in MLX5_MTT_OCTW that may cause setting too big ucseg->xlt_octowords, also leading to mapping garbage. Generalize some calculations by introducing the MLX5_OCTWORD constant. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Use umr_mode to calculate striding RQ parametersMaxim Mikityanskiy
Instead of passing the unaligned flag, pass an enum that indicates the UMR mode. The next commit will add the third mode (KLM for certain configurations of XSK), which will be added to this enum instead of adding another bool flag everywhere. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Improve need_wakeup logicMaxim Mikityanskiy
XSK need_wakeup mechanism allows the driver to stop busy waiting for buffers when the fill ring is empty, yield to the application and signal it that the driver needs to be waken up after the application refills the fill ring. Add protection against the race condition on the RX (refill) side: if the application refills buffers after xskrq->post_wqes is called, but before mlx5e_xsk_update_rx_wakeup, NAPI will exit, skipping taking these buffers to the hardware WQ, and the application won't wake it up again. Optimize the whole need_wakeup logic, removing unneeded flows, to compensate for this new check. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Include XSK skb_from_cqe callbacks in INDIRECT_CALLMaxim Mikityanskiy
XSK is a performance-critical data path. To avoid an indirect function call with a retpoline, include XSK callbacks in the INDIRECT_CALL macro, so that they are called directly in XSK flows. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Set napi_id to support busy pollingMaxim Mikityanskiy
xdp_rxq_info_reg should get the actual napi_id, not 0, in order to support socket busy polling properly. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net/mlx5e: xsk: Flush RQ on XSK activation to save memoryMaxim Mikityanskiy
The regular RQ remains open after opening an XSK socket, in order to guarantee that closing the XSK socket never fails due to an error when reopening the regular RQ. To save memory, the regular RQ can be deactivated and flushed, releasing all pages, when an XSK socket is open. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net: mvpp2: fix mvpp2 debugfs leakRussell King (Oracle)
When mvpp2 is unloaded, the driver specific debugfs directory is not removed, which technically leads to a memory leak. However, this directory is only created when the first device is probed, so the hardware is present. Removing the module is only something a developer would to when e.g. testing out changes, so the module would be reloaded. So this memory leak is minor. The original attempt in commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()") that was labelled as a memory leak fix was not, it fixed a refcount leak, but in doing so created a problem when the module is reloaded - the directory already exists, but mvpp2_root is NULL, so we lose all debugfs entries. This fix has been reverted. This is the alternative fix, where we remove the offending directory whenever the driver is unloaded. Fixes: 21da57a23125 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net: ipa: update copyrightsAlex Elder
Some source files state copyright dates that are earlier than the last modification of the file. Change the copyright year to 2022 in all such cases. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20220930224549.3503434-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03net: ipa: update commentsAlex Elder
This patch just updates comments throughout the IPA code. Transaction state is now tracked using indexes into an array rather than linked lists, and a few comments refer to the "old way" of doing things. The description of how transactions are used was changed to refer to "operations" rather than "commands", to (hopefully) remove a possible ambiguity. IPA register offsets and fields are now handled differently as well, and the register documentation is updated to better describe the code. A few minor updates to comments were made (e.g., adding a missing word, fixing a typo or punctuation, etc.). Finally, the local macro atomic_dec_not_zero() is no longer used, so it is deleted. Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20220930224527.3503404-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03r8152: Rate limit overflow messagesAndrew Gaul
My system shows almost 10 million of these messages over a 24-hour period which pollutes my logs. Signed-off-by: Andrew Gaul <gaul@google.com> Link: https://lore.kernel.org/r/20221002034128.2026653-1-gaul@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-04Merge tag 'amd-drm-next-6.1-2022-09-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.1-2022-09-30: amdgpu: - RLC FW code cleanup - RLC fixes for GC 11.x - SMU 13.x fixes - CP FW code cleanup - SDMA FW code cleanup - GC 11.x fixes - DCN 3.2.x fixes - DCN 3.1.4 fixes - Misc fixes - RAS fixes - SR-IOV fixes - VCN 4.x fixes amdkfd: - GC 11.x fixes - Xnack fixes - UBSAN warning fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220930162012.5823-1-alexander.deucher@amd.com
2022-10-03net: lan966x: Fix return type of lan966x_port_xmitNathan Huckleberry
The ndo_start_xmit field in net_device_ops is expected to be of type netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev). The mismatched return type breaks forward edge kCFI since the underlying function definition does not match the function hook definition. The return type of lan966x_port_xmit should be changed from int to netdev_tx_t. Reported-by: Dan Carpenter <error27@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1703 Cc: llvm@lists.linux.dev Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220929182704.64438-1-nhuck@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03Merge tag 'rust-v6.1-rc1' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull Rust introductory support from Kees Cook: "The tree has a recent base, but has fundamentally been in linux-next for a year and a half[1]. It's been updated based on feedback from the Kernel Maintainer's Summit, and to gain recent Reviewed-by: tags. Miguel is the primary maintainer, with me helping where needed/wanted. Our plan is for the tree to switch to the standard non-rebasing practice once this initial infrastructure series lands. The contents are the absolute minimum to get Rust code building in the kernel, with many more interfaces[2] (and drivers - NVMe[3], 9p[4], M1 GPU[5]) on the way. The initial support of Rust-for-Linux comes in roughly 4 areas: - Kernel internals (kallsyms expansion for Rust symbols, %pA format) - Kbuild infrastructure (Rust build rules and support scripts) - Rust crates and bindings for initial minimum viable build - Rust kernel documentation and samples Rust support has been in linux-next for a year and a half now, and the short log doesn't do justice to the number of people who have contributed both to the Linux kernel side but also to the upstream Rust side to support the kernel's needs. Thanks to these 173 people, and many more, who have been involved in all kinds of ways: Miguel Ojeda, Wedson Almeida Filho, Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg, Adam Bratschi-Kaye, Benno Lossin, Maciej Falkowski, Finn Behrens, Sven Van Asbroeck, Asahi Lina, FUJITA Tomonori, John Baublitz, Wei Liu, Geoffrey Thomas, Philip Herron, Arthur Cohen, David Faust, Antoni Boucher, Philip Li, Yujie Liu, Jonathan Corbet, Greg Kroah-Hartman, Paul E. McKenney, Josh Triplett, Kent Overstreet, David Gow, Alice Ryhl, Robin Randhawa, Kees Cook, Nick Desaulniers, Matthew Wilcox, Linus Walleij, Joe Perches, Michael Ellerman, Petr Mladek, Masahiro Yamada, Arnaldo Carvalho de Melo, Andrii Nakryiko, Konstantin Shelekhin, Rasmus Villemoes, Konstantin Ryabitsev, Stephen Rothwell, Andy Shevchenko, Sergey Senozhatsky, John Paul Adrian Glaubitz, David Laight, Nathan Chancellor, Jonathan Cameron, Daniel Latypov, Shuah Khan, Brendan Higgins, Julia Lawall, Laurent Pinchart, Geert Uytterhoeven, Akira Yokosawa, Pavel Machek, David S. Miller, John Hawley, James Bottomley, Arnd Bergmann, Christian Brauner, Dan Robertson, Nicholas Piggin, Zhouyi Zhou, Elena Zannoni, Jose E. Marchesi, Leon Romanovsky, Will Deacon, Richard Weinberger, Randy Dunlap, Paolo Bonzini, Roland Dreier, Mark Brown, Sasha Levin, Ted Ts'o, Steven Rostedt, Jarkko Sakkinen, Michal Kubecek, Marco Elver, Al Viro, Keith Busch, Johannes Berg, Jan Kara, David Sterba, Connor Kuehl, Andy Lutomirski, Andrew Lunn, Alexandre Belloni, Peter Zijlstra, Russell King, Eric W. Biederman, Willy Tarreau, Christoph Hellwig, Emilio Cobos Álvarez, Christian Poveda, Mark Rousskov, John Ericson, TennyZhuang, Xuanwo, Daniel Paoliello, Manish Goregaokar, comex, Josh Stone, Stephan Sokolow, Philipp Krones, Guillaume Gomez, Joshua Nelson, Mats Larsen, Marc Poulhiès, Samantha Miller, Esteban Blanc, Martin Schmidt, Martin Rodriguez Reboredo, Daniel Xu, Viresh Kumar, Bartosz Golaszewski, Vegard Nossum, Milan Landaverde, Dariusz Sosnowski, Yuki Okushi, Matthew Bakhtiari, Wu XiangCheng, Tiago Lam, Boris-Chengbiao Zhou, Sumera Priyadarsini, Viktor Garske, Niklas Mohrin, Nándor István Krácser, Morgan Bartlett, Miguel Cano, Léo Lanteri Thauvin, Julian Merkle, Andreas Reindl, Jiapeng Chong, Fox Chen, Douglas Su, Antonio Terceiro, SeongJae Park, Sergio González Collado, Ngo Iok Ui (Wu Yu Wei), Joshua Abraham, Milan, Daniel Kolsoi, ahomescu, Manas, Luis Gerhorst, Li Hongyu, Philipp Gesang, Russell Currey, Jalil David Salamé Messina, Jon Olson, Raghvender, Angelos, Kaviraj Kanagaraj, Paul Römer, Sladyn Nunes, Mauro Baladés, Hsiang-Cheng Yang, Abhik Jain, Hongyu Li, Sean Nash, Yuheng Su, Peng Hao, Anhad Singh, Roel Kluin, Sara Saa, Geert Stappers, Garrett LeSage, IFo Hancroft, and Linus Torvalds" Link: https://lwn.net/Articles/849849/ [1] Link: https://github.com/Rust-for-Linux/linux/commits/rust [2] Link: https://github.com/metaspace/rust-linux/commit/d88c3744d6cbdf11767e08bad56cbfb67c4c96d0 [3] Link: https://github.com/wedsonaf/linux/commit/9367032607f7670de0ba1537cf09ab0f4365a338 [4] Link: https://github.com/AsahiLinux/linux/commits/gpu/rust-wip [5] * tag 'rust-v6.1-rc1' of https://github.com/Rust-for-Linux/linux: (27 commits) MAINTAINERS: Rust samples: add first Rust examples x86: enable initial Rust support docs: add Rust documentation Kbuild: add Rust support rust: add `.rustfmt.toml` scripts: add `is_rust_module.sh` scripts: add `rust_is_available.sh` scripts: add `generate_rust_target.rs` scripts: add `generate_rust_analyzer.py` scripts: decode_stacktrace: demangle Rust symbols scripts: checkpatch: enable language-independent checks for Rust scripts: checkpatch: diagnose uses of `%pA` in the C side as errors vsprintf: add new `%pA` format specifier rust: export generated symbols rust: add `kernel` crate rust: add `bindings` crate rust: add `macros` crate rust: add `compiler_builtins` crate rust: adapt `alloc` crate to the kernel ...
2022-10-04Merge tag 'drm-misc-next-2022-09-30' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.1: Core Changes: - Add dma_resv_assert_held to vmap/vunmap calls. - Add kunit tests for some format conversion calls. - Don't rewrite link config when setting phy test pattern in DP link training. Driver Changes: - Assorted small fixes in bridge/lt8192b, qxl, virtio-gpu, ast. - Fix corrupted image output in lt8912b. - Fix driver unbind in meson. - Add INX, BOE, AUO, Multi-Inno Technology panels to panel-edp. - Synchronize access to GEM bo's in simpledrm, ssd130x. - Use dev_err_probe in panel-edp and panel-simple. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/afbd505a-3799-c73b-8008-ef6e156ad7e1@linux.intel.com
2022-10-03Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf 2022-10-03 We've added 10 non-merge commits during the last 23 day(s) which contain a total of 14 files changed, 130 insertions(+), 69 deletions(-). The main changes are: 1) Fix dynptr helper API to gate behind CAP_BPF given it was not intended for unprivileged BPF programs, from Kumar Kartikeya Dwivedi. 2) Fix need_wakeup flag inheritance from umem buffer pool for shared xsk sockets, from Jalal Mostafa. 3) Fix truncated last_member_type_id in btf_struct_resolve() which had a wrong storage type, from Lorenz Bauer. 4) Fix xsk back-pressure mechanism on tx when amount of produced descriptors to CQ is lower than what was grabbed from xsk tx ring, from Maciej Fijalkowski. 5) Fix wrong cgroup attach flags being displayed to effective progs, from Pu Lehui. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: xsk: Inherit need_wakeup flag for shared sockets bpf: Gate dynptr API behind CAP_BPF selftests/bpf: Adapt cgroup effective query uapi change bpftool: Fix wrong cgroup attach flags being assigned to effective progs bpf, cgroup: Reject prog_attach_flags array when effective query bpf: Ensure correct locking around vulnerable function find_vpid() bpf: btf: fix truncated last_member_type_id in btf_struct_resolve selftests/xsk: Add missing close() on netns fd xsk: Fix backpressure mechanism on Tx MAINTAINERS: Add include/linux/tnum.h to BPF CORE ==================== Link: https://lore.kernel.org/r/20221003201957.13149-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03Merge tag 'thermal-6.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "The most significant part of this update is the thermal control DT initialization rework from Daniel Lezcano and the following conversion of drivers to use the new API introduced by it Apart from that, the maximum number of trip points in a thermal zone is increased and there are some fixes and code cleanups Specifics: - Rework the device tree initialization, convert the drivers to the new API and remove the old OF code (Daniel Lezcano) - Fix return value to -ENODEV when searching for a specific thermal zone which does not exist (Daniel Lezcano) - Fix the return value inspection in of_thermal_zone_find() (Dan Carpenter) - Fix kernel panic when KASAN is enabled as it detects use after free when unregistering a thermal zone (Daniel Lezcano) - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano) - Remove unnecessary error message as it is already shown in the underlying function (Jiapeng Chong) - Rework the monitoring path and move the locks upper in the call stack to fix some potentials race windows (Daniel Lezcano) - Fix lockdep_assert() warning introduced by the lock rework (Daniel Lezcano) - Do not lock thermal zone mutex in the user space governor (Rafael Wysocki) - Revert the Mellanox 'hotter thermal zone' feature because it is already handled in the thermal framework core code (Daniel Lezcano) - Increase maximum number of trip points in the thermal core (Sumeet Pawnikar) - Replace strlcpy() with unused retval with strscpy() in the core thermal control code (Wolfram Sang) - Use module_pci_driver() macro in the int340x processor_thermal driver (Shang XiaoJing) - Use get_cpu() instead of smp_processor_id() in the intel_powerclamp thermal driver to prevent it from crashing and remove unused accounting for IRQ wakes from it (Srinivas Pandruvada) - Consolidate priv->data_vault checks in int340x_thermal (Rafael Wysocki) - Check the policy first in cpufreq_cooling_register() (Xuewen Yan) - Drop redundant error message from da9062-thermal (zhaoxiao) - Drop of_match_ptr() from thermal_mmio (Jean Delvare)" * tag 'thermal-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits) thermal: core: Increase maximum number of trip points thermal: int340x: processor_thermal: Use module_pci_driver() macro thermal: intel_powerclamp: Remove accounting for IRQ wakes thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash thermal: int340x_thermal: Consolidate priv->data_vault checks thermal: cpufreq_cooling: Check the policy first in cpufreq_cooling_register() thermal: Drop duplicate words from comments thermal: move from strlcpy() with unused retval to strscpy() thermal: da9062-thermal: Drop redundant error message thermal/drivers/thermal_mmio: Drop of_match_ptr() thermal: gov_user_space: Do not lock thermal zone mutex Revert "mlxsw: core: Add the hottest thermal zone detection" thermal/core: Fix lockdep_assert() warning thermal/core: Move the mutex inside the thermal_zone_device_update() function thermal/core: Move the thermal zone lock out of the governors thermal/governors: Group the thermal zone lock inside the throttle function thermal/core: Rework the monitoring a bit thermal/core: Rearm the monitoring only one time thermal/drivers/qcom/spmi-adc-tm5: Remove unnecessary print function dev_err() thermal/of: Remove old OF code ...
2022-10-03init/main.c: remove unnecessary (void*) conversionsZhou jie
The void pointer object can be directly assigned to different structure objects, it does not need to be cast. Link: https://lkml.kernel.org/r/20220928014539.11046-1-zhoujie@nfschina.com Signed-off-by: Zhou jie <zhoujie@nfschina.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03proc: mark more files as permanentAlexey Dobriyan
Mark /proc/devices /proc/kpagecount /proc/kpageflags /proc/kpagecgroup /proc/loadavg /proc/meminfo /proc/softirqs /proc/uptime /proc/version as permanent /proc entries, saving alloc/free and some list/spinlock ops per use. These files are never removed by the kernel so it is OK to mark them. Link: https://lkml.kernel.org/r/Yyn527DzDMa+r0Yj@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03nilfs2: remove the unneeded result variableye xingchen
Return the value nilfs_segctor_sync() directly instead of storing it in another redundant variable. Link: https://lkml.kernel.org/r/20220831033403.302184-1-ye.xingchen@zte.com.cn Link: https://lkml.kernel.org/r/20220921034803.2476-3-konishi.ryusuke@gmail.com Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03nilfs2: delete unnecessary checks before brelse()Minghao Chi
Patch series "nilfs2 minor amendments". This patch (of 2): The brelse() inline function tests whether its argument is NULL and then returns immediately. Thus remove the tests which are not needed around the shown calls. Link: https://lkml.kernel.org/r/20220921034803.2476-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20220819081700.96279-1-chi.minghao@zte.com.cn Link: https://lkml.kernel.org/r/20220921034803.2476-2-konishi.ryusuke@gmail.com Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03checkpatch: warn for non-standard fixes tag styleNiklas Söderlund
Add a warning for fixes tags that does not follow community conventions. Link: https://lkml.kernel.org/r/20220914100255.1048460-1-niklas.soderlund@corigine.com Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Philippe Schenker <philippe.schenker@toradex.com> Acked-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03usr/gen_init_cpio.c: remove unnecessary -1 values from int fileLi zeming
The file variable is assigned first, it does not need to be initialized. Link: https://lkml.kernel.org/r/20220919014406.3242-1-zeming@nfschina.com Signed-off-by: Li zeming <zeming@nfschina.com> Cc: Li zeming <zeming@nfschina.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03ipc/msg: mitigate the lock contention with percpu counterJiebin Sun
The msg_bytes and msg_hdrs atomic counters are frequently updated when IPC msg queue is in heavy use, causing heavy cache bounce and overhead. Change them to percpu_counter greatly improve the performance. Since there is one percpu struct per namespace, additional memory cost is minimal. Reading of the count done in msgctl call, which is infrequent. So the need to sum up the counts in each CPU is infrequent. Apply the patch and test the pts/stress-ng-1.4.0 -- system v message passing (160 threads). Score gain: 3.99x CPU: ICX 8380 x 2 sockets Core number: 40 x 2 physical cores Benchmark: pts/stress-ng-1.4.0 -- system v message passing (160 threads) [akpm@linux-foundation.org: coding-style cleanups] [jiebin.sun@intel.com: avoid negative value by overflow in msginfo] Link: https://lkml.kernel.org/r/20220920150809.4014944-1-jiebin.sun@intel.com [akpm@linux-foundation.org: fix min() warnings] Link: https://lkml.kernel.org/r/20220913192538.3023708-3-jiebin.sun@intel.com Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dennis Zhou <dennis@kernel.org> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vasily Averin <vasily.averin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03percpu: add percpu_counter_add_local and percpu_counter_sub_localJiebin Sun
Patch series "/msg: mitigate the lock contention in ipc/msg", v6. Here are two patches to mitigate the lock contention in ipc/msg. The 1st patch is to add the new interface percpu_counter_add_local and percpu_counter_sub_local. The batch size in percpu_counter_add_batch should be very large in heavy writing and rare reading case. Add the "_local" version, and mostly it will do local adding, reduce the global updating and mitigate lock contention in writing. The 2nd patch is to use percpu_counter instead of atomic update in ipc/msg. The msg_bytes and msg_hdrs atomic counters are frequently updated when IPC msg queue is in heavy use, causing heavy cache bounce and overhead. Change them to percpu_counter greatly improve the performance. Since there is one percpu struct per namespace, additional memory cost is minimal. Reading of the count done in msgctl call, which is infrequent. So the need to sum up the counts in each CPU is infrequent. This patch (of 2): The batch size in percpu_counter_add_batch should be very large in heavy writing and rare reading case. Add the "_local" version, and mostly it will do local adding, reduce the global updating and mitigate lock contention in writing. Link: https://lkml.kernel.org/r/20220913192538.3023708-1-jiebin.sun@intel.com Link: https://lkml.kernel.org/r/20220913192538.3023708-2-jiebin.sun@intel.com Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vasily Averin <vasily.averin@linux.dev> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03fs/ocfs2: fix repeated words in commentswangjianli
Delete the redundant word 'to'. Link: https://lkml.kernel.org/r/20220908130036.31149-1-wangjianli@cdjrlc.com Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03relay: use kvcalloc to alloc page array in relay_alloc_page_arraywuchi
kvcalloc() is safer because it will check the integer overflows, and using it will simple the logic of allocation size. Link: https://lkml.kernel.org/r/20220909101025.82955-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03proc: make config PROC_CHILDREN depend on PROC_FSLukas Bulwahn
Commit 2e13ba54a268 ("fs, proc: introduce CONFIG_PROC_CHILDREN") introduces the config PROC_CHILDREN to configure kernels to provide the /proc/<pid>/task/<tid>/children file. When one deselects PROC_FS for kernel builds without /proc/, the config PROC_CHILDREN has no effect anymore, but is still visible in menuconfig. Add the dependency on PROC_FS to make the PROC_CHILDREN option disappear for kernel builds without /proc/. Link: https://lkml.kernel.org/r/20220909122529.1941-1-lukas.bulwahn@gmail.com Fixes: 2e13ba54a268 ("fs, proc: introduce CONFIG_PROC_CHILDREN") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Iago López Galeiras <iago@endocode.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03fs: uninline inode_maybe_inc_iversion()Andrew Morton
It has many callsites and is large. text data bss dec hex filename 91796 15984 512 108292 1a704 mm/shmem.o-before 91180 15984 512 107676 1a49c mm/shmem.o-after Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03firmware: google: test spinlock on panic path to avoid lockupsGuilherme G. Piccoli
Currently the gsmi driver registers a panic notifier as well as reboot and die notifiers. The callbacks registered are called in atomic and very limited context - for instance, panic disables preemption and local IRQs, also all secondary CPUs (not executing the panic path) are shutdown. With that said, taking a spinlock in this scenario is a dangerous invitation for lockup scenarios. So, fix that by checking if the spinlock is free to acquire in the panic notifier callback - if not, bail-out and avoid a potential hang. Link: https://lkml.kernel.org/r/20220909200755.189679-1-gpiccoli@igalia.com Fixes: 74c5b31c6618 ("driver: Google EFI SMI") Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Evan Green <evgreen@chromium.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: David Gow <davidgow@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julius Werner <jwerner@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03ipc: mqueue: remove unnecessary conditionalsJingyu Wang
iput() already handles null and non-null parameters, so there is no need to use if(). Link: https://lkml.kernel.org/r/20220908185452.76590-1-jingyuwang_vip@163.com Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03init.h: fix spelling typo in commentJiangshan Yi
Fix spelling typo in comment. Link: https://lkml.kernel.org/r/20220905021034.947701-1-13667453960@163.com Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03fs/ocfs2/suballoc.h: fix spelling typo in commentJiangshan Yi
Fix spelling typo in comment. Link: https://lkml.kernel.org/r/20220905061656.1829179-1-13667453960@163.com Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Reported-by: k2ci <kernel-bot@kylinos.cn> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03ocfs2: replace zero-length arrays with DECLARE_FLEX_ARRAY() helperGustavo A. R. Silva
Zero-length arrays are deprecated and we are moving towards adopting C99 flexible-array members, instead. So, replace zero-length array declarations in a couple of structures and unions with the new DECLARE_FLEX_ARRAY() helper macro. This helper allows for a flexible-array member in a union and as only member in a structure. Also, this addresses multiple warnings reported when building with Clang-15 and -Wzero-length-array. Lastly, this will also help memcpy (in a coming hardening update) execute proper bounds-checking on variable length object i_symlink at fs/ocfs2/namei.c:1973: fs/ocfs2/namei.c: 1973 memcpy((char *) fe->id2.i_symlink, symname, l); Link: https://github.com/KSPP/linux/issues/21 Link: https://github.com/KSPP/linux/issues/193 Link: https://github.com/KSPP/linux/issues/197 Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html Link: https://lkml.kernel.org/r/YxKY6O2hmdwNh8r8@work Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03clk: allow building lan966x as a moduleClément Léger
Set the COMMON_CLK_LAN966X option as a tristate and switch from builtin_platform_driver() to module_platform_driver() to allow building and using this driver as a module. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Link: https://lore.kernel.org/r/20220617103306.489466-1-clement.leger@bootlin.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-10-03clk: clk-xgene: simplify if-if to if-elseYihao Han
Replace `if (!pclk->param.csr_reg)` with `else` for simplification and add curly brackets according to the kernel coding style: "Do not unnecessarily use braces where a single statement will do." ... "This does not apply if only one branch of a conditional statement is a single statement; in the latter case use braces in both branches" Please refer to: https://www.kernel.org/doc/html/v5.17-rc8/process/coding-style.html Signed-off-by: Yihao Han <hanyihao@vivo.com> Link: https://lore.kernel.org/r/20220408130617.14963-1-hanyihao@vivo.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-10-03clk: ast2600: BCLK comes from EPLLJoel Stanley
This correction was made in the u-boot SDK recently. There are no in-tree users of this clock so the impact is minimal. Fixes: d3d04f6c330a ("clk: Add support for AST2600 SoC") Link: https://github.com/AspeedTech-BMC/u-boot/commit/8ad54a5ae15f27fea5e894cc2539a20d90019717 Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20220421040426.171256-1-joel@jms.id.au Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-10-03mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbolJohannes Weiner
Since 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP. Update the sites accordingly and drop the symbol. [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM, which hasn't been a user-visible symbol for over half a decade. ] Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm: memcontrol: use do_memsw_account() in a few more placesJohannes Weiner
It's slightly more descriptive and consistent with other places that distinguish cgroup1's combined memory+swap accounting scheme from cgroup2's dedicated swap accounting. Link: https://lkml.kernel.org/r/20220926135704.400818-4-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm: memcontrol: deprecate swapaccounting=0 modeJohannes Weiner
The swapaccounting= commandline option already does very little today. To close a trivial containment failure case, the swap ownership tracking part of the swap controller has recently become mandatory (see commit 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") for details), which makes up the majority of the work during swapout, swapin, and the swap slot map. The only thing left under this flag is the page_counter operations and the visibility of the swap control files in the first place, which are rather meager savings. There also aren't many scenarios, if any, where controlling the memory of a cgroup while allowing it unlimited access to a global swap space is a workable resource isolation strategy. On the other hand, there have been several bugs and confusion around the many possible swap controller states (cgroup1 vs cgroup2 behavior, memory accounting without swap accounting, memcg runtime disabled). This puts the maintenance overhead of retaining the toggle above its practical benefits. Deprecate it. Link: https://lkml.kernel.org/r/20220926135704.400818-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabledJohannes Weiner
Patch series "memcg swap fix & cleanups". This patch (of 4): Since commit 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control"), the cgroup swap arrays are used to track memory ownership at the time of swap readahead and swapoff, even if swap space *accounting* has been turned off by the user via swapaccount=0 (which sets cgroup_memory_noswap). However, the patch was overzealous: by simply dropping the cgroup_memory_noswap conditionals in the swapon, swapoff and uncharge path, it caused the cgroup arrays being allocated even when the memory controller as a whole is disabled. This is a waste of that memory. Restore mem_cgroup_disabled() checks, implied previously by cgroup_memory_noswap, in the swapon, swapoff, and swap_entry_free callbacks. Link: https://lkml.kernel.org/r/20220926135704.400818-1-hannes@cmpxchg.org Link: https://lkml.kernel.org/r/20220926135704.400818-2-hannes@cmpxchg.org Fixes: 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm/secretmem: remove reduntant return valueXiu Jianfeng
The return value @ret is always 0, so remove it and return 0 directly. Link: https://lkml.kernel.org/r/20220920012205.246217-1-xiujianfeng@huawei.com Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm/hugetlb: add available_huge_pages() funcXin Hao
In hugetlb.c there are several places which compare the values of 'h->free_huge_pages' and 'h->resv_huge_pages', it looks a bit messy, so add a new available_huge_pages() function to do these. Link: https://lkml.kernel.org/r/20220922021929.98961-1-xhao@linux.alibaba.com Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm: remove unused inline functions from include/linux/mm_inline.hGaosheng Cui
Remove the following unused inline functions from mm_inline.h: 1. All uses of add_page_to_lru_list_tail() have been removed since commit 7a3dbfe8a52b ("mm/swap: convert lru_deactivate_file to a folio_batch"), and it can be replaced by lruvec_add_folio_tail(). 2. All uses of __clear_page_lru_flags() have been removed since commit 188e8caee968 ("mm/swap: convert __page_cache_release() to use a folio"), and it can be replaced by __folio_clear_lru_flags(). They are useless, so remove them. Link: https://lkml.kernel.org/r/20220922110935.1495099-1-cuigaosheng1@huawei.com Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memoryZach O'Keefe
Add :collapse mod to userfaultfd selftest. Currently this mod is only valid for "shmem" test type, but could be used for other test types. When provided, memory allocated by ->allocate_area() will be hugepage-aligned enforced to be hugepage-sized. userfaultf_minor_test, after the UFFD-registered mapping has been populated by UUFD minor fault handler, attempt to MADV_COLLAPSE the UFFD-registered mapping to collapse the memory into a pmd-mapped THP. This test is meant to be a functional test of what occurs during UFFD-driven live migration of VMs backed by huge tmpfs where, after a hugepage-sized region has been successfully migrated (in native page-sized chunks, to avoid latency of fetched a hugepage over the network), we want to reclaim previous VM performance by remapping it at the PMD level. Link: https://lkml.kernel.org/r/20220907144521.3115321-11-zokeefe@google.com Link: https://lkml.kernel.org/r/20220922224046.1143204-11-zokeefe@google.com Signed-off-by: Zach O'Keefe <zokeefe@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Kennelly <ckennelly@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com> Cc: SeongJae Park <sj@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>