summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-04-10perf/core: Optimize perf_adjust_freq_unthr_context()Namhyung Kim
It was unnecessarily disabling and enabling PMUs for each event. It should be done at PMU level. Add pmu_ctx->nr_freq counter to check it at each PMU. As PMU context has separate active lists for pinned group and flexible group, factor out a new function to do the job. Another minor optimization is that it can skip PMUs w/ CAP_NO_INTERRUPT even if it needs to unthrottle sampling events. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Mingwei Zhang <mizhang@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20240207050545.2727923-1-namhyung@kernel.org
2024-04-09net: remove napi_frag_unrefMina Almasry
With the changes in the last patches, napi_frag_unref() is now reduandant. Remove it and use skb_page_unref directly. Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240408153000.2152844-4-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-09net: make napi_frag_unref reuse skb_page_unrefMina Almasry
The implementations of these 2 functions are almost identical. Remove the implementation of napi_frag_unref, and make it a call into skb_page_unref so we don't duplicate the implementation. Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240408153000.2152844-2-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-09ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addrJiri Benc
Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it still means hlist_for_each_entry_rcu can return an item that got removed from the list. The memory itself of such item is not freed thanks to RCU but nothing guarantees the actual content of the memory is sane. In particular, the reference count can be zero. This can happen if ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough timing, this can happen: 1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry. 2. Then, the whole ipv6_del_addr is executed for the given entry. The reference count drops to zero and kfree_rcu is scheduled. 3. ipv6_get_ifaddr continues and tries to increments the reference count (in6_ifa_hold). 4. The rcu is unlocked and the entry is freed. 5. The freed entry is returned. Prevent increasing of the reference count in such case. The name in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe. [ 41.506330] refcount_t: addition on 0; use-after-free. [ 41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130 [ 41.507413] Modules linked in: veth bridge stp llc [ 41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14 [ 41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [ 41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130 [ 41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff [ 41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282 [ 41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000 [ 41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900 [ 41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff [ 41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000 [ 41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48 [ 41.514086] FS: 00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000 [ 41.514726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0 [ 41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 41.516799] Call Trace: [ 41.517037] <TASK> [ 41.517249] ? __warn+0x7b/0x120 [ 41.517535] ? refcount_warn_saturate+0xa5/0x130 [ 41.517923] ? report_bug+0x164/0x190 [ 41.518240] ? handle_bug+0x3d/0x70 [ 41.518541] ? exc_invalid_op+0x17/0x70 [ 41.520972] ? asm_exc_invalid_op+0x1a/0x20 [ 41.521325] ? refcount_warn_saturate+0xa5/0x130 [ 41.521708] ipv6_get_ifaddr+0xda/0xe0 [ 41.522035] inet6_rtm_getaddr+0x342/0x3f0 [ 41.522376] ? __pfx_inet6_rtm_getaddr+0x10/0x10 [ 41.522758] rtnetlink_rcv_msg+0x334/0x3d0 [ 41.523102] ? netlink_unicast+0x30f/0x390 [ 41.523445] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 41.523832] netlink_rcv_skb+0x53/0x100 [ 41.524157] netlink_unicast+0x23b/0x390 [ 41.524484] netlink_sendmsg+0x1f2/0x440 [ 41.524826] __sys_sendto+0x1d8/0x1f0 [ 41.525145] __x64_sys_sendto+0x1f/0x30 [ 41.525467] do_syscall_64+0xa5/0x1b0 [ 41.525794] entry_SYSCALL_64_after_hwframe+0x72/0x7a [ 41.526213] RIP: 0033:0x7fbc4cfcea9a [ 41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a [ 41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005 [ 41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c [ 41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b [ 41.531573] </TASK> Fixes: 5c578aedcb21d ("IPv6: convert addrconf hash list to RCU") Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jiri Benc <jbenc@redhat.com> Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-09net: add copy_safe_from_sockptr() helperEric Dumazet
copy_from_sockptr() helper is unsafe, unless callers did the prior check against user provided optlen. Too many callers get this wrong, lets add a helper to fix them and avoid future copy/paste bugs. Instead of : if (optlen < sizeof(opt)) { err = -EINVAL; break; } if (copy_from_sockptr(&opt, optval, sizeof(opt)) { err = -EFAULT; break; } Use : err = copy_safe_from_sockptr(&opt, sizeof(opt), optval, optlen); if (err) break; Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240408082845.3957374-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-09fs: Rename SB_I_EVM_UNSUPPORTED to SB_I_EVM_HMAC_UNSUPPORTEDStefan Berger
Now that EVM supports RSA signatures for previously completely unsupported filesystems rename the flag SB_I_EVM_UNSUPPORTED to SB_I_EVM_HMAC_UNSUPPORTED to reflect that only HMAC is not supported. Suggested-by: Amir Goldstein <amir73il@gmail.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-04-09evm: Store and detect metadata inode attributes changesStefan Berger
On stacked filesystem the metadata inode may be different than the one file data inode and therefore changes to it need to be detected independently. Therefore, store the i_version, device number, and inode number associated with the file metadata inode. Implement a function to detect changes to the inode and if a change is detected reset the evm_status. This function will be called by IMA when IMA detects that the metadata inode is different from the file's inode. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-04-09ima: Move file-change detection variables into new structureStefan Berger
Move all the variables used for file change detection into a structure that can be used by IMA and EVM. Implement an inline function for storing the identification of an inode and one for detecting changes to an inode based on this new structure. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-04-09security: allow finer granularity in permitting copy-up of security xattrsStefan Berger
Copying up xattrs is solely based on the security xattr name. For finer granularity add a dentry parameter to the security_inode_copy_up_xattr hook definition, allowing decisions to be based on the xattr content as well. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM,SELinux) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-04-09io-uring: correct typo in comment for IOU_F_TWQ_LAZY_WAKEHaiyue Wang
The 'r' key is near to 't' key, that makes 'with' to be 'wirh' ? :) Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Link: https://lore.kernel.org/r/20240409173531.846714-1-haiyue.wang@intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-09bpf: Add support for certain atomics in bpf_arena to x86 JITAlexei Starovoitov
Support atomics in bpf_arena that can be JITed as a single x86 instruction. Instructions that are JITed as loops are not supported at the moment, since they require more complex extable and loop logic. JITs can choose to do smarter things with bpf_jit_supports_insn(). Like arm64 may decide to support all bpf atomics instructions when emit_lse_atomic is available and none in ll_sc mode. bpf_jit_supports_percpu_insn(), bpf_jit_supports_ptr_xchg() and other such callbacks can be replaced with bpf_jit_supports_insn() in the future. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240405231134.17274-1-alexei.starovoitov@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-04-09Merge branch 'work.iov_iter' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into read_iter * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: new helper: copy_to_iter_full()
2024-04-09compiler.h: Add missing quote in macro commentThorsten Blum
Add a missing doublequote in the __is_constexpr() macro comment. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-09ASoC: ti: davinci-i2s: Remove the unused clk_input_pin attributeBastien Curutchet
The clk_input_pin attribute of davinci_mcbsp_dev struct is not set since commit 257ade78b601 ("ASoC: davinci-i2s: Convert to use edma-pcm"). Remove the attribute. Keep the behaviour of the MCBSP_CLKR case as MCBSP_CLKR == 0. I can't test the BC_FP format so I added back the initial comment that was removed by commit ec6375533748 ("ASoC: DaVinci: Added selection of clk input pin for McBSP"). This was the last dependency to linux/platform_data/davinci_asp.h so it is not included anymore. Remove the enum mcbsp_clk_input_pin from davinci_asp.h as it is not used anywhere else. Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://msgid.link/r/20240402071213.11671-4-bastien.curutchet@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-09cpumask: add cpumask_any_and_but()Mark Rutland
In some cases, it's useful to be able to select a random cpu from the intersection of two masks, excluding a particular CPU. For example, in some systems an uncore PMU is shared by a subset of CPUs, and management of this PMU is assigned to some arbitrary CPU in this set. Whenever the management CPU is hotplugged out, we wish to migrate responsibility to another arbitrary CPU which is both in this set and online. Today we can use cpumask_any_and() to select an arbitrary CPU in the intersection of two masks. We can also use cpumask_any_but() to select any arbitrary cpu in a mask excluding, a particular CPU. To do both, we either need to use a temporary cpumask, which is wasteful, or use some lower-level cpumask helpers, which can be unclear. This patch adds a new cpumask_any_and_but() to cater for these cases. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-kernel@vger.kernel.org Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20240403155950.2068109-2-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09firmware: qcom: uefisecapp: Fix memory related IO errors and crashesMaximilian Luz
It turns out that while the QSEECOM APP_SEND command has specific fields for request and response buffers, uefisecapp expects them both to be in a single memory region. Failure to adhere to this has (so far) resulted in either no response being written to the response buffer (causing an EIO to be emitted down the line), the SCM call to fail with EINVAL (i.e., directly from TZ/firmware), or the device to be hard-reset. While this issue can be triggered deterministically, in the current form it seems to happen rather sporadically (which is why it has gone unnoticed during earlier testing). This is likely due to the two kzalloc() calls (for request and response) being directly after each other. Which means that those likely return consecutive regions most of the time, especially when not much else is going on in the system. Fix this by allocating a single memory region for both request and response buffers, properly aligning both structs inside it. This unfortunately also means that the qcom_scm_qseecom_app_send() interface needs to be restructured, as it should no longer map the DMA regions separately. Therefore, move the responsibility of DMA allocation (or mapping) to the caller. Fixes: 759e7a2b62eb ("firmware: Add support for Qualcomm UEFI Secure Application") Cc: stable@vger.kernel.org # 6.7 Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Konrad Dybcio <konrad.dybcio@linaro.org> # X13s Link: https://lore.kernel.org/r/20240406130125.1047436-1-luzmaximilian@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-04-09fs/proc: Skip bootloader comment if no embedded kernel parametersMasami Hiramatsu
If the "bootconfig" kernel command-line argument was specified or if the kernel was built with CONFIG_BOOT_CONFIG_FORCE, but if there are no embedded kernel parameter, omit the "# Parameters from bootloader:" comment from the /proc/bootconfig file. This will cause automation to fall back to the /proc/cmdline file, which will be identical to the comment in this no-embedded-kernel-parameters case. Link: https://lore.kernel.org/all/20240409044358.1156477-2-paulmck@kernel.org/ Fixes: 8b8ce6c75430 ("fs/proc: remove redundant comments from /proc/bootconfig") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-09PCI/DOE: Support discovery version 2Alexey Kardashevskiy
PCIe r6.1, sec 6.30.1.1 defines a "DOE Discovery Version" field in the DOE Discovery Request Data Object Contents (3rd DW) as: 15:8 DOE Discovery Version – must be 02h if the Capability Version in the Data Object Exchange Extended Capability is 02h or greater. Add support for the version on devices with the DOE v2 capability. Link: https://lore.kernel.org/r/20240307022006.3657433-1-aik@amd.com Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2024-04-09serial: max3100: Make struct plat_max3100 localAndy Shevchenko
There is no user of the struct plat_max3100 outside the driver. Inline its contents into the driver. While at it, drop outdated example in the comment. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240402195306.269276-5-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09printk: Save console options for add_preferred_console_match()Tony Lindgren
Driver subsystems may need to translate the preferred console name to the character device name used. We already do some of this in console_setup() with a few hardcoded names, but that does not scale well. The console options are parsed early in console_setup(), and the consoles are added with __add_preferred_console(). At this point we don't know much about the character device names and device drivers getting probed. To allow driver subsystems to set up a preferred console, let's save the kernel command line console options. To add a preferred console from a driver subsystem with optional character device name translation, let's add a new function add_preferred_console_match(). This allows the serial core layer to support console=DEVNAME:0.0 style hardware based addressing in addition to the current console=ttyS0 style naming. And we can start moving console_setup() character device parsing to the driver subsystem specific code. We use a separate array from the console_cmdline array as the character device name and index may be unknown at the console_setup() time. And eventually there's no need to call __add_preferred_console() until the subsystem is ready to handle the console. Adding the console name in addition to the character device name, and a flag for an added console, could be added to the struct console_cmdline. And the console_cmdline array handling could be modified accordingly. But that complicates things compared saving the console options, and then adding the consoles when the subsystems handling the consoles are ready. Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240327110021.59793-2-tony@atomide.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09tty: serial: switch from circ_buf to kfifoJiri Slaby (SUSE)
Switch from struct circ_buf to proper kfifo. kfifo provides much better API, esp. when wrap-around of the buffer needs to be taken into account. Look at pl011_dma_tx_refill() or cpm_uart_tx_pump() changes for example. Kfifo API can also fill in scatter-gather DMA structures, so it easier for that use case too. Look at lpuart_dma_tx() for example. Note that not all drivers can be converted to that (like atmel_serial), they handle DMA specially. Note that usb-serial uses kfifo for TX for ages. omap needed a bit more care as it needs to put a char into FIFO to start the DMA transfer when OMAP_DMA_TX_KICK is set. In that case, we have to do kfifo_dma_out_prepare twice: once to find out the tx_size (to find out if it is worths to do DMA at all -- size >= 4), the second time for the actual transfer. All traces of circ_buf are removed from serial_core.h (and its struct uart_state). Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Al Cooper <alcooperx@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com> Cc: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Richard Genoud <richard.genoud@gmail.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Claudiu Beznea <claudiu.beznea@tuxon.dev> Cc: Alexander Shiyan <shc_work@mail.ru> Cc: Baruch Siach <baruch@tkos.co.il> Cc: Maciej W. Rozycki <macro@orcam.me.uk> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Taichi Sugaya <sugaya.taichi@socionext.com> Cc: Takao Orito <orito.takao@socionext.com> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Pali Rohár <pali@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Laxman Dewangan <ldewangan@nvidia.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: Orson Zhai <orsonzhai@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: David S. Miller <davem@davemloft.net> Cc: Hammer Hsieh <hammerh0314@gmail.com> Cc: Peter Korsgaard <jacmet@sunsite.dk> Cc: Timur Tabi <timur@kernel.org> Cc: Michal Simek <michal.simek@amd.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20240405060826.2521-13-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09kfifo: fix typos in kernel-docJiri Slaby (SUSE)
Obviously: "This macro finish" -> "This macro finishes" and similar. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240405060826.2521-9-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09kfifo: add kfifo_dma_out_prepare_mapped()Jiri Slaby (SUSE)
When the kfifo buffer is already dma-mapped, one cannot use the kfifo API to fill in an SG list. Add kfifo_dma_in_prepare_mapped() which allows exactly this. A mapped dma_addr_t is passed and it is filled into provided sgl too. Including the dma_len. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240405060826.2521-8-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09kfifo: add kfifo_out_linear{,_ptr}()Jiri Slaby (SUSE)
These are helpers which are going to be used in the serial layer. We need a wrapper around kfifo which provides us with a tail (sometimes "tail" offset, sometimes a pointer) to the kfifo data. And which returns count of available data -- but not larger than to the end of the buffer (hence _linear in the names). I.e. something like CIRC_CNT_TO_END() in the legacy circ_buf. This patch adds such two helpers. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240405060826.2521-4-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09kfifo: introduce and use kfifo_skip_count()Jiri Slaby (SUSE)
kfifo_skip_count() is an extended version of kfifo_skip(), accepting also count. This will be useful in the serial code later. Now, it can be used to implement both kfifo_skip() and kfifo_dma_out_finish(). In the latter, 'len' only needs to be divided by 'type' size (as it was until now). And stop using statement expressions when the return value is cast to 'void'. Use classic 'do {} while (0)' instead. Note: perhaps we should skip 'count' records for the 'recsize' case, but the original (kfifo_dma_out_finish()) used to skip only one record. So this is kept unchanged and 'count' is still ignored in the recsize case. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240405060826.2521-3-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09kfifo: drop __kfifo_dma_out_finish_r()Jiri Slaby (SUSE)
It is the same as __kfifo_skip_r(), so: * drop __kfifo_dma_out_finish_r() completely, and * replace its (only) use by __kfifo_skip_r(). Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240405060826.2521-2-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09rcu-tasks: Make Tasks RCU wait idly for grace-period delaysPaul E. McKenney
Currently, all waits for grace periods sleep at TASK_UNINTERRUPTIBLE, regardless of RCU flavor. This has worked well, but there have been cases where a longer-than-average Tasks RCU grace period has triggered softlockup splats, many of them, before the Tasks RCU CPU stall warning appears. These softlockup splats unnecessarily consume console bandwidth and complicate diagnosis of the underlying problem. Plus a long but not pathologically long Tasks RCU grace period might trigger a few softlockup splats before completing normally, which generates noise for no good reason. This commit therefore causes Tasks RCU grace periods to sleep at TASK_IDLE priority. If there really is a persistent problem, the eventual Tasks RCU CPU stall warning will flag it, and without the extra noise. Reported-by: Breno Leitao <leitao@debian.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-04-09tcp: replace TCP_SKB_CB(skb)->tcp_tw_isn with a per-cpu fieldEric Dumazet
TCP can transform a TIMEWAIT socket into a SYN_RECV one from a SYN packet, and the ISN of the SYNACK packet is normally generated using TIMEWAIT tw_snd_nxt : tcp_timewait_state_process() ... u32 isn = tcptw->tw_snd_nxt + 65535 + 2; if (isn == 0) isn++; TCP_SKB_CB(skb)->tcp_tw_isn = isn; return TCP_TW_SYN; This SYN packet also bypasses normal checks against listen queue being full or not. tcp_conn_request() ... __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; ... /* TW buckets are converted to open requests without * limitations, they conserve resources and peer is * evidently real one. */ if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) { want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name); if (!want_cookie) goto drop; } This was using TCP_SKB_CB(skb)->tcp_tw_isn field in skb. Unfortunately this field has been accidentally cleared after the call to tcp_timewait_state_process() returning TCP_TW_SYN. Using a field in TCP_SKB_CB(skb) for a temporary state is overkill. Switch instead to a per-cpu variable. As a bonus, we do not have to clear tcp_tw_isn in TCP receive fast path. It is temporarily set then cleared only in the TCP_TW_SYN dance. Fixes: 4ad19de8774e ("net: tcp6: fix double call of tcp_v6_fill_cb()") Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-09tcp: propagate tcp_tw_isn via an extra parameter to ->route_req()Eric Dumazet
tcp_v6_init_req() reads TCP_SKB_CB(skb)->tcp_tw_isn to find out if the request socket is created by a SYN hitting a TIMEWAIT socket. This has been buggy for a decade, lets directly pass the information from tcp_conn_request(). This is a preparatory patch to make the following one easier to review. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-09fs: Add FOP_HUGE_PAGESMatthew Wilcox (Oracle)
Instead of checking for specific file_operations, add a bit to file_operations which denotes a file that only contain hugetlb pages. This lets us make hugetlbfs_file_operations static, and removes is_file_shm_hugepages() completely. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240407201122.3783877-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-09Merge tag 'v6.9-rc3' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-09eth: Move IPv4/IPv6 multicast address bases to their own symbolsDiogo Ivo
As these addresses can be useful outside of checking if an address is a multicast address (for example in device drivers) make them accessible to users of etherdevice.h to avoid code duplication. Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-09Merge tag 'v6.9-rc3' into x86/cpu, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-09RDMA/hns: Support DSCPJunxian Huang
Add support for DSCP configuration. For DSCP, get dscp-prio mapping via hns3 nic driver api .get_dscp_prio() and fill the SL (in WQE for UD or in QPC for RC) with the priority value. The prio-tc mapping is configured to HW by hns3 nic driver. HW will select a corresponding TC according to SL and the prio-tc mapping. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240315093551.1650088-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-09media: v4l2-subdev: Remove non-pad dv timing callbacksPaweł Anikiel
After the conversion of dv timing calls to use a pad argument is done, remove the old callbacks. Update the subdev ioctl handlers to use the new callbacks. Signed-off-by: Paweł Anikiel <panikiel@google.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-04-09media: v4l2-subdev: Add pad versions of dv timing subdev callsPaweł Anikiel
Currently, subdev dv timing calls (i.e. g/s/query_dv_timings) are video ops without a pad argument. This is a problem if the subdevice can have different dv timings for each pad (e.g. a DisplayPort receiver with multiple virtual channels). To solve this, change these calls to include a pad argument, and put them into pad ops. Keep the old ones temporarily to make the switch easier. Signed-off-by: Paweł Anikiel <panikiel@google.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-04-08md: Fix overflow in is_mddev_idleLi Nan
UBSAN reports this problem: UBSAN: Undefined behaviour in drivers/md/md.c:8175:15 signed integer overflow: -2147483291 - 2072033152 cannot be represented in type 'int' Call trace: dump_backtrace+0x0/0x310 show_stack+0x28/0x38 dump_stack+0xec/0x15c ubsan_epilogue+0x18/0x84 handle_overflow+0x14c/0x19c __ubsan_handle_sub_overflow+0x34/0x44 is_mddev_idle+0x338/0x3d8 md_do_sync+0x1bb8/0x1cf8 md_thread+0x220/0x288 kthread+0x1d8/0x1e0 ret_from_fork+0x10/0x18 'curr_events' will overflow when stat accum or 'sync_io' is greater than INT_MAX. Fix it by changing sync_io, last_events and curr_events to 64bit. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240117031946.2324519-2-linan666@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2024-04-08Merge patch series "scsi: documentation: clean up docs and fix kernel-doc"Martin K. Petersen
Randy Dunlap <rdunlap@infradead.org> says: Clean up some SCSI doc files and fix kernel-doc in 6 header files in include/scsi/. Link: https://lore.kernel.org/r/20240408025425.18778-1-rdunlap@infradead.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: scsi_transport_srp: Fix a couple of kernel-doc warningsRandy Dunlap
Add a struct short description and a function return value to prevent kernel-doc warnings: scsi_transport_srp.h:77: warning: missing initial short description on line: * struct srp_function_template scsi_transport_srp.h:132: warning: No description found for return value of 'srp_chkready' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-9-rdunlap@infradead.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: scsi_transport_fc: Add kernel-doc for function returnRandy Dunlap
Add function return value to prevent a kernel-doc warning: scsi_transport_fc.h:780: warning: No description found for return value of 'fc_remote_port_chkready' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-8-rdunlap@infradead.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: core: Add function return kernel-doc for 2 functionsRandy Dunlap
Add missing function return values to prevent kernel-doc warnings: scsi.h:75: warning: No description found for return value of 'scsi_status_is_check_condition' scsi.h:202: warning: No description found for return value of 'scsi_status_is_good' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-7-rdunlap@infradead.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: libfcoe: Fix a slew of kernel-doc warningsRandy Dunlap
Fix all kernel-doc warnings in <scsi/libfcoe.h>: libfcoe.h:163: warning: Function parameter or struct member 'ctlr' not described in 'fcoe_ctlr_priv' libfcoe.h:163: warning: Excess function parameter 'cltr' description in 'fcoe_ctlr_priv' libfcoe.h:163: warning: No description found for return value of 'fcoe_ctlr_priv' libfcoe.h:218: warning: Function parameter or struct member 'fd_flags' not described in 'fcoe_fcf' libfcoe.h:218: warning: Excess struct member 'event' description in 'fcoe_fcf' libfcoe.h:240: warning: Function parameter or struct member 'rdata' not described in 'fcoe_rport' libfcoe.h:273: warning: No description found for return value of 'is_fip_mode' libfcoe.h:332: warning: Function parameter or struct member 'crc_eof_page' not described in 'fcoe_percpu_s' libfcoe.h:332: warning: Function parameter or struct member 'lock' not described in 'fcoe_percpu_s' libfcoe.h:332: warning: Excess struct member 'page' description in 'fcoe_percpu_s' libfcoe.h:362: warning: Function parameter or struct member 'data_src_addr' not described in 'fcoe_port' libfcoe.h:362: warning: Function parameter or struct member 'get_netdev' not described in 'fcoe_port' libfcoe.h:362: warning: Excess struct member 'data_srt_addr' description in 'fcoe_port' libfcoe.h:369: warning: No description found for return value of 'fcoe_get_netdev' libfcoe.h:386: warning: missing initial short description on line: * struct netdev_list libfcoe.h:393: warning: expecting prototype for struct netdev_list. Prototype was for struct fcoe_netdev_mapping instead Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-6-rdunlap@infradead.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: iser: Fix @read_stag kernel-doc warningRandy Dunlap
Correct kernel-doc comments for struct iser_ctrl to prevent warnings: iser.h:76: warning: Function parameter or struct member 'read_stag' not described in 'iser_ctrl' iser.h:76: warning: Excess struct member 'reaf_stag' description in 'iser_ctrl' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-5-rdunlap@infradead.org Acked-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08scsi: core: Add kernel-doc for scsi_msg_to_host_byte()Randy Dunlap
Add entries for missing documentation to prevent kernel-doc warnings: scsi_cmnd.h:365: warning: Function parameter or struct member 'cmd' not described in 'scsi_msg_to_host_byte' scsi_cmnd.h:365: warning: Function parameter or struct member 'msg' not described in 'scsi_msg_to_host_byte' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240408025425.18778-4-rdunlap@infradead.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-08drm/xe/uapi: Restore flags VM_BIND_FLAG_READONLY and VM_BIND_FLAG_IMMEDIATEFrancois Dugast
The commit 84a1ed5e6756 ("drm/xe/uapi: Remove unused flags") is partially reverted. At the time, flags not used by user space were removed during cleanup. Some flags now needed by the compute runtime are brought back in this commit: - DRM_XE_VM_BIND_FLAG_READONLY is used to write protect kernel ISA thus preventing accidental overwrites. - DRM_XE_VM_BIND_FLAG_IMMEDIATE is used to trigger mapping at the time of binding in order to prevent faulting at execution time. The changes in the compute runtime are ready and approved, see link below. v2: Include a link to the PR in the commit message (Matthew Brost) v3: Update kernel doc and improve commit message (Lucas De Marchi) Cc: Mateusz Jablonski <mateusz.jablonski@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://github.com/intel/compute-runtime/pull/717 Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240329124403.7-1-francois.dugast@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-08wifi: mac80211: extend IEEE80211_KEY_FLAG_GENERATE_MMIE to other ciphersMichael-CY Lee
Extend the flag IEEE80211_KEY_FLAG_GENERATE_MMIE to BIP-CMAC-256, BIP-GMAC-128 and BIP-GMAC-256 for the same reason and in the same way that the flag was added originally in commit a0b4496a4368 ("mac80211: add IEEE80211_KEY_FLAG_GENERATE_MMIE to ieee80211_key_flags"). Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com> Link: https://msgid.link/20240326003036.15215-1-michael-cy.lee@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-08wifi: mac80211: Add missing return value documentationJeff Johnson
kernel-doc is reporting some warnings, so fix them: % scripts/kernel-doc -Wall -Werror -none include/net/mac80211.h include/net/mac80211.h:2056: warning: No description found for return value of 'wdev_to_ieee80211_vif' include/net/mac80211.h:2066: warning: No description found for return value of 'ieee80211_vif_to_wdev' include/net/mac80211.h:5603: warning: No description found for return value of 'ieee80211_beacon_cntdwn_is_complete' include/net/mac80211.h:5968: warning: No description found for return value of 'ieee80211_gtk_rekey_add' include/net/mac80211.h:6350: warning: No description found for return value of 'ieee80211_find_sta_by_link_addrs' include/net/mac80211.h:6478: warning: No description found for return value of 'ieee80211_txq_airtime_check' include/net/mac80211.h:6981: warning: No description found for return value of 'rate_control_set_rates' include/net/mac80211.h:7142: warning: No description found for return value of 'ieee80211_tx_prepare_skb' include/net/mac80211.h:7156: warning: No description found for return value of 'ieee80211_parse_tx_radiotap' include/net/mac80211.h:7277: warning: No description found for return value of 'ieee80211_tx_dequeue' include/net/mac80211.h:7292: warning: No description found for return value of 'ieee80211_tx_dequeue_ni' include/net/mac80211.h:7324: warning: No description found for return value of 'ieee80211_next_txq' include/net/mac80211.h:7405: warning: No description found for return value of 'ieee80211_txq_may_transmit' include/net/mac80211.h:7466: warning: No description found for return value of 'ieee80211_calc_rx_airtime' include/net/mac80211.h:7480: warning: No description found for return value of 'ieee80211_calc_tx_airtime' include/net/mac80211.h:7528: warning: No description found for return value of 'ieee80211_is_tx_data' include/net/mac80211.h:7562: warning: No description found for return value of 'ieee80211_set_active_links' 17 warnings as Errors Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://msgid.link/20240329-mac80211-kdoc-retval-v1-2-5e4d1ad6c250@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-08wifi: mac80211: remove ieee80211_set_hw_80211_encap()Jeff Johnson
While fixing kernel-doc issues it was discovered that the ieee80211_set_hw_80211_encap() prototype doesn't actually have an implementation, so remove it. Note the implementation was removed in commit 6aea26ce5a4c ("mac80211: rework tx encapsulation offload API"). Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://msgid.link/20240329-mac80211-kdoc-retval-v1-1-5e4d1ad6c250@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-08wifi: mac80211: don't use rate mask for scanningJohannes Berg
The rate mask is intended for use during operation, and can be set to only have masks for the currently active band. As such, it cannot be used for scanning which can be on other bands as well. Simply ignore the rate masks during scanning to avoid warnings from incorrect settings. Reported-by: syzbot+fdc5123366fb9c3fdc6d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fdc5123366fb9c3fdc6d Co-developed-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Tested-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://msgid.link/20240326220854.9594cbb418ca.I7f86c0ba1f98cf7e27c2bacf6c2d417200ecea5c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-08cgroup/cpuset: Make cpuset hotplug processing synchronousWaiman Long
Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()"), cpuset hotplug was done asynchronously via a work function. This is to avoid recursive locking of cgroup_mutex. Since then, the cgroup locking scheme has changed quite a bit. A cpuset_mutex was introduced to protect cpuset specific operations. The cpuset_mutex is then replaced by a cpuset_rwsem. With commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on, cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes allow the hotplug code to call into cpuset core directly. The following commits were also merged due to the asynchronous nature of cpuset hotplug processing. - commit b22afcdf04c9 ("cpu/hotplug: Cure the cpusets trainwreck") - commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs") - commit 28b89b9e6f7b ("cpuset: handle race between CPU hotplug and cpuset_hotplug_work") Clean up all these bandages by making cpuset hotplug processing synchronous again with the exception that the call to cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1 cpuset, if necessary, will still be done via a work function due to the existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible to reverse that dependency, but that will require updating a number of different cgroup controllers. This special hotplug code path should be rarely taken anyway. As all the cpuset states will be updated by the end of the hotplug operation, we can revert most the above commits except commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs") which is partially reverted. Also removing some cpus_read_lock trylock attempts in the cpuset partition code as they are no longer necessary since the cpu_hotplug_lock is now held for the whole duration of the cpuset hotplug code path. Signed-off-by: Waiman Long <longman@redhat.com> Tested-by: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>