summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-04-17sched/vtime: Remove confusing arch_vtime_task_switch() declarationAlexander Gordeev
Callback arch_vtime_task_switch() is only defined when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is selected. Yet, the function prototype forward declaration is present for CONFIG_VIRT_CPU_ACCOUNTING_GEN variant. Remove it. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/783d7c611864f82b0fb9edf01890b9396f3a549a.1712760275.git.agordeev@linux.ibm.com
2024-04-17PNP: Add dev_is_pnp() macroGuanbing Huang
Add dev_is_pnp() macro to determine whether the device is a PNP device. Signed-off-by: Guanbing Huang <albanhuang@tencent.com> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Bing Fan <tombinfan@tencent.com> Tested-by: Linheng Du <dylanlhdu@tencent.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/4e68f5557ad53b671ca8103e572163eca52a8f29.1713234515.git.albanhuang@tencent.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-16net: pse-pd: Rectify and adapt the naming of admin_cotrol member of struct ↵Kory Maincent (Dent Project)
pse_control_config In commit 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment"), the 'pse_control_config' structure was introduced, housing a single member labeled 'admin_cotrol' responsible for maintaining the operational state of the PoDL PSE functions. A noticeable typographical error exists in the naming of this field ('cotrol' should be corrected to 'control'), which this commit aims to rectify. Furthermore, with upcoming extensions of this structure to encompass PoE functionalities, the field is being renamed to 'podl_admin_state' to distinctly indicate that this state is tailored specifically for PoDL." Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240414-feature_poe-v8-3-e4bf1e860da5@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-17Add bridged amplifiers to cs42l43Mark Brown
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>: In some cs42l43 systems a couple of cs35l56 amplifiers are attached to the cs42l43's SPI and I2S. On Windows the cs42l43 is controlled by a SDCA class driver and these two amplifiers are controlled by firmware running on the cs42l43. However, under Linux the decision was made to interact with the cs42l43 directly, affording the user greater control over the audio system. However, this has resulted in an issue where these two bridged cs35l56 amplifiers are not populated in ACPI and must be added manually. There is at least an SDCA extension unit DT entry we can key off. The process of adding this is handled using a software node, firstly the ability to add native chip selects to software nodes must be added. Secondly, an additional flag for naming the SPI devices is added this allows the machine driver to key to the correct amplifier. Then finally, the cs42l43 SPI driver adds the two amplifiers directly onto its SPI bus. An additional series will follow soon to add the audio machine driver parts (in the sof-sdw driver), however that is fairly orthogonal to this part of the process, getting the actual amplifiers registered.
2024-04-16mm/shmem: inline shmem_is_huge() for disabled transparent hugepagesSumanth Korikkar
In order to minimize code size (CONFIG_CC_OPTIMIZE_FOR_SIZE=y), compiler might choose to make a regular function call (out-of-line) for shmem_is_huge() instead of inlining it. When transparent hugepages are disabled (CONFIG_TRANSPARENT_HUGEPAGE=n), it can cause compilation error. mm/shmem.c: In function `shmem_getattr': ./include/linux/huge_mm.h:383:27: note: in expansion of macro `BUILD_BUG' 383 | #define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; }) | ^~~~~~~~~ mm/shmem.c:1148:33: note: in expansion of macro `HPAGE_PMD_SIZE' 1148 | stat->blksize = HPAGE_PMD_SIZE; To prevent the possible error, always inline shmem_is_huge() when transparent hugepages are disabled. Link: https://lkml.kernel.org/r/20240409155407.2322714-1-sumanthk@linux.ibm.com Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16mm,swapops: update check in is_pfn_swap_entry for hwpoison entriesOscar Salvador
Tony reported that the Machine check recovery was broken in v6.9-rc1, as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors to DRAM. After some more digging and debugging on his side, he realized that this went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")'. That commit, among other things, introduced swp_offset_pfn(), replacing hwpoison_entry_to_pfn() in its favour. The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), but is_pfn_swap_entry() never got updated to cover hwpoison entries, which means that we would hit the VM_BUG_ON whenever we would call swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. Fix this by updating the check to cover hwpoison entries as well, and update the comment while we are it. Link: https://lkml.kernel.org/r/20240407130537.16977-1-osalvador@suse.de Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-by: Tony Luck <tony.luck@intel.com> Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ Tested-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: <stable@vger.kernel.org> [6.1.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16cgroup/rstat: add cgroup_rstat_lock helpers and tracepointsJesper Dangaard Brouer
This commit enhances the ability to troubleshoot the global cgroup_rstat_lock by introducing wrapper helper functions for the lock along with associated tracepoints. Although global, the cgroup_rstat_lock helper APIs and tracepoints take arguments such as cgroup pointer and cpu_in_loop variable. This adjustment is made because flushing occurs per cgroup despite the lock being global. Hence, when troubleshooting, it's important to identify the relevant cgroup. The cpu_in_loop variable is necessary because the global lock may be released within the main flushing loop that traverses CPUs. In the tracepoints, the cpu_in_loop value is set to -1 when acquiring the main lock; otherwise, it denotes the CPU number processed last. The new feature in this patchset is detecting when lock is contended. The tracepoints are implemented with production in mind. For minimum overhead attach to cgroup:cgroup_rstat_lock_contended, which only gets activated when trylock detects lock is contended. A quick production check for issues could be done via this perf commands: perf record -g -e cgroup:cgroup_rstat_lock_contended Next natural question would be asking how long time do lock contenders wait for obtaining the lock. This can be answered by measuring the time between cgroup:cgroup_rstat_lock_contended and cgroup:cgroup_rstat_locked when args->contended is set. Like this bpftrace script: bpftrace -e ' tracepoint:cgroup:cgroup_rstat_lock_contended {@start[tid]=nsecs} tracepoint:cgroup:cgroup_rstat_locked { if (args->contended) { @wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);}} interval:s:1 {time("%H:%M:%S "); print(@wait_ns); }' Extending with time spend holding the lock will be more expensive as this also looks at all the non-contended cases. Like this bpftrace script: bpftrace -e ' tracepoint:cgroup:cgroup_rstat_lock_contended {@start[tid]=nsecs} tracepoint:cgroup:cgroup_rstat_locked { @locked[tid]=nsecs; if (args->contended) { @wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);}} tracepoint:cgroup:cgroup_rstat_unlock { @locked_ns=hist(nsecs-@locked[tid]); delete(@locked[tid]);} interval:s:1 {time("%H:%M:%S "); print(@wait_ns);print(@locked_ns); }' Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-04-16sched: Add missing memory barrier in switch_mm_cidMathieu Desnoyers
Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb() which the core scheduler code has depended upon since commit: commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid") If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can unset the actively used cid when it fails to observe active task after it sets lazy_put. There *is* a memory barrier between storing to rq->curr and _return to userspace_ (as required by membarrier), but the rseq mm_cid has stricter requirements: the barrier needs to be issued between store to rq->curr and switch_mm_cid(), which happens earlier than: - spin_unlock(), - switch_to(). So it's fine when the architecture switch_mm() happens to have that barrier already, but less so when the architecture only provides the full barrier in switch_to() or spin_unlock(). It is a bug in the rseq switch_mm_cid() implementation. All architectures that don't have memory barriers in switch_mm(), but rather have the full barrier either in finish_lock_switch() or switch_to() have them too late for the needs of switch_mm_cid(). Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the generic barrier.h header, and use it in switch_mm_cid() for scheduler transitions where switch_mm() is expected to provide a memory barrier. Architectures can override smp_mb__after_switch_mm() if their switch_mm() implementation provides an implicit memory barrier. Override it with a no-op on x86 which implicitly provide this memory barrier by writing to CR3. Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid") Reported-by: levi.yun <yeoreum.yun@arm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64 Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86 Cc: <stable@vger.kernel.org> # 6.4.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240415152114.59122-2-mathieu.desnoyers@efficios.com
2024-04-16af_unix: Try not to hold unix_gc_lock during accept().Kuniyuki Iwashima
Commit dcf70df2048d ("af_unix: Fix up unix_edge.successor for embryo socket.") added spin_lock(&unix_gc_lock) in accept() path, and it caused regression in a stress test as reported by kernel test robot. If the embryo socket is not part of the inflight graph, we need not hold the lock. To decide that in O(1) time and avoid the regression in the normal use case, 1. add a new stat unix_sk(sk)->scm_stat.nr_unix_fds 2. count the number of inflight AF_UNIX sockets in the receive queue under unix_state_lock() 3. move unix_update_edges() call under unix_state_lock() 4. avoid locking if nr_unix_fds is 0 in unix_update_edges() Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202404101427.92a08551-oliver.sang@intel.com Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240413021928.20946-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-16gpio: swnode: Add ability to specify native chip selects for SPICharles Keepax
SPI devices can specify a cs-gpios property to enumerate their chip selects. Under device tree, a zero entry in this property can be used to specify that a particular chip select is using the SPI controllers native chip select, for example: cs-gpios = <&gpio1 0 0>, <0>; Here, the second chip select is native. However, when using swnodes there is currently no way to specify a native chip select. The proposal here is to register a swnode_gpio_undefined software node, that can be specified to allow the indication of a native chip select. For example: static const struct software_node_ref_args device_cs_refs[] = { { .node = &device_gpiochip_swnode, .nargs = 2, .args = { 0, GPIO_ACTIVE_LOW }, }, { .node = &swnode_gpio_undefined, .nargs = 0, }, }; Register the swnode as the gpiolib is initialised and check in swnode_get_gpio_device() if the returned node matches swnode_gpio_undefined and return -ENOENT, which matches the behaviour of the device tree system when it encounters a 0 phandle. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20240416100904.3738093-2-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-16ASoC: tracing: Export SND_SOC_DAPM_DIR_OUT to its valueSteven Rostedt
The string SND_SOC_DAPM_DIR_OUT is printed in the snd_soc_dapm_path trace event instead of its value: (((REC->path_dir) == SND_SOC_DAPM_DIR_OUT) ? "->" : "<-") User space cannot parse this, as it has no idea what SND_SOC_DAPM_DIR_OUT is. Use TRACE_DEFINE_ENUM() to convert it to its value: (((REC->path_dir) == 1) ? "->" : "<-") So that user space tools, such as perf and trace-cmd, can parse it correctly. Reported-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Fixes: 6e588a0d839b5 ("ASoC: dapm: Consolidate path trace events") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240416000303.04670cdf@rorschach.local.home Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-16coresight: Add helpers registering/removing both AMBA and platform driversAnshuman Khandual
This adds two different helpers i.e coresight_init_driver()/remove_driver() enabling coresight devices to register or remove AMBA and platform drivers. This changes replicator and funnel devices to use above new helpers. Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: James Clark <james.clark@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: coresight@lists.linaro.org Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240314055843.2625883-5-anshuman.khandual@arm.com
2024-04-15vfs: export remap and write check helpersDarrick J. Wong
Export these functions so that the next patch can use them to check the file ranges being passed to the XFS_IOC_EXCHANGE_RANGE operation. Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-15Merge tag 'nfsd-6.9-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a potential tracepoint crash - Fix NFSv4 GETATTR on big-endian platforms * tag 'nfsd-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: fix endianness issue in nfsd4_encode_fattr4 SUNRPC: Fix rpcgss_context trace event acceptor field
2024-04-15remove call_{read,write}_iter() functionsMiklos Szeredi
These have no clear purpose. This is effectively a revert of commit bb7462b6fd64 ("vfs: use helpers for calling f_op->{read,write}_iter()"). The patch was created with the help of a coccinelle script. Fixes: bb7462b6fd64 ("vfs: use helpers for calling f_op->{read,write}_iter()") Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-04-15kernel_file_open(): get rid of inode argumentAl Viro
always equal to ->dentry->d_inode of the path argument these days. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-04-15fd_is_open(): move to fs/file.cAl Viro
no users outside that... Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-04-15close_on_exec(): pass files_struct instead of fdtableAl Viro
both callers are happier that way... Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-04-15net: dqs: make struct dql more cache efficientBreno Leitao
With the previous change, struct dqs->stall_thrs will be in the hot path (at queue side), even if DQS is disabled. The other fields accessed in this function (last_obj_cnt and num_queued) are in the first cache line, let's move this field (stall_thrs) to the very first cache line, since there is a hole there. This does not change the structure size, since it moves an short (2 bytes) to 4-bytes whole in the first cache line. This is the new structure format now: struct dql { unsigned int num_queued; unsigned int last_obj_cnt; ... short unsigned int stall_thrs; /* XXX 2 bytes hole, try to pack */ ... /* --- cacheline 1 boundary (64 bytes) --- */ ... /* Longest stall detected, reported to user */ short unsigned int stall_max; /* XXX 2 bytes hole, try to pack */ }; Also, read the stall_thrs (now in the very first cache line) earlier, together with dql->num_queued (also in the first cache line). Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240411192241.2498631-5-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15net: dql: Optimize stall information populationBreno Leitao
When Dynamic Queue Limit (DQL) is set, it always populate stall information through dql_queue_stall(). However, this information is only necessary if a stall threshold is set, stored in struct dql->stall_thrs. dql_queue_stall() is cheap, but not free, since it does have memory barriers and so forth. Do not call dql_queue_stall() if there is no stall threshold set, and save some CPU cycles. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240411192241.2498631-4-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15net: dql: Separate queue function responsibilitiesBreno Leitao
The dql_queued() function currently handles both queuing object counts and populating bitmaps for reporting stalls. This commit splits the bitmap population into a separate function, allowing for conditional invocation in scenarios where the feature is disabled. This refactor maintains functionality while improving code organization. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240411192241.2498631-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15net: dql: Avoid calling BUG() when WARN() is enoughBreno Leitao
If the dql_queued() function receives an invalid argument, WARN about it and continue, instead of crashing the kernel. This was raised by checkpatch, when I am refactoring this code (see following patch/commit) WARNING: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240411192241.2498631-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15Replace macro "ARCH_HAVE_EXTRA_ELF_NOTES" with kconfigVignesh Balasubramanian
"ARCH_HAVE_EXTRA_ELF_NOTES" enables an extra note section in the core dump. Kconfig variable is preferred over ARCH_HAVE_* macro. Co-developed-by: Jini Susan George <jinisusan.george@amd.com> Signed-off-by: Jini Susan George <jinisusan.george@amd.com> Signed-off-by: Vignesh Balasubramanian <vigbalas@amd.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20240412062138.1132841-2-vigbalas@amd.com Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-15rcu: Add a trace event for synchronize_rcu_normal()Uladzislau Rezki (Sony)
Add an rcu_sr_normal() trace event. It takes three arguments first one is the name of RCU flavour, second one is a user id which triggeres synchronize_rcu_normal() and last one is an event. There are two traces in the synchronize_rcu_normal(). On entry, when a new request is registered and on exit point when request is completed. Please note, CONFIG_RCU_TRACE=y is required to activate traces. Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-04-15rcu: Mollify sparse with RCU guardJohannes Berg
When using "guard(rcu)();" sparse will complain, because even though it now understands the cleanup attribute, it doesn't evaluate the calls from it at function exit, and thus doesn't count the context correctly. Given that there's a conditional in the resulting code: static inline void class_rcu_destructor(class_rcu_t *_T) { if (_T->lock) { rcu_read_unlock(); } } it seems that even trying to teach sparse to evalulate the cleanup attribute function it'd still be difficult to really make it understand the full context here. Suppress the sparse warning by just releasing the context in the acquisition part of the function, after all we know it's safe with the guard, that's the whole point of it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-04-15dma-buf: Do not build debugfs related code when !CONFIG_DEBUG_FSTvrtko Ursulin
There is no point in compiling in the list and mutex operations which are only used from the dma-buf debugfs code, if debugfs is not compiled in. Put the code in questions behind some kconfig guards and so save some text and maybe even a pointer per object at runtime when not enabled. Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian König <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com Reviewed-by: T.J. Mercier <tjmercier@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328145323.68872-1-tursulin@igalia.com
2024-04-15drm/fb_dma: s/drm_panic_gem_get_scanout_buffer/drm_fb_dma_get_scanout_bufferMaíra Canal
On version 11 of the "drm/panic: Add a drm panic handler" series [1], Thomas suggested to change the name of the function `drm_panic_gem_get_scanout_buffer` to `drm_fb_dma_get_scanout_buffer` and this request was applied on version 12, which is the version that landed [2]. Although the name of the function changed on the C file, it didn't changed on the header file, leading to a compilation error as such: drivers/gpu/drm/imx/ipuv3/ipuv3-plane.c:780:24: error: use of undeclared identifier 'drm_fb_dma_get_scanout_buffer'; did you mean 'drm_panic_gem_get_scanout_buffer'? 780 | .get_scanout_buffer = drm_fb_dma_get_scanout_buffer, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | drm_panic_gem_get_scanout_buffer ./include/drm/drm_fb_dma_helper.h:23:5: note: 'drm_panic_gem_get_scanout_buffer' declared here 23 | int drm_panic_gem_get_scanout_buffer(struct drm_plane *plane, | ^ 1 error generated. Fix the compilation error by changing `drm_panic_gem_get_scanout_buffer` to `drm_fb_dma_get_scanout_buffer` on the header file. Link: https://lore.kernel.org/dri-devel/20240328120638.468738-1-jfalempe@redhat.com/ [1] Link: https://lore.kernel.org/dri-devel/aea2aa01-7f03-453b-8b30-8f4d90b1b47f@redhat.com/ [2] Fixes: 879b3b6511fe ("drm/fb_dma: Add generic get_scanout_buffer() for drm_panic") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415151013.3210278-1-mcanal@igalia.com
2024-04-15rcu: Remove redundant CONFIG_PROVE_RCU #if conditionPaul E. McKenney
The #if condition controlling the rcu_preempt_sleep_check() definition has a redundant check for CONFIG_PREEMPT_RCU, which is already checked for by an enclosing #ifndef. This commit therefore removes this redundant condition from the inner #if. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-04-15vfs, swap: compile out IS_SWAPFILE() on swapless configsAlexey Dobriyan
No swap support -- no swapfiles possible. Signed-off-by: Alexey Dobriyan (Yandex) <adobriyan@gmail.com> Link: https://lore.kernel.org/r/2391c7f5-0f83-4188-ae56-4ec7ccbf2576@p183 Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-15drm/fb_dma: Add generic get_scanout_buffer() for drm_panicJocelyn Falempe
This was initialy done for imx6, but should work on most drivers using drm_fb_dma_helper. v8: * Replace get_scanout_buffer() logic with drm_panic_set_buffer() (Thomas Zimmermann) v9: * go back to get_scanout_buffer() * move get_scanout_buffer() to plane helper functions v12: * Rename drm_panic_gem_get_scanout_buffer to drm_fb_dma_get_scanout_buffer (Thomas Zimmermann) * Remove the #ifdef CONFIG_DRM_PANIC, and build it unconditionnaly, as it's a small function. (Thomas Zimmermann) Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-6-jfalempe@redhat.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-04-15drm/panic: Add a drm panic handlerJocelyn Falempe
This module displays a user friendly message when a kernel panic occurs. It currently doesn't contain any debug information, but that can be added later. v2 * Use get_scanout_buffer() instead of the drm client API. (Thomas Zimmermann) * Add the panic reason to the panic message (Nerdopolis) * Add an exclamation mark (Nerdopolis) v3 * Rework the drawing functions, to write the pixels line by line and to use the drm conversion helper to support other formats. (Thomas Zimmermann) v4 * Use drm_fb_r1_to_32bit for fonts (Thomas Zimmermann) * Remove the default y to DRM_PANIC config option (Thomas Zimmermann) * Add foreground/background color config option * Fix the bottom lines not painted if the framebuffer height is not a multiple of the font height. * Automatically register the device to drm_panic, if the function get_scanout_buffer exists. (Thomas Zimmermann) v5 * Change the drawing API, use drm_fb_blit_from_r1() to draw the font. * Also add drm_fb_fill() to fill area with background color. * Add draw_pixel_xy() API for drivers that can't provide a linear buffer. * Add a flush() callback for drivers that needs to synchronize the buffer. * Add a void *private field, so drivers can pass private data to draw_pixel_xy() and flush(). v6 * Fix sparse warning for panic_msg and logo. v7 * Add select DRM_KMS_HELPER for the color conversion functions. v8 * Register directly each plane to the panic notifier (Sima) * Add raw_spinlock to properly handle concurrency (Sima) * Register plane instead of device, to avoid looping through plane list, and simplify code. * Replace get_scanout_buffer() logic with drm_panic_set_buffer() (Thomas Zimmermann) * Removed the draw_pixel_xy() API, will see later if it can be added back. v9 * Revert to using get_scanout_buffer() (Sima) * Move get_scanout_buffer() and panic_flush() to the plane helper functions (Thomas Zimmermann) * Register all planes with get_scanout_buffer() to the panic notifier * Use drm_panic_lock() to protect against race (Sima) v10 * Move blit and fill functions back in drm_panic (Thomas Zimmermann). * Simplify the text drawing functions. * Use kmsg_dumper instead of panic_notifier (Sima). v12 * Use array for map and pitch in struct drm_scanout_buffer to support multi-planar format later. (Thomas Zimmermann) * Better indent struct drm_scanout_buffer declaration. (Thomas Zimmermann) Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-3-jfalempe@redhat.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-04-15drm/panic: Add drm panic lockingDaniel Vetter
Rough sketch for the locking of drm panic printing code. The upshot of this approach is that we can pretty much entirely rely on the atomic commit flow, with the pair of raw_spin_lock/unlock providing any barriers we need, without having to create really big critical sections in code. This also avoids the need that drivers must explicitly update the panic handler state, which they might forget to do, or not do consistently, and then we blow up in the worst possible times. It is somewhat racy against a concurrent atomic update, and we might write into a buffer which the hardware will never display. But there's fundamentally no way to avoid that - if we do the panic state update explicitly after writing to the hardware, we might instead write to an old buffer that the user will barely ever see. Note that an rcu protected deference of plane->state would give us the the same guarantees, but it has the downside that we then need to protect the plane state freeing functions with call_rcu too. Which would very widely impact a lot of code and therefore doesn't seem worth the complexity compared to a raw spinlock with very tiny critical sections. Plus rcu cannot be used to protect access to peek/poke registers anyway, so we'd still need it for those cases. Peek/poke registers for vram access (or a gart pte reserved just for panic code) are also the reason I've gone with a per-device and not per-plane spinlock, since usually these things are global for the entire display. Going with per-plane locks would mean drivers for such hardware would need additional locks, which we don't want, since it deviates from the per-console takeoverlocks design. Longer term it might be useful if the panic notifiers grow a bit more structure than just the absolute bare EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console drivers with proper register/unregister interfaces we could perhaps reuse the very fancy console lock with all it's check and takeover semantics that John Ogness is developing to fix the console_lock mess. But for the initial cut of a drm panic printing support I don't think we need that, because the critical sections are extremely small and only happen once per display refresh. So generally just 60 tiny locked sections per second, which is nothing compared to a serial console running a 115kbaud doing really slow mmio writes for each byte. So for now the raw spintrylock in drm panic notifier callback should be good enough. Another benefit of making panic notifiers more like full blown consoles (that are used in panics only) would be that we get the two stage design, where first all the safe outputs are used. And then the dangerous takeover tricks are deployed (where for display drivers we also might try to intercept any in-flight display buffer flips, which if we race and misprogram fifos and watermarks can hang the memory controller on some hw). For context the actual implementation on the drm side is by Jocelyn and this patch is meant to be combined with the overall approach in v7 (v8 is a bit less flexible, which I think is the wrong direction): https://lore.kernel.org/dri-devel/20240104160301.185915-1-jfalempe@redhat.com/ Note that the locking is very much not correct there, hence this separate rfc. Starting from v10, I (Jocelyn) have included this patch in the drm_panic series, and done the corresponding changes. v2: - fix authorship, this was all my typing - some typo oopsies - link to the drm panic work by Jocelyn for context v10: - Use spinlock_irqsave/restore (John Ogness) v11: - Use macro instead of inline functions for drm_panic_lock/unlock (John Ogness) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240409163432.352518-2-jfalempe@redhat.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-04-15io_uring: separate header for exported net bitsPavel Begunkov
We're exporting some io_uring bits to networking, e.g. for implementing a net callback for io_uring cmds, but we don't want to expose more than needed. Add a separate header for networking. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20240409210554.1878789-1-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring: remove async request cachePavel Begunkov
io_req_complete_post() was a sole user of ->locked_free_list, but since we just gutted the function, the cache is not used anymore and can be removed. ->locked_free_list served as an asynhronous counterpart of the main request (i.e. struct io_kiocb) cache for all unlocked cases like io-wq. Now they're all forced to be completed into the main cache directly, off of the normal completion path or via io_free_req(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7bffccd213e370abd4de480e739d8b08ab6c1326.1712331455.git.asml.silence@gmail.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring/kbuf: use vm_insert_pages() for mmap'ed pbuf ringJens Axboe
Rather than use remap_pfn_range() for this and manually free later, switch to using vm_insert_page() and have it Just Work. This requires a bit of effort on the mmap lookup side, as the ctx uring_lock isn't held, which otherwise protects buffer_lists from being torn down, and it's not safe to grab from mmap context that would introduce an ABBA deadlock between the mmap lock and the ctx uring_lock. Instead, lookup the buffer_list under RCU, as the the list is RCU freed already. Use the existing reference count to determine whether it's possible to safely grab a reference to it (eg if it's not zero already), and drop that reference when done with the mapping. If the mmap reference is the last one, the buffer_list and the associated memory can go away, since the vma insertion has references to the inserted pages at that point. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring: Avoid anonymous enums in io_uring uapiGabriel Krisman Bertazi
While valid C, anonymous enums confuse Cython (Python to C translator), as reported by Ritesh (YoSTEALTH) [1] . Since people rely on it when building against liburing and we want to keep this header in sync with the library version, let's name the existing enums in the uapi header. [1] https://github.com/cython/cython/issues/3240 Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20240328210935.25640-1-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring/alloc_cache: switch to array based cachingJens Axboe
Currently lists are being used to manage this, but best practice is usually to have these in an array instead as that it cheaper to manage. Outside of that detail, games are also played with KASAN as the list is inside the cached entry itself. Finally, all users of this need a struct io_cache_entry embedded in their struct, which is union'ized with something else in there that isn't used across the free -> realloc cycle. Get rid of all of that, and simply have it be an array. This will not change the memory used, as we're just trading an 8-byte member entry for the per-elem array size. This reduces the overhead of the recycled allocations, and it reduces the amount of code code needed to support recycling to about half of what it currently is. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring/uring_cmd: switch to always allocating async dataJens Axboe
Basic conversion ensuring async_data is allocated off the prep path. Adds a basic alloc cache as well, as passthrough IO can be quite high in rate. Tested-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring/rw: always setup io_async_rw for read/write requestsJens Axboe
read/write requests try to put everything on the stack, and then alloc and copy if a retry is needed. This necessitates a bunch of nasty code that deals with intermediate state. Get rid of this, and have the prep side setup everything that is needed upfront, which greatly simplifies the opcode handlers. This includes adding an alloc cache for io_async_rw, to make it cheap to handle. In terms of cost, this should be basically free and transparent. For the worst case of {READ,WRITE}_FIXED which didn't need it before, performance is unaffected in the normal peak workload that is being used to test that. Still runs at 122M IOPS. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring: get rid of intermediate aux cqe cachesPavel Begunkov
io_post_aux_cqe(), which is used for multishot requests, delays completions by putting CQEs into a temporary array for the purpose completion lock/flush batching. DEFER_TASKRUN doesn't need any locking, so for it we can put completions directly into the CQ and defer post completion handling with a flag. That leaves !DEFER_TASKRUN, which is not that interesting / hot for multishot requests, so have conditional locking with deferred flush for them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Tested-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/b1d05a81fd27aaa2a07f9860af13059e7ad7a890.1710799188.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring: remove struct io_tw_state::lockedPavel Begunkov
ctx is always locked for task_work now, so get rid of struct io_tw_state::locked. Note I'm stopping one step before removing io_tw_state altogether, which is not empty, because it still serves the purpose of indicating which function is a tw callback and forcing users not to invoke them carelessly out of a wrong context. The removal can always be done later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Tested-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/e95e1ea116d0bfa54b656076e6a977bc221392a4.1710799188.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15nvme/io_uring: use helper for polled completionsJens Axboe
NVMe is making up issue_flags, which is a no-no in general, and to make matters worse, they are completely the wrong ones. For a pure polled request, which it does check for, we're already inside the ctx->uring_lock when the completions are run off io_do_iopoll(). Hence the correct flag would be '0' rather than IO_URING_F_UNLOCKED. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15io_uring/cmd: document some uring_cmd related helpersPavel Begunkov
Add comments warning users that they're only allowed to pass issue_flags that were given from io_uring. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/82ff8a45f2c3eb5f3a04a33f0692e5e4a1320455.1710799188.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-15ACPI: platform-profile: add platform_profile_cycle()Gergo Koteles
Some laptops have a key to switch platform profiles. Add a platform_profile_cycle() function to cycle between the enabled profiles. Signed-off-by: Gergo Koteles <soyer@irl.hu> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/5a97deddf72aa5e764d881eb39a7ba35c01a903e.1712597199.git.soyer@irl.hu Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-15dt-bindings: leds: Add LED_FUNCTION_FNLOCKGergo Koteles
Newer laptops have FnLock LED. Add a define for this very common function. Signed-off-by: Gergo Koteles <soyer@irl.hu> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Lee Jones <lee@kernel.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/8ac95e85a53dc0b8cce1e27fc1cab6d19221543b.1712063200.git.soyer@irl.hu Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-15gpiolib: acpi: Pass con_id instead of property into acpi_dev_gpio_irq_get_by()Andy Shevchenko
Pass the con_id instead of property so that callers won't repeat the GPIO suffixes to try. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-04-15drm/edid: add drm_edid_print_product_id()Jani Nikula
Add a function to print a decoded EDID vendor and product id to a drm printer, optionally with the raw data. v2: - refactor date printing - use seq_buf to avoid kasprintf() (Ville) - handle week == 0 (Ville) - use be16_to_cpu() on manufacturer_name Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Melissa Wen <mwen@igalia.com> # v1 Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/32bbc83ee6557809ef6d7a5edb1bc8ef4d56d10f.1712655867.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-04-15drm/edid: add drm_edid_get_product_id()Jani Nikula
Add a struct drm_edid based function to get the vendor and product ID from an EDID. Add a separate struct for defining this part of the EDID, with defined byte order for manufacturer name, product code and serial number. v2: Define manufacturer_name as __be16 instead of u8[2] (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/df0e7dedbf7f2c190039d6e6eae3e126eba113c9.1712655867.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-04-15iommu: account IOMMU allocated memoryPasha Tatashin
In order to be able to limit the amount of memory that is allocated by IOMMU subsystem, the memory must be accounted. Account IOMMU as part of the secondary pagetables as it was discussed at LPC. The value of SecPageTables now contains mmeory allocation by IOMMU and KVM. There is a difference between GFP_ACCOUNT and what NR_IOMMU_PAGES shows. GFP_ACCOUNT is set only where it makes sense to charge to user processes, i.e. IOMMU Page Tables, but there more IOMMU shared data that should not really be charged to a specific process. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-12-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu: observability of the IOMMU allocationsPasha Tatashin
Add NR_IOMMU_PAGES into node_stat_item that counts number of pages that are allocated by the IOMMU subsystem. The allocations can be view per-node via: /sys/devices/system/node/nodeN/vmstat. For example: $ grep iommu /sys/devices/system/node/node*/vmstat /sys/devices/system/node/node0/vmstat:nr_iommu_pages 106025 /sys/devices/system/node/node1/vmstat:nr_iommu_pages 3464 The value is in page-count, therefore, in the above example the iommu allocations amount to ~428M. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-11-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>