summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-18maple_tree: make maple state reusable after mas_empty_area_rev()Liam R. Howlett
Stop using maple state min/max for the range by passing through pointers for those values. This will allow the maple state to be reused without resetting. Also add some logic to fail out early on searching with invalid arguments. Link: https://lkml.kernel.org/r/20230414145728.4067069-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18mm: kmsan: handle alloc failures in kmsan_ioremap_page_range()Alexander Potapenko
Similarly to kmsan_vmap_pages_range_noflush(), kmsan_ioremap_page_range() must also properly handle allocation/mapping failures. In the case of such, it must clean up the already created metadata mappings and return an error code, so that the error can be propagated to ioremap_page_range(). Without doing so, KMSAN may silently fail to bring the metadata for the page range into a consistent state, which will result in user-visible crashes when trying to access them. Link: https://lkml.kernel.org/r/20230413131223.4135168-2-glider@google.com Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Link: https://lore.kernel.org/linux-mm/CANX2M5ZRrRA64k0hOif02TjmY9kbbO2aCBPyq79es34RXZ=cAw@mail.gmail.com/ Reviewed-by: Marco Elver <elver@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush()Alexander Potapenko
As reported by Dipanjan Das, when KMSAN is used together with kernel fault injection (or, generally, even without the latter), calls to kcalloc() or __vmap_pages_range_noflush() may fail, leaving the metadata mappings for the virtual mapping in an inconsistent state. When these metadata mappings are accessed later, the kernel crashes. To address the problem, we return a non-zero error code from kmsan_vmap_pages_range_noflush() in the case of any allocation/mapping failure inside it, and make vmap_pages_range_noflush() return an error if KMSAN fails to allocate the metadata. This patch also removes KMSAN_WARN_ON() from vmap_pages_range_noflush(), as these allocation failures are not fatal anymore. Link: https://lkml.kernel.org/r/20230413131223.4135168-1-glider@google.com Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Link: https://lore.kernel.org/linux-mm/CANX2M5ZRrRA64k0hOif02TjmY9kbbO2aCBPyq79es34RXZ=cAw@mail.gmail.com/ Reviewed-by: Marco Elver <elver@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18tools/Makefile: do missed s/vm/mm/SeongJae Park
Commit 799fb82aa132 ("tools/vm: rename tools/vm to tools/mm") missed renaming 'vm' in 'tools/Makefile' to 'mm'. As a result, 'make clean' under 'tools/' directory fails as below: $ make -C tools clean DESCEND vm make[1]: Entering directory '/linux/tools/vm' make[1]: *** No rule to make target 'clean'. Stop. make[1]: Leaving directory '/linux/tools/vm' make: *** [Makefile:173: vm_clean] Error 2 make: Leaving directory '/linux/tools' Do the missed rename. Link: https://lkml.kernel.org/r/20230415203110.13858-1-sj@kernel.org Fixes: 799fb82aa132 ("tools/vm: rename tools/vm to tools/mm") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Ricardo Pardini <ricardo@pardini.net> Link: https://lore.kernel.org/linux-mm/20230415202454.13558-1-sj@kernel.org/ Tested-by: Ricardo Pardini <ricardo@pardini.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18mm: fix memory leak on mm_init error handlingMathieu Desnoyers
commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") introduces a memory leak by missing a call to destroy_context() when a percpu_counter fails to allocate. Before introducing the per-cpu counter allocations, init_new_context() was the last call that could fail in mm_init(), and thus there was no need to ever invoke destroy_context() in the error paths. Adding the following percpu counter allocations adds error paths after init_new_context(), which means its associated destroy_context() needs to be called when percpu counters fail to allocate. Link: https://lkml.kernel.org/r/20230330133822.66271-1-mathieu.desnoyers@efficios.com Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlockTetsuo Handa
syzbot is reporting circular locking dependency which involves zonelist_update_seq seqlock [1], for this lock is checked by memory allocation requests which do not need to be retried. One deadlock scenario is kmalloc(GFP_ATOMIC) from an interrupt handler. CPU0 ---- __build_all_zonelists() { write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd // e.g. timer interrupt handler runs at this moment some_timer_func() { kmalloc(GFP_ATOMIC) { __alloc_pages_slowpath() { read_seqbegin(&zonelist_update_seq) { // spins forever because zonelist_update_seq.seqcount is odd } } } } // e.g. timer interrupt handler finishes write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even } This deadlock scenario can be easily eliminated by not calling read_seqbegin(&zonelist_update_seq) from !__GFP_DIRECT_RECLAIM allocation requests, for retry is applicable to only __GFP_DIRECT_RECLAIM allocation requests. But Michal Hocko does not know whether we should go with this approach. Another deadlock scenario which syzbot is reporting is a race between kmalloc(GFP_ATOMIC) from tty_insert_flip_string_and_push_buffer() with port->lock held and printk() from __build_all_zonelists() with zonelist_update_seq held. CPU0 CPU1 ---- ---- pty_write() { tty_insert_flip_string_and_push_buffer() { __build_all_zonelists() { write_seqlock(&zonelist_update_seq); build_zonelists() { printk() { vprintk() { vprintk_default() { vprintk_emit() { console_unlock() { console_flush_all() { console_emit_next_record() { con->write() = serial8250_console_write() { spin_lock_irqsave(&port->lock, flags); tty_insert_flip_string() { tty_insert_flip_string_fixed_flag() { __tty_buffer_request_room() { tty_buffer_alloc() { kmalloc(GFP_ATOMIC | __GFP_NOWARN) { __alloc_pages_slowpath() { zonelist_iter_begin() { read_seqbegin(&zonelist_update_seq); // spins forever because zonelist_update_seq.seqcount is odd spin_lock_irqsave(&port->lock, flags); // spins forever because port->lock is held } } } } } } } } spin_unlock_irqrestore(&port->lock, flags); // message is printed to console spin_unlock_irqrestore(&port->lock, flags); } } } } } } } } } write_sequnlock(&zonelist_update_seq); } } } This deadlock scenario can be eliminated by preventing interrupt context from calling kmalloc(GFP_ATOMIC) and preventing printk() from calling console_flush_all() while zonelist_update_seq.seqcount is odd. Since Petr Mladek thinks that __build_all_zonelists() can become a candidate for deferring printk() [2], let's address this problem by disabling local interrupts in order to avoid kmalloc(GFP_ATOMIC) and disabling synchronous printk() in order to avoid console_flush_all() . As a side effect of minimizing duration of zonelist_update_seq.seqcount being odd by disabling synchronous printk(), latency at read_seqbegin(&zonelist_update_seq) for both !__GFP_DIRECT_RECLAIM and __GFP_DIRECT_RECLAIM allocation requests will be reduced. Although, from lockdep perspective, not calling read_seqbegin(&zonelist_update_seq) (i.e. do not record unnecessary locking dependency) from interrupt context is still preferable, even if we don't allow calling kmalloc(GFP_ATOMIC) inside write_seqlock(&zonelist_update_seq)/write_sequnlock(&zonelist_update_seq) section... Link: https://lkml.kernel.org/r/8796b95c-3da3-5885-fddd-6ef55f30e4d3@I-love.SAKURA.ne.jp Fixes: 3d36424b3b58 ("mm/page_alloc: fix race condition between build_all_zonelists and page allocation") Link: https://lkml.kernel.org/r/ZCrs+1cDqPWTDFNM@alley [2] Reported-by: syzbot <syzbot+223c7461c58c58a4cb10@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=223c7461c58c58a4cb10 [1] Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Petr Mladek <pmladek@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Patrick Daly <quic_pdaly@quicinc.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18kernel/sys.c: fix and improve control flow in __sys_setres[ug]id()Ondrej Mosnacek
Linux Security Modules (LSMs) that implement the "capable" hook will usually emit an access denial message to the audit log whenever they "block" the current task from using the given capability based on their security policy. The occurrence of a denial is used as an indication that the given task has attempted an operation that requires the given access permission, so the callers of functions that perform LSM permission checks must take care to avoid calling them too early (before it is decided if the permission is actually needed to perform the requested operation). The __sys_setres[ug]id() functions violate this convention by first calling ns_capable_setid() and only then checking if the operation requires the capability or not. It means that any caller that has the capability granted by DAC (task's capability set) but not by MAC (LSMs) will generate a "denied" audit record, even if is doing an operation for which the capability is not required. Fix this by reordering the checks such that ns_capable_setid() is checked last and -EPERM is returned immediately if it returns false. While there, also do two small optimizations: * move the capability check before prepare_creds() and * bail out early in case of a no-op. Link: https://lkml.kernel.org/r/20230217162154.837549-1-omosnace@redhat.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18drm/amd/display: fix a divided-by-zero errorAlex Hung
[Why & How] timing.dsc_cfg.num_slices_v can be zero and it is necessary to check before using it. This fixes the error "divide error: 0000 [#1] PREEMPT SMP NOPTI". Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-04-18drm/amd/display: limit timing for single dimm memoryDaniel Miess
[Why] 1. It could hit bandwidth limitdation under single dimm memory when connecting 8K external monitor. 2. IsSupportedVidPn got validation failed with 2K240Hz eDP + 8K24Hz external monitor. 3. It's better to filter out such combination in EnumVidPnCofuncModality 4. For short term, filter out in dc bandwidth validation. [How] Force 2K@240Hz+8K@24Hz timing validation false in dc. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Daniel Miess <Daniel.Miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-04-18drm/amd/display: set dcn315 lb bpp to 48Dmytro Laktyushkin
[Why & How] Fix a typo for dcn315 line buffer bpp. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-04-18drm/amdgpu: Fix desktop freezed after gpu-resetAlan Liu
[Why] After gpu-reset, sometimes the driver fails to enable vblank irq, causing flip_done timed out and the desktop freezed. During gpu-reset, we disable and enable vblank irq in dm_suspend() and dm_resume(). Later on in amdgpu_irq_gpu_reset_resume_helper(), we check irqs' refcount and decide to enable or disable the irqs again. However, we have 2 sets of API for controling vblank irq, one is dm_vblank_get/put() and another is amdgpu_irq_get/put(). Each API has its own refcount and flag to store the state of vblank irq, and they are not synchronized. In drm we use the first API to control vblank irq but in amdgpu_irq_gpu_reset_resume_helper() we use the second set of API. The failure happens when vblank irq was enabled by dm_vblank_get() before gpu-reset, we have vblank->enabled true. However, during gpu-reset, in amdgpu_irq_gpu_reset_resume_helper() vblank irq's state checked from amdgpu_irq_update() is DISABLED. So finally it disables vblank irq again. After gpu-reset, if there is a cursor plane commit, the driver will try to enable vblank irq by calling drm_vblank_enable(), but the vblank->enabled is still true, so it fails to turn on vblank irq and causes flip_done can't be completed in vblank irq handler and desktop become freezed. [How] Combining the 2 vblank control APIs by letting drm's API finally calls amdgpu_irq's API, so the irq's refcount and state of both APIs can be synchronized. Also add a check to prevent refcount from being less then 0 in amdgpu_irq_put(). v2: - Add warning in amdgpu_irq_enable() if the irq is already disabled. - Call dc_interrupt_set() in dm_set_vblank() to avoid refcount change if it is in gpu-reset. v3: - Improve commit message and code comments. Signed-off-by: Alan Liu <HaoPing.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-04-18Merge branch 'Provide bpf_for() and bpf_for_each() by libbpf'Alexei Starovoitov
Andrii Nakryiko says: ==================== This patch set moves bpf_for(), bpf_for_each(), and bpf_repeat() macros from selftests-internal bpf_misc.h header to libbpf-provided bpf_helpers.h header. To do this in a way to allow users to feature-detect and guard such bpf_for()/bpf_for_each() uses on old kernels we also extend libbpf to improve unresolved kfunc calls handling and reporting. This lets us mark bpf_iter_num_{new,next,destroy}() declarations as __weak, and thus not fail program loading outright if such kfuncs are missing on the host kernel. Patches #1 and #2 do some simple clean ups and logging improvements. Patch #3 adds kfunc call poisoning and log fixup logic and is the hear of this patch set, effectively. Patch #4 adds selftest for this logic. Patches #4 and #5 move bpf_for()/bpf_for_each()/bpf_repeat() into bpf_helpers.h header and mark kfuncs as __weak to allow users to feature-detect and guard their uses. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18libbpf: mark bpf_iter_num_{new,next,destroy} as __weakAndrii Nakryiko
Mark bpf_iter_num_{new,next,destroy}() kfuncs declared for bpf_for()/bpf_repeat() macros as __weak to allow users to feature-detect their presence and guard bpf_for()/bpf_repeat() loops accordingly for backwards compatibility with old kernels. Now that libbpf supports kfunc calls poisoning and better reporting of unresolved (but called) kfuncs, declaring number iterator kfuncs in bpf_helpers.h won't degrade user experience and won't cause unnecessary kernel feature dependencies. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18libbpf: move bpf_for(), bpf_for_each(), and bpf_repeat() into bpf_helpers.hAndrii Nakryiko
To make it easier for bleeding-edge BPF applications, such as sched_ext, to utilize open-coded iterators, move bpf_for(), bpf_for_each(), and bpf_repeat() macros from selftests/bpf-internal bpf_misc.h helper, to libbpf-provided bpf_helpers.h header. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18selftests/bpf: add missing __weak kfunc log fixup testAndrii Nakryiko
Add test validating that libbpf correctly poisons and reports __weak unresolved kfuncs in post-processed verifier log. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18libbpf: improve handling of unresolved kfuncsAndrii Nakryiko
Currently, libbpf leaves `call #0` instruction for __weak unresolved kfuncs, which might lead to a confusing verifier log situations, where invalid `call #0` will be treated as successfully validated. We can do better. Libbpf already has an established mechanism of poisoning instructions that failed some form of resolution (e.g., CO-RE relocation and BPF map set to not be auto-created). Libbpf doesn't fail them outright to allow users to guard them through other means, and as long as BPF verifier can prove that such poisoned instructions cannot be ever reached, this doesn't consistute an invalid BPF program. If user didn't guard such code, libbpf will extract few pieces of information to tie such poisoned instructions back to additional information about what entitity wasn't resolved (e.g., BPF map name, or CO-RE relocation information). __weak unresolved kfuncs fit this model well, so this patch extends libbpf with poisioning and log fixup logic for kfunc calls. Note, this poisoning is done only for kfunc *calls*, not kfunc address resolution (ldimm64 instructions). The former cannot be ever valid, if reached, so it's safe to poison them. The latter is a valid mechanism to check if __weak kfunc ksym was resolved, and do necessary guarding and work arounds based on this result, supported in most recent kernels. As such, libbpf keeps such ldimm64 instructions as loading zero, never poisoning them. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18libbpf: report vmlinux vs module name when dealing with ksymsAndrii Nakryiko
Currently libbpf always reports "kernel" as a source of ksym BTF type, which is ambiguous given ksym's BTF can come from either vmlinux or kernel module BTFs. Make this explicit and log module name, if used BTF is from kernel module. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18libbpf: misc internal libbpf clean ups around log fixupAndrii Nakryiko
Normalize internal constants, field names, and comments related to log fixup. Also add explicit `ext_idx` alias for relocation where relocation is pointing to extern description for additional information. No functional changes, just a clean up before subsequent additions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features ↵Lorenzo Bianconi
flag For veth pairs, NETDEV_XDP_ACT_NDO_XMIT is supported by the current device if the peer one is running a XDP program or if it has GRO enabled. Fix the xdp_features flags reporting considering peer device and not current one for NETDEV_XDP_ACT_NDO_XMIT. Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/4f1ca6f6f6b42ae125bfdb5c7782217c83968b2e.1681767806.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-18Merge tag 'mmc-v6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC host: - sdhci_am654: Fix support for UHS-I SDR12 and SDR25 speed modes MEMSTICK: - Fix memory leak if card device never gets registered" * tag 'mmc-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: memstick: fix memory leak if card device is never registered mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25
2023-04-18KVM: arm64: Make vcpu flag updates non-preemptibleMarc Zyngier
Per-vcpu flags are updated using a non-atomic RMW operation. Which means it is possible to get preempted between the read and write operations. Another interesting thing to note is that preemption also updates flags, as we have some flag manipulation in both the load and put operations. It is thus possible to lose information communicated by either load or put, as the preempted flag update will overwrite the flags when the thread is resumed. This is specially critical if either load or put has stored information which depends on the physical CPU the vcpu runs on. This results in really elusive bugs, and kudos must be given to Mostafa for the long hours of debugging, and finally spotting the problem. Fix it by disabling preemption during the RMW operation, which ensures that the state stays consistent. Also upgrade vcpu_get_flag path to use READ_ONCE() to make sure the field is always atomically accessed. Fixes: e87abb73e594 ("KVM: arm64: Add helpers to manipulate vcpu flags among a set") Reported-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230418125737.2327972-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-04-18irqchip/gic-v3: Add Rockchip 3588001 erratum workaroundSebastian Reichel
Rockchip RK3588/RK3588s GIC600 integration does not support the sharability feature. Rockchip assigned Erratum ID #3588001 for this issue. Note, that the 0x0201743b ID is not Rockchip specific and thus there is an extra of_machine_is_compatible() check. The flags are named FORCE_NON_SHAREABLE to be vendor agnostic, since apparently similar integration design errors exist in other platforms and they can reuse the same flag. Co-developed-by: XiaoDong Huang <derrick.huang@rock-chips.com> Signed-off-by: XiaoDong Huang <derrick.huang@rock-chips.com> Co-developed-by: Kever Yang <kever.yang@rock-chips.com> Signed-off-by: Kever Yang <kever.yang@rock-chips.com> Co-developed-by: Lucas Tanure <lucas.tanure@collabora.com> Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230418142109.49762-2-sebastian.reichel@collabora.com
2023-04-18f2fs: add has_enough_free_secs()Yangtao Li
Replace !has_not_enough_free_secs w/ has_enough_free_secs. BTW avoid nested 'if' statements in f2fs_balance_fs(). Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-18f2fs: relax sanity check if checkpoint is corruptedJaegeuk Kim
1. extent_cache - let's drop the largest extent_cache 2. invalidate_block - don't show the warnings Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-18f2fs: refactor f2fs_gc to call checkpoint in urgent conditionJaegeuk Kim
The major change is to call checkpoint, if there's not enough space while having some prefree segments in FG_GC case. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-18cpufreq: use correct unit when verify cur freqSanjay Chandrashekara
cpufreq_verify_current_freq checks() if the frequency returned by the hardware has a slight delta with the valid frequency value last set and returns "policy->cur" if the delta is within "1 MHz". In the comparison, "policy->cur" is in "kHz" but it's compared against HZ_PER_MHZ. So, the comparison range becomes "1 GHz". Fix this by comparing against KHZ_PER_MHZ instead of HZ_PER_MHZ. Fixes: f55ae08c8987 ("cpufreq: Avoid unnecessary frequency updates due to mismatch") Signed-off-by: Sanjay Chandrashekara <sanjayc@nvidia.com> [ sumit gupta: Commit message update ] Signed-off-by: Sumit Gupta <sumitg@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-18ACPI: LPSS: Add 80862289 ACPI _HID for second PWM controller on Cherry TrailHans de Goede
On some Cherry Trail devices the second PWM controller uses 80862289 as ACPI _HID, rather then using 80862288 as is done for both controllers on most models. Add the missing 80862289 ACPI _HID, note this uses its own lpss_device_desc, without ".setup = bsw_pwm_setup" so that the pwm_lookup is not added for it. On devices where both controllers use the 80862288 _HID bsw_pwm_setup() does a UID check to avoid registering the lookup for the second controller but that will not work here. Adding the missing id fixes the second PWM controller no longer working after the entire LPSS1 island has been in D3 at least once, which causes the contents of the LPSS private registers to get lost. Adding the _HID makes acpi_lpss restore these when the controller moves from D3 to D0. Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-18ACPI: bus: Ensure that notify handlers are not running after removalRafael J. Wysocki
Currently, acpi_device_remove_notify_handler() may return while the notify handler being removed is still running which may allow the module holding that handler to be torn down prematurely. Address this issue by making acpi_device_remove_notify_handler() wait for the handling of all the ACPI events in progress to complete before returning. Fixes: 5894b0c46e49 ("ACPI / scan: Move bus operations and notification routines to bus.c") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-18ACPI: bus: Add missing braces to acpi_sb_notify()Rafael J. Wysocki
As per the kernel coding style. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-18dt-bindings: remoteproc: Drop unneeded quotesRob Herring
Cleanup bindings dropping unneeded quotes. Once all these are fixed, checking for this can be enabled in yamllint. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230327170114.4102315-1-robh@kernel.org
2023-04-18Merge tag 'arm-fixes-6.3-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are a number of updates for devicetree files for Qualcomm, Rockchips, and NXP i.MX platforms, addressing mistakes in the DT contents: - Wrong GPIO polarity on some boards - Lower SD card interface speed for better stability - Incorrect power supply, clock, pmic, cache properties - Disable broken hbr3 on sc7280-herobrine - Devicetree warning fixes The only other changes are: - A regression fix for the Amlogic performance monitoring unit driver, along with two related DT changes. - imx_v6_v7_defconfig enables PCI support again. - Trivial fixes for tee, optee and psci firmware drivers, addressing compiler warning and error output" * tag 'arm-fixes-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (32 commits) firmware/psci: demote suspend-mode warning to info level arm64: dts: qcom: sc7280: remove hbr3 support on herobrine boards ARM: imx_v6_v7_defconfig: Fix unintentional disablement of PCI arm64: dts: rockchip: correct panel supplies on some rk3326 boards arm64: dts: rockchip: use just "port" in panel on RockPro64 arm64: dts: rockchip: use just "port" in panel on Pinebook Pro ARM: dts: imx6ull-colibri: Remove unnecessary #address-cells/#size-cells ARM: dts: imx7d-remarkable2: Remove unnecessary #address-cells/#size-cells arm64: dts: imx8mp-verdin: correct off-on-delay arm64: dts: imx8mm-verdin: correct off-on-delay arm64: dts: imx8mm-evk: correct pmic clock source arm64: dts: qcom: sc8280xp-pmics: fix pon compatible and registers arm64: dts: rockchip: Remove non-existing pwm-delay-us property arm64: dts: rockchip: Add clk_rtc_32k to Anbernic xx3 Devices tee: Pass a pointer to virt_to_page() perf/amlogic: adjust register offsets arm64: dts: meson-g12-common: resolve conflict between canvas & pmu arm64: dts: meson-g12-common: specify full DMC range arm64: dts: imx8mp: fix address length for LCDIF2 riscv: dts: canaan: drop invalid spi-max-frequency ...
2023-04-18Merge tag 'mvebu-arm64-6.4-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/arm mvebu arm64 for 6.4 (part 1) turris-mox-rwtm firmware: - prevent modification at runtime of the kobj_type struct * tag 'mvebu-arm64-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: firmware: turris-mox-rwtm: make kobj_type structure constant Link: https://lore.kernel.org/r/878repzfbp.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18ARM: mv78xx0: fix entries for gpios, buttons and usb portsJeremy J. Peper
Original code was largely copy-pasted from the reference board code, correct values to reflect the hardware actually present in the TS-WXL. Signed-off-by: Jeremy J. Peper <jeremy@jeremypeper.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18ARM: mv78xx0: add code to enable XOR and CRYPTO engines on mv78xx0Jeremy J. Peper
Adding missing code/values required to enable the XOR and CESA engines for this SoC Signed-off-by: Jeremy J. Peper <jeremy@jeremypeper.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18ARM: mv78xx0: set the correct driver for the i2c RTCJeremy J. Peper
Original code was largely copy-pasted from the reference board code, adjust to use the actual RTC chip present on the TS-WXL. Signed-off-by: Jeremy J. Peper <jeremy@jeremypeper.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18ARM: mv78xx0: adjust init logic for ts-wxl to reflect single core devJeremy J. Peper
Original code was largely copy-pasted from the reference board code, adjust pcie initialiazation to reflect the TS-WXL using the single-core variant of this SoC. Correct pcie_port_size to be a power of 2 as required. Signed-off-by: Jeremy J. Peper <jeremy@jeremypeper.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime ↵Frederic Weisbecker
monotonicity The first field of /proc/uptime relies on the CLOCK_BOOTTIME clock which can also be fetched from clock_gettime() API. Improve the test coverage while verifying the monotonicity of CLOCK_BOOTTIME accross both interfaces. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230222144649.624380-9-frederic@kernel.org
2023-04-18selftests/proc: Remove idle time monotonicity assertionsFrederic Weisbecker
Due to broken iowait task counting design (cf: comments above get_cpu_idle_time_us() and nr_iowait()), it is not possible to provide the guarantee that /proc/stat or /proc/uptime display monotonic idle time values. Remove the assertions that verify the related wrong assumption so that testers and maintainers don't spend more time on that. Reported-by: Yu Liao <liaoyu15@huawei.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230222144649.624380-8-frederic@kernel.org
2023-04-18MAINTAINERS: Remove stale email addressFrederic Weisbecker
Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-7-frederic@kernel.org
2023-04-18timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()Frederic Weisbecker
There is no need for the __tick_nohz_idle_stop_tick() function between tick_nohz_idle_stop_tick() and its implementation. Remove that unnecessary step. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-6-frederic@kernel.org
2023-04-18timers/nohz: Add a comment about broken iowait counter update raceFrederic Weisbecker
The per-cpu iowait task counter is incremented locally upon sleeping. But since the task can be woken to (and by) another CPU, the counter may then be decremented remotely. This is the source of a race involving readers VS writer of idle/iowait sleeptime. The following scenario shows an example where a /proc/stat reader observes a pending sleep time as IO whereas that pending sleep time later eventually gets accounted as non-IO. CPU 0 CPU 1 CPU 2 ----- ----- ------ //io_schedule() TASK A current->in_iowait = 1 rq(0)->nr_iowait++ //switch to idle // READ /proc/stat // See nr_iowait_cpu(0) == 1 return ts->iowait_sleeptime + ktime_sub(ktime_get(), ts->idle_entrytime) //try_to_wake_up(TASK A) rq(0)->nr_iowait-- //idle exit // See nr_iowait_cpu(0) == 0 ts->idle_sleeptime += ktime_sub(ktime_get(), ts->idle_entrytime) As a result subsequent reads on /proc/stat may expose backward progress. This is unfortunately hardly fixable. Just add a comment about that condition. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-5-frederic@kernel.org
2023-04-18timers/nohz: Protect idle/iowait sleep time under seqcountFrederic Weisbecker
Reading idle/IO sleep time (eg: from /proc/stat) can race with idle exit updates because the state machine handling the stats is not atomic and requires a coherent read batch. As a result reading the sleep time may report irrelevant or backward values. Fix this with protecting the simple state machine within a seqcount. This is expected to be cheap enough not to add measurable performance impact on the idle path. Note this only fixes reader VS writer condition partitially. A race remains that involves remote updates of the CPU iowait task counter. It can hardly be fixed. Reported-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-4-frederic@kernel.org
2023-04-18timers/nohz: Only ever update sleeptime from idle exitFrederic Weisbecker
The idle and IO sleeptime statistics appearing in /proc/stat can be currently updated from two sites: locally on idle exit and remotely by cpufreq. However there is no synchronization mechanism protecting concurrent updates. It is therefore possible to account the sleeptime twice, among all the other possible broken scenarios. To prevent from breaking the sleeptime accounting source, restrict the sleeptime updates to the local idle exit site. If there is a delta to add since the last update, IO/Idle sleep time readers will now only compute the delta without actually writing it back to the internal idle statistic fields. This fixes a writer VS writer race. Note there are still two known reader VS writer races to handle. A subsequent patch will fix one. Reported-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-3-frederic@kernel.org
2023-04-18timers/nohz: Restructure and reshuffle struct tick_schedFrederic Weisbecker
Restructure and group fields by access in order to optimize cache layout. While at it, also add missing kernel doc for two fields: @last_jiffies and @idle_expires. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230222144649.624380-2-frederic@kernel.org
2023-04-18Merge tag 'mvebu-dt64-6.4-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt64 for 6.4 (part 1) Enlarge PCI memory window on Machiatobin (Armada 7040 based) Add supoport for the GL.iNet GL-MV1000 (Armada 3700 based) Add missing phy-mode on the cn9310 Align thermal node names with bindings * tag 'mvebu-dt64-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: ARM64: dts: marvell: cn9310: Add missing phy-mode arm64: dts: marvell: add DTS for GL.iNet GL-MV1000 arm64: dts: marvell: align thermal node names with bindings arm64: dts: marvell: mochabin: enlarge PCI memory window Link: https://lore.kernel.org/r/87bkjlzfcw.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18Merge tag 'mvebu-dt-6.4-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt for 6.4 (part 1) Add missing phy-mode and fixed links for kirkwood, orion5 and Armada (370, XP, 38x) SoCs * tag 'mvebu-dt-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: ARM: dts: armada: Add missing phy-mode and fixed links ARM: dts: orion5: Add missing phy-mode and fixed links ARM: dts: kirkwood: Add missing phy-mode and fixed links Link: https://lore.kernel.org/r/87edohzfeg.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18Merge tag 'v6.4-rockchip-dts64-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt On the Rock5b a fix the newly added rtc node and cpu-regulators for the big cluster. Volume-keys (via adc) for the Pinephone Pro, display support for the Anbernic RG353. As well as gpio-ranges for rk356x and fixes for the audio-codec node-names on two boards. * tag 'v6.4-rockchip-dts64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: Add support for volume keys to rk3399-pinephone-pro arm64: dts: rockchip: Add vdd_cpu_big regulators to rk3588-rock-5b arm64: dts: rockchip: Use generic name for es8316 on Pinebook Pro and Rock 5B arm64: dts: rockchip: Drop RTC clock-frequency on rk3588-rock-5b arm64: dts: rockchip: Add pinctrl gpio-ranges for rk356x arm64: dts: rockchip: add panel to Anbernic RG353 series Link: https://lore.kernel.org/r/5144826.MHq7AAxBmi@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18ARM: config: Update Vexpress defconfigLinus Walleij
The Versatile Express should conform to standard contemporary kernel features: add NO_HZ_FULL and HIGH_RES_TIMERS. Also add the AFS flash partitions as these are used on the platform. The removed SCHED_DEBUG is due to Kconfig changes. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230418082427.186677-1-linus.walleij@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-18Merge tag 'opp-updates-6.4' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Pull OPP updates for 6.4 from Viresh Kumar: "- Use of_property_present() for testing DT property presence (Rob Herring). - Add set_required_opps() callback to the 'struct opp_table', to make the code paths cleaner (Viresh Kumar)." * tag 'opp-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: OPP: Move required opps configuration to specialized callback OPP: Handle all genpd cases together in _set_required_opps() opp: Use of_property_present() for testing DT property presence
2023-04-18thermal: intel: int340x: Add DLVR support for RFIM controlSrinivas Pandruvada
Add support for DLVR (Digital Linear Voltage Regulator) attributes, which can be used to control RFIM. Here instead of "fivr" another directory "dlvr" is created with DLVR attributes: /sys/bus/pci/devices/0000:00:04.0/dlvr ├── dlvr_freq_mhz ├── dlvr_freq_select ├── dlvr_hardware_rev ├── dlvr_pll_busy ├── dlvr_rfim_enable └── dlvr_spread_spectrum_pct └── dlvr_control_mode └── dlvr_control_lock Attributes dlvr_freq_mhz (RO): Current DLVR PLL frequency in MHz. dlvr_freq_select (RW): Sets DLVR PLL clock frequency. dlvr_hardware_rev (RO): DLVR hardware revision. dlvr_pll_busy (RO): PLL can't accept frequency change when set. dlvr_rfim_enable (RW): 0: Disable RF frequency hopping, 1: Enable RF frequency hopping. dlvr_control_mode (RW): Specifies how frequencies are spread. 0: Down spread, 1: Spread in Center. dlvr_control_lock (RW): 1: future writes are ignored. dlvr_spread_spectrum_pct (RW) A write to this register updates the DLVR spread spectrum percent value. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>