linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-02-20	mm/swap: fix race when skipping swapcache	Kairui Song
	When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads swapin the same entry at the same time, they get different pages (A, B). Before one thread (T0) finishes the swapin and installs page (A) to the PTE, another thread (T1) could finish swapin of page (B), swap_free the entry, then swap out the possibly modified page reusing the same entry. It breaks the pte_same check in (T0) because PTE value is unchanged, causing ABA problem. Thread (T0) will install a stalled page (A) into the PTE and cause data corruption. One possible callstack is like this: CPU0 CPU1 ---- ---- do_swap_page() do_swap_page() with same entry <direct swapin path> <direct swapin path> <alloc page A> <alloc page B> swap_read_folio() <- read to page A swap_read_folio() <- read to page B <slow on later locks or interrupt> <finished swapin first> ... set_pte_at() swap_free() <- entry is free <write to page B, now page A stalled> <swap out page B to same swap entry> pte_same() <- Check pass, PTE seems unchanged, but page A is stalled! swap_free() <- page B content lost! set_pte_at() <- staled page A installed! And besides, for ZRAM, swap_free() allows the swap device to discard the entry content, so even if page (B) is not modified, if swap_read_folio() on CPU0 happens later than swap_free() on CPU1, it may also cause data loss. To fix this, reuse swapcache_prepare which will pin the swap entry using the cache flag, and allow only one thread to swap it in, also prevent any parallel code from putting the entry in the cache. Release the pin after PT unlocked. Racers just loop and wait since it's a rare and very short event. A schedule_timeout_uninterruptible(1) call is added to avoid repeated page faults wasting too much CPU, causing livelock or adding too much noise to perf statistics. A similar livelock issue was described in commit 029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead") Reproducer: This race issue can be triggered easily using a well constructed reproducer and patched brd (with a delay in read path) [1]: With latest 6.8 mainline, race caused data loss can be observed easily: $ gcc -g -lpthread test-thread-swap-race.c && ./a.out Polulating 32MB of memory region... Keep swapping out... Starting round 0... Spawning 65536 workers... 32746 workers spawned, wait for done... Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss! Round 0 Failed, 15 data loss! This reproducer spawns multiple threads sharing the same memory region using a small swap device. Every two threads updates mapped pages one by one in opposite direction trying to create a race, with one dedicated thread keep swapping out the data out using madvise. The reproducer created a reproduce rate of about once every 5 minutes, so the race should be totally possible in production. After this patch, I ran the reproducer for over a few hundred rounds and no data loss observed. Performance overhead is minimal, microbenchmark swapin 10G from 32G zram: Before: 10934698 us After: 11157121 us Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag) [kasong@tencent.com: v4] Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") Reported-by: "Huang, Ying" <ying.huang@intel.com> Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel.com/ Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1] Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Yu Zhao <yuzhao@google.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Barry Song <21cnbao@gmail.com> Cc: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-20	dm bufio: Support IO priority	Hongyu Jin
	Some IO will dispatch from kworker with different io_context settings than the submitting task, we may need to specify a priority to avoid losing priority. Add dm_bufio_read_with_ioprio() and dm_bufio_prefetch_with_ioprio() for use by bufio users to pass an ioprio other than IOPRIO_DEFAULT. Co-developed-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> [snitzer: introduced _with_ioprio() wrappers to reduce churn] Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20	dm io: Support IO priority	Hongyu Jin
	Some IO will dispatch from kworker with different io_context settings than the submitting task, we may need to specify a priority to avoid losing priority. Add IO priority parameter to dm_io() and update all callers. Co-developed-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20	f2fs: deprecate io_bits	Jaegeuk Kim
	Let's deprecate an unused io_bits feature to save CPU cycles and memory. Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-20	KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA	Paul Durrant
	Some pfncache pages may actually be overlays on guest memory that have a fixed HVA within the VMM. It's pointless to invalidate such cached mappings if the overlay is moved so allow a cache to be activated directly with the HVA to cater for such cases. A subsequent patch will make use of this facility. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-10-paul@xen.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-20	KVM: s390: Refactor kvm_is_error_gpa() into kvm_is_gpa_in_memslot()	Sean Christopherson
	Rename kvm_is_error_gpa() to kvm_is_gpa_in_memslot() and invert the polarity accordingly in order to (a) free up kvm_is_error_gpa() to match with kvm_is_error_{hva,page}(), and (b) to make it more obvious that the helper is doing a memslot lookup, i.e. not simply checking for INVALID_GPA. No functional change intended. Link: https://lore.kernel.org/r/20240215152916.1158-9-paul@xen.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-20	KVM: pfncache: remove KVM_GUEST_USES_PFN usage	Paul Durrant
	As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any callers of kvm_gpc_init(), and for good reason: the implementation is incomplete/broken. And it's not clear that there will ever be a user of KVM_GUEST_USES_PFN, as coordinating vCPUs with mmu_notifier events is non-trivial. Remove KVM_GUEST_USES_PFN and all related code, e.g. dropping KVM_GUEST_USES_PFN also makes the 'vcpu' argument redundant, to avoid having to reason about broken code as __kvm_gpc_refresh() evolves. Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage check in hva_to_pfn_retry() and hence the 'usage' argument to kvm_gpc_init() are also redundant. [1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-6-paul@xen.org [sean: explicitly call out that guest usage is incomplete] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-20	KVM: pfncache: add a mark-dirty helper	Paul Durrant
	At the moment pages are marked dirty by open-coded calls to mark_page_dirty_in_slot(), directly deferefencing the gpa and memslot from the cache. After a subsequent patch these may not always be set so add a helper now so that caller will protected from the need to know about this detail. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-5-paul@xen.org [sean: decrease indentation, use gpa_to_gfn()] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-20	tc: make tc_bus_type const	Ricardo B. Marliere
	Since commit d492cc2573a0 ("driver core: device.h: make struct bus_type a const *"), the driver core can properly handle constant struct bus_type, move the tc_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Acked-by: Maciej W. Rozycki <macro@orcam.me.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-02-20	firmware: arm_scmi: Report frequencies in the perf notifications	Cristian Marussi
	Extend the perf notification report to include pre-calculated frequencies corresponding to the reported limits/levels event; such frequencies are properly computed based on the stored known OPPs information taking into consideration if the current operating mode is level indexed or not. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20240212123233.1230090-12-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2024-02-20	net: skbuff: add overflow debug check to pull/push helpers	Florian Westphal
	syzbot managed to trigger following splat: BUG: KASAN: use-after-free in __skb_flow_dissect+0x4a3b/0x5e50 Read of size 1 at addr ffff888208a4000e by task a.out/2313 [..] __skb_flow_dissect+0x4a3b/0x5e50 __skb_get_hash+0xb4/0x400 ip_tunnel_xmit+0x77e/0x26f0 ipip_tunnel_xmit+0x298/0x410 .. Analysis shows that the skb has a valid ->head, but bogus ->data pointer. skb->data gets its bogus value via the neigh layer, which does: 1556 __skb_pull(skb, skb_network_offset(skb)); ... and the skb was already dodgy at this point: skb_network_offset(skb) returns a negative value due to an earlier overflow of skb->network_header (u16). __skb_pull thus "adjusts" skb->data by a huge offset, pointing outside skb->head area. Allow debug builds to splat when we try to pull/push more than INT_MAX bytes. After this, the syzkaller reproducer yields a more precise splat before the flow dissector attempts to read off skb->data memory: WARNING: CPU: 5 PID: 2313 at include/linux/skbuff.h:2653 neigh_connected_output+0x28e/0x400 ip_finish_output2+0xb25/0xed0 iptunnel_xmit+0x4ff/0x870 ipgre_xmit+0x78e/0xbb0 Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240216113700.23013-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-20	locking/atomic: scripts: Clarify ordering of conditional atomics	Mark Rutland
	Conditional atomic operations (e.g. cmpxchg()) only provide ordering when the condition holds; when the condition does not hold, the location is not modified and relaxed ordering is provided. Where ordering is needed for failed conditional atomics, it is necessary to use smp_mb__before_atomic() and/or smp_mb__after_atomic(). This is explained tersely in memory-barriers.txt, and is implied but not explicitly stated in the kerneldoc comments for the conditional operations. The lack of an explicit statement has lead to some off-list queries about the ordering semantics of failing conditional operations, so evidently this is confusing. Update the kerneldoc comments to explicitly describe the lack of ordering for failed conditional atomic operations. For most conditional atomic operations, this is written as: \| If (${condition}), atomically updates @v to (${new}) with ${desc_order} ordering. \| Otherwise, @v is not modified and relaxed ordering is provided. For the try_cmpxchg() operations, this is written as: \| If (${condition}), atomically updates @v to @new with ${desc_order} ordering. \| Otherwise, @v is not modified, @old is updated to the current value of @v, \| and relaxed ordering is provided. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Link: https://lore.kernel.org/r/20240209124010.2096198-1-mark.rutland@arm.com
2024-02-20	fs/select: rework stack allocation hack for clang	Arnd Bergmann
	A while ago, we changed the way that select() and poll() preallocate a temporary buffer just under the size of the static warning limit of 1024 bytes, as clang was frequently going slightly above that limit. The warnings have recently returned and I took another look. As it turns out, clang is not actually inherently worse at reserving stack space, it just happens to inline do_select() into core_sys_select(), while gcc never inlines it. Annotate do_select() to never be inlined and in turn remove the special case for the allocation size. This should give the same behavior for both clang and gcc all the time and once more avoids those warnings. Fixes: ad312f95d41c ("fs/select: avoid clang stack usage warning") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240216202352.2492798-1-arnd@kernel.org Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-20	net: add netmem to skb_frag_t	Mina Almasry
	Use struct netmem* instead of page in skb_frag_t. Currently struct netmem* is always a struct page underneath, but the abstraction allows efforts to add support for skb frags not backed by pages. There is unfortunately 1 instance where the skb_frag_t is assumed to be a exactly a bio_vec in kcm. For this case, WARN_ON_ONCE and return error before doing a cast. Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so that the API can be used to create netmem skbs. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-20	firmware: arm_ffa: Make ffa_bus_type const	Ricardo B. Marliere
	Now that the driver core can properly handle constant struct bus_type, move the ffa_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240211-bus_cleanup-firmware2-v1-1-1851c92c7be7@marliere.net Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2024-02-20	firmware: arm_scmi: Implement clock get permissions	Peng Fan
	ARM SCMI v3.2 introduces clock get permission command. To implement the same let us stash the values of those permissions in the scmi_clock_info. They indicate if the operation is forbidden or not. If the CLOCK_GET_PERMISSIONS command is not supported, the default permissions are set to allow the operations, otherwise they will be set according to the response of CLOCK_GET_PERMISSIONS from the SCMI platform firmware. Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20240121110901.1414856-1-peng.fan@oss.nxp.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2024-02-19	block: pass a queue_limits argument to blk_alloc_disk	Christoph Hellwig
	Pass a queue_limits to blk_alloc_disk and apply it if non-NULL. This will allow allocating queues with valid queue limits instead of setting the values one at a time later. Also change blk_alloc_disk to return an ERR_PTR instead of just NULL which can't distinguish errors. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20240215071055.2201424-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-19	Merge tag 'v6.8-rc5' into timers/core, to resolve conflict	Ingo Molnar
	There's a conflict between this recent upstream fix: dad6a09f3148 ("hrtimer: Report offline hrtimer enqueue") and a pending commit in the timers tree: 1a4729ecafc2 ("hrtimers: Move hrtimer base related definitions into hrtimer_defs.h") Resolve it by applying the upstream fix to the new <linux/hrtimer_defs.h> header. Conflict: include/linux/hrtimer.h Semantic conflict: include/linux/hrtimer_defs.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-19	iio: adc: adi-axi-adc: move to backend framework	Nuno Sa
	Move to the IIO backend framework. Devices supported by adi-axi-adc now register themselves as backend devices. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240210-iio-backend-v11-7-f5242a5fb42a@analog.com Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-19	iio: add the IIO backend framework	Nuno Sa
	This is a Framework to handle complex IIO aggregate devices. The typical architecture is to have one device as the frontend device which can be "linked" against one or multiple backend devices. All the IIO and userspace interface is expected to be registers/managed by the frontend device which will callback into the backends when needed (to get/set some configuration that it does not directly control). The basic framework interface is pretty simple: - Backends should register themselves with @devm_iio_backend_register() - Frontend devices should get backends with @devm_iio_backend_get() Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240210-iio-backend-v11-5-f5242a5fb42a@analog.com Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-19	iio: buffer-dmaengine: export buffer alloc and free functions	Nuno Sa
	Export iio_dmaengine_buffer_free() and iio_dmaengine_buffer_alloc(). This is in preparation of introducing IIO backends support. This will allow us to allocate a buffer and control it's lifetime from a device different from the one holding the DMA firmware properties. Effectively, in this case the struct device holding the firmware information about the DMA channels is not the same as iio_dev->dev.parent (typical case). While at it, namespace the buffer-dmaengine exports and update the current user of these buffers. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240210-iio-backend-v11-4-f5242a5fb42a@analog.com Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-19	x86/resctrl: Separate arch and fs resctrl locks	James Morse
	resctrl has one mutex that is taken by the architecture-specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list. Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against CPU hotplug. The CPU hotplug code already has a mechanism to do this using cpus_read_lock(). MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory. Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this. Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly. Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() could walk the domain list using RCU, but to keep all the filesystem operations the same, this is changed to call cpus_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases. The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring. Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious. Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves. [ bp: Fold in a build fix: https://lore.kernel.org/r/87zfvwieli.ffs@tglx ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-25-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-19	smp: Make __smp_processor_id() 0-argument macro	Alexey Dobriyan
	smp_processor_id family of macros never accepted any arguments. #define __smp_processor_id(x) works by accident (see C99 6.10.3 §4). __smp_processor_id() gets 1 (empty) argument and passes it down to raw_smp_processor_id() which doesn't accept arguments. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/0037d1f2-8153-4b33-b43e-f4b6ecd710ac@p183
2024-02-19	locking: Add rwsem_assert_held() and rwsem_assert_held_write()	Matthew Wilcox (Oracle)
	Modelled after lockdep_assert_held() and lockdep_assert_held_write(), but are always active, even when lockdep is disabled. Of course, they don't test that _this_ thread is the owner, but it's sufficient to catch many bugs and doesn't incur the same performance penalty as lockdep. Acked-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Acked-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-19	pwm: lpss-*: Make use of devm_pwmchip_alloc() function	Uwe Kleine-König
	This prepares the pwm-lpss drivers to further changes of the pwm core outlined in the commit introducing devm_pwmchip_alloc(). There is no intended semantical change and the driver should behave as before. Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/b567ab5dd992e361eb884fa6c2cac11be9c7dde3.1707900770.git.u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
2024-02-19	jiffies: Transform comment about time_* functions into DOC block	Anna-Maria Behnsen
	This general note about time_* functions is also useful to be available in kernel documentation. Therefore transform it into a kernel-doc DOC block with proper formatting. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-6-anna-maria@linutronix.de
2024-02-19	hrtimers: Update formatting of documentation	Anna-Maria Behnsen
	Documentation of functions lacks the annotations which are used by kernel-doc and *.rst to make appearance in rendered documents more user-friendly. Use those annotations to improve user-friendliness. While at it prevent duplication of comments and use a reference instead. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-3-anna-maria@linutronix.de
2024-02-19	hrtimers: Move hrtimer base related definitions into hrtimer_defs.h	Anna-Maria Behnsen
	hrtimer base related struct definitions are part of hrtimers.h as it is required there. With this, also the struct documentation which is for core code internal use, is exposed into the general api. To prevent this, move all core internal definitions and the related includes into hrtimer_defs.h. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-2-anna-maria@linutronix.de
2024-02-19	Merge 6.8-rc5 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19	Merge 6.8-rc5 into tty-next	Greg Kroah-Hartman
	We need the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19	sysfs: Introduce a mechanism to hide static attribute_groups	Dan Williams
	Add a mechanism for named attribute_groups to hide their directory at sysfs_update_group() time, or otherwise skip emitting the group directory when the group is first registered. It piggybacks on is_visible() in a similar manner as SYSFS_PREALLOC, i.e. special flags in the upper bits of the returned mode. To use it, specify a symbol prefix to DEFINE_SYSFS_GROUP_VISIBLE(), and then pass that same prefix to SYSFS_GROUP_VISIBLE() when assigning the @is_visible() callback: DEFINE_SYSFS_GROUP_VISIBLE($prefix) struct attribute_group $prefix_group = { .name = $name, .is_visible = SYSFS_GROUP_VISIBLE($prefix), }; SYSFS_GROUP_VISIBLE() expects a definition of $prefix_group_visible() and $prefix_attr_visible(), where $prefix_group_visible() just returns true / false and $prefix_attr_visible() behaves as normal. The motivation for this capability is to centralize PCI device authentication in the PCI core with a named sysfs group while keeping that group hidden for devices and platforms that do not meet the requirements. In a PCI topology, most devices will not support authentication, a small subset will support just PCI CMA (Component Measurement and Authentication), a smaller subset will support PCI CMA + PCIe IDE (Link Integrity and Encryption), and only next generation server hosts will start to include a platform TSM (TEE Security Manager). Without this capability the alternatives are: * Check if all attributes are invisible and if so, hide the directory. Beyond trouble getting this to work [1], this is an ABI change for scenarios if userspace happens to depend on group visibility absent any attributes. I.e. this new capability avoids regression since it does not retroactively apply to existing cases. * Publish an empty /sys/bus/pci/devices/$pdev/tsm/ directory for all PCI devices (i.e. for the case when TSM platform support is present, but device support is absent). Unfortunate that this will be a vestigial empty directory in the vast majority of cases. * Reintroduce usage of runtime calls to sysfs_{create,remove}_group() in the PCI core. Bjorn has already indicated that he does not want to see any growth of pci_sysfs_init() [2]. * Drop the named group and simulate a directory by prefixing all TSM-related attributes with "tsm_". Unfortunate to not use the naming capability of a sysfs group as intended. In comparison, there is a small potential for regression if for some reason an @is_visible() callback had dependencies on how many times it was called. Additionally, it is no longer an error to update a group that does not have its directory already present, and it is no longer a WARN() to remove a group that was never visible. Link: https://lore.kernel.org/all/2024012321-envious-procedure-4a58@gregkh/ [1] Link: https://lore.kernel.org/linux-pci/20231019200110.GA1410324@bhelgaas/ [2] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/2024013028-deflator-flaring-ec62@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19	Merge 6.8-rc5 into driver-core-next	Greg Kroah-Hartman
	We need the driver core changes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-18	tty: Don't include tty_buffer.h in tty.h	Ilpo Järvinen
	There's no need to include linux/tty_buffer.h in linux/tty.h. Move the include into tty_buffer.c that is actually using it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240215111538.1920-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17	gpio: constify opaque pointer "data" in gpio_device_find()	Krzysztof Kozlowski
	The opaque pointer "data" in each match function used by gpio_device_find() is a pointer to const, thus the same argument passed to gpio_device_find() can adjusted similarly. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-02-17	net: phy: add PHY_EEE_CAP2_FEATURES	Heiner Kallweit
	As a prerequisite for adding EEE CAP2 register support, complement PHY_EEE_CAP1_FEATURES with PHY_EEE_CAP2_FEATURES. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17	net: mdio: add helpers for accessing the EEE CAP2 registers	Heiner Kallweit
	This adds helpers for accessing the EEE CAP2 registers. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17	Merge tag 'char-misc-6.8-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / miscdriver fixes from Greg KH: "Here is a small set of char/misc and IIO driver fixes for 6.8-rc5. Included in here are: - lots of iio driver fixes for reported issues - nvmem device naming fixup for reported problem - interconnect driver fixes for reported issues All of these have been in linux-next for a while with no reported the issues (the nvmem patch was included in a different branch in linux-next before sent to me for inclusion here)" * tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) nvmem: include bit index in cell sysfs file name iio: adc: ad4130: only set GPIO_CTRL if pin is unused iio: adc: ad4130: zero-initialize clock init data interconnect: qcom: x1e80100: Add missing ACV enable_mask interconnect: qcom: sm8650: Use correct ACV enable_mask iio: accel: bma400: Fix a compilation problem iio: commom: st_sensors: ensure proper DMA alignment iio: hid-sensor-als: Return 0 for HID_USAGE_SENSOR_TIME_TIMESTAMP iio: move LIGHT_UVA and LIGHT_UVB to the end of iio_modifier staging: iio: ad5933: fix type mismatch regression iio: humidity: hdc3020: fix temperature offset iio: adc: ad7091r8: Fix error code in ad7091r8_gpio_setup() iio: adc: ad_sigma_delta: ensure proper DMA alignment iio: imu: adis: ensure proper DMA alignment iio: humidity: hdc3020: Add Makefile, Kconfig and MAINTAINERS entry iio: imu: bno055: serdev requires REGMAP iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC iio: pressure: bmp280: Add missing bmp085 to SPI id table iio: core: fix memleak in iio_device_register_sysfs interconnect: qcom: sm8550: Enable sync_state ...
2024-02-17	Merge tag 'tty-6.8-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are three small tty and serial driver fixes for 6.8-rc5: - revert a 8250_pci1xxxx off-by-one change that was incorrect - two changes to fix the transmit path of the mxs-auart driver, fixing a regression in the 6.2 release All of these have been in linux-next for over a week with no reported issues" * tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: mxs-auart: fix tx serial: core: introduce uart_port_tx_flags() serial: 8250_pci1xxxx: partially revert off by one patch
2024-02-17	Merge tag 'usb-6.8-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are two small fixes for 6.8-rc5: - thunderbolt to fix a reported issue on many platforms - dwc3 driver revert of a commit that caused problems in -rc1 Both of these changes have been in linux-next for over a week with no reported issues" * tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "usb: dwc3: Support EBC feature of DWC_usb31" thunderbolt: Fix setting the CNS bit in ROUTER_CS_5
2024-02-17	iio: hid-sensor-als: Add light chromaticity support	Basavaraj Natikar
	On some platforms, ambient color sensors also support the x and y light colors, which represent the coordinates on the CIE 1931 chromaticity diagram. Add light chromaticity x and y. Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240205185926.3030521-5-srinivas.pandruvada@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17	iio: hid-sensor-als: Add light color temperature support	Basavaraj Natikar
	On some platforms, ambient color sensors also support light color temperature. Add support of light color temperature. Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240205185926.3030521-4-srinivas.pandruvada@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17	iio: core: make iio_bus_type const	Ricardo B. Marliere
	Now that the driver core can properly handle constant struct bus_type, move the iio_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Acked-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240208-bus_cleanup-iio-v1-1-4a167c3b5fb3@marliere.net Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17	iio: locking: introduce __cleanup() based direct mode claiming infrastructure	Jonathan Cameron
	Allows use of: iio_device_claim_direct_scoped(return -EBUSY, indio_dev) { } to automatically call iio_device_release_direct_mode() based on scope. Typically seen in combination with local device specific locks which are already have automated cleanup options via guard(mutex)(&st->lock) and scoped_guard(). Using both together allows most error handling to be automated. Reviewed-by: Nuno Sa <nuno.a@analog.com> Link: https://lore.kernel.org/r/20240128150537.44592-2-jic23@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17	usb: gadget: Support already-mapped DMA SGs	Paul Cercueil
	Add a new 'sg_was_mapped' field to the struct usb_request. This field can be used to indicate that the scatterlist associated to the USB transfer has already been mapped into the DMA space, and it does not have to be done internally. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20240130122340.54813-2-paul@crapouillou.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17	kobject: make uevent_seqnum atomic	Eric Dumazet
	We will soon no longer acquire uevent_sock_mutex for most kobject_uevent_net_broadcast() calls, and also while calling uevent_net_broadcast(). Make uevent_seqnum an atomic64_t to get its own protection. This fixes a race while reading /sys/kernel/uevent_seqnum. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christian Brauner <brauner@kernel.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20240214084829.684541-2-edumazet@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-16	x86/numa: Fix the address overlap check in numa_fill_memblks()	Alison Schofield
	numa_fill_memblks() fills in the gaps in numa_meminfo memblks over a physical address range. To do so, it first creates a list of existing memblks that overlap that address range. The issue is that it is off by one when comparing to the end of the address range, so memblks that do not overlap are selected. The impact of selecting a memblk that does not actually overlap is that an existing memblk may be filled when the expected action is to do nothing and return NUMA_NO_MEMBLK to the caller. The caller can then add a new NUMA node and memblk. Replace the broken open-coded search for address overlap with the memblock helper memblock_addrs_overlap(). Update the kernel doc and in code comments. Suggested by: "Huang, Ying" <ying.huang@intel.com> Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/10a3e6109c34c21a8dd4c513cf63df63481a2b07.1705085543.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-17	crypto: hisilicon/qm - change function type to void	Weili Qian
	The function qm_stop_qp_nolock() always return zero, so function type is changed to void. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17	crypto: hisilicon/qm - obtain stop queue status	Weili Qian
	The debugfs files 'dev_state' and 'dev_timeout' are added. Users can query the current queue stop status through these two files. And set the waiting timeout when the queue is released. dev_state: if dev_timeout is set, dev_state indicates the status of stopping the queue. 0 indicates that the queue is stopped successfully. Other values indicate that the queue stops fail. If dev_timeout is not set, the value of dev_state is 0; dev_timeout: if the queue fails to stop, the queue is released after waiting dev_timeout * 20ms. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17	crypto: hisilicon/qm - add stop function by hardware	Weili Qian
	Hardware V3 could be able to drain function by sending mailbox to hardware which will trigger tasks in device to be flushed out. When the function is reset, the function can be stopped by this way. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17	firmware: coreboot: Generate aliases for coreboot modules	Nícolas F. R. A. Prado
	Generate aliases for coreboot modules to allow automatic module probing. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240212-coreboot-mod-defconfig-v4-2-d14172676f6d@collabora.com Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>