summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-10-19block: Add bdev atomic write limits helpersJohn Garry
Add helpers to get atomic write limits for a bdev, so that we don't access request_queue helpers outside the block layer. We check if the bdev can actually atomic write in these helpers, so we can avoid users missing using this check. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241019125113.369994-4-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-19fs/block: Check for IOCB_DIRECT in generic_atomic_write_valid()John Garry
Currently FMODE_CAN_ATOMIC_WRITE is set if the bdev can atomic write and the file is open for direct IO. This does not work if the file is not opened for direct IO, yet fcntl(O_DIRECT) is used on the fd later. Change to check for direct IO on a per-IO basis in generic_atomic_write_valid(). Since we want to report -EOPNOTSUPP for non-direct IO for an atomic write, change to return an error code. Relocate the block fops atomic write checks to the common write path, as to catch non-direct IO. Fixes: c34fc6f26ab8 ("fs: Initial atomic write support") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241019125113.369994-3-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-19block/fs: Pass an iocb to generic_atomic_write_valid()John Garry
Darrick and Hannes both thought it better that generic_atomic_write_valid() should be passed a struct iocb, and not just the member of that struct which is referenced; see [0] and [1]. I think that makes a more generic and clean API, so make that change. [0] https://lore.kernel.org/linux-block/680ce641-729b-4150-b875-531a98657682@suse.de/ [1] https://lore.kernel.org/linux-xfs/20240620212401.GA3058325@frogsfrogsfrogs/ Fixes: c34fc6f26ab8 ("fs: Initial atomic write support") Suggested-by: Darrick J. Wong <djwong@kernel.org> Suggested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241019125113.369994-2-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-19fs: add file_refChristian Brauner
As atomic_inc_not_zero() is implemented with a try_cmpxchg() loop it has O(N^2) behaviour under contention with N concurrent operations and it is in a hot path in __fget_files_rcu(). The rcuref infrastructures remedies this problem by using an unconditional increment relying on safe- and dead zones to make this work and requiring rcu protection for the data structure in question. This not just scales better it also introduces overflow protection. However, in contrast to generic rcuref, files require a memory barrier and thus cannot rely on *_relaxed() atomic operations and also require to be built on atomic_long_t as having massive amounts of reference isn't unheard of even if it is just an attack. As suggested by Linus, add a file specific variant instead of making this a generic library. Files are SLAB_TYPESAFE_BY_RCU and thus don't have "regular" rcu protection. In short, freeing of files isn't delayed until a grace period has elapsed. Instead, they are freed immediately and thus can be reused (multiple times) within the same grace period. So when picking a file from the file descriptor table via its file descriptor number it is thus possible to see an elevated reference count on file->f_count even though the file has already been recycled possibly multiple times by another task. To guard against this the vfs will pick the file from the file descriptor table twice. Once before the refcount increment and once after to compare the pointers (grossly simplified). If they match then the file is still valid. If not the caller needs to fput() it. The unconditional increment makes the following race possible as illustrated by rcuref: > Deconstruction race > =================== > > The release operation must be protected by prohibiting a grace period in > order to prevent a possible use after free: > > T1 T2 > put() get() > // ref->refcnt = ONEREF > if (!atomic_add_negative(-1, &ref->refcnt)) > return false; <- Not taken > > // ref->refcnt == NOREF > --> preemption > // Elevates ref->refcnt to ONEREF > if (!atomic_add_negative(1, &ref->refcnt)) > return true; <- taken > > if (put(&p->ref)) { <-- Succeeds > remove_pointer(p); > kfree_rcu(p, rcu); > } > > RCU grace period ends, object is freed > > atomic_cmpxchg(&ref->refcnt, NOREF, DEAD); <- UAF > > [...] it prevents the grace period which keeps the object alive until > all put() operations complete. Having files by SLAB_TYPESAFE_BY_RCU shouldn't cause any problems for this deconstruction race. Afaict, the only interesting case would be someone freeing the file and someone immediately recycling it within the same grace period and reinitializing file->f_count to ONEREF while a concurrent fput() is doing atomic_cmpxchg(&ref->refcnt, NOREF, DEAD) as in the race above. But this is safe from SLAB_TYPESAFE_BY_RCU's perspective and it should be safe from rcuref's perspective. T1 T2 T3 fput() fget() // f_count->refcnt = ONEREF if (!atomic_add_negative(-1, &f_count->refcnt)) return false; <- Not taken // f_count->refcnt == NOREF --> preemption // Elevates f_count->refcnt to ONEREF if (!atomic_add_negative(1, &f_count->refcnt)) return true; <- taken if (put(&f_count)) { <-- Succeeds remove_pointer(p); /* * Cache is SLAB_TYPESAFE_BY_RCU * so this is freed without a grace period. */ kmem_cache_free(p); } kmem_cache_alloc() init_file() { // Sets f_count->refcnt to ONEREF rcuref_long_init(&f->f_count, 1); } Object has been reused within the same grace period via kmem_cache_alloc()'s SLAB_TYPESAFE_BY_RCU. /* * With SLAB_TYPESAFE_BY_RCU this would be a safe UAF access and * it would work correctly because the atomic_cmpxchg() * will fail because the refcount has been reset to ONEREF by T3. */ atomic_cmpxchg(&ref->refcnt, NOREF, DEAD); <- UAF However, there are other cases to consider: (1) Benign race due to multiple atomic_long_read() CPU1 CPU2 file_ref_put() // last reference // => count goes negative/FILE_REF_NOREF atomic_long_add_negative_release(-1, &ref->refcnt) -> __file_ref_put() file_ref_get() // goes back from negative/FILE_REF_NOREF to 0 // and file_ref_get() succeeds atomic_long_add_negative(1, &ref->refcnt) // This is immediately followed by file_ref_put() // managing to set FILE_REF_DEAD file_ref_put() // __file_ref_put() continues and sees // cnt > FILE_REF_RELEASED // and splats with // "imbalanced put on file reference count" cnt = atomic_long_read(&ref->refcnt); The race however is benign and the problem is the atomic_long_read(). Instead of performing a separate read this uses atomic_long_dec_return() and pass the value to __file_ref_put(). Thanks to Linus for pointing out that braino. (2) SLAB_TYPESAFE_BY_RCU may cause recycled files to be marked dead When a file is recycled the following race exists: CPU1 CPU2 // @file is already dead and thus // cnt >= FILE_REF_RELEASED. file_ref_get(file) atomic_long_add_negative(1, &ref->refcnt) // We thus call into __file_ref_get() -> __file_ref_get() // which sees cnt >= FILE_REF_RELEASED cnt = atomic_long_read(&ref->refcnt); // In the meantime @file gets freed kmem_cache_free() // and is immediately recycled file = kmem_cache_zalloc() // and the reference count is reinitialized // and the file alive again in someone // else's file descriptor table file_ref_init(&ref->refcnt, 1); // the __file_ref_get() slowpath now continues // and as it saw earlier that cnt >= FILE_REF_RELEASED // it wants to ensure that we're staying in the middle // of the deadzone and unconditionally sets // FILE_REF_DEAD. // This marks @file dead for CPU2... atomic_long_set(&ref->refcnt, FILE_REF_DEAD); // Caller issues a close() system call to close @file close(fd) file = file_close_fd_locked() filp_flush() // The caller sees that cnt >= FILE_REF_RELEASED // and warns the first time... CHECK_DATA_CORRUPTION(file_count(file) == 0) // and then splats a second time because // __file_ref_put() sees cnt >= FILE_REF_RELEASED file_ref_put(&ref->refcnt); -> __file_ref_put() My initial inclination was to replace the unconditional atomic_long_set() with an atomic_long_try_cmpxchg() but Linus pointed out that: > I think we should just make file_ref_get() do a simple > > return !atomic_long_add_negative(1, &ref->refcnt)); > > and nothing else. Yes, multiple CPU's can race, and you can increment > more than once, but the gap - even on 32-bit - between DEAD and > becoming close to REF_RELEASED is so big that we simply don't care. > That's the point of having a gap. I've been testing this with will-it-scale using fstat() on a machine that Jens gave me access (thank you very much!): processor : 511 vendor_id : AuthenticAMD cpu family : 25 model : 160 model name : AMD EPYC 9754 128-Core Processor and I consistently get a 3-5% improvement on 256+ threads. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202410151043.5d224a27-oliver.sang@intel.com Closes: https://lore.kernel.org/all/202410151611.f4cd71f2-oliver.sang@intel.com Link: https://lore.kernel.org/r/20241007-brauner-file-rcuref-v2-2-387e24dc9163@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-19Merge tag 'iio-for-6.13a-take2' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: IIO: 1st set of new device support, features and cleanup for 6.13 Changes in take2: - Additional patch to drop the accidentally added empty file. - Tidy up of a duplicate include. Two merges of other trees. 6.12-rc2: To pull in the move of unaligned.h from include/asm to include/linux This was to resolve issues in linux-next. pwm/duty_offset-for6.13-rc1@ PWM infrastructure that is use di the AD7625 ADC driver. New device support ================== adi,ad7173 - Add support for the AD4113 8 channel ADC. adi,ad7606 - Add support for the AD7606C-16 and AD7606C-18 which have higher precision and support bipolar and differential channels. A lot of driver rework was needed to add the additional flexibility needed to support these parts. adi,ad7625 - New driver supporting AD7625, AD7626, AD7960 and AD7961 LVDS connected ADCs. Interface uses a combination of a PWM control and an IIO backend (currently a custom FPGA IP). adi,ad8460 - New driver for the ad8460 Waveform DAC. This high speed device is driven by a custom IP via a DMAEngine buffer. bosch,bmi270 - New driver for this 6-axis IMU. A later patch adds SPI support. gehc,pmc-adc - New driver for this GE Healthcare ADC 16-channel 16 bit ADC. invensense,mpu6050 - Add support for IAM-20680HT and IAM-20680HP variants of the IAM-20680 IMU that have better specifications in various ways including larger FIFO sizes. vishay,vl6030 - Support the veml7700, a stripped down veml6030 ambient light sensor. - Support the veml6035 ambient light sensor. Features ======== liteon,ltr390 - Allow configuration of sampling frequency - Support suspend and resume - Add interrupt support including threshold events + control over event reporting persistence. st,vl53l0x - Add support for continuous mode via IIO buffer support and a dataready trigger. ti,tmp0006 - Add triggered buffer support using data ready interrupt. vishay,vl6030 - Add regulator control support. vishay,vl6070 - Add regulator control support. vishay,vl6180 - Allow configuration of waiting between continuous samples. - Use the interrupt, if available, for single shot captures - Support continuous mode via the IIO triggered buffer interfaces. Cleanups and minor fixes ======================== tools/event monitor - Free dev_dir_name. treewide - Introduce aligned_s64 type for timestamps to replace s64 __aligned(8). Initial use in a few drivers - many others to follow. - Use dev_get_platform_data() instead of open-coding the access. - InvenSense email address and maintainer updates to reflect move to the tdk domain after acquisition. - Switch platform drivers from remove_new() back to remove() now all rework this was enabling is done. - More use of device_for_each_child_node_scoped() to remove need for manual caling of fwnode_handle_put() in early exits from the loop. - Use IIO_MAP() macro to replace some open-coded versions. - Constify struct iio_map arrays. - Use irq_get_trigger_type(irq) rather than irqd_get_trigger_type(irq_get_irq_data(irq); core - unsigned to unsigned int. adi,ad3552r - Fix to low a limit on max SPI clock speed. No rush to upstream this one as the binding that uses the higher speed will merge via this tree once ready. adi,ad5770r - Use get_unaligned_le16() instead of open-coding. Note this caused a merge issue in linux-next as unaligned.h has moved. adi,ad7606 - Use read_avail() callback to handle the various _available attributes. - Wrap up all data related to channel scaling into a single structure as the use of this gets more complex for the -16 and -18 parts. adi,adf4371 - Use chip_info structures and spi_get_device_match_data() - Drop spi_set_drvdata() as unused. - Reduce scope of struct clock as only touched in probe(). - Use dev_err_probe() where appropriate. adi,axi-dac - Improve register naming to make field and register association clearer. - Fix a wrong register bit. amlogic,meson8-saradc - Allow the meson8-saradc to have the amlogic,hhi-sysctrl property. bosch,bmp280 - Use u8 for the DMA buffer to avoid implication of other types given it can contain be24 data. - Use unsigned types to store raw values to make it clear they cannot be negative. - Drop unnecessary check for errors after IIR filter update. - Support soft reset to get device to know state on driver load. - Use bulk reads to retrieve the humidity calibration data. dynaimage,al3010 - Make sure to powerdown device in error paths. invensense,icm42600 - Add missing i2c_device-id tables. kionix,kmx61 - Drop ACPI IDs that are not associated with valid ACPI vendor IDs and for which there is no evidence they are in use in real devices. liteon,ltrf261a - Document a bad compatible that we are supporting because it is in the wild in the Valve Steam Deck via ACPI PRP0001. maxim,max1363 - Use get_unaligned_be16() instead of open-coding. Note this caused a merge issue in linux-next as unaligned.h has moved. mediatek,mt6360 - Use get_unaligned_be16() instead of open-coding. Note this caused a merge issue in linux-next as unaligned.h has moved. microchip,pac1921 - Drop various unnecessary type casts that were suggested by Wconversion compiler warnings which we do not use in IIO. Remove them because they hurt readability in cases where it is clear not overflow can occur. rohm,rpr0521 - Use iio_poll_func_store_time() rather than open-coding a similar solution. semtech,sx9324 - Make sx_common_get_raw_register() local to the only code that uses it. - Drop unnecessary acpi.h include. st,vl53l0x - Check the part ID register and print an info message if it is not what is expected. vishay,veml6030 - Fix DT binding file name to include vishay - Use regmap_set_bits() for case where all bits are set. - Use dev_err_probe() where appropriate. - Add missing delay after powering up in resume path. - Drop a processed accessor for the white channel as there is no public information to allow a specific scale to be established. - Use read_avail() to replace explicit custom _available attributes. vishay,veml6070 - Use guard() to allow early returns. - Add a devm callback to unregister the i2c device. - Use devm_iio_device_register() to simplify removal code. Various other minor improvements not called out explicitly. * tag 'iio-for-6.13a-take2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (161 commits) iio: imu: bmi270: Remove duplicated include in bmi270_i2c.c iio: adc: ad7606: Drop spurious empty file. iio: light: rpr0521: Use generic iio_pollfunc_store_time() iio: light: vl6180: Add support for Continuous Mode iio: light: vl6180: Added Interrupt support for single shot access iio: light: vl6180: Add configurable inter-measurement period support MAINTAINERS: add entry for VEML6030 ambient light sensor driver iio: light: veml6030: add support for veml7700 dt-bindings: iio: light: veml6030: add veml7700 iio: light: veml6035: fix read_avail in no_irq case for veml6035 iio: dac: adi-axi-dac: update register names iio: dac: adi-axi-dac: fix wrong register bitfield docs: iio: new docs for ad7625 driver iio: adc: ad7625: add driver dt-bindings: iio: adc: add AD762x/AD796x ADCs iio: Convert unsigned to unsigned int iio: pressure: bmp280: Fix uninitialized variable iio: Switch back to struct platform_driver::remove() iio: imu: bmi323: remove redundant register definition iio: frequency: adf4371: make use of dev_err_probe() ...
2024-10-18PCI: Move struct pci_bus_resource into bus.cIlpo Järvinen
The struct pci_bus_resource is only used in bus.c, so move it there. Link: https://lore.kernel.org/r/20241017141111.44612-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-18PCI: Remove unused PCI_SUBTRACTIVE_DECODEIlpo Järvinen
2fe2abf896c1 ("PCI: augment bus resource table with a list") added PCI_SUBTRACTIVE_DECODE which is put into the struct pci_bus_resource flags field but is never read. There seems to never have been users for it. Remove both PCI_SUBTRACTIVE_DECODE and the flags field from the struct pci_bus_resource. Link: https://lore.kernel.org/r/20241017141111.44612-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-18Merge tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: - Fix integer overflow in xrep_bmap - Fix stale dealloc punching for COW IO * tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: punch delalloc extents from the COW fork for COW writes xfs: set IOMAP_F_SHARED for all COW fork allocations xfs: share more code in xfs_buffered_write_iomap_begin xfs: support the COW fork in xfs_bmap_punch_delalloc_range xfs: IOMAP_ZERO and IOMAP_UNSHARE already hold invalidate_lock xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof xfs: factor out a xfs_file_write_zero_eof helper iomap: move locking out of iomap_write_delalloc_release iomap: remove iomap_file_buffered_write_punch_delalloc iomap: factor out a iomap_last_written_block helper xfs: fix integer overflow in xrep_bmap
2024-10-18Merge tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly fixes, msm and xe are the two main ones, with a bunch of scattered fixes including a largish revert in mgag200, then amdgpu, vmwgfx and scattering of other minor ones. All seems pretty regular. msm: - Display: - move CRTC resource assignment to atomic_check otherwise to make consecutive calls to atomic_check() consistent - fix rounding / sign-extension issues with pclk calculation in case of DSC - cleanups to drop incorrect null checks in dpu snapshots - fix to use kvzalloc in dpu snapshot to avoid allocation issues in heavily loaded system cases - Fix to not program merge_3d block if dual LM is not being used - Fix to not flush merge_3d block if its not enabled otherwise this leads to false timeouts - GPU: - a7xx: add a fence wait before SMMU table update xe: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) amdgpu: - SR-IOV fix - CS chunk handling fix - MES fixes - SMU13 fixes amdkfd: - VRAM usage reporting fix radeon: - Fix possible_clones handling i915: - Two DP bandwidth related MST fixes ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree()" * tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel: (45 commits) drm/ast: vga: Clear EDID if no display is connected drm/ast: sil164: Clear EDID if no display is connected Revert "drm/mgag200: Add vblank support" drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater drm/xe/bmg: improve cache flushing behaviour drm/xe/xe_sync: initialise ufence.signalled drm/xe/ufence: ufence can be signaled right after wait_woken drm/xe: Use bookkeep slots for external BO's in exec IOCTL drm/xe/query: Increase timestamp width drm/xe: Don't free job in TDR drm/xe: Take job list lock in xe_sched_add_pending_job drm/xe: fix unbalanced rpm put() with declare_wedged() drm/xe: fix unbalanced rpm put() with fence_fini() drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation drm/msm/a6xx+: Insert a fence wait before SMMU table update drm/msm/dpu: don't always program merge_3d block drm/msm/dpu: Don't always set merge_3d pending flush ...
2024-10-17locking/lockdep: Avoid creating new name string literals in ↵Ahmed Ehab
lockdep_set_subclass() Syzbot reports a problem that a warning will be triggered while searching a lock class in look_up_lock_class(). The cause of the issue is that a new name is created and used by lockdep_set_subclass() instead of using the existing one. This results in a lock instance has a different name pointer than previous registered one stored in lock class, and WARN_ONCE() is triggered because of that in look_up_lock_class(). To fix this, change lockdep_set_subclass() to use the existing name instead of a new one. Hence, no new name will be created by lockdep_set_subclass(). Hence, the warning is avoided. [boqun: Reword the commit log to state the correct issue] Reported-by: <syzbot+7f4a6f7f7051474e40ad@syzkaller.appspotmail.com> Fixes: de8f5e4f2dc1f ("lockdep: Introduce wait-type checks") Cc: stable@vger.kernel.org Signed-off-by: Ahmed Ehab <bottaawesome633@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/lkml/20240824221031.7751-1-bottaawesome633@gmail.com/
2024-10-17lockdep: Add lockdep_cleanup_dead_cpu()David Woodhouse
Add a function to check that an offline CPU has left the tracing infrastructure in a sane state. Commit 9bb69ba4c177 ("ACPI: processor_idle: use raw_safe_halt() in acpi_idle_play_dead()") fixed an issue where the acpi_idle_play_dead() function called safe_halt() instead of raw_safe_halt(), which had the side-effect of setting the hardirqs_enabled flag for the offline CPU. On x86 this triggered warnings from lockdep_assert_irqs_disabled() when the CPU was brought back online again later. These warnings were too early for the exception to be handled correctly, leading to a triple-fault. Add lockdep_cleanup_dead_cpu() to check for this kind of failure mode, print the events leading up to it, and correct it so that the CPU can come online again correctly. Re-introducing the original bug now merely results in this warning instead: [ 61.556652] smpboot: CPU 1 is now offline [ 61.556769] CPU 1 left hardirqs enabled! [ 61.556915] irq event stamp: 128149 [ 61.556965] hardirqs last enabled at (128149): [<ffffffff81720a36>] acpi_idle_play_dead+0x46/0x70 [ 61.557055] hardirqs last disabled at (128148): [<ffffffff81124d50>] do_idle+0x90/0xe0 [ 61.557117] softirqs last enabled at (128078): [<ffffffff81cec74c>] __do_softirq+0x31c/0x423 [ 61.557199] softirqs last disabled at (128065): [<ffffffff810baae1>] __irq_exit_rcu+0x91/0x100 [boqun: Capitalize the title and reword the message a bit] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/f7bd2b3b999051bb3ef4be34526a9262008285f5.camel@infradead.org
2024-10-17Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "28 hotfixes. 13 are cc:stable. 23 are MM. It is the usual shower of unrelated singletons - please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits) maple_tree: add regression test for spanning store bug maple_tree: correct tree corruption on spanning store mm/mglru: only clear kswapd_failures if reclaimable mm/swapfile: skip HugeTLB pages for unuse_vma selftests: mm: fix the incorrect usage() info of khugepaged MAINTAINERS: add Jann as memory mapping/VMA reviewer mm: swap: prevent possible data-race in __try_to_reclaim_swap mm: khugepaged: fix the incorrect statistics when collapsing large file folios MAINTAINERS: kasan, kcov: add bugzilla links mm: don't install PMD mappings when THPs are disabled by the hw/process/vma mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw() Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs Docs/damon/maintainer-profile: add missing '_' suffixes for external web links maple_tree: check for MA_STATE_BULK on setting wr_rebalance mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets() mm: remove unused stub for can_swapin_thp() mailmap: add an entry for Andy Chiu MAINTAINERS: add memory mapping/VMA co-maintainers fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization ...
2024-10-18Merge tag 'drm-misc-fixes-2024-10-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241017115516.GA196624@linux.fritz.box
2024-10-17power: supply: hwmon: move interface to private headerThomas Weißschuh
The interface of power_supply_hwmon.c is only meant to be used by the psy core. Remove it from the public header file and use the private one instead. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241017-power-supply-cleanups-v2-1-cb0f5deab088@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-17clk: divider: Introduce CLK_DIVIDER_EVEN_INTEGERS flagThéo Lebrun
Add CLK_DIVIDER_EVEN_INTEGERS flag to support divisor of 2, 4, 6, etc. The same divisor can be done using a table, which would be big and wasteful for a clock dividor of width 8 (256 entries). Require increasing flags size from u8 to u16 because CLK_DIVIDER_EVEN_INTEGERS is the eighth flag. u16 is used inside struct clk_divider; `unsigned long` is used for function arguments. Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://lore.kernel.org/r/20241007-mbly-clk-v5-3-e9d8994269cb@bootlin.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-10-17ASoC: SDCA: add initial modulePierre-Louis Bossart
Add new module for SDCA (SoundWire Device Class for Audio) support. For now just add a parser to identify the SDCA revision and the function mask. Note that the SDCA definitions and related MIPI DisCo properties are defined only for ACPI platforms and extracted with _DSD helpers. There is currently no support for Device Tree in the specification, the 'depends on ACPI' reflects this design limitation. This might change in a future revision of the specification but for SDCA 1.0 ACPI is the only supported type of platform firmware. The SDCA library is defined with static inline fallbacks, which will allow for unconditional addition of SDCA support in common parts of the code. The design follows a four-step process: 1) Basic information related to Functions is extracted from MIPI DisCo tables and stored in the 'struct sdw_slave'. Devm_ based memory allocation is not allowed at this point prior to a driver probe, so we only store the function node, address and type. 2) When a codec driver probes, it will register subdevices for each Function identified in phase 1) 3) a driver will probe for each subdevice and addition parsing/memory allocation takes place at this level. devm_ based allocation is highly encouraged to make error handling manageable. 4) Before the peripheral device becomes physically attached, register access is not permitted and the regmaps are cache-only. When peripheral device is enumerated, the bus level uses the 'update_status' notification; after optional device-level initialization, the codec driver will notify each of the subdevices so that they can start interacting with the hardware. Note that the context extracted in 1) should be arguably be handled completely in the codec driver probe. That would however make it difficult to use the ACPI information for machine quirks, and e.g. select different machine driver and topologies as done for the RT712_VB handling later in the series. To make the implementation of quirks simpler, this patchset extracts a minimal amount of context (interface revision and number/type of Functions) before the codec driver probe, and stores this context in the scope of the 'struct sdw_slave'. The SDCA library can also be used in a vendor-specific driver without creating subdevices, e.g. to retrieve the 'initialization-table' values to write platform-specific values as needed. For more technical details, the SDCA specification is available for public downloads at https://www.mipi.org/mipi-sdca-v1-0-download Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016102333.294448-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-17ASoC/soundwire: remove sdw_slave_extended_idPierre-Louis Bossart
This structure is used to copy information from the 'sdw_slave' structures, it's better to create a flexible array of 'sdw_slave' pointers and directly access the information. This will also help access additional information stored in the 'sdw_slave' structure, such as an SDCA context. This patch does not add new functionality, it only modified how the information is retrieved. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016102333.294448-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-17soundwire: sdw_intel: include linux/acpi.hBard Liao
For the acpi_handle stuff. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Link: https://patch.msgid.link/20241016102333.294448-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-17Merge tag 'sound-6.12-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, nothing really stands out: - Usual HD-audio quirks / device-specific fixes - Kconfig dependency fix for UM - A series of minor fixes for SoundWire - Updates of USB-audio LINE6 contact address" * tag 'sound-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 ALSA/hda: intel-sdw-acpi: add support for sdw-manager-list property read ALSA/hda: intel-sdw-acpi: simplify sdw-master-count property read ALSA/hda: intel-sdw-acpi: fetch fwnode once in sdw_intel_scan_controller() ALSA/hda: intel-sdw-acpi: cleanup sdw_intel_scan_controller ALSA: hda/tas2781: Add new quirk for Lenovo, ASUS, Dell projects ALSA: scarlett2: Add error check after retrieving PEQ filter values ALSA: hda/cs8409: Fix possible NULL dereference sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML ALSA: line6: update contact information ALSA: usb-audio: Fix NULL pointer deref in snd_usb_power_domain_set() ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 ALSA: hda: Sound support for HP Spectre x360 16 inch model 2024
2024-10-17Merge tag 'net-6.12-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Current release - new code bugs: - eth: mlx5: HWS, don't destroy more bwc queue locks than allocated Previous releases - regressions: - ipv4: give an IPv4 dev to blackhole_netdev - udp: compute L4 checksum as usual when not segmenting the skb - tcp/dccp: don't use timer_pending() in reqsk_queue_unlink(). - eth: mlx5e: don't call cleanup on profile rollback failure - eth: microchip: vcap api: fix memory leaks in vcap_api_encode_rule_test() - eth: enetc: disable Tx BD rings after they are empty - eth: macb: avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY Previous releases - always broken: - posix-clock: fix missing timespec64 check in pc_clock_settime() - genetlink: hold RCU in genlmsg_mcast() - mptcp: prevent MPC handshake on port-based signal endpoints - eth: vmxnet3: fix packet corruption in vmxnet3_xdp_xmit_frame - eth: stmmac: dwmac-tegra: fix link bring-up sequence - eth: bcmasp: fix potential memory leak in bcmasp_xmit() Misc: - add Andrew Lunn as a co-maintainer of all networking drivers" * tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net/mlx5e: Don't call cleanup on profile rollback failure net/mlx5: Unregister notifier on eswitch init failure net/mlx5: Fix command bitmask initialization net/mlx5: Check for invalid vector index on EQ creation net/mlx5: HWS, use lock classes for bwc locks net/mlx5: HWS, don't destroy more bwc queue locks than allocated net/mlx5: HWS, fixed double free in error flow of definer layout net/mlx5: HWS, removed wrong access to a number of rules variable mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame net: dsa: vsc73xx: fix reception from VLAN-unaware bridges net: ravb: Only advertise Rx/Tx timestamps if hardware supports it net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test() net: phy: mdio-bcm-unimac: Add BCM6846 support dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio udp: Compute L4 checksum as usual when not segmenting the skb genetlink: hold RCU in genlmsg_mcast() net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361 tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink(). ...
2024-10-17Merge branch 'linus' into sched/urgent, to resolve conflictIngo Molnar
Conflicts: kernel/sched/ext.c There's a context conflict between this upstream commit: 3fdb9ebcec10 sched_ext: Start schedulers with consistent p->scx.slice values ... and this fix in sched/urgent: 98442f0ccd82 sched: Fix delayed_dequeue vs switched_from_fair() Resolve it. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-10-17mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()Kefeng Wang
Patch series "mm: don't install PMD mappings when THPs are disabled by the hw/process/vma". During testing, it was found that we can get PMD mappings in processes where THP (and more precisely, PMD mappings) are supposed to be disabled. While it works as expected for anon+shmem, the pagecache is the problematic bit. For s390 KVM this currently means that a VM backed by a file located on filesystem with large folio support can crash when KVM tries accessing the problematic page, because the readahead logic might decide to use a PMD-sized THP and faulting it into the page tables will install a PMD mapping, something that s390 KVM cannot tolerate. This might also be a problem with HW that does not support PMD mappings, but I did not try reproducing it. Fix it by respecting the ways to disable THPs when deciding whether we can install a PMD mapping. khugepaged should already be taking care of not collapsing if THPs are effectively disabled for the hw/process/vma. This patch (of 2): Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by shmem_allowable_huge_orders() and __thp_vma_allowable_orders(). [david@redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ] Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Leo Fu <bfu@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: Boqiao Fu <bfu@redhat.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on certain builds.Sebastian Andrzej Siewior
Arnd reported a build failure due to the BUILD_BUG_ON() statement in alloc_kmem_cache_cpus(). The test PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu) The factors that increase the right side of the equation: - PAGE_SIZE > 4KiB increases KMALLOC_SHIFT_HIGH - For the local_lock_t in kmem_cache_cpu: - PREEMPT_RT adds an actual lock. - LOCKDEP increases the size of the lock. - LOCK_STAT adds additional bytes plus padding to the lockdep structure. The net difference with and without PREEMPT_RT is 88 bytes for the lock_lock_t, 96 bytes for kmem_cache_cpu due to additional padding. This is enough to exceed the 80KiB limit with 16KiB page size - the 8KiB page size is fine. Increase PERCPU_DYNAMIC_SIZE_SHIFT to 13 on configs with PAGE_SIZE larger than 4KiB and LOCKDEP enabled. Link: https://lkml.kernel.org/r/20241007143049.gyMpEu89@linutronix.de Fixes: d8fccd9ca5f9 ("arm64: Allow to enable PREEMPT_RT.") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410020326.iaZIteIx-lkp@intel.com/ Reported-by: Arnd Bergmann <arnd@kernel.org> Closes: https://lore.kernel.org/20241004095702.637528-1-arnd@kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17usb: typec: Add attribute file showing the USB Modes of the partnerHeikki Krogerus
This attribute file shows the supported USB modes (USB 2.0, USB 3.0 and USB4) of the partner, and the currently active mode. The active mode is determined primarily by checking the speed of the enumerated USB device. When USB Power Delivery is supported, the active USB mode should be always the mode that was used with the Enter_USB Message, regardless of the result of the USB enumeration. The port drivers can separately assign the mode with a dedicated API. If USB Power Delivery Identity is supplied for the partner device, the supported modes are extracted from it. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241016131834.898599-3-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17usb: typec: Add attribute file showing the supported USB modes of the portHeikki Krogerus
This attribute file, named "usb_capability", will show the supported USB modes, which are USB 2.0, USB 3.2 and USB4. These modes are defined in the USB Type-C (R2.0) and USB Power Delivery (R3.0 V2.0) Specifications. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241016131834.898599-2-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16power: supply: core: unexport power_supply_property_is_writeable()Thomas Weißschuh
Since commit ("power: supply: Drop use_cnt check from power_supply_property_is_writeable()"), this function does not check use_cnt anymore, making it unsuitable for general usage. As it is only used by the psy core anyways, remove it from the public header and unexport it to avoid misusage. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241005-power-supply-cleanups-v1-2-45303b2d0a4d@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-16genirq: Unexport nr_irqsBart Van Assche
Unexport nr_irqs and declare it static now that all code that reads or modifies nr_irqs has been converted to number_of_interrupts() / set_number_of_interrupts(). Change the type of 'nr_irqs' from 'int' into 'unsigned int' to match the return type and argument type of the irq_get_nr_iqs() / irq_set_nr_irqs() functions. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241015190953.1266194-23-bvanassche@acm.org
2024-10-16genirq: Switch to irq_get_nr_irqs()Bart Van Assche
Use the irq_get_nr_irqs() function instead of the global variable 'nr_irqs'. Cache the result of this function in a local variable in order not to rely on CSE (common subexpression elimination). Prepare for changing 'nr_irqs' from an exported global variable into a variable with file scope. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241015190953.1266194-22-bvanassche@acm.org
2024-10-16genirq: Introduce irq_get_nr_irqs() and irq_set_nr_irqs()Bart Van Assche
Prepare for changing 'nr_irqs' from an exported global variable into a variable with file scope. This will prevent accidental changes of assignments to a local variable 'nr_irqs' into assignments to the global 'nr_irqs' variable. Suppose that a patch would be submitted for review that removes a declaration of a local variable with the name 'nr_irqs' and that that patch does not remove all assignments to that local variable. Such a patch converts an assignment to a local variable into an assignment into a global variable. If the 'nr_irqs' assignment is more than three lines away from other changes, the assignment won't be included in the diff context lines and hence won't be visible without inspecting the modified file. With these abstraction series applied, such accidental conversions from assignments to a local variable into an assignment to a global variable are converted into a compilation error. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241015190953.1266194-2-bvanassche@acm.org
2024-10-16PCI: endpoint: Introduce pci_epc_mem_map()/unmap()Damien Le Moal
Some endpoint controllers have requirements on the alignment of the controller physical memory address that must be used to map a RC PCI address region. For instance, the endpoint controller of the RK3399 SoC uses at most the lower 20 bits of a physical memory address region as the lower bits of a RC PCI address region. For mapping a PCI address region of size bytes starting from pci_addr, the exact number of address bits used is the number of address bits changing in the address range [pci_addr..pci_addr + size - 1]. For this example, this creates the following constraints: 1) The offset into the controller physical memory allocated for a mapping depends on the mapping size *and* the starting PCI address for the mapping. 2) A mapping size cannot exceed the controller windows size (1MB) minus the offset needed into the allocated physical memory, which can end up being a smaller size than the desired mapping size. Handling these constraints independently of the controller being used in an endpoint function driver is not possible with the current EPC API as only the ->align field in struct pci_epc_features is provided but used for BAR (inbound ATU mappings) mapping only. A new API is needed for function drivers to discover mapping constraints and handle non-static requirements based on the RC PCI address range to access. Introduce the endpoint controller operation ->align_addr() to allow the EPC core functions to obtain the size and the offset into a controller address region that must be allocated and mapped to access a RC PCI address region. The size of the mapping provided by the align_addr() operation can then be used as the size argument for the function pci_epc_mem_alloc_addr() and the offset into the allocated controller memory provided can be used to correctly handle data transfers. For endpoint controllers that have PCI address alignment constraints, the align_addr() operation may indicate upon return an effective PCI address mapping size that is smaller (but not 0) than the requested PCI address region size. The controller ->align_addr() operation is optional: controllers that do not have any alignment constraints for mapping RC PCI address regions do not need to implement this operation. For such controllers, it is always assumed that the mapping size is equal to the requested size of the PCI region and that the mapping offset is 0. The function pci_epc_mem_map() is introduced to use this new controller operation (if it is defined) to handle controller memory allocation and mapping to a RC PCI address region in endpoint function drivers. This function first uses the ->align_addr() controller operation to determine the controller memory address size (and offset into) needed for mapping an RC PCI address region. The result of this operation is used to allocate a controller physical memory region using pci_epc_mem_alloc_addr() and then to map that memory to the RC PCI address space with pci_epc_map_addr(). Since ->align_addr() () may indicate that not all of a RC PCI address region can be mapped, pci_epc_mem_map() may only partially map the RC PCI address region specified. It is the responsibility of the caller (an endpoint function driver) to handle such smaller mapping by repeatedly using pci_epc_mem_map() over the desried PCI address range. The counterpart of pci_epc_mem_map() to unmap and free a mapped controller memory address region is pci_epc_mem_unmap(). Both functions operate using the new struct pci_epc_map data structure. This new structure represents a mapping PCI address, mapping effective size, the size of the controller memory needed for the mapping as well as the physical and virtual CPU addresses of the mapping (phys_base and virt_base fields). For convenience, the physical and virtual CPU addresses within that mapping to use to access the target RC PCI address region are also provided (phys_addr and virt_addr fields). Endpoint function drivers can use struct pci_epc_map to access the mapped RC PCI address region using the ->virt_addr and ->pci_size fields. Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241012113246.95634-4-dlemoal@kernel.org [mani: squashed the patch that changed phy_addr_t to u64] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-10-16bpf: Prevent tailcall infinite loop caused by freplaceLeon Hwang
There is a potential infinite loop issue that can occur when using a combination of tail calls and freplace. In an upcoming selftest, the attach target for entry_freplace of tailcall_freplace.c is subprog_tc of tc_bpf2bpf.c, while the tail call in entry_freplace leads to entry_tc. This results in an infinite loop: entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc. The problem arises because the tail_call_cnt in entry_freplace resets to zero each time entry_freplace is executed, causing the tail call mechanism to never terminate, eventually leading to a kernel panic. To fix this issue, the solution is twofold: 1. Prevent updating a program extended by an freplace program to a prog_array map. 2. Prevent extending a program that is already part of a prog_array map with an freplace program. This ensures that: * If a program or its subprogram has been extended by an freplace program, it can no longer be updated to a prog_array map. * If a program has been added to a prog_array map, neither it nor its subprograms can be extended by an freplace program. Moreover, an extension program should not be tailcalled. As such, return -EINVAL if the program has a type of BPF_PROG_TYPE_EXT when adding it to a prog_array map. Additionally, fix a minor code style issue by replacing eight spaces with a tab for proper formatting. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20241015150207.70264-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-16fanotify: allow reporting errors on failure to open fdAmir Goldstein
When working in "fd mode", fanotify_read() needs to open an fd from a dentry to report event->fd to userspace. Opening an fd from dentry can fail for several reasons. For example, when tasks are gone and we try to open their /proc files or we try to open a WRONLY file like in sysfs or when trying to open a file that was deleted on the remote network server. Add a new flag FAN_REPORT_FD_ERROR for fanotify_init(). For a group with FAN_REPORT_FD_ERROR, we will send the event with the error instead of the open fd, otherwise userspace may not get the error at all. For an overflow event, we report -EBADF to avoid confusing FAN_NOFD with -EPERM. Similarly for pidfd open errors we report either -ESRCH or the open error instead of FAN_NOPIDFD and FAN_EPIDFD. In any case, userspace will not know which file failed to open, so add a debug print for further investigation. Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com> Link: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241003142922.111539-1-amir73il@gmail.com
2024-10-16fs: pass offset and result to backing_file end_write() callbackAmir Goldstein
This is needed for extending fuse inode size after fuse passthrough write. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Link: https://lore.kernel.org/linux-fsdevel/CAJfpegs=cvZ_NYy6Q_D42XhYS=Sjj5poM1b5TzXzOVvX=R36aA@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-10-16platform/chrome: Update EC feature flagsPavan Holla
Define EC_FEATURE_UCSI_PPM to enable usage of the cros_ec_ucsi driver. Also, add any feature flags that are implemented by the EC but are missing in the kernel header. Signed-off-by: Pavan Holla <pholla@chromium.org> Signed-off-by: Łukasz Bartosik <ukaszb@chromium.org> Acked-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20240910101527.603452-3-ukaszb@chromium.org Signed-off-by: Lee Jones <lee@kernel.org>
2024-10-16mfd: sec-core: Add support for the Samsung s2dos05Dzmitry Sankouski
S2DOS05 is a panel/touchscreen PMIC, often found in Samsung phones. We define regulator sub-device for which driver will be added in subsequent patch. The device also has ADC for power and current measurements. Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20240617-starqltechn_integration_upstream-v5-2-ea1109029ba5@gmail.com Signed-off-by: Lee Jones <lee@kernel.org>
2024-10-16mfd: max77693: Remove unused max77693_irq_source declarationsDzmitry Sankouski
Remove `enum max77693_irq_source` declaration because unused. Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20240617-starqltechn_integration_upstream-v5-1-125d9228d751@gmail.com Signed-off-by: Lee Jones <lee@kernel.org>
2024-10-16mfd: palmas: Constify strings with regulator namesKrzysztof Kozlowski
The names of regulators are static const strings, so pointers can be made as pointers to const for code safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240909134941.121847-1-krzysztof.kozlowski@linaro.org Signed-off-by: Lee Jones <lee@kernel.org>
2024-10-16iopoll/regmap/phy/snd: Fix comment referencing outdated timer documentationAnna-Maria Behnsen
Function descriptions in iopoll.h, regmap.h, phy.h and sound/soc/sof/ops.h copied all the same outdated documentation about sleep/delay function limitations. In those comments, the generic (and still outdated) timer documentation file is referenced. As proper function descriptions for used delay and sleep functions are in place, simply update the descriptions to reference to them. While at it fix missing colon after "Returns" in function description and move return value description to the end of the function description. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> # for phy.h Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-12-dc8b907cb62f@linutronix.de
2024-10-16timers: Adjust flseep() to reflect realityAnna-Maria Behnsen
fsleep() simply implements the recommendations of the outdated documentation in "Documentation/timers/timers-howto.rst". This should be a user friendly interface to choose always the best timeout function approach: - udelay() for very short sleep durations shorter than 10 microseconds - usleep_range() for sleep durations until 20 milliseconds - msleep() for the others The actual implementation has several problems: - It does not take into account that HZ resolution also has an impact on granularity of jiffies and has also an impact on the granularity of the buckets of timer wheel levels. This means that accuracy for the timeout does not have an upper limit. When executing fsleep(20000) on a HZ=100 system, the possible additional slack will be 50% as the granularity of the buckets in the lowest level is 10 milliseconds. - The upper limit of usleep_range() is twice the requested timeout. When no other interrupts occur in this range, the maximum value is used. This means that the requested sleep length has then an additional delay of 100%. Change the thresholds for the decisions in fsleep() to make sure the maximum slack which is added to the sleep duration is 25%. Note: Outdated documentation will be updated in a followup patch. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-7-dc8b907cb62f@linutronix.de
2024-10-16timers: Update function descriptions of sleep/delay related functionsAnna-Maria Behnsen
A lot of commonly used functions for inserting a sleep or delay lack a proper function description. Add function descriptions to all of them to have important information in a central place close to the code. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-5-dc8b907cb62f@linutronix.de
2024-10-16timers: Rename usleep_idle_range() to usleep_range_idle()Anna-Maria Behnsen
usleep_idle_range() is a variant of usleep_range(). Both are using usleep_range_state() as a base. To be able to find all the related functions in one go, rename it usleep_idle_range() to usleep_range_idle(). No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: SeongJae Park <sj@kernel.org> Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-4-dc8b907cb62f@linutronix.de
2024-10-15ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_valueMasami Hiramatsu (Google)
Rename ftrace_regs_return_value to ftrace_regs_get_return_value as same as other ftrace_regs_get/set_* APIs. arm64 and riscv are already using this new name. Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Florent Revest <revest@chromium.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/172895573350.107311.7564634260652361511.stgit@devnote2 Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macrosMasami Hiramatsu (Google)
Since the arch_ftrace_get_regs(fregs) is only valid when the FL_SAVE_REGS is set, we need to use `&arch_ftrace_regs()->regs` for ftrace_regs_*() APIs because those APIs are for ftrace_regs, not complete pt_regs. Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Florent Revest <revest@chromium.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/172895572290.107311.16057631001860177198.stgit@devnote2 Fixes: e4cf33ca4812 ("ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15power: supply: core: remove {,devm_}power_supply_register_no_ws()Thomas Weißschuh
The same functionality is available through power_supply_config::no_wakeup_source, which is more idiomatic. All users of the old API have been converted. Also remove the argument "ws" from __power_supply_register(), as it is now always "true". Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20241005-power-supply-no-wakeup-source-v1-8-1d62bf9bcb1d@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15power: supply: core: add wakeup source inhibit by power_supply_configThomas Weißschuh
To inhibit wakeup users currently have to use dedicated functions to register the power supply: {,devm_}power_supply_register_no_ws(). This is inconsistent to other runtime settings which can be configured through struct power_supply_config. It's also not obvious what _no_ws() is meant to mean. Extend power_supply_config to also be able to inhibit the wakeup source. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20241005-power-supply-no-wakeup-source-v1-1-1d62bf9bcb1d@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15power: supply: core: constify power_supply_battery_info::ocv_tableThomas Weißschuh
The power supply core never modifies the ocv table. Reflect this in the API, so drivers can mark their static tables as const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20241005-power-supply-battery-const-v1-5-c1f721927048@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15power: supply: core: constify power_supply_battery_info::resist_tableThomas Weißschuh
The power supply core never modifies the resist table. Reflect this in the API, so drivers can mark their static tables as const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20241005-power-supply-battery-const-v1-1-c1f721927048@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15tpm: fix unsigned/signed mismatch errors related to __calc_tpm2_event_sizeGregory Price
__calc_tpm2_event_size returns 0 or a positive length, but return values are often interpreted as ints. Convert everything over to u32 to avoid signed/unsigned logic errors. Signed-off-by: Gregory Price <gourry@gourry.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-10-15Merge tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: - New metadata version inode_has_child_snapshots This fixes bugs with handling of unlinked inodes + snapshots, in particular when an inode is reattached after taking a snapshot; deleted inodes now get correctly cleaned up across snapshots. - Disk accounting rewrite fixes - validation fixes for when a device has been removed - fix journal replay failing with "journal_reclaim_would_deadlock" - Some more small fixes for erasure coding + device removal - Assorted small syzbot fixes * tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs: (27 commits) bcachefs: Fix sysfs warning in fstests generic/730,731 bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev bcachefs: Fix kasan splat in new_stripe_alloc_buckets() bcachefs: Add missing validation for bch_stripe.csum_granularity_bits bcachefs: Fix missing bounds checks in bch2_alloc_read() bcachefs: fix uaf in bch2_dio_write_done() bcachefs: Improve check_snapshot_exists() bcachefs: Fix bkey_nocow_lock() bcachefs: Fix accounting replay flags bcachefs: Fix invalid shift in member_to_text() bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry bcachefs: Check if stuck in journal_res_get() closures: Add closure_wait_event_timeout() bcachefs: Fix state lock involved deadlock bcachefs: Fix NULL pointer dereference in bch2_opt_to_text bcachefs: Release transaction before wake up bcachefs: add check for btree id against max in try read node bcachefs: Disk accounting device validation fixes bcachefs: bch2_inode_or_descendents_is_open() ...
2024-10-15gpu: host1x: Set up device DMA parametersThierry Reding
In order to store device DMA parameters, the DMA framework depends on the device's dma_parms field to point at a valid memory location. Add backing storage for this in struct host1x_memory_context and point to it. Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240916133320.368620-1-thierry.reding@gmail.com (cherry picked from commit b4ad4ef374d66cc8df3188bb1ddb65bce5fc9e50) Signed-off-by: Thierry Reding <treding@nvidia.com>