summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-02-19iio: add the IIO backend frameworkNuno Sa
This is a Framework to handle complex IIO aggregate devices. The typical architecture is to have one device as the frontend device which can be "linked" against one or multiple backend devices. All the IIO and userspace interface is expected to be registers/managed by the frontend device which will callback into the backends when needed (to get/set some configuration that it does not directly control). The basic framework interface is pretty simple: - Backends should register themselves with @devm_iio_backend_register() - Frontend devices should get backends with @devm_iio_backend_get() Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240210-iio-backend-v11-5-f5242a5fb42a@analog.com Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-19iio: buffer-dmaengine: export buffer alloc and free functionsNuno Sa
Export iio_dmaengine_buffer_free() and iio_dmaengine_buffer_alloc(). This is in preparation of introducing IIO backends support. This will allow us to allocate a buffer and control it's lifetime from a device different from the one holding the DMA firmware properties. Effectively, in this case the struct device holding the firmware information about the DMA channels is not the same as iio_dev->dev.parent (typical case). While at it, namespace the buffer-dmaengine exports and update the current user of these buffers. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240210-iio-backend-v11-4-f5242a5fb42a@analog.com Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-19x86/resctrl: Separate arch and fs resctrl locksJames Morse
resctrl has one mutex that is taken by the architecture-specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list. Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against CPU hotplug. The CPU hotplug code already has a mechanism to do this using cpus_read_lock(). MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory. Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this. Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly. Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() could walk the domain list using RCU, but to keep all the filesystem operations the same, this is changed to call cpus_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases. The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring. Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious. Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves. [ bp: Fold in a build fix: https://lore.kernel.org/r/87zfvwieli.ffs@tglx ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-25-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-19smp: Make __smp_processor_id() 0-argument macroAlexey Dobriyan
smp_processor_id family of macros never accepted any arguments. #define __smp_processor_id(x) works by accident (see C99 6.10.3 §4). __smp_processor_id() gets 1 (empty) argument and passes it down to raw_smp_processor_id() which doesn't accept arguments. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/0037d1f2-8153-4b33-b43e-f4b6ecd710ac@p183
2024-02-19locking: Add rwsem_assert_held() and rwsem_assert_held_write()Matthew Wilcox (Oracle)
Modelled after lockdep_assert_held() and lockdep_assert_held_write(), but are always active, even when lockdep is disabled. Of course, they don't test that _this_ thread is the owner, but it's sufficient to catch many bugs and doesn't incur the same performance penalty as lockdep. Acked-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Acked-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-19pwm: lpss-*: Make use of devm_pwmchip_alloc() functionUwe Kleine-König
This prepares the pwm-lpss drivers to further changes of the pwm core outlined in the commit introducing devm_pwmchip_alloc(). There is no intended semantical change and the driver should behave as before. Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/b567ab5dd992e361eb884fa6c2cac11be9c7dde3.1707900770.git.u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
2024-02-19jiffies: Transform comment about time_* functions into DOC blockAnna-Maria Behnsen
This general note about time_* functions is also useful to be available in kernel documentation. Therefore transform it into a kernel-doc DOC block with proper formatting. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-6-anna-maria@linutronix.de
2024-02-19hrtimers: Update formatting of documentationAnna-Maria Behnsen
Documentation of functions lacks the annotations which are used by kernel-doc and *.rst to make appearance in rendered documents more user-friendly. Use those annotations to improve user-friendliness. While at it prevent duplication of comments and use a reference instead. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-3-anna-maria@linutronix.de
2024-02-19hrtimers: Move hrtimer base related definitions into hrtimer_defs.hAnna-Maria Behnsen
hrtimer base related struct definitions are part of hrtimers.h as it is required there. With this, also the struct documentation which is for core code internal use, is exposed into the general api. To prevent this, move all core internal definitions and the related includes into hrtimer_defs.h. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240123164702.55612-2-anna-maria@linutronix.de
2024-02-19Merge 6.8-rc5 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19Merge 6.8-rc5 into tty-nextGreg Kroah-Hartman
We need the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19sysfs: Introduce a mechanism to hide static attribute_groupsDan Williams
Add a mechanism for named attribute_groups to hide their directory at sysfs_update_group() time, or otherwise skip emitting the group directory when the group is first registered. It piggybacks on is_visible() in a similar manner as SYSFS_PREALLOC, i.e. special flags in the upper bits of the returned mode. To use it, specify a symbol prefix to DEFINE_SYSFS_GROUP_VISIBLE(), and then pass that same prefix to SYSFS_GROUP_VISIBLE() when assigning the @is_visible() callback: DEFINE_SYSFS_GROUP_VISIBLE($prefix) struct attribute_group $prefix_group = { .name = $name, .is_visible = SYSFS_GROUP_VISIBLE($prefix), }; SYSFS_GROUP_VISIBLE() expects a definition of $prefix_group_visible() and $prefix_attr_visible(), where $prefix_group_visible() just returns true / false and $prefix_attr_visible() behaves as normal. The motivation for this capability is to centralize PCI device authentication in the PCI core with a named sysfs group while keeping that group hidden for devices and platforms that do not meet the requirements. In a PCI topology, most devices will not support authentication, a small subset will support just PCI CMA (Component Measurement and Authentication), a smaller subset will support PCI CMA + PCIe IDE (Link Integrity and Encryption), and only next generation server hosts will start to include a platform TSM (TEE Security Manager). Without this capability the alternatives are: * Check if all attributes are invisible and if so, hide the directory. Beyond trouble getting this to work [1], this is an ABI change for scenarios if userspace happens to depend on group visibility absent any attributes. I.e. this new capability avoids regression since it does not retroactively apply to existing cases. * Publish an empty /sys/bus/pci/devices/$pdev/tsm/ directory for all PCI devices (i.e. for the case when TSM platform support is present, but device support is absent). Unfortunate that this will be a vestigial empty directory in the vast majority of cases. * Reintroduce usage of runtime calls to sysfs_{create,remove}_group() in the PCI core. Bjorn has already indicated that he does not want to see any growth of pci_sysfs_init() [2]. * Drop the named group and simulate a directory by prefixing all TSM-related attributes with "tsm_". Unfortunate to not use the naming capability of a sysfs group as intended. In comparison, there is a small potential for regression if for some reason an @is_visible() callback had dependencies on how many times it was called. Additionally, it is no longer an error to update a group that does not have its directory already present, and it is no longer a WARN() to remove a group that was never visible. Link: https://lore.kernel.org/all/2024012321-envious-procedure-4a58@gregkh/ [1] Link: https://lore.kernel.org/linux-pci/20231019200110.GA1410324@bhelgaas/ [2] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/2024013028-deflator-flaring-ec62@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19Merge 6.8-rc5 into driver-core-nextGreg Kroah-Hartman
We need the driver core changes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-18tty: Don't include tty_buffer.h in tty.hIlpo Järvinen
There's no need to include linux/tty_buffer.h in linux/tty.h. Move the include into tty_buffer.c that is actually using it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240215111538.1920-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17gpio: constify opaque pointer "data" in gpio_device_find()Krzysztof Kozlowski
The opaque pointer "data" in each match function used by gpio_device_find() is a pointer to const, thus the same argument passed to gpio_device_find() can adjusted similarly. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-02-17net: phy: add PHY_EEE_CAP2_FEATURESHeiner Kallweit
As a prerequisite for adding EEE CAP2 register support, complement PHY_EEE_CAP1_FEATURES with PHY_EEE_CAP2_FEATURES. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17net: mdio: add helpers for accessing the EEE CAP2 registersHeiner Kallweit
This adds helpers for accessing the EEE CAP2 registers. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17Merge tag 'char-misc-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / miscdriver fixes from Greg KH: "Here is a small set of char/misc and IIO driver fixes for 6.8-rc5. Included in here are: - lots of iio driver fixes for reported issues - nvmem device naming fixup for reported problem - interconnect driver fixes for reported issues All of these have been in linux-next for a while with no reported the issues (the nvmem patch was included in a different branch in linux-next before sent to me for inclusion here)" * tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) nvmem: include bit index in cell sysfs file name iio: adc: ad4130: only set GPIO_CTRL if pin is unused iio: adc: ad4130: zero-initialize clock init data interconnect: qcom: x1e80100: Add missing ACV enable_mask interconnect: qcom: sm8650: Use correct ACV enable_mask iio: accel: bma400: Fix a compilation problem iio: commom: st_sensors: ensure proper DMA alignment iio: hid-sensor-als: Return 0 for HID_USAGE_SENSOR_TIME_TIMESTAMP iio: move LIGHT_UVA and LIGHT_UVB to the end of iio_modifier staging: iio: ad5933: fix type mismatch regression iio: humidity: hdc3020: fix temperature offset iio: adc: ad7091r8: Fix error code in ad7091r8_gpio_setup() iio: adc: ad_sigma_delta: ensure proper DMA alignment iio: imu: adis: ensure proper DMA alignment iio: humidity: hdc3020: Add Makefile, Kconfig and MAINTAINERS entry iio: imu: bno055: serdev requires REGMAP iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC iio: pressure: bmp280: Add missing bmp085 to SPI id table iio: core: fix memleak in iio_device_register_sysfs interconnect: qcom: sm8550: Enable sync_state ...
2024-02-17Merge tag 'tty-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are three small tty and serial driver fixes for 6.8-rc5: - revert a 8250_pci1xxxx off-by-one change that was incorrect - two changes to fix the transmit path of the mxs-auart driver, fixing a regression in the 6.2 release All of these have been in linux-next for over a week with no reported issues" * tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: mxs-auart: fix tx serial: core: introduce uart_port_tx_flags() serial: 8250_pci1xxxx: partially revert off by one patch
2024-02-17Merge tag 'usb-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are two small fixes for 6.8-rc5: - thunderbolt to fix a reported issue on many platforms - dwc3 driver revert of a commit that caused problems in -rc1 Both of these changes have been in linux-next for over a week with no reported issues" * tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "usb: dwc3: Support EBC feature of DWC_usb31" thunderbolt: Fix setting the CNS bit in ROUTER_CS_5
2024-02-17iio: hid-sensor-als: Add light chromaticity supportBasavaraj Natikar
On some platforms, ambient color sensors also support the x and y light colors, which represent the coordinates on the CIE 1931 chromaticity diagram. Add light chromaticity x and y. Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240205185926.3030521-5-srinivas.pandruvada@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17iio: hid-sensor-als: Add light color temperature supportBasavaraj Natikar
On some platforms, ambient color sensors also support light color temperature. Add support of light color temperature. Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240205185926.3030521-4-srinivas.pandruvada@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17iio: core: make iio_bus_type constRicardo B. Marliere
Now that the driver core can properly handle constant struct bus_type, move the iio_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Acked-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240208-bus_cleanup-iio-v1-1-4a167c3b5fb3@marliere.net Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17iio: locking: introduce __cleanup() based direct mode claiming infrastructureJonathan Cameron
Allows use of: iio_device_claim_direct_scoped(return -EBUSY, indio_dev) { } to automatically call iio_device_release_direct_mode() based on scope. Typically seen in combination with local device specific locks which are already have automated cleanup options via guard(mutex)(&st->lock) and scoped_guard(). Using both together allows most error handling to be automated. Reviewed-by: Nuno Sa <nuno.a@analog.com> Link: https://lore.kernel.org/r/20240128150537.44592-2-jic23@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-17usb: gadget: Support already-mapped DMA SGsPaul Cercueil
Add a new 'sg_was_mapped' field to the struct usb_request. This field can be used to indicate that the scatterlist associated to the USB transfer has already been mapped into the DMA space, and it does not have to be done internally. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20240130122340.54813-2-paul@crapouillou.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17kobject: make uevent_seqnum atomicEric Dumazet
We will soon no longer acquire uevent_sock_mutex for most kobject_uevent_net_broadcast() calls, and also while calling uevent_net_broadcast(). Make uevent_seqnum an atomic64_t to get its own protection. This fixes a race while reading /sys/kernel/uevent_seqnum. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christian Brauner <brauner@kernel.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20240214084829.684541-2-edumazet@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-16x86/numa: Fix the address overlap check in numa_fill_memblks()Alison Schofield
numa_fill_memblks() fills in the gaps in numa_meminfo memblks over a physical address range. To do so, it first creates a list of existing memblks that overlap that address range. The issue is that it is off by one when comparing to the end of the address range, so memblks that do not overlap are selected. The impact of selecting a memblk that does not actually overlap is that an existing memblk may be filled when the expected action is to do nothing and return NUMA_NO_MEMBLK to the caller. The caller can then add a new NUMA node and memblk. Replace the broken open-coded search for address overlap with the memblock helper memblock_addrs_overlap(). Update the kernel doc and in code comments. Suggested by: "Huang, Ying" <ying.huang@intel.com> Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/10a3e6109c34c21a8dd4c513cf63df63481a2b07.1705085543.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-17crypto: hisilicon/qm - change function type to voidWeili Qian
The function qm_stop_qp_nolock() always return zero, so function type is changed to void. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17crypto: hisilicon/qm - obtain stop queue statusWeili Qian
The debugfs files 'dev_state' and 'dev_timeout' are added. Users can query the current queue stop status through these two files. And set the waiting timeout when the queue is released. dev_state: if dev_timeout is set, dev_state indicates the status of stopping the queue. 0 indicates that the queue is stopped successfully. Other values indicate that the queue stops fail. If dev_timeout is not set, the value of dev_state is 0; dev_timeout: if the queue fails to stop, the queue is released after waiting dev_timeout * 20ms. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17crypto: hisilicon/qm - add stop function by hardwareWeili Qian
Hardware V3 could be able to drain function by sending mailbox to hardware which will trigger tasks in device to be flushed out. When the function is reset, the function can be stopped by this way. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17firmware: coreboot: Generate aliases for coreboot modulesNícolas F. R. A. Prado
Generate aliases for coreboot modules to allow automatic module probing. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240212-coreboot-mod-defconfig-v4-2-d14172676f6d@collabora.com Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
2024-02-16Merge tag 'block-6.8-2024-02-16' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Just an nvme pull request via Keith: - Fabrics connection error handling (Chaitanya) - Use relaxed effects to reduce unnecessary queue freezes (Keith)" * tag 'block-6.8-2024-02-16' of git://git.kernel.dk/linux: nvmet: remove superfluous initialization nvme: implement support for relaxed effects nvme-fabrics: fix I/O connect error handling
2024-02-16Merge tag 'trace-v6.8-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix the #ifndef that didn't have the 'CONFIG_' prefix on HAVE_DYNAMIC_FTRACE_WITH_REGS The fix to have dynamic trampolines work with x86 broke arm64 as the config used in the #ifdef was HAVE_DYNAMIC_FTRACE_WITH_REGS and not CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which removed the fix that the previous fix was to fix. - Fix tracing_on state The code to test if "tracing_on" is set incorrectly used ring_buffer_record_is_on() which returns false if the ring buffer isn't able to be written to. But the ring buffer disable has several bits that disable it. One is internal disabling which is used for resizing and other modifications of the ring buffer. But the "tracing_on" user space visible flag should only report if tracing is actually on and not internally disabled, as this can cause confusion as writing "1" when it is disabled will not enable it. Instead use ring_buffer_record_is_set_on() which shows the user space visible settings. - Fix a false positive kmemleak on saved cmdlines Now that the saved_cmdlines structure is allocated via alloc_page() and not via kmalloc() it has become invisible to kmemleak. The allocation done to one of its pointers was flagged as a dangling allocation leak. Make kmemleak aware of this allocation and free. - Fix synthetic event dynamic strings An update that cleaned up the synthetic event code removed the return value of trace_string(), and had it return zero instead of the length, causing dynamic strings in the synthetic event to always have zero size. - Clean up documentation and header files for seq_buf * tag 'trace-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: seq_buf: Fix kernel documentation seq_buf: Don't use "proxy" headers tracing/synthetic: Fix trace_string() return value tracing: Inform kmemleak of saved_cmdlines allocation tracing: Use ring_buffer_record_is_set_on() in tracer_tracing_is_on() tracing: Fix HAVE_DYNAMIC_FTRACE_WITH_REGS ifdef
2024-02-16x86/resctrl: Add CPU offline callback for resctrl workJames Morse
The resctrl architecture specific code may need to free a domain when a CPU goes offline, it also needs to reset the CPUs PQR_ASSOC register. Amongst other things, the resctrl filesystem code needs to clear this CPU from the cpu_mask of any control and monitor groups. Currently, this is all done in core.c and called from resctrl_offline_cpu(), making the split between architecture and filesystem code unclear. Move the filesystem work to remove the CPU from the control and monitor groups into a filesystem helper called resctrl_offline_cpu(), and rename the one in core.c resctrl_arch_offline_cpu(). Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-23-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPUJames Morse
When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU. Once the offline callbacks have been split, cqm_setup_limbo_handler() will be called while the CPU that is going offline is still present in the CPU mask. Pass the CPU to exclude to cqm_setup_limbo_handler() and mbm_setup_overflow_handler(). These functions can use a variant of cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need excluding. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-22-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16x86/resctrl: Add CPU online callback for resctrl workJames Morse
The resctrl architecture specific code may need to create a domain when a CPU comes online, it also needs to reset the CPUs PQR_ASSOC register. The resctrl filesystem code needs to update the rdtgroup_default CPU mask when CPUs are brought online. Currently, this is all done in one function, resctrl_online_cpu(). It will need to be split into architecture and filesystem parts before resctrl can be moved to /fs/. Pull the rdtgroup_default update work out as a filesystem specific cpu_online helper. resctrl_online_cpu() is the obvious name for this, which means the version in core.c needs renaming. resctrl_online_cpu() is called by the arch code once it has done the work to add the new CPU to any domains. In future patches, resctrl_online_cpu() will take the rdtgroup_mutex itself. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-21-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()James Morse
Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter value. Allocating a contended resource may involve sleeping. __check_limbo() and mon_event_count() each make multiple calls to resctrl_arch_rmid_read(), to avoid extra work on contended systems, the allocation should be valid for multiple invocations of resctrl_arch_rmid_read(). The memory or hardware allocated is not specific to a domain. Add arch hooks for this allocation, which need calling before resctrl_arch_rmid_read(). The allocated monitor is passed to resctrl_arch_rmid_read(), then freed again afterwards. The helper can be called on any CPU, and can sleep. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-16-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16x86/resctrl: Allow resctrl_arch_rmid_read() to sleepJames Morse
MPAM's cache occupancy counters can take a little while to settle once the monitor has been configured. The maximum settling time is described to the driver via a firmware table. The value could be large enough that it makes sense to sleep. To avoid exposing this to resctrl, it should be hidden behind MPAM's resctrl_arch_rmid_read(). resctrl_arch_rmid_read() may be called via IPI meaning it is unable to sleep. In this case, it should return an error if it needs to sleep. This will only affect MPAM platforms where the cache occupancy counter isn't available immediately, nohz_full is in use, and there are no housekeeping CPUs in the necessary domain. There are three callers of resctrl_arch_rmid_read(): __mon_event_count() and __check_limbo() are both called from a non-migrateable context. mon_event_read() invokes __mon_event_count() using smp_call_on_cpu(), which adds work to the target CPUs workqueue. rdtgroup_mutex() is held, meaning this cannot race with the resctrl cpuhp callback. __check_limbo() is invoked via schedule_delayed_work_on() also adds work to a per-cpu workqueue. The remaining call is add_rmid_to_limbo() which is called in response to a user-space syscall that frees an RMID. This opportunistically reads the LLC occupancy counter on the current domain to see if the RMID is over the dirty threshold. This has to disable preemption to avoid reading the wrong domain's value. Disabling preemption here prevents resctrl_arch_rmid_read() from sleeping. add_rmid_to_limbo() walks each domain, but only reads the counter on one domain. If the system has more than one domain, the RMID will always be added to the limbo list. If the RMIDs usage was not over the threshold, it will be removed from the list when __check_limbo() runs. Make this the default behaviour. Free RMIDs are always added to the limbo list for each domain. The user visible effect of this is that a clean RMID is not available for re-allocation immediately after 'rmdir()' completes. This behaviour was never portable as it never happened on a machine with multiple domains. Removing this path allows resctrl_arch_rmid_read() to sleep if its called with interrupts unmasked. Document this is the expected behaviour, and add a might_sleep() annotation to catch changes that won't work on arm64. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-15-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16x86/resctrl: Track the closid with the rmidJames Morse
x86's RMID are independent of the CLOSID. An RMID can be allocated, used and freed without considering the CLOSID. MPAM's equivalent feature is PMG, which is not an independent number, it extends the CLOSID/PARTID space. For MPAM, only PMG-bits worth of 'RMID' can be allocated for a single CLOSID. i.e. if there is 1 bit of PMG space, then each CLOSID can have two monitor groups. To allow resctrl to disambiguate RMID values for different CLOSID, everything in resctrl that keeps an RMID value needs to know the CLOSID too. This will always be ignored on x86. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-6-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdefJames Morse
tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu() allows a specific CPU to be tested against the mask, and evaluates to false when CONFIG_NO_HZ_FULL is not defined. The resctrl code needs to pick a CPU to run some work on, a new helper prefers housekeeping CPUs by examining the tick_nohz_full_mask. Hiding the declaration behind #ifdef CONFIG_NO_HZ_FULL forces all the users to be behind an #ifdef too. Move the tick_nohz_full_mask declaration, this lets callers drop the #ifdef, and guard access to tick_nohz_full_mask with IS_ENABLED() or something like tick_nohz_full_cpu(). The definition does not need to be moved as any callers should be removed at compile time unless CONFIG_NO_HZ_FULL is defined. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> # for resctrl dependency Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Link: https://lore.kernel.org/r/20240213184438.16675-2-james.morse@arm.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2024-02-16Merge tag 'gpio-fixes-for-v6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - add missing stubs for functions that are not built with GPIOLIB disabled * tag 'gpio-fixes-for-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: add gpio_device_get_label() stub for !GPIOLIB gpiolib: add gpio_device_get_base() stub for !GPIOLIB gpiolib: add gpiod_to_gpio_device() stub for !GPIOLIB
2024-02-16PCI: endpoint: Drop only_64bit on reserved BARsNiklas Cassel
The definition of a reserved BAR is that EPF drivers should not touch them. The definition of only_64bit is that the EPF driver must configure this BAR as 64-bit. (An EPF driver is not allowed to choose if this BAR should be configured as 32-bit or 64-bit.) Thus, it does not make sense to put only_64bit of a BAR that EPF drivers are not allow to touch. Drop the only_64bit property from hardware descriptions that are of type reserved BAR. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Kishon Vijay Abraham I <kishon@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240216134524.1142149-3-cassel@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-02-16PCI: endpoint: Clean up hardware description for BARsNiklas Cassel
The hardware description for BARs is scattered in many different variables in pci_epc_features. Some of these things are mutually exclusive, so it can create confusion over which variable that has precedence over another. Improve the situation by creating a struct pci_epc_bar_desc, and a new enum pci_epc_bar_type, and convert the endpoint controller drivers to use this more well defined format. Additionally, some endpoint controller drivers mark the BAR succeeding a "64-bit only BAR" as reserved, while some do not. By definition, a 64-bit BAR uses the succeeding BAR for the upper 32-bits, so an EPF driver cannot use a BAR succeeding a 64-bit BAR. Ensure that all endpoint controller drivers are uniform, and actually describe a reserved BAR as reserved. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Kishon Vijay Abraham I <kishon@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240216134524.1142149-2-cassel@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-02-16iommu: Make iommu_report_device_fault() return voidLu Baolu
As the iommu_report_device_fault() has been converted to auto-respond a page fault if it fails to enqueue it, there's no need to return a code in any case. Make it return void. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-17-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Make iopf_group_response() return voidLu Baolu
The iopf_group_response() should return void, as nothing can do anything with the failure. This implies that ops->page_response() must also return void; this is consistent with what the drivers do. The failure paths, which are all integrity validations of the fault, should be WARN_ON'd, not return codes. If the iommu core fails to enqueue the fault, it should respond the fault directly by calling ops->page_response() instead of returning an error number and relying on the iommu drivers to do so. Consolidate the error fault handling code in the core. Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-16-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Track iopf group instead of last faultLu Baolu
Previously, before a group of page faults was passed to the domain's iopf handler, the last page fault of the group was kept in the list of iommu_fault_param::faults. In the page fault response path, the group's last page fault was used to look up the list, and the page faults were responded to device only if there was a matched fault. The previous approach seems unnecessarily complex and not performance friendly. Put the page fault group itself to the outstanding fault list. It can be removed in the page fault response path or in the iopf_queue_remove_device() path. The pending list is protected by iommu_fault_param::lock. To allow checking for the group's presence in the list using list_empty(), the iopf group should be removed from the list with list_del_init(). IOMMU_PAGE_RESP_PASID_VALID is set in the code but not used anywhere. Remove it to make the code clean. IOMMU_PAGE_RESP_PASID_VALID is set in the response message indicating that the response message includes a valid PASID value. Actually, we should keep this hardware detail in the individual driver. When the page fault handling framework in IOMMU and IOMMUFD subsystems includes a valid PASID in the fault message, the response message should always contain the same PASID value. Individual drivers should be responsible for deciding whether to include the PASID in the messages they provide for the hardware. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-15-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Improve iopf_queue_remove_device()Lu Baolu
Convert iopf_queue_remove_device() to return void instead of an error code, as the return value is never used. This removal helper is designed to be never-failed, so there's no need for error handling. Ack all outstanding page requests from the device with the response code of IOMMU_PAGE_RESP_INVALID, indicating device should not attempt any retry. Add comments to this helper explaining the steps involved in removing a device from the iopf queue and disabling its PRI. The individual drivers are expected to be adjusted accordingly. Here we just define the expected behaviors of the individual iommu driver from the core's perspective. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-14-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Use refcount for fault data accessLu Baolu
The per-device fault data structure stores information about faults occurring on a device. Its lifetime spans from IOPF enablement to disablement. Multiple paths, including IOPF reporting, handling, and responding, may access it concurrently. Previously, a mutex protected the fault data from use after free. But this is not performance friendly due to the critical nature of IOPF handling paths. Refine this with a refcount-based approach. The fault data pointer is obtained within an RCU read region with a refcount. The fault data pointer is returned for usage only when the pointer is valid and a refcount is successfully obtained. The fault data is freed with kfree_rcu(), ensuring data is only freed after all RCU critical regions complete. An iopf handling work starts once an iopf group is created. The handling work continues until iommu_page_response() is called to respond to the iopf and the iopf group is freed. During this time, the device fault parameter should always be available. Add a pointer to the device fault parameter in the iopf_group structure and hold the reference until the iopf_group is freed. Make iommu_page_response() static as it is only used in io-pgfault.c. Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Separate SVA and IOPFLu Baolu
Add CONFIG_IOMMU_IOPF for page fault handling framework and select it from its real consumer. Move iopf function declaration from iommu-sva.h to iommu.h and remove iommu-sva.h as it's empty now. Consolidate all SVA related code into iommu-sva.c: - Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c. - Move sva iopf handling code from io-pgfault.c to iommu-sva.c. Consolidate iommu_report_device_fault() and iommu_page_response() into io-pgfault.c. Export iopf_free_group() and iopf_group_response() for iopf handlers implemented in modules. Some functions are renamed with more meaningful names. No other intentional functionality changes. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16iommu: Make iommu_queue_iopf() more genericLu Baolu
Make iommu_queue_iopf() more generic by making the iopf_group a minimal set of iopf's that an iopf handler of domain should handle and respond to. Add domain parameter to struct iopf_group so that the handler can retrieve and use it directly. Change iommu_queue_iopf() to forward groups of iopf's to the domain's iopf handler. This is also a necessary step to decouple the sva iopf handling code from this interface. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>