Age | Commit message (Collapse) | Author |
|
mkey->size is already stored in ibmr->length, no need to store it here.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
iova is already stored in ibmr->iova, no need to store it here.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
As another qdisc is linked to the TBF, the latter should issue an event to
give drivers a chance to react to the grafting. In other qdiscs, this event
is called GRAFT, so follow suit with TBF as well.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The last user of this struct gone a couple of releases ago.
Besides that there were not so many users of this API for
more than 10 years:
1/ The one is Intel Merrifield, that had been added 2016-08-31
by the commit 3976b0380b31 ("x86/platform/intel-mid: Enable
SD card detection on Merrifield") and removed 2021-02-11 by
the commit 4590d98f5a4f ("sfi: Remove framework for deprecated
firmware").
2/ The other is Intel Sunrisepoint related, that had been added
2015-02-06 by the commit e1bfad6d936d ("mmc: sdhci-pci: Add
support for drive strength selection for SPT") and removed
2017-03-20 by the commit 51ced59cc02e ("mmc: sdhci-pci: Use
ACPI DSM to get driver strength for some Intel devices").
Effectively this is a revert of the commit 52c506f0bc72 ("mmc:
sdhci-pci: add platform data").
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20211014132613.27861-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
https://git.linaro.org/people/daniel.lezcano/linux into for-next/8.6-timers
Pull Arm architected timer driver rework from Marc (via Daniel) so that
we can add the Armv8.6 support on top.
Link: https://lore.kernel.org/r/d0c55386-2f7f-a940-45bb-d80ae5e0f378@linaro.org
* 'timers/drivers/armv8.6_arch_timer' of https://git.linaro.org/people/daniel.lezcano/linux:
clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
First set of IIO new device and feature support for the 5.16 cycle
Counter subsystem changes now sent separately.
This has been a busy cycle, so lots here and a few more stragglers to
come next week.
Big new feature in this cycle is probably output buffer support.
This has been in the works for a very long time so it's great to see
Mihail pick up the challenge and build upon his predecessors work to finally
bring this feature to mainline.
New device support
------------------
* adi,adxl313
- New driver and dt bindings for this low power accelerometer.
* adi,adxl355
- New driver and dt bindings for this accelerometer.
- Later series adds buffer support.
* asahi-kasei,ak8975
- Minor additions to driver to support ak09916
* aspeed,aspeed-adc
- Substantial rework plus feature additions to add support for the
ast2600 including a new dt bindings doc.
* atmel,at91_sama5d2
- Rework and support introduced for the sama7g5 parts.
* maxim,max31865
- New driver and bindings for this RTD temperature sensor chip.
* nxp,imx8qxp
- New driver and bindings for the ADC found on the i.MX 8QuadXPlus Soc.
* senseair,sunrise
- New driver and bindings for this family of carbon dioxide gas sensors.
* sensiron,scd4x
- New driver and bindings for this carbon dioxide gas sensor.
New features
------------
* Output buffer support. Works in a similar fashion to input buffers, but
in this case userspace pushes data into the kfifo which is then drained
to the device when a trigger occurs. Support added to the ad5766 DAC
driver.
* Core, devm_iio_map_array_register() to avoid need for
devm_add_action_or_reset() based cleanup in fully managed allocation
drivers.
* Core iio_push_to_buffers_with_ts_unaligned() function to safely handle a
few drivers where it really hard to ensure the correct data alignment in
an iio_push_to_buffers_with_timestamp() call. Note this uses a bounce
buffer so should be avoided whenever possible. Used in the ti,adc108s102,
invense,mpu3050 and adi,adis16400. This closes the last known set
of drivers with alignment issues at this interface.
* maxim,max1027
- Substantial rework to this driver main target of which was supporting
use of other triggers than it's own EOC interrupt.
- Transfer optimization.
* nxp,fxls8962af
- Threshold even support including using it as a wakeup source.
Cleanups, minor fixes etc
-------------------------
Chances of a common type to multiple drivers:
* devm_ conversion and drop of .remove() callbacks in:
- adi,ad5064
- adi,ad7291
- adi,ad7303
- adi,ad7746
- adi,ad9832
- adi,adis16080
- dialog,da9150-gpadc
- intel,mrfld_adc
- marvell,berlin2
- maxim,max1363
- maxim,max44000
- nuvoton,nau7802
- st_sensors (includes a lot of rework!)
- ti,ads8344
- ti,lp8788
* devm_platform_ioremap_resource() used to reduce boilerplate
- cirrus,ep93xx
- rockchip,saradc
- stm,stm32-dac
* Use dev_err_probe() in more places to both not print on deferred probe and
ensure a reason for the deferral is available for debug purposes.
- adi,ad8801
- capella,cm36651
- linear,ltc1660
- maxim,ds4424
- maxim,max5821
- microchip,mcp4922
- nxp,lpc18xx
- onnn,noa1305
- st,lsm9ds0
- st,st_sensors
- st,stm32-dac
- ti,afe4403
- ti,afe4404
- ti,dac7311
* Drop error returns in SPI and I2C remove() functions as they are ignored and
long term plan is to change these all over to returning void. In some cases
these patches just make it 'obvious' they always return 0 where it was the
case before but not easy to tell.
- adi,ad5380
- adi,ad5446
- adi,ad5686
- adi,ad5592r
- bosch,bma400
- bosch,bmc150
- fsl,mma7455
- honeywell,hmc5843
- kionix,kxsd9
- maxim,max5487
- meas,ms5611
- ti,afe4403
Driver specific changes
* adi,ad5770r
- Bring driver inline with documented bindings.
* adi,ad7746
- Trivial style fix
* adi,ad7949
- Express some magic values as the underlying parts via new #defines.
- Make it work with SPI controllers that don't support 14 or 16 bit messages
- Support selection of voltage reference from dt including expanding the
dt-bindings to cover this new functionality.
* adi,ad799x
- Implement selection of external reference voltage on AD7991, AD7995 and
AD7999.
- Add missing dt-bindings doc for devices supported by this driver.
* adi,adislib
- Move interrupt startup to better location in startup flow.
- Handle devices that cannot mask/unmask the drdy pin and must instead mask
at the interrupt controller. Applies to the adis16460 and adis16475 from
which we then drop equivalent code.
* adi,ltc2983
- Add support for optional reset pin.
- Fail to probe if no channels specified in dt binding.
* asahi-kasei,ak8975
- dt-binding additions of missing vid-supply regulator.
* aspeed,aspeed-adc
- Typo fix.
* fsl,mma7660
- Mark acpi_device_id table __maybe_unused to avoid build warning.
* fsl,imx25-gcq
- Avoid initializing regulators that aren't used.
* invensense,mpu3050
- Drop a dead protection against a clash with the old input driver.
* invensense,mpu6050
- Rework code to not use strcpy() and hence avoid possibility of wrong sized
buffers. Note this wasn't a bug, but the new code is a lot more readable.
- Mark acpi_device_id table __maybe_unused to avoid build warning.
* kionix,kxcjk1013
- dt-binding addition to note it supports interrupts.
* marvell,berlin2-adc
- Enable COMPILE_TEST building.
* maxim,max1027
- Avoid returning success in an error path.
* nxp,imx8qxp
- Fix warning when runtime pm not enabled via __maybe_unused.
* ricoh,rn5t618
- Use the new devm_iio_map_array_register() instead of open coding the same.
* samsung,exynos_adc
- Improve kconfig help text.
* st,lsm6dsx
- Move max_fifo_size into the fifo_ops structure where the other configuration
parameters are found.
* st,st_sensors:
- Reorder to ensure we turn the power off after removing userspace interfaces.
* senseair,sunrise
- Add missing I2C dependency.
* ti,twl6030
- Small code tidy up.
* tag 'iio-for-5.16a-split-take4' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (148 commits)
iio: imx8qxp-adc: mark PM functions as __maybe_unused
iio: pressure: ms5611: Make ms5611_remove() return void
iio: potentiometer: max5487: Don't return an error in .remove()
iio: magn: hmc5843: Make hmc5843_common_remove() return void
iio: health: afe4403: Don't return an error in .remove()
iio: dac: ad5686: Make ad5686_remove() return void
iio: dac: ad5592r: Make ad5592r_remove() return void
iio: dac: ad5446: Make ad5446_remove() return void
iio: dac: ad5380: Make ad5380_remove() return void
iio: accel: mma7455: Make mma7455_core_remove() return void
iio: accel: kxsd9: Make kxsd9_common_remove() return void
iio: accel: bmi088: Make bmi088_accel_core_remove() return void
iio: accel: bmc150: Make bmc150_accel_core_remove() return void
iio: accel: bma400: Make bma400_remove() return void
drivers:iio:dac:ad5766.c: Add trigger buffer
iio: triggered-buffer: extend support to configure output buffers
iio: kfifo-buffer: Add output buffer support
iio: Add output buffer support
iio: documentation: Document scd4x calibration use
drivers: iio: chemical: Add support for Sensirion SCD4x CO2 sensor
...
|
|
This removes the chrdev_lock from the counter subsystem. This was
intended to prevent opening the chrdev more than once. However, this
doesn't work in practice since userspace can duplicate file descriptors
and pass file descriptors to other processes. Since this protection
can't be relied on, it is best to just remove it.
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: David Lechner <david@lechnology.com>
Link: https://lore.kernel.org/r/20211017185521.3468640-1-david@lechnology.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
No one use the following functions, kill them.
amba_aphb_device_add()
amba_apb_device_add()
amba_apb_device_add_res()
amba_ahb_device_add()
amba_ahb_device_add_res()
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Extend structure 'mlxreg_core_data' with the field "secured". The
purpose of this field is to restrict access to some attributes, if
kernel is configured with security options, like:
LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY.
Access to some attributes, which for example, allow burning of some
hardware components, like FPGA, CPLD, SPI, etcetera can break the
system. In case user does not want to allow such access, it can disable
it by setting security options.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20211002093238.3771419-7-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Extend the structure 'mlxreg_hotplug_device" with platform device field
to allow transition of the register map and system interrupt line number
to underlying hotplug devices, sharing the same register map and
same interrupt line with 'mlxreg-hotplug' driver.
Extend logic for hotplug devices creation and removing according to
the action associated with the hotplug device description. Previously
hotplug driver was capable to attach / de-attach upon hotplug events
only I2C devices handled by simple I2C drivers. Now it should be able
to attach also devices handled by the platform drivers.
The motivation is to allow transition of platform data like:
- system interrupt line number, sharing with 'mlxreg-hotplug' to
underlying hotplug devices.
- shared register map of programmable devices on main board to
underlying hotplug devices.
Additioanlly the number of 'sysfs' attributes is increased, since
modular system defines more 'sysfs' attributes.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20211002093238.3771419-4-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Add new types for the Nvidia modular systems MSN4800 which could
be equipped with the different types of replaceable line cards.
Add new type to specify the kind of hotplug events for the line cards.
The line card events are generated by the programmable device located
on the main board. This device implements interrupt controller logic.
Line card interrupts are associated with different line cards states
during its initialization: insertion, security signature validation,
power good state, security validation, hardware-firmware
synchronization state, line card PHYs readiness state, firmware
availability for line card ports. Also under some circumstances
hardware can generate thermal shutdown for particular line card.
Add new type specifying the action, which should be performed when
particular hotplug event is received. This action defines in which way
hotplug event should be handled by hotplug driver. There are the next
actions types:
- Connect I2C device with empty 'platform_data' field according to the
platform topology, if device is configured (for example, power unit
micro-controller driver, when power unit is connected to power source
(this is what is currently supported).
- Connect device with 'platform_data' field set according to the
platform topology. The purpose is to pass 'platform_data' through
hotplug driver to underlying device (for example line card driver).
- No device is associated with hotplug event - just send "udev" event
(this is what is currently supported).
Extend structure 'mlxreg_hotplug_device' with hotplug action field.
Extend structure 'mlxreg_core_data' with:
- Registers for line card power and enabling control.
- Slot number field, to indicate at which physical slot replaceable
line card device is located.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20211002093238.3771419-2-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
The branch is a stable branch shared with ARM maintainers for the
first 13th patches of the series:
It is based on v5.14-rc3.
As stated by the changelog:
" [... ] enabling ARMv8.6 support for timer subsystem, and was prompted by a
discussion with Oliver around the fact that an ARMv8.6 implementation
must have a 1GHz counter, which leads to a number of things to break
in the timer code:
- the counter rollover can come pretty quickly as we only advertise a
56bit counter,
- the maximum timer delta can be remarkably small, as we use the
countdown interface which is limited to 32bit...
Thankfully, there is a way out: we can compute the minimal width of
the counter based on the guarantees that the architecture gives us,
and we can use the 64bit comparator interface instead of the countdown
to program the timer.
Finally, we start making use of the ARMv8.6 ECV features by switching
accesses to the counters to a self-synchronising register, removing
the need for an ISB. Hopefully, implementations will *not* just stick
an invisible ISB there...
A side effect of the switch to CVAL is that XGene-1 breaks. I have
added a workaround to keep it alive.
I have added Oliver's original patch[0] to the series and tweaked a
couple of things. Blame me if I broke anything.
The whole things has been tested on Juno (sysreg + MMIO timers),
XGene-1 (broken sysreg timers), FVP (FEAT_ECV, CNT*CTSS_EL0).
"
Link: https://lore.kernel.org/r/20211017124225.3018098-1-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-next
Oded writes:
This tag contains habanalabs driver changes for v5.16:
- Add a new uAPI (under the memory ioctl) to request from the driver
to export a DMA-BUF object that represents a memory region on
the device's DRAM. This is needed to enable peer-to-peer over PCIe
between habana device and an RDMA adapter (e.g. mlnx5 or efa
rdma adapter).
- Add debugfs node to dynamically configure CS timeout. Up until now,
it was only configurable through kernel module parameter.
- Fetch more comprehensive power information from the firmware.
- Always take timestamp when waiting for user interrupt, as the user
needs that information to optimize the graph runtime compilation.
- Modify user interrupt to look on 64-bit user value as fence, instead
of 32-bit.
- Bypass reset in case of repeated h/w error event after device reset.
This is to prevent endless loop of resets to the device.
- Fix several bugs in multi CS completion code.
- Fix race condition in fd close/open.
- Update to latest firmware headers
- Add select CRC32 in kconfig
- Small fixes, cosmetics
* tag 'misc-habanalabs-next-2021-10-18' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (25 commits)
habanalabs: refactor fence handling in hl_cs_poll_fences
habanalabs: context cleanup cosmetics
habanalabs: simplify wait for interrupt with timestamp flow
habanalabs: initialize hpriv fields before adding new node
habanalabs: Unify frequency set/get functionality
habanalabs: select CRC32
habanalabs: add support for dma-buf exporter
habanalabs: define uAPI to export FD for DMA-BUF
habanalabs: fix NULL pointer dereference
habanalabs: fix race condition in multi CS completion
habanalabs: use only u32
habanalabs: update firmware files
habanalabs: bypass reset for continuous h/w error event
habanalabs: take timestamp on wait for interrupt
habanalabs: prevent race between fd close/open
habanalabs: refactor reset log message
habanalabs: define soft-reset as inference op
habanalabs: fix debugfs device memory MMU VA translation
habanalabs: add support for a long interrupt target value
habanalabs: remove redundant cs validity checks
...
|
|
Now that output (kfifo) buffers are supported, we need to extend the
{devm_}iio_triggered_buffer_setup_ext() parameter list to take a direction
parameter.
This allows us to attach an output triggered buffer to a DAC device.
Unfortunately it's a bit difficult to add another macro to avoid changing 5
drivers where {devm_}iio_triggered_buffer_setup_ext() is used.
Well, it's doable, but may not be worth the trouble vs just updating all
these 5 drivers.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Mihail Chindris <mihail.chindris@analog.com>
Link: https://lore.kernel.org/r/20211007080035.2531-4-mihail.chindris@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Currently IIO only supports buffer mode for capture devices like ADCs. Add
support for buffered mode for output devices like DACs.
The output buffer implementation is analogous to the input buffer
implementation. Instead of using read() to get data from the buffer write()
is used to copy data into the buffer.
poll() with POLLOUT will wakeup if there is space available.
Drivers can remove data from a buffer using iio_pop_from_buffer(), the
function can e.g. called from a trigger handler to write the data to
hardware.
A buffer can only be either a output buffer or an input, but not both. So,
for a device that has an ADC and DAC path, this will mean 2 IIO buffers
(one for each direction).
The direction of the buffer is decided by the new direction field of the
iio_buffer struct and should be set after allocating and before registering
it.
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Co-developed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Mihail Chindris <mihail.chindris@analog.com>
Link: https://lore.kernel.org/r/20211007080035.2531-2-mihail.chindris@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Whilst it is almost always possible to arrange for scan data to be
read directly into a buffer that is suitable for passing to
iio_push_to_buffers_with_timestamp(), there are a few places where
leading data needs to be skipped over.
For these cases introduce a function that will allocate an appropriate
sized and aligned bounce buffer (if not already allocated) and copy
the unaligned data into that before calling
iio_push_to_buffers_with_timestamp() on the bounce buffer.
We tie the lifespace of this buffer to that of the iio_dev.dev
which should ensure no memory leaks occur.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210613151039.569883-2-jic23@kernel.org
|
|
Some devices can't mask/unmask the data ready pin and in those cases
each driver was just calling '{dis}enable_irq()' to control the trigger
state. This change, moves that handling into the library by introducing
a new boolean in the data structure that tells the library that the
device cannot unmask the pin.
On top of controlling the trigger state, we can also use this flag to
automatically request the IRQ with 'IRQF_NO_AUTOEN' in case it is set.
So far, all users of the library want to start operation with IRQs/DRDY
pin disabled so it should be fairly safe to do this inside the library.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210903141423.517028-3-nuno.sa@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
This change introduces a device-managed variant to the
iio_map_array_register() function. It's a simple implementation of calling
iio_map_array_register() and registering a callback to
iio_map_array_unregister() with the devm_add_action_or_reset().
The function uses an explicit 'dev' parameter to bind the unwinding to. It
could have been implemented to implicitly use the parent of the IIO device,
however it shouldn't be too expensive to callers to just specify to which
device object to bind this unwind call.
It would make the API a bit more flexible.
Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
Link: https://lore.kernel.org/r/20210903072917.45769-2-aardelean@deviqon.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
First set of counter subsystem new feature support for the 5.16 cycle
Most interesting element this time is the new chrdev based interface
for the counter subsystem. Affects all drivers. Some minor precursor
patches.
Major parts:
* Bring all the sysfs attribute setup into the counter core rather than
leaving it to individual drivers. Docs updates accompany these changes.
* Move various definitions to a uapi header as now needed from userspace.
* Add the chardev interface + extensive documentation and example tool
* Add new ABI needed to identify indexes needed for chrdev interface
* Implement new interface for the 104-quad-8
* Follow up deals with wrong path for documentation build
* Various trivial cleanups and missing feature additions related to this
series
* tag 'counter-for-5.16a-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
docs: counter: Include counter-chrdev kernel-doc to generic-counter.rst
counter: fix docum. build problems after filename change
counter: microchip-tcb-capture: Tidy up a false kernel-doc /** marking.
counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8
counter: 104-quad-8: Replace mutex with spinlock
counter: Implement events_queue_size sysfs attribute
counter: Implement *_component_id sysfs attributes
counter: Implement signalZ_action_component_id sysfs attribute
tools/counter: Create Counter tools
docs: counter: Document character device interface
counter: Add character device interface
counter: Move counter enums to uapi header
docs: counter: Update to reflect sysfs internalization
counter: Update counter.h comments to reflect sysfs internalization
counter: Internalize sysfs interface code
counter: stm32-timer-cnt: Provide defines for slave mode selection
counter: stm32-lptimer-cnt: Provide defines for clock polarities
|
|
Check for a NULL page->mapping before dereferencing the mapping in
page_is_secretmem(), as the page's mapping can be nullified while gup()
is running, e.g. by reclaim or truncation.
BUG: kernel NULL pointer dereference, address: 0000000000000068
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 6 PID: 4173897 Comm: CPU 3/KVM Tainted: G W
RIP: 0010:internal_get_user_pages_fast+0x621/0x9d0
Code: <48> 81 7a 68 80 08 04 bc 0f 85 21 ff ff 8 89 c7 be
RSP: 0018:ffffaa90087679b0 EFLAGS: 00010046
RAX: ffffe3f37905b900 RBX: 00007f2dd561e000 RCX: ffffe3f37905b934
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe3f37905b900
...
CR2: 0000000000000068 CR3: 00000004c5898003 CR4: 00000000001726e0
Call Trace:
get_user_pages_fast_only+0x13/0x20
hva_to_pfn+0xa9/0x3e0
try_async_pf+0xa1/0x270
direct_page_fault+0x113/0xad0
kvm_mmu_page_fault+0x69/0x680
vmx_handle_exit+0xe1/0x5d0
kvm_arch_vcpu_ioctl_run+0xd81/0x1c70
kvm_vcpu_ioctl+0x267/0x670
__x64_sys_ioctl+0x83/0xa0
do_syscall_64+0x56/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Link: https://lkml.kernel.org/r/20211007231502.3552715-1-seanjc@google.com
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: Stephen <stephenackerman16@gmail.com>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 6e7b64b9dd6d ("elfcore: fix building with clang") introduces
special handling for two architectures, ia64 and User Mode Linux.
However, the wrong name, i.e., CONFIG_UM, for the intended Kconfig
symbol for User-Mode Linux was used.
Although the directory for User Mode Linux is ./arch/um; the Kconfig
symbol for this architecture is called CONFIG_UML.
Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:
UM
Referencing files: include/linux/elfcore.h
Similar symbols: UML, NUMA
Correct the name of the config to the intended one.
[akpm@linux-foundation.org: fix um/x86_64, per Catalin]
Link: https://lkml.kernel.org/r/20211006181119.2851441-1-catalin.marinas@arm.com
Link: https://lkml.kernel.org/r/YV6pejGzLy5ppEpt@arm.com
Link: https://lkml.kernel.org/r/20211006082209.417-1-lukas.bulwahn@gmail.com
Fixes: 6e7b64b9dd6d ("elfcore: fix building with clang")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Barret Rhoden <brho@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The node demotion order needs to be updated during CPU hotplug. Because
whether a NUMA node has CPU may influence the demotion order. The
update function should be called during CPU online/offline after the
node_states[N_CPU] has been updated. That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during
CPU offline. But in commit 884a6e5d1f93 ("mm/migrate: update node
demotion order on hotplug events"), the function to update node demotion
order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline. This
doesn't satisfy the order requirement.
For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0
and P2, P3 in S1), the demotion order is
- S0 -> NUMA_NO_NODE
- S1 -> NUMA_NO_NODE
After P2 and P3 is offlined, because S1 has no CPU now, the demotion
order should have been changed to
- S0 -> S1
- S1 -> NO_NODE
but it isn't changed, because the order updating callback for CPU
hotplug doesn't see the new nodemask. After that, if P1 is offlined,
the demotion order is changed to the expected order as above.
So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the
update function on them.
Link: https://lkml.kernel.org/r/20210929060351.7293-1-ying.huang@intel.com
Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Once upon a time, the node demotion updates were driven solely by memory
hotplug events. But now, there are handlers for both CPU and memory
hotplug.
However, the #ifdef around the code checks only memory hotplug. A
system that has HOTPLUG_CPU=y but MEMORY_HOTPLUG=n would miss CPU
hotplug events.
Update the #ifdef around the common code. Add memory and CPU-specific
#ifdefs for their handlers. These memory/CPU #ifdefs avoid unused
function warnings when their Kconfig option is off.
[arnd@arndb.de: rework hotplug_memory_notifier() stub]
Link: https://lkml.kernel.org/r/20211013144029.2154629-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20210924161255.E5FE8F7E@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
snd_dma_buffer_sync() is declared twice, and the one outside the ifdef
CONFIG_HAS_DMA could lead to a build error when CONFIG_HAS_DMA=n.
As it's an overlooked leftover after rebase, drop this line.
Fixes: a25684a95646 ("ALSA: memalloc: Support for non-contiguous page allocation")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20211019165402.4fa82c38@canb.auug.org.au
Link: https://lore.kernel.org/r/20211019060536.26089-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The uplink destination type should be used in rules to steer the
packet to the uplink when the device is in steering based LAG mode.
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Introduce new APIs to create and destroy flow matcher
for given format id.
Flow match definer object is used for defining the fields and
mask used for the hash calculation. User should mask the desired
fields like done in the match criteria.
This object is assigned to flow group of type hash. In this flow
group type, packets lookup is done based on the hash result.
This patch also adds the required bits to create such flow group.
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add new port selection flow steering namespace. Flow steering rules in
this namespaceare are used to determine the physical port for egress
packets.
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
We are only holding the lun_tg_pt_gp_lock in target_alua_state_check() to
make sure tg_pt_gp is not freed from under us while we copy the state,
delay, ID values. We can instead use RCU here to access the tg_pt_gp.
With this patch IOPs can increase up to 10% for jobs like:
fio --filename=/dev/sdX --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64 --numjobs=N
when there are multiple sessions (running that fio command to each /dev/sdX
or using multipath and there are over 8 paths), or more than 8 queues for
the loop or vhost with multiple threads case and numjobs > 8.
Link: https://lore.kernel.org/r/20210930020422.92578-5-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This patch fixes the following bugs:
1. If there are multiple ordered cmds queued and multiple simple cmds
completing, target_restart_delayed_cmds() could be called on different
CPUs and each instance could start a ordered cmd. They could then run in
different orders than they were queued.
2. target_restart_delayed_cmds() and target_handle_task_attr() can race
where:
1. target_handle_task_attr() has passed the simple_cmds == 0 check.
2. transport_complete_task_attr() then decrements simple_cmds to 0.
3. transport_complete_task_attr() runs target_restart_delayed_cmds() and
it does not see any cmds on the delayed_cmd_list.
4. target_handle_task_attr() adds the cmd to the delayed_cmd_list.
The cmd will then end up timing out.
3. If we are sent > 1 ordered cmds and simple_cmds == 0, we can execute
them out of order, because target_handle_task_attr() will hit that
simple_cmds check first and return false for all ordered cmds sent.
4. We run target_restart_delayed_cmds() after every cmd completion, so if
there is more than 1 simple cmd running, we start executing ordered cmds
after that first cmd instead of waiting for all of them to complete.
5. Ordered cmds are not supposed to start until HEAD OF QUEUE and all older
cmds have completed, and not just simple.
6. It's not a bug but it doesn't make sense to take the delayed_cmd_lock
for every cmd completion when ordered cmds are almost never used. Just
replacing that lock with an atomic increases IOPs by up to 10% when
completions are spread over multiple CPUs and there are multiple
sessions/ mqs/thread accessing the same device.
This patch moves the queued delayed handling to a per device work to
serialze the cmd executions for each device and adds a new counter to track
HEAD_OF_QUEUE and SIMPLE cmds. We can then check the new counter to
determine when to run the work on the completion path.
Link: https://lore.kernel.org/r/20210930020422.92578-3-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.
After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).
This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].
[1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
[2] https://reviews.llvm.org/D111588
[3] https://reviews.llvm.org/D111199
Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com
|
|
While writing an email explaining the "bit = 0" logic for a discussion on
making ftrace_test_recursion_trylock() disable preemption, I discovered a
path that makes the "not do the logic if bit is zero" unsafe.
The recursion logic is done in hot paths like the function tracer. Thus,
any code executed causes noticeable overhead. Thus, tricks are done to try
to limit the amount of code executed. This included the recursion testing
logic.
Having recursion testing is important, as there are many paths that can
end up in an infinite recursion cycle when tracing every function in the
kernel. Thus protection is needed to prevent that from happening.
Because it is OK to recurse due to different running context levels (e.g.
an interrupt preempts a trace, and then a trace occurs in the interrupt
handler), a set of bits are used to know which context one is in (normal,
softirq, irq and NMI). If a recursion occurs in the same level, it is
prevented*.
Then there are infrastructure levels of recursion as well. When more than
one callback is attached to the same function to trace, it calls a loop
function to iterate over all the callbacks. Both the callbacks and the
loop function have recursion protection. The callbacks use the
"ftrace_test_recursion_trylock()" which has a "function" set of context
bits to test, and the loop function calls the internal
trace_test_and_set_recursion() directly, with an "internal" set of bits.
If an architecture does not implement all the features supported by ftrace
then the callbacks are never called directly, and the loop function is
called instead, which will implement the features of ftrace.
Since both the loop function and the callbacks do recursion protection, it
was seemed unnecessary to do it in both locations. Thus, a trick was made
to have the internal set of recursion bits at a more significant bit
location than the function bits. Then, if any of the higher bits were set,
the logic of the function bits could be skipped, as any new recursion
would first have to go through the loop function.
This is true for architectures that do not support all the ftrace
features, because all functions being traced must first go through the
loop function before going to the callbacks. But this is not true for
architectures that support all the ftrace features. That's because the
loop function could be called due to two callbacks attached to the same
function, but then a recursion function inside the callback could be
called that does not share any other callback, and it will be called
directly.
i.e.
traced_function_1: [ more than one callback tracing it ]
call loop_func
loop_func:
trace_recursion set internal bit
call callback
callback:
trace_recursion [ skipped because internal bit is set, return 0 ]
call traced_function_2
traced_function_2: [ only traced by above callback ]
call callback
callback:
trace_recursion [ skipped because internal bit is set, return 0 ]
call traced_function_2
[ wash, rinse, repeat, BOOM! out of shampoo! ]
Thus, the "bit == 0 skip" trick is not safe, unless the loop function is
call for all functions.
Since we want to encourage architectures to implement all ftrace features,
having them slow down due to this extra logic may encourage the
maintainers to update to the latest ftrace features. And because this
logic is only safe for them, remove it completely.
[*] There is on layer of recursion that is allowed, and that is to allow
for the transition between interrupt context (normal -> softirq ->
irq -> NMI), because a trace may occur before the context update is
visible to the trace recursion logic.
Link: https://lore.kernel.org/all/609b565a-ed6e-a1da-f025-166691b5d994@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211018154412.09fcad3c@gandalf.local.home
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jisheng Zhang <jszhang@kernel.org>
Cc: =?utf-8?b?546L6LSH?= <yun.wang@linux.alibaba.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
In commit fda31c50292a ("signal: avoid double atomic counter
increments for user accounting") Linus made a clever optimization to
how rlimits and the struct user_struct. Unfortunately that
optimization does not work in the obvious way when moved to nested
rlimits. The problem is that the last decrement of the per user
namespace per user sigpending counter might also be the last decrement
of the sigpending counter in the parent user namespace as well. Which
means that simply freeing the leaf ucount in __free_sigqueue is not
enough.
Maintain the optimization and handle the tricky cases by introducing
inc_rlimit_get_ucounts and dec_rlimit_put_ucounts.
By moving the entire optimization into functions that perform all of
the work it becomes possible to ensure that every level is handled
properly.
The new function inc_rlimit_get_ucounts returns 0 on failure to
increment the ucount. This is different than inc_rlimit_ucounts which
increments the ucounts and returns LONG_MAX if the ucount counter has
exceeded it's maximum or it wrapped (to indicate the counter needs to
decremented).
I wish we had a single user to account all pending signals to across
all of the threads of a process so this complexity was not necessary
Cc: stable@vger.kernel.org
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
v1: https://lkml.kernel.org/r/87mtnavszx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87fssytizw.fsf_-_@disp2133
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Tested-by: Rune Kleveland <rune.kleveland@infomedia.dk>
Tested-by: Yu Zhao <yuzhao@google.com>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Reading the inode size brings in a new cacheline for IO submit, and
it's in the hot path being checked for every single IO. When doing
millions of IOs per core per second, this is noticeable overhead.
Cache the nr_sectors in the bdev itself.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a helper to return the size of sb->s_bdev in sb->s_blocksize_bits
based unites.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211018101130.1838532-26-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a helper to query the size of a block device in bytes. This
will be used to remove open coded access to ->bd_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211018101130.1838532-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Ensure these are always available for inlines in the various block layer
headers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211018101130.1838532-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of calling blk_mq_end_request() on a single request, add a helper
that takes the new struct io_comp_batch and completes any request stored
in there.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
sbitmap currently only supports clearing tags one-by-one, add a helper
that allows the caller to pass in an array of tags to clear.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
struct io_comp_batch contains a list head and a completion handler, which
will allow completions to more effciently completed batches of IO.
For now, no functional changes in this patch, we just define the
io_comp_batch structure and add the argument to the file_operations iopoll
handler.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of open-coding the list additions, traversal, and removal,
provide a basic set of helpers.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just like the blk_mq_ctx counterparts, we've got a bunch of counters
in here that are only for debugfs and are of questionnable value. They
are:
- dispatched, index of how many requests were dispatched in one go
- poll_{considered,invoked,success}, which track poll sucess rates. We're
confident in the iopoll implementation at this point, don't bother
tracking these.
As a bonus, this shrinks each hardware queue from 576 bytes to 512 bytes,
dropping a whole cacheline.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The 0-element arrays that are used as memcpy() destinations are actually
flexible arrays. Adjust their structures accordingly so that memcpy()
can better reason able their destination size (i.e. they need to be seen
as "unknown" length rather than "zero").
In some cases, use of the DECLARE_FLEX_ARRAY() helper is needed when a
flexible array is alone in a struct.
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Manish Rangankar <mrangankar@marvell.com>
Cc: GR-QLogic-Storage-Upstream@marvell.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Florian Schilhabel <florian.c.schilhabel@googlemail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Cc: Ross Schmidt <ross.schm.dev@gmail.com>
Cc: Marco Cesati <marcocesati@gmail.com>
Cc: ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-staging@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
In support of enabling -Warray-bounds and -Wzero-length-bounds and
correctly handling run-time memcpy() bounds checking, replace all
open-coded flexible arrays (i.e. 0-element arrays) in unions with the
DECLARE_FLEX_ARRAY() helper macro.
This fixes warnings such as:
fs/hpfs/anode.c: In function 'hpfs_add_sector_to_btree':
fs/hpfs/anode.c:209:27: warning: array subscript 0 is outside the bounds of an interior zero-length array 'struct bplus_internal_node[0]' [-Wzero-length-bounds]
209 | anode->btree.u.internal[0].down = cpu_to_le32(a);
| ~~~~~~~~~~~~~~~~~~~~~~~^~~
In file included from fs/hpfs/hpfs_fn.h:26,
from fs/hpfs/anode.c:10:
fs/hpfs/hpfs.h:412:32: note: while referencing 'internal'
412 | struct bplus_internal_node internal[0]; /* (internal) 2-word entries giving
| ^~~~~~~~
drivers/net/can/usb/etas_es58x/es58x_fd.c: In function 'es58x_fd_tx_can_msg':
drivers/net/can/usb/etas_es58x/es58x_fd.c:360:35: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[]'} [-Wzero-length-bounds]
360 | tx_can_msg = (typeof(tx_can_msg))&es58x_fd_urb_cmd->raw_msg[msg_len];
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/net/can/usb/etas_es58x/es58x_core.h:22,
from drivers/net/can/usb/etas_es58x/es58x_fd.c:17:
drivers/net/can/usb/etas_es58x/es58x_fd.h:231:6: note: while referencing 'raw_msg'
231 | u8 raw_msg[0];
| ^~~~~~~
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Cc: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Cc: Rohit Maheshwari <rohitm@chelsio.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Mordechay Goodstein <mordechay.goodstein@intel.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arunachalam Santhanam <arunachalam.santhanam@in.bosch.com>
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: linux-crypto@vger.kernel.org
Cc: ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-can@vger.kernel.org
Cc: bpf@vger.kernel.org
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # drivers/net/can/usb/etas_es58x/*
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
There are many places where kernel code wants to have several different
typed trailing flexible arrays. This would normally be done with multiple
flexible arrays in a union, but since GCC and Clang don't (on the surface)
allow this, there have been many open-coded workarounds, usually involving
neighboring 0-element arrays at the end of a structure. For example,
instead of something like this:
struct thing {
...
union {
struct type1 foo[];
struct type2 bar[];
};
};
code works around the compiler with:
struct thing {
...
struct type1 foo[0];
struct type2 bar[];
};
Another case is when a flexible array is wanted as the single member
within a struct (which itself is usually in a union). For example, this
would be worked around as:
union many {
...
struct {
struct type3 baz[0];
};
};
These kinds of work-arounds cause problems with size checks against such
zero-element arrays (for example when building with -Warray-bounds and
-Wzero-length-bounds, and with the coming FORTIFY_SOURCE improvements),
so they must all be converted to "real" flexible arrays, avoiding warnings
like this:
fs/hpfs/anode.c: In function 'hpfs_add_sector_to_btree':
fs/hpfs/anode.c:209:27: warning: array subscript 0 is outside the bounds of an interior zero-length array 'struct bplus_internal_node[0]' [-Wzero-length-bounds]
209 | anode->btree.u.internal[0].down = cpu_to_le32(a);
| ~~~~~~~~~~~~~~~~~~~~~~~^~~
In file included from fs/hpfs/hpfs_fn.h:26,
from fs/hpfs/anode.c:10:
fs/hpfs/hpfs.h:412:32: note: while referencing 'internal'
412 | struct bplus_internal_node internal[0]; /* (internal) 2-word entries giving
| ^~~~~~~~
drivers/net/can/usb/etas_es58x/es58x_fd.c: In function 'es58x_fd_tx_can_msg':
drivers/net/can/usb/etas_es58x/es58x_fd.c:360:35: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[]'} [-Wzero-length-bounds]
360 | tx_can_msg = (typeof(tx_can_msg))&es58x_fd_urb_cmd->raw_msg[msg_len];
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/net/can/usb/etas_es58x/es58x_core.h:22,
from drivers/net/can/usb/etas_es58x/es58x_fd.c:17:
drivers/net/can/usb/etas_es58x/es58x_fd.h:231:6: note: while referencing 'raw_msg'
231 | u8 raw_msg[0];
| ^~~~~~~
However, it _is_ entirely possible to have one or more flexible arrays
in a struct or union: it just has to be in another struct. And since it
cannot be alone in a struct, such a struct must have at least 1 other
named member -- but that member can be zero sized. Wrap all this nonsense
into the new DECLARE_FLEX_ARRAY() in support of having flexible arrays
in unions (or alone in a struct).
As with struct_group(), since this is needed in UAPI headers as well,
implement the core there, with a non-UAPI wrapper.
Additionally update kernel-doc to understand its existence.
https://github.com/KSPP/linux/issues/137
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A common idiom in kernel code is to wipe the contents of a structure
starting from a given member. These open-coded cases are usually difficult
to read and very sensitive to struct layout changes. Like memset_after(),
introduce a new helper, memset_startat() that takes the target struct
instance, the byte to write, and the member name where zeroing should
start.
Note that this doesn't zero padding preceding the target member. For
those cases, memset_after() should be used on the preceding member.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Francis Laniel <laniel_francis@privacyrequired.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A common idiom in kernel code is to wipe the contents of a structure
after a given member. This is especially useful in places where there is
trailing padding. These open-coded cases are usually difficult to read
and very sensitive to struct layout changes. Introduce a new helper,
memset_after() that takes the target struct instance, the byte to write,
and the member name after which the zeroing should start.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Francis Laniel <laniel_francis@privacyrequired.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Handling the migration of TSCs correctly is difficult, in part because
Linux does not provide userspace with the ability to retrieve a (TSC,
realtime) clock pair for a single instant in time. In lieu of a more
convenient facility, KVM can report similar information in the kvm_clock
structure.
Provide userspace with a host TSC & realtime pair iff the realtime clock
is based on the TSC. If userspace provides KVM_SET_CLOCK with a valid
realtime value, advance the KVM clock by the amount of elapsed time. Do
not step the KVM clock backwards, though, as it is a monotonic
oscillator.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210916181538.968978-5-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The pci_pool users have been converted to dma_pool. Remove the unused
pci_pool wrappers.
Link: https://lore.kernel.org/r/20211018124110.214-1-caihuoqing@baidu.com
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
BSM or Backup Switch Mode is a common feature on RTCs, allowing to select
how the RTC will decide when to switch from its primary power supply to the
backup power supply. It is necessary to be able to set it from userspace as
there are uses cases where it has to be done dynamically.
Supported values are:
RTC_BSM_DISABLED: disabled
RTC_BSM_DIRECT: switching will happen as soon as Vbackup > Vdd
RTC_BSM_LEVEL: switching will happen around a threshold, usually with an
hysteresis
RTC_BSM_STANDBY: switching will not happen until Vdd > Vbackup, this is
useful to ensure the RTC doesn't draw any power until the device is first
powered on.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20211018151933.76865-6-alexandre.belloni@bootlin.com
|
|
Add a new parameter allowing the get and set the correction using ioctls
instead of just sysfs.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20211018151933.76865-5-alexandre.belloni@bootlin.com
|