Age | Commit message (Collapse) | Author |
|
Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By
default this option is set. When it is set the long-standing behavior
of being able to write to mounted block devices is enabled.
But in order to guard against unintended corruption by writing to the
block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned
off. In that case it isn't possible to write to mounted block devices
anymore.
A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES
which disallows concurrent BLK_OPEN_WRITE access. When we still had the
bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because
the mode was passed around. Since we managed to get rid of the bdev
handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based
on whether the file was opened writable and writes to that block device
are blocked. That logic doesn't work because we do allow
BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.
Fix the detection logic and use an FMODE_* bit. We could've also abused
O_EXCL as an indicator that BLK_OPEN_RESTRICT_WRITES has been requested.
For userspace open paths O_EXCL will never be retained but for internal
opens where we open files that are never installed into a file
descriptor table this is fine. But it would be a gamble that this
doesn't cause bugs. Note that BLK_OPEN_RESTRICT_WRITES is an internal
only flag that cannot directly be raised by userspace. It is implicitly
raised during mounting.
Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and
unset.
Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@casper.infradead.org
Link: https://lore.kernel.org/r/20240323-zielbereich-mittragen-6fdf14876c3e@brauner
Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access")
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The "platform_notify" and "platform_notify_remove" hooks have been unused
since 00ba9357d189 ("ARM: ixp4xx: Drop custom DMA coherency and bouncing").
Remove "platform_notify" and "platform_notify_remove". No functional
change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20240325221409.1457036-1-helgaas@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch prepares onboad_hub to support non-hub devices by renaming
the driver files and their content, the headers and their references.
The comments and descriptions have been slightly modified to keep
coherence and account for the specific cases that only affect onboard
hubs (e.g. peer-hub).
The "hub" variables in functions where "dev" (and similar names) variables
already exist have been renamed to onboard_dev for clarity, which adds a
few lines in cases where more than 80 characters are used.
No new functionality has been added.
Acked-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Javier Carrasco <javier.carrasco@wolfvision.net>
Link: https://lore.kernel.org/r/20240325-onboard_xvf3500-v8-2-29e3f9222922@wolfvision.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Keep the PXA*_SSP types together in enum pxa_ssp_type
for better maintenance.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://msgid.link/r/20240326181027.1418989-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Patch series "ARM: prctl: Reject PR_SET_MDWE where not supported".
I noticed after a recent kernel update that my ARM926 system started
segfaulting on any execve() after calling prctl(PR_SET_MDWE). After some
investigation it appears that ARMv5 is incapable of providing the
appropriate protections for MDWE, since any readable memory is also
implicitly executable.
The prctl_set_mdwe() function already had some special-case logic added
disabling it on PARISC (commit 793838138c15, "prctl: Disable
prctl(PR_SET_MDWE) on parisc"); this patch series (1) generalizes that
check to use an arch_*() function, and (2) adds a corresponding override
for ARM to disable MDWE on pre-ARMv6 CPUs.
With the series applied, prctl(PR_SET_MDWE) is rejected on ARMv5 and
subsequent execve() calls (as well as mmap(PROT_READ|PROT_WRITE)) can
succeed instead of unconditionally failing; on ARMv6 the prctl works as it
did previously.
[0] https://lore.kernel.org/all/2023112456-linked-nape-bf19@gregkh/
This patch (of 2):
There exist systems other than PARISC where MDWE may not be feasible to
support; rather than cluttering up the generic code with additional
arch-specific logic let's add a generic function for checking MDWE support
and allow each arch to override it as needed.
Link: https://lkml.kernel.org/r/20240227013546.15769-4-zev@bewilderbeest.net
Link: https://lkml.kernel.org/r/20240227013546.15769-5-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Russell King (Oracle) <linux@armlinux.org.uk>
Cc: Sam James <sam@gentoo.org>
Cc: Stefan Roesch <shr@devkernel.io>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: <stable@vger.kernel.org> [6.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
On a 104 thread, 2 socket Skylake system, Intel report a 4.7% performance
reduction with will-it-scale page_fault2. This was due to reducing the
size of the batch from 32 to 15. Increasing the folio batch size from 15
to 31 gives a performance increase of 12.5% relative to the original, or
17.2% relative to the reduced performance commit.
The penalty of this commit is an additional 128 bytes of stack usage. Six
folio_batches are also allocated from percpu memory in cpu_fbatches so
that will be an additional 768 bytes of percpu memory (per CPU). Tim Chen
originally submitted a patch like this in 2020:
https://lore.kernel.org/linux-mm/d1cc9f12a8ad6c2a52cb600d93b06b064f2bbc57.1593205965.git.tim.c.chen@linux.intel.com/
Link: https://lkml.kernel.org/r/20240315140823.2478146-1-willy@infradead.org
Fixes: 99fbb6bfc16f ("mm: make folios_put() the basis of release_pages()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202403151058.7048f6a8-oliver.sang@intel.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Last user of skb_free_datagram_locked() went away in 2016
with commit 850cbaddb52d ("udp: use it's own memory
accounting schema").
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240325134155.620531-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
Fix kernel documentation and inclusion block, and dropping the size
of the num_chipselect.
|
|
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
A couple of cleanups against linux/spi/pxa2xx_spi.h.
I'm sending this as v3 to land in the SPI subsystem. Meanwhile I'm
preparing an update to make linux/spi/pxa2xx_spi.h private to the
subsystem (PXA2xx driver). But the second part will be presented later
on (likely after v6.9-rc1). That said, this can be routed either via
SoC tree or SPI, up to respective maintainers.
|
|
The gpio_device_find_by_() functions do not have stubs which means that if
they are referenced from code with an optiona dependency on gpiolib then
the code will fail to link. Add stubs for lookups via fwnode and label. I
have not added a stub for plain gpio_device_find() since it seems harder to
see a use case for that which does not depend on gpiolib.
With the addition of the GPIO reset controller (which lacks a gpiolib
dependency) to the arm64 defconfig this is causing build breaks for arm64
virtconfig in -next:
aarch64-linux-gnu-ld: drivers/reset/core.o: in function `__reset_add_reset_gpio_lookup':
/build/stage/linux/drivers/reset/core.c:861:(.text+0xccc): undefined reference to `gpio_device_find_by_fwnode'
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Unregistering SCMI notifications using the managed devres interface can be
done providing as a reference simply the previously successfully registered
notification block since it could have been registered only on one kernel
notification_chain: drop any reference to SCMI protocol, events and
sources.
Devres internal helpers can search for the provided notification block
reference and, once found, the associated devres object will already
provide the above SCMI references for the event.
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20240325204620.1437237-5-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
|
|
The defer_list is a per-CPU list which is used to free skbs outside of
the socket lock and on the CPU on which they have been allocated.
The list is processed during NAPI callbacks so ideally the list is
cleaned up.
Should the amount of skbs on the list exceed a certain water mark then
the softirq is triggered remotely on the target CPU by invoking a remote
function call. The raise of the softirqs via a remote function call
leads to waking the ksoftirqd on PREEMPT_RT which is undesired.
The backlog-NAPI threads already provide the infrastructure which can be
utilized to perform the cleanup of the defer_list.
The NAPI state is updated with the input_pkt_queue.lock acquired. It
order not to break the state, it is needed to also wake the backlog-NAPI
thread with the lock held. This requires to acquire the use the lock in
rps_lock_irq*() if the backlog-NAPI threads are used even with RPS
disabled.
Move the logic of remotely starting softirqs to clean up the defer_list
into kick_defer_list_purge(). Make sure a lock is held in
rps_lock_irq*() if backlog-NAPI threads are used. Schedule backlog-NAPI
for defer_list cleanup if backlog-NAPI is available.
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Since both ext4 and overlayfs define the same macro to specify string
parameters that may allow empty values, define it in an header file so
that this helper can be shared.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Link: https://lore.kernel.org/r/20240312104757.27333-1-luis.henriques@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a new statx field for (sub)volume identifiers, as implemented by
btrfs and bcachefs.
This includes bcachefs support; we'll definitely want btrfs support as
well.
Link: https://lore.kernel.org/linux-fsdevel/2uvhm6gweyl7iyyp2xpfryvcu2g3padagaeqcbiavjyiis6prl@yjm725bizncq/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20240308022914.196982-1-kent.overstreet@linux.dev
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
There is a problem when a driver requests a shared interrupt line to run a
threaded handler on it without IRQF_ONESHOT set if that flag has been set
already for the IRQ in question by somebody else. Namely, the request
fails which usually leads to a probe failure even though the driver might
have worked just fine with IRQF_ONESHOT, but it does not want to use it by
default. Currently, the only way to handle this is to try to request the
IRQ without IRQF_ONESHOT, but with IRQF_PROBE_SHARED set and if this fails,
try again with IRQF_ONESHOT set. However, this is a bit cumbersome and not
very clean.
When commit 7a36b901a6eb ("ACPI: OSL: Use a threaded interrupt handler for
SCI") switched the ACPI subsystem over to using a threaded interrupt
handler for the SCI, it had to use IRQF_ONESHOT for it because that's
required due to the way the SCI handler works (it needs to walk all of the
enabled GPEs before the interrupt line can be unmasked). The SCI interrupt
line is not shared with other users very often due to the SCI handling
overhead, but on sone systems it is shared and when the other user of it
attempts to install a threaded handler, a flags mismatch related to
IRQF_ONESHOT may occur.
As it turned out, that happened to the pinctrl-amd driver and so commit
4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
attempted to address the issue by adding IRQF_ONESHOT to the interrupt
flags in that driver, but this is now causing an IRQF_ONESHOT-related
mismatch to occur on another system which cannot boot as a result of it.
Clearly, pinctrl-amd can work with IRQF_ONESHOT if need be, but it should
not set that flag by default, so it needs a way to indicate that to the
interrupt subsystem.
To that end, introdcuce a new interrupt flag, IRQF_COND_ONESHOT, which will
only have effect when the IRQ line is shared and IRQF_ONESHOT has been set
for it already, in which case it will be promoted to the latter.
This is sufficient for drivers sharing the interrupt line with the SCI as
it is requested by the ACPI subsystem before any drivers are probed, so
they will always see IRQF_ONESHOT set for the interrupt in question.
Fixes: 4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
Reported-by: Francisco Ayala Le Brun <francisco@videowindow.eu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Closes: https://lore.kernel.org/lkml/CAN-StX1HqWqi+YW=t+V52-38Mfp5fAz7YHx4aH-CQjgyNiKx3g@mail.gmail.com/
Link: https://lore.kernel.org/r/12417336.O9o76ZdvQC@kreacher
|
|
There are no more users of the deprecated is_dma_mapped in struct
spi_message so it can be removed.
References in documentation and comments are also removed.
A few similar checks if xfer->tx_dma or xfer->rx_dma are not NULL are
also removed since these are now guaranteed to be NULL because they
were previously set only if is_dma_mapped was true.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://msgid.link/r/20240325-spi-remove-is_dma_mapped-v2-1-d08d62b61f1c@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add optional irq_num attribute to ad_sigma_delta_info structure for
selecting the used interrupt line for ADC's conversion completion.
Signed-off-by: Dumitru Ceclan <mitrutzceclan@gmail.com>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20240228110622.25114-2-mitrutzceclan@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Export libata NCQ Priority configuration helpers to be reused for libsas
managed SATA devices.
Switched locking from spin_lock_irq() to spin_lock_irqsave(). In the
future someone might call these helper functions when interrupts are
disabled. spin_unlock_irq() could lead to a premature re-enabling of
interrupts, whereas spin_unlock_irqrestore() restores the interrupt state
to its condition prior to the spin_lock_irqsave() call.
Acked-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Link: https://lore.kernel.org/r/20240307214418.3812290-2-ipylypiv@google.com
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Similar to recently propose for_each_child_of_node_scoped() this
new version of the loop macro instantiates a new local
struct fwnode_handle * that uses the __free(fwnode_handle) auto
cleanup handling so that if a reference to a node is held on early
exit from the loop the reference will be released. If the loop
runs to completion, the child pointer will be NULL and no action will
be taken.
The reason this is useful is that it removes the need for
fwnode_handle_put() on early loop exits. If there is a need
to retain the reference, then return_ptr(child) or no_free_ptr(child)
may be used to safely disable the auto cleanup.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20240217164249.921878-5-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Useful where the fwnode_handle was obtained from a call such as
fwnode_find_reference() as it will safely do nothing if IS_ERR() is true
and will automatically release the reference on the variable leaving
scope.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20240217164249.921878-3-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
By having this function as static inline in the header, the compiler
is able to see if can optmize the call out if (IS_ERR_OR_NULL(fwnode))
This will allow a simpler DEFINE_FREE() call in the following patch.
Suggested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240217164249.921878-2-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Commit 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop
management") introduced the manage_system_start_stop scsi_device flag to
allow libata to indicate to the SCSI disk driver that nothing should be
done when resuming a disk on system resume. This change turned the
execution of sd_resume() into a no-op for ATA devices on system
resume. While this solved deadlock issues during device resume, this change
also wrongly removed the execution of opal_unlock_from_suspend(). As a
result, devices with TCG OPAL locking enabled remain locked and
inaccessible after a system resume from sleep.
To fix this issue, introduce the SCSI driver resume method and implement it
with the sd_resume() function calling opal_unlock_from_suspend(). The
former sd_resume() function is renamed to sd_resume_common() and modified
to call the new sd_resume() function. For non-ATA devices, this result in
no functional changes.
In order for libata to explicitly execute sd_resume() when a device is
resumed during system restart, the function scsi_resume_device() is
introduced. libata calls this function from the revalidation work executed
on devie resume, a state that is indicated with the new device flag
ATA_DFLAG_RESUMING. Doing so, locked TCG OPAL enabled devices are unlocked
on resume, allowing normal operation.
Fixes: 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218538
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240319071209.1179257-1-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The enable_and_queue_work() function is introduced to streamline
the process of enabling and queuing a work item on a specific
workqueue. This function combines the functionalities of
enable_work() and queue_work() in a single call, providing a
concise and convenient API for enabling and queuing work items.
The function accepts a target workqueue and a work item as parameters.
It first attempts to enable the work item using enable_work(). A successful
enable operation means that the work item was previously disabled
and is now marked as eligible for execution. If the enable operation
is successful, the work item is then queued on the specified workqueue
using queue_work(). The function returns true if the work item was
successfully enabled and queued, and false otherwise.
Note: This function may lead to unnecessary spurious wake-ups in cases
where the work item is expected to be dormant but enable/disable are called
frequently. Spurious wake-ups refer to the condition where worker threads
are woken up without actual work to be done. Callers should be aware of
this behavior and may need to employ additional synchronization mechanisms
to avoid these overheads if such wake-ups are not desired.
This addition aims to enhance code readability and maintainability by
providing a unified interface for the common use case of enabling and
queuing work items on a workqueue.
tj: Made the function comment more compact.
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
attrs->affn_strict
Allow more pools can be shared when attrs->affn_strict.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a regression that broke iwd as well as a divide by zero in
iaa"
* tag 'v6.9-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: iaa - Fix nr_cpus < nr_iaa case
Revert "crypto: pkcs7 - remove sha1 support"
|
|
Add an off-queue flag, WORK_OFFQ_BH, that indicates whether the last
workqueue the work item was on was a BH one. This will be used to test
whether a work item is BH in cancel_sync path to implement atomic
cancel_sync'ing for BH work items.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
|
|
cancel[_delayed]_work_sync() guarantees that it can shut down
self-requeueing work items. To achieve that, it grabs and then holds
WORK_STRUCT_PENDING bit set while flushing the currently executing instance.
As the PENDING bit is set, all queueing attempts including the
self-requeueing ones fail and once the currently executing instance is
flushed, the work item should be idle as long as someone else isn't actively
queueing it.
This means that the cancel_work_sync path may hold the PENDING bit set while
flushing the target work item. This isn't a problem for the queueing path -
it can just fail which is the desired effect. It doesn't affect flush. It
doesn't matter to cancel_work either as it can just report that the work
item has successfully canceled. However, if there's another cancel_work_sync
attempt on the work item, it can't simply fail or report success and that
would breach the guarantee that it should provide. cancel_work_sync has to
wait for and grab that PENDING bit and go through the motions.
WORK_OFFQ_CANCELING and wq_cancel_waitq are what implement this
cancel_work_sync to cancel_work_sync wait mechanism. When a work item is
being canceled, WORK_OFFQ_CANCELING is also set on it and other
cancel_work_sync attempts wait on the bit to be cleared using the wait
queue.
While this works, it's an isolated wart which doesn't jive with the rest of
flush and cancel mechanisms and forces enable_work() and disable_work() to
require a sleepable context, which hampers their usability.
Now that a work item can be disabled, we can use that to block queueing
while cancel_work_sync is in progress. Instead of holding PENDING the bit,
it can temporarily disable the work item, flush and then re-enable it as
that'd achieve the same end result of blocking queueings while canceling and
thus enable canceling of self-requeueing work items.
- WORK_OFFQ_CANCELING and the surrounding mechanims are removed.
- work_grab_pending() is now simpler, no longer has to wait for a blocking
operation and thus can be called from any context.
- With work_grab_pending() simplified, no need to use try_to_grab_pending()
directly. All users are converted to use work_grab_pending().
- __cancel_work_sync() is updated to __cancel_work() with
WORK_CANCEL_DISABLE to cancel and plug racing queueing attempts. It then
flushes and re-enables the work item if necessary.
- These changes allow disable_work() and enable_work() to be called from any
context.
v2: Lai pointed out that mod_delayed_work_on() needs to check the disable
count before queueing the delayed work item. Added
clear_pending_if_disabled() call.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
|
|
While (delayed) work items could be flushed and canceled, there was no way
to prevent them from being queued in the future. While this didn't lead to
functional deficiencies, it sometimes required a bit more effort from the
workqueue users to e.g. sequence shutdown steps with more care.
Workqueue is currently in the process of replacing tasklet which does
support disabling and enabling. The feature is used relatively widely to,
for example, temporarily suppress main path while a control plane operation
(reset or config change) is in progress.
To enable easy conversion of tasklet users and as it seems like an inherent
useful feature, this patch implements disabling and enabling of work items.
- A work item carries 16bit disable count in work->data while not queued.
The access to the count is synchronized by the PENDING bit like all other
parts of work->data.
- If the count is non-zero, the work item cannot be queued. Any attempt to
queue the work item fails and returns %false.
- disable_work[_sync](), enable_work(), disable_delayed_work[_sync]() and
enable_delayed_work() are added.
v3: enable_work() was using local_irq_enable() instead of
local_irq_restore() to undo IRQ-disable by work_grab_pending(). This is
awkward now and will become incorrect as enable_work() will later be
used from IRQ context too. (Lai)
v2: Lai noticed that queue_work_node() wasn't checking the disable count.
Fixed. queue_rcu_work() is updated to trigger warning if the inner work
item is disabled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
|
|
The cancel[_sync] paths acquire and release WORK_STRUCT_PENDING, and
manipulate WORK_OFFQ_CANCELING. However, they assume that all the OFFQ bit
values except for the pool ID are statically known and don't preserve them,
which is not wrong in the current code as the pool ID and CANCELING are the
only information carried. However, the planned disable/enable support will
add more fields and need them to be preserved.
This patch updates work data handling so that only the bits which need
updating are updated.
- struct work_offq_data is added along with work_offqd_unpack() and
work_offqd_pack_flags() to help manipulating multiple fields contained in
work->data. Note that the helpers look a bit silly right now as there
isn't that much to pack. The next patch will add more.
- mark_work_canceling() which is used only by __cancel_work_sync() is
replaced by open-coded usage of work_offq_data and
set_work_pool_and_keep_pending() in __cancel_work_sync().
- __cancel_work[_sync]() uses offq_data helpers to preserve other OFFQ bits
when clearing WORK_STRUCT_PENDING and WORK_OFFQ_CANCELING at the end.
- This removes all users of get_work_pool_id() which is dropped. Note that
get_work_pool_id() could handle both WORK_STRUCT_PWQ and !WORK_STRUCT_PWQ
cases; however, it was only being called after try_to_grab_pending()
succeeded, in which case WORK_STRUCT_PWQ is never set and thus it's safe
to use work_offqd_unpack() instead.
No behavior changes intended.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
|
|
No in-kernel users of struct rspi_plat_data. If required,
the software nodes should be used for such users. For now
just get rid of it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://msgid.link/r/20240325142118.3210915-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The RISC-V advanced interrupt architecture (AIA) specification defines
advanced platform-level interrupt controller (APLIC) which has two modes
of operation: 1) Direct mode and 2) MSI mode.
(For more details, refer https://github.com/riscv/riscv-aia)
In APLIC direct-mode, wired interrupts are forwared to CPUs (or HARTs)
as a local external interrupt.
Add a platform irqchip driver for the RISC-V APLIC direct-mode to
support RISC-V platforms having only wired interrupts.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240307140307.646078-7-apatel@ventanamicro.com
|
|
The RISC-V advanced interrupt architecture (AIA) specification
defines a new MSI controller called incoming message signalled
interrupt controller (IMSIC) which manages MSI on per-HART (or
per-CPU) basis. It also supports IPIs as software injected MSIs.
(For more details refer https://github.com/riscv/riscv-aia)
Add an early irqchip driver for RISC-V IMSIC which sets up the
IMSIC state and provide IPIs.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240307140307.646078-3-apatel@ventanamicro.com
|
|
Implement a helper function to check if an instruction is
addr_space_cast from as(0) to as(1). Use this helper in the x86 JIT.
Other JITs can use this helper when they add support for this instruction.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240324183226.29674-1-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Update power_type in bss_conf based on Indoor AFC and LPI power types
received in HE 6 GHz operation element on assoc success.
Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240318184907.89c25dae34ff.Ifd8b2983f400623ac03dc032fc9a20025c9ca365@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If there was a possibility of an MLE basic STA profile without
subelements, we might reject it because we account for the one
octet for sta_info_len twice (it's part of itself, and in the
fixed portion). Like in ieee80211_mle_reconf_sta_prof_size_ok,
subtract 1 to adjust that.
When reading the elements we did take this into account, and
since there are always elements, this never really mattered.
Fixes: 7b6f08771bf6 ("wifi: ieee80211: Support validating ML station profile length")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240318184907.00bb0b20ed60.I8c41dd6fc14c4b187ab901dea15ade73c79fb98c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
For simplicity, we may want to pass a NULL element, and
while we should then pass also a zero length, just be a
bit more careful here.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240318184907.4d983653cb8d.Ic3ea99b60c61ac2f7d38cb9fd202a03c97a05601@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If a misfit task is affined to a subset of the possible CPUs, we need to
verify that one of these CPUs can fit it. Otherwise the load balancer
code will continuously trigger needlessly leading the balance_interval
to increase in return and eventually end up with a situation where real
imbalances take a long time to address because of this impossible
imbalance situation.
This can happen in Android world where it's common for background tasks
to be restricted to little cores.
Similarly if we can't fit the biggest core, triggering misfit is
pointless as it is the best we can ever get on this system.
To be able to detect that; we use asym_cap_list to iterate through
capacities in the system to see if the task is able to run at a higher
capacity level based on its p->cpus_ptr. We do that when the affinity
change, a fair task is forked, or when a task switched to fair policy.
We store the max_allowed_capacity in task_struct to allow for cheap
comparison in the fast path.
Improve check_misfit_status() function by removing redundant checks.
misfit_task_load will be 0 if the task can't move to a bigger CPU. And
nohz_balancer_kick() already checks for cpu_check_capacity() before
calling check_misfit_status().
Test:
=====
Add
trace_printk("balance_interval = %lu\n", interval)
in get_sd_balance_interval().
run
if [ "$MASK" != "0" ]; then
adb shell "taskset -a $MASK cat /dev/zero > /dev/null"
fi
sleep 10
// parse ftrace buffer counting the occurrence of each valaue
Where MASK is either:
* 0: no busy task running
* 1: busy task is pinned to 1 cpu; handled today to not cause
misfit
* f: busy task pinned to little cores, simulates busy background
task, demonstrates the problem to be fixed
Results:
========
Note how occurrence of balance_interval = 128 overshoots for MASK = f.
BEFORE
------
MASK=0
1 balance_interval = 175
120 balance_interval = 128
846 balance_interval = 64
55 balance_interval = 63
215 balance_interval = 32
2 balance_interval = 31
2 balance_interval = 16
4 balance_interval = 8
1870 balance_interval = 4
65 balance_interval = 2
MASK=1
27 balance_interval = 175
37 balance_interval = 127
840 balance_interval = 64
167 balance_interval = 63
449 balance_interval = 32
84 balance_interval = 31
304 balance_interval = 16
1156 balance_interval = 8
2781 balance_interval = 4
428 balance_interval = 2
MASK=f
1 balance_interval = 175
1328 balance_interval = 128
44 balance_interval = 64
101 balance_interval = 63
25 balance_interval = 32
5 balance_interval = 31
23 balance_interval = 16
23 balance_interval = 8
4306 balance_interval = 4
177 balance_interval = 2
AFTER
-----
Note how the high values almost disappear for all MASK values. The
system has background tasks that could trigger the problem without
simulate it even with MASK=0.
MASK=0
103 balance_interval = 63
19 balance_interval = 31
194 balance_interval = 8
4827 balance_interval = 4
179 balance_interval = 2
MASK=1
131 balance_interval = 63
1 balance_interval = 31
87 balance_interval = 8
3600 balance_interval = 4
7 balance_interval = 2
MASK=f
8 balance_interval = 127
182 balance_interval = 63
3 balance_interval = 31
9 balance_interval = 16
415 balance_interval = 8
3415 balance_interval = 4
21 balance_interval = 2
Signed-off-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20240324004552.999936-3-qyousef@layalina.io
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There is no use for whole 16-bit for the number of chip select pins.
Drop it to 8 bits.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://msgid.link/r/20240307195056.4059864-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
There is the only one user of the pxa2xx_set_spi_info(). Unexport it
and inline to the actual user.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://msgid.link/r/20240307195056.4059864-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
There is no use for whole 16-bit for the number of chip select pins.
Drop it to 8 bits and reshuffle the data structure layout to avoid
unnecessary paddings.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Michal Simek <michal.simek@amd.com>
Link: https://msgid.link/r/20240308162920.46816-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
xilinx_spi.h is mnissing inclusion and forward declaration, add them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Link: https://msgid.link/r/20240308162920.46816-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
While updating the data structure layout the kernel documentation
became outdated. Synchronize kernel documentation with the actual
data structure layout.
Fixes: 1dd46599f83a ("spi: xilinx: add force_irq for QSPI mode")
Fixes: 082339bc63cc ("spi: spi-xilinx: Add run run-time endian detection")
Reviewed-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://msgid.link/r/20240308162920.46816-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Adding support for new pmic pca9451a.
Signed-off-by: Joy Zou <joy.zou@nxp.com>
Link: https://msgid.link/r/20240318095633.4079027-3-joy.zou@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
- Handle errors in mark_rodata_ro() and mark_initmem_nx()
- Make struct crash_mem available without CONFIG_CRASH_DUMP
Thanks to Christophe Leroy and Hari Bathini.
* tag 'powerpc-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency
powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
kexec/kdump: make struct crash_mem available without CONFIG_CRASH_DUMP
powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more hardening updates from Kees Cook:
- CONFIG_MEMCPY_SLOW_KUNIT_TEST is no longer needed (Guenter Roeck)
- Fix needless UTF-8 character in arch/Kconfig (Liu Song)
- Improve __counted_by warning message in LKDTM (Nathan Chancellor)
- Refactor DEFINE_FLEX() for default use of __counted_by
- Disable signed integer overflow sanitizer on GCC < 8
* tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lkdtm/bugs: Improve warning message for compilers without counted_by support
overflow: Change DEFINE_FLEX to take __counted_by member
Revert "kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST"
arch/Kconfig: eliminate needless UTF-8 character in Kconfig help
ubsan: Disable signed integer overflow sanitizer on GCC < 8
|
|
The norm should be flexible array structures with __counted_by
annotations, so DEFINE_FLEX() is updated to expect that. Rename
the non-annotated version to DEFINE_RAW_FLEX(), and update the
few existing users. Additionally add selftests for the macros.
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20240306235128.it.933-kees@kernel.org
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for various vector-accelerated crypto routines
- Hibernation is now enabled for portable kernel builds
- mmap_rnd_bits_max is larger on systems with larger VAs
- Support for fast GUP
- Support for membarrier-based instruction cache synchronization
- Support for the Andes hart-level interrupt controller and PMU
- Some cleanups around unaligned access speed probing and Kconfig
settings
- Support for ACPI LPI and CPPC
- Various cleanus related to barriers
- A handful of fixes
* tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
riscv: Fix syscall wrapper for >word-size arguments
crypto: riscv - add vector crypto accelerated AES-CBC-CTS
crypto: riscv - parallelize AES-CBC decryption
riscv: Only flush the mm icache when setting an exec pte
riscv: Use kcalloc() instead of kzalloc()
riscv/barrier: Add missing space after ','
riscv/barrier: Consolidate fence definitions
riscv/barrier: Define RISCV_FULL_BARRIER
riscv/barrier: Define __{mb,rmb,wmb}
RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
cpufreq: Move CPPC configs to common Kconfig and add RISC-V
ACPI: RISC-V: Add CPPC driver
ACPI: Enable ACPI_PROCESSOR for RISC-V
ACPI: RISC-V: Add LPI driver
cpuidle: RISC-V: Move few functions to arch/riscv
riscv: Introduce set_compat_task() in asm/compat.h
riscv: Introduce is_compat_thread() into compat.h
riscv: add compile-time test into is_compat_task()
riscv: Replace direct thread flag check with is_compat_task()
riscv: Improve arch_get_mmap_end() macro
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Add objtool support for LoongArch
- Add ORC stack unwinder support for LoongArch
- Add kernel livepatching support for LoongArch
- Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
- Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
- Some bug fixes and other small changes
* tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch/crypto: Clean up useless assignment operations
LoongArch: Define the __io_aw() hook as mmiowb()
LoongArch: Remove superfluous flush_dcache_page() definition
LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h
LoongArch: Change __my_cpu_offset definition to avoid mis-optimization
LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
LoongArch: Add kernel livepatching support
LoongArch: Add ORC stack unwinder support
objtool: Check local label in read_unwind_hints()
objtool: Check local label in add_dead_ends()
objtool/LoongArch: Enable orc to be built
objtool/x86: Separate arch-specific and generic parts
objtool/LoongArch: Implement instruction decoder
objtool/LoongArch: Enable objtool to be built
|