Age | Commit message (Collapse) | Author |
|
This reverts commit 464eb03c4a7cfb32cb3324249193cf6bb5b35152.
Once we have a per-netns RTNL, we won't use guard(rtnl).
Also, there's no users for now.
$ grep -rnI "guard(rtnl" || true
$
Suggested-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/CANn89i+KoYzUH+VPLdGmLABYf5y4TW0hrM4UAeQQJ9AREty0iw@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The omap4 camera driver has seen no progress since forever, and
now OMAP4 support has also been dropped from u-boot (1). So it is
time to retire this driver.
(1): https://lists.denx.de/pipermail/u-boot/2024-July/558846.html
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
|
|
Darrick J. Wong <djwong@kernel.org> says:
This patchset fixes multiple data corruption bugs in the fallocate
unshare range implementation for fsdax.
* patches from https://lore.kernel.org/r/172796813251.1131942.12184885574609980777.stgit@frogsfrogsfrogs:
fsdax: dax_unshare_iter needs to copy entire blocks
fsdax: remove zeroing code from dax_unshare_iter
iomap: share iomap_unshare_iter predicate code with fsdax
xfs: don't allocate COW extents when unsharing a hole
Link: https://lore.kernel.org/r/172796813251.1131942.12184885574609980777.stgit@frogsfrogsfrogs
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Introduce support for configuring the master/slave role of PHYs based on
the `timing-role` property in the device tree. While this functionality
is necessary for Single Pair Ethernet (SPE) PHYs (1000/100/10Base-T1)
where hardware strap pins may be unavailable or incorrectly set, it
works for any PHY type.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Giving the opportunity to userspace to associate a free-form
name with a drm_file struct is helpful for tracking and debugging.
This is similar to the existing DMA_BUF_SET_NAME ioctl.
Access to client_name is protected by a mutex, and the 'clients' debugfs
file has been updated to print it.
Userspace MR to use this ioctl:
https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1428
If the string passed by userspace contains chars that would mess up output
when it's going to be printed (in dmesg, fdinfo, etc), -EINVAL is returned.
A 0-length string is a valid use, and clears the existing name.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241003124506.470931-2-pierre-eric.pelloux-prayer@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
This drm printer wrapper can be used to increase the robustness of
the captured output generated by any other drm_printer to make sure
we didn't lost any intermediate lines of the output by adding line
numbers to each output line. Helpful for capturing some crash data.
v2: Extended short int counters to full int (JohnH)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-8-John.C.Harrison@Intel.com
|
|
Jordan reported that when running Cilium with netkit in per-endpoint-routes
mode, network policy misclassifies traffic. In this direct routing mode
of Cilium which is used in case of GKE/EKS/AKS, the Pod's BPF program to
enforce policy sits on the netkit primary device's egress side.
The issue here is that in case of netkit's netkit_prep_forward(), it will
clear meta data such as skb->mark and skb->priority before executing the
BPF program. Thus, identity data stored in there from earlier BPF programs
(e.g. from tcx ingress on the physical device) gets cleared instead of
being made available for the primary's program to process. While for traffic
egressing the Pod via the peer device this might be desired, this is
different for the primary one where compared to tcx egress on the host
veth this information would be available.
To address this, add a new parameter for the device orchestration to
allow control of skb->mark and skb->priority scrubbing, to make the two
accessible from BPF (and eventually leave it up to the program to scrub).
By default, the current behavior is retained. For netkit peer this also
enables the use case where applications could cooperate/signal intent to
the BPF program.
Note that struct netkit has a 4 byte hole between policy and bundle which
is used here, in other words, struct netkit's first cacheline content used
in fast-path does not get moved around.
Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device")
Reported-by: Jordan Rife <jrife@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://github.com/cilium/cilium/issues/34042
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20241004101335.117711-1-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Upcoming per-netns RTNL conversion needs to get rid
of shared hash tables.
fib_info_devhash[] is one of them.
It is unclear why we used a hash table, because
a single hlist_head per net device was cheaper and scalable.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
No driver now uses the DSA switch phylink members, so we can now remove
the method pointers, but we need to leave empty shim functions to allow
those drivers that do not provide phylink MAC operations structure to
continue functioning.
Signed-off-by: Russell King (oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # sja1105, felix, dsa_loop
Link: https://patch.msgid.link/E1swKNV-0060oN-1b@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add new SMI event to report the dropped event count.
When the event kfifo is full, drop count is not zero, or no enough space
left to store the event message, increase drop count.
After reading event out from kfifo, if event was dropped, drop_count is
not zero, generate a dropped event record and reset drop count to zero.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If page migration failed, also output migrate end event to match with
migrate start event, with failure error_code added to the end of the
migrate message macro. This will not break uAPI because application uses
old message macro sscanf drop and ignore the error_code.
Output GPU page fault restore end event if migration failed.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We never had callers for __close_range() except for close_range(2)
itself. Nothing of that sort has appeared in four years and if any users
do show up, we can always separate those suckers again.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Once upon a time, predecessors of those used to do file lookup
without bumping a refcount, provided that caller held rcu_read_lock()
across the lookup and whatever it wanted to read from the struct
file found. When struct file allocation switched to SLAB_TYPESAFE_BY_RCU,
that stopped being feasible and these primitives started to bump the
file refcount for lookup result, requiring the caller to call fput()
afterwards.
But that turned them pointless - e.g.
rcu_read_lock();
file = lookup_fdget_rcu(fd);
rcu_read_unlock();
is equivalent to
file = fget_raw(fd);
and all callers of lookup_fdget_rcu() are of that form. Similarly,
task_lookup_fdget_rcu() calls can be replaced with calling fget_task().
task_lookup_next_fdget_rcu() doesn't have direct counterparts, but
its callers would be happier if we replaced it with an analogue that
deals with RCU internally.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Merge spk_rtd_init for multifunction sdca codecs:rt712/rt722
Signed-off-by: Naveen Manohar <naveen.m@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20241007075955.12575-3-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Fixes build issues with the KVM selftests.
|
|
Pulls in a build fix for the KVM selftests.
|
|
The predicate code that iomap_unshare_iter uses to decide if it's really
needs to unshare a file range mapping should be shared with the fsdax
version, because right now they're opencoded and inconsistent.
Note that we simplify the predicate logic a bit -- we no longer allow
unsharing of inline data mappings, but there aren't any filesystems that
allow shared inline data currently.
This is a fix in the sense that it should have been ported to fsdax.
Fixes: b53fdb215d13 ("iomap: improve shared block detection in iomap_unshare_iter")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/172796813294.1131942.15762084021076932620.stgit@frogsfrogsfrogs
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
netfslib currently defers dropping the ref on the folios it obtains during
readahead to after it has started I/O on the basis that we can do it whilst
we wait for the I/O to complete, but this runs the risk of the I/O
collection racing with this in future.
Furthermore, Matthew Wilcox strongly suggests that the refs should be
dropped immediately, as readahead_folio() does (netfslib is using
__readahead_batch() which doesn't drop the refs).
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/3771538.1728052438@warthog.procyon.org.uk
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Matthew Wilcox (Oracle) <willy@infradead.org> says:
A few minor fixes; nothing earth-shattering.
Matthew Wilcox (Oracle) (3):
netfs: Remove call to folio_index()
netfs: Fix a few minor bugs in netfs_page_mkwrite()
netfs: Remove unnecessary references to pages
Link: https://lore.kernel.org/r/20241005182307.3190401-1-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Calling folio_index() is pointless overhead; directly dereferencing
folio->index is fine.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20241005182307.3190401-2-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The VFS has always used coarse-grained timestamps when updating the
ctime and mtime after a change. This has the benefit of allowing
filesystems to optimize away a lot metadata updates, down to around 1
per jiffy, even when a file is under heavy writes.
Unfortunately, this has always been an issue when we're exporting via
NFSv3, which relies on timestamps to validate caches. A lot of changes
can happen in a jiffy, so timestamps aren't sufficient to help the
client decide when to invalidate the cache. Even with NFSv4, a lot of
exported filesystems don't properly support a change attribute and are
subject to the same problems with timestamp granularity. Other
applications have similar issues with timestamps (e.g backup
applications).
If fine-grained timestamps were always used, that would improve the
situation, but that becomes rather expensive, as the underlying
filesystem would have to log a lot more metadata updates.
What is needed is a way to only use fine-grained timestamps when they
are being actively queried. Use the (unused) top bit in
inode->i_ctime_nsec as a flag that indicates whether the current
timestamps have been queried via stat() or the like. When it's set,
allow the update to use a fine-grained timestamp iff it's necessary to
make the ctime show a different value.
If it has been queried, then first see whether the current coarse time
is later than the existing ctime. If it is, accept that value. If it
isn't, then get a fine-grained timestamp and attempt to stamp the inode
ctime with that value. If that races with another concurrent stamp, then
abandon the update and take the new value without retrying.
Filesystems can opt into this by setting the FS_MGTIME fstype flag.
Others should be unaffected (other than being subject to the same floor
value as multigrain filesystems).
Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20241002-mgtime-v10-3-d1c4717f5284@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Update header inclusions to follow IWYU (Include What You Use)
principle.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://lore.kernel.org/r/20241003154819.1075141-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
bit_wait_io_timeout has been unused since 2016's
commit 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Remove it.
Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: https://lore.kernel.org/r/20241001234016.231696-1-linux@treblig.org
|
|
__HAVE_THREAD_FUNCTIONS could be defined by architectures wishing to
provide their own task_thread_info(), task_stack_page(),
setup_thread_stack() and end_of_stack() hooks.
Commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture")
removed the last upstream consumer of __HAVE_THREAD_FUNCTIONS, so change
the remaining !CONFIG_THREAD_INFO_IN_TASK && !__HAVE_THREAD_FUNCTIONS
conditionals to only check for the former case.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20240930050945.30304-2-ddiss@suse.de
|
|
It is not currently possible to wait wait_var_event for an io_schedule()
style wait. This patch adds wait_var_event_io() for that purpose.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-7-neilb@suse.de
|
|
Sometimes we need to wait for a condition to be true which must be
testing while holding a lock. Correspondingly the condition is made
true while holding the lock and the wake up is sent under the lock.
This patch provides wake and wait interfaces which can be used for this
situation when the lock is a mutex or a spinlock, or any other lock for
which there are foo_lock() and foo_unlock() functions.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-6-neilb@suse.de
|
|
There are common patterns in the kernel of using test_and_clear_bit()
before wake_up_bit(), and atomic_dec_and_test() before wake_up_var().
These combinations don't need extra barriers but sometimes include them
unnecessarily.
To help avoid the unnecessary barriers and to help discourage the
general use of wake_up_bit/var (which is a fragile interface) introduce
two combined functions which implement these patterns.
Also add store_release_wake_up() which supports the task of simply
setting a non-atomic variable and sending a wakeup. This pattern
requires barriers which are often omitted.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-5-neilb@suse.de
|
|
wake_up_var(), wait_var_event() and related interfaces are not
documented but have important ordering requirements. This patch adds
documentation and makes these requirements explicit.
The return values for those wait_var_event_* functions which return a
value are documented. Note that these are, perhaps surprisingly,
sometimes different from comparable wait_on_bit() functions.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-4-neilb@suse.de
|
|
This patch revises the documention for wake_up_bit(),
clear_and_wake_up_bit(), and all the wait_on_bit() family of functions.
The new documentation places less emphasis on the pool of waitqueues
used (an implementation detail) and focuses instead on details of how
the functions behave.
The barriers included in the wait functions and clear_and_wake_up_bit()
and those required for wake_up_bit() are spelled out more clearly.
The error statuses returned are given explicitly.
The fact that the wait_on_bit_lock() function sets the bit is made more
obvious.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-3-neilb@suse.de
|
|
wake_up_bit() currently allows a "void *". While this isn't strictly a
problem as the address is never dereferenced, it is inconsistent with
the corresponding wait_on_bit() which requires "unsigned long *" and
does dereference the pointer.
Any code that needs to wait for a change in something other than an
unsigned long would be better served by wake_up_var()/wait_var_event().
This patch changes all related "void *" to "unsigned long *".
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-2-neilb@suse.de
|
|
Provide a new api rdma_user_mmap_disassociate() for drivers to
disassociate mmap pages for a device.
Since drivers can now disassociate mmaps by calling this api,
introduce a new disassociation_lock to specifically prevent
races between this disassociation process and new mmaps. And
thus the old hw_destroy_rwsem is not needed in this api.
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20240927103323.1897094-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Pick up the VFS specific interfaces so further timekeeping changes can be
based on them.
|
|
The mgtime_floor value is a global variable for tracking the latest
fine-grained timestamp handed out. Because it's a global, track the
number of times that a new floor value is assigned.
Add a new percpu counter to the timekeeping code to track the number of
floor swap events that have occurred. A later patch will add a debugfs
file to display this counter alongside other stats involving multigrain
timestamps.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits
Link: https://lore.kernel.org/all/20241002-mgtime-v10-2-d1c4717f5284@kernel.org
|
|
Multigrain timestamps allow the kernel to use fine-grained timestamps when
an inode's attributes is being actively observed via ->getattr(). With
this support, it's possible for a file to get a fine-grained timestamp, and
another modified after it to get a coarse-grained stamp that is earlier
than the fine-grained time. If this happens then the files can appear to
have been modified in reverse order, which breaks VFS ordering guarantees
[1].
To prevent this, maintain a floor value for multigrain timestamps.
Whenever a fine-grained timestamp is handed out, record it, and when later
coarse-grained stamps are handed out, ensure they are not earlier than that
value. If the coarse-grained timestamp is earlier than the fine-grained
floor, return the floor value instead.
Add a static singleton atomic64_t into timekeeper.c that is used to keep
track of the latest fine-grained time ever handed out. This is tracked as a
monotonic ktime_t value to ensure that it isn't affected by clock
jumps. Because it is updated at different times than the rest of the
timekeeper object, the floor value is managed independently of the
timekeeper via a cmpxchg() operation, and sits on its own cacheline.
Add two new public interfaces:
- ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the
coarse-grained clock and the floor time
- ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries
to swap it into the floor. A timespec64 is filled with the result.
The floor value is global and updated via a single try_cmpxchg(). If
that fails then the operation raced with a concurrent update. Any
concurrent update must be later than the existing floor value, so any
racing tasks can accept any resulting floor value without retrying.
[1]: POSIX requires that files be stamped with realtime clock values, and
makes no provision for dealing with backward clock jumps. If a backward
realtime clock jump occurs, then files can appear to have been modified
in reverse order.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20241002-mgtime-v10-1-d1c4717f5284@kernel.org
|
|
The GE HealthCare PMC Analog to Digital Converter (ADC) is a 16-Channel
(voltage and current), 16-Bit ADC with an I2C Interface.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Tested-by: Ian Ray <ian.ray@gehealthcare.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20241003114641.672086-3-herve.codina@bootlin.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
On most modern qualcomm SoCs, the configuration necessary to enable the
Tag/Data RAM related irqs being propagated to the SoC irq controller is
already done in firmware (in DSF or 'DDR System Firmware')
On some like the x1e80100, these registers aren't even accesible to the
kernel causing a crash when edac device is probed.
Hence, make the irq configuration optional in the driver and mark x1e80100
as the SoC on which this should be avoided.
Fixes: af16b00578a7 ("arm64: dts: qcom: Add base X1E80100 dtsi and the QCP dts")
Reported-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Rajendra Nayak <quic_rjendra@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20240903101510.3452734-1-quic_rjendra@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Add new entry to the SM8450 dt-bindings and add SM8475-specific clocks
to SM8450 GCC header file.
Signed-off-by: Danila Tikhonov <danila@jiaxyga.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240818204348.197788-2-danila@jiaxyga.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
The "struct batadv_tvlv_tt_data" uses a dynamically sized set of
trailing elements. Specifically, it uses an array of structures of type
"batadv_tvlv_tt_vlan_data". So, use the preferred way in the kernel
declaring a flexible array [1].
At the same time, prepare for the coming implementation by GCC and Clang
of the __counted_by attribute. Flexible array members annotated with
__counted_by can have their accesses bounds-checked at run-time via
CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for
strcpy/memcpy-family functions). In this case, it is important to note
that the attribute used is specifically __counted_by_be since variable
"num_vlan" is of type __be16.
The following change to the "batadv_tt_tvlv_ogm_handler_v1" function:
- tt_vlan = (struct batadv_tvlv_tt_vlan_data *)(tt_data + 1);
- tt_change = (struct batadv_tvlv_tt_change *)(tt_vlan + num_vlan);
+ tt_change = (struct batadv_tvlv_tt_change *)((void *)tt_data
+ + flex_size);
is intended to prevent the compiler from generating an "out-of-bounds"
notification due to the __counted_by attribute. The compiler can do a
pointer calculation using the vlan_data flexible array memory, or in
other words, this may be calculated as an array offset, since it is the
same as:
&tt_data->vlan_data[num_vlan]
Therefore, we go past the end of the array. In other "multiple trailing
flexible array" situations, this has been solved by addressing from the
base pointer, since the compiler either knows the full allocation size
or it knows nothing about it (this case, since it came from a "void *"
function argument).
The order in which the structure batadv_tvlv_tt_data and the structure
batadv_tvlv_tt_vlan_data are defined must be swap to avoid an incomplete
type error.
Also, avoid the open-coded arithmetic in memory allocator functions [2]
using the "struct_size" macro and use the "flex_array_size" helper to
clarify some calculations, when possible.
Moreover, the new structure member also allow us to avoid the open-coded
arithmetic on pointers in some situations. Take advantage of this.
This code was detected with the help of Coccinelle, and audited and
modified manually.
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1]
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2]
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Erick Archer <erick.archer@outlook.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
When using function_graph tracer to analyze the flow of kernel function
execution, it is often necessary to quickly locate the exact line of code
where the call occurs. While this may be easy at times, it can be more
time-consuming when some functions are inlined or the flow is too long.
This feature aims to simplify the process by recording the return address
of traced funcions and printing it when outputing trace logs.
To enhance human readability, the prefix 'ret=' is used for the kernel return
value, while '<-' serves as the prefix for the return address in trace logs to
make it look more like the function tracer.
A new trace option named 'funcgraph-retaddr' has been introduced, and the
existing option 'sym-addr' can be used to control the format of the return
address.
See below logs with both funcgraph-retval and funcgraph-retaddr enabled.
0) | load_elf_binary() { /* <-bprm_execve+0x249/0x600 */
0) | load_elf_phdrs() { /* <-load_elf_binary+0x84/0x1730 */
0) | __kmalloc_noprof() { /* <-load_elf_phdrs+0x4a/0xb0 */
0) 3.657 us | __cond_resched(); /* <-__kmalloc_noprof+0x28c/0x390 ret=0x0 */
0) + 24.335 us | } /* __kmalloc_noprof ret=0xffff8882007f3000 */
0) | kernel_read() { /* <-load_elf_phdrs+0x6c/0xb0 */
0) | rw_verify_area() { /* <-kernel_read+0x2b/0x50 */
0) | security_file_permission() { /* <-kernel_read+0x2b/0x50 */
0) | selinux_file_permission() { /* <-security_file_permission+0x26/0x40 */
0) | __inode_security_revalidate() { /* <-selinux_file_permission+0x6d/0x140 */
0) 2.034 us | __cond_resched(); /* <-__inode_security_revalidate+0x5f/0x80 ret=0x0 */
0) 6.602 us | } /* __inode_security_revalidate ret=0x0 */
0) 2.214 us | avc_policy_seqno(); /* <-selinux_file_permission+0x107/0x140 ret=0x0 */
0) + 16.670 us | } /* selinux_file_permission ret=0x0 */
0) + 20.809 us | } /* security_file_permission ret=0x0 */
0) + 25.217 us | } /* rw_verify_area ret=0x0 */
0) | __kernel_read() { /* <-load_elf_phdrs+0x6c/0xb0 */
0) | ext4_file_read_iter() { /* <-__kernel_read+0x160/0x2e0 */
Then, we can use the faddr2line to locate the source code, for example:
$ ./scripts/faddr2line ./vmlinux load_elf_phdrs+0x6c/0xb0
load_elf_phdrs+0x6c/0xb0:
elf_read at fs/binfmt_elf.c:471
(inlined by) load_elf_phdrs at fs/binfmt_elf.c:531
Link: https://lore.kernel.org/20240915032912.1118397-1-dolinux.peng@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409150605.HgUmU8ea-lkp@intel.com/
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
[ Rebased to handle text_delta offsets ]
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Ensure that the inline function contains no more than 10 lines.
move q_num_set() from hisi_acc_qm.h to qm.c.
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Alternatively to the X9.62 encoding of ecdsa signatures, which uses
ASN.1 and is already supported by the kernel, there's another common
encoding called P1363. It stores r and s as the concatenation of two
big endian, unsigned integers. The name originates from IEEE P1363.
Add a P1363 template in support of the forthcoming SPDM library
(Security Protocol and Data Model) for PCI device authentication.
P1363 is prescribed by SPDM 1.2.1 margin no 44:
"For ECDSA signatures, excluding SM2, in SPDM, the signature shall be
the concatenation of r and s. The size of r shall be the size of
the selected curve. Likewise, the size of s shall be the size of
the selected curve. See BaseAsymAlgo in NEGOTIATE_ALGORITHMS for
the size of r and s. The byte order for r and s shall be in big
endian order. When placing ECDSA signatures into an SPDM signature
field, r shall come first followed by s."
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.2.1.pdf
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
software_key_query() returns the maximum signature and digest size for a
given key to user space. When it only supported RSA keys, calculating
those sizes was trivial as they were always equivalent to the key size.
However when ECDSA was added, the function grew somewhat complicated
calculations which take the ASN.1 encoding and curve into account.
This doesn't scale well and adjusting the calculations is easily
forgotten when adding support for new encodings or curves. In fact,
when NIST P521 support was recently added, the function was initially
not amended:
https://lore.kernel.org/all/b749d5ee-c3b8-4cbd-b252-7773e4536e07@linux.ibm.com/
Introduce a ->max_size() callback to struct sig_alg and take advantage
of it to move the signature size calculations to ecdsa-x962.c.
Introduce a ->digest_size() callback to struct sig_alg and move the
maximum ECDSA digest size to ecdsa.c. It is common across ecdsa-x962.c
and the upcoming ecdsa-p1363.c and thus inherited by both of them.
For all other algorithms, continue using the key size as maximum
signature and digest size.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
crypto_sig_maxsize() is a bit of a misnomer as it doesn't return the
maximum signature size, but rather the key size.
Rename it as well as all implementations of the ->max_size callback.
A subsequent commit introduces a crypto_sig_maxsize() function which
returns the actual maximum signature size.
While at it, change the return type of crypto_sig_keysize() from int to
unsigned int for consistency with crypto_akcipher_maxsize(). None of
the callers checks for a negative return value and an error condition
can always be indicated by returning zero.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Unlike the rsa driver, which separates signature decoding and
signature verification into two steps, the ecdsa driver does both in one.
This restricts users to the one signature format currently supported
(X9.62) and prevents addition of others such as P1363, which is needed
by the forthcoming SPDM library (Security Protocol and Data Model) for
PCI device authentication.
Per Herbert's suggestion, change ecdsa to use a "raw" signature encoding
and then implement X9.62 and P1363 as templates which convert their
respective encodings to the raw one. One may then specify
"x962(ecdsa-nist-XXX)" or "p1363(ecdsa-nist-XXX)" to pick the encoding.
The present commit moves X9.62 decoding to a template. A separate
commit is going to introduce another template for P1363 decoding.
The ecdsa driver internally represents a signature as two u64 arrays of
size ECC_MAX_BYTES. This appears to be the most natural choice for the
raw format as it can directly be used for verification without having to
further decode signature data or copy it around.
Repurpose all the existing test vectors for "x962(ecdsa-nist-XXX)" and
create a duplicate of them to test the raw encoding.
Link: https://lore.kernel.org/all/ZoHXyGwRzVvYkcTP@gondor.apana.org.au/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
If <linux/asn1_decoder.h> is the first header included from a .c file
(due to headers being sorted alphabetically), the compiler complains:
include/linux/asn1_decoder.h:18:29: error: unknown type name 'size_t'
Avoid by including <linux/types.h>.
Jonathan notes that the counterpart <linux/asn1_encoder.h> already
includes <linux/types.h>, but additionally includes the unnecessary
<linux/bug.h>. Drop it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The crypto_sig_*() API calls lived in sig.c so far because they needed
access to struct crypto_sig_type: This was necessary to differentiate
between signature algorithms that had already been migrated from
crypto_akcipher to crypto_sig and those that hadn't yet.
Now that all algorithms have been migrated, the API calls can become
static inlines in <crypto/sig.h> to mimic what <crypto/akcipher.h> is
doing.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced and all asymmetric
sign/verify algorithms have been migrated to it.
The sign/verify operations can thus be dropped from akcipher_alg.
It is now purely for asymmetric encrypt/decrypt.
Move struct crypto_akcipher_sync_data from internal.h to akcipher.c and
unexport crypto_akcipher_sync_{prep,post}(): They're no longer used by
sig.c but only locally in akcipher.c.
In crypto_akcipher_sync_{prep,post}(), drop various NULL pointer checks
for data->dst as they were only necessary for the verify operation.
In the crypto_sig_*() API calls, remove the forks that were necessary
while algorithms were converted from crypto_akcipher to crypto_sig
one by one.
In struct akcipher_testvec, remove the "params", "param_len" and "algo"
elements as they were only needed for the ecrdsa verify operation.
Remove corresponding dead code from test_akcipher_one() as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The virtio crypto driver exposes akcipher sign/verify operations in a
user space ABI. This blocks removal of sign/verify from akcipher_alg.
Herbert opines:
"I would say that this is something that we can break. Breaking it
is no different to running virtio on a host that does not support
these algorithms. After all, a software implementation must always
be present.
I deliberately left akcipher out of crypto_user because the API
is still in flux. We should not let virtio constrain ourselves."
https://lore.kernel.org/all/ZtqoNAgcnXnrYhZZ@gondor.apana.org.au/
"I would remove virtio akcipher support in its entirety. This API
was never meant to be exposed outside of the kernel."
https://lore.kernel.org/all/Ztqql_gqgZiMW8zz@gondor.apana.org.au/
Drop sign/verify support from virtio crypto. There's no strong reason
to also remove encrypt/decrypt support, so keep it.
A key selling point of virtio crypto is to allow guest access to crypto
accelerators on the host. So far the only akcipher algorithm supported
by virtio crypto is RSA. Dropping sign/verify merely means that the
PKCS#1 padding is now always generated or verified inside the guest,
but the actual signature generation/verification (which is an RSA
decrypt/encrypt operation) may still use an accelerator on the host.
Generating or verifying the PKCS#1 padding is cheap, so a hardware
accelerator won't be of much help there. Which begs the question
whether virtio crypto support for sign/verify makes sense at all.
It would make sense for the sign operation if the host has a security
chip to store asymmetric private keys. But the kernel doesn't even
have an asymmetric_key_subtype yet for hardware-based private keys.
There's at least one rudimentary driver for such chips (atmel-ecc.c for
ATECC508A), but it doesn't implement the sign operation. The kernel
would first have to grow support for a hardware asymmetric_key_subtype
and at least one driver implementing the sign operation before exposure
to guests via virtio makes sense.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced with the intent of moving all
asymmetric sign/verify algorithms to it one by one.
Migrate the sign/verify operations from rsa-pkcs1pad.c to a separate
rsassa-pkcs1.c which uses the new backend.
Consequently there are now two templates which build on the "rsa"
akcipher_alg:
* The existing "pkcs1pad" template, which is instantiated as an
akcipher_instance and retains the encrypt/decrypt operations of
RSAES-PKCS1-v1_5 (RFC 8017 sec 7.2).
* The new "pkcs1" template, which is instantiated as a sig_instance
and contains the sign/verify operations of RSASSA-PKCS1-v1_5
(RFC 8017 sec 8.2).
In a separate step, rsa-pkcs1pad.c could optionally be renamed to
rsaes-pkcs1.c for clarity. Additional "oaep" and "pss" templates
could be added for RSAES-OAEP and RSASSA-PSS.
Note that it's currently allowed to allocate a "pkcs1pad(rsa)" transform
without specifying a hash algorithm. That makes sense if the transform
is only used for encrypt/decrypt and continues to be supported. But for
sign/verify, such transforms previously did not insert the Full Hash
Prefix into the padding. The resulting message encoding was incompliant
with EMSA-PKCS1-v1_5 (RFC 8017 sec 9.2) and therefore nonsensical.
From here on in, it is no longer allowed to allocate a transform without
specifying a hash algorithm if the transform is used for sign/verify
operations. This simplifies the code because the insertion of the Full
Hash Prefix is no longer optional, so various "if (digest_info)" clauses
can be removed.
There has been a previous attempt to forbid transform allocation without
specifying a hash algorithm, namely by commit c0d20d22e0ad ("crypto:
rsa-pkcs1pad - Require hash to be present"). It had to be rolled back
with commit b3a8c8a5ebb5 ("crypto: rsa-pkcs1pad: Allow hash to be
optional [ver #2]"), presumably because it broke allocation of a
transform which was solely used for encrypt/decrypt, not sign/verify.
Avoid such breakage by allowing transform allocation for encrypt/decrypt
with and without specifying a hash algorithm (and simply ignoring the
hash algorithm in the former case).
So again, specifying a hash algorithm is now mandatory for sign/verify,
but optional and ignored for encrypt/decrypt.
The new sig_alg API uses kernel buffers instead of sglists, which
avoids the overhead of copying signature and digest from sglists back
into kernel buffers. rsassa-pkcs1.c is thus simplified quite a bit.
sig_alg is always synchronous, whereas the underlying "rsa" akcipher_alg
may be asynchronous. So await the result of the akcipher_alg, similar
to crypto_akcipher_sync_{en,de}crypt().
As part of the migration, rename "rsa_digest_info" to "hash_prefix" to
adhere to the spec language in RFC 9580. Otherwise keep the code
unmodified wherever possible to ease reviewing and bisecting. Leave
several simplification and hardening opportunities to separate commits.
rsassa-pkcs1.c uses modern __free() syntax for allocation of buffers
which need to be freed by kfree_sensitive(), hence a DEFINE_FREE()
clause for kfree_sensitive() is introduced herein as a byproduct.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
pkcs1pad_set_pub_key() and pkcs1pad_set_priv_key() are almost identical.
The upcoming migration of sign/verify operations from rsa-pkcs1pad.c
into a separate crypto_template will require another copy of the exact
same functions. When RSASSA-PSS and RSAES-OAEP are introduced, each
will need yet another copy.
Deduplicate the functions into a single one which lives in a common
header file for reuse by RSASSA-PKCS1-v1_5, RSASSA-PSS and RSAES-OAEP.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|